[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Omaha.pm] A regex "best fit" finder?



Nothing like that comes to mind. If it has to be something that is in the predefined match, most of your examples wouldn't do that. If it's just to be a heuristic to help you throw out something, it would depend on the heuristic.

Personally, I'd probably use a tied hash or maybe MongoDB or something similar, fill that with the list and then hit the database for a verification. The database "hash table" could be reloaded every week.

This is old school, but it works quite well for this simple task:

# Common Code
use DB_File;
tie %valid,  'DB_File', 'valid_things.db', O_RDWR|O_CREAT, 0644, $DB_HASH;

# Load the latest (assuming the input is one valid string per line)
%valid = ();
while (<>) { chomp; $valid{$_}++ }

# Check for valid strings
if ($valid{ $unvalidated_input }) { print "YES!\n" }
else { print "NO!\n" }


2011/9/29 Dan Linder <dan@linder.org>
I have a list of server names that I want to create a regex match against.  It could be done by hand, but the list changes (adds, removes) on a weekly basis.

Does anyone know of a program that can take a list of matches and create a regular _expression_ that will match them?

Example:
OMAWWW001
OMAWWW002
OMADNS001
ORDWWW001
ORDWWW002
ORDWWW003
ORDDNS001
ORDDNS002

I guess the "shortest" match would be /O.......[123]/ but it's kinda 'loose'.

I *think* what I'd like is something like this: /O[MR][AD][WD][WN][WS]00[123]/
(But a smarter regex tool might find something tighter...)

What I *don't* want is: /OMAWWW001|OMAWWW002|...|ORDDNS002/
I don't have enough space in my tool for a 10K long string! :)

Any thoughts?

Thanks,
DanL

--
***************** ************* *********** ******* ***** *** **
"Quis custodiet ipsos custodes?"
    (Who can watch the watchmen?)
    -- from the Satires of Juvenal
"I do not fear computers, I fear the lack of them."
    -- Isaac Asimov (Author)
** *** ***** ******* *********** ************* *****************

_______________________________________________
Omaha-pm mailing list
Omaha-pm@pm.org
http://mail.pm.org/mailman/listinfo/omaha-pm



--
Andrew Sterling Hanenkamp
sterling@hanenkamp.com
785.370.4454