[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Omaha.pm] A regex "best fit" finder?



Yeah, my simple list was very simple compared to the actual list of files.  I envision the final solution would be a number of match strings strung together with "|"...

Thanks, I'll keep looking.

Dan

On Thu, Sep 29, 2011 at 16:15, Christopher Cashell <topher-pm@zyp.org> wrote:
2011/9/29 Dan Linder <dan@linder.org>:
> Example:
> OMAWWW001
> OMAWWW002
> OMADNS001
> ORDWWW001
> ORDWWW002
> ORDWWW003
> ORDDNS001
> ORDDNS002
> Any thoughts?

I've dealt with a similar thing at work.  It can be incredibly tricky,
depending on the names in question, how variable they are, and whether
you just want to match them roughly, or if you want to match them to
validate them.

For example, from the data listed, they appear to be all of the form:
3 letter site/city code, followed by 3 letter function/machine code,
followed by a 3 digit number.  If you just wanted to catch anything
that matches that format, you could possibly do something like:

/\w{3}\w{3}\d{3}/

Depending on the number of site/city codes and the number of
function/machine codes, you could do something like (Note: start of
line/field anchor added to improve performance with alternations;
depending on how much data you're processing, it may not matter or be
applicable):

/^(OMA|ORD)(WWW|DNS)\d{3}/

This would allow you to validate that not only does the 3 letter, 3
letter, 3 digit form matches, but that it validates to expected site
and function codes.  This also has the advantage that it works with
codes that aren't exactly 3 letters (i.e. if you want to use SMTP for
a mail server).

If you've got a decent number of entries, you might want to reformat
it with /x for increased readability:

/^ (OMA|ORD|DEN|SEA|LAX)
  (WWW|DNS|SMTP|IRC|DB)
  \d{3} /x

Without knowing more about the current names, as well as potential
future names, that's probably the best I can think of.

> Thanks,
> DanL

--
Christopher
_______________________________________________
Omaha-pm mailing list
Omaha-pm@pm.org
http://mail.pm.org/mailman/listinfo/omaha-pm



--
***************** ************* *********** ******* ***** *** **
"Quis custodiet ipsos custodes?"
    (Who can watch the watchmen?)
    -- from the Satires of Juvenal
"I do not fear computers, I fear the lack of them."
    -- Isaac Asimov (Author)
** *** ***** ******* *********** ************* *****************