[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Omaha.pm] Help with parsing HTML
> What are you trying to extract?
For example, I'd like the content inside the content inside each set of
<SCRIPT></SCRIPT> tags in a given file.
Jay, I tried your suggestion of Text::Balanced, but didn't have any
luck.
Here's what I did with Text::Balanced :
__________________________
use Text::Balanced qw ( extract_tagged );
foreach $arg ( @ARGV ) {
open (IN,$arg) or next;
local $/;
$filecontent = <IN>;
($extracted, $remainder)
= extract_tagged($filecontent, '<SCRIPT>', '</SCRIPT>', undef,
undef);
print "extracted: $extracted\n";
print "remainder: $remainder\n";
}
___________________________
But nothing was ever returned in the $extracted variable, everything was
always in the remainder. I tried many variations of the 2nd and 3rd
arguments to extract_tagged() but nothing worked. Is there anything
obviously wrong with how I am using it? Once I get that to work I plan
to put it inside a while loop to continue to call extract_tagged() until
I've gone through the whole file.
-Ryan