[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Omaha.pm] WWW::Mechanize & http://screen.yahoo.com/stocks.html




Last meeting we talked about Dean's attempt to spider Yahoo. It wasn't quite working.

I figured it out. Here's the solution:

   http://jays.net/tmp/j.pl.txt
   (Program and output)

I believe the problem was this:

   $agent->follow_link(text => "Next 20", n => 240)

That syntax asks WWW::Mechanize to follow the 240th link labelled "Next 20". That's not right, of course. There's only one link labelled "Next 20". And the link actually is /Next \d\d/, so I changed the syntax to this:

   $agent->follow_link(text_regex => qr/Next \d\d/)

Looks like it's working. Pretty slick.

Enjoy!

j