Forum:Automated scrape
From Woot Wiki
--ikishk 09:07, 19 December 2006 (UTC)
Figured talking about the scrape script fits under forum more than discussion... So I copied it over:
so whats the deal with the main page? why hasnt it been updated in a week and who's been doing it before? I'll volunteer to write a perl script (with WWW::Mechanize) to update it every morning, just gimme a template to post with.--ikishk 16:12, 15 December 2006 (UTC)
Ok, scripted and croned at 00:02:30 Central. I didnt script the price pulls from froogle or streetprices, but a link to the resulting pages is there. I'll try and script those pulls later--ikishk 20:03, 15 December 2006 (UTC)
- msrp,froogle,yahoo,streetprices,amazon price pulls automated too. that was fun.--ikishk 21:07, 15 December 2006 (UTC)
- added side deal--ikishk 08:51, 19 December 2006 (UTC)
- added nextag pull--ikishk 08:51, 19 December 2006 (UTC)
- added asterisk on tuesdays denoting comparison prices are for ONE item, not two.--ikishk 08:51, 19 December 2006 (UTC)
- added archiving of the product the script replaces. WootArchive18Dec2006 for example.--ikishk 08:54, 19 December 2006 (UTC)
- added shopzilla--ikishk 09:07, 19 December 2006 (UTC)
If theres other locations you'd like to see scripted, please post a sample url w/ the current woot item. I can only script something with a result that has a consistant templated reply. for instance, pricegrabber is all over the place and they dont sort low to high. they sort low to high on featured, then low to high on everything else.--ikishk 08:51, 19 December 2006 (UTC)
- I've started adding product review links and manufacturer's site as of today. I'll do my best to keep this up going forward, but if it isn't updated by 9AM EST, feel free to jump in. Also feel free to add any additional reviews, etc. --noctis 01:36, 20 December 2006
- use the "Review" template for reviews. see http://woot.wikia.com/wiki/Talk:Main_Page#Every_day_item
- wikia altered the "Move" link, broke script. fixed. --ikishk 05:30, 21 March 2007 (UTC)
- woot updated some html layouts last week in preperation for shirt.woot and it broke my scrape. I've now updated it to use the Salerss.aspx. --ikishk 04:07, 29 June 2007 (UTC)
- sellout.woot
Could you add sellout.woot.com to the scrape? --Will N. Dowd 13:47, 9 October 2007 (UTC)
- nope. they have elected to not offer an RSS feed for sellout.woot, which i read as they dont want it syndicated. I can go back and use the old pre-rss scrape for sellout.woot, but it just seems wrong if they dont want it scraped.--ikishk 15:17, 9 October 2007 (UTC)