TinyApps.Org
Small is beautiful


 HOME

  0. Internet
  1. Text
  2. Graphics
  3. System
  4. File
  5. Misc
  6. Palm
  7. OS X

 BLOG

 DOCS

 FAQ

 RSS (?)





Using wget on Amazon to download search results #
Neither Google nor Amazon index used book descriptions from sellers, making it difficult to find unique or mislisted volumes. In order to download these comments for searching with ack, wget was tried:
$ wget "http://www.amazon.com/gp/offer-listing/0131103628/?condition=used"
...
HTTP request sent, awaiting response... 204 NoContent
...
Hrm. How about omitting the User-Agent header?
$ wget --user-agent="" "http://www.amazon.com/gp/offer-listing/0131103628/?condition=used"
...
HTTP request sent, awaiting response... 200 OK
...
Bingo. If there is more than one page of listings (i.e., more than 15 used books available), all pages can be downloaded via something like wget -i urls.txt --user-agent="", where urls.txt contains one URL per line:
http://www.amazon.com/gp/offer-listing/0131103628/ref=olp_page_1?startIndex=0&condition=used
http://www.amazon.com/gp/offer-listing/0131103628/ref=olp_page_2?startIndex=15&condition=used
http://www.amazon.com/gp/offer-listing/0131103628/ref=olp_page_3?startIndex=30&condition=used

/nix | Dec 01, 2011



Categories
/blosxom
/eink
/mac
/misc
/nix
/palm
/windows

Blosxom Archive
2012: 5 4 3 2 1
2011: 12 11 10 9 8 7 6 5 4 3 2 1
2010: 12 11 10 9 8 7 6 5 4 3 2 1
2009: 12 11 10 9 8 7 6 5 4 3 2 1
2008: 12 11 10 9 8 7 6 5 4 3 2 1
2007: 12 11 10 9 8 7 6 5 4 3 2 1
2006: 12 11 10 9 8 7 6 5 4 3 2 1
2005: 12 11 10

Blogger Archive
2005: 10 9 8 7 6 5 4 3 2 1
2004: 12 11 10 9 8 7 6 5 4 3 2 1
2003: 12 11 10 9 8 7 6

Ezine Archive
2004: 4 3 2 1
2003: 12 9 8 7 6 5 4 2 1
2002: 12 10 9 8 7 6 5 3 2 1
2001: 12 11 10