I need a scrape expert. I'm looking at scraping [url removed, login to view] and obtaining the following attributes stored in a CSV file:
Main category title
Sub category title
Item photo filename (filename only not URL)
* Download of the main product image into /imgs/ named the filename above
The script will begin here:
[url removed, login to view]
1. Your duty will be to scape all the "sub-categories" of each main category
2. Your scrape will take 1,000 items per category (this requires your script to visit the "next pagination" to obtain more results if 1,000 hasn't been reached yet. I believe they show 30 per page.
3. For each result of 30 items per page, your script must visit the "item detail page for each item" and scrape the image filename of the picture, download of the product photo to be saved into /imgs/[url removed, login to view] locally, price, shipping cost, title and description. The CSV will simply store the filename of the image but also requires downloading it into /imgs/filename.jpg.. this is very important we have the pictures locally saved.
I need the photo since these listings may expire and when I import my CSV the images may be dead.. so I need them to be downloaded and stored in /imgs/
4. Some "categories" don't have 1,000 items within, so just move to the next category.
At the end, I should expect a CSV format like:
'Main category title','Category title','Product title','Product description','Price','Shipping','Condition','Item photo filename'
and also a /img/ folder with all the item photo filenames as specified in the CSV above.
Please be sure your script escapes single quote ' characters with \' so it doesn't break the CSV formatting for the fields..
Looking forward for a quick turn around.