Hello, We sell digital cameras and need to populate products specs within oscommerce in a quick manner. I am looking for a small script that will extract data from www.shopping.com. Before I go in to details, I do have permission to spider thier site and to use the data since we buy their advertising.
I would prefer a friendly script - not killing [url removed, login to view] or using a lot of resources.
Here is what the script needs to do:
1. From our db, pull the manufacturer name and sku (i can write in this query).
2. from starting page: [url removed, login to view] insert manufacturer name and sku
3. the product page should be found. The example of "Fuji S5000" will give
[url removed, login to view]+s5000
4. on the right side of this page there will be a box. we need to visit the link that reads: "See product details"
5. from this page is where data will be collected and stored into our db. here is the breakdown
COLLECT EVERYTHING BETWEEN THIS TAGS
do: "insert into product_desc" into db according to product SKU
// product specs
ForEach spec header:
COLLECT THIS TEXT
find spec header info:
COLLECT THIS TEXT
do: insert spec header and spec header info into db according to sku
Jump to step one
Continue until no more query results from step one
Ok, that is the breakdown. Now, you do not have to actually write the program as I have indicated but, what is important is that I get all Spec Headers with the correct Spec Header Info. If you want to use arrays, that is fine by me.
I need this script to be in one file - for cron purposes. Since [url removed, login to view] website changes from time to time - I am also looking for the code to be clean in case I need to make modifications. I am not a real programmer but I can modify and make small changes when needed. I can only read clean code.
I need this done ASAP. This job shouldn't take more than 3-4 hours with testing included. I am willing to pay $35/hour or a max of $140. Please don't bid if you can't finish this within 2 days from the closing date of bidding.