I need a data scraper for the italian Yellowpages.
The scaper needs to collect the following information
- category (eg plumbers etc)
- Business Name
- description (id="textDescriptor")
- All phone & fax numbers
- website address
- email address
A business may have more than one phone number and should be broken into the following fields.
- AH Contact
I also need the address broken into separate fields
- Street number and name
The script must be able to:
- Scrape a lot of entries without being kicked off the site.
- Able to export the data to a csv or mysql file or both.
- A simple html interface will allow me to start/stop the script and provide basic progress feedback.
- extract the data from the sponsored listings.
- automatically extract the data from the continuing pages i.e. 2, 3, 4 onwards to get the full data
I should be able to specify the max number of records to retrive