For this exciting project you will be scraping a large content website
Will ideally accept bids from $50-$100.
We will give you the address of a website with ~30,000 pages.
For all of the normal content pages, you will need to:
1) Scrape the content
2) Result should be presented as CSV or SQL.
3) Parse content and save the following fields: title, abstract, keywords, body, category (some fields will not be available for this content)
The resulting content must be free of any images and html tags, but must maintain spaces and paragraph indicator. For instance bolded text shall come in as plain text.
We are aware of several website scraping tools (Velocityscape/web scraper plus, etc), and are happy if you want to use one of them.
We are looking to complete this project quickly – by Sunday March 7, at the latest.
We will need the freelancer to show us a small number of records for our approval before going and completing the project.
Please use the phrase I'm your scraper, in your response, so we know you have read this description.
We expect to have additional work like this.