Hi, I'm looking for a very, very easy web scraper. It's very easy because the site actually encourages this sort of thing and makes it convenient. I just want to automate the process to fill in a SQL Server 2008 database (although your scraper can simply output everything to CSV if that's easier).
The program you make should do two things, described below. I have attached an excel file (compressed as a RAR) which clearly shows what I'm looking for in the final product.
1)Go to [url removed, login to view] and click on Screener, then select Overview. See the part that says "export" in blue in the lower right corner? Click that and you get a CSV file. Now select the Valuation part next to Overview. Again, click "export" to download the CSV. Continue doing this for Financial, Ownership, Performance, and Technical. Notice that some of those views repeat the same columns, such as Market Cap. I want all of the columns from these CSV files to be merged into one CSV file, with the duplicate columns removed. The desired result is show as the FinishedFullList sheet in my excel example.
2) Notice on the top of the page it has a drop-down box labeled "Signals". Click that and select "Top Gainers", then click "export". Repeat this for the other 18 signals. The idea here is to take just the Ticker column from each of these CSV files and put it into a column labeled with the name of the signal (this corresponds to the SignalLists sheet in my excel file). After this, this SignalList should be used to generate a table which lists every ticker, and for each column representing a signal, a 0 or 1 if the given ticker is in the list for that signal. This corresponds to the SignalMatrix sheet in my excel file.
This whole thing should take less than an hour for a skilled programmer. Ideally this would be in C# .net and I would keep the source code. I'd like something that runs easily on Windows Server 2008 and which I can run automatically at night. If you can also easily make it so that the FinishedFullList and SignalMatrix are uploaded to a SQL Server 2008 instance, than that is great (and might be worth a tiny bit extra). Otherwise, CSV outputs of those two tables (FinishedFullList and SignalMatrix) is acceptable.
Please don't bother bidding a high price-- I know this is a trivial task for anyone who knows his stuff. I have numerous other small tasks like this that I want to do, so if you do a great job for a good price (hint: $30 sounds high for this, but I know that's the minimum), I will probably use your for future work like this. Thanks for looking!
I want to amend this task to do two more things (should be easy):
3) Click on Groups (two items to the right from Screener). Then click Overview, Valuation, and Performance. I want to export the CSVs for all of these, first by Sector, then Industry, Country, and Capitalization. For example, the Sector output should be one CSV file the has all of the columns from Overview, Valuation, and Performance, with any duplicate columns removed.
4) Click Insider (two over to the right from Groups). Click "Latest Insider Trading". Grab this list into a csv file. Do the same for "Top Insider Trading Recent Week" and "Top 10% Owner Trading Recent Week". That's it.
All of the final CSV files that result from your program should have a descriptive filename saying what kind of CSV file it is and the date it was created(for example, "IndustryTable-- 2008-10-01"