This project consists in creating a web spider or web crawler that will; 1) periodically collect financial statement’s data from a well known financial information web site; 2) store the collected data orderly into Microsoft Access; 3) perform automated calculations with the collected data and create output tables.
This is how the web spider or crawler will work in details:
1) Data collection
The web site to crawler/spider belongs to a financial information group and is updated continuosly. It is divided in sections: news, discussion, markets, country analysis, etc. One section is devoted to display financial statements’data of many thousands of companies whose shares are listed on many stock exchanges. Each company’s financial statement is displayed in detail: a) balance sheet; b) income statement; c) cash flow statement. To access to the financial statement’s data one needs to enter the company’s name into a form. The web site answers the query by displaying the data.
The crawler/spider will look at a list of company names contained in a Microsoft Access Database. Then, it will enter the names, one at time, into the web’s query form. When the financial statement of each single company is displayed, the crawler/spider will collect the data, line by line, of balance sheet, income statement and cash flow statement. It will also collect other data, like the last traded price of the company’s shares or the description of what the company does, etc.
3) Data storage
All collected data will be then stored, line by line and field by field into the Microsoft Access Database. When new data for one company are collected, they will be appended to the already stored information. As an example, in the first run, the crawler/spider will collect financial information of fiscal year closed on september 30, 2008….one year later, in a second run , the crawler/spider will collect financial information of fiscal year closed on september 30, 2009…etc.
The crawler/spider will be able to recognize if data has already been collected and stored and it will not do it twice, unless it refers to a different fiscal period (i.e. 2004, 2006, 2010, etc.).
3) Data manipulation
Once the collection and storage process is complete, the crawler/spider will drive various calculations with the Microsoft Access database and will create output tables.
Such tables will be exported into Microsoft Excel
Any programming option or language is wellcome, provided it is suitable 100% for the project and the programmer is a true master in using it.
My budget is 145$ for this. Please pid include the word "12653xas"
Thank you !