Sedang Disiapkan

Web Site Scraping

Web Site Scraping

We want to build a service which srapes web sites in order to maintain an external database and to extract data from dynamic web pages. The targeted website has to be entered through a log into site.

The service will be initiated by an external scheduler. The external scheduler uses XML code which contains all information for the service. The service shall execute the following steps

a) receive XML

d) pass the log into site

c) maintain the external database

d) extract data

e) send XML

Once the service is finished, it shall report its success (XML).

Technical details: Communication only via XML interface. The XML schema is given. We expect cURL or Java. Multiple instances on the same machine are required.

As a contractor you can use a testing system for the XML interface. Regarding the third party websites you will receive the login data for a user account and a screen shot documentation of the manually maintenance for every targeted web site. Please note that we cannot provide a testing system for third party websites, every change is real life and has to be restored to the original data.

We want to scrape 250 web sites successive within the next months. This is an enquiry for the first package of 25 web sites. Ongoing we need another 10 a month, eventually up to 25 a month.

At the moment we are asking for external development only and will do the ongoing maintenance by ourselves. In a further stage we will shift this work as well.

Kemahiran: Pemasukan Data, Pemprosesan Data, Java, PHP, XML

Lihat lagi: website scraping service, work web site, we do web sites, web site work, web site service, website scraping login required java, website or web site, web site on php, web site interface, website development enquiry, web development steps, web development in java, web development documentation, web contractor, uses of php in web development, user testing sites, system development web, steps of website development, steps in website development, steps in web development

Tentang Majikan:
( 9 ulasan ) Berlin, Germany

ID Projek: #64189