Find Jobs
Hire Freelancers

273163 Project for loker

N/A

Dalam Kemajuan
Disiarkan lebih dari 15 tahun yang lalu

N/A

Dibayar semasa penghantaran
C++ Crawler able to index/reindex pages and download content making xml file for each page. Here are main requirements: * Can be scheduled * The Agent can accept multiple crawl start locations per web site * Support for [login to view URL] * Forbiden string in url (for example do not follow ?, %, or keyword) * Can leave domain / do not leave domain * Max pages per domain (user input) * The agent can support exclusions of files beyond that of the servers standard [login to view URL] * Specify how many levels deep to follow links for starting location crawl * Multi-Threaded for Concurrent Scans * Reindexing New Files or Modified Files Only * Complete Cache Management * Download to specific storage (web, news) * Download Title, Description, Keywords, Page content, Add the following fields: date indexed, Page size, url * Make XML file for each downloaded page with the info above ------------------------------------------------------------------- * Web based administration * List of url's to crawl * Start/Stop/Hold/Continue * Scheduled time index/reindex for specific storage and list of sites * File type: html based (html, htm, php, asp, js, do ...)
ID Projek: 2019447

Tentang projek

Projek jarak jauh
Aktif 12 tahun yang lalu

Ingin menjana wang?

Faedah membida di Freelancer

Tetapkan bajet dan garis masa anda
Dapatkan bayaran untuk kerja anda
Tuliskan cadangan anda
Ianya percuma untuk mendaftar dan membida pekerjaan

Tentang klien

Bendera CYPRUS
Nicosia, Cyprus
5.0
2
Ahli sejak Jan 17, 2009

Pengesahan Klien

Terima kasih! Kami telah menghantar pautan melalui e-mel kepada anda untuk menuntut kredit percuma anda.
Sesuatu telah berlaku semasa menghantar e-mel anda. Sila cuba lagi.
Pengguna Berdaftar Jumlah Pekerjaan Disiarkan
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Memuatkan pratonton
Kebenaran diberikan untuk Geolocation.
Sesi log masuk anda telah luput dan telah dilog keluar. Sila log masuk sekali lagi.