Web Crawler Development in Arabic

What I need or Require

The development of a web crawler that searches Arabic blogosphere and social media networks for representation in a similar way to [url removed, login to view] - The requirement mainly is to develop a web crawler and indexer almost similar to [url removed, login to view] - Please refer to: [url removed, login to view]

Functional and Content Identification

The required solution is inspired by the recent trend in web development ‘Social Story Telling’. The trend aims to produce meaningful stories through connecting online expressions in emotional composition.

The application will consist of two core components:

1. Content collection and qualification.

2. Content delivery and distribution.

Automated blogosphere and social media search crawler / indexer / data collector

• The search crawler will methodically search the Arabic blogosphere and social media sites such as Flickr for tags and textual content for term or a phrase.

• Once a sentence containing one of the predefined search terms is found, the system looks backward to the beginning of the sentence, and forward to the end of the sentence, and then saves the full sentence in a database.. Alternatively, the application could extract a pre-defined number of words before and after the identified search term and re-cord them in the database.

• Once saved, the sentence will be scanned to see if it includes one or more terms in a pre-identified list.

• Every qualified sentence, the sen-tence/extract will represent an Arabic Voice.

• If an image is found in the post, the image is saved along with the sentence, and the image.

• The application will extract the date and time of the post where the search term / qualified voice/sentence was found.

• A high percentage of all blogs are hosted by one of several large blogging companies (Blogger, MySpace, MSN Spaces, LiveJournal, etc), the application will examine the URL format of the blog posts and use it to extract the username of the post's author. Given the author's username, the application will automatically traverse the given blogging site to find that user's profile page. From the profile page, the application will extract the age, gender, country, state, and city of the blog's owner.

Other Requirements

The application will be in Arabic.


3-4 weeks for the development of the crawler and indexer (excluding interface and conten[url removed, login to view] delivery)

Kemahiran: Pengaturcaraan C#, Perl, PHP, Destop Windows, XML

Lihat lagi: arabic crawler, arabic web crawler, web crawler development, www web site development, where to find an author, where to find all blogs, what sites need a blogger, what's an online blogger, what is online blogging, web trend, web solution companies, web search the social web, web searches database, web page php develop, web page development sites, web page development online, web page create in html, web page consist of, web develop php, web development system requirements, web development solution, web development search, web development requirement, web development in html, web development in c

Tentang Majikan:
( 121 ulasan ) Belfast, Ireland

ID Projek: #610869

4 pekerja bebas membida secara purata $483 untuk pekerjaan ini


Please Check PM

$750 USD dalam 20 hari
(123 Ulasan)

please see pm ........ thank you

$250 USD dalam 5 hari
(0 Ulasan)

READY TO ACCEPT YOUR WORK. Hello sir, we have gone through your details and we can complete your work very easily and efficiently. please let us know. Sincerely, R.K

$480 USD dalam 20 hari
(0 Ulasan)

Hello sir, Please to check my PM. Thanks

$450 USD dalam 21 hari
(0 Ulasan)