Document categorization database


I need a database (for automatic content categorization system) of English words. It must contain at least 200 000 words.

Each word needs the information about 3-5 categories probability (for Naive Bayes method).

The list of categories must look like:

[url removed, login to view]

It MUST be hierarchical.

An example:




Shopping -> 0.3

Library -> 0.3

Library:Education -> 0.2

Entertainment:Humor & Fun -> 0.2

(etc.. ) hope you catch an idea :)

The database must also include stop-words (e.g. "to be" -> no category) and word-combinations.

Please give your price for it. Hope someone already has such database..

I don't care the data format - any will be appreciated (which can be converted).

Kemahiran: Pemprosesan Data, Pemasaran Internet, Penyelidikan, Terjemahan, XML

Lihat lagi: naive method, msn translation, msn com, fun categories, document needs translation, Humor, humor book, document database, bayes , category translation, database example, converted words, catch can, probability data, fun book, document english, 200 word translation, word database, book category, data categorization, shopping database, document word, need database list, naive, book library

Tentang Majikan:
( 0 ulasan ) moscow, Russian Federation

ID Projek: #370947

2 pekerja bebas membida secara purata $250 untuk pekerjaan ini


I will give you 50000 word.

$250 USD dalam sehari
(0 Ulasan)

Choose me for professional results.

$250 USD dalam 60 hari
(0 Ulasan)