Sedang Disiapkan

-Text correction with Hidden Markov Model-

It is a Machine Learning lesson project. In this project, a system will be designed to correct(fix) mistaken letters in a document with Hidden Markov Model. States will represent letters to be written correctly and outputs will represent the real letters. The most likely sequence of letters (hidden state) will be generated with Viterbi Algorithm for the given wrong text (observed information). The document in the below link may be helpful to understand the project.

[url removed, login to view]

1. Training: Calculations below will be done for the training documents which spelled correctly.

a) Calculate the probability of how many states are there that starts with s[i].

b) Calculate the probability of the transition from s[i] to s[j] state.

c) Calculate the probability of a character is likely to be in s[i] state.

Calculation above will be used to form the information below.

Initial State Probability-I : N represents different state number, I[s] is the every letter’s probability of being first letter of the correct words.

State Transition Probability Matrix – A: A[i][j] represents the probability of i th state to j th state in NxN dimensional matrix A, in other words it shows that the probability of presence of other letter after each letter for the correct words.

Output probability matrix-B: It represents the the probability of letter in misspelled Word against letter in correct spelled Word.

M represents the different letter number in misspelled words, N represents different letter number in correct spelled words, MxN dimensional B matrix B [o][s] : o output letter probability to be seen in s state.

[url removed, login to view]: The most likely sequences of letters will be obtained using Viterbi algorithm in the given test document.

Document Examples: English document examples are found in docs data document. The first 20.000 character in document is for test and the rest will be used for training. In the file that include documents, first column shows correct words and 2nd column shows misspelled words. For test process, misspelled words part will be used to generate correct words.

Example from the document:

p p

e e

o o

p l

l l

e e

_ _

w w

h h

o k

_ _

p p

u u

r r

s s

u u

e e

_ _

Don’t use training examples for test.

In application calculate that;

a) How many mispelled words and how many misspelled letters are there in test data.

b) How many misspelled words and how many misspelled letters corrected with the help of the program.

c) How many correct words and how many correct letters broke because of theprogram.

d) Using the values of calculation, you will show the percent of success rate .

e) For 50 words which you will select, you will show these words’ misspelled and corrected state.

Urgent: In one day

Kemahiran: Pengaturcaraan C#, Perlombongan Data, Pembelajaran Mesin, Matlab and Mathematica

Lihat lebih lanjut: hidden markov model text, text correction markov model, www programming in th, use of algorithm in programming, r programming training, r programming examples, Programming with R, programming with matlab, programming model, programming in mathematica, programming algorithm examples, programming algorithm example, probability programming, probability help, p.e.t. model, o 1 algorithm, matrix programming, letters of application examples, letter of application examples, learning programming, it programming uk, http programming in th, h and p form, examples of letters, examples of letter of application

Tentang Majikan:
( 13 ulasan ) Istanbul, Turkey

ID Projek: #4104158