Sedang Disiapkan

Matching Problem (II)

I posted this project once before but it was not successfully completed.

This project involves devising a method to match inconsistently coded data. We have a dataset of worksite inspections. Entries describing the location of the same work site are often recorded differently. For example, one entry might have an "address" cell recorded as 123 Elm Rd while another entry might be recorded as 123 Elm Road. In other cases, the same company's "company name" cells might be recorded differently. For example, Acme Inc. might be misspelled as Amce Inc. in one entry. We would like to devise a program to match inconsistently coded entries. A successful match would occur when there is a high probability that the two entries are actually one and the same. This must be an automated process because our data set contains a few hundred thousand observations.

I have attached a sample of the data.

Kemahiran: Pemprosesan Data

Lihat lagi: problem of probability, probability problem, match problem, high match, example problem of probability, Road, inspections, ii, cells, cases, devise, recorded, company problem, probability data, problem probability, matching site, data matching project, process location, must posted, contains, data matching, thousand, automated data, sample attached, 123 data entry

Tentang Majikan:
( 0 ulasan ) Berkeley, United States

ID Projek: #33442