Our inventory catalog is made up of many different supplier inventory feeds. A problem that has come up is that many of the suppliers carry the same product, the result is us listing either on our website or marketplaces we sell the same product multiple times. What we need is a script that can weed through our inventory and match up duplicate products based on various data. This script should result a Mysql table that contains rows where each row contains the our unique product id for found matches.
ID A -- 1234
ID B -- 5678
ID C -- 9876
Assuming A, B, and C are matches their IDs should be stored in 1 row of the table. The table should also include a integer code of what type of match was found (see below)
Types of matches:
1) UPC MATCH
2) Manufacture AND Sku match
Script should match products that either have the same UPC or the same Manufacturer and Sku. For #2 we have a table that contains manufacturer variations. For example one supplier may put a manufacturer as "ACER Electronics" and another may just say "ACER." So the variations must be checked as well.
Things to keep in mind:
&acirc;€¢ UPCs are sometimes provided to us with prefixed or trailing "0" or blank spaces.
&acirc;€¢ Probably 99% of the matches found in Type #1 will also match in type #2. Meaning if UPCs match, chances are Manufacturer and SKU will match as well
&acirc;€¢ Not all products have UPC, in fact its only about 50%. But we would still like to use it in searching because when there is match its almost 100% that it was not a false positive.