We need a Linux program written in C++ (or Java) which runs constantly on a linux server and reads one/or multiple MySQL tables and writes this data to Lucene Fulltext Files.
The Lucene Index-Structure, the MySQL Tables and Databases and all other Structures should be configurable (xml or ini).
The most important thing is performance: The program should only read new rows in the table, the program needs to index more than 3 GB of Data and approx. more than 3 Million Rows. The program should work with minimum RAM requirements.
Further informations can be requested.
Because of many requests concerning the config-Option, we wrote extra informations:
Our goal is to replace with Lucene the MySQL-Index Functions (unique,index,fulltext).
What we need to be configurable basicly is:
TableA.FieldA,.FieldB,.FieldC => Lucene IndexA
TableA.FieldD=> Lucene IndexB
TableB.FieldA,.FieldB => Lucene IndexC
So we provide "rules" for a MySQL-Table where to Index the Fields in Lucene. If there is no Rule for a Table, the Table won't be indexed.
Extra (important) Config-Option for a Field in a MySQL-Table: If we say in the config ".FieldD" should be FulltextIndex, this Field should be indexed in a separate Fulltext-Index with the same name, sothat every word in the String gets indexed by Lucene.
The program should basicly only index new rows, but it should be configurable to set the intern "lastRow" to 0, so that the whole table gets indexed again.