I need this script to be modified (or a new script created) so that the program will dedupe records from a several fields in each of two different source files. The format that you need to read needs to include both CSV and | for field delimiters. Ignore spaces after or before the delimiter (these are all the same: ", field" ",field " ",field"). It's ok that some of the records do not have the same number of fields.
Example:
[login to view URL]:
field1,field2,"field, 3", field4
field1_a,field2_a,"field, 3_a", field4_a
field1_b,field2_b,"field, 3_b", field4_b
field1_c,field2_c,"field, 3_c", field4_c
[login to view URL]:
field1|field2|"field, 3"|field4
field1_a|field2_a|"field3_a"| field4_a
field1_b|"field2_b"|"field3_b"
field1_c|field2_XXX|"field3_c"| field4_c
Running the program, I would select these two input files (one as the existing source file, and one as the new source file). I need to select the fields that I am using as fields to look for matching duplicates. In this example, I will choose fields 1 and 2. If fields 1 and 2 are the same in both files (ignore the "'s), then I will not output that record in the output file.
In this example, the output would be:
field1_c|field2_XXX|"field3_c"| field4_c
because this is the only record that does not have the same fields for both field1 and field2 in both input files. You are only outputting the non-matching records from the NEW source file.
----
If you use the attached script, then you will need to make it work with standard CSV files (i.e. where the quotes are not necessary for escaping). This script will need to be adjusted to make that work.