This job is for simple transcription between typed pages and a text files.
There are 930 pages, some containing a few lines and some full. The source is a set of scanned pages, many typed between 50 and 90 years ago. The output will be a text file named the same as the input file but with a .txt filename. There is a general format in the source but there are some minor variations between them. The output format is very specific and delimited between fields with a semi-colin. Please download the sample files, it contains input data and output files. The output layout is at the top of a the file.