We require a desktop software solution that will assist in extracting article information from PDF scans of newspapers and magazines. Use case: 1. User will import PDF scan of newspaper/magazine into tool. It can be image scan (in which case tool will have to integrate with OCR platform - ie Omniscan, Abby Finereader, ReadIris - and perform OCR function) or already OCR-ed into searchable PDF. 2. Tool will search document for a list of keywords and highlight matched terms. User will be able to skip to next search term, or navigate page by page. User must have ability to add and edit keywords in simple fashion 3. Ideally, tool will do layout analysis (or use layout analysis of integrated OCR solution) and identify article on page. Alternatively user should identify article 4. Where tool is able it should identify Headline, author, date, text, picture. Otherwise, it should be made simple as possible for user to identify these fields 5. If possible the tool should create an image/pdf file containing only the content of the clip 5. On completion of clipping of magazine, the tool should FTP an XML or RSS file + images into our MySQL article database I would envisage the layout of tool having PDF page on one side of screen and article fields on the other side. We are looking for maximum automation and minimising the amount of time it takes to clip a publication. Please describe how long you think it will take to develop this tool (timing is urgent), which OCR tool you would integrate with if you build this into your solution and as much detail as possible about proposed solution.

## Platform

Windows XP/Vista

