We require a desktop software solution that will assist in extracting article information from PDF scans of newspapers and magazines. Use case: 1. User will import PDF scan of newspaper/magazine into tool. It can be image scan (in which case tool will have to integrate with OCR platform - ie Omniscan, Abby Finereader, ReadIris - and perform OCR function) or already OCR-ed into searchable PDF. 2. Tool will search document for a list of keywords and highlight matched terms. User will be able to skip to next search term, or navigate page by page. User must have ability to add and edit keywords in simple fashion 3. Ideally, tool will do layout analysis (or use layout analysis of integrated OCR solution) and identify article on page. Alternatively user should identify article 4. Where tool is able it should identify Headline, author, date, text, picture. Otherwise, it should be made simple as possible for user to identify these fields 5. If possible the tool should create an image/pdf file containing only the content of the clip 5. On completion of clipping of magazine, the tool should FTP an XML or RSS file + images into our MySQL article database I would envisage the layout of tool having PDF page on one side of screen and article fields on the other side. We are looking for maximum automation and minimising the amount of time it takes to clip a publication. Please describe how long you think it will take to develop this tool (timing is urgent), which OCR tool you would integrate with if you build this into your solution and as much detail as possible about proposed solution.
1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.
2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):
a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.
b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.
3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).