PROJECT: PDF Processing
OVERVIEW: I have a large number of large catalogs that are stored as PDF files. Some of the PDFs can be several hundred pages and more than 50MB.
I need to have these files processed so we can better use them in our shopping system. You will need to be able to
1. extract, process and then write individual pages of the PDF
2. search for, and report finding, strings within the pages
3. insert hyperlinks around strings in pdf file
INPUT:
1. A source pdf file {source}.pdf. May be several hundred pages long and over 50MB in size.
2. A SearchString definition. This search string is a 'regular expression' used to locate part-numbers within the PDF. (see [login to view URL]).
3. A URLtemplate string. This template will be used to insert links into the pdf. The found partnumber will be inserted into the template string and then added back into the pdf document. For example:
URLtemplate = [login to view URL]{partnumber}
OUTPUT:
1. A .csv spreadsheet which lists each partnumber found in the first column and a comma-separated list of the pages where it was found in the second column. This file is written to {sourcefilename}[login to view URL] in a subfolder called {sourcefilename}_pages
2. A subfolder {sourcefilename}_pages where each page of the source is written out as {source}_{page}.pdf. Before the page is written out, the page is searched for SearchString, and if found, the string is replaced with URLtemplate
3. The {source} document is searched for SearchString, and if found, the string is replaced with URLtemplate. This processed source file is written as {source}.pdf in the {sourcefilename}_pages subfolder
DELIVERABLE:
1. Prefer the code to be written in ASP/VB but will also consider alternate language implementations provided support is also included in installing the language on my server.
2. The code should be written in 2 parts, one as a [login to view URL] library that can be included into other programs, and the second as a [login to view URL] which will ask user for the inputs and then produce the outputs
3. I will provide you with a .pdf document. You will have the code hosted on an ASP server (I can provide this if you like). You will provide a url where the code can be demonstrated.
4. The project should be completed in no more than 10 days from the start of the project.
RESPONSE:
If you are interested in bidding on this project please provide the following
1. Brief description of your experience with ASP and working with PDF files
2. If you have a company website, please provide the URL
3. The day you will be able to start the project
4. When you expect the project to be finished
5. What your fee will be for the project
6. Your contact information
Regards, Andy
ID Projek: 113969
Tentang projek
Projek jarak jauh
Aktif 17 tahun yang lalu
Ingin menjana wang?
Faedah membida di Freelancer
Tetapkan bajet dan garis masa anda
Dapatkan bayaran untuk kerja anda
Tuliskan cadangan anda
Ianya percuma untuk mendaftar dan membida pekerjaan