PDF to text conversion for voter rolls in India

The objective is to create a command-line tool that will convert a (structured) PDF file containing publicly available voter rolls in India, into text.

The output will be stored in a CSV sheet. There are further details in the attached instruction file.

Some points to bear in mind:

- The text is in the Devanagari character set (i.e. in the Hindi language)

- The voter rolls are arranged in a grid (3 columns and n rows) - see attached PDF

- There are known issues with fidelity of information during a simple copy-paste from PDF to text

- The tool is expected to be run on a Linux system and take two command-line parameters: the path + file name of the source PDF and the path + file name of the output CSV file

Kemahiran: Java, Linux, PDF - Format Dokumen Mudah Alih, Perl, Python

Lihat lagi: the source for linux information, objective c pdf, java in india, india java, java to objective c, voter, source india, pdf to text, linux pdf, copy from pdf into, path pdf, line pdf, java create pdf, copy file system system java, command line java linux, perl pdf csv file, perl convert pdf, linux convert pdf text, pdf java create, java details hindi, convert pdf line, csv text file, linux convert java, simple java file output, csv pdf

Tentang Majikan:
( 8 ulasan ) Mumbai, India

ID Projek: #4084852

Dianugerahkan kepada:


Hi, i have significant experience working with automated pdf extraction - will be able to deliver quality code in 2 milestones, the first one will demonstrate the data extraction with accuracy, the second will be to Lagi

$300 USD dalam 7 hari
(4 Ulasan)

3 pekerja bebas membida secara purata $283 untuk pekerjaan ini


Beat work guaranteed. Please check PM.

$250 USD dalam 4 hari
(0 Ulasan)

Hi i have extensive experience in python .I can do this within a week.

$300 USD dalam 7 hari
(0 Ulasan)