EDIT: Sorry folks! This session won’t happen. The postponement by a day, and some personal work means that I will end up missing this Barcamp. See you next time, and many thanks for the volunteers of BCB!
Proposed session details below:
The CEO(Chief Electoral Officer) of every state publishes the electoral rolls of the state.
These are typically published as PDF (E.g. the lok sabha 2014 electoral rolls are at http://ceokarnataka.kar.nic.in/ElectionFinalroll2014/). The files contain the voter name, name of a relative & the relation, address, age and sex. Photographs are not available in publicly downloadable files. The voter lists are include all voters, and are naturally voluminous. The voter lists for Karnataka itself take up GBs of space…
This is a valuable treasure trove of information, given that it is available for anyone to download.
Unfortunately, there is a problem. The PDF files need to be processed to unlock this information. More than 90% of the electoral rolls are in regional languages, making this even harder.
In this session, I will
– narrate a story of how I extracted the information to run a voter list search during the recent elections (http://www.shreekumar.in/?p=565)
– describe my methods of processing the PDF files
– challenges of processing indian language voter lists, and how I am solving them. (You are welcome to contribute as well ! )
Note that this is an ongoing work, with a public repository at https://github.com/shreekumar3d/voter-list
Session difficulty level: Intro/101
Share this session: