March | 2019 | un, deux ou trois

In the last weeks i was confronted with the task to make a big stack of PDF files of scanned doccuments electronically searchable. The PDF files shall be stored in a new archive which allows it to search not the metadata of the documents. It searches also through the whole document.

Since in the work environment are MS Windows PCs, i was looking for a solution for Windows 7-10. There are a lot of solutions for Linux and Mac out there and there are of course also solutions for Windows like Adobe Acrobat which comes with a functions set which allows it to create searchable PDF files from scanned or photografed documents.

But i was also looking for a solution which can handle multiple files at once. Since i was not sure if the Adobe Acrabot can do that and if it would be possible to aquire a license, i started looking for a simple self programmed solution. I read somewhere that you can drag and drop other files onto batch files (.bat) in order to execute the batch file with the file names of the dropped files as parameters (probably here: https://en.wikibooks.org/wiki/Windows_Batch_Scripting or in the German version). I started reading about Tesseract OCR software and Imagemagick and wrote a little batch file which is wrapper for these software programms: https://github.com/timberger/Searchable-Image-PDF-Creat-O-Mat.

un, deux ou trois

nuages

Monthly Archives: March 2019

Searchable Image PDF Creat-O-Mat