Extracting Data From PDF Reports to Create Text-based Files for Import into Other Packages

Last year a client requested assistance to convert data from a small on-line based accounting system into a new accounting system. Unfortunately, the software would only output reports as PDF. No Excel. No CSV. No tab-delimited files. So the problem was how to convert PDF reports into structured text files that could be imported into the new accounting package.

The problem was compounded by the fact that the data that I needed was actually on two different reports and that as PDF reports there was no way to merge these files. The good news was that there was a common key.

Using Arbutus Analyzer this was simple. It has a function to read the PDF reports, extract the data from the reports as text, merge files and then output the files as .txt ready for import into the other package.

Here is a screenshot of the Print Image Reader data definition wizard.

24 views0 comments

Recent Posts

See All