I’ve been a word processor now for 35 years, man and boy.  You know, like the guy in After Hours.  And I’ve been at it since word processing was done on dedicated machines, little more than memory typewriter-type devices that printed out their documents onto paper with daisy wheel printers.

It all seems so primitive now, but I unwittingly was part of the digital revolution.  I watched business and legal documents go from pieces of paper to digitized e-documents that are, in fact, small computer programs that consist of nothing more than ones and zeroes (bits) processed by hexameter memory circuits as eight-bit bytes which in the aggregate make up this sentence.

This has changed the job itself, from being a typist to being a program converter of sorts.  My most popular assignment, for example, is to convert a PDF file of a document into an editable Word document, and I do several of them a day.  Most people I work with believe that what happens in that OCR (optical character recognition) process is that the OCR software (also known as an engine) will extract the text from the PDF file and drop it into a Word document, where it can be edited and formatted.

The only problem with that idea is that there is no text.  That is, there is nowhere in a PDF file where there are actual letters and words embossed somewhere, like on a piece of paper.  Instead, the PDF file contains coding, which at its most basic level are ones and zeroes.  What the OCR engine does is to read the PDF coding and more or less convert it to Word coding.  It’s a digital world, remember?  Then, I have to run several other subroutines to more or less distill the newly created coding into a properly coded Word document.  You know, the kind that can do tricks.

Thinking in terms of “text,” to me, is throwing things back into their analog days.  I find it incredibly annoying and have to check myself when talking about it to co-workers.

