This is something I've been meaning to blog about for a while, but I'm going paperless in my office. I've been scanning down my large library of photocopied journal articles and reducing them to PDFs which I store on my RAID network area server. It's a slow process, but I'd like to be done by the end of the semester which is when we're moving office spaces (again).

My current workflow is:
- Scan using a Fujitsu ScapSnap
- OCR using Adobe Acrobat
- Index using Spotlight
- Rinse, repeat.
Adobe Acrobat 7.0 for the Mac is buggy and has problems with some of the PDFs that ScanSnap generates, so I'm looking forward to seeing if Acrobat 8 solves them. I haven't been able to find other good OCR solutions for batch processing PDFs so if you have any ideas, I'd love to hear them.

