Polverone - 19-12-2007 at 10:43
I have long been a satisfied user of FineReader OCR software, but the recently released version 9 seems to be a step in the wrong direction in many
aspects.
First, the benefits of the new program:
-Multi-processor or multi-core systems will get a speed boost from use of multiple cores
-Recognition accuracy is claimed to be improved (though it was excellent before)
-Layout recognition is more capable
-Custom quality settings are again available in PDF export (this was missing in 8)
-Recognition can proceed simultaneously with opening, so you don't have to completely finish loading large files before recognition starts
The downsides:
-Much slower, both in recognition and in reviewing recognized pages
-Uses more RAM
-Thumbnail page view can no longer be sorted by anything other than page number
-Page-by-page view of recognition process with text highlighting no longer appears when recognizing all pages
-Buggier (had to manually kill a recognition process that locked some critical resource, re-recognizing one page in a 2000 page batch seemed to freeze
the program, and there is no batch recovery after unexpected shutdown like in 8)
I wonder if I am being unfair to the program, since I am using a cracked version. It's possible that the bugginess, slowness, and malformed features
are subtle anti-piracy measures that the crack authors did not catch and patch. It's also possible that I'm seeing the real face of FR 9 but that
later point releases will fix some of these warts. Right now it feels like a downgrade overall. I haven't read any software reviews that mention these
problems, but many software reviews are practically regurgitated product brochures and don't really stress a program.
Have others had experience with FR 9? I have several hundred thousand pages that need OCR. This will be a multi-month project no matter what software
I use. I'm going to try running several thousand pages through the latest Omnipage and see how it compares with FR 8 and FR 9.
Edit: Omnipage 16 crashed with an unhandled exception when I tried to load and recognize the same test data I'm using with Finereader. It looks like
FR 8 may be my only option due to bugs in other programs, never mind feature sets.
[Edited on 12-19-2007 by Polverone]
chemrox - 19-12-2007 at 15:03
It seems like a lot of sw manufacturers put out new versions for the sake of sales without making real imrovements. Witness how MS builds on top of
older sys without cleaning out the garbage and makes memory hogs out of everything as a result. memory gets cheaper, hds get bigger and the sw
developers demand more and more of the space. its unending. I've been trying to scan a book for solo and others using microtek sw. The OCR is
aabbyy. It has been difficult and a start and stop and learn and start again kind of thing. Would you suggest my trying FineReader 7 or 8 instead?
Polverone - 19-12-2007 at 15:22
I've always scanned through bundled scanner software or a photoshop TWAIN plugin, so I can't comment on the experience of actually scanning through
any of the OCR packages.
I don't mind newer software requiring more hardware resources if it does things better. The worst part about FR 9 is the bugginess, followed by the
missing thumbnail sort options. The slowness is minor in comparison and would be worth it, IMO, if it didn't have the regressions compared to FR 8. FR
9 does genuinely offer some new and improved features, but I can't really enjoy them because of other shortcomings.
Part of my problem may be that my use is atypical. I use scripts and programs to take whole nested folders of PDF files, extract all the images, and
turn them into one enormous multipage TIFF file. That way I can run OCR on all the page images and review them at the end rather than reviewing one
file at a time. After I've reviewed the data I save it as PDF and split the PDF to recreate the individual articles. In the review stage, viewing
thumbnails sorted by error/warning messages is tremendously useful because it lets me quickly spot wrongly oriented pages.
My initial test for this project used a bundle of 2800 page images. I didn't have any problems using FR 9 on documents of just a few hundred pages,
though it was still slow.
It turns out that Omnipage isn't totally buggy like I thought. For some reason it needed more space on my C drive even though it was loading data from
a different, much larger drive. In the past I've experienced more rejected page images with it than with FR, so we'll see.
FR 8 has been rock solid for me. I don't think you can go wrong using it for recognition of the book you're scanning. Like I said earlier, I've never
actually used it for image acquisition from a scanner, only for doing the OCR.