Digitising microfiches

A flatbed scanner doesn't work, but it proved possible using my Nikon P7700 camera in combination with an extra lens.


The microfiche:

Noticing this microfiche for my motorcycle on Ebay, I decided to give it a go, thinking that digitising it might work on my scanner.


01: Two sheets, together containing 289 pictures.


02: Each picture measures 8 x 6 mm.
For copyright reasons, page info has been made black.


Left: The poor result of a scan on my CanoScan 9000F Mark II.
Though this is a very good scanner, it just hasn't got enough optical resolution for a good quality result from such a tiny original image.
Even very expensive scanners will likely not produce much better quality.


Taking the pictures:

Why doesn't a flatbed scanner work? Well, this is a matter of pixels per inch. The Images I encountered on the microfiche I obtained were 8 x 6 mm in size
A flatbed scanner capable 4800 dpi (dots per inch) will likely have an optical resolution of 1000 - 1200 ppi (pixels per inch), whilst the more expensive ones claiming 9600 dpi are mostly between 1500 and 2000 dpi in reality.
1 Inch = 25.4 mm. This means that a scan of an 8 mm wide picture in 2000 ppi results in an image 2000 / 25.4 x 8 = 630 pixels wide, probably less on most scanners. This is not exactly high resolution.
My camera produces images of 4000 x 3000 dots. The extra lens I can attach, allows for a width of 16 mm to be photographed. It is possible to zoom in from this, but that is digital, not optical.
This means that the resolution in theory is 4000 pixels / 16mm x 25.4 = 6350 ppi. Off course, just like the scanners, output and optical resolution are not the same.
Tests show that the output quality of the camera with this lens is at least twice the output quality of my high quality scanner.
The difference is very poorly readable text from scanner and good readable text from the camera.
Off course a high end camera, equipped with the perfect lens and accessories will produce much better results, but that comes at a price....



03: The photo of one of the pictures on the fiche.
Click to open the large picture. Download size is 1MB.


04: The result after some work on the picture:
Click to open the large picture. Download size is 0.8MB.
Note that though the original picture is in two colours, whilst my result is in greyscale.
Though the considerably increases file size, it also greatly increases readability, which in this case is to be preferred.


Editing the pictures:

Being able to take one picture of a microfiche is one thing. It is something else when you realise that 200 pictures easily fit on one microfiche. If you are willing to straighten and crop these one at the time; It can be done for hobby. The time lost for this is professionally not acceptable.
One possible solution is using a cross table, like found on milling machines. Using this, one can exactly position each picture under the camera. Following that, it is possible to batch process all images.
Once again: Professional equipment likely can do better, but at a price.


OCR (Optical Character Recognition):

Presentation would be best in PDF format, with selectable and searchable text. When text may not be converted accurately, it is also possible to make PDF files presenting the original image, but containing the OCR text in a separate layer in the background. Last option is to present the image only. For that, there is a wide range of file types available.
For presentation, readability and file size reasons, usually it seems logical to run OCR on the files. OCR is possible with this quality. However, it is not without faults and due to the technical text, running a spelling check is virtually useless.
In the above picture, there is damage visible in the text from Ref. Nr. 31 and 32. Not a problem to see what it should be on the original picture, but for OCR to handle this it would be different.



If you have similar Microfiche(s) and have trouble digitizing this, drop me a mail. I might be able to help: rob (remove this and join) (at) panzerbasics (dot) com.