INL120 Theme 4 - 2024
INL120 Theme 4 - 2024
INL120 Theme 4 - 2024
2024/08/20
Semester Test 1
Date: 23 August 2024 (Friday)
Scope: Theme 1 till 3 and practical class notes 1
Types: MCQs, Short Questions & Long Questions
Time: 17:30-19:00 (90 minutes)
Total: 50 marks
Venue: Thuto 1-1: A – Mac
Thuto 1-2: Mad – Z
2
Learning outcomes
After the completion of this theme, you need to be able to
discuss the following concepts:
• Understand the techniques used in capturing
information objects such as Document Image Processing
(DIP).
• Understand the various types of scanners used in
creating digital objects.
• Understand the types of technologies used in DIP.
2
What is DIP? (Document Image Processing)
A system for storing and retrieving information in the form of
bitmapped images of paper document input with a scanner,
rather than in the form of text or numeric files.
Strategic advantages
• Improves quality of service.
• Reduces response time.
• Improves management of documents .
• Reduces cost.
Disadvantages of DIP
• Scanning still requires manual handling.
• Face-up contact-free scanning is preferable.
• Digital cameras and overhead scanners are better.
• Effect of lighting on fragile material.
• Image quality may be lower than original
Technology problems
• Costs of technology may be a problem (in smaller institutions).
• Need to customise equipment.
• Level of participation by conservators and user resistance.
• Unrealistic expectations.
• Resulting document may have limited search capabilities.
Capturing document images through
scanning
Optical scanners provide the means to capture the image of
paper documents.
Examples of use:
• Scanning Blueprints: Capture the finest technical details.
• Scanning Artwork & Posters: capture quality highlight and
shadow detail
• Map & GIS Scans: create high resolution & dimensionally
accurate images.
From scanning to digitization
• A variety of processing steps follow scanning. Such
procedures may occur at any point in the digitization chain,
from immediately after scanning to just prior to delivery to
end-users.
• These may be customized modifications that affect only
certain files, or mass, automated processing of all files
(batch processing).
• They may be one-time operations or done repeatedly on
an as-needed basis.
File/image processing operations
• Editing: Touch-up, enhancement can be built in the
software or separate image-editing tools (e.g. Adobe
Photoshop, Corel Photo-Paint, ImageMagick) can be
utilized.
• Compression: Sometimes carried out by dedicated scanner
firmware or dedicated hardware in the computer.
• Compression can also be a software-only operation though
dedicated hardware is faster and should be considered
when creating very large files or very large numbers of
files.
File/image processing operations (cont.)
• Scaling: Some scans captured at high resolution will not be
suitable for on-screen display. Scaling (resolution reduction
through bit disposal) is often necessary in order to create
images for Web delivery.
• Metadata creation: Addition of text that helps describe or
organize an image for retrieval.
File/image processing operations (cont.)
• File format conversion: The original scan may not be in a
format suitable for all intended uses, thus requiring
conversion.
Additional reading
OCR? ICR? IWR? OMG!
Get the Most from Your
Scanned Text
Link to article:
https://www.thecrowleycompany.com/ocr-icr-iwr-
omg-get-scanned-text/