0% found this document useful (0 votes)
70 views

Lecture 4 Data Formats

This document discusses common data formats used in computer systems. It begins by explaining that computers process and store all data in binary format, while human communication includes language, images, and sounds. It then defines data formats as specifications for converting human data into a computer-usable form. The rest of the document discusses specific data format standards for alphanumeric text, images, video, audio, and methods for data compression.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
70 views

Lecture 4 Data Formats

This document discusses common data formats used in computer systems. It begins by explaining that computers process and store all data in binary format, while human communication includes language, images, and sounds. It then defines data formats as specifications for converting human data into a computer-usable form. The rest of the document discusses specific data format standards for alphanumeric text, images, video, audio, and methods for data compression.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

CS2842 Computer Systems – Lecture IV

Data Formats

Dr. Sapumal Ahangama


Department of Computer Science and Engineering

1
DATA FORMATS
 Computers
 Process and store all forms of data in binary format

 Human communication
 Includes language, images and sounds

2
DATA FORMATS
 Specifications for converting data into computer usable form

 Define the different ways human data may be represented,


stored and processed by a computer

 The data must have the ability to be moved between


computers
 Metadata: information that describes or interprets the meaning of the
data

3
DATA FORMATS
 Proprietary formats
 Individual programs can store and process data in any format that
they want

 Standard data representations


 to be used as interfaces between different programs,
 between a program and the I/O devices used by the program,
 between interconnected hardware,
 between systems that share data

4
COMMON DATA REPRESENTATIONS

5
ALPHANUMERIC DATA
 Much of the data that will be used in a computer are originally
provided in human-readable form,
 Letters of the alphabet, numbers, and punctuation,
 English or some other language

 Alphanumeric data are a combination of alphabetical and


numerical characters

 Since alphanumeric data must be stored and processed within


the computer in binary form, each character must be
translated to a binary representation

6
ALPHANUMERIC DATA
 Three alphanumeric codes are in common use,
 ASCII (American Standard Code for Information Interchange)
 EBCDIC (Extended Binary Coded Decimal Interchange Code)
 Unicode

 Nearly every system today uses Unicode or ASCII

7
ASCII
 Each character represented with a 7 bit code
 128 characters

 Consists of,
 digits 0 to 9,
 lowercase letters a to z,
 uppercase letters A to Z,
 punctuation symbols,
 33 non-printing control codes

 Extended to 8 bit code – Latin-1

8
ASCII

9
UNICODE
 ASCII and EBCDIC have limitations
 8-bit word limit the number of possible characters
 Other major languages?
 Omitted characters [, ], ^, {, }, ~

 These issues led to a 16 bit standard – Unicode or UTF-16


 65,536 characters
 49,000 are defined to represent the world’s most used
characters
 6,400 16-bit codes are reserved for private use
 Each character can be stored in 2 bytes

10
UNICODE

11
UNICODE

12
2 CLASSES OF CODE
 Printing characters
 Produced on the screen or printer

 Control characters

13
KEYBOARD INPUT
 Scan code
 When a key is struck on the keyboard, the circuitry in the
keyboard generates a binary code

14
KEYBOARD INPUT
 Other alphanumeric inputs:
 OCR
 Barcode
 Magnetic Strip Reader
 RFID

15
IMAGE DATA
 Images come in many different shapes, sizes, textures, colors,
and shadings
 Different requirements require different forms for image
data
 Quality of the image
 Storage space required
 Time to transmit
 Ease of modification

 Make it difficult to define a single universal format

16
IMAGE DATA
 Two distinct categories
 Bitmap or raster images
 Characterized by continuous variations in shading, color, shape, and
texture
 JPEG, GIF

 Graphical objects
 Made up of graphical shapes such as lines and curves that can be
defined geometrically

 The nature of display technology make it much more


convenient and cost effective to display and print most images
as bitmaps

17
IMAGE DATA
 Two distinct categories
 Bitmap or raster images
 Characterized by continuous variations in shading, color, shape, and
texture
 JPEG, GIF

 Graphical objects
 Made up of graphical shapes such as lines and curves that can be
defined geometrically

 The nature of display technology make it much more


convenient and cost effective to display and print most images
as bitmaps

18
IMAGE DATA

19
BITMAP IMAGES
 Bitmap image format
 A rectangular image is divided into rows and columns
 The junction of each row and column is a point known as a pixel
 Pixel is a set of one or more binary numerical values that define the
visual characteristics

 Preferred when image contains large amount of detail and


processing requirements are fairly simple

20
BITMAP IMAGES
 Example each point below represented by a 4 bit code corresponding
to 1 of 16 shades
 Meta data
 Pixel data
 Stored from top to bottom one row at a time

21
BITMAP IMAGES
 Data value representing a pixel
 Could be as simple as one bit
 For color image, might consist of many bytes
 RGB
 Additional bytes for other characteristics such as transparency and
color correction.

22
BITMAP IMAGES
 File size affected by
 Resolution
 Reducing the size of a pixel to improve details
 Levels: number of bits to represent each pixel

 Image formats
 GIF (Graphics Interchange Format)
 JPEG (Joint Photographers Expert Group)
 PNG (Portable Network Graphic)

23
OBJECT IMAGES
 Object images are made up of simple elements like straight or
curved lines, circles and arcs etc.
 Each element can defined mathematically by parameters
 Circle requires 3 parameters, Cartesian coordinates + radius
 Straight line needs the coordinates of its end points

24
OBJECT IMAGES
 Advantages
 Require less storage space
 Can be manipulated easily

 Photographs as object images?

25
VIDEO DATA
 Requires a large amount of data
 1024 × 768 pixel true-color images at a frame rate of 30 frames per
second?
 70.8 megabytes of data per second!
 4.25 gigabytes per minute

 How to reduce video size?

26
AUDIO DATA
 Sound is naturally an analog wave that needs to be digitized

 Sampling
 1000 samples per second = 1 KHz (kilohertz)
 Example : Audio CD sampling rate = 44.1KHz

27
AUDIO DATA
 Sampling Rate

 Height of each sample saved as,


 8 bit number for radio quality recordings
 16 bit number for high fidelity recordings
 2 x 16 bits for stereo sound

28
DATA COMPRESSION
 Compression: reducing data so that it requires fewer bytes of
storage space
 Compression ratio: the amount of file shrunk

 Lossless Compression
 Inverse algorithm restores data to exact original form
 Examples GIF, PCX, TIFF

 05573200001473291000006682732732
 0155732041473291056682732732
 0155Z0314Z91056682ZZ

29
DATA COMPRESSION
 Lossy Compression
 Trades off data degradation for file size and download speed
 Much higher compression ratios, often 10 to 1
 JPEG

 MPEG-2?

30
THANK YOU

31

You might also like