Multimedia e Content
Multimedia e Content
Multimedia e Content
The following chapter introduces terminology and gives a sense of the commonality of
the elements of multimedia. The introduction of terminology begins with a clarification of the
notion multimedia followed by a description of media and the important properties of
multimedia systems. Subsequently, characteristics of data Streams in such systems and the
introduction of the notion Logical Data Unit (LDU) follow
One way of defining multimedia can be found in the meaning of the composed word.
Multi- much] much; multiple.
Medium[lat]
This description is derived from the common forms of human interaction. It is not very exact and
has to be adapted to computer processing. Therefore, we discuss in the next section the notion
medium in more detail with respect to computer processing.
Medium
An image can be in JPEG format. An audio stream can be represented using a simple
PCM (Pulse Coding Method) with a linear quantization of 16 bits per sample. A text character is
coded in ASCII or EBCDIC code. Representation media are characterized by internal computer
representations of information. That is, how is the computer information coded? The various
formats used to represent media information in computer.
Presentation media refer to the tools and devices for the input and output of information.
The central question is: Through which medium is information delivered lA the computer, or
introduced into the computer? The media, ex, paper' screen and speaker are used to deliver the
information by the computer (output media); keyboard, mouse, camera and microphone are the
input media,
Storage media refer to a data carrier which enables storage of information. However,
the storage of data is not limited only to the available components o[ a .ompui"i.
Therefore, paper is also a storage medium. The central question is: Where will
the information be stored,? Microfilm, floppy disk, hard disk, and CD-ROM are
examples of storage media.
The information exchange medium includes all information carriers for transmission,
That is all storage and transmission media. • The storage medium is transported outside of
computer networks to the destination, through direct transmission using computer networks, or
through combined usage of storage and transmission media. (e.g., electronic mailing system)
The above classification of media can be used as a basis for characterizing the notion
medium in the context of information processing. Here, the description of perception medium
comes closest to our notion of a medium: the media appeal to the human senses. Each medium
defines, values and representation port [HD90, HS91], which involve the five senses. Stereo and
quadraphonic determined the acoustic representations spaces.
Components of Multimedia
There are five components of multimedia i.e. text, sound, images, animation and video.
Text
Text or written language is the most common way of communicating information. It is
one of the basic components of multimedia. It was originally defined by printed media such as
books and newspapers that used various typefaces to display the alphabet, numbers, and special
characters. Although multimedia products include pictures, audio and video, text may be the
most common data type found in multimedia applications. Besides this, text also provides
opportunities to extend the traditional power of text by linking it to other media, thus making it
an interactive medium.
(ii) Hypertext
hypertext system consists of nodes. It contains the text and links between the nodes,
which define the paths the user can follow to access the text in non-sequential ways. The links
represent associations of meaning and can be thought of as cross-references. This structure is
created by the author of the system, although in more sophisticated hypertext systems the user is
able to define their own paths. The hypertext provides the user with the flexibility and choice to
navigate through the material. Text should be used to convey imperative information and should
be positioned at appropriate place in a multimedia product. Well-formatted sentences and
paragraphs are vital factors, spacing and punctuation also affects the readability of the text. Fonts
and styles should be used to improve the communication of the message more appropriately.
Image
Images are an important component of multimedia. These are generated by the computer
in two ways, as bitmap or raster images and as vector images.
Compression techniques are used to reduce the file size of images that is useful for
storing large number of images and speeding transmission for networked application.
Compression formats used for this purpose are GIF, TIFF and JPEG.
Animation
Animation consists of still images displayed so quickly that they give the impression of
continuous movement. The screen object is a vector image in animation. The movement of that
image along paths is calculated using numerical transformations applied to their defining
coordinates. To give the impression of smoothness the frame rate has to be at least 16 frames per
second, and for natural looking motion it should be at least 25 frames per second. Animations
may be two or three dimensional. In two dimensional animations the visual changes that bring an
image alive occur on the flat X and Y axis of the screen, while in three dimensional animations it
occurs along the entire three axes X, Y and Z showing the image from all the angles. Such
animations are typically rendered frame by high-end three dimensional animation softwares.
Animation tools are very powerful and effective. There are two basic types of animations, path
animation and frame animation.
Sound
Sound is probably the most sensuous element of multimedia. It is meaningful speech in
any language, from a whisper to a scream. It can provide the listening pleasure of music, the
startling accent of special effects, or the ambience of a mood setting background. It can promote
an artist; add interest to a text site by humanizing the author, or to teach pronouncing words in
another language. Sound pressure level (volume) is measured in decibels, which is actually the
ratio between a chosen reference point on a logarithmic scale and the level that is actually
experienced.
The main benefit of audio is that it provides a channel that is separate from that of the
display (Nielson, 1995). Sound plays a major role in multimedia applications, but there is a very
fine balance between getting it right and overdoing it (Philips, 1997). Multimedia products
benefit from digital audio as informational content such as a speech or voice-over and as special
effects to indicate that a program is executing various actions such as jumping to new screens.
The three sampling frequencies used in multimedia are CD-quality 44.1 kHz, 22.05 kHz and
11.025 kHz. Digital audio plays a key role in digital video.
Video
Video is defined as the display of recorded real events on a television type screen. The
embedding of video in multimedia applications is a powerful way to convey information. It can
incorporate a personal element, which other media lack. The personality of the presenter can be
displayed in a video (Philips, 1997). The video may be categorized in two types, analog video
and digital video.
Component analog video is considered more advanced than composite video. It takes
different components of video such as colour, brightness and synchronization and breaks them
into separate signals (Hillman, 1998). S-VHS and Hi-8 are examples of this type of analog video
in which colour and brightness, information are stored on two separate tracks. In early 1980s,
Sony has launched a new portable, professional video format „Betacam‟ in which signals are
stored on three separate tracks (Vaughan, 2008).
There are certain analogue broadcast video standards commonly used round the globe.
These are National Television Standard Committee (NTSC), Phase Alternate Line (PAL),
Sequential Colour with Memory (SECAM) and HDTV. In the United States, Canada, Japan
NTSC standard is used, while in United Kingdom, China, South Africa PAL is used. SECAM is
used in France. A new standard has been developed known as High Definition Television
(HDTV) which bears better image and colour quality in comparison to other standards.
Component digital is the uncompressed format having very high image quality. It is
highly expensive. Some popular formats in this category are „Digital Bitacam‟ and D-5
developed in 1994 and DVCAM developed in 1996. There are certain standards for digital
display of video i.e. Advanced Television System Committee (ATSC), Digital Video
Broadcasting (DVB), and Integrated Services Digital Broadcasting (ISBD). ATSC is the digital
television standard for the United States, Canada and South Korea, DVB is used commonly in
Europe and ISBD is used in Japan to allow the radio and television stations to convert into digital
format (Molina & Villamil, 1998).Video can be used in many applications. Motion pictures
enhance comprehension only if they match the explanation. For example, if we want to show the
dance steps used in different cultures, video is easier and more effective than to use any graphics
or animation (Thibodeau, 1997).
1. Combination of media.
2. Independence.
3. Computer supported integration(computer control)
4. Communication system.
1. Combination of media
According to the definition of multimedia system, a multimedia system must be composed with
the help of different mediums and devices and all together when works or comes in function then
it forms the multimedia system.
2. Independence
In the multimedia system different media should be independent from each other whereas
there should be inherently tight connection between different media to work together also.
The different independent media are combined in arbitrary forms to work together as a
system with the support of computers. /*Computer supported integration also called control
through the computer in multimedia systems.*/
4. Communication systems
Images can be created by using different techniques of representation of data called data
type like monochrome and colored images. Monochrome image is created by using single color
whereas colored image is created by using multiple colors. Some important data types of images
are following:
1-bit images- An image is a set of pixels. Note that a pixel is a picture element in digital
image. In 1-bit images, each pixel is stored as a single bit (0 or 1). A bit has only two
states either on or off, white or black, true or false. Therefore, such an image is also
referred to as a binary image, since only two states are available. 1-bit image is also
known as 1-bit monochrome images because it contains one color that is black for off
state and white for on state.
A 1-bit image with resolution 640*480 needs a storage space of 640*480 bits.
640 x 480 bits. = (640 x 480) / 8 bytes = (640 x 480) / (8 x 1024) KB= 37.5KB.
The clarity or quality of 1-bit image is very low.
8-bit Gray level images- Each pixel of 8-bit gray level image is represented by a single
byte (8 bits). Therefore each pixel of such image can hold 2 8=256 values between 0 and
255. Therefore each pixel has a brightness value on a scale from black (0 for no
brightness or intensity) to white (255 for full brightness or intensity). For example, a dark
pixel might have a value of 15 and a bright one might be 240.
A grayscale digital image is an image in which the value of each pixel is a single sample, which
carries intensity information. Images are composed exclusively of gray shades, which vary from
black being at the weakest intensity to white being at the strongest. Grayscale images carry many
shades of gray from black to white. Grayscale images are also called monochromatic, denoting
the presence of only one (mono) color (chrome). An image is represented by bitmap. A bitmap is
a simple matrix of the tiny dots (pixels) that form an image and are displayed on a computer
screen or printed.
A 8-bit image with resolution 640 x 480 needs a storage space of 640 x 480 bytes=(640 x
480)/1024 KB= 300KB. Therefore an 8-bit image needs 8 times more storage space than 1-bit
image.
24-bit color images - In 24-bit color image, each pixel is represented by three bytes,
usually representing RGB (Red, Green and Blue). Usually true color is defined to mean
256 shades of RGB (Red, Green and Blue) for a total of 16777216 color variations. It
provides a method of representing and storing graphical image information an RGB color
space such that a colors, shades and hues in large number of variations can be displayed
in an image such as in high quality photo graphic images or complex graphics.
Many 24-bit color images are stored as 32-bit images, and an extra byte for each pixel used to
store an alpha value representing special effect information.
A 24-bit color image with resolution 640 x 480 needs a storage space of 640 x 480 x 3 bytes =
(640 x 480 x 3) / 1024=900KB without any compression. Also 32-bit color image with resolution
640 x 480 needs a storage space of 640 x 480 x 4 bytes= 1200KB without any compression.
Disadvantages
o Require large storage space
o Many monitors can display only 256 different colors at any one time. Therefore,
in this case it is wasteful to store more than 256 different colors in an image.
8-bit color images - 8-bit color graphics is a method of storing image information in a
computer's memory or in an image file, where one byte (8 bits) represents each pixel. The
maximum number of colors that can be displayed at once is 256. 8-bit color graphics are
of two forms. The first form is where the image stores not color but an 8-bit index into
the color map for each pixel, instead of storing the full 24-bit color value. Therefore, 8-bit
image formats consists of two parts: a color map describing what colors are present in the
image and the array of index values for each pixel in the image. In most color maps each
color is usually chosen from a palette of 16,777,216 colors (24 bits: 8 red, 8green, 8
blue).
The other form is where the 8-bits use 3 bits for red, 3 bits for green and 2 bits for blue. This
second form is often called 8-bit true color as it does not use a palette at all. When a 24-bit full
color image is turned into an 8-bit image, some of the colors have to be eliminated, known as
color quantization process.
A 8-bit color image with resolution 640 x 480 needs a storage space of 640 x 480 bytes=(640 x
480) / 1024KB= 300KB without any compression.
A color loop-up table (LUT) is a mechanism used to transform a range of input colors
into another range of colors. Color look-up table will convert the logical color numbers stored in
each pixel of video memory into physical colors, represented as RGB triplets, which can be
displayed on a computer monitor. Each pixel of image stores only index value or logical color
number. For example if a pixel stores the value 30, the meaning is to go to row 30 in a color
look-up table (LUT). The LUT is often called a Palette.
Characteristic of LUT are following:
The number of entries in the palette determines the maximum number of colors which
can appear on screen simultaneously.
The width of each entry in the palette determines the number of colors which the wider
full palette can represent.
A common example would be a palette of 256 colors that is the number of entries is 256 and thus
each entry is addressed by an 8-bit pixel value. Each color can be chosen from a full palette, with
a total of 16.7 million colors that is the each entry is of 24 bits and 8 bits per channel which sets
the total combinations of 256 levels for each of the red, green and blue components 256 x 256 x
256 =16,777,216 colors.
JPEG- Joint Photographic Experts Group- The JPEG format was developed by the Joint
Photographic Experts Group. JPEG files are bitmapped images. It store information as
24-bit color. This is the format of choice for nearly all photograph images on the internet.
Digital cameras save images in a JPEG format by default. It has become the main
graphics file format for the World Wide Web and any browser can support it without
plug-ins. In order to make the file small, JPEG uses lossy compression. It works well on
photographs, artwork and similar materials but not so well on lettering, simple cartoons
or line drawings. JPEG images work much better than GIFs. Though JPEG can be
interlaced, still this format lacks many of the other special abilities of GIFs, like
animations and transparency, but they really are only for photos.
PNG- Portable Network Graphics- PNG is the only lossless format that web browsers
support. PNG supports 8 bit, 24 bits, 32 bits and 48 bits data types. One version of the
format PNG-8 is similar to the GIF format. But PNG is the superior to the GIF. It
produces smaller files and with more options for colors. It supports partial transparency
also. PNG-24 is another flavor of PNG, with 24-bit color supports, allowing ranges of
color akin to high color JPEG. PNG-24 is in no way a replacement format for JPEG
because it is a lossless compression format. This means that file size can be rather big
against a comparable JPEG. Also PNG supports for up to 48 bits of color information.
TIFF- Tagged Image File Format- The TIFF format was developed by the Aldus
Corporation in the 1980 and was later supported by Microsoft. TIFF file format is widely
used bitmapped file format. It is supported by many image editing applications, software
used by scanners and photo retouching programs.
TIFF can store many different types of image ranging from 1 bit image, grayscale image, 8 bit
color image, 24 bit RGB image etc. TIFF files originally use lossless compression. Today TIFF
files also use lossy compression according to the requirement. Therefore, it is a very flexible
format. This file format is suitable when the output is printed. Multi-page documents can be
stored as a single TIFF file and that is way this file format is so popular. The TIFF format is now
used and controlled by Adobe.
BMP- Bitmap- The bitmap file format (BMP) is a very basic format supported by most
Windows applications. BMP can store many different type of image: 1 bit image,
grayscale image, 8 bit color image, 24 bit RGB image etc. BMP files are uncompressed.
Therefore, these are not suitable for the internet. BMP files can be compressed using
lossless data compression algorithms.
EPS- Encapsulated Postscript- The EPS format is a vector based graphic. EPS is popular
for saving image files because it can be imported into nearly any kind of application. This
file format is suitable for printed documents. Main disadvantage of this format is that it
requires more storage as compare to other formats.
PDF- Portable Document Format- PDF format is vector graphics with embedded pixel
graphics with many compression options. When your document is ready to be shared
with others or for publication. This is only format that is platform independent. If you
have Adobe Acrobat you can print from any document to a PDF file. From illustrator you
can save as .PDF.
EXIF- Exchange Image File- Exif is an image format for digital cameras. A variety of
tage are available to facilitate higher quality printing, since information about the camera
and picture - taking condition can be stored and used by printers for possible color
correction algorithms.it also includes specification of file format for audio that
accompanies digital images.
WMF- Windows MetaFile- WMF is the vector file format for the MS-Windows operating
environment. It consists of a collection of graphics device interface function calls to the
MS-Windows graphice drawing library.Metafiles are both small and flexible, hese images
can be displayed properly by their proprietary softwares only.
PICT- PICT images are useful in Macintosh software development, but you should avoid
them in desktop publishing. Avoid using PICT format in electronic publishing-PICT
images are prone to corruption.
Photoshop- This is the native Photoshop file format created by Adobe. You can import
this format directly into most desktop publishing applications.
Image recognition is usually performed on digital images which are represented by a pixel
matrix. The only information available to an image recognition system is the light intensities of
each pixel and the location of a pixel in relation to its neighbors. From this information, image
recognition systems must recover information which enables objects to be located and
recognized, and, in the case of stereoscopic images, depth information which informs us of the
spatial relationship between objects in a scene.
Image Formatting
Image Formatting means capturing an image by bringing it into a digital form -- already
covered in the section on digitizing images.
Conditioning
In an image, there are usually features which are uninteresting, either because they were
introduced into the image during the digitization process as noise, or because they form part of a
background. An observed image is composed of informative patterns modified by uninteresting
random variations. Conditioning suppresses, or normalizes, the uninteresting variations in the
image, effectively highlighting the interesting parts of the image.
Labeling
Informative patterns in an image have structure. Patterns are usually composed of
adjacent pixels which share some property such that it can be inferred that they are part of the
same structure (e.g., an edge). Edge detection techniques focus on identifying continuous
adjacent pixels which differ greatly in intensity or colour, because these are likely to mark
boundaries, between objects, or an object and the background, and hence form an edge. After the
edge detection process is complete, much edge will have been identified. However, not all of the
edges are significant. Thresholding filters out insignificant edges. The remaining edges are
labeled. More complex labeling operations may involve identifying and labeling shape primitives
and corner finding.
Grouping
Labeling finds primitive objects, such as edges. Grouping can turn edges into lines by
determining that different edges belong to the same spatial event. The first 3 operations represent
the image as a digital image data structure (pixel information), however, from the grouping
operation the data structure needs also to record the spatial events to which each pixel belongs.
This information is stored in a logical data structure.
Extracting
Grouping only records the spatial event(s) to which pixels belong. Feature extraction
involves generating a list of properties for each set of pixels in a spatial event. These may include
a set's centroid, area, orientation, spatial moments, grey tone moments, spatial-grey tone
moments, circumscribing circle, inscribing circle, etc. Additionally properties depend on whether
the group is considered a region or an arc. If it is a region, then the number of holes might be
useful. In the case of an arc, the average curvature of the arc might be useful to know. Feature
extraction can also describe the topographical relationships between different groups. Do they
touch? Does one occlude another? Where are they in relation to each other? Etc.
Matching
Finally, once the pixels in the image have been grouped into objects and the relationship
between the different objects has been determined, the final step is to recognize the objects in the
image. Matching involves comparing each object in the image with previously stored models and
determining the best match template matching.
A digital image is stored as a 2-dimensional array of values, where each value represents the data
associated with a pixel in the image. In the case of bitmaps, the value is 0 or 1, which represent
monochrome images. In the case of a colour image, the value can be:
3 numbers representing the intensities of the red, green, and blue components of the
colour at that pixel;
An indirect address to tables of red, green and blue intensities;
An indirect address to a table of colour triples;
An indirect address to any table capable of representing colour codes;
4 or 5 spectral samples for each colour.
The storage space required for an image is the resolution of the image multiplied by the colour
depth. For example, a 640x480 resolution image in millions of colours requires 640x480x24 =
7,372,800 bits, or 900K. Smaller space requirements can be obtained by compressing the image.
Computer-generated graphics
Graphics can also be created from scratch using a graphics editor, e.g. xphigs. In this
case, a graphic is specified through graphics primitives and their attributes, rather than by a pixel
matrix. This gives the advantage that components of the image can be manipulated through the
primitives (e.g., line, square, ellipse), whereas with a digitized image it is only possible to
manipulate the image at the pixel level. These graphics occupy less space than a corresponding
digitized image of the same resolution and colour-depth. However, before the graphic can be
rendered on the screen it needs to be converted into a pixel matrix. Some graphics packages also
allow objects to be labeled (e.g., if you draw a chair you can label that object as a chair). This is
of particular interest to content-based image retrieval.