0% found this document useful (0 votes)
4 views23 pages

Module 2

Module-2 discusses various spatial data models used in Geographic Information Systems (GIS), including relational and object-oriented databases, as well as raster and vector data structures. It highlights the importance of Entity-Relationship (ER) diagrams for database design and the characteristics of different database structures, including their advantages and disadvantages. The module also covers raster data models like GRID and IMGRID, explaining their methodologies for storing and analyzing spatial data.

Uploaded by

fareenahmed08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views23 pages

Module 2

Module-2 discusses various spatial data models used in Geographic Information Systems (GIS), including relational and object-oriented databases, as well as raster and vector data structures. It highlights the importance of Entity-Relationship (ER) diagrams for database design and the characteristics of different database structures, including their advantages and disadvantages. The module also covers raster data models like GRID and IMGRID, explaining their methodologies for storing and analyzing spatial data.

Uploaded by

fareenahmed08
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

BCV654 B Module-2

Module-2
Spatial Data Models
Database Structures – Relational, Object Oriented – Entities – ER diagram - data models -
conceptual, logical and physical models - spatial data models – Raster Data Structures – Raster
Data Compression - Vector Data Structures - Raster vs Vector Models- TIN and GRID data
models.

Database Structures in GIS


A GIS database stores and organises spatial and attribute data for efficient access,
management, and analysis. Its structure determines how data is stored, retrieved, and processed.
Types of Database Structures in GIS
GIS databases use different structures to handle spatial and non-spatial (attribute) data:
1. Flat File Database

.IN
2. Hierarchical Database
3. Network Database
4. Relational Database (RDBMS)
C
5. Object-Oriented Database (OODBMS)
N
6. Object-Relational Database (ORDBMS)
SY

Relational Database (RDBMS) in GIS


A Relational Database Management System (RDBMS) is a structured way of storing and
managing data using tables that are related to each other. In GIS (Geographic Information
Systems), RDBMS is widely used to store and manage spatial and attribute data efficiently.
U

GIS applications require the ability to store, retrieve, query, and analyze spatial data, and
VT

RDBMS helps in managing this efficiently through spatial extensions such as PostGIS
(PostgreSQL), Oracle Spatial, and Microsoft SQL Server Spatial.
Features of RDBMS in GIS

✔ Table-based storage – Stores spatial and attribute data in structured tables.


✔ SQL support – Uses Structured Query Language for data manipulation.
✔ Data relationships – Uses Primary Keys (PK) and Foreign Keys (FK) to relate tables.
✔ Indexing and Optimization – Uses spatial indexing for faster querying and retrieval.
✔ Scalability & Security – Supports multi-user access, authentication, and data integrity.
✔ Spatial Extensions – Supports spatial data types, spatial functions, and GIS operations.
Key Characteristics of RDBMS in GIS:

✔ Stores data in tables (rows and columns).


✔ Uses Primary Keys (PK) and Foreign Keys (FK) to establish relationships.
✔ Supports SQL (Structured Query Language) for data retrieval and management.

1
BCV654 B Module-2

✔ Allows data integrity, security, and concurrent access by multiple users.


✔ Can store both spatial (maps, coordinates) and non-spatial (attributes) data.

Example:
Cities Table (Attribute Data)

City_ID (PK) City_Name Population

1 City A 500,000

2 City B 200,000

Advantages and Disadvantages

Ensures accurate and consistent dataComplex integrity rules can slow


Data Integrity &

.IN
using primary keys, foreign keys, down data insertion and
Consistency
and constraints. updates.
Stores data in structured tables, Fixed schema requires careful
Data
making it easy to manage and planning before designing the
C
Organization
retrieve. database.
Complex queries may require
N
Querying & Data Supports SQL for efficient querying,
optimized indexing for
Retrieval filtering, and manipulation of data.
performance improvement.
SY

Requires administration and


Provides access control, user
Data Security regular maintenance to ensure
authentication, and encryption.
security.
U

Supports one-to-one, one-to-many, Relationship management can


Data
and many-to-many relationships become complicated for large
Relationships
for structured data storage. datasets.
VT

Locking mechanisms are needed


Multi-User Allows multiple users to access and
to prevent conflicts, which may
Access modify data simultaneously.
slow down performance.
Can handle large datasets Vertical scaling (adding more
Scalability efficiently by indexing and resources to a single server) can
partitioning data. be costly.
Reduces redundancy by Too much normalization can lead
Data
normalization, avoiding duplicate to complex queries and slower
Redundancy
data. performance.
Data corruption or crashes
Backup & Provides features for data backup,
require complex recovery
Recovery rollback, and recovery.
procedures.
Supports spatial extensions (e.g.,
Requires specialized extensions
GIS Integration PostGIS, Oracle Spatial) for GIS
for full GIS functionality.
applications.

2
BCV654 B Module-2

Object-Oriented Database (OODBMS) in GIS


An Object-Oriented Database Management System (OODBMS) integrates object-
oriented programming (OOP) principles with database functionalities. Unlike traditional
Relational Database Management Systems (RDBMS), which store data in tables, OODBMS
stores data as objects, similar to how data is represented in object-oriented programming
languages such as Java, C++, and Python.

Why OODBMS in GIS?


Geographic Information Systems (GIS) handle complex spatial data involving real-world
entities, relationships, and behaviours. OODBMS provides better flexibility, efficiency, and
scalability for managing spatial data models than relational databases.

Features of OODBMS in GIS

.IN
✔ Object-Oriented Data Storage – Stores spatial data as objects, encapsulating attributes and
behaviours.
C
✔ Complex Data Representation – Supports hierarchical and multi-level relationships for
geographic features.
N
✔ Inheritance & Polymorphism – Allows objects to inherit properties and behaviors from
SY

other objects.
✔ Spatial Data Support – Integrates spatial data types such as points, lines, polygons, and
3D models.
U

✔ Direct Object Manipulation – Objects are stored and retrieved without complex joins,
VT

improving performance.
✔ Better Mapping with Programming Languages – Works efficiently with OOP languages
like Java, C++, and Python
Advantages

Better Representation of Real-World Stores GIS features as real-world objects (e.g.,


GIS Data cities, roads, rivers) rather than tables.

Objects can contain both data (attributes) and


Encapsulation of Data & Methods
functions (methods), improving modularity.

No need for complex JOIN operations like in


Efficient Querying
RDBMS, leading to faster queries.

3
BCV654 B Module-2

Handles hierarchical, network-based, and multi-


Supports Complex Spatial Data
dimensional spatial data effectively.

Objects can inherit properties from other objects,


Inheritance & Reusability
reducing redundancy.

Works well with large-scale GIS applications


Scalability & Performance
requiring complex spatial relationships.

Better Integration with Object- Easily integrates with languages like Java, C++,
Oriented Programming (OOP) and Python for GIS applications.

Useful for real-time GIS applications, such as


Real-Time Spatial Data Handling
traffic monitoring and environmental modelling.

.IN
Disadvantages
OODBMS is not as widely adopted in GIS as relational
Less Mature than RDBMS
databases.
C
No standard query language like SQL; querying depends on
Complex Query Language
N
OOP-based languages.
SY

Requires knowledge of object-oriented programming (OOP),


Steeper Learning Curve making it harder for GIS professionals unfamiliar with
coding.
U

Limited Support in GIS Fewer commercial GIS tools (e.g., ArcGIS, QGIS) fully
VT

Software support OODBMS compared to RDBMS.

Managing object relationships and versioning can be


Complex Data Management
challenging in large-scale GIS applications.

Higher Storage Object-oriented databases often require more storage due to


Requirements their complex data structures.

Difficult Migration from Converting existing GIS databases from RDBMS to


RDBMS OODBMS is complex and time-consuming.

Entity-Relationship (ER) Diagram


An ER diagram is a visual way to represent the entities in a database and the relationships
between them. It's a key tool for database design.

4
BCV654 B Module-2

• Why are ER diagrams important in GIS?


o Data Organization: GIS data can be complex, involving both spatial and non-
spatial information. ER diagrams help organize this data logically.
o Database Design: They provide a blueprint for creating a GIS database, ensuring
that data is stored efficiently and accurately.
o Spatial Relationships: GIS is all about spatial relationships (e.g., adjacency,
containment, intersection). ER diagrams help model these relationships.
o Data Integrity: They help define rules and constraints that maintain the accuracy
and consistency of the data.
• Components of an ER Diagram:
o Entities: Represented by rectangles (as you see in the image).
o Attributes: Properties or characteristics of an entity. Represented by ovals. For

.IN
example, a "Road" entity might have attributes like "Name," "Length," and
"Type."
o Relationships: Associations between entities. Represented by diamonds or lines.
C
For example, a "City" entity "contains" a "Park" entity.
N
o Cardinality: Specifies the number of instances of one entity that can be related
to another entity. Common types are:
SY

▪ One-to-one (1:1)
▪ One-to-many (1:N)
▪ Many-to-many (M:N)
U

How ER Diagrams are Used in GIS


VT

1. Modeling Geographic Features:


o For example, you might have entities like "Land Parcels," "Buildings," and
"Owners." An ER diagram would show how these are related (e.g., a "Land
Parcel" contains a "Building," and a "Land Parcel" is owned by an "Owner").
2. Representing Spatial Relationships:
o ER diagrams can model relationships like:
▪ Adjacency: "Parcels" that share a boundary.
▪ Containment: "Cities" that contain "Parks."
▪ Intersection: "Roads" that intersect "Rivers."
3. Designing GIS Databases:
o ER diagrams help determine:
▪ What tables are needed in the database.
5
BCV654 B Module-2

▪ What columns each table should have (attributes).


▪ How the tables should be linked together (relationships).

.IN
The above figure is an Entity-Relationship (ER) Diagram designed to represent the
C
relationships between different entities in a database, likely for a GIS (Geographic Information
N
System) application related to land parcels and buildings. Let's break it down:
SY

Entities (Rectangles):

• Land Parcel: This represents a piece of land.


• Building: This represents a structure located on a land parcel.
U

• Occupant: This represents a person or entity that occupies a building.


VT

Attributes (Ovals):

• Land Parcel:

o ID #: A unique identifier for each land parcel.


o Owner: The person or entity that owns the land parcel.
o Address: The location of the land parcel.

• Building:

▪ ID #: A unique identifier for each building.


▪ Type: The type of building (e.g., residential, commercial).
▪ # Floors: The number of floors in the building.

6
BCV654 B Module-2

• Occupant:

o ID #: A unique identifier for each occupant.


o Name: The name of the occupant.
o Person: Additional information about the occupant (e.g., contact details).

Relationships (Diamonds):

• Contains: This represents the relationship between "Land Parcel" and "Building". It
indicates that a land parcel contains a building.

• Has: This represents the relationship between "Building" and "Occupant". It indicates
that a building has an occupant.

Interpretation:

The diagram shows that:


.IN
C
1. Each Land Parcel has a unique ID, an owner, and an address.
N
2. A Land Parcel can contain one or more Buildings (as indicated by the "Contains"
SY

relationship).

3. Each Building has a unique ID, a type, and a number of floors.


U

4. A Building can have one or more Occupants (as indicated by the "Has" relationship).
VT

5. Each Occupant has a unique ID, a name, and potentially other personal information.

Key Concepts Illustrated:

• Entities: The fundamental objects or concepts being modeled.

• Attributes: The properties that describe the entities.

• Relationships: How the entities are connected to each other.

• Cardinality: (Implicit) While not explicitly shown with notation like 1:1, 1:N, or M:N,
we can infer that:

o One Land Parcel can contain multiple Buildings (1:N or 1:M).

7
BCV654 B Module-2

o One Building can have multiple Occupants (1:N or 1:M).

Purpose:

This ER diagram serves as a blueprint for designing a database to store and manage information
about land parcels, buildings, and occupants. It helps to ensure data consistency, integrity, and
efficient retrieval of information.

RASTER DATA MODELS

The grid-based GIS spatial data can be stored, manipulated, analysed, and referenced basically
in any one of the three methods/models. These three models are: GRID model, IMGRID
model and MAP model. All of these models use the grid cell values, their attributes, coverages
and corresponding legends. These models are developed depending upon the requirements

.IN
from time to time. Based on the applications of interest, availability of software and other
related information, any one of the above models can be selected for the execution of a
particular GIS project.
C
GRID MODEL:
N
SY

The first and foremost model for the representation of raster data is the GIRD model. The
method of storing, manipulating, and analysing the grid-based data was first conceptualised in
an attempt to develop GRID model.
U

Burrough (1983) used this approach because each of those early GIS systems used this model.
VT

The figure below illustrates the GRID model. In this method, each grid cell is referenced and
addressed individually and is associated with identically positioned grid cells in all other
coverages, rather like a vertical column of grid cells, each dealing with a separate theme.
Comparisons between coverages are therefore performed on a single column at a time. For
example, to compare soil attributes in one coverage with vegetation attributes in a second
coverage, and land use/land cover attributes in a third coverage, each X and Y location must be
examined individually. So a soil grid cell at the location must be examined individually. So a
soil grid cell at location X10-Y10 will be compared to its vegetation counterpart and third layer
land use/land cover at location X10-Y10. You might be able to envision this by imagining a
geological core in which each rock type is lying directly on top of the next, and to get a picture
of the entire study area, it will be necessary to put a large number of cores together.

8
BCV654 B Module-2

The advantage of this model is that computational comparison of multiple themes or coverages
for each grid cell location is relatively easy. This is a reasonable approach and has proven
successful. The main disadvantage is that it limits the efficient examination of relationships of
themes to one-to-one relationships within the spatial framework. In other words, it is more
inconvenient to compare groups in one coverage to groups in another coverage because each
grid cell location must be addressed individually. The second disadvantage is more storage
space for the cell data and the representation is vertical rather than horizontal, which would
more closely resemble our notion of maps.

.IN
C
N
SY

IMGRID MODEL

With a slight modification of the checkerboard analogue, the second basic raster data model,
U

that is the IMGRID data model, can be illustrated in the figure below. This model is also used
in the early GIS system (Burrough, 1983). Let us assume that the red squares on the
VT

checkerboard map serve to contain a single attribute, rather than just a theme. Instead, we can
use the number 1 (red squares) to repre·sent water and 0 (black squares) to indicate the absence
of water. How can we represent a thematic map of land use that contains, say four categories,
namely, recreation, agriculture, industry, and residences? Each of these four attributes would
have to be separated as an individual layer. One layer would stand for agriculture only, with 1
's and O's representing the presence or absence of this activity for each grid cell. Recreation,
industry, and residences would be represented in the same way, with each variable referenced
directly.

9
BCV654 B Module-2

IMGRID system has two major advantages. First, we have a contiguous object that more
closely resembles how we think about a map. That is, our primary storage object is a two-
dimensional array of numbers, rather than a column of numbers for different themes. Second,

.IN
we reduce the numbers that must be contained in each coverage to O's and 1 'so This will
certainly simplify our computations and will eliminate the need for map legends. Since each
C
variable is uniquely identified, assigning a single attribute value to a single grid cell is possible,
and this is the third advantage. Let us assume that a given grid cell partly occupies agriculture
N
and partly recreation and each of these attributes of the land use theme is separated. In such a
SY

case, we may encounter difficulties when creating our final thematic coverage if multiple
values occur in individual cells. To avoid such problems, we must be able to ensure that each
grid cell has only a single value for each variable.
U

MAP MODEL
VT

The third raster GIS model Map Analysis Package (MAP) model developed by C. Dana Tomlin
(Burrough, 1983) formally integrates the advantages of the above two raster data structure
methods. In this data model, each thematic coverage is recorded and accessed separately by
map name or title. This is accomplished by recording each variable, or mapping unit, of the
coverage's theme as a separate number code or label, which can be accessed individually when
the coverage is retrieved. The label corresponds to a portion of the legend and has its symbol
assigned to it. In this way, it is easy to operate on individual grid cells and groups of similar
grid cells, and the resolution changes in value require rewriting only a single number per
mapping unit, thus simplifying the computations. The overall major improvement is that the
MAP method allows ready manipulation of the data in a many-to-one relationship of the
attribute values and the sets of grid cells.

10
BCV654 B Module-2

The MAP data model is compatible with almost all computer systems from its original
mainframe version to Macintosh and PC versions and modern UNIX-based workstation
versions. It can be used as a teaching version of GIS as it is very flexible and has also become
a major module in commercial GIS packages like ARC/INFO.

Although raster GIS systems have traditionally been developed to allow single attributes to be
stored individually for each grid cell, some have evolved to include direct links to existing
database management systems. This approach extends the utility of the raster GIS by
minimising the number of coverages and substituting multiple variables for each grid cell in
each coverage. Such extensions to the raster data model have also allowed direct linkage to
existing GIS systems that use a vector back and forth from raster to vector. The user can operate
with all the advantages of both the data structures. The conversion process is often quite
transparent, allowing the user to perform the analyses needed without concern for the original

.IN
data structure. This feature is particularly important because it is strengthening the relationship
between traditional digital image processing software used to manipulate grid cell-based,
C
remotely sensed data and GIS software. Many software systems already have both sets of
capabilities, and still, more are likely in the future. Together with the linkage with existing
N
statistical packages, we are rapidly approaching the systems that operate with a superset of
SY

spatial analytical techniques, resulting in a maturing of automated geography


U
VT

RASTER DATA STRUCTURES

11
BCV654 B Module-2

Refers to the storage of Raster data so that it can processed by the computer. A raster data is
stored as a matrix. The cell values are written into the files by rows and columns. The different
types of raster data structures are:

i. Banded Interleaved By Pixel

ii. Banded Interleaved By Line

iii. Banded Sequential Format

Banded Interleaved by Pixel

.IN
One of the earliest digital formats used for satellite data is band intereaved by pixel (BIP)
format. This format treats pixels as the separate storage unit. Brightness values for each pixel
are stored one after another. It is practical to use if all bands in an image are to be used. Figure
C
shows the logic of how the data is recorded to the computer tape in sequential values for a four
N
band image in BIP format. All four bands are written to the tape before values for the next pixel
are represented. Any given pixel located on the tape contains values for all four bands written
SY

directly in sequence. In order to read all four bands of the image, all four panels must be pieced
together to form the entire scene.
U
VT

Banded Interleaved by Line (BIL)

Just as the BIP format treats each pixel of data as the separate unit, the band interleaved by line
(BIL) format is stored by lines. Figure shows the logic of how the data is recorded to the
computer tape in sequential values for a four band image in BIL format. Each line is represented
in all four bands before the next line is recorded. Like the BIP format, it is a useful to use if all

12
BCV654 B Module-2

bands of the imagery are to be used in the analysis. If some bands are not of interest, the format
is inefficient if the data are on tape, since it is necessary to read serially past unwanted data.

Banded Sequential Format

.IN
The band sequential format requires that all data for a single band covering the entire scene be
written as one file. Thus, if an analyst wanted to extract the area in the center of a scene in four
C
bands, it would be necessary to read into this location in four separate files to extract the desired
N
information. Many researchers like this format because it is not necessary to read serially past
unwanted information if certain bands are of no value, especially when the data are on a number
SY

of different tapes.Random access optical disk technology, however, makes this serial argument
obsolete.
U
VT

Types of Raster Data

13
BCV654 B Module-2

Satellite Imagery

Remotely sensed satellite data are recorded in raster format. The pixel value in a satellite image
represents light energy reflected or emitted from earth’s surface. By analyzing the pixel values,
an image processing system can extract a variety of themes from satellite images, such as land
use and land cover, hydrography, water quality and other areas Satellite images can be
displayed in black and white or in color. Satellite images can also simulate color photographs
if they have pixel values from the red green and blue spectral bands. The image looks like a
color photograph if bands 3 2 and 1 are assigned to red green and blue respectively, and a color
infrared photograph if bands 4 3 and 2 are assigned to red, green and blue respective.

Graphics Files

In this type of raster data, we can include maps, photographs and images which can be stored

.IN
as digital graphic files. Major popular graphic files in raster format are GIF (Graphic
Interchange Format), TIFF (Tagged Image File Format), JPEG Joint Photographic Experts
C
Group).
N
Digital Elevation Models
SY

A digital elevation model (consist of an array of uniformly spaced elevation data. A DEM is
point based, but it can easily be converted to raster data by placing each elevation point at the
center of a cell.
U

Digital Orthophotos
VT

A digital orthophoto quad (is a digitized image prepared from an aerial photographs or other
remotely sensed data, in which the displacement caused by camera tilt and terrain relief has
been removed. A digital orthophoto is geo referenced and can be registered with topographic
and other maps.

Raster Data Compression

Raster data compression techniques aim to reduce storage space and improve efficiency by
encoding information using fewer bits than the original, employing either lossless or lossy
methods. Common techniques include (a) run-length codes, (b) raster chain codes, (c) block
codes, and (d) the unique structure called quadtrees.

14
BCV654 B Module-2

Run-Length Coding

The first method of compacting raster data is a process called run-length codes. In the raster
data, each grid cell has a numerical value corresponding to a category of data on the map that
must be put (generally typed) into the computer.

For example, for a map of 500 x 500 grid cells, 2,50,000 numbers have to the typed into the
computer. As you begin typing, you will quickly see patterns emerging from the data that
present opportunities for reducing your workload. Specifically, there are long strings of the
same number in each row. Think how much time you could save if for a given row, you could
just tell the computer that starting at column 8 all the numbers are 1 s, representing some map
variable, until you get to column 56, then at column 57 the numbers are 2s until the end of the
row. Indeed, you could also save a great deal of space by simply giving starting and ending

.IN
points for each string and the value that should be stored for that string. This method of storing
the data is called run-length coding.
C
N
SY
U
VT

0,10
0,10
0,4, 1,4, 0,2
0,4, 1,4, 0,2
0,2, 1,6, 0,2
0,2, 1,6, 0,2
0,2, 1,6, 0,2
0,2, 1,6, 0,2
0,10
0,10

15
BCV654 B Module-2

Raster Chain Codes

Chain coding defines the outer boundary using relative positions from a start point. The
sequence of the exterior is stored where the endpoint finishes at the start point. During the
encoding, the direction is stored as an integer. For example, the value 0 is north and 1 is east.
The disadvantage is that it is difficult to modify and edit the boundaries, such as merging and
inserting them. Local modification will change the overall structure, which is inefficient.
Moreover, because chain code stores the boundaries of each area as a unit, the boundaries of
adjacent areas will be stored repeatedly, resulting in redundancy.

For example: we start at the position (5,2). From here we define the border using cardinal
directions and the number of movements.

• E3, S4,W1,

.IN
• E1, E2, E3
• S1,S2,S3,S4
• S1,W1,N1,
• W1
C
• W1, N3, E1, • S1,W1,N1
N
• W1,N3.E1
• N1.
• N1
SY

Block Codes
U

The third method of storing the grid-based data ~or reducing the storage is block codes. The
VT

block codes method is a modification of run-length codes. Instead of giving starting and ending
points, plus a grid cell code, select a square group of cells and assign a starting point, the centre
or a corner, pick a grid cell value, and tell the computer how wide the square of grid cells is,
based on the number of cells. Block coding is also called a two-dimensional run-length code.
Each square, group of grid cells, including individual grid cells, can be stored in this way with
a minimum group of numbers. Block coding methods are a very effective method of reducing
the storage space for most thematically layered digital data in a GIS.

With respect to pervious example: Instead of storing 64 grid cells, all it takes is just 7 blocks.
Using block coding, it requires one 3×3 block, two 2×2 blocks, and four 1×1 cell blocks to
encode this raster image

16
BCV654 B Module-2

Quadtrees

.IN
The final method of compact storage is a rather difficult approach. Still at least one commercial
system called Spatial Analysis System (SPANS), from Tydac, and one experimental system
called Quilt are based on this scheme. Like block codes, quadtrees operate on square groups of
C
cells. In this the entire map is- successively divided into uniform square groups of grid cells
N
with the same attribute value. Starting with the entire map as entry points the map is then
divided into four quadrants (NW, NE, SW, and SE). If any of these quadrants is homogeneous
SY

containing grid cells with the same value, that quadrant is stored and no further subdivision is
necessary. Each remaining quadrant is further divided into four quadrants, again NW, N.E, SW,
and SE. Each quadrant is examined for homogeneity. All homogeneous quadrants are again
U

stored, and each of the remaining quadrants is further divided and tested in the same way until
VT

the entire map is stored, as square groups of cells, each with the same attribute value. In the
quadtree structure, the smallest unit of representation is a single grid cell.

One of the advantages of this raster model is that each cell can be subdivided into smaller cells
of the same shape and orientation. This unique feature of the raster data model has produced a
range of innovative data storage and data reductionmethods that are based on quadtree works
on the principle of recursively subdividing space. The most popular of these is the area or
region quadtree. The area quadtree works on the principles of recursively subdividing the cells
in a raster image into quads (or quarters). The subdivision process continues until each cell in
the image can be classed as having the spatial entity either present or absent within the bounds
of its geographical domain. The number of subdivisions required to represent an entity will be
a trade-off between the complexity of the feature and the dimensions of the smallest grid cell.

17
BCV654 B Module-2

The quadtrees principle is illustrated in Figure where the division of the region of the image is
mainly based on the resolution of the system as minimum mapable unit. Therefore the systems
based on quadtrees are called variable resolution systems because they can operate at any level
of quadtree subdivision. Thus users can decide how fine the resolution needs to be for various
manipulations and applications. In addition, because of the compactness of storage from this
method, a very large database, perhaps of a continental or even global scale, can be stored in a
single system.

.IN
C
N
SY

VECTOR DATA MODEL


U

Vector data structures allow the representation of geographic space in an intuitive way
VT

reminiscent of the familiar analog map. The geographic space can be represented by the spatial
location of items or attributes which are stored in another file for later access. Like the raster
spatial data model, there are many potential vector data models that can be used to store the
geometric representation of entities in the computer.

A point is the simplest spatial entity that can be represented in the vector world with topology.
A point requires to be topologically correct with respect to a geographicalreference system
which locates it with respect to other spatial entities. To have topology a line entity must consist
of an ordered set of points a locus of number points, (known as an arc, segment, or chain) with
a defined start and end points (nodes). Knowledge of the start and end points gives a line
direction. For the creation of topologically correct area entities, the data about the points and
lines used in its construction, and a knowledge of how these are connected to define the

18
BCV654 B Module-2

boundary, are required. The combination of points gives the line entity and the combination of
points and line segments forms an area entity. The two basic types of vector data models are
(i) spaghetti model, and (ii) topological model.

Spaghetti Model

The simplest vector data structure that can be used to reproduce a geographical image in the
computer is a file containing (x, y) coordinate pairs that represent the location of individual
point features. The figure below is essentially a one-for-one translation of the graphical image
or a map which is also termed as the conceptual model.

Let us consider a
conceptual model in
which an analog

.IN
map covering each
graphic object is
C
shown in Figure.
Each graphic object
N
can be represented
SY

with a piece of
spaghetti. Each
piece of spaghetti
U

acts as a single
VT

entity. The shortest


spaghetti can be
represented as a
point, collection of a number of point spaghettis for a line entity and collections of line
segments that come together at the beginning and ending of surrounding areas form an area
entity. Each entity is a single, logical record in the computer, coded as variable length strings
of (x, y) coordinate pairs. Let us assume that two polygons lie adjacent to each other in a
thematic coverage. These two adjacent polygons must ha~e separate pieces of spaghetti for
adjacent sides. That is, no two adjacent polygons share the same string of spaghetti. Each side
of polygon is uniquely defined by its own set of lines and coordinate pairs. In this model of
representing vector data, all the spaghetties are recorded separately for polygons. But in the
computer, they should have the same coordinates.

19
BCV654 B Module-2

Topological Models:

In order to use the data manipulation and analysis subsystem more efficiently and obtain the
desired results, to allow advanced analytical techniques on GIS data and its systematic study
in any project area, much explicit spatial information is to be created. The topological data
model incorporates solutions to some of the frequently used operations in advanced GIS
analytical techniques. This is done by explicitly recording adjacency information into the basic
logical entity in topological data structures, beginning and ending when it contacts or intersects
another line, or when there is a change in the direction of the line.

Each line then has two sets of numbers: a pair of coordinates. and an associated node number.
The node is the intersection of two or more lines, and its number is used to refer to any line to
which it is connected. In addition, each line segment, called a link, has its own identification

.IN
number that is used as a pointer to indicate the set of nodes that represent its beginning and
ending polygon. These links also have identification codes that relate polyg~n numbers to see
which two polygons are adjacent to each other along its length. In fact, the left and right
C
polygon are also stored explicitly, so that even this tedious step is eliminated. This design
N
feature allows the computer to know the actual relationships among all its graphical parts to
identify the spatial relationships contained in an analog map document.
SY

Fundamentally, the topological models available in GIS ensure (a) that no node or line segment
is duplicated, (b) that line segments and nodes can be referenced to more than one polygon,
U

and (c) that all polygons can be adequately represented. Figure below shows one possible
VT

topological data structure for the vector representation. To understand the topological vector
data structure, let us consider a network with 8 nodes encoded as n1 to n8.

20
BCV654 B Module-2

The links joining all these nodes are encoded as 11 to 114 and the polygons created by all these
line segmentsllinks are coded as A 1 to A8. The creation of this structure for complex area
features is carried out in a series of stages. Burrough (1986) identifies these stages as
identifying a boundary network of arcs (the envelope polygon), checking polygons for closure,
and linking arcs into polygons. The area of polygons can then be calculated and unique
identification numbers attached. This identifier would allow nonspatial information to be
linked to a specific polygon.

.IN
C
N
SY
U
VT

21
BCV654 B Module-2

Feature Raster Data Model Vector Data Model


Represents geographic data as a grid Represents geographic data using
Definition
of pixels (cells). points, lines, and polygons.
Uses a matrix of cells (rows and
Uses coordinate-based geometry to
Data Structure columns) where each cell has a
define spatial features.
value.
Continuous data (e.g., elevation, Discrete data (e.g., roads,
Best Suited For
temperature, satellite images). boundaries, buildings).
Each cell has a value representing an Objects are defined by vertices and
Data
attribute (e.g., land cover type, paths (e.g., a river as a polyline, a
Representation
temperature). city as a point).
Larger file size due to the need to Smaller file size as it stores only
File Size
store every pixel. coordinate points and attributes.

.IN
Depends on pixel size; finer Independent of resolution;
Resolution resolution means more detail but maintains high accuracy at different
larger file size. scales.
C
Lower accuracy, as data is Higher accuracy, as features are
Spatial Accuracy
generalized to fit within a pixel. defined with precise coordinates.
N
Faster for overlay and mathematical Slower for complex spatial analysis
Data Processing
operations (e.g., elevation analysis, but efficient for queries and
Speed
SY

NDVI). topology.
Common formats: GeoTIFF (.tif), Common formats: Shapefile (.shp),
Storage Format
JPEG2000 (.jp2), GRID. GeoJSON (.geojson), KML (.kml).
U

Good for spatial analysis like terrain Best for network analysis,
Analysis Type modeling, land use classification, topology-based operations, and
VT

and remote sensing. feature-based mapping.


Maintains quality at all zoom levels
Becomes pixelated when zoomed in
Scalability due to coordinate-based
beyond resolution limit.
representation.
Road networks, administrative
Example Use Satellite imagery, elevation models,
boundaries, land parcels, utility
Cases rainfall distribution maps.
networks.

22
BCV654 B Module-2

TIN vs GRID Data Models

TIN (Triangulated Irregular


Feature GRID (Raster/Grid Data Model)
Network)
A vector-based model that represents A raster-based model that represents
Definition terrain using a network of irregularly terrain using a regular grid of square
spaced triangles. cells (pixels).
Uses irregularly spaced points Uses a regular matrix of cells (rows
Data Structure connected by triangles to form a and columns) where each cell has a
continuous surface. single elevation value.
Variable-resolution terrain data
Uniform-resolution terrain data for
Best Suited For where details are needed in specific
continuous surface analysis.
areas.
Data Represents elevation as triangles Represents elevation as pixels with a
Representation with vertices at known points. value for each cell.
Higher accuracy due to adaptive
Lower accuracy in areas with high
Accuracy resolution (denser in complex areas,

.IN
variation since the resolution is fixed.
sparser in flat areas).
Smaller file size since data is stored Larger file size since all cells are
Storage Size
only where needed. stored, even in flat areas.
C
Variable resolution; adapts to Fixed resolution; same grid size
Resolution
terrain complexity. everywhere.
N
Faster for terrain modeling and 3D More efficient for grid-based spatial
Computational
visualization due to adaptive analysis like hydrology and land use
Efficiency
SY

sampling. classification.
Surface Creates a network of triangles that Creates a grid-based 2.5D surface,
Representation can represent terrain in 3D. less smooth than TIN.
U

3D terrain modeling, contour DEM (Digital Elevation Model),


Common
generation, engineering applications, remote sensing, hydrological
Applications
VT

LiDAR data processing. modeling, land cover classification.

Common formats: .tin, .shp (with Common formats: GeoTIFF (.tif),


File Formats
TIN structure). GRID, ASCII raster (.asc).

******

23

You might also like