Lesson 4. Spatial Data Input and Editing
Lesson 4. Spatial Data Input and Editing
Lesson 4. Spatial Data Input and Editing
Lesson 4
Spatial Data Input and Editing
Lesson 4: Spatial Data Input and Editing
Introduction
Collecting data and creating a GIS database is a time consuming but an important
task. There are many sources of geographic data and many ways to enter that data
into a GIS. A data pool can be generated by either data capture or data transfer.
The data sources are divided into following two main classes: (1) Primary data and
(2) Secondary data.
Learning Outcomes
Upon completion of this lesson, the students will be able to:
1. Differentiate primary from secondary data.
2. Discuss the different data errors.
ACTIVITY
Please refer to the attached activity.
Activity No. 6 Working with Raster Data
Activity No. 7 Determination of Flood Prone Areas
ANALYSIS
1. What are the common problems faced when obtaining data from secondary
sources?
2. What are the three main types of data error?
3. Will a repeated generalisation make the boundary of a polygon more precise?
Explain.
ABSTRACTION
A. Primary Data
48 | P a g e
GIS 205 – GIS and Remote Sensing
Both satellite images and aerial photographs can provide stereo imagery
from overlapping pairs of images i.e. they can generate a three-dimensional
model of the earth’s surface. The other advantages include global coverage
and repetitive monitoring that make these datasets useful for large area
projects and short time events.
Sampling
Since it is not practically possible as well as worthwhile to observe the value
of a variable at every point throughout the study area we adopt the strategy
of sampling. Using sampling we measure subsets of the features in the area
that best capture the spatial variation of the concerned attribute over the
study area. The following five patterns options may be considered for
sampling:
a.Simple random
This method ensures that all parts of the project area
have an equal chance of being sampled. Project area
is divided into a grid with numbered coordinates. A
random site is picked by selecting coordinate pairs
from a number table and plotting those on the project
area map. Each random site is a sample point.
Figure 25. Simple random
pattern
49 | P a g e
GIS 205 – GIS and Remote Sensing
b. Stratified random
It maintains randomness and at the same time
overcomes the chance of an uneven distribution of
points among the map classes. Specific numbers of
sample points are assigned to each class with respect
to its size and significance for the project. Within a
class the random sites are generated in the same way
as in simple random pattern. Figure 26. Stratified
random pattern
c. Systematic
It arranges sample points at equidistant intervals
thus forming a grid. Orientation of the grid is chosen
randomly.
d. Systematic unaligned
It distributes the project area into a grid and assigns
the positions of sample points randomly within the grid
cells.
e. Clustered
In this method, nodal points are the centers for
clusters of sample points. The nodal locations are
selected randomly, stratified by classes, or by
identification of accessible sites.
Advantage In terrain with poor access, the operator can make the
most of accessible sites.
50 | P a g e
GIS 205 – GIS and Remote Sensing
B. Secondary Data
Secondary data refers to the data obtained from maps, hardcopy documents etc.
Some of the methods to capture secondary data are as follows:
Scanned data: A scanner is used to convert analog source map or
document into digital images by scanning successive lines across a map or
document and recording the amount of light reflected from the data source.
Documents such as building plans, CAD drawings, images and maps are
scanned prior to vectorization. Scanning helps in reducing wear and tear;
improves access and provides integrated storage.
There are three different types of scanner that are widely used:
o Flatbed scanner
o Rotating drum scanner
o Large format feed scanner
51 | P a g e
GIS 205 – GIS and Remote Sensing
Vectorization
Vectorization is the process of converting a raster image into a vector image. It is
a faster way of creating the vector data from raster data. Automatic vectorization
is performed in either batch or interactive mode. Batch vectorization takes one
raster file and converts it into vector objects in a single operation. Post vectorization
editing is required to remove the errors. In interactive vectorization software is used
to automate digitizing. The operator snaps the cursor to a pixel and indicates the
direction in which line is to be digitized. The software then automatically digitizes
the line. The operator can decide various parameters such as density of points,
whether to pause at junction for operator’s intervention or to trace in a specific
direction etc. Though the process involves labor it produces high quality data and
greater productivity than the manual digitization.
52 | P a g e
GIS 205 – GIS and Remote Sensing
Obtaining Data from external sources : Creating the same dataset multiple times
for the same area is a time and resource intensive process. One can always import
data from data repositories. Some of these are freely available while others are
available at a price. Internet is the best way to search geographic data. The internet
gives information about geographic data catalogs and vendors. National agencies
of a state/country also disseminate geographic data through their web portals or
through other digital media on demand made by the users.
C. Data Editing
Errors affect the quality of GIS data. Once the data is collected, and prepared for
visualization and analysis it must be checked for errors.
Burrough (1986) divided the sources of error into the following categories:
1. Common sources of error
2. Errors resulting from original measurements
3. Errors arising through processing
Lack of data: The data for a given area may be incomplete or entirely
lacking. For example the land-use map for border regions may not be
available.
Map scale: The details shown on a map depend on the scale used. Maps
or data of the appropriate scale at which details are required, must be used
for the project. Use of wrong scale would make the analysis erroneous.
53 | P a g e
GIS 205 – GIS and Remote Sensing
o Sliver: It refers to the gap which is created between the two polygons
when snapping is not considered while creating those polygons.
These errors can be corrected using the constraints or the rules which are
defined for the layers. Topology rules define the permissible spatial
relationships between features. To know the rules read Topology Rules.
Raster data editing is concerned with correcting the specific contents of raster
images than their general geometric characteristics. The objective of the editing is
to produce an image suitable for raster geoprocessing. Following editing functions
are mostly used for raster data editing:
Filling holes and gaps: To fill holes and gaps that appear in the raster image
54 | P a g e
GIS 205 – GIS and Remote Sensing
Filtering: To remove speckles or the random high or low valued pixels in the
image
Vector data editing is a post digitizing process that ensures that the data is free
from errors. It suggests that
55 | P a g e
GIS 205 – GIS and Remote Sensing
All polygons are closed and each of them contain a label point
APPLICATION
Please refer to the attached activity.
Activity No. 6 Working with Raster Data
Activity No. 7 Determination of Flood Prone Areas
Closure
You have finished with the common sources of spatial data and the common errors
in mapping with GIS. Always remember that in mapping, data coming from a
reliable source is a PRIORITY AND REQUIREMENT. In the next lesson, we will
understand the importance of setting the correct coordinate reference system to
get a correct measurement in our maps. If you are ready, then let’s go!
56 | P a g e