Database Theory

Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

What are the limitation of flat-file databases?

In a flat-file database all the data is held in a single table and there are
several problems associated with this. Consider the following scenario:

 A doctor wishes to use a database to keep track of his patients’ test


results. It needs to record:
 patient name
 patient age
 test type
 test results.
 A flat-file database table set up to store the data might look like the
one below (the key field has been ignored).

Duplicated data – highlights show typing errors made during data entry
The problems that can then occur:

Data redundancy: The limitations of trying to represent this information in


a single file quickly become apparent. The first row represents the Pulse
reading for patient “Kit Kline”. The second row represents a different test
but requires us to duplicatethe information (forename, surname, age)
about the patient. This continues for each test result. This makes the size
of the database unnecessarily large and the process of data entry
cumbersome and repetitive. The process of unnecessarily duplicating data
is called data redundancy.

Data inconsistency: Note the data entry errors highlighted in the above
table. Such data inconsistency occurs when large amounts of duplicate
data have to be entered into a database. By duplicating data across
records in flat files we introduce a risk of data inconsistency.

SUMMARY: The problems with flat-file database systems are that they can
lead to:
 Data redundancy – the needless duplication of data
 Data inconsistency – errors in the values or format of data that
should be identical

What is a relational database?


The key difference between a relational database and a flat-file database is
that in a relational database the data is grouped into entities and stored
in multiple linked tables.

What is an entity and how are they related to database tables?

An entity is a “real world thing” about which data can is held in a database.
Examples of entities include:

 Customers
 Products
 Pupils
 Suppliers
 Loans
 Videos/DVD’s
 Flights
 Employees
 Treatments
 Contracts
 Library books
 Cars
 Orders
 Zoo animals
 Rentals

In a relational database, each entity corresponds to a separate table in


the database.

What is an attribute and how are they related to database tables?

Attributes are the facts, aspects, properties, or details about an entity.


Examples of attributes include:

 Entity – Library books. Attributes – ISBN number, author, category


etc.
 Entity – Flights. Attributes – Flight No., aircraft type, departure/arrival
date/time, destination etc.
 Entity – employee. Attributes – Name, gender, address, DOB,
qualifications, job title etc.

In a database, each attribute corresponds to a separate field in the


database.

In general, entities can have attributes but attributes have no smaller parts.
An attribute belongs to a single entity.

What is a key field and how are they used to link database tables?

The key field (sometimes called a primary key) is used to store an attribute
that makes that particular entity entry in the database unique. For example,
a NHS number, a passport number, a vehicle registration, a booking
reference, a flight number etc.

The tables in a relational database are linked by the key field in one
table being added as a foreign key in another table to form a
relationship between the entities.

Consider a library database. Entities such as book, author and library


memberswould all have separate but linked tables.

A diagram
of a simplified relational database showing the tables linked by the key fields
The book table would only have details about the book itself while details of
the author would be stored in a separate table. A further table would hold
details of the library members who could borrow books.
Another further table would also be needed to store the details of the
actual loans, such as the lending data and the return date.

This loan table would not need to store details of the book, the author or
the borrower, just the links to such data in the other tables.

The simplified diagram shows how the key fields in each table are used to
link each entity.

What are the advantages of relational databases?


 No data redundancy – in a well-designed relational database there
should be no duplicated data (other than the key field).
 No data inconsistency – as data is not duplicated, there is no risk of
the same data item being stored differently in another record.
 Flexibility – A relational database can be queried with greater
flexibility than a flat-file system. Relationships mean that data can be
combined in a variety of ways to produce the views that different areas
of an organisation require.

What are the components of a relational database?


The components of a relational database are:

 Tables with Records and Fields


 Relationships between Entities (tables) using Primary
Keys and Foreign keys
 Forms
 Query facilities
 Reporting facilities
 Modules

What is a database table?

In a relational database, each table represents the attributes of one


entity. Database tables are made up of record and fields.
The components of a database table
What is a database record?

Each record in a relational database table corresponds to an instance of


an entity. In the example table of ‘Students’ above there are 5 records (the
top row containing the field names is not a record). Each record
corresponds to an instance of a Student entity, i.e. each row represents an
individual student. Note that although there are two students called Philip
Barker with the same date of birth, they have different Student IDs and are
therefore considered to be different students.

What is a database field?

An attribute is a piece of information or a characteristic of an entity.


Attributes of entities are represented in database tables
by fields (columns). A field stores one item of data for a record.

In the table above, each student is represented in the relational database


by a record and the student attributes are stored in the following fields:

 Student ID
 Forename
 Surname
 Date of Birth

For example, the student with Student ID 6777 is called Cynthia


Ferguson and she was born on 3rd January 1996.
Fields have the following characteristics:

 Each field in a table has a unique name.


 Each field stores a single item of data – For example, a field called
Date of Birth would store no more than one date of birth value.
 Each field has a particular data type – for example, text, Boolean,
integer, date/time, etc.
 Each field can have its own validation rules – these ensure that data
recorded meets certain rules.

What is a key field?

In a relational database each entity must have one or more


fields that uniquely identify that entity. Sometimes an existing attribute
can be used because it is unique but most of the time some sort of numeric
ID is created. This unique identifier is called the key field or primary key.
In the Student database table above the Student ID is the Key Field,
needed because there is no other unique item of data about each student.
Sometimes one or more fields can be combined to make a primary key, for
example a cinema reservation could be uniquely defined by a particular
seat number and a particular screening.

How is the key field used to connect tables and avoid data
redundancy?

To link tables, the key field in one table must appear as a foreign key field
in the other table. This is best explained with an example, in this case a
simplified relational database for a doctor’s surgery.

Two tables in a
relational database showing the link between them
There are two entities, ‘Patient’ and ‘Test Results’ so two tables are
created.
 ‘Forename’, ‘Surname’ etc. are examples of attributes in the
‘Patient’ entity.
 ‘Test Performed’ and ‘Test Result’ are examples of attributes in the
‘Test Result’ entity.

As neither patients or treatments have any unique attributes, each entity is


given a numeric ID as the key field. All the data associated with the patient
is now held in one table and all the data associated with test results is held
in another.

In order to create a relationship between the two tables, a link is created


by adding the key field in the Patient Table as a foreign key field in the
Test Result Table. This means that when a new test result is added to the
test result table, the only data about the patient that needs to be added is
the Patient ID.

The example below shows how the two tables might look with some real
data.

Data stored in two tables in a relational database, showing the linked field
As can be seen, the Patient ID key field makes a link between the two
tables and, apart from the foreign key field, there is no duplication of data
so no data redundancy. If more details of the patient who was tested are
needed they can easily be accessed because of the relationship between
the two tables of data.

Problems with data inconsistency and data integrity are also avoided
because there is only one set of data.

What types of relationships can there be in a relational database?


There are three different types of relationship between entities:

 One-to-one – e.g. Husband and wife. A husband can have one wife
and a wife can have one husband
 One-to-many – e.g. Political party and politician. A political party has
many politicians, but a politician belongs to only one political party
 Many-to-many – e.g. Actor and film. An actor stars in many films and
a film can have many actors

Each of the above relationships can be shown in an Entity Relationship


diagram(E-R diagram) such as the ones below:

Diagram showing the


three different types of entity relationships
The diagram below shows an E-R diagram drawn to represent the entities
of a relational database planned to store details of student and the
resources that they have borrowed from their school library (books, videos,
music, etc).

This diagram shows


that there is a one-to-many relationship between students and their library loans.
The Student entity is at the one end of the relationship while the Library
Loan entity is at the many end (the end of the link that looks like “crow’s
feet”). The relationship between instances of Student entities and instances
of Library Loan entities can be stated as:
When converted into database tables, each entity needs its own table
containing its own fields (attributes). All of the student attributes (name,
date of birth, etc) will be held in a ‘Students’ table while a ‘Library Loans’
table will hold all the data associated with students’ library loans. See the
tables below:

Database table showing primary keys


The Students table has a field called Student ID as its key field, while the
Library Loans table has a field called Loan ID as its key field. Because the
key field is a field that has to have a unique value for each record in the
table, the ‘Student ID’ value is used to uniquely identify any student.
Similarly, each Library Loan record is uniquely identified by the ‘Loan ID’
field of the Library Loans table.

Note that database tables are often designed so that when a new record is
created, the key field entry is generated automatically as an incremental
number (e.g. in Microsoft Access you can assign a data type of
“AutoNumber” to a field). Sometimes however there is a more natural
choice of unique data which can be used instead, e.g. “Car Registration
Number”, “National Insurance Number” or “Product Serial Number”.

In order to create a relationship between two tables the primary key of one
of the tables must be added as a field to the other table where it will is
becomes the foreign key.

In the case of the library database, the shared field would be the ‘Student
ID’ field, i.e. the primary key of the Students table. It is selected because,
as a general rule, in one-to-many relationships between entities,
the primary key of the entity that is on the one end of the relationship is
used as the foreign key field of the entity that is on the many end of the
relationship (see the E-R diagram below).

After the Student ID field is added to the Library Loans table, the tables
would now look like the one below.
Database relationships
showing primary and foreign keys
By appearing in both tables, the ‘Student ID’ field has created a relationship
between the two tables. It acts as the primary key field in the Students
table and as a foreign key field in the Library Loans table.

Unlike the ‘Loan ID’ field, the ‘Student ID’ field can occur in more than one
record of the Library Loans table because a student may have borrowed
more than one resource from the library.

The diagram below shows example data in the Students and Library Loans
tables. Notice how the values of the primary key in the Students table
match the value of the foreign key in the Library Loans table. This creates a
link between the two tables and permits questions such as “What loans has
student 6783 borrowed?” to be answered.

Data entries in a relational database


Notice how every library loan record is associated with a student, however
not every student has a library loan. When a user inputs details of a new
library loan, their entry will be rejected if they do not also associate the loan
with an existing student (i.e. a valid Student ID). This is known as ensuring
referential integrity and is one of the methods of validating data on entry.
SUMMARY:

 The use of foreign keys enables tables to be linked to form


relationships. A foreign key is a field in a table that is also a key
field in another table.
 The relationships remove data redundancy, i.e. the need for data to
be duplicated (as happens in the case in flat-file databases).
 To create a relationship between two tables, we use the key
field from one of the tables and place it as a foreign key in the other
table.
 A key field is a field that is guaranteed to have a unique value for
each record in the table.
 Foreign key relationships are used to ensure the referential
integrity of data in a database.

What are forms?

A form is used for data INPUT and for displaying and editing selected
data, rather than having to use a database table directly. When edited, the
changes update the database table.

Forms offer many advantages over tables as a user interface:

 Data can be laid out in a meaningful way with clear labels rather
than just columns of data and field names.
 Font colour, type and size can be customised to improve clarity.
 Graphics such as boxes and lines can be used to group and separate
information.
 Images can be included on forms, both as fields and as graphics such
as logos.
 Forms can be tailored to accept only certain information or only
information in the correct format.
 Drop-down list boxes can speed up data input as well as reducing
errors.
 Forms can include control buttons to navigate to other records or
forms, close forms, open help screens or print out information.
A typical database form
In the GCSE Computing exam, you might well be asked to design a layout
for a form. Make sure you take note of the following:

 always include a title that clearly describes the purpose of the form;
 arrange the fields in a logical order, grouping similar fields if
necessary;
 make the field labels as clear as possible (they do not have to
exactly match the field names);
 position each field label carefully, either above or to the left of the
field;
 make sure the field label and the field itself are easily
distinguished(showing the field as a box with the field name inside is
the best option);
 if the form uses data from multiple tables, make sure you include all
the fields relevant to the form;
 optional – add symbols to show drop down lists and navigation
buttons (but label these clearly if you include them);
 if an image field is required, think about the orientation (portrait or
landscape) and don’t forget to add a label;
 make sure you fill the box you are given to create your design in and
spread things out.
A typical examination question and design sketch answeral
What is a query?

The interrogation of a database is called querying the database and a


question used to interrogate the data is called a query.

There are many reasons why users may wish to query data, including:

 To identify a group of records that share a certain attribute – e.g. a list


of students with nut allergies, products from a particular supplier, etc.
 To calculate totals based on the information held in records – e.g.
calculating the total value of the assets held by a company.
 To update the details of a specific record or group of records.
 To view data in particular combinations etc.

The above list shows that queries are a means of producing information
from data. This information is used by the decision makers in
organisations to plan strategies and tactics. Databases usually allow users
to create and reuse their own queries.
A query design specifies which records the user is searching for and what
fields to display out of those records. There are two types of query:

 Simple query – looking for data in one field only.


 Complex query – looking for data in multiple fields and across
linked tables.

The example of a simple query below uses an Microsoft Access database


table and a query to find the names of all students who are in form 10B.
This involves the following steps:

 Adding the Students table to the query;


 Selecting the fields to display;
 Adding the search criteria into the correct field.

A simple query – showing the design view and the result


A query can also be written using a query language such
as SQL (Structured Query Language). The above example would be
written as: SELECT Surname, Forename, Form FROM Students
WHERE Form = "10B"

For the GCSE Computing examination, this could be simplified to Form =


"10B"

Further examples of queries on this data could be:

 (Form = "10A") AND (Gender = “Female”)


 (Form = "10A") OR (Form = "10B")
 (Gender = "Male") AND (Date of Birth < 01/01/1995)
 (Student ID > 1100) AND (Student ID < 1200)

What are reports?

A
simple report from a database
A report is used for data OUTPUT. It can either be displayed on a screen or
more typically be printed.

A printed report is therefore a snapshot of the data in the database at a


particular point in time.

Reports offer a number of advantages over just displaying or printing out a


database table or query result:

 Query results or data from linked tables can be presented in a


professional manner, using different fonts and colours;
 Reports can be used to calculate totals, averages and other
aggregated data;
 Reports can group, sort and summarise data to give information that
is useful for the decision makers in organisations;
 Reports can include images and and charts generated from the data
in the database.

Most relational databases include facilities for reports to be generated


from table data or from the results of queries on data held in the
database.

Reports can include charts and formatting to make their information easy
to assimilate. As with forms, reports can be customised to include the
logos, colours and font styles of an organisation.

Different types of reports from databases


In the GCSE Computing exam, you might well be asked to design a layout
for a report. Make sure you take note of the following:

 always include a title that clearly describes the purpose


and time/date of the report;
 each field forms a column, a bit like in a database table. Label each
column clearly, then position each field underneath it;
 if the report uses data from multiple tables, make sure you include all
the fields relevant to the report;
 do not include fields that do not relate to the report, just because
they are in the tables described in the question;
 show any grouping or sorting that you think should be built into the
report;
 make sure you fill the box you are given to create your design in,
spread things out.
A typical examination question and design sketch answer
What data types are there in a database?

When a database is designed, all the fields are set to accept a particular
data type. This helps check for the wrong type of data being entered
(validation). For example, a database will not allow you to store text in a
field of data type: ‘integer’.

Data types also help to make sure that the data is stored as efficiently as
possible. For example a field of data type ‘integer’ requires less storage
space to hold the value 2 compared a field of data type “real”.

Data types also help ensure that data can be sorted correctly. For example,
if the integer values 1, 2, 3 and 20 are stored in a number field then they
can easily be sorted correctly. However, if the same values were stored in
a text/alphanumeric field as text then they would sort in the alphabetical
order: 1, 2, 20, 3. In the same way, date fields allow data to be sorted by
year, then month and then day.

Text/String/Alphanumeric
A field of data type text will accept both numbers and text.

 When to use: Use text data types if you want to enter text or a
mixture of text and numbers into a field.
 When NOT to use: if you want to store only numbers or only dates.
You can enter numbers or dates into text fields but they would get
sorted as if they were words and could not easily be used in number
or date calculations. They would also be more difficult to validate
using range checks.
 Note, telephone ‘numbers’ are always stored as text, this allows a
leading zero, spaces, brackets etc.

Number

Numeric fields are used to store numbers. This is because:

 The data will be sorted correctly: the numbers 2, 3, 20, 30 would be


sorted as: 2, 20, 3, 30 if they were stored in a text field.
 Validation rules such as range checks can used (e.g. a range check
such as BETWEEN 10 and 20)
 Arithmetic operations can be carried out on number fields.

There are two types of numeric field you need to know


about: Real and Integer.

 Real: This is for numbers which can have decimal places. For
example:
 132.01
 45.7
 9.18652493
 Real numbers can be formatted as currency (e.g. £5.67) and/or to a
fixed number of decimal places (e.g. entering 3.1427 into a real
number field formatted to two decimal places would mean it appeared
as 3.14)
 When to use the real data type:
 use the real data type if the field is going to be used to store
numbers where precision is important. For example:
 Height of people (e.g. 1.56m)
 Length of a room (e.g. 3.7m)
 Price of goods (e.g. £4.25) (Note: this data would be
formatted as currency but would still be a real number)
 Integer: This for whole numbers with no decimal places. For example:
 44
 -763
 9321389
 When to use the integer data type:
 Integers are used when accuracy is not of vital importance or
when the value allocated is specifically a whole number (e.g. a
count of something).
 For example:
 Number of car doors
 Large amounts of money (e.g. Average house price
£155000)
 Number of children

Logical/Boolean

 A field of data type Boolean is one which can contain only one of two
values: true and false. These two values can be used to represent
anything containing two states.
 When to use a Boolean data type – when you need to store a data
item that has only one of two possible values. It is often phrased as a
question:
 Has the pet been vaccinated?
 Does the property have a garden?
 Are you female?

Date/Time

A field of this type stores days, months and years so that records can
be sorted correctly (by year, then month and finally day) and be easily
validated using range checks. Date fields can display the date information
in different formats such as the full name of the day/month (28th August
1961) or the numerical versions (28/08/61).

 When to use a date/time field data type – when you need to store
dates or times and have the data easily validated and sorted.

What are database modules?


Modules are packages of code that database developers can use to add
extra functionality to a relational database. Each module is a collection of
procedures that are stored together as a unit in the database. Modules can
be associated with specific forms or reports in which case the procedures
tend to be specific to the form or report to which they are attached.
Modules can be used to present a company’s business rules in a user
friendly interface. An example business rule might state that no credit
check is to be performed on existing customers when processing loan
applications. In a large organisation, some modules can contain general
procedures so that they can be reused in different sections of the company.
This saves different users from having to write their own versions of the
same queries on databases and helps to ensure that company standards
and practices are followed. Well written modules tend to be optimised to
run faster and more efficiently than queries written by end users.

Modules can be presented to users as applications tailored to the users’


needs. For example, the user interface can be forms for use by a dental
receptionist. She can use these to query the appointments database in
order to find and book a slot for a patient. By such means, modules save
end users from having to understand the tables, fields and relationships of
a database.

In Microsoft Access, modules are produced using a language called Visual


Basic for Applications. Other database vendors similarly have their own
languages for producing modules. Some vendors have specialised in
producing professional modules for common business functions. For
example, you can purchase individual systems from Oracle that are
composed of the databases and modules for:

 Inventory Management
 Payroll
 Human Resources (i.e. staff management)
 Marketing
 Sales
 Purchasing Order Management

How are logical operators used in framing database


queries?
A complex query looks for data in two or more fields and uses the logical
operators OR, AND or NOT.

The following example uses a complex query to find all of the pupils in
Form 10B who were born before 1995. This query uses the logical
operator AND and is written as:
(Form = “10B”) AND (Date of Birth < 01/01/1995)

The following operators can be used to refine search:

 = (equals)
 < (less than)
 ≤ (less than or equal to)
 > (greater than)
 ≥ (greater than or equal to)
 <> (not equal to)

A query design in MS Access is shown below. Note that there are two
entries in the search criteria row and that the query has been given a
meaningful name (“Select Query – Pupils in 10B born before 1995”).

A complex query using more than one field


Below is a complex query that uses the logical operator OR to find pupils
who are in Form 10A OR Form 10C:

(Form = “10A”) OR (Form = “10C”)

This time, in the query definition there will be two criteria rows. The query
and its results are shown below:
A complex
query using OR
What are wildcards in queries?

Wildcard characters can be used in database queries. For example you


may want a list of all pupils born in November, or all of the pupils whose
surname starts with a ‘C’. Wildcard searches allow you to specify the part
of the data that you know and leave the data handling software to fill in the
blanks.

Note that wildcard characters do vary between different relational


databases. For example, the wildcard characters you would use in an
Microsoft Access database differ from the wildcard characters used in an
internationally recognised standard called ANSI-92. ANSI-92 is used by
several popular relational databases such as ORACLE and MYSQL. The
following table shows some wildcards you can use when you compose
query criteria using Microsoft Access.

Wildcards in database queries


How can data be validated?
A validation check is a rule that is built into a database to check that the
data entered is:

 sensible
 reasonable
 within acceptable boundaries
 complete

It does NOT mean that the data is actually correct, that


requires verification.

There are a number of different validation rules that can be used in a


database:

Type Checks

Field data types provide a basic method of validation. Field data types are
assigned to fields during the creation of the database table and data types
such as Numeric, Boolean, Date/Time and Image restrict what can be
entered.

 Example – if a user tries to enter text in a date field or alphabetic


characters in a numeric field, their entry will be rejected.

Range checks

These are used with numerical data to limit the range of numbers a user
can enter.

Examples – The ‘day’ part of a date must be in the range 1 to 31.

Further examples of range check validation rules

Rule What is being checked


Valid Invalid
data data

>20 If a numerical entry is greater than 20 21 20

If a numerical entry is greater than OR equal


>=20 20 19
to 20

If a numerical entry is greater than OR equal


BETWEEN 20 AND 30 25 31
to 20 AND less than OR equal to 30

If a numerical entry greater than OR equal to


>=20 AND <=30 25 18
20 AND less than OR equal to 30

Check digits

This type of check is used with numbers. An extra ‘check digit’ is calculated
from the numbers to be entered and added to the end. The numbers can
then be checked at any stage by re-calculating the check digit from the
other numbers and seeing if it matches the one entered. One example
where a check digit is used is in the 10 digit ISBN number which uniquely
identifies books. The last number of the ISBN is actually the check digit for
the other numbers, for example – the ISBN 0192761501.

 The Modulus-11 system is one example of a check digit system and


can apparently detect 99% of input errors (there is still a small chance
that more than one error occurs and the resulting number has the
same check digit).

How the Modulus-11 system works:


An ISBN stored using a European Article Number (EAN)
barcode

 Each number input is ‘weighted’ by multiplying it by it’s position (see


the diagram below)
 The weightings are adding together (the checksum) and this is divided
by the prime number 11
 If there is a remainder then it is subtracted from 11 to get the check
digit*, otherwise the check digit is 0.
 The check digit is then added to the end of the number.
 To check if the number entered is correct the calculation is repeated
but this time the check digit is included in the calculation.
 Because of this the remainder should now be 0 and it it is then there is
around a 99% chance the other numbers were entered correctly.

*if the check digit is a 10 then this is printed as an x to keep it as a single


character.

i.e. the number 019276150 would have a check digit of 1 as shown below.

Check

digit

Position 10 9 8 7 6 5 4 3 2 1

ISBN 0 1 9 2 7 6 1 5 0 1 Checksum Checksum

with with check


check digit digit / 11

(Number) x 17 remainder
0 9 72 14 42 30 4 15 0 1 187
(position) 0

 The weighted checksum without the check digit is 186. 186 divided
by 11 = 16 remainder 10. 11 minus the remainder therefore gives a
check digit of 1
 When the check digit is included in the calculation in position 1 then
the checksum is 187. 187 divided by 11 = 17 remainder 0 so the
check digit matches the rest of the numbers which means they were
probably entered correctly.

Presence checks

A presence check
These simply check that an entry has been made in a particular field
before the user can continue i.e. a null value (empty field) is not permitted.
Usually, not every field in a record needs to be filled in, however there are
likely to be some that must have a value and the presence check means
that the system will not allow the record to be saved until an entry is made.

 Example – An application for a passport must have the applicant’s


surname.
 Example – When recording the details of a new employee, presence
validation will cause rejection of a record that has nothing entered in
the ‘Job Title’ field.

Length Checks
All alphanumeric data has a length. A single character has a length of 1
and a string of text such as “Hello World” has a length of 11 (spaces are
counted in text strings). A length check ensures that such data is either an
exact length or does not exceed a specified number of characters.

 Examples – Bank account numbers are stored as text and should be


8 characters in length. Postcodes consist of no more than 8
characters. Mobile phone numbers are stored as text and should be
11 characters in length.

Lookup

A lookup check takes the value entered and compares it against a list of
values in a separate table. It can then return confirmation of the value
entered or a second list based on the value.

Dropdown list
One use of lookups restricts users to pre-defined input using drop-down
lists. A user is forced to use a list box to select from a predetermined list of
valid values. For example, they may have to select Male or Female or an
exam grade that should be in the range ‘A’…’G’ or ‘U’. , etc.

An example is frequently encountered when entering a postcode in an


online form. The database looks up the postcode and then returns a list of
possible addresses for the user to select from.

A database system that manages airline seat bookings could use this
check to limit the entries available in a list box selection. Rather than
checking for double-bookings after data has been entered, travel agents
could be presented with a list box that only shows the seats that are still
available. This would eliminate double-bookings and is an example of more
complicated, ‘real-time’ lookup validation.
Validation using referential integrity in a
database
Referential integrity is a special form of lookup. It is used in relational
databases to enforce valid inputs and involves checking entries in certain
fields against values in other tables.

For example, in an Orders table, when a new record is entered, the


database could check the ‘Customer_ID’ field of the new order against the
existing ‘Customer_ID’ data already in the Customers table to prevent the
entry of orders for customers that do not exist.

Input Masks

Certain alphanumeric fields in a database may require entry in a particular


format, e.g. a mixture of numbers and letters. Simple examples of this are
date structures or text of a specific length. More complex examples include
data items such as postcodes, National Insurance Numbers, driving licence
numbers or product codes.

Most databases allow formats to be defined for database fields by an input


mask, which defines the valid characters permitted in a field. For example,
some item codes in a factory may look like:

 TY856J
 FS871K
 AP120N

The first two characters are letter, the next three characters are numbers
and the last character is a letter.

This would allow a format check to be applied to the field, for


example: LLNNNLwhere L is any upper-case letter and N is any number.
This makes sure that only letters can be entered for the first two and the
last characters and only numbers entered for the middle three.
Common input mask codes

N a digit between 0 and 9 must be entered

# an entry is optional, but it must be a digit between 0 and 9

L a letter between a to z must be entered

? an entry is optional, but it must be a letter from a to z

A a letter or digit must be entered

a an entry is optional, but it must be a letter or digit

How can data be verified?


Typing data into a computer from a data capture form is called transcribing.
If an error is made then it is called a transcription error.

Verification is used to eliminate transcription errors. Data that has been


verified should therefore be totally correct. There are two main methods of
verification:
 Visual checking – data that has been entered is displayed on a
screen or printed out for the user to check it is correct before it is
finally submitted for entry. Data that is not correct is then re-entered
correctly This is often seen when completing an online order or when
transferring funds in online banking. Before the order is confirmed or
the transfer is made, all the details are displayed for the user to check.

 Pasword verification

Double Entry – Double entry is when data is entered twice,


sometimes by two different people. The computer compares the two
versions and does not accept the entry if there is a difference between
them. A familiar example of this type of verification is encountered
when setting a new password: users are usually asked to key the
password in a second time to ensure that they did not make a mistake
the first time. As passwords are not displayed on screen, this check is
very useful in ensuring that the right password has been set.

SUMMARY

 Validation cannot check that data is actually correct, only that it is


reasonable or sensible data.
 The process of checking that data is actually correct is called
verification.

You might also like