DBMS - Bba Uni 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

UNIT-3

Data warehousing
 A data warehouse is a subject oriented, integrated, time-variant, and
non- volatile collection of data. This data helps analysts to take
informed decisions in an organization.
 It stores huge amount of data, which is typically collected from multiple
heterogeneous source like files, DBMS, etc.
 A Data Warehouse is a Relational Database Management System
(RDBMS) designed specifically to meet the needs of transaction
processing systems.
 It can be loosely defined as any centralized data repository which can
be queried for business benefit.
 It provides data that is already transformed and summarized, therefore
making it an appropriate environment for more efficient DSS (decision
support system).

Data warehouses are prepared as-


A data warehouse helps business executives to organize, analyse, and use
theirdata for decision making.
Data warehouses are widely used in the following fields −

 Financial services
 Banking services
 Consumer goods
 Retail sectors
 Controlled manufacturing

Functions of Data Warehouse Tools and Utilities

 Data Extraction − Involves gathering data from multiple heterogeneous


sources.
 Data Cleaning − Involves finding and correcting the errors in data.
 Data Transformation − Involves converting the data from legacy
format towarehouse format.
 Data Loading − Involves sorting, summarizing, consolidating, checking
integrity, and building indices and partitions.
 Refreshing − Involves updating from data sources to warehouse.
Advantages of Data Warehouses:

 Integrate data from multiple sources into a single database and data model.
 Maintain data history, even if the source transaction systems do not.
 Improve data quality, by providing consistent codes and descriptions,
flagging or even fixing bad data.
 Present the organization's information consistently.
 Provide a single common data model for all data of interest regardless
of thedata's source.
 Restructure the data so that it makes sense to the business users.
 Restructure the data so that it delivers excellent query performance,
even forcomplex analytic queries, without impacting the operational
systems.
 Add value to operational business applications, notably
customerrelationship management (CRM) systems.
 Make decision–support queries easier to write.
 Organize and disambiguate repetitive data

Data Mining:

 Data Mining is the nontrivial extraction of implicit, previously unknown,


and potentially useful information from data.
 Data mining is the search for relationships and global patterns that exist
in large databases but are ‘hidden’ among the vast amount of data, such
as a relationship between patient data and their medical diagnosis.
 Data Mining refers to “using a variety of techniques to identify nuggets
of information or decision-making knowledge in bodies of data, and
extracting these in such a way that they can be put to use in the areas
such as decision support, prediction, forecasting and estimation.”
 Data mining analysis tends to work from the data up and the best
techniques are those developed with an orientation towards large
volumes of data, making use of as much of the collected data as possible
to arrive at reliable conclusions and decisions.

Data mining helps business organisation in following ways-

 For businesses, data mining is used to discover patterns and


relationships inthe data in order to help make better business decisions.
 Data mining can help spot sales trends, develop smarter
marketingcampaigns, and accurately predict customer loyalty.
 Data mining technique helps companies to get knowledge-
basedinformation.
 Data mining helps organizations to make the profitable adjustments in
operation and production.
 The data mining is a cost-effective and efficient solution compared to
otherstatistical data applications.
 Data mining helps with the decision-making process.
 Facilitates automated prediction of trends and behaviours as well as
automated discovery of hidden patterns.
 It can be implemented in new systems as well as existing platforms
 It is the speedy process which makes it easy for the users to analyse
hugeamount of data in less time

Data mining process (phases/stages)

Selection- Selecting or segmenting the data according to some criteria e.g.


all those people who own a car, in this way subsets of the data can be
determined.Pre-processing- Pre-Processing-This is the data cleansing stage
where certain information is removed which is deemed unnecessary and may
slow down queries. For example, unnecessary to note the sex of a patient
when studying pregnancy. Also the data is reconfigured to ensure a
consistent format as there is a possibility of inconsistent formats because the
data is drawn from several sources e.g. sex mayhave recorded as F or M and
also as 1 or 0. Transformation- the data is not merely transferred across but
transformed in that overlays may be added such as the demographic
overlays commonly used in market research. The data is made usable and
navigable.
Data Mining- This stage is concerned with the extraction of patterns from
the data. A pattern can be defined as given a set of facts (data) F, a language L,
and some measure of certainty C a pattern is a statement S in L that describes
relationships among a subset Fs of F with a certainty c such that S is simpler
in some sense than the enumeration of all the facts in Fs.

Interpretation and Evaluation- The patterns identified by the system are


interpreted into knowledge which can then be used to support human decision
- making e.g. prediction and classification tasks, summarizing the contents of
a database or explaining observed phenomena.

Mobile Database: Mobile databases are separate from the main database and
can easily be transported to various places. Even though they are not
connected to the main database, they can still communicate with the database
to share and exchangedata.
 It is a database that can be connected to by a mobile computing device
overa mobile network.
 The client and server have wireless connections.
 A cache is maintained to hold frequent data and transactions so that
they arenot lost due to connection failure.
 Mobile Database are the database that allows the development and
deployment of data base applications for hand held devices, thus,
 Enabling relational database based applications in the hand of mobile
workers.
 The database technology allows employees using hand held to link to
their corporate networks, download data, work offline and then connect
database. To the network again to synchronize with the corporate

Components of mobile data base:


 The main system database that stores all the data and is linked to the
mobiledatabase.
 The mobile database that allows users to view information even while
on themove. It shares information with the main database.
 The device that uses the mobile database to access data. This device can
be amobile phone, laptop etc.
 A communication link that allows the transfer of data between the
mobiledatabase and the main database.
Advantages of Mobile Databases
 The data in a database can be accessed from anywhere using a
mobiledatabase. It provides wireless database access.
 The database systems are synchronized using mobile databases and
multipleusers can access the data with seamless delivery process.
 Mobile databases require very little support and maintenance.
 The mobile database can be synchronized with multiple devices
such asmobiles, computer devices, laptops etc.
Disadvantages of Mobile Databases
 The mobile data is less secure than data that is stored in a
conventionalstationary database. This presents a security hazard.
 The mobile unit that houses a mobile database may frequently lose
powerbecause of limited battery. This should not lead to loss of data in
database.
Data management issues in mobile database
Mobile database design –Because of the frequent shutdown and for handling
thequeries, the global name resolution problem is compounded.

Security –The data which is left at the fixed location is more secure as
compared to mobile data. That is mobile data is less secure. Data are also
becoming more volatile and techniques must be able to compensate for its loss.

Replication issues –There is increase of costs for updates and signalling due
to increase in number of replicas. Mobile hosts can move anywhere and
anytime.

Recovery and fault tolerance –Fault tolerance is the ability of a system to


perform its function correctly even in the presence of internal faults. The
mobile database environment must deal with site, transaction, media, and
communication failures. Due to limited battery power there is a site failure at
MU.
Query processing –Because of the mobility and rapid resource changes of
mobile units, Query optimization becomes the most complicated. That is query
processing is affected when mobility is considered.

Architecture

Mobile databases typically involve three parties: fixed hosts, mobile units,
andbase stations.

 Fixed hosts perform the transaction and data management functions


with thehelp of database servers.
 Mobile units are portable computers that move around a geographical
region that includes the cellular network (or "cells") that these units use
to communicate to base stations.
 Base stations are two-way radios, installations in fixed locations that
pass communications with the mobile units to and from the fixed hosts.
They are typically low-power devices such as mobile phones, portable
phones, or wireless routers.

Internet/ Web databases


A Web database is a database application designed to be managed and
accessedthrough the Internet.
 Website operators can manage this collection of data and present
analytical results based on the data in the Web database application.
 The Web provides cheap, ubiquitous networking. It has an existing user
basewith standardized web browser software that runs on a variety of
ordinary computers.
 Examples of web database applications include news services that
provide access to large data repositories, e-commerce applications
such as online stores, and business-to-business (B2B) support products.

Advantages

 Simplicity: They are easy to develop, as HTML upon which they are
basedis easy to learn and use.
 Cross Platform support: Being web-based, they are not confined to
anyparticular OS platform as they are accessible via web browsers.
 Standardization: With HTML being a standard on all browsers,
HTMLdocuments can be read from any machine in the world.

Disadvantages
 Reliability: The internet is currently an unreliable and slow
communication medium. At times servers can be down and a message
may be delayed to be sent.
 Security: Security is of great concern especially when the organisation
makes its databases accessible on the web.
 Cost- Cost of meeting the demands and expectations of customers is high.
 Limited Functionality of HTML: Some highly interactive database
applications may not be converted easily to web based applications.

A Typical Web Database Application Configuration


Dynamic web content is generally generated by a front end web server and a
back end data base. The contents of the site are stored in the database. The
application logic provides access to that content. The client sends an HTTP
request to the web server containing the appropriate URL and some
parameters. The web server causes the application logic to be executed.
Generally, the application logic issuesa number of queries to the database and
formats the results as an HTML page. The web server finally returns this
page as an HTTP response to the client.
Web Database Tools
1. Common Gateway Interface (CGI) tool. - It is the most commonly used tool
for creating the web databases. CGI is a standard for interfacing external
applications with servers e.g. HTTP or web servers. A CGI program executes
in real time to output the dynamic information. A CGI program can be written
in any language that allows it to be executed on the system, such as C, C++ or
PERL etc.

2. Extended Mark-up Language (XML) – It is a meta-language for describing


mark-up languages for documents containing structured information. XML
provides a facility to define tags and the structured relationship between them.

Digital Libraries:
 Digital Library is a combination of traditional and media collections, so
theyencompass both paper and electronic materials.
 The Digital Library in a broad sense is a computerized system that
allows users to obtain a Coherent means of access to an organized,
electronically stored repository of information and data.

Objectives of Digital Libraries:

 To collect, store, organize and access information in digital form.


 To meet the requirements of patrons by providing better services
 To provide personalized and retrospective services in an efficient way.
 To preserve valuable and rare documents.
 To save time of library staff by avoiding routine jobs
 To provide a coherent view of all information in any format
 To minimize massive storage and space problems of large libraries
 To reduce cost involved in various library activities.

Components of Digital Libraries as below:

 Geographically distributed digital information collections


 Geographically distributed users
 Information represented by a variety of digital objects
 Large and diverse collections
 Seamless access
Features of digital library

 No physical boundary. - The user of a digital library need not to go


to the library physically; people from all over the world can gain
access to the same information, as long as an Internet connection is
available.
 Round the clock availability - a major advantage of digital libraries
is that people can gain access 24/7 to the information.
 Multiple accesses. - The same resources can be used simultaneously
by anumber of institutions and patrons.
 Information retrieval. - The user is able to use any search term
(word, phrase, title, name, and subject) to search the entire collection.
Digital libraries can provide very user-friendly interfaces, giving
click able access to its resources.
 Preservation and conservation. -Digitization is not a long-term
preservation solution for physical collections, but does succeed in
providing access copies for materials that would otherwise fall to
degradation from repeated use.
 Space. Whereas traditional libraries are limited by storage space,
digital libraries have the potential to store much more information,
simply because digital information requires very little physical space
to contain them.
 Added value. Certain characteristics of objects, primarily the quality
of images, may be improved. Digitization can enhance legibility and
removevisible flaws such as stains and discoloration.
 Easily accessible

Spatial Databases:

 A spatial database is a database that is optimized for storing and


querying data that represents objects defined in a geometric space.
Most spatial databases allow the representation of simple geometric
objects such as points, lines and polygons.
 Spatial data is associated with geographic locations such as cities,
towns etc.
 It is a collection of spatially referenced data that acts as a model of
reality, where model of reality means that the database represents a
selected set or approximation of phenomena.
 It offers spatial data types and its data model and query language.
 It also implements spatial data types and provide spatial indexing
and algorithms for spatial join.
Example
A road map is a visualization of geographic information. A road map is a 2
- dimensional object which contains points, lines, and polygons that can
representcities, roads, and political boundaries such as states or provinces.
Spatial data can be of two types:

 Vector data: This data is represented as discrete points, lines and


polygons
 Rastor data: This data is represented as a matrix of square cells.

Features

Spatial Measurements: Computes line length, polygon area, the


distancebetween geometries, etc.
 Spatial Functions: Modify existing features to create new ones, for
example by providing a buffer around them, intersecting features, etc.
 Spatial Predicates: Allows true/false queries about spatial
relationshipsbetween geometries.
 Geometry Constructors: Creates new geometries, usually by
specifyingthe vertices (points or nodes) which define the shape.
Advantages of Spatial Databases:

 It offsets the complicated tasks to the database server such


asorganization and indexing.
 Spatial databases do not have to re-implement operators and
 Functions because spatial database offset the complicated tasks to
thedatabase server.
 The development time of client applications significantly reduces
whileusing spatial databases.
 Spatial databases use simple SQL expression to determine spatial
relationships such as distance, adjacency and containment.
 Spatial databases use simple SQL expression to perform
spatialoperations such as area, length, union, intersection and buffer.
Disadvantages of Spatial Databases
 Cost to implement can be high
 Some inflexibility
 Incompatibilities with some GIS software
 Slower than local, specialized data structures
 User/managerial inexperience and caution
Uses of spatial database
 Customer location
 Store locations
 Transportation tracking
 Weather Information
 Land holdings
 Natural resources
 City Planning

Multimedia Databases:
 It is a kind of database like any other databases containing
multimediacollections.
 Multimedia is defined as the combination of more than one media.
 Multimedia database contains text, image, animation, video, audio,
moviesound etc.

Types –

 Static - Text, graphics, and images are categorized as static media.


 Dynamic media- objects like animation, music, audio, speech, videos
arecategorized as dynamic media.

Need of Multimedia Databases:


 It will surely help to develop multimedia applications in various
fieldslike teaching, medical sciences and libraries.
 Preserving decaying photographs, maps, films having got
historicalevidence or national importance.
 Using multimedia database, we can develop the excellent
teachingpackages.
 Helps multi-user operations

Challenges of Multimedia Database


 Multimedia databases contains data in a large type of formats
such as
.txt(text), .jpg(images), .swf(videos), .mp3(audio) etc. It is difficult to
convert one type of data format to another.
 The multimedia database requires a large size as the multimedia
data is quite large and needs to be stored successfully in the database.
 It takes a lot of time to process multimedia data so multimedia
database isslow.
 Storage of multimedia database on any standard disk presents the
problem of representation, compression, mapping to device
hierarchies, archiving and buffering during input-output operation.
 For multimedia data like images, video, audio accessing data through
query opens up many issues like efficient query formulation, query
execution and optimization which need to be worked upon.

Content of Multimedia Database management system:


 Media data – The actual data representing an object.
 Media format data – Information such as sampling rate, resolution,
encoding scheme etc. about the format of the media data after it goes
through the acquisition, processing and encoding phase.
 Media keyword data – Keywords description relating to the
generation of data. It is also known as content descriptive data.
Example: date, time and place of recording.
 Media feature data – Content dependent data such as the distribution
of colours, kinds of texture and different shapes present in data.

Areas where multimedia database is applied are:

 Documents and record management: Industries and businesses that


keep detailed records and variety of documents. Example: Insurance
claim record.
 Knowledge dissemination: Multimedia database is a very effective
tool for knowledge dissemination in terms of providing several
resources. Example: Electronic books.
 Education and training: Computer-aided learning materials can be
designed using multimedia sources which are nowadays very
popular sources of learning. Example: Digital libraries.
 Marketing, advertising, retailing, entertainment and travel. Example:
a virtual tour of cities.
 Real-time control and monitoring: Coupled with active database
technology, multimedia presentation of information can be very
effective means for monitoring and controlling complex tasks
Example: Manufacturing operation control.

You might also like