0% found this document useful (0 votes)
23 views61 pages

Notes TYBBA DBA Data Mining

Uploaded by

akankshaingle47
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views61 pages

Notes TYBBA DBA Data Mining

Uploaded by

akankshaingle47
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 61

DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

****INDEX****

Unit-1. Introduction to DBMS. Pg.02

Unit-2. Database Administrator (DBA).


Pg.15

Unit-3. Data warehousing. Pg.30

Unit-4. Data Analytics. Pg.51

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

Unit No – 1

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

Unit – I
Introduction to Database Management System

• Introduction
• Objectives
• DBMS Concept
• Purpose of Database System
• Advantages & disadvantages of Database System

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Introduction

There are two popular words in the computer dictionary which are Hardware and
Software.
Hardware – Collection of mechanical, electrical parts
Software – Collection of programs

Why computer invented ?


The basic need of Computer invention was processing and storing data

There are two types of files in the computer system are


a) Program b) Data Files
a) Programs – written to instruct the machine to perform thecertain task or
operation. Programs can be Application Programs – MS Office
System Programs – Operating systems like Windows XP

b) Data File – Data entered by user is stored in this file.

Difference between Data and Information –


Data – User input or collection of numerical facts and figures is called data.
Information – Processed data or manipulated data as per user’s requirement is
called information.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Objectives of DBMS

Important thing while programming is data storage, access to data at any moment.
e.g. – If we use ‘C’ Program language to write program, datastorage facility is provided
by O.S.
But there are many restrictions while storing data as these are plain text files.
Manipulation becomes difficult in case of large amount of data. Also storing and
searching becomes difficult.

The purpose of a database is to help your business stay organized and keep
information easily accessible, so that you can use it. But it isn't a magic solution to
all your data concerns. First,you need to collect and input the data into a database.

Databases are structured to facilitate the storage, retrieval, modification, and


deletion of data in conjunction with various data-processing operations. A
database management system (DBMS) extracts information from the database
in response toqueries.

Concept of DBMS

 Database management system is the set of interrelated data andset of programs to


operate on it.

 The collection of interrelated data is called as database.


 Database management is the process of collecting, storing,organizing,
maintaining and analyzing data.
 A DBMS enables organizations to effectively managedatabases.
 A database management system (DBMS) is a software package designed to
define, manipulate, retrieve and managedata in a database.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Purpose of DBMS

 The purpose of a database is to help your business stay organized and


keep information easily accessible,so that you can use it.
 But it isn't a magic solution to all your data concerns. First, you need to collect and
input the data into a database.
 Databases are structured to facilitate the storage, retrieval, modification, and
deletion of data in conjunction with variousdata-processing operations.
 A database management system (DBMS) extracts information from the database
in response to queries.

Advantages of Database System

Better Data Transferring


Better Data Security
Better data integration
Minimized Data Inconsistency
Faster data Access
Better decision making
Increased end-user productivity
Simple

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Advantages of Database System

Better Data Transferring - Database management creates a place where users have
anadvantage of more and better managed data. Thus making it possible for end-users
to have a quick look and to respond fast to any changes made in their environment.

Better Data Security - As number of users increases data transferring or data


sharing rate also increases thus increasing the risk of data security. It is widely used
in corporation world where companies invest money, time and effort in large amount
to ensure data is secure and is used properly. A Database Management System
(DBMS) provide a betterplatform for data privacy and security policies thus, helping
companies to improve Data Security.

Data integration - The data in file system is stored in separate files. It is very
difficult to access data stored in separate and independent files. An important
objective of databases is to solve this problem. The data in database may be
located at different computers physically but it is connected through data
communication links. In this way, data appears centralized logically.

Data integrity - Data integrity means the reliability and accuracy of data. Integrity
rules are designed to keep the data consistent and correct. These rules act like a check
on the incoming data. It is very important that a data base maintains the quality of the
data stored in it. DBMS provides several methods to enforce integrity of the data in a
database. Enforcing data integrity ensures the quality of data in the database. For
example, if an employee ID is entered as “123”, this value should not beentered again.
The same ID should not be assigned to two or more employee.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Minimized Data Inconsistency - Data inconsistency occurs between files when
different versions of the same data appear in different places.For Example, data
inconsistency occurs when a student name is saved as “Amit J. Patil” on a main
computer of school but on teacher registered system same student name is “Amit
Patil”, or when the price of a productis Rs.86.95 in local system of company and its
National sales office system shows the same product price as Rs. 84.95.
So if a database is properly designed then Data inconsistency can be greatly reduced
hence minimizing data inconsistency.
Faster data Access - The Data base management system (DBMS) helps to produce
quick answers to database queries thus making data accessing faster and more
accurate. For example, end users, when dealing with large amounts of sale data,
will have enhanced access to the data, enabling faster sales cycle.
Some queries may be like:
What is the increase of the sale in last three months?
What is the bonus given to each of the salespeople in last five months?
How many customers have credit score of 850 or more?
Better decision making:
Due to DBMS now we have Better managed data and Improved data
accessing because of which we can generate better quality information hence
on this basis better decisions can be made. Better Data quality improves
accuracy, validity and time it takes to read data. DBMS does not guarantee
data quality, it provides a framework to makeit is easy to improve data quality

Increased end-user productivity - The data which is available with the help of
Prof: A.A.Kulkarni www.dacc.edu.in
DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
combination of tools which transform data into useful information, helps end user
to make quick, informative and better decisions that can make difference between
success and failure in the global economy.
Simple - Data base management system (DBMS) gives simple and clear logical
view of data. Many operations like insertion, deletion or creation of file or data are
easy to implement.
Data independence: - Database approach provides the facility of data
independence. It means that the data and the application programs are separate
from each other. The user can change data storage structures and operations
without changing the application programs. The user can also modify programs
without reorganization of data.
Complexity - The provision of the functionality that is expected of a good DBMS
makes the DBMS an extremely complex piece of software. Database designers,
developers, database administrators and end-users must understand this
functionality to take full advantage of it. Failure to understand the system can lead
to bad design decisions, which can have serious consequences for an organization.
The complexity and breadth of functionality makes the DBMS an extremely large
piece of software, occupying many megabytes of disk space and requiring
substantial amounts of memory to run efficiently

Performance- Typically, a File Based system is written for a specific


application, such as invoicing. As result, performance is generally very good.
However, the DBMS is written to be more general, to cater for many applications
rather than just one. The effect is that some applications may not run as fast as they
used to.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

• Disadvantages of Database System

 Cost of Conversion: In some situations, the cost of the DBMS and extra
hardware may be insignificant compared with the cost of converting existing
applications to run on the new DBMS and hardware. This cost also includes the
cost of training staff to use thesenew systems and possibly the employment of
specialist staff to help with conversion and running of the system. This cost is
one of the main reasons why some organizations feel tied to their current systems
and cannot switch to modern database technology.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

Unit No – 2

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

Unit – II

Database Administration
Database Administrator

 A Database Administrator (DBA) is individual or person responsible for


controlling, maintenance, coordinating, and operation of database management
system. Their role also varies from configuration, database design, migration,
security, troubleshooting, backup, and data recovery.

 A Database Administrator (DBA) in Database Management System is an IT


professional who works on creating, maintaining, querying, and tuning the
database of the organization. They are also responsible for maintaining data
security and integrity. This role requires professionals to have good knowledge
and experience in the particular RDBMS that the company uses.

• Types Database Administrator

Based on the requirements of the company, there are various types of DBAs

including:

Administrative DBA: They maintain and run the databases and servers of the
organization. They are mainly concerned with the security patches, replication,
and backup of data.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Development DBA: They work on developing SQL queries and stored
procedures to meet the requirements of the business. They specialize in
database development.
Data Architect: They build data structures, table indexes, and
relationships. They are mainly responsible for building a structure that
meets the business requirements in a specific area.
Data Warehouse DBA: They merge data from numerous data sources and
store them in a data warehouse.

• Purpose of DBA
 Database administration is more of an operational or technical level
function responsible for physical database design, security enforcement, and
database performance.
 Tasks include maintaining the data dictionary, monitoring performance,
and enforcing organizational standards and security.
 A database administrator's (DBA) primary job is to ensure that data is
available, protected from loss and corruption, and easily accessible as
needed. Below are some of the chief responsibilities that make up the day-
to-day work of a DBA.
Software installation and Maintenance
 A DBA often collaborates on the initial installation and configuration
of a new Oracle, SQL Server etc. database. The system administrator sets
up hardware and deploys the operating system for the database server,
then the DBA installs the database software and configures it for use. As
updates and patches are required, the DBA handles this on-going
maintenance.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
 And if a new server is needed, the DBA handles the transfer of data
from the existing system to the new platform.

Data Extraction, Transformation, and Loading


 Known as ETL, data extraction, transformation, and loading refers
to efficiently importing large volumes of data that have been extracted
from multiple systems into a data warehouse environment.
 This external data is cleaned up and transformed to fit the
desiredformat so that it can be imported into a central repository.
Specialized Data Handling
 Today’s databases can be massive and may contain unstructured
data types such as images, documents, or sound and video files.
Managinga very large database (VLDB) may require higher-level skills
and additional monitoring and tuning to maintain efficiency.
Database Backup and Recovery
 DBAs create backup and recovery plans and procedures based on
industry best practices, then make sure that the necessary steps are
followed. Backups cost time and money, so the DBA may have to
persuade management to take necessary precautions to preserve data.
 System admins or other personnel may actually create the backups,
but it is the DBA’s responsibility to make sure that everything is doneon
schedule.
Database Backup and Recovery
 In the case of a server failure or other form of data loss, the DBA
will use existing backups to restore lost information to the system.
Different types of failures may require different recovery strategies, and
the DBA must be prepared for any eventuality. With technology
change, it is becoming ever more typical for a DBA to backup

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
databases to the cloud, Oracle Cloud for Oracle Databases and MS
Azure for SQL Server.
Security
 A DBA needs to know potential weaknesses of the database
software and the company’s overall system and work to minimize
risks. No system is one hundred per cent immune to attacks, but
implementingbest practices can minimize risks.
 In the case of a security breach or irregularity, the DBA can
consult audit logs to see who has done what to the data. Audit trails
are alsoimportant when working with regulated data.
Authentication
 Setting up employee access is an important aspect of database
security. DBAs control who has access and what type of access they
are allowed.
 For instance, a user may have permission to see only certain pieces
ofinformation, or they may be denied the ability to make changes to the
system.
Capacity Planning
 The DBA needs to know how large the database currently is and
how fast it is growing in order to make predictions about future needs.
Storage refers to how much room the database takes up in server and
backup space. Capacity refers to usage level.
 If the company is growing quickly and adding many new users,
the DBA will have to create the capacity to handle the extra
workload.
Performance Monitoring
 Monitoring databases for performance issues is part of the on-
going system maintenance a DBA performs. If some part of the
Prof: A.A.Kulkarni www.dacc.edu.in
DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
system is slowing down processing, the DBA may need to make
configuration changes to the software or add additional hardware
capacity.
 Many types of monitoring tools are available, and part of the
DBA’s job is to understand what they need to track to improve the
system. 3rd party organizations can be ideal for outsourcing this
aspect, but make sure they offer modern DBA support.
Database Tuning
 Performance monitoring shows where the database should be
tweaked to operate as efficiently as possible. The physical
configuration, the way the database is indexed, and how queries are
handled can all have a dramatic effect on database performance.
 With effective monitoring, it is possible to proactively tune a
system based on application and usage instead of waiting until a
problem develops
Troubleshooting
 DBAs are on call for troubleshooting in case of any problems.
Whether they need to quickly restore lost data or correct an issue to
minimise damage, a DBA needs to quickly understand and respond to
problems when they occur.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

What is a Database Transaction?

 A Database Transaction is a logical unit of processing in a DBMSwhich


entails one or more database access operation. In a nutshell, database
transactions represent real-world events of any enterprise.

 All types of database access operation which are held between the
beginning and end transaction statements are considered as a single
logical transaction in DBMS.

 During the transaction the database is inconsistent.

 Only once the database is committed the state is changed from one
consistent state to another.

Facts about Database Transactions

A transaction is a program unit whose execution may or may not


change the contents of a database.

The transaction concept in DBMS is executed as a single unit.


If the database operations do not update the database but only retrieve
data, this type of transaction is called a read-only transaction.

A successful transaction can change the database from one


CONSISTENT STATE to another

DBMS transactions must be atomic, consistent, isolated and durable


If the database were in an inconsistent state before a transaction, it
would remain in the inconsistent state after the transaction.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

The various states of a transaction concept in DBMS are listed below:

 Active State
A transaction enters into an active state when the execution
process begins. During this state read or write operations can be
performed.
 Partially Committed
A transaction goes into the partially committed state after the end
of a transaction.
 Committed State
When the transaction is committed to state, it has already
completed its execution successfully. Moreover, all of its changes
are recorded to the database permanently.
 Failed State
A transaction considers failed when any one of the checks fails or
if the transaction is aborted while it is in the active state.
 Terminated State
State of transaction reaches terminated state when certain transactions which
are leaving the system can’t be restarted.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
What are ACID Properties?

ACID Properties are used for maintaining the integrity of database

during transaction processing. ACID in DBMS stands for

 Atomicity

 Consistency

 Isolation

 Durability

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE –
45

Subject: Database Administration and Data Mining Class: TYBBA ____

What are ACID Properties?

Atomicity - A transaction is a single unit of operation. Youeither execute it


entirely or do not execute it at all. There cannot be partial execution.

 Consistency - Once the transaction is executed, it should


movefrom one consistent state to another.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

What are ACID Properties?

 Isolation - Transaction should be executed in isolation from other


transactions (no Locks). During concurrent transaction execution,
intermediate transaction results from simultaneously executed transactions
should not be made available to each other.

 Durability - After successful completion of a transaction, thechanges in the


database should persist. Even in the case of system failures.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

Unit No – 3

Unit No – 3
Data Warehousing

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Warehouse – A large building where raw materials or manufactured goods may
be stored prior to their distribution for sale.

The main function of a warehouse is to store products or goods before moving


them to another location.

Data Warehouse -

Data warehousing is the storage of information over time by a business or other


organization. New data is periodically added by people in various key departments
such as marketing and sales.

A database is designed to supply real-time information.

A data warehouse is designed as an archive of historical information.

Data Warehousing (DW) is process for collecting and managing data from varied
sources to provide meaningful business insights.
A Data warehouse is typically used to connect and analyze business data from
heterogeneous sources. The data warehouse is the core of the BI system which is
built for data analysis and reporting.

Data warehouse is a type of data management system that is designed to enable


and support business intelligence (BI) activities, especially analytics.

Data warehouses are solely intended to perform queries and analysis and often
contain large amounts of historical data.

A relational database to store and manage data.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
For example, data warehousing makes data mining possible, which assists
businesses in looking for data patterns that can lead to higher sales and profits.

Database and Data Warehouse

A database is any collection of data organized for storage, accessibility and


retrieval.

A data warehouse is a type of database that integrates copies of transaction data


from disparate source systems and provisions them for analytical use.

It is a process of transforming data into information and making it available to


users in a timely manner to make a difference.

Data warehouse system is also known by the following name:

 Decision Support System (DSS)


 Executive Information System
 Management Information System
 Business Intelligence Solution
 Analytic Application
 Data Warehouse

The decision support database (Data Warehouse) is maintained separately from the
organization’s operational database. However, the data warehouse is not a product
but an environment. It is an architectural construct of an information system which

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
provides users with current and historical decision support information which is
difficult to access or present in the traditional operational data store.

You many know that database for an inventory system many have tables related to
each other. For example, a report on current inventory information can include
more than 12 joined conditions. This can quickly slow down the response time of
the query and report. A data warehouse provides a new design which can help to
reduce the response time and helps to enhance the performance of queries for
reports and analytics.

Why do data warehouse is needed

Data warehouse allows users to access critical data from the number of sources in a
single place. Therefore, it saves user's time of retrieving data from multiple
sources. Data warehouse stores a large amount of historical data. This helps users
to analyze different time periods and trends to make future predictions. Interview
tions and Answers

How Does Data Warehousing Work?

A data warehouse essentially combines information from several sources into one
comprehensive database. Data is extracted from individual sources and redundant
data/outliers are removed. Next, the data is reorganized into a consistent format
(e.g. tables, columns, charts) that can be queried.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Popular Data Warehousing Tools

Just like there are several different ways to establish a data warehouse, there are
numerous data warehousing tools that businesses can use to upload and analyze
their data. Some of the most popular data warehouse tools include:

 Amazon Redshift.  Microsoft Azure.


 Google BigQuery.  Snowflake.
 Micro Focus Vertica.  Teradata.
 Amazon DynamoDB.  PostgreSQL.

History of Data Warehouse

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
The Data warehouse benefits users to understand and enhance their organization’s
performance. The need to warehouse data evolved as computer systems became
more complex and needed to handle increasing amounts of Information. However,
Data Warehousing is a not a new thing.

Here are some key events in evolution of Data Warehouse-

 1960- Dartmouth and General Mills in a joint research project, develop the
terms dimensions and facts.

 1970- A Nielsen and IRI introduces dimensional data marts for retail sales.
 1983- Tera Data Corporation introduces a database management system
which is specifically designed for decision support

 Data warehousing started in the late 1980s when IBM worker Paul Murphy
and Barry Devlin developed the Business Data Warehouse.

 However, the real concept was given by Inmon Bill. He was considered as a
father of data warehouse. He had written about a variety of topics for
building, usage and maintenance of the warehouse & the Corporate
Information Factory.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
How Data warehouse works?

A Data Warehouse works as a central repository (store house) where information


arrives from one or more data sources. Data flows into a data warehouse from the
transactional system and other relational databases.

Data may be:

1. Structured
2. Semi-structured
3. Unstructured data

The data is processed, transformed, and ingested (consumed) so that users can
access the processed data in the Data Warehouse through Business Intelligence
tools, SQL clients, and spreadsheets. A data warehouse merges information
coming from different sources into one comprehensive database.

By merging all of this information in one place, an organization can analyze its
customers more holistically. This helps to ensure that it has considered all the
information available. Data warehousing makes data mining possible. Data mining
is looking for patterns in the data that may lead to higher sales and profits.

Types of Data Warehouse

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Three main types of Data Warehouses (DWH) are:

1. Enterprise Data Warehouse (EDW):

Enterprise Data Warehouse (EDW) is a centralized warehouse. It provides decision


support service across the enterprise. It offers a unified approach for organizing
and representing data. It also provides the ability to classify data according to the
subject and give access according to those divisions.

2. Operational Data Store:

Operational Data Store, which is also called ODS, are nothing but data store
required when neither Data warehouse nor OLTP ( Online Transactional
Processing) systems support organizations reporting needs. In ODS, Data
warehouse is refreshed in real time. Hence, it is widely preferred for routine
activities like storing records of the Employees.

3. Data Mart:

A data mart is a subset of the data warehouse. It specially designed for a particular
line of business, such as sales or finance. In an independent data mart, data can
collect directly from sources.

General stages of Data Warehouse

Earlier, organizations started relatively simple use of data warehousing. However,


over time, more sophisticated use of data warehousing begun.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
The following are general stages of use of the data warehouse (DWH):

Offline Operational Database:

In this stage, data is just copied from an operational system to another server. In
this way, loading, processing, and reporting of the copied data do not impact the
operational system’s performance.

Offline Data Warehouse: Data in the Data warehouse is regularly updated from
the Operational Database. The data in Data warehouse is mapped and transformed
to meet the Data warehouse objectives.

Real time Data Ware house : In this stage, Data warehouses are updated
whenever any transaction takes place in operational database. For example, Airline
or railway booking system.

Integrated Data Warehouse:

In this stage, Data Warehouses are updated continuously when the operational
system performs a transaction. The Datawarehouse then generates transactions
which are passed back to the operational system.

Components of Data warehouse

Four components of Data Warehouses are:

Load manager: Load manager is also called the front component. It performs with
all the operations associated with the extraction and load of data into the

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
warehouse. These operations include transformations to prepare the data for
entering into the Data warehouse.

Warehouse Manager: Warehouse manager performs operations associated with


the management of the data in the warehouse. It performs operations like analysis
of data to ensure consistency, creation of indexes and views, and aggregations,
transformation and merging of source data and archiving and baking-up data.

Query Manager: Query manager is also known as backend component. It


performs all the operation operations related to the management of user queries.
The operations of this Data warehouse components are direct queries to the
appropriate tables for scheduling the execution of queries.

End-user access tools:

This is categorized into five different groups like

1. Data Reporting

2. Query Tools

3. Application development tools

4. EIS ( Executive Information) tools

5. OLAP tools and data mining tools.

Who needs Data warehouse?

Data warehouse is needed for all types of users like:

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
 Decision makers who rely on mass amount of data
 Users who use customized, complex processes to obtain information from
multiple data sources.
 It is also used by the people who want simple technology to access the data
 It also essential for those people who want a systematic approach for making
decisions.
 If the user wants fast performance on a huge amount of data which is a
necessity for reports, grids or charts, then Data warehouse proves useful.
 Data warehouse is a first step If you want to discover ‘hidden patterns’ of
data-flows and groupings.

What Is a Data Warehouse Used For?

Here, are most common sectors where Data warehouse is used:

Airline:

In the Airline system, it is used for operation purpose like crew assignment,
analyses of route profitability, frequent flyer program promotions etc.

Banking:

It is widely used in the banking sector to manage the resources available on desk
effectively. Few banks also used for the market research, performance analysis of
the product and operations.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Healthcare:

Healthcare sector also used Data warehouse to prepare strategies and predict
outcomes, generate patient’s treatment reports, share data with tie-in insurance
companies, medical aid services, etc.

Public sector: In the public sector, data warehouse is used for intelligence
gathering. It helps government agencies to maintain and analyze tax records, health
policy records, for every individual.

Investment and Insurance sector:

In this sector, the warehouses are primarily used to analyze data patterns, customer
trends, and to track market movements.

Retail chain:

In retail chains, Data warehouse is widely used for distribution and marketing. It
also helps to track items, customer buying pattern, promotions and also used for
determining pricing policy.

Telecommunication:

A data warehouse is used in this sector for product promotions, sales decisions and
to make distribution decisions.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Hospitality Industry:

This Industry utilizes warehouse services to design as well as estimate their


advertising and promotion campaigns where they want to target clients based on
their feedback and travel patterns.

Steps to Implement Data Warehouse

The best way to address the business risk associated with a Datawarehouse
implementation is to employ a three strategy as below

1. Enterprise strategy: Here we identify technical including current


architecture and tools. We also identify facts, dimensions, and attributes.
Data mapping and transformation is also passed.
2. Phased delivery: Datawarehouse implementation should be phased based
on subject areas. Related business entities like booking and billing should be
first implemented and then integrated with each other.
3. Iterative Prototyping: Rather than a big bang approach to implementation,
the Datawarehouse should be developed and tested iteratively.

Best practices to implement a Data Warehouse

 Decide a plan to test the consistency, accuracy, and integrity of the data.
 The data warehouse must be well integrated, well defined and time stamped.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
 While designing Datawarehouse make sure you use right tool, stick to life
cycle, take care about data conflicts and ready to learn your mistakes.
 Never replace operational systems and reports
 Don’t spend too much time on extracting, cleaning and loading data.
 Ensure to involve all stakeholders including business personnel in
Datawarehouse implementation process. Establish that Data warehousing is
a joint/ team project. You don’t want to create Data warehouse that is not
useful to the end users.
 Prepare a training plan for the end users.

Advantages & Disadvantages

Advantages of Data Warehouse (DWH):

 Data warehouse allows business users to quickly access critical data from
some sources all in one place.
 Data warehouse provides consistent information on various cross-functional
activities. It is also supports reporting and query.
 Data Warehouse helps to integrate many sources of data to reduce stress on
the production system.
 Data warehouse helps to reduce total turnaround time for analysis and
reporting.
 Restructuring and Integration make it easier for the user to use for reporting
and analysis.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
 Data warehouse allows users to access critical data from the number of
sources in a single place. Therefore, it saves user’s time of retrieving data
from multiple sources.
 Data warehouse stores a large amount of historical data. This helps users to
analyze different time periods and trends to make future predictions.

Disadvantages of Data Warehouse:

 Not an ideal option for unstructured data.


 Creation and Implementation of Data Warehouse is surely time confusing
affair.
 Data Warehouse can be outdated relatively quickly
 Difficult to make changes in data types and ranges, data source schema,
indexes, and queries.
 The data warehouse may seem easy, but actually, it is too complex for the
average users.
 Despite best efforts at project management, data warehousing project scope
will always increase.
 Sometime warehouse users will develop different business rules.
 Organizations need to spend lots of their resources for training and
Implementation purpose.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

Goals of Data Warehousing

o To help reporting as well as analysis


o Maintain the organization's historical information
o Be the foundation for decision making.

The Future of Data Warehousing

 Change in Regulatory constrains may limit the ability to combine source of


disparate data. These disparate sources may include unstructured data which
is difficult to store.
 As the size of the databases grows, the estimates of what constitutes a very
large database continue to grow. It is complex to build and run data
warehouse systems which are always increasing in size. The hardware and
software resources are available today do not allow to keep a large amount
of data online.
 Multimedia data cannot be easily manipulated as text data, whereas textual
information can be retrieved by the relational software available today. This
could be a research subject.

Data Warehouse Tools

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
There are many Data Warehousing tools are available in the market. Here, are
some most prominent one:

1. MarkLogic:

MarkLogic is useful data warehousing solution that makes data integration easier
and faster using an array of enterprise features. This tool helps to perform very
complex search operations. It can query different types of data like documents,
relationships, and metadata.

2. Oracle:

Oracle is the industry-leading database. It offers a wide range of choice of data


warehouse solutions for both on-premises and in the cloud. It helps to optimize
customer experiences by increasing operational efficiency.

3. Amazon RedShift:

Amazon Redshift is Data warehouse tool. It is a simple and cost-effective tool to


analyze all types of data using standard SQL and existing BI tools. It also allows
running complex queries against petabytes of structured data, using the technique
of query optimization.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

Unit No – 4

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

Unit No. – 4

Data Analytics
Data Analysis -

In simple words, data analysis is the process of collecting and organizing data in
order to draw helpful conclusions from it. The process of data analysis uses
analytical and logical reasoning to gain information from the data.

Data Analysis is the process of systematically applying statistical and/or logical


techniques to describe and illustrate, condense and recap, and evaluate data.
Indeed, researchers generally analyze for patterns in observations through the
entire data collection phase.

Data analytics can be defined as a process of examining data sets with the help of
various software or specialized systems to draw conclusions from it.
Nowadays, data analytics has become one of the most crucial parts of commercial
industries. Data analytics enables organizations to take more-informed business
decisions based on scientific data and research.

Data analytics (DA) is the process of examining data sets in order to find trends
and draw conclusions about the information they contain. Increasingly, data
analytics is done with the aid of specialized systems and software. Data analytics
technologies and techniques are widely used in commercial industries to enable
organizations to make more-informed business decisions. Scientists and

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
researchers also use analytics tools to verify or disprove scientific models, theories
and hypotheses.

Data analytics is important because it helps businesses optimize their


performances. Implementing it into the business model means companies can help
reduce costs by identifying more efficient ways of doing business and by storing
large amounts of data.

Today, many data analytics techniques use specialized systems and software that
integrate machine learning algorithms, automation and other capabilities.

Data Scientists and Analysts use data analytics techniques in their research, and
businesses also use it to inform their decisions. Data analysis can help companies
better understand their customers, evaluate their ad campaigns, personalize
content, create content strategies and develop products. Ultimately, businesses can
use data analytics to boost business performance and improve their bottom line.

For businesses, the data they use may include historical data or new information
they collect for a particular initiative. They may also collect it first-hand from their
customers and site visitors or purchase it from other organizations. Data a company
collects about its own customers is called first-party data, data a company obtains
from a known organization that collected it is called second-party data, and
aggregated data a company buys from a marketplace is called third-party data. The
data a company uses may include information about an audience’s demographics,
their interests, behaviors and more.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Ways to Use Data Analytics

Now that you have looked at what data analytics is, let’s understand how we can
use data analytics.

1. Improved Decision Making: Data Analytics eliminates guesswork and manual


tasks. Be it choosing the right content, planning marketing campaigns, or
developing products. Organizations can use the insights they gain from data
analytics to make informed decisions. Thus, leading to better outcomes and
customer satisfaction.

2. Better Customer Service: Data analytics allows you to tailor customer service
according to their needs. It also provides personalization and builds stronger
relationships with customers. Analyzed data can reveal information about
customers’ interests, concerns, and more. It helps you give better recommendations
for products and services.

3. Efficient Operations: With the help of data analytics, you can streamline your
processes, save money, and boost production. With an improved understanding of

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
what your audience wants, you spend lesser time creating ads and content that isn’t
in line with your audience’s interests.

4. Effective Marketing: Data analytics gives you valuable insights into how your
campaigns are performing. This helps in fine-tuning them for optimal outcomes.
Additionally, you can also find potential customers who are most likely to interact
with a campaign and convert into leads.

Types of Data Analytics

There are four types of data analytics descriptive analytics, diagnostic analytics,
predictive analytics, and prescriptive analytics. Here we will have a glance at all
the four types in detail.

1. Descriptive Analytics

Descriptive analytics simply describes the answer to what happened and it


alters raw information from numerous data sources to give important
knowledge into the past. Though, these outcomes barely signal that
something is wrong or right, without clarifying why.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
2. Diagnostic Analytics

At this stage, historical information can be classified against other data to


acknowledge the topic of why something happened. Diagnostic analytics
provides top to bottom bits of knowledge into a specific issue.

3. Predictive Analytics

Predictive analytics is giving hints that it is something related to future


prediction. Yes, it is as it tells about what is going to happen. It uses the
discoveries of descriptive and diagnostic analytics to identify bunches and
special cases and to predict future trends, which makes it a significant device
for estimating.

Predictive analytics has a place with advanced analytics types and brings
numerous points of interest like complex analysis dependent on the machine
or deep learning and proactive methodology that predictions empower.

Nevertheless, our information advisors state it obviously: predicting is only a


instrument, the exactness of which exceptionally relies upon information
quality and security of the circumstance, so it expects cautious treatment and
persistent streamlining.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
4. Prescriptive analytics

The motivation behind prescriptive analytics is to prescribe what move to


make to eliminate a future issue or take full advantage of a promising trend.
Prescriptive analytics utilizes advanced tools and technologies, similar to
machine learning, business rules, and algorithms, which makes it modern to
actualize and manage.

Also, this cutting edge sort of data analytics expects recorded inner
information as well as outer data because of the nature of algorithms it
depends on.

Data Mining

Data mining is the process of discovering actionable information from large sets of
data. Data mining uses mathematical analysis to derive patterns and trends that
exist in data.

Data mining, or knowledge discovery from data (KDD), is the process of


uncovering trends, common themes or patterns in “big data”. For example, an early
form of data mining was used by companies to analyze huge amounts of scanner
data from supermarkets.

Data mining is a process used by companies to turn raw data into useful
information. By using software to look for patterns in large batches of data,

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
businesses can learn more about their customers to develop more effective
marketing strategies, increase sales and decrease costs.

Data mining is the process of searching large sets of data to look out for patterns
and trends that can't be found using simple analysis techniques. Data mining has
several types, including pictorial data mining, text mining, social media mining,
web mining, and audio and video mining amongst others.

The main purpose of data mining

The data mining is the process of uncovering patterns and finding anomalies and
relationships in large datasets that can be used to make predictions about future
trends. The main purpose of data mining is to extract valuable information from
available data.

Some data mining tools used in the industry are Rapid Miner, oracle data mining,
IBM SPSS Modeler, KNIME, Python Orange, Kaggle, Rattle, Weka, and
Teradata.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
DATA MINING USES
The data mining is used in various fields like research, business, marketing, sales,
product development, education, and healthcare. When used appropriately, data
mining provides an extreme advantage over competitive establishments by
providing more information about customers and helps to develop better and
effective strategies in marketing which will raise the revenue and lower the cost. In
order to achieve excellent results from data mining, a number of tools and
techniques are required. Some of the most common Data mining concepts are :
 Data cleansing and preparation- in this step transformation of data into a
suitable form required for further processing and analysis such as
identification and error removal and missing data.
 Artificial intelligence (AI)- analytical activities that are associated with
human intelligence like reasoning, planning, learning, and problem-solving
are performed by these systems.
 Association rule learning- also known as market basket analysis, these tools
look in the dataset, for the relationship between variables such as concluding
which products are purchased by the customers together.
 Clustering- is a process in which the dataset is partitioned into sets of
relevant divisions called clusters, that would help the users to understand the
structure or natural groups in the data.
 Classification- with the goal of predicting the target class for each and every
case in the data, items are assigned by this technique in the dataset.
 Data analytics- data analytics is the process of evaluating digital information
and converting it into information useful for business.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
 Data warehousing- is the component of the foundational importance of most
huge-scale data mining efforts with a large collection of data, that is used for
decision making in organizations.
 Machine learning- is a computer programmed technique, that makes use of
statistical probabilities that gives the computer the capacity to ‘learn’ even
without being clearly programmed.
 Regression- is a technique that is made use of to predict a variety of numeric
values, including sales, price of a stock, temperatures, that are based on a
precise dataset.
ADVANTAGES OF DATA MINING
Since we live and work in a data-centric world, it’s essential to get as many
advantages as possible. Data mining provides us with the means of resolving
problems and issues in this challenging information age. Data mining benefits
include:

 It helps companies gather reliable information

 It’s an efficient, cost-effective solution compared to other data applications

 It helps businesses make profitable production and operational adjustments

 Data mining uses both new and legacy systems

 It helps businesses make informed decisions

 It helps detect credit risks and fraud

 It helps data scientists easily analyze enormous amounts of data quickly

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
 Data scientists can use the information to detect fraud, build risk models, and
improve product safety

 It helps data scientists quickly initiate automated predictions of behaviors and


trends and discover hidden patterns

Disadvantages of Data Mining

Nothing’s perfect, including data mining. These are the major issues in data
mining:

 Many data analytics tools are complex and challenging to use. Data scientists
need the right training to use the tools effectively.

 Speaking of the tools, different ones work with varying types of data mining,
depending on the algorithms they employ. Thus, data analysts must be sure to
choose the correct tools.

 Data mining techniques are not infallible, so there’s always the risk that the
information isn’t entirely accurate. This obstacle is especially relevant if there’s
a lack of diversity in the dataset.

 Companies can potentially sell the customer data they have gleaned to other
businesses and organizations, raising privacy concerns.

 Data mining requires large databases, making the process hard to manage.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

WHY DATA MINING IS IMPORTANT?


Data mining is important in marketing for exploring the large increase in the
database and improving segmentation in the market. The valuable insight could be
provided by extracting and combining multiple data from various sources. It is thus
designed to explore data.

WHAT IS THE HISTORY OF DATA MINING?


The concept of data mining has been used by businesses for more than a century. It
is not a new digital age invention. It became a great public focus in the 1930s. It
first came into the Limelight when Alan Turing in 1936 introduced the universal
machine concept that was able to perform computations that are similar to present-
day computers. Hence it is not a new invention of the digital age.
So many improvements have taken place since then. In order to improve several
aspects that include sales processes, interpret financials for the purpose of
investment, businesses have started machine learning, and are making use of data
mining. As a result of this data scientists all over the world have become very vital
for business establishments and enable them to achieve greater business goals with
data science.
11 Dec 2021
HOW DATA MINING WORKS
Exploring and analyzing large quantities of information to derive relevant
patterns and trends is involved in data mining. Data mining has many uses such as
credit risk management, database marketing, spam email filtering, fraud detection,
and also to fathom ( measure) the opinion and sentiment of users.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

The data mining process is further divided into five steps.


a) First data is collected and loaded into the data warehouse.
b) Then the data is managed and stored either in the cloud or in the in-house
servers.
c) The data is assessed by management teams, business analysts, and information
technology professionals and they determine how to organize the data.
d) Then based on the results of the users, it is sorted by the application software.
e) Finally, the data is presented by the end-user in a format like a graph or a table
that is easy to share.

Data Mining Applications

Data mining is a useful and versatile tool for today’s competitive businesses. Here
are some data mining examples, showing a broad range of applications.

Banks

Data mining helps banks work with credit ratings and anti-fraud systems,
analyzing customer financial data, purchasing transactions, and card transactions.
Data mining also helps banks better understand their customers’ online habits and
preferences, which helps when designing a new marketing campaign.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Healthcare

Data mining helps doctors create more accurate diagnoses by bringing together
every patient’s medical history, physical examination results, medications, and
treatment patterns. Mining also helps fight fraud and waste and bring about a more
cost-effective health resource management strategy.

Marketing

If there was ever an application that benefitted from data mining, it’s marketing!
After all, marketing’s heart and soul is all about targeting customers effectively for
maximum results. Of course, the best way to target your audience is to know as
much about them as possible. Data mining helps bring together data on age,
gender, tastes, income level, location, and spending habits to create more effective
personalized loyalty campaigns. Data marketing can even predict which customers
will more likely unsubscribe to a mailing list or other related service. Armed with
that information, companies can take steps to retain those customers before they
get the chance to leave!

Retail

The world of retail and marketing go hand-in-hand, but the former still warrants its
separate listing. Retail stores and supermarkets can use purchasing patterns to
narrow down product associations and determine which items should be stocked in
the store and where they should go. Data mining also pinpoints which campaigns
get the most response.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________

Cloud Computing

The practice of using a network of remote servers hosted on the internet to store,
manage, and process data, rather than a local server or a personal computer.

Simply , cloud computing is the delivery of computing services—including


servers, storage, databases, networking, software, analytics, and intelligence—over
the Internet (“the cloud”) to offer faster innovation, flexible resources, and
economies of scale.

In the simplest terms, cloud computing means storing and accessing data and
programs over the internet instead of your computer's hard drive.

Cloud computing is named as such because the information being accessed is


found remotely in the cloud or a virtual space. Companies that provide cloud
services enable users to store files and applications on remote servers and then
access all the data via the Internet.

Dropbox - a file storage and sharing system.

Microsoft Azure - offers backup and disaster recovery services, hosting, and more.
Rackspace - offers data, security, and infrastructure services.

Facebook is one of the biggest tech company which is not using AWS or
Azure. No cloud for that matter is being used by Facebook to store its data.
Facebook is running their own infrastructure to meet their needs, as Facebook had

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
a very large number of users at a point of time when AWS was developing back in
2009.

Google Cloud is a suite of cloud computing services that runs on the same
infrastructure that Google uses internally for their own consumer products, such as
Google Search, Gmail, and YouTube.

Who own cloud computing?


The short answer is that you own the data you create, but the cloud service
provider has ultimate control over it. This is reflected in many providers' terms of
service which state that they can hold on to the data to comply with legal
regulations.

Purpose of Cloud Computing


The goal of cloud computing is to allow users to take benefit from all of these
technologies, without the need for deep knowledge about or expertise with each
one of them. The cloud aims to cut costs and helps the users focus on their core
business instead of being impeded by IT obstacles.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
The main enabling technology for cloud computing is virtualization. Virtualization
software separates a physical computing device into one or more "virtual" devices,
each of which can be easily used and managed to perform computing tasks.
With operating system–level virtualization essentially creating a scalable system of
multiple independent computing devices, idle computing resources can be
allocated and used more efficiently. Virtualization provides the agility required to
speed up IT operations and reduces cost by increasing infrastructure utilization.
Autonomic computing automates the process through which the user can provision
resources on-demand. By minimizing user involvement, automation speeds up the
process, reduces labor costs and reduces the possibility of human errors.

Cloud computing uses concepts from utility computing to provide metrics for the
services used. Cloud computing attempts to address QoS (quality of service)
and reliability problems of other grid computing models.

Need of Cloud computing


All you need is an internet connection and you can access your files from any
device, anywhere. There are several types of cloud storage available including
block, file and object storage. These each fit different use cases from shared
filesystems to block-based volumes and backup and archiving systems.
some of the most common reasons to use the cloud.

 File storage: You can store all types of information in the cloud, including files and
email.
 File sharing: The cloud makes it easy to share files with several people at the same
time.
 Backing up data: You can also use the cloud to protect your files.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Advantages of Cloud Computing

1. Cost Savings
If you are worried about the price tag that would come with making the switch to
cloud computing, you aren't alone 20% of organizations are concerned about the
initial cost of implementing a cloud-based server. But those who are attempting to
weigh the advantages and disadvantages of using the cloud need to consider more
factors than just initial price they need to consider ROI.

Once you're on the cloud, easy access to your company's data will save time and
money in project startups. And, for those who are worried that they'll end up
paying for features that they neither need nor want, most cloud-computing services
are pay as you go. This means that if you don't take advantage of what the cloud
has to offer, then at least you won't have to be dropping money on it.

2. Security
Many organisations have security concerns when it comes to adopting a cloud-
computing solution.

For one thing, a cloud host's full-time job is to carefully monitor security, which is
significantly more efficient than a conventional in-house system, where an
organisation must divide its efforts between a countless of IT concerns, with
security being only one of them. And while most businesses don't like to openly
consider the possibility of internal data theft, the truth is that a shockingly high
percentage of data thefts occur internally and are perpetrated by employees. When
this is the case, it can actually be much safer to keep sensitive information offsite.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
3. Flexibility
Your business has only a finite amount of focus to divide between all of its
responsibilities. If your current IT solutions are forcing you to commit too much of
your attention to computer and data-storage issues, then you aren't going to be able
to concentrate on reaching business goals and satisfying customers.

The cloud offers businesses more flexibility overall versus hosting on a local
server. And, if you need extra bandwidth, a cloud-based service can meet that
demand instantly, rather than undergoing a complex (and expensive) update to
your IT infrastructure. This improved freedom and flexibility can make a
significant difference to the overall efficiency of your organisation.

4. Mobility
Cloud computing allows mobile access to corporate data via smartphones and
devices, which, considering over 2.6 billion smartphones are being used
globally today, is a great way to ensure that no one is ever left out of the loop. Staff
with busy schedules, or who live a long way away from the corporate office, can
use this feature to keep instantly up to date with clients and co-worker.

Through the cloud, you can offer conveniently accessible information to sales staff
who travel, freelance employees, or remote employees, for better work-life
balance.

5. Insight
As we move ever further into the digital age, it's becoming clearer and clearer that
the old adage “knowledge is power” has taken on the more modern and accurate

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
form: “Data is money.” Hidden within the millions of bits of data that surround
your customer transactions and business process are nuggets of invaluable,
actionable information just waiting to be identified and acted upon. Of course,
sifting through that data to find these kernels can be very difficult, unless you have
access to the right cloud-computing solution. Many cloud-based storage solutions
offer integrated cloud analytics for a bird's-eye view of your data. With your
information stored in the cloud, you can easily implement tracking mechanisms
and build customised reports to analyse information organisation wide. From those
insights, you can increase efficiencies and build action plans to meet organisational
goals.

6. Increased Collaboration
If your business has two employees or more, then you should be making
collaboration a top priority. After all, there isn't much point to having a team if it is
unable to work like a team. Cloud computing makes collaboration a simple
process. Team members can view and share information easily and securely across
a cloud-based platform. Some cloud-based services even provide collaborative
social spaces to connect employees across your organisation, therefore increasing
interest and engagement. Collaboration may be possible without a cloud-
computing solution, but it will never be as easy, nor as effective.

7. Quality Control
There are few things as detrimental to the success of a business as poor quality and
inconsistent reporting. In a cloud-based system, all documents are stored in one
place and in a single format. With everyone accessing the same information, you

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
can maintain consistency in data, avoid human error, and have a clear record of
any revisions or updates. Conversely, managing information in silos can lead to
employees accidentally saving different versions of documents, which leads to
confusion and diluted data.

8. Disaster Recovery
One of the factors that contributes to the success of a business is control.
Unfortunately, no matter how in control your organisation may be when it comes
to its own processes, there will always be things that are completely out of your
control, and in today's market, even a small amount of unproductive downtime can
have a resoundingly negative effect. Downtime in your services leads to lost
productivity, revenue, and brand reputation.

But while there may be no way for you to prevent or even anticipate the disasters
that could potentially harm your organisation, there is something you can do to
help speed your recovery. Cloud-based services provide quick data recovery for all
kinds of emergency scenarios, from natural disasters to power outages.

9. Loss Prevention
If your organisation isn't investing in a cloud-computing solution, then all of your
valuable data is inseparably tied to the office computers it resides in. This may not
seem like a problem, but the reality is that if your local hardware experiences a
problem, you might end up permanently losing your data. This is a more common
problem than you might realise computers can malfunction for many reasons, from
viral infections, to age-related hardware deterioration, to simple user error..

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
If you aren't on the cloud, you're at risk of losing all the information you had saved
locally. With a cloud-based server, however, all the information you've uploaded to
the cloud remains safe and easily accessible from any computer with an internet
connection, even if the computer you regularly use isn't working.

10. Automatic Software Updates


For those who have a lot to get done, there isn't anything more irritating than
having to wait for system updates to be installed. Cloud-based applications
automatically refresh and update themselves, instead of forcing an IT department
to perform a manual organisation wide update. This saves valuable IT staff time
and money spent on outside IT consultation.

11. Competitive Edge


While cloud computing is increasing in popularity, there are still those who prefer
to keep everything local. That's their choice, but doing so places them at a distinct
disadvantage when competing with those who have the benefits of the cloud at
their fingertips. If you implement a cloud-based solution before your competitors,
you'll be further along the learning curve by the time they catch up.

12. Sustainability
Given the current state of the environment, it's no longer enough for organisations
to place a recycling bin in the breakroom and claim that they're doing their part to
help the planet. Real sustainability requires solutions that address wastefulness at
every level of a business. Hosting on the cloud is more environmentally friendly
and results in less of a carbon footprint.

Prof: A.A.Kulkarni www.dacc.edu.in


DNYANSAGAR ARTS AND COMMERCE COLLEGE, BALEWADI, PUNE – 45

Subject: Database Administration and Data Mining Class: TYBBA


____________________________________________________
Cloud infrastructures support environmental proactivity, powering virtual services
rather than physical products and hardware, and cutting down on paper waste,
improving energy efficiency reducing commuter -related emissions.

Prof: A.A.Kulkarni www.dacc.edu.in

You might also like