Difference Between Data Warehousing and Data Mining: Data Warehouse Architecture Three-Tier Data Warehouse Architecture
Difference Between Data Warehousing and Data Mining: Data Warehouse Architecture Three-Tier Data Warehouse Architecture
Difference Between Data Warehousing and Data Mining: Data Warehouse Architecture Three-Tier Data Warehouse Architecture
sources is stored under a single schema. It is then used for reporting and
historical data derived from transaction data. While a Data Warehouse is built
data mining can be carried with any traditional database, but since a data
warehouse contains quality data, it is good to have data mining over the data
Let us understand the Difference between Data Warehousing and Data Mining
in detailed
Key Features:
1. Data Warehouse:
2. Cost reduction
Data Mining:
insurance claims, cellular phone calls or credit card purchases are likely
to be fraudulent.
Virtual Warehouse
Data mart
Enterprise Warehouse
Virtual Warehouse
The view over an operational data warehouse is known as a virtual warehouse. It is
easy to build a virtual warehouse. Building a virtual warehouse requires excess
capacity on operational database servers.
Data Mart
Data mart contains a subset of organization-wide data. This subset of data is valuable
to specific groups of an organization.
In other words, we can claim that data marts contain data specific to a particular group.
For example, the marketing data mart may contain data related to items, customers,
and sales. Data marts are confined to subjects.
Points to remember about data marts −
Window-based or Unix/Linux-based servers are used to implement data marts. They are
implemented on low-cost servers.
The implementation data mart cycles is measured in short periods of time, i.e., in weeks
rather than months or years.
The life cycle of a data mart may be complex in long run, if its planning and design are not
organization-wide.
Data marts are small in size.
Data marts are customized by department.
The source of a data mart is departmentally structured data warehouse.
Data mart are flexible.
Enterprise Warehouse
An enterprise warehouse collects all the information and the subjects spanning an entire
organization
It provides us enterprise-wide data integration.
The data is integrated from operational systems and external information providers.
This information can vary from a few gigabytes to hundreds of gigabytes, terabytes or
beyond.
Load Manager
This component performs the operations required to extract and load process.
The size and complexity of the load manager varies between specific solutions from
one data warehouse to other.
Fast Load
In order to minimize the total load window the data need to be loaded into the warehouse in
the fastest possible time.
The transformations affects the speed of data processing.
It is more effective to load the data into relational database prior to applying transformations
and checks.
Gateway technology proves to be not suitable, since they tend not be performant when large
data volumes are involved.
Simple Transformations
While loading it may be required to perform simple transformations. After this has been
completed we are in position to do the complex checks. Suppose we are loading the
EPOS sales transaction we need to perform the following checks:
Strip out all the columns that are not required within the warehouse.
Backup/Recovery tool
SQL Scripts
Query Manager
Query manager is responsible for directing the queries to the suitable tables.
By directing the queries to appropriate tables, the speed of querying and response
generation can be increased.
Query manager is responsible for scheduling the execution of the queries posed by the user.
Stored procedures
Note − If detailed information is held offline to minimize disk storage, we should make
sure that the data has been extracted, cleaned up, and transformed into starflake
schema before it is archived.