Data Warehouse Components
Data Warehouse Components
Data Warehouse Components
Overall architecture
The data sourcing, cleanup, extract, transformation and migration tools have to deal with some significant issues, as follows:
Database heterogeneity. Data heterogeneity.
Metadata
Metadata is data about data that describes the data warehouse. It is used for building, maintaining, managing, and using the data warehouse. Metadata can be classified into the following:
Technical metadata Business metadata Data warehouse operational information such as data history (snapshots, versions), ownership, extract audit trail, usage data
Access Tools
The principal purpose of data warehouse is to provide information to business users for strategic decision making. These users interact with the data warehouse using frontend tool. Many of these tools require an information specialist, a domain expert, who can analyze the information and can interact with the data warehousing environment in order to reach meaningful conclusions. This is especially true for data mining tools when defining the problem, configuring the tool, and analyzing the results.
Tool Taxonomy
The end user tools area spans a number of components. For example, all end user tools use metadata definitions to obtain access to data stored in the warehouse, and some of these tools may employ additional/ intermediary data stores. These tools can be divide into five main groups:
Data Query and Reporting tools Application Development tools Executive Information System (EIS) Tools Online analytical processing tools Data mining tools
The strategic value of data mining is time-sensitive, especially in the retail, marketing and finance sectors of the industry Using data mining to build predictive models in decision making has several benefits.
A model should explain why a particular decision was made Adjusting a model based on feedback from future decisions will lead to experience accumulation and true organizational learning. Finally, a predictive model can be used to automate a decision step in a larger process.
Data Marts
The concept of the data mart is causing a lot of excitement and attracting much attention in the data warehouse industry. In general, data marts are being presented as an inexpensive alternative to a data warehouse, taking significantly less time and money to built The data mart is directed at a partition of data (often called as a subject area) that is created for the use of a dedicated group of users. Unfortunately, the misleading statements about the simplicity and low cost of data marts sometimes result in organizations or vendors incorrectly positioning them as an alternative to the data warehouse. In summary, data marts present two problems: the problem of scalability in situations where an initial small data mart grows quickly in multiple dimensions, and the problem of data integration.
XML
XML stands for eXtensible Markup Language XML should:
Easy to use over the Internet Compatible with SGML Capable to processed by easy-to-write programs Legible and reasonably clear to users
In addition to the XML standard, several auxiliary standards are needed to complete the functionality of XML. For example, XSL, Xlink, and Xpointer are among the proposed standards that provide XML support for style sheets, hyperlinks, and other features