SAP Data Quality

Download as pdf or txt
Download as pdf or txt
You are on page 1of 58

Developing a Data Quality

and Integration Strategy


Jonathan G. Geiger
Intelligent Solutions, Inc.
April 28, 2010
Sponsor
Topics

 Complexities
 Expectations Setting
 Assessment
 Improvement
 Strategy

3
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Topics

 Complexities
 Typical Complexities
 Why They Exist

 Expectations Setting
 Assessment
 Improvement
 Strategy

4
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Integration

. . . and a miracle Operational


Data Store
Internal
happens here . . .
Data
Data
Warehouse

External
Data
Operational Data

Data integration is very complex!

5
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Integration - ETL
Process
 Capture data from source-of-  Consolidate data
record Persistent  Translate to target semantics
 Filter operational detail Staging  Calculate derived and summary data
 Filter only changed data Area  Transport to target platform
 Ensure data Quality  Load target data warehouse

Integrate Operational
Data Store
Internal Capture Validate Load
Data Transform Data
Warehouse

External
Data
Operational Data
Transport

 Audits and control metrics


Metadata  Source of record mapping
 Quality metrics
 Performance metrics

6
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Capture Complexities

 Data requirements
 Best source of the data
 Business rules for capturing the data
 Data meaning
 External data requirements
 History requirements
 Currency requirements
 Privacy and security requirements
 Audit and control requirements
 Metadata
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
7
Cleansing, Transformation &
Integration Complexities
 Data quality
 Data integration
 Data transformation
 Data enrichment
 Error handling
 Privacy and security requirements
 Audit and control requirements
 Metadata

8
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Load Complexities

 Currency
 Privacy and security requirements
 Audit and control requirements
 Metadata

9
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Why Complexities Exist

Data deficiencies are often


Problem Recognition not recognized

Responsibility Overlaps and gaps

Data is called an asset but


Discipline not managed as such

People are getting work


Benefit Recognition done`

10
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Topics

 Complexities

 Expectations Setting
 Quality Definition
 Governance and Stewardship
 Prioritization

 Assessment
 Improvement
 Strategy
11
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Quality - Definition

 Quality is conformance to requirements


 Quality is not
.... (necessarily) zero defects
Defect Rate

Target

Time

12
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Quality - Definition

 Quality is conformance to requirements

 Conformance to what?

 Whose requirements?

 How are requirements set?

 What degree of conformance?

13
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Stewardship

 A steward is one who is called upon to exercise


responsible care over possessions entrusted to
him or her
 (adapted from Webster’s dictionary)

 The steward does not own the possessions

 The steward has a responsibility affecting the


processes that impact the possessions

14
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Steward

 Exercise responsible care over the data


resources of the enterprise
 The steward does not own the data
 The steward impacts processes that affect the data
and its use
 Acquisition
 Management, maintenance and storage
 Dissemination
 Disposal

15
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Planning Roles

 Stewardship responsibilities
 Provide input to the subject area model
 Provide input to the business data model (business
rules, definitions, etc.)
 Establish metadata management strategy
 Custodianship responsibilities
 Develop the subject are model
 Develop and maintain the business data model
 Establish metadata management strategy

16
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Acquisition Roles

 Stewardship responsibilities
 Establish the quality expectations
 Define business processes for acquiring data
 Establish the authority levels for creating, updating,
and deleting data
 Establish the validation rules for ensuring that the
data meets quality expectations
 Custodianship responsibilities
 Understand the quality of existing data
 Provide input to the quality expectations

17
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Management Roles

 Stewardship responsibilities
 Establish and monitor data demographic expectations
 Establish archival and disaster recovery rules
 Provide metadata content
 Custodianship responsibilities
 Transform the business data model into system and
technology models
 Establish (technical) data naming standards
 Manage metadata
 Manage data storage (design, reliability, security,
recoverability, archival and restoration, etc.)

18
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Dissemination Roles

 Stewardship responsibilities
 Establish privacy and security policies
 Define standard query and reporting requirements
 Establish capability requirements
 Establish quality expectations
 Establish policies and guidelines for information use
 Provide metadata content
 Custodianship responsibilities
 Ensure adherence to privacy and security policies
 Providing input to the quality expectations
 Manage and provide metadata
19
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Disposal Roles

 Stewardship responsibilities
 Establish retention rules
 Establish erasure rules
 Custodianship responsibilities
 Provide input to retention rules
 Provide input to erasure rules

20
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Making It Real

 Two ways to approach stewardship


 Data stewards are assigned to a specific data subject
area – customer, product, order, etc.
 Data stewards are assigned to a particular function –
sales, marketing, finance, etc.
 There are benefits and drawbacks to each
approach
 In either case, good communication is
mandatory

21
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Executive Oversight

 Cross-functional committee

 Provides authority to the data stewards

 Provide resources for data stewardship and


information management

 Resolve conflicts

22
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Prioritization

 Too many data elements to do at once

 Need to categorize data


 Criticality
 Visibility
 Usage
 Sanctioned Projects

23
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Topics

 Complexities
 Expectations Setting

 Assessment
 Continuous Improvement
 Data Profiling
 Symptoms vs. Root Causes

 Improvement
 Strategy
24
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Continuous Improvement
Process

Reactive actions Proactive programs


typically start here start here

ACT PLAN

CHECK DO

Some companies start


Data profiling typically
here, following existing
starts here
processes

25
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Profiling Framework
DO
PLAN

CHECK

Data Cleanup and Business Process Adjustment


Framework courtesy of SAP ACT 26
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Profiling in Context

 Diagnostic step to understand data meaning


and quality
 Priorities dictate scope
 Business data model provides business rules
 Quality expectations provide perspective
 Data profiling reveals conditions
 Analysis determines actions
 Expectations adjustment
 Corrective actions
 Preventive actions
27
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Root Cause Analysis is
Performed

Major Cause Major Cause

# %
Characteristic

Major Cause Major Cause

B A D C OTHERS

28
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Topics

 Complexities
 Expectations Setting
 Assessment

 Improvement
 Data warehouse implications
 Upstream implications

 Strategy

29
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Warehouse Implications

 Data handling options


 Accept
 Reject
 Fix
 Adopt default value

 Error handling options


 Suspend data awaiting correction
 Transmit correction to source
 Transmit need for corrections
30
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Continuous Improvement
Process

Reactive actions Proactive programs


typically start here start here

ACT PLAN

CHECK DO

Some companies start


Data profiling typically
here, following existing
starts here
processes

31
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Topics

 Complexities
 Expectations Setting
 Assessment
 Improvement

 Strategy
 Major components

32
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Framework & Business Drivers

 Relate to Enterprise Quality Management


Approach
 Formal or informal
 Goals
 Understand business drivers and needs
 Business intelligence / operational systems
 Strategic / tactical / operational
 Declare strategy
 Mission statement
 Guiding principles
33
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Methodology

34
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Models

Subject Area Model

Business Data Model

Operational Data Warehouse


System Model System Model

Technology Models 35
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Continuous Improvement
Process

Reactive actions Proactive programs


typically start here start here

ACT PLAN

CHECK DO

Some companies start


Data profiling typically
here, following existing
starts here
processes

36
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Profiling Framework

Data Cleanup and Business Process Adjustment


Framework courtesy of SAP
37
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Roles and Responsibilities

 Executive Oversight

 Data Stewardship

 Data Custodianship

 Data Providers

 Data Users
38
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Tools and Technology

 Database management system

 Data modeling

 Data profiling

 Metadata management

39
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Data Quality Metrics

 Data usage

 Data quality improvement

 Benefits attained

40
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Strategy Communication

DATA
QUALITY
STRATEGY

DATA
QUALITY
STRATEGY

41
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
Topics

 Complexities
 Expectations Setting
 Assessment
 Improvement
 Strategy

42
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
About Intelligent Solutions
 Founded in 1992 by Claudia Imhoff
 Received outstanding recognition for client satisfaction
(Dun and Bradstreet survey of our clients)
 Internationally recognized industry expertise
 Full line of Corporate Information Factory and CRM
courses
 BI and CRM Consulting services
 Mentoring
 Assessment and Planning
 Management
 Design and Implementation
 International client base in all industry verticals

43
Copyright © 2010 Intelligent Solutions, Inc., All Rights Reserved
SAP Solution Overview
Enterprise Information Management

Kristin McMahon
Director, Enterprise Information Management
SAP
April 28, 2010
Poorly Managed Information
Leads to Inefficiency and Risk

“ Over 51% of organizations estimate data


related issues cost their company over
$5 million. Forbes Insight

“ 90% of all businesses still do not have


sufficient policies in place to meet data
governance regulations.”
IT Policy Compliance Group
Build an Information Driven Organization

Improve business insight Increase operational Meet compliance and


and decision making efficiency and reduce costs regulatory requirements

Provide all users with data that Provide high quality data to all Enhance information
is complete, accurate and business processes governance via policy-based
accessible data management
SAP Provides Best-In-Class EIM Solutions
Deliver Information That Is Complete, Accurate, and Accessible

Data Integration & Quality Management: Master Data Management:


 SAP BusinessObjects Data Services  SAP NetWeaver Master Data Management
 SAP BusinessObjects Data Federator  SAP Master Data Governance for Financials
 SAP BusinessObjects Text Analysis  SAP Data Maintenance by Vistex
 SAP BusinessObjects Data Insight
 SAP Data Migration services

Content & Information Lifecycle Management: Enterprise Data Warehousing:


 SAP NetWeaver Information Lifecycle Management  SAP NetWeaver Business Warehouse
 SAP Extended ECM by Open Text  SAP NetWeaver Business Warehouse Accelerator
 SAP Document Access by Open Text  SAP BusinessObjects Rapid Marts
 SAP Archiving by Open Text  SAP BusinessObjects Metadata Management

© SAP 2007 / Page 49


What Are the Sources of Bad Data Problems?

Employee
Data Entry

Purchased
or Rented Customer
External Self-Service
Data
Enterprise
Information

IT Data
Application Migration
Updates Projects

© SAP AG 2010 / 50
The Data Quality Framework

Continuous Monitoring
Data Assessment
CONTINUOUS
MONITORING
MEASURE

CONSOLIDATE
ANALYZE
Match &
Consolidate
MATCH YOUR DATA
PARSE

ENHANCE

STANDARDIZE

Enhance CORRECT
Data Cleansing

© SAP 2009 / Page 51


Data Quality Approach
The Three Rules of Data Quality

Rule 1: Analyze your data • What is the definition of “clean” data?


Profile, query, extract and in every other way • Who defines “clean”?
become intimately familiar with data content at a • Who owns it over time?
detail level. If you take a high-level approach to • Which entities have the most issues?
data quality, you will waste time discussing what • Where are the issues originating from?
the data might look like.

Rule 2: Define your scope • Which business processes are affected?


All data quality projects uncover hidden issues. • What business benefit can be achieved?
Be very clear about what is, and is not, relevant • How clean does it need to be?
to your current effort. • People, process, and tools?

Rule 3: Cleanse your data and track your • Define stakeholders to analyze and clean
results • Define processes to clean, monitor and
Data quality is not a one-time process. It is an maintain cleanliness
ongoing process of monitoring and correcting • Acquire necessary tools to assist
your data. You should know that: 1) new quality
needs are being met and 2) new business
processes are being monitored.

© SAP AG 2010 / 52
Business and IT collaboration through
visualizing information governance metrics

IT can easily share data quality


metrics to business users and
involve them in owning the data
problem

Business users can easily see how


their information measures up
against information governance
rules and standards

© SAP 2009 / Page 53


Building a Roadmap for Enterprise Information
Management is Key for Success

4. Data
GOVERNANCE
Technology enabling
people to implement a
repeatable process to
3. Data
manage the use, quality
CONSOLIDATION and lifecycle of
2. Data
Value

Consolidate diverse information


INTEGRATION & master data landscapes
CLEANSING and increase trust and Govern
reliability in information
Deliver trusted
information repeatable
1. Data and reliably at the right Consolidate Consolidate
READINESS form, to the right place at
Understand what data the right time
assets you have and Cleanse Cleanse Cleanse
how they are being
used
Understand Understand Understand Understand

People & Process Maturity


Data Quality Provides Value Throughout
Portfolio maps to the People and Process Maturity

Full Enterprise END-TO-END Data


COVERAGE Management

4. Data
Value

GOVERNANCE

3. Data MDG
CONSOLIDATION
2. Data
INTEGRATION & MDM MDM
CLEANSING

1. Data Dash Integrator Dash Integrator Dash Integrator


READINESS

Dash Quality Dash Quality Dash Quality Data Quality

People & Process Maturity

© SAP 2009 / Page 55


Why SAP? Time to Value: Fast and cost-
The Best Choice for EIM effective integration with existing
SAP and non-SAP systems

Proven Customer Value: Mature


offering and large install base of
customers supporting critical
business scenarios

Market Leadership: Analyst


recognition and customer
implementation success

Comprehensive Solutions for EIM


Strategy
One-stop for end-to-end information
governance and management
© SAP 2007 / Page 56
Questions??
Contact Information
If you have further questions or comments:

Jonathan G. Geiger, Intelligent Solutions


jggeiger@earthlink.net

Kristin McMahon, SAP


kristin.mcmahon@sap.com

You might also like