Informatica Error Classification
Informatica Error Classification
Informatica Error Classification
You have to deal with different type of errors in the ETL Job. When you run a session, the
PowerCenter Integration Service can encounter fatal or non-fatal errors. Typical error handling
includes:
o
User Defined Exceptions : Data issues critical to the data quality, which might get
loaded to the database unless explicitly checked for quality. For example, a credit
card transaction with a future transaction data can get loaded into the database
unless the transaction date of every record is checked.
Fatal
Exceptions :
Errors
such
as
database
connection
errors,
which
Business users define the user defined user defined exception, which is critical to the data
quality. We can setup the user defined error handling using;
1. Error Handling Functions.
2. User Defined Error Tables.
ERROR() : This function Causes the PowerCenter Integration Service to skip a row and issue an
error message, which you define. The error message displays in the session log or written to the
error log tables based on the error logging type configuration in the session.
You
can
use ERROR in
Expression
transformations
to
validate
data.
Generally,
you
use ERROR within an IIF or DECODE function to set rules for skipping rows.
Eg : IIF(TRANS_DATA > SYSDATE,ERROR('Invalid Transaction Date'))
Above expression raises an error and drops any record whose transaction data is greater than the
current date from the ETL process and the target table.
ABORT() : Stops the session, and issues a specified error message to the session log file or written
to the error log tables based on the error logging type configuration in the session. When the
PowerCenter Integration Service encounters an ABORT function, it stops transforming data at that
row. It processes any rows read before the session aborts.
You can use ABORT in Expression transformations to validate data.
Eg : IIF(ISNULL(LTRIM(RTRIM(CREDIT_CARD_NB))),ABORT('Empty Credit Card Number'))
Above expression aborts the session if any one of the transaction records are coming with out a
credit card number.
Note :- You need to use these two functions in a mapping along with a session configuration for row
error logging to capture the error data from the source system. Depending on the session
configuration, source data will be collected into Informatica predefined PMERR error tables or files.
Please refer the article "User Defined Error Handling in Informatica PowerCenter" for more detailed
level implementation information on user defined error handling.
Typical ETL Design will read error data from the error table along with the source data. During the
data transformation, data quality will be checked and any record violating the quality check will be
moved to error tables. Record flags will be used to identify the reprocessed and records which are
fixed for reprocessing.
Non-fatal exception causes the records to be dropped out in the ETL process, which is
critical to quality. You can handle non-fatal exceptions using;
1. Default Port Value Setting.
2. Row Error Logging.
3. Error Handling Settings.
Input ports : Use default values if you do not want the Integration Service to treat null values
as NULL.
Output ports : Use default values if you do not want to skip the row due to transformation
error or if you want to write a specific message with the skipped row to the session log.
Input/output ports : Use default values if you do not want the Integration Service to treat
null values as NULL. But no user-defined default values for output transformation errors in an
input/output port.
Use Case 2
Below setting uses the default expression to convert the date if the incoming value is not in a valid
date format.
Please refer the article Error Handling Made Easy Using Informatica Row Error Logging for more
details.
Stop On Errors : Indicates how many non-fatal errors the Integration Service can encounter
before it stops the session.
On Stored Procedure Error : If you select Stop Session, the Integration Service stops the
session on errors executing a pre-session or post-session stored procedure.
On Pre-Session Command Task Error : If you select Stop Session, the Integration Service
stops the session on errors executing pre-session shell commands.
Pre-Post SQL Error : If you select Stop Session, the Integration Service stops the session
errors executing pre-session or post-session SQL.
2. Workflow Recovery
Workflow recovery allows you to continue processing the workflow and workflow tasks from the point
of interruption. During the workflow recovery process Integration Service access the workflow state,
which is stored in memory or on disk based on the recovery configuration. The workflow state of
operation includes the status of tasks in the workflow and workflow variable values.
Please refer the article "Informatica Workflow Recovery with High Availability for Auto Restartable
Jobs" for more details on workflow recovery.
Error Log Type :- Specifies the type of error log to create. It can be relational database or
flat file.
Error Log Table Name Prefix :- Specifies the table name prefix for relational logs.
Log Row Data :- Specifies whether or not to log transformation row data.
Log Source Row Data :- Specifies whether or not to log source row data.
Data Column Delimiter :- Data will be delimited by the specified character in DB column.
With this configuration we specified, Informatica PowerCenter will create four different tables for error
logging and the table details as below.
ETL_PMERR_DATA :- Stores data about a transformation row error and its corresponding
source row.
ETL_PMERR_TRANS :- Stores metadata about the source and transformation ports, when
error occurs.
With this configuration, we are done with the setting required to capture any error during the session
execution. Now lets see how do we retrieve the data from the error log tables and report it to
Business Users or IT Department.
as
'Error
MSG'
from
ETL_PMERR_SESS
sess
left
outer
join
sess.SESS_INST_ID
msg.WORKFLOW_RUN_ID
sess.SESS_INST_ID
where
left
outer
join
sess.WORKFLOW_RUN_ID
sess.FOLDER_NAME
ETL_PMERR_MSG
and
<Project
msg
on
msg.SESS_INST_ID
Folder
Name>
and
Pros.
1. Out of the box Solution Provided by Informatica.
2. Captures all the Error, Including the unthought error.
3. Less Coding and Testing efforts required by the development team.
Cons.
1. If an error occurs after an active transformation, Integration service will not be able to capture
source data into the erro table.
2. In sessions with multiple non-pass through partitions, Integration service can not capture
source data into error table.
3. Added overhead to the Session performance (Which is expected and acceptable).
Please leave us a comment below, if you have any difficulties implementing this error handling
approach. We will be more than happy to help you.
ERROR() : This function Causes the PowerCenter Integration Service to skip a row and issue an
error message, which you define. The error message displays in the session log or written to the
error log tables based on the error logging type configuration in the session.
ABORT() : Stops the session, and issues a specified error message to the session log file or written
to the error log tables based on the error logging type configuration in the session. When the
PowerCenter Integration Service encounters an ABORT function, it stops transforming data at that
row. It processes any rows read before the session aborts.
Note : Use the ERROR, ABORT function for both input and output port default values. You might
use these functions for input ports to keep null values from passing into a transformation and use for
output ports to handle any kind of transformation error.
Informatica Implementation
For the demonstration lets consider a workflow which loads daily credit card transactions and below
two user defined data validation checks
1. Should not load any transaction with 0 (zero) amount, but capture such transactions into
error tables
2. Should not process any transactions with out credit card number and Stop the workflow.
Add below expression for both ports. First expression will take care of the user defined data
validation check No 1 and second expression will take care of user defined data validation check No
2.
TEST_CREDIT_CARD_NB :- IIF(ISNULL(LTRIM(RTRIM(CREDIT_CARD_ND))),ABORT('Em
pty Credit Card Number'))
Now insert this transformation in the mapping where you need the data validation and complete the
mapping.
Hint : This Expression can be converted into a Reusable transformation, So that any mapping
needs this data validation can reuse this transformation.
With this configuration we specified, Informatica PowerCenter will create four different tables for error
logging and the table details as below.
ETL_PMERR_DATA :- Stores data about a transformation row error and its corresponding
source row.
ETL_PMERR_TRANS :- Stores metadata about the source and transformation ports, when
error occurs.
With this, we are done with the setting required to capture user defined errors. Any data records
which violates our data validation check will be captured into PMERR tables mentioned above.
as
'Error
MSG'
from
ETL_PMERR_SESS
left
outer
join
sess
ETL_PMERR_DATA
data
outer
join
sess.SESS_INST_ID
ETL_PMERR_MSG
msg
sess.SESS_INST_ID
where
sess.FOLDER_NAME
sess.WORKFLOW_NAME
<Project
=
Folder
<Workflow
Name>
Name>
Pros.
and
and
Cons.
1. Added overhead to the Session performance, which is expected and acceptable.
Hope you enjoyed this article. Please leave us a comment below, if you have any
difficulties implementing this error handling approach. We will be more than happy to help you.