Informatica Vs Odi
Informatica Vs Odi
Informatica Vs Odi
ETL
ETL stands for extract, transform and load. ETL is typically performed by an external tool such
as Informatica, DataStage or by PLSQL procedures. When tools like Informatica are used, most
of the data transformation is done on the ETL server, which made the ETL server to be well built
to support the data transformation process. The advantage of this was less impact on the source
and target systems.
ELT
ELT stands for extract, load, and transform. The majority of the data transformation is done after
the data is loaded into the target database. This will reduce the load on the ETL server and in turn
the cost of the server. Since most of the transformation is done on the database, it is also faster,
which in turn increases the performance and decreases the duration of the data movement
window. This utilizes the capabilities of the target database to increase the performance.
Similarities
As both Informatica and ODI are tools used to move data from source-to-target with
transformations, there are some similarities. I am listing some of those below.
Key Features
Restart Ability from the Point of Failure
Restart from the point of failure is a key feature in any data movement process. This reduces the
reprocessing of the same data, which in turn reduces the overall load window. Both Informatica
and ODI have restart ability (ODI from 11.1.1.5 onwards). Both offer different levels of restart
ability.
In a workflow/load plan, we can configure the restart from the exact point of failure or restart
from the beginning of a group based on the failure (earlier versions of ODI did not have load
plan and therefore no restart from point of failure was not possible).
Load Dependency
Making sure that the different data sets are loaded in a proper sequence depending on logical
dependency is important to the success of any data warehouse implementation. For example, if
we have 10 dimensions and 3 facts, the dimensions used in the facts should be completed before
the facts are loaded. This can be set by a single level dependency, let all the 10 dimensions
complete first and then loaded the 3 facts as shown below in image 1.
This satisfies the dependencies. If fact1 is only dependent on dimensions 1,2,3 and 4... fact2 is
dependent only on dimensions 3,4,5 and 6... and fact3 is dependent only on dimensions 7,8, 9
and 10, the setup in image 1 still satisfies the dependencies, but it has additional dependencies
which are not needed. Fact1 is waiting on dimensions 5 to 10 to complete which is not a
requirement. This is a bit inefficient and can be refined by adjusting the dependencies as shown
in image 2 below.
In ODI, procedures can be defined by the pre and post session tasks. These tasks can be
embedded in the load plan to run before and after the table load step is executed as required.
Truncate Target
Truncating the target table (can be a stage or dimension and fact table) through the automated
process is most in any ETL/ELT tool. Both ODI and Informatica have this option available to be
configured. ODI has it at the interface level and Informatica has it at the session level.
Unit Testing
Unit testing the independent block of code is required in any data warehouse ETL/ELT
implementations. Informatica only allows it at the workflow level as that is the executable
(separate workflow has to be built to test each mapping), but ODI allows it at the
interface/procedure level as a scenario can be generated at both levels. This saves some time, in
the build phase of the implementation.
Reusability of Mapping/Interface in Multiple Session/Scenario
Reusability of a mapping /interface is an important benefit in defining full and incremental loads.
This will eliminate the need to maintain two separate mappings/interfaces for full and
incremental loads. Informatica provides the ability to define two different sessions on the same
mapping with different options like truncate table. This is a built in feature. ODI does not have a
built in feature to define the truncate option at the scenario level as this is set at the interface
level through the knowledge module. A custom knowledge can be developed to make the
truncate option dynamic with the help of some external tables (this has to be developed and
tested in house).
Partitioning Option
When the data volumes are high, target tables are partitioned to improve both the ETL/ELT load
performance and also the data retrieving performance (reporting performance). Informatica has
additional capability to utilize the partitioning in the load process of the table to improve the load
time. ODI currently does not have an option but in the process of building a new knowledge
module to utilize the partitioning.