Component Reference
Component Reference
Component Reference
tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
abinitio>
Component Reference
Ab Initio Components
====================
Section #1
==========
Component Groups
----------------
1) Compress
2) Continuous
3) Database
4) Datasets
5) Departition
6) Deprecated
6) Examples
7) FTP
8) Miscellanious
9) My Components
10) Partition
11) Sort
12) Transform
13) Translate
14) Validate
Section #2
==========
Component Listing
-----------------
1) Compress
1) Deflate
2) Inflate
3) Compress
4) Uncompress
2) Comtinuous
Will do later
3) Database
1) Call Stored Procedure
2) Input Table
3) Join with DB
4) Multi Update Table
5) Output Table
6) Run SQL
7) Truncate Table
8) Update Table
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialog… 1/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
4) Dataset
1) Input File
2) Input Table
3) Intermediate File
4) Lookup File
5) Output File
6) Output Table
7) Read Multiple files
8) Read Shared
9) Write Multiple Files
5) Departition
1) Concatinate
2) Gather
3) Interleave
4) Merge
6) FTP
1) FTP From
2) FTP To
3) SFTP From
4) SFTP To
7) Miscellanious
1) Assign Keys
2) Buffered Copy
3) Documentation
4) Gather Logs
5) Leading Records
6) Meta Pivot
7) Redefine Format
8) Replicate
9) Run Program
10) Throttle
11) Recirculate and Compute Closure
12) Trash
8) Partition
1) Broadcast
2) Partition by Expression
3) Partition by Key
4) Partition by Percentage
5) Partition by Range
6) Partition by Round-Robin
7) Partition with Load Balance
9) Sort
1) Checkpointed Sort
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialog… 2/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
10) Transform
1) Aggregate
2) Dedup Sorted
3) Denormalize Sorted
4) Filter By Expression
5) Fuse
6) Join
7) Match Sorted
8) Multi Reformat
9) Normalize
10) Reformat
11) Rollup
12) Scan
13) Scan With Rollup
11) Translate
Some components like decoder, encoder are not included
1) Read Seperated Values
2) Read Tagged Values
3) Read XML
4) Read XML Tranform
5) Write XML
6) Write XML Transform
7) XML Reformat
12) Validate
1) Check Order
2) Compare Checksums
3) Compare Records
4) Compute Checksum
5) Generate Random Bytes
6) Generate Records
7) Validate Records
Section #3
==========
Component Description
---------------------
1) Aggregate ( Tranform )
2) Assign Keys ( Miscellanious )
It has a natural key field (against which the surrogate key will be generated)
It has a surrogate key.
1. buffer-bytes
2.buffer_records
Recommentations
---------------
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialog… 4/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
a) Use Update Table component to execute a stored procedure which takes only input
parameter
b) Use Run SQL to execute a stored procedure which donot take any parameters
c) Use Join With DB to execute stored procesures which have both input and output
parameters.
If it finds an unordered record, it will write a single-line error message into OUT
port.
If the number of errors is greater than the LIMIT parameter, Component will stop
the graph.
7) Check Pointed Sort ( Sort )
This component sorts and merges records, inserting a CheckPoint between Sorting and
Merging phases.
This component reads records from in port until the max-core memory is fill.
Then component will sort all the records in memory and write the sorted records
into a temporary file
Once all the records are finished, do a checkpoint. (This is very inexpensive sine
all data are already wriotten as temp files)
Now the merge will merge all temp files keeping the sort order.
If the checksums are equal, component will exit with status code '0'
Else it will exit with status '-1' and stops the graph.
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialog… 5/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
To compress data .
Always use Inflate component to reverse the action before processing records
For denormalize;
a) Define element_type ;
This element_type will be referenced in denormalize function as variable 'elt'
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialog… 6/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
<Important : This component has output_select function which can be used to filter the
records coming from the finalize function)
If deselected records are NOT needed, better use the select method in components
to do this filter. This will give more performance.
Component reads all records, sorts the records and find teh splitters which can be
sent to 'split' port of
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialog… 7/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
so on ....
22) Gather Logs ( Miscellanious ) Should use 'HANDLE_LOGS' for new development
Collects records from the out port of components
23) Generate Random Bytes ( Validate ) Should use CREATE DATA in new graphs
Generates specified number of records with specified number of random bytes
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialog… 8/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
24) Generate Records ( Validate ) Should use CREATE DATA in new graphs
Generates specified number of records.
Using 'command_line' we can modify these data. Or we can use a transform component
to modify this data.
25) Hadoop Parallel Read
Initiates a hadoop map reduce job and reads the results through one or multiple
TCP connections.
Parent Graph and subgraph can have different components with same error_group
name. This was error escalation is possible from the sub graph to the parent graph.
For error escalation following functions are used in the handle_error component;
1) make_error()
In this function, depending on rules, use force_error to escalate the selected
errors.
2) log_error()
Handle_Error component should be can in the same phase or higher phases. It cannot
be in lower phases.
If its in higher phases, data will be lost if the graph fails.
27) Handle Logs
Works in similar way as that of Handle Errors. Only difference is that it do not
have escalation option which is nopt needed.
INFLATE and DEFLATE copmponents can compress and uncompress data better than
COMPRESS and UNCOMPRESS components.
INFLATE and DEFLATE can accept multiple input flows (implicit gather), but
COMPRESS and UNCOMPRESS cannot
INFLATE and DEFLATE can work in continuos graph, but COMPRESS and UNCOMPRESS
cannot work in conmtinous graph.
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialog… 9/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
INFLATE and DEFLATE cannot work on files compresses with unix pack utility, but
COMPRESS and UNCOMPRESS can work. THis is the only situtation, we should use COMPRESS
and UNCOMPRESS.
This is dangerous because the number of data files can vary, affecting the graph
or even the graph may fail.
if ABLOCAL() is uased in the sql, this will be replaced with the value from
ab_local_exp parameter
if ABLOCAL(table name), then the table name will be used as the driving table to
control parallel load.
if ABLOCAL_SUBPART() is used, it will be replaced with the subpartition(partition
name) clause in SQL. Each partitioned qry using its own partitiona name.
Since the database table partition(layout) can change due to database maintenance,
it is not advisable to use default database layout for input table. If we use this, use
a partition component after this so that changes in table layout will not affect
downstream components
Or use MFS as the layout for the input table, or use propogation from neighbor
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialo… 10/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
Combines blocks of record from multiple flows in round robin fashion. This
reverses the effect of 'partition by round robin'.
Join Types are : 1) Inner Join, 2) Outer Join and 3) Explicit join.
'driving' property specifies the driving port. (Used when input record is not
sorted. Join component will all other ports in to memory). driving port must be the
biggest port.
Optional 'compute_key' to derive the keys for the sql query from the data from in
record. If provided, compute_key takes input record and compute a temporary value of
type 'key_type' which will be used in the sql query for bind variables.
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialo… 11/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
33) Lookup File ( Datasets ) (Do a sample graph for interval keys)
Represents one or more serial files or multifile. Represents keys and asosciated
data. Keys will be indexed. Data
will be kept in memory for easy retrievel.
We should use Lookup file only when the data in small. If large, we should use
Join component.
Use DML functions like 'lookup', 'lookup_count' and 'lookup_next' to retrieve the
data.
This component can be used just infront of a custom component to avoid dead lock.
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialo… 12/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
Specify the number of records using 'length' method or implements while do loop
using 'finished' method. (it will keep on calling normalize until finished function
returns 1 for the record)
a) input_select
b) initialize
c) length
d) normalize
e) finalize
f) output_select
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialo… 13/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
direct
indexing_mode
If there are indexes on columns other than primary key, loading the data in serial
(than in parallel) may
provide better performnace. This is because otherwise multiple partiotions will
try to compete to aquire
lock on same objects.
Oracle loads can be seeped up by specifying 'unrecoverable' option. This means the
process will not write into
redo log files.
This will be applicable when we are loading a table completely.
block-size parameter.
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialo… 14/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
Partitions the data in such a way that it copies data based on the speed at which
data is consumed.
Optionally we can call reformat function to reformat the records and wrfite the
reformatted records into out port
For different components in the same graph, similar feature is done through
'REPLICATE' component.
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialo… 15/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
a) prepare_document
b) create_output
This has method 'output_index' which will decide to which "tranform funtion" to be
called
and out port the record will be written. Also it has output_indexes if one record
need to go to multiple out ports
60) Replicate
Arbitrarily combines records from its in port and writes a copy of this data into
all its out port
61) Rollup
Evaluates a group of input records and generate a record which summarizes the
group.
Two Modes
1) Template Mode
If the roll up is to calculate the basic aggregate functions like sum, maximum,
minimum etc, use template
mode along with the aggregate functions. During runtime, Co-Op system will expand this
into different functions
to perform the operation.
This mode provides no control on deriving the summary records
2) Expanded Mode
This mode gives maximum control
provides
a) a Temporary Type
b) initialize method
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialo… 16/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
c) rollup method
d) finalize method
input_select
key_change
'accummulation' function can be used in rollup() template method to create a vetor for
a field
eg : out.amounts :: accumulation(in.amnt) ;
Only one command is allowed. If multiple commands are needed, then wrap the
commands in shell
65) Sample
Retrieves specified number of records from in port
66) Scan
For every in record, generates one record which includes *running cumilative* of
the group the record belpongs to
input_select
initialize
scan
finalize
output_select
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialo… 17/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
77) Sort
Sorts records
Records with NULL key values are listed first (in asc order)
79) SPLIT
Split is reverse of COMBINE component. It normalizese a complex (DML with vector)
in to flattened or multiple output flows
Or select a subset of from the input data.
There is no tranform function. Operations are done based on the defined DML on the
in and out ports.
79) Throttle
Copies records from IN port to OUT port at a rate we specify
80) Trash
Ends a flow by accepting all records and discarding it
Insert SQL will get executed only when update sql failes.
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialo… 18/19
8/18/22, 3:19 PM https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2…
==============================================================================
https://sites.google.com/site/jibjabmonitor/abinitio/component-reference?tmpl=%2Fsystem%2Fapp%2Ftemplates%2Fprint%2F&showPrintDialo… 19/19