Sr. Talend ETL Big Data Developer Resume: Professional Summary

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7
At a glance
Powered by AI
The candidate has over 7 years of experience working with Talend and big data technologies. Key skills mentioned include ETL development, data integration, AWS, and working with databases like Oracle, SQL Server, and Snowflake.

The resume mentions skills and experience with Talend platforms, databases like Oracle, SQL Server and Snowflake, tools like JIRA and Confluence, and programming languages like Python, Scala and Java.

Some responsibilities mentioned for the ZocDoc role include developing ETL processes to load data from flat files to staging and data marts, collaborating with other teams, implementing business rules, and developing mappings for fact and dimension tables.

Sr Talend ETL Big Data Developer

Sr. Talend ETL Big Data Developer Resume


Professional Summary:
· 7+ years of IT industry experience in all aspects of Analysis, Design, Testing, Development of Data Warehousing
Systems and Data Marts in various domains.
· Working with Data Architects and Data Modelers, perform source data profiling to identify business rule gaps,
capture metadata, refine transformation mappings, develop technical design documents and process flow diagrams.
· Extensive experience in developing integration jobs with Talend Data Services Platform and Talend Big-Data
Platform.
· Experience is data profiling and building analysis, reports using Talend Data Quality Portal.
· Experience in creating generic schemas and creating context groups and variables to run jobs against different
environments like Dev, Test and Prod. 
· Extensively worked on transformations such as Joiner, Filter, Router, Expression, Lookup, Aggregator, Sorter,
Normalizer, Sequence Generator etc.
· Strong background in performance tuning, building jobs from the Talend Studio and debugging  of existing ETL
process.
· Experience in publishing the jobs from the Talend Studio to Nexus artifact repository and deploying the jobs to TAC
from the Artifact Repository.
· Experience in migrating the code between dev, test, prod. Migrating the jobs as snapshots to dev, test and as
releases to prod.
· Experience in creating execution plans, monitoring and scheduling using Job Conductor (Talend Admin Console).
· Experience in creating Airflow DAGS for scheduling and monitoring jobs in Apache Airflow.
· Experience in working with Talend reference projects for metadata and repository context.
· Strong experience in working with AWS services like S3, Athena, Glue etc.
· Experience in working with views, stored procedures, functions using SSMS.
· Skillful in documenting the standards, best practices, test results etc.
· Strong communication skills and working with both small sized and large sized teams.
· Experience working in an Agile/Scrum environment and ability to work well under pressure and meet deadlines.

Technical Skills:
Talend Platforms: Talend Data Services Platform 7.1, Talend Data Services Platform 6.2,
Talend Big Data Platform 6.3, Talend Big Data Platform 7.0, Talend Big Data Platform 7.2
Databases: Oracle, SQL SERVER, Netezza, Snowflake, Postgres, AWS Aurora MySQL, Sybase.
Other tools: Confluence, JIRA, MS Teams, Slack, ServiceNow, Nexus Artifact Repository, Box, Share-point, Postman,
Notepad++, Putty, WinSCP, DBeaver.
Programming Languages: Python, Scala, bash, java.

Talend Components Worked on:


Database Components
DB Commons - tDBInput, tDBOutput, tDBRow, tDBConnection, tDBClose,
tDBCommit, tDBSP
DB Specifics - MS Sql Server, Oracle, Netezza, Snowflake, PostgreSQL,
MySql

ELT Components
tELTMSSqlInput, tELTMSSqlMap, tELTMSSqlOutput
Sr Talend ETL Big Data Developer

File Components
tFileInputDelimited, tFileInputExcel, tFileOutputDelimited,
tFileOutputExcel, tFileArchive, tFileCompare, tFileCopy,
tFileDelete, tFileExist, tFileList, tFileUnarchive, tAdvancedFileOutputXML,
tFileInputParquet, tFileOutputParquet

Internet Components
tFTPConnection, tFTPClose, tFTPDelete, tFTPFileExist, tFTPFileList, tFTPGet,
tFTPPut, tFTPRename, tREST, tPOP

Logs &Errors Components


tDie, tFlowMeter, tFlowMeterCatcher, tLogCatcher, tLogRow, tStatCatcher,
tWarn.

Orchestration Components tFileList, tFlowtoIterate, tForeach, tParallelize, tPrejob, tPostjob, tReplicate,


tRunJob, tSleep, tUnite.

System Components
tRunJob, tSSH, tSystem

Custom Code Components


tJava, tJavaRow, tSetGlobalVar

Business Components
Salesforce Components - tSalesforceBulkExec, tSalesforceConnection,
tSalesforceGetDeleted, tSalesforceInput,
tSalesforceOutput, tSalesforceOutputBulk, tSalesforceOutputBulkExec

Cloud Components
Snowflake Components - tSnowflakeConnection, tSnowflakeInput,
tSnowflakeOutput, tSnowflakeRow, tSnowflakeClose.

Amazon S3Components – tS3Connection, tS3Get, tS3Put

Data Quality Components


tMelissaDataAddress, tUniqRow, tVerifyEmail

Storage Components
tHDFSConfiguration, tS3Configuration
Sr Talend ETL Big Data Developer

Certifications: Talend Data Integration v7 Certified Developer

Experience:
MassMutual, Springfield, MA | Sr Talend ETL Big Data Developer Feb 2020 - till date
Description:
The project involves loading data to the Datamart and providing data from the Datamart to the outbound vendors.
Part of the extracts team which provides data to the vendors.

Responsibilities:
· Designing data integration jobs to extract data from the warehouse and provide to the outbound teams.
· Co-ordinating with the Data Modeler and the business team to review the details for the extract files.
· Reviewing the view schemas created by the data modeler by comparing against the Mapping specs to make sure all
the required fields are present in the DB objects, validate the datatypes and the length of the columns in the
database views against the mapping specs.
· Designing the ETL data integration jobs sourcing from the view based on the Mapping specs.
· Applying the transformations in the ETL as per the specifications given in the Mapping specs.
· Creating Talend jobs using the metadata and the repository context from the reference project.
· Updating the reference project with all the metadata connections and any new context variables that needs to be
added or updated.
· Documenting the standards and practices to be followed for designing the Extract jobs.
· Validating the files on AWS S3 buckets which is the target location for the extracts from the ETL.
· Creating Talend jobs to read files from AWS S3 and write them to AWS S3 using tS3 components.
· Designing an audit control process that can be used in all the extract jobs to track the start-time, end-time, status of
the job and record count of the extract file.
· The audit process comprises of an initiation job, a reusable joblet that can be used in all the extract jobs and a
closing job. All these three modules are developed as reusable components and can be used for all the extract
processes.
Sr Talend ETL Big Data Developer

· Using bash scripting to create a process to extract the table dumps from a legacy sybase database. The sybase utility
bcp export has been used in this process.
· Design control tables for the above process to schedule the above as a batch process.
· Creating a python script to normalize the records in a file, one record needs to be split into multiple records based
on the iteration of a specific type of columns.
· Creating a Talend job for this process was a complex one, and so created a Python utility that could perform the
same operation.
· Working on big data jobs to read from parquet files, perform transformations and load into target parquet files.
· Creating Glue crawlers to read files from AWS S3 and load them to AWS Athena.
· Validating the parquet files on AWS S3 by reading the files from AWS Athena.
· Creating sample parquet files using Scala in dev and rear environment to make sure the jobs are performing as
expected.
· Using Scala to create a dataframe from files on hdfs and S3, create a sample record set as per the validations that
needs to be performed and load the sample file back to S3 or hdfs.
· Using the S3 utility and hdfs utility to perform basic operations on the files like copy, move, delete etc.
· Co-ordinate with the MFT team to create a file transfer job to move the file to the vendor outbound location from
the S3 buckets.
· Performing unit testing for the extracts process and document the results and uploading to box for tracking
purposes.
· Updating the documentations on confluence pages with any new latest development details.
· Updating the Airflow DAGS to include the latest extracts in the orchestration jobs on Airflow.
· Monitoring the jobs scheduled on Airflow as a part of the on-call production support.

Yale University, New Haven, CT | Talend ETL Developer Dec 2018 - Feb 2020
Description:
The project involves Talend platform upgrade from Talend Data Services Platform 6.2 to Talend Data Services
Platform 7.1. It also involves the migration of all database servers in AWS to Microsoft Azure. The existing code
needs to be migrated to accommodate the Talend Upgrade and Database Migration. In addition, this project also
involves data mart enhancements as per the internal clients’ request. 

Responsibilities:
· Designed, developed, tested and implemented data warehouse and data marts enhancements.
· Worked alongside data analysts and gathered requirements from clients.
· Build ETL jobs in compliance with established standards and patterns including error processing and integrated
scheduling.
· Performed unit testing of ETL component, documenting test results, designing and building an end-to-end
orchestration process (Talend TAC Execution plan) to schedule/trigger and run the ETL process.
· Completed code reviews for ETL components and design documentation, support all testing including Development,
Integration Testing, System Testing, User Acceptance Testing, End-to-End Testing and Performance Testing.
· Performed troubleshooting on ETL components as needed, responding to assigned defect tickets, working with users
to resolve data load defects and prepare documentation and production migration scripts.
· Involved in the Migration of ETL code for Talend Upgrade from Data Services Platform 6.2 to 7.1.
· Involved in the migration of the SQL server databases from AWS to Azure.
· Participated in meetings with the team and the Talend support to report issues/bugs as a part of the Talend
upgrade.
· Converted SVN project to a new GIT project. Maintaining version controls by creating branches for code
development and tags for release management.
Sr Talend ETL Big Data Developer

· Enhanced the data warehouse involving the addition of new attributes as per the client requests.
· Expertise in creating sub jobs in parallel to maximize the performance and reduce overall job execution time with
the use of parallelize component of Talend in TOS and using the Multithreaded Executions in Talend Studio.
· Created an alternative solution to the legacy oracle forms. Working with ServiceNow team to create a new form and
a new ETL process.
· Created a Talend job to read an attachment from the Inbox of a shared mailbox using tPOP component.
· Due to security concerns the Talend job using tPOP component wasn't implemented. As an alternative a Talend job
was created having a REST call from ServiceNow API.
· Created jobs to pull data from WorkDay integrations using REST calls and save it as a xml file on the Talend server.
· Created metadata for DB connections, File delimited, File xml and use the metadata in the Talend jobs.
· Developed jobs to stage the data from the xml files, delimited files and views provided by the clients to the SQL
server database.
· Worked with stored procedures, Table valued functions, cursors, views within the SQL server as well as using the
stored procedures and views in the Talend jobs.
· Created new views and enhancing the existing views in SSMS as per the requirements.
· Created utility jobs to archive historic stage files based on the dates, move archive files from stage directory to
archive directory, copy files from MFT server to the Talend job server.

Evariant Healthcare, Farmington, CT |Talend ETL Developer Jan 2016 - Dec 2018
Description:
This project involved in loading various clients Patients data provided from heterogeneous sources to their
respective Salesforce orgs through ETL process. The Dedupe process prevents the insertion of possible duplicate
patient records to the salesforce platform. The duplicates are figured by comparing the demographics of the
patients and passed into the Salesforce Merge function where the master record is retained, and the slave record is
deleted. The CRM process involves in the Insert and Update of the Patients Demographic, Encounter and Clinical
Code Information sourced from the files provided by various clients.

Responsibilities:
· Participated in all phases of development life cycle with extensive involvement in the definition and design
meetings, functional and technical walkthroughs.
· Developed ETL code for Dedupe and CRM processes for various clients using Talend Real-Time Big Data Platform.
· Created jobs to load from various sources like flat files (delimited files), Salesforce, Oracle, Netezza databases as
inputs to Oracle, Netezza and Salesforce as targets as per the mappings provided within the requirement docs.
· Developed jobs with validations for different date formats, names, email address, physical address, phone numbers
etc. using various transformations provided with the Talend tMap.
· Created a custom routine for email validation and made use of it in the Talend job for email validation.
· Developed new scripts or update existing scripts to perform source file validation prior to staging the files in the
stage database.
· Developed Orchestration jobs with sub-jobs both parallel, serial and made sure the job parameters are accurate.
· Developed jobs as per the client standards and included the stats and logs wherever required. Also made sure all
jobs had error handling included.
· Used tStatsCatcher, tDie, tLogRow to create a generic joblet to store processing stats into a Database table to record
job history.
· Created utility process-control jobs to make sure jobs are not rerun even if triggered by mistake.
· Implemented delta process having current and previous tables to make sure redundant data if any present in the
source files, is eliminated.
Sr Talend ETL Big Data Developer

· Created Talend jobs to copy the files from one server to another and utilized Talend FTP components.
· Published the jobs from the Talend studio to Nexus artifact repository and deployed the jobs to TAC.
· Used ETL methodologies and best practices to create Talend ETL jobs.
· Created many complex ETL jobs for data exchange from and to Database Server and various other systems including
delimited, CSV, Flat file structures.
· Developed Linux scripts to validate source files prior to the load, archive files after the load process and consolidate
all historic files into a single file.
· Worked in migrating the ETL code from Netezza to Snowflake using Talend Big Data Platform 7.0.
· Worked on the Netezza to snowflake migration project replacing all the ETL jobs using Netezza and replace them
with Snowflake Components.
· Re designed all the sql statements used in the ETL to make them compatible with Snowflake.
· Worked on the near real time snowflake project to read HL7 messages in json format and process them to salesforce.
· Created snowflake views on top of the raw tables where the HL7 messages are loaded using Snowpipe to read from
semi structured json objects.
· Design a near real time ETL process to process the HL7 json messages reading from the views created to load them to
Salesforce.
· Worked on snowflake cli to execute sql statements and perform DML and DLL operations.
· Involved in production deployment activities, creation of the deployment guide for migration of the code to
production.
· Involved in production support and monitored scheduled jobs. Well versed in reading the error logs from TAC and
taking care of the errors either by restarting the job if there is a connection issue or reporting it to the team if there
is anything that can’t be fixed.

ZocDoc – New York, NY (Offshore) |Jr. Talend ETL Developer Oct 2013 to
December 2015
Description:
ZocDoc is s healthcare organization that provides real time access to the providers. ZocDoc offers patients the ability
to book their appointment at times that are most convenient for them. This project involved loading the
appointment information to the data warehouse using the ETL process. The patient data was loaded from the flat
files to the stage and then to the warehouse.

Responsibilities: 
· Analyzed business, functional requirements and used and developed test plans, test cases and test scripts for both
positive and negative tests.
· Designed and developed end-to-end ETL process from various source systems to staging area, from staging to data
marts.
· Communicated and collaborated with partner teams across the company to assess data needs and prioritize
accordingly.
· Developed jobs in Talend Enterprise edition from stage to source, intermediate, conversion and target.
· Solid experience in implementing complex business rules by creating re-usable transformations and robust
mappings.
· Experienced in writing expressions with in tMap as per the business need.
· Extracted data from flat files/ databases applied business logic to load them in the staging database as well as flat
files. 
· Involved in writing SQL Queries and used Joins to access data from Oracle and Postgres.
· Developed Talend jobs to populate the claims data to data warehouse - star schema.
Sr Talend ETL Big Data Developer

· Developed mappings to load Fact and Dimension tables, SCD Type 2 dimensions and Incremental loading and unit
tested the mappings.
· Implemented FTP operations using Talend Studio to transfer files in between network folders as well as to FTP
server using components and created complex mappings.

You might also like