ETL Vs DB Testing
ETL Vs DB Testing
Software Testing has a variety of areas to be concentrated. Major varieties are functional and
non-functional testing. Functional Testing is the procedural way to ensure so that the
functionality developed works as expected. Non-functional testing is the approach by which the
non-functional aspects like enhanced or performance at an acceptable level can be ensured.
There is another flavor of testing called DB testing. Data is organized in the database in the form
of tables. For business, there can be flows where the data from the multiple tables can be merged
or processed on to a single table and vice versa.
ETL Testing is one another kind of testing that is preferred in the business case where a kind of
reporting need is sought by the clients. The reporting is sought in order to analyze the demands,
needs and the supply so that clients, business and the end-users are very well served and
benefited.
What will you learn in this tutorial?
In this tutorial, you will learn what is Database Testing, what is ETL Testing, a difference
between DB Testing and ETL Testing, and more details about ETL testing need, process, and
planning with real examples.
We have also covered ETL Testing in more detail on the below page. Also, have a look at it.
DB Testing:
DB Testing is usually used extensively in the business flows where there are multiple data flows
occurring in the application from multiple data sources on to a single table. The data source can
be a table, flat file, application or anything else that can yield some output data.
In turn, the output data obtained can still be used as input for the sequential business flow. Hence
when we perform DB testing the most important thing that has to be captured is the way the data
can get transformed from the source along with how it gets saved in the destination location.
Synchronization is one major and the essential thing that has to be considered when performing
the DB Testing. Due to the positioning of the application in the architectural flow, there might be
few issues with the data or DB synchronization. Hence while performing the testing, this has to
be taken care of as this can overcome the potential invalid defects or bugs.
Example #1:
Project “A” has integrated architecture where the particular application makes use of data
from several other heterogeneous data sources. Hence the integrity of these data with the
destination location has to be done along with the validations for the following:
Primary foreign key validation
Column values integrity
Null values for any columns
What is ETL Testing?
ETL Testing is a special type of testing that the client wants to have it done for their forecasting
and analysis of their business. This is mostly used for reporting purposes. For instance, if the
clients need to have reported on the customers who use or go for their product based on the day
they purchase, they have to make use of the ETL reports.
Post analysis and reporting, this data is data warehoused to a data warehouse where the old
historical business data has to be moved.
This is a multiple level testing as the data from the source is transformed into multiple
environments before it reaches the final destined location.
Example #2:
We will consider a group “A” doing retail customer business through a shopping market
where the customer can purchase any household items required for their day to day survival.
Here all the customers visiting are provided with a unique membership id with which they can
gain points every time they come to purchase things from the shopping market.
The regulations provided by the group say that the points gained expire every year. And
depending upon their usage, the membership can be either upgraded to a higher grade
member or downgraded to a lower grade member comparatively to the current grade.
After 5 years of shopping market establishment now management is looking for scaling up
their business along with revenue.
Hence they required few business reports so that they can promote their customers.
Following is a tabular form that describes the basic behavior of both the testing formats.
Applicable In the functional system where the business External to the business flow environment. in
place flow occurs the historical business data
Business Severe impacts can lead as it is the integrated Potential impacts as in when the clients wants
impact architecture of the business flows the forecasting and analysis to be done
Data Nature Normalized data is being used here Denormalized data is being used here
So in reality, the clients want to analyze the historical data and come up with some reports
strategically.
Most of the time the data is captured from the source databases are not directly. All the source
data will have the tables’ view from where the data can be used.
Example: Following is an example of how the mappings can be provided. The two columns
VIEW_NAME and TABLE_NAME can be used to represent the views for reading data from the
source and the table in the ETL environment respectively.
It is advisable to maintain the naming convention that can help us while planning for automation.
Generic notation that can be used is just prefixing the name of the environment.
The most significant thing in ETL is about identifying the essential data and the tables from the
source. The next essential step is the mapping of tables from the source to the ETL environment.
Following is an example of how the mapping between the tables from the various environments
can be related to the ETL purpose.
The above mapping assumes the data from the source table to the staging table. And from then
on to the tables in EDW and then to OLAP which is the final reporting environment. Hence at
any point in time, data synchronization is very important for the ETL’s sake.
Critical ETL Needs
As we understand ETL is the need for forecasting, reporting and analyzing the business in order
to capture the customer needs in a more successive manner. This will enable the business to have
higher demands than in the past.
Here are few of the critical needs without which ETL testing cannot be achieved:
1. Data and tables identification: This is important as there can be many other irrelevant
and unnecessary data that can be of least importance when forecasting and analyze the
customer needs. Hence the relevant data and the tables have to be selected before starting
up the ETL works.
2. Mapping sheet: This is one of the critical needs while doing ETL works. Mapping of the
right table from the source to the destination is mandatory and any problems or incorrect
data in this sheet might impact the whole ETL deliverable.
3. Table designs and data, column type: This is the next major step when considering the
mapping of source tables into the destined tables. The column type has to match with the
tables at both the places etc.
4. Database access: The main thing is access to the database where ETL goes on. Any
restrictions on the access will have an equivalent impact.
Example #3:
A company which manufactures silk fabric wanted to analyze their annual sales. On review of
their annual sales, they found during the month of August and September there was a
tremendous fall in sales with the use of the report they generated.
Hence they decided to roll out the promotional offer like the exchange, discounts, etc., that
enhanced their sales.
Basic Issues In ETL Testing
There can be a number of issues while performing ETL testing like the following:
Either the access to the source tables or the views will not be valid.
The column name and the data type from the source to the next layer might not match.
A number of records from the source table to the destined tabled might not match.
And there might be much more.
Following is a sample of mapping sheet where there are columns like VIEW_NAME,
COLUMN_NAME, DATA_TYPE, TABLE_NAME, COLUMN_NAME, DATA_TYPE, and
TRANSFORMATION LOGIC present.
The first 3 columns represent the details of the source database and the next 3 are the details for
the immediate preceding database. The last column is very important. Transformation logic is the
way the data from the source is read and stored in the destined database. This depends on the
business and ETL needs.
Following are a few of the points to be taken care while ETL Test Planning and Execution:
#1) Data is being extracted from the heterogeneous data sources
#2) ETL process handling in the integrated environment that have different:
DBMS
OS
Hardware
Communication protocols
#3) Necessity in having a logical data mapping sheet before the physical data can be transformed
#4) Understanding and examining of the data sources
#5) Initial load and the incremental load
#6) Audit columns
#7) Loading the facts and the dimensions
ETL Tools And Their Significant Usage
ETL tools are basically used to build and convert the transformation logic by taking data from
the source into another applying the transformation logic. You can also map the schemas from
the source to the destination which occurs in unique ways, transform and clean up data before it
can be moved to the destination, along with loading at the destination in an efficient manner.
This can significantly reduce the manual efforts as the mapping can be done that is used for
almost all of the ETL validation and verification.
ETL tools:
1. Informatica – PowerCenter – is one of the popular ETL tools that is introduced by the
Informatica Corporation. This has a very good customer base covering wide areas. The
major components of the tool are its tools for clients and the repository tools and the
servers. To know more about the tool please click here
2. IBM – Infosphere Information Server – IBM who is the market leader in terms of
Computer technology has developed the Infosphere Information server that is used for
Information Integration and Management in the year 2008. To know more about the tool
please click here
3. Oracle – Data Integrator – Oracle Corporation has developed its ETL tool in the name
of Oracle – Data Integrator. Their increasing customer support has made them update
their ETL tools in various versions. To know more about the tool please click here
More examples of the usage of ETL testing:
Considering some Airlines which want to roll out promotions and offers to attract the customers
strategically. Firstly they will try to understand the demands and needs of the customer’s
specifications. In order to achieve this, they will require the historical data preferably the
previous 2 years’ data. Using the data they will analyze and prepare some reports that will be
helpful in understanding the customers’ needs.
Analyzing these reports will help the clients in identifying the kind of promotions and offers that
will benefit the customers and at the same time can benefit businesses where this can become a
Win-Win situation. This can be easily achieved by ETL testing and reports.
In parallel, the IT segment faces a serious DB issue that has been noticed that has stopped
multiple services, in turn, has the potential to cause impacts in the business. On investigation, it
was identified that some invalid data has corrupted a few databases that needed to be corrected
manually.
In the former case, it is ETL reports and testing that will be required.
Whereas the latter case is where the DB testing has to be done properly to overcome issues with
invalid data.
Conclusion
Hope the above tutorial has provided a simple and clear overview of what ETL testing is and
why it has to be done along with the business impacts or benefits they yield. This does not stop
here, but it can extend to set foresight in growth in business.
About the author: This tutorial is written by Nagarajan. He is a Test Lead with over 6 years of
Software Testing experience in various functional areas like Banking, Airlines, and Telecom in
terms of both manual and automation.
Please let us know your thoughts/questions in the comments below.
Recommended Reading
ETL Testing Interview Questions and Answers
ETL Testing Data Warehouse Testing Tutorial (A Complete Guide)
10 Best ETL Testing Tools in 2021 [TOP SELECTIVE]
How to Perform ETL Testing Using Informatica PowerCenter Tool
31 Top Database Testing Interview Questions and Answers
40+ Best Database Testing Tools - Popular Data Testing Solutions
Database Testing Complete Guide (Why, What, and How to Test Data)
Selenium Database Testing (Using WebDriver and JDBC API)
CategoriesDatabase TestingPost navigation
How to Write an Effective Test Summary Report [Sample Report Download]
How to Test Smarter: Explore More, Document Less
21 thoughts on “ETL vs. DB Testing – A Closer Look at ETL Testing
Need, Planning and ETL Tools”
1. Surbhi
nice explanation. do you have ETL testing training as well?
Reply
2. Sathish
Nice article of ETL Testing.
Reply
3. Kunjal Gandhi
Awesome explanation!!
Reply
4. Venkateswara Rao Kotta
Very Nice Article
Reply
5. Njuneki Jayne
I have enjoyed reading this article, very educative.
Reply
6. rinku
thanku for the valuable information…
Reply
7. Nagarajan STH Author
@Surbhi, Sathish, Venkateswara Rao Kotta, Njuneki Jayne
Thanks for your comments. You can always get back to us regarding any clarifications from
Software Testing.
-Nagarajan
Reply
8. Nagarajan STH Author
@Surbhi,
We are Planning to conduct training on ETL testing as well. Will keep posted on our
training’s calendar.. Please keep checking!!
-Nagarajan
Reply
9. Boxfish
Useful Article, Thank you for Sharing.
I think your website have a problem with html link. i didn’t see any images on this page, fyi.
Reply
10. Nandhini
Very nicely and neatly explained. Thanks
Reply
11. Manander Singh
Nice post!!!
But after reading you article I thought its a kind of ETL testing.
Can you explain a little….
Reply
12. Tester
Hi, I have one suggestion for your web site. If one left clicks on any image on web page it
gets open in the entire page and we need to go back to the page. eg. Check page
– https://www.softwaretestinghelp.com/etl-testing-vs-db-testing/. It would be much better if
your image will not respond to left click and rather provide this option in right click.
Reply
13. Neelakanta
Nice Explanation
Reply
14. Piyush Dubey
Thanks for such a good exlplanation. Are you conducting ETL testing also. If yes then plz
update me on piyushdubey5@gmail.com.
Reply
15. Vasu
Which one is best of these two etl and db testing. Just suggest ne….
Reply
16. Saritha
This article provided required info for further progress in ETL Testing in initial stage
Reply
17. mallesh
Fully clarified about ETL testing and Data Warehouse thank you very much for your efforts
to help others…..
Reply
18. James
About the Automation tool for DB Testing & ETL testing as you suggested:
– I completely agree: automation tools (for ETL testing) are Informatica, QuerySurge,
COGNOS, …
– But in the automation tools (for Database testing): I don’t think QTP, Selenium are the
automation tool for Database testing. Because if you research well about the nature of QTP
or Selenium, you’ll find out that the main function of these tools focus to automation GUI
(rather than communicating directly to the database).
**Example: We have built an Framework X to test a website with the data shown on website
are getting from a specific Database (SQL Server). The Framework X use Selenium
WebDriver API for interacting on GUI (such as: click, mouse move, assert text, …). Besides
for the communicating directly to the database, the Framework X use the SqlClient API.
==> As you see, for the testing a value shown on website need the combination both
SqlClient API & Selenium WebDriver API. The SqlClient API to help you get expected data
from Database, the WebDriver API help you to get the current value on GUI for testing as
an actual result. Base on the expected & actual result, you’ll make a comparison via two
ways: 1-use assert function of Selenium WebDriver API for comparison; 2-use nature assert
function of Framework X for comparison.
==> Let’s get to the core of the discussion: I think that for the automation test Database.
– We can’t focus to 1 specific Automation GUI API for Database testing.
– We should focus to many things, such as: Automation GUI API, Communication database
API, etc … instead of only focusing on Automation GUI API.
Many thanks,
Reply
19. Mukesh
Nice post for freshers
Reply
20. Gopi
Hi Nagaraj, do you conduct online training on ETL. If yes please share me the details to my
email @ mailme.gopi@yahoo.com. I am very much impressed the way you explained on
the difference between ETL and DB testing.
Reply
21. swetha
if ETL testing has been conducted or planned to conduct please keep me informed.
QuerySurge i am looking.
Reply
Leave a Comment