0% found this document useful (0 votes)

97 views8 pages

Data Normalization in Data Mining

Uploaded by

poi.tamrakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

97 views8 pages

Data Normalization in Data Mining

Uploaded by

poi.tamrakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Read Discuss

Data Normalization in Data Mining

Difficulty Level : Basic ● Last Updated : 02 Feb, 2023

INTRODUCTION:

Data normalization is a technique used in data mining to transform the values of a

dataset into a common scale. This is impor tant because many machine learning

algorithms are sensitive to the scale of the input features and can produce better

results when the data is normalized.

There are several different normalization techniques that can be used in data

mining, including :

1. Min-Max normalization: This technique scales the values of a feature to a range

between 0 and 1. This is done by subtracting the minimum value of the feature

from each value, and then dividing by the range of the feature.

2. Z-score normalization: This technique scales the values of a feature to have a

mean of 0 and a standard deviation of 1. This is done by subtracting the mean of

the feature from each value, and then dividing by the standard deviation.

3. Decimal Scaling : This technique scales the values of a feature by dividing the

values of a feature by a power of 10.

4. Logarithmic transformation: This technique applies a logarithmic transformation

to the values of a feature. This can be useful for data with a wide range of values,

as it can help to reduce the impact of outliers.

5. Root transformation: This technique applies a square root transformation to the

values of a feature. This can be useful for data with a wide range of values, as it

can help to reduce the impact of outliers.

6. It ’s impor tant to note that normalization should be applied only to the input

features, not the target variable, and that different normalization technique may

work better for different types of data and models.

▲
Start Your Coding Journey Now!
In conclusion, normalization is an impor tant step in data mining, as it can help to

Read Discuss
features to a common scale. This can help to reduce the impact of outliers and

improve the accuracy of the model.

Normalization is used to scale the data of an attribute so that it falls in a smaller

range, such as -1.0 to 1.0 or 0.0 to 1.0. It is generally useful for classification

algorithms.

Need of Normalization –

Normalization is generally required when we are dealing with attributes on a

different scale, other wise, it may lead to a dilution in effectiveness of an impor tant

equally impor tant attribute(on lower scale) because of other attribute having values

on larger scale. In simple words, when multiple attributes are there but attributes

have values on different scales, this may lead to poor data models while per forming

data mining operations. So they are normalized to bring all the attributes on the

same scale.

Methods of Data Normalization –

Start Your Coding Journey Now!
Read Discuss
Decimal Scaling

Min-Max Normalization

z-Score Normalization(zero-mean Normalization)

Decimal Scaling Method For Normalization –

It normalizes by moving the decimal point of values of the data. To normalize the data

by this technique, we divide each value of the data by the maximum absolute value of

data. The data value, vi, of data is normalized to vi‘ by using the formula below –

where j is the smallest

integer such that max(|v ‘|)<1. Example –

Let the input data is: -10, 201, 301, -401, 501, 601, 701 To normalize the above

data, Step 1: Maximum absolute value in given data(m): 701 Step 2: Divide the

given data by 1000 (i.e j=3) Result : The normalized data is: -0.01, 0.201, 0.301,

Data Structures and Algorithms

-0.401, 0.501, 0.601, 0.701
Interview Preparation Data Science

Min-Max Normalization –

In this technique of data normalization, linear transformation is per formed on the

original data. Minimum and maximum value from data is fetched and each value is

replaced according to the following formula.

Start Your Coding Journey Now!
Read Discuss

Where A is the attribute data, Min(A), Max(A) are the minimum and maximum

absolute value of A respectively. v’ is the new value of each entr y in data. v is the old

value of each entr y in data. new_max(A), new_min(A) is the max and min value of the

range(i.e boundar y value of range required) respectively.

Z-score normalization –

In this technique, values are normalized based on mean and standard deviation of the

data A . The formula used is:

Start Your Coding Journey Now!
Read Discuss

v’, v is the new and old of each entr y in data respectively. σ A

, A is the standard

deviation and mean of A respectively.

ADVANTAGES OR DISADVANTAGES:

Data normalization in data mining can have a number of advantages and

disadvantages.

Advantages :

1. Improved per formance of machine learning algorithms: Normalization can help to

improve the per formance of machine learning algorithms by scaling the input

features to a common scale. This can help to reduce the impact of outliers and

improve the accuracy of the model.

2. Better handling of outliers: Normalization can help to reduce the impact of

outliers by scaling the data to a common scale, which can make the outliers less

influential.

3. Improved interpretability of results: Normalization can make it easier to interpret

the results of a machine learning model, as the inputs will be on a common scale.

4. Better generalization: Normalization can help to improve the generalization of a

model, by reducing the impact of outliers and by making the model less sensitive

to the scale of the inputs.

Disadvantages :
Start Your Coding Journey Now!
1. Loss of information: Normalization can result in a loss of information if the

original scale of the input features is impor tant.

Read Discuss
2. Impact on outliers: Normalization can make it harder to detect outliers as they will

be scaled along with the rest of the data.

3. Impact on interpretability: Normalization can make it harder to interpret the

results of a machine learning model, as the inputs will be on a common scale,

which may not align with the original scale of the data.

4. Additional computational costs: Normalization can add additional computational

costs to the data mining process, as it requires additional processing time to scale

the data.

5. In conclusion, data normalization can have both advantages and disadvantages. It

can improve the per formance of machine learning algorithms and make it easier

to interpret the results. However, it can also result in a loss of information and

make it harder to detect outliers. It ’s impor tant to weigh the pros and cons of data

normalization and carefully assess the risks and benefits before implementing it.

Like 28

Previous Next

2. Difference Between Data Mining and Web Mining

3. Text Mining in Data Mining

Start
4. Your Coding
Generalized Journey
Sequential PatternNow!
(GSP) Mining in Data Mining

Read Discuss
5. Problems on min-max normalization

6. Difference between Data Warehousing and Data Mining

7. Data Mining: Data Attributes and Quality

8. Difference Between Big Data and Data Mining

9. Difference Between Data Mining and Data Visualization

10. Outlier Detection in High-Dimensional Data in Data Mining

Ar ticle Contributed By :

deepak_jain
@deepak_jain

Vote for difficulty

Current difficulty : Basic

Easy Normal Medium Hard Expert

Article Tags : data mining, Computer Subject

Practice Tags : Data MIning

Improve Article Report Issue

A-143, 9th Floor, Sovereign Corporate Tower,
Sector-136, Noida, Uttar Pradesh - 201305
Start Your Coding Journey Now!
feedback@geeksforgeeks.org
Read Discuss

Company Learn
About Us DSA
Careers Algorithms
In Media Data Structures
Contact Us SDE Cheat Sheet
Privacy Policy Machine learning
Copyright Policy CS Subjects
Advertise with us Video Tutorials
Courses

News Languages
Top News
Python
Technology
Java
Work & Career
CPP
Business
Golang
Finance
C#
Lifestyle
SQL
Knowledge
Kotlin

Web Development Contribute

Web Tutorials Write an Article
Django Tutorial Improve an Article
HTML Pick Topics to Write
JavaScript Write Interview Experience
Bootstrap Internships
ReactJS Video Internship
NodeJS

@geeksforgeeks , Some rights reserved

dmdw2 2
No ratings yet
dmdw2 2
24 pages
Data Normalization
No ratings yet
Data Normalization
7 pages
Data Mining
No ratings yet
Data Mining
11 pages
Data Preprocessing
No ratings yet
Data Preprocessing
49 pages
3 1 Chapter 3 Normalization
No ratings yet
3 1 Chapter 3 Normalization
22 pages
2.5 1
No ratings yet
2.5 1
128 pages
5 Data Preprocessing III Editted Notes
No ratings yet
5 Data Preprocessing III Editted Notes
17 pages
Data Minig Lab Manual
No ratings yet
Data Minig Lab Manual
58 pages
Data Normalization
No ratings yet
Data Normalization
6 pages
Lecture 10 - Data Transformation-M
No ratings yet
Lecture 10 - Data Transformation-M
8 pages
Data Transformation
No ratings yet
Data Transformation
12 pages
DMDW 5
No ratings yet
DMDW 5
25 pages
Data Transformation and Standardization
No ratings yet
Data Transformation and Standardization
5 pages
Data Preprocessing: Essential Steps For Preparing Data Before Modeling
No ratings yet
Data Preprocessing: Essential Steps For Preparing Data Before Modeling
111 pages
3point5point2 Normalization
No ratings yet
3point5point2 Normalization
3 pages
Presentation #1 Data Mining Minahel Khan BSIT (E) 22!11!1
No ratings yet
Presentation #1 Data Mining Minahel Khan BSIT (E) 22!11!1
7 pages
Study+Material+Unit 4+Data+Preprocessing+
No ratings yet
Study+Material+Unit 4+Data+Preprocessing+
8 pages
WINSEM2024-25 MCSE615L TH VL2024250502897 2025-01-11 Reference-Material-I
No ratings yet
WINSEM2024-25 MCSE615L TH VL2024250502897 2025-01-11 Reference-Material-I
11 pages
Data Transformation in Data Mining
No ratings yet
Data Transformation in Data Mining
6 pages
Scaling Techniques
No ratings yet
Scaling Techniques
30 pages
Normalization: Normalization Techniques at A Glance
No ratings yet
Normalization: Normalization Techniques at A Glance
5 pages
Data Preprocessing and Feature Engineering
No ratings yet
Data Preprocessing and Feature Engineering
32 pages
Data Mining: A Preprocessing Engine
No ratings yet
Data Mining: A Preprocessing Engine
5 pages
Normalization and Standardization: Methods To Preprocess Data To Have Consistent Scales and Distributions
No ratings yet
Normalization and Standardization: Methods To Preprocess Data To Have Consistent Scales and Distributions
10 pages
8 Normalization Methods
No ratings yet
8 Normalization Methods
10 pages
Lecture 7 Data Transformation and Dimensionality Reduction
No ratings yet
Lecture 7 Data Transformation and Dimensionality Reduction
22 pages
Unit-2 Data Warehouse Notes
No ratings yet
Unit-2 Data Warehouse Notes
11 pages
04 - Data Normalization in Python - en
No ratings yet
04 - Data Normalization in Python - en
1 page
CH2 Data Integration - Transformation
No ratings yet
CH2 Data Integration - Transformation
16 pages
ML Unit 2
No ratings yet
ML Unit 2
90 pages
Database Design
No ratings yet
Database Design
4 pages
Normalization Vs Standardization
No ratings yet
Normalization Vs Standardization
2 pages
ML - Week 04
No ratings yet
ML - Week 04
33 pages
5Preprocessing
No ratings yet
5Preprocessing
44 pages
Seven Lab Instruction
No ratings yet
Seven Lab Instruction
38 pages
Bi Ut2 Answers
No ratings yet
Bi Ut2 Answers
23 pages
Predictive Analytics Modelling (21CSH-440) : Apex Institute of Technology
No ratings yet
Predictive Analytics Modelling (21CSH-440) : Apex Institute of Technology
20 pages
JAVA Advanced 3
No ratings yet
JAVA Advanced 3
19 pages
Model Selection and Feature Engineering
No ratings yet
Model Selection and Feature Engineering
64 pages
10-2 Data Analysis and Pre-Processing Part 4 PDF
No ratings yet
10-2 Data Analysis and Pre-Processing Part 4 PDF
23 pages
DSR Unit III
No ratings yet
DSR Unit III
11 pages
4 - Finding and Fixing Data Quality Issues
No ratings yet
4 - Finding and Fixing Data Quality Issues
48 pages
Lecture-11 - Feature Scaling
No ratings yet
Lecture-11 - Feature Scaling
26 pages
Chap 3
No ratings yet
Chap 3
26 pages
Feature Engineering
No ratings yet
Feature Engineering
18 pages
Normalization A Preprocessing Stage
No ratings yet
Normalization A Preprocessing Stage
5 pages
Preprocessing Stage
No ratings yet
Preprocessing Stage
4 pages
Summary Chap 1 & 2
No ratings yet
Summary Chap 1 & 2
5 pages
4 Data Pre Processing II
No ratings yet
4 Data Pre Processing II
26 pages
Foundations of Data Science - Unit 4
No ratings yet
Foundations of Data Science - Unit 4
17 pages
Mod1 DM Part2
No ratings yet
Mod1 DM Part2
34 pages
Iarjset 5
No ratings yet
Iarjset 5
3 pages
Data Preprocessing Techniques Cleaning Transformation and Integration
No ratings yet
Data Preprocessing Techniques Cleaning Transformation and Integration
6 pages
Lecture # 13 Data - Transformation - Techniques
No ratings yet
Lecture # 13 Data - Transformation - Techniques
36 pages
Feature Scaling (Standardization & Normalization)
No ratings yet
Feature Scaling (Standardization & Normalization)
35 pages
ML Normalization Techniques - Overview & Practical Guide
No ratings yet
ML Normalization Techniques - Overview & Practical Guide
5 pages
Data Preprocessing PT 2
No ratings yet
Data Preprocessing PT 2
7 pages
DAI101 4 Data Preparation
No ratings yet
DAI101 4 Data Preparation
45 pages
Face Recognition PAC
No ratings yet
Face Recognition PAC
24 pages
Sandbox
No ratings yet
Sandbox
7 pages
Preprocessing
No ratings yet
Preprocessing
62 pages
Data Science Vs Big Data
No ratings yet
Data Science Vs Big Data
34 pages
Varun Sodhi Resume PDF
No ratings yet
Varun Sodhi Resume PDF
1 page
Personalized Learning Path Generator (PLPG)
No ratings yet
Personalized Learning Path Generator (PLPG)
3 pages
Artikel Media Pembelajaran
No ratings yet
Artikel Media Pembelajaran
15 pages
Course Outline
No ratings yet
Course Outline
2 pages
Anne Vosser
No ratings yet
Anne Vosser
3 pages
Y12 HEco QP 23
No ratings yet
Y12 HEco QP 23
12 pages
Model: BERT + DNN Discussion: Anushya Subbiah Divya Sudhakar Kenny Hsu
No ratings yet
Model: BERT + DNN Discussion: Anushya Subbiah Divya Sudhakar Kenny Hsu
1 page
Students Database
No ratings yet
Students Database
7 pages
Calander 2018-2019 Tusd
No ratings yet
Calander 2018-2019 Tusd
1 page
2020 World AIDS Day Report Graphs Tables en
No ratings yet
2020 World AIDS Day Report Graphs Tables en
45 pages
ChamPock An
No ratings yet
ChamPock An
160 pages
Scavenger Hunt Student New
No ratings yet
Scavenger Hunt Student New
1 page
1984 Practice Vocab Quiz
No ratings yet
1984 Practice Vocab Quiz
6 pages
JKD Conversations With John Little
No ratings yet
JKD Conversations With John Little
37 pages
Digital Literacy For The 21st Century: Rethinking & Redesigning The Roles of Libraries
No ratings yet
Digital Literacy For The 21st Century: Rethinking & Redesigning The Roles of Libraries
6 pages
Ubc Thesis No Punctuation
75% (4)
Ubc Thesis No Punctuation
5 pages
Essentials of Geology 13th Edition Lutgens Fast Access
No ratings yet
Essentials of Geology 13th Edition Lutgens Fast Access
300 pages
Hikmah Task 10 Presentation
No ratings yet
Hikmah Task 10 Presentation
12 pages
Reading Remediation Through Peer Mentoring
No ratings yet
Reading Remediation Through Peer Mentoring
10 pages
A Proposed Health Information Leaflet in Waray-Waray Dialect
No ratings yet
A Proposed Health Information Leaflet in Waray-Waray Dialect
31 pages
PG Handbook 2019
No ratings yet
PG Handbook 2019
96 pages
Catholic Practices Exam Mock
No ratings yet
Catholic Practices Exam Mock
5 pages
IJSRD - International Journal For Scientific Research & Development - Vol. 8, Issue 3, 2020 - ISSN (Online) - 2321-0613
No ratings yet
IJSRD - International Journal For Scientific Research & Development - Vol. 8, Issue 3, 2020 - ISSN (Online) - 2321-0613
3 pages
R-01-POL-PC Policy On Registration in Professional Categories
No ratings yet
R-01-POL-PC Policy On Registration in Professional Categories
31 pages
Curvitaeko Updated
No ratings yet
Curvitaeko Updated
4 pages
Historical Review of Midwifery
0% (1)
Historical Review of Midwifery
3 pages
Brighton in The Rain
No ratings yet
Brighton in The Rain
3 pages
Transactional and Interactional
No ratings yet
Transactional and Interactional
12 pages
Breeds of Animal Week 4
No ratings yet
Breeds of Animal Week 4
5 pages
Measures To Control Population Growth in India
No ratings yet
Measures To Control Population Growth in India
4 pages

Data Normalization in Data Mining

Uploaded by

Data Normalization in Data Mining

Uploaded by

Read Discuss

Data Normalization in Data Mining

Data normalization is a technique used in data mining to transform the values of a

results when the data is normalized.

1. Min-Max normalization: This technique scales the values of a feature to a range

2. Z-score normalization: This technique scales the values of a feature to have a

mean of 0 and a standard deviation of 1. This is done by subtracting the mean of

values of a feature by a power of 10.

4. Logarithmic transformation: This technique applies a logarithmic transformation

as it can help to reduce the impact of outliers.

5. Root transformation: This technique applies a square root transformation to the

can help to reduce the impact of outliers.

work better for different types of data and models.

improve the accuracy of the model.

Normalization is used to scale the data of an attribute so that it falls in a smaller

Normalization is generally required when we are dealing with attributes on a

Methods of Data Normalization –

z-Score Normalization(zero-mean Normalization)

Decimal Scaling Method For Normalization –

where j is the smallest

integer such that max(|v ‘|)<1. Example –

Data Structures and Algorithms

In this technique of data normalization, linear transformation is per formed on the

replaced according to the following formula.

range(i.e boundar y value of range required) respectively.

data A . The formula used is:

v’, v is the new and old of each entr y in data respectively. σ A

deviation and mean of A respectively.

Data normalization in data mining can have a number of advantages and

1. Improved per formance of machine learning algorithms: Normalization can help to

improve the accuracy of the model.

2. Better handling of outliers: Normalization can help to reduce the impact of

3. Improved interpretability of results: Normalization can make it easier to interpret

4. Better generalization: Normalization can help to improve the generalization of a

to the scale of the inputs.

original scale of the input features is impor tant.

be scaled along with the rest of the data.

3. Impact on interpretability: Normalization can make it harder to interpret the

results of a machine learning model, as the inputs will be on a common scale,

4. Additional computational costs: Normalization can add additional computational

5. In conclusion, data normalization can have both advantages and disadvantages. It

1. Difference Between Data Mining and Text Mining

2. Difference Between Data Mining and Web Mining

3. Text Mining in Data Mining

6. Difference between Data Warehousing and Data Mining

7. Data Mining: Data Attributes and Quality

8. Difference Between Big Data and Data Mining

9. Difference Between Data Mining and Data Visualization

10. Outlier Detection in High-Dimensional Data in Data Mining

Vote for difficulty

Current difficulty : Basic

Easy Normal Medium Hard Expert

Article Tags : data mining, Computer Subject

Practice Tags : Data MIning

Improve Article Report Issue

Web Development Contribute

@geeksforgeeks , Some rights reserved

You might also like