0% found this document useful (0 votes)
31 views

1.4 Data Mining in A Programming Language

Uploaded by

Manuel Barrera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

1.4 Data Mining in A Programming Language

Uploaded by

Manuel Barrera
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 31

1.

Data Analysis
1.4 Data mining in a programming language

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 1
1
INTRODUCTION
Hello Everyone,

In this presentation we are going to talk about data analysis / data mining and how can we
use a programing language to take advantage of it. As we all know the use of data is
extremely important for companies to know their held and to take decisions, as well as for
technology, especially artificial intelligence and machine learning. So in this presentation
we are going to talk about what data analysis and mining is; and some of its tools, how to
analyze data and how this relates to technology.

Remember that my door is always open in case that you have any question.

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 2
2
DATA ANALYSIS INTRODUCTION

Nowadays, data analysis has become a


fundamental tool for decision-making in
companies. It is a discipline that allows you to
analyze large amounts of information and extract
valuable knowledge to improve the performance of
companies.

For example, manufacturing companies often


record the run time, idle time, and work queue of
various machines, then analyze those to better
plan workloads and keep machines running closer
to their maximum capacity.

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 3
3
WHAT IS DATA ANALYSIS

Data analysis is the process of examining, cleaning, transforming, and modeling


data with the goal of discovering useful information, reaching conclusions and
supporting decision-making. This process involves the application of various
techniques and methods to extract meaningful patterns, trends, correlations, and
insights from data sets. The information obtained can be used to optimize processes
and increase the overall efficiency of a business or system.

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist
facts to suit theories, instead of theories to suit facts,” Sherlock Holmes proclaims in
Sir Arthur Conan Doyle's A Scandal in Bohemia.

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 4
4
DATA ANALYSIS TYPES

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 5
5
DATA ANALYSIS TYPES
There are several different types of data analysis. These are the following:

● Descriptive …………………………………………………….

● Diagnostic ……………………………………………………..

● Predictive ………………………..…………………………….

● Prescriptive ……………………………………………………

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 6
6
DATA ANALYSIS PROCESS

As the data available to companies continues to grow both in amount and complexity,
so too does the need for an effective and efficient process by which to harness the
value of that data. The data analysis process typically moves through several iterative
phases. Let’s take a closer look at each.

Identify the business question you’d like to answer. What problem is the company
trying to solve? What do you need to measure, and how will you measure it?

Collect the raw data sets you’ll need to help you answer the identified question.
Data collection might come from internal sources, like a company’s client
relationship management (CRM) software, or from secondary sources, like
government records or social media application programming interfaces (APIs).

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 7
7
DATA ANALYSIS PROCESS
Clean the data to prepare it for analysis. This often involves purging duplicate and
anomalous data, reconciling inconsistencies, standardizing data structure and
format, and dealing with white spaces and other syntax errors.

Analyze the data. By manipulating the data using various data analysis techniques
and tools, you can begin to find trends, correlations, outliers, and variations that
tell a story. During this stage, you might use data mining to discover patterns
within databases or data visualization software to help transform data into an
easy-to-understand graphical format.

Interpret the results of your analysis to see how well the data answered your
original question. What recommendations can you make based on the data?
What are the limitations to your conclusions?

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 8
8
DATA ANALYSIS PROCESS IN EXCEL

Name: Master Data


Analysis on Excel in Just
10 Minutes
Duration: 11:31
Account: Kenji Explains

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 9
9
DATA ANALYSIS (DATA MINING)
Now that we know about Data Analysis, we need to incorporate another definition
when searching to take advantage of the information, Data Mining.

Data mining is the process of searching and analyzing a large batch of raw data in
order to identify patterns and extract useful information.

The difference of Data Analysis and Data Mining is that Data Analysis will help to clean
the information and present it on a way that will be easy to take decisions as for Data
Mining the information will be worked to extract specific information.

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 10
10
DATA ANALYSIS (DATA MINING)
Data mining involves exploring and analyzing large blocks of information to
glean meaningful patterns and trends. The data mining process breaks down
into four steps:

1. Data is collected and loaded into data warehouses on site or on a cloud


service.
2. Business analysts, management teams, and information technology
professionals access the data and determine how they want to organize it.
3. Custom application software sorts and organizes the data.
4. The end user presents the data in an easy-to-share format, such as a graph
or table.
Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 11
11
DATA ANALYSIS (Data Mining Techniques)
Data mining uses algorithms and various other techniques to convert large collections
of data into useful output.

● Association rules …………………………………………………

● Classification ………………………………………………………

● Clustering …………………………………………………………..

● Decision trees ……………………………………………………

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 12
12
DATA ANALYSIS (DATA MINING TECHNIQUES)

● K-Nearest Neighbor ……………..……………………………

● Neural networks…………………………………………………

● Predictive analysis ……………………………………………..

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 13
13
DATA ANALYSIS (DATA MINING)

Name: What is Data


Mining
Duration: 06:52
Account: IBM Technology

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 14
14
DATA ANALYSIS (DATA MINING - DECISION
TREES)
A decision tree is a non-parametric supervised learning
algorithm, which is utilized for both classification and
regression tasks. It has a hierarchical, tree structure, which
consists of a root node, branches, internal nodes and leaf
nodes.

As you can see from the diagram, a decision tree starts


with a root node, which does not have any incoming
branches. The outgoing branches from the root node then
feed into the internal nodes, also known as decision nodes.
Based on the available features, both node types conduct
evaluations to form homogenous subsets, which are
denoted by leaf nodes, or terminal nodes. The leaf nodes
represent all the possible outcomes within the dataset.
Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 15
15
DATA ANALYSIS (DATA MINING - DECISION
TREES)
As an example, let’s imagine that you were trying to assess whether or not you should go surf,
you may use the following decision rules to make a choice:

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 16
16
DATA ANALYSIS (DATA MINING - DECISION
TREES)
An example in a business would be something like, "earnings are expected to increase by $5
million.” But since the events indicated by end nodes are speculative in nature, chance nodes
also specify the probability of a specific projection coming to fruition.

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 17
17
DATA ANALYSIS (DATA MINING - DECISION
TREES)

Name: How To create a


Decision Tree
Duration: 05:31
Account: Wondershare
Edraw

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 18
18
Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 19
19
WHAT IS BIG DATA
It is very important to remember what Big Data is since advance data analytics depends
on Big Data.

When we talk about Big Data we refer to data sets or combinations of data sets whose
size (volume), complexity (variability) and speed of growth (velocity) make it difficult to
capture, manage, process or analyze them using conventional technologies and tools,
such as such as relational databases and conventional statistical or visualization packages,
within the time necessary for them to be useful.

Although the size used to determine whether a given data set is considered Big Data is
not firmly defined and continues to change over time, most analysts and professionals
currently refer to data sets ranging from 30-50 Terabytes to several Petabytes.

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 20
20
WHAT IS DATA ANALYTICS
Advanced analytics is a comprehensive set of analytical techniques and methods such
as Big Data, Artificial Intelligence (AI), Machine Learning, etc.

These techniques allow for better predictive analysis and provide insights into
technological change. As it occurs, it provides a broader view that enables organizations
to develop better responses and act on more accurate forecasts and processes.

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 21
21
DIFFERENCE BETWEEN DATA ANALYSIS AND
DATA ANALYTICS

Data analysis is a traditional or generic type of analytics used in enterprises to make data-
driven decisions.

Data analytics is a specialized type of analytics used in businesses to evaluate data and
gain insights.

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 22
22
DATA ANALYTICS TYPES
Within advanced data analytics we can differentiate between 4 main types:

● Descriptive analytics ………………………………………………..

● Diagnostic analytics …………………………………………………

● Predictive analytics …………………………………………………

● Prescriptive analytics ……………………………………………..

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 23
23
ADVANCE ANALYTICS

Name: Tech Explainer |


What is Advanced
Analytics?
Duration: 01:36
Account: IMDA Singapore

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 24
24
ADVANCE ANALYTICS TOOLS

Name: Data Analytics -


The 9 Essential Tools!
(2024)
Duration: 04:07
Account: CareerFoundry

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 25
25
WHAT IS THE ROLE OF AI IN DATA ANALYTICS

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 26
26
CONCLUSION
In conclusion, we can see that data can be used in different ways, we can use it to
review what happened in the past, what is happening in the present and what will
come in the future, but this does not stop there since Intelligent machines can also
use data, clean it and analyze it to make decisions, for example, the exact time to do
maintenance or what to produce, what not to produce and when to produce. What
other technological advances will the use of data bring us in the future?

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 27
BIBLIOGRAPHY CONSULTED
Coursera Staff, (Nov,2023) What is data analysis
https://www.coursera.org/articles/what-is-data-analysis-with-examples

Data Discovery Solutions, (Mar, 2023) La Importancia del Análisis de Datos


https://es.linkedin.com/pulse/la-importancia-del-an%C3%A1lisis-de-datos-data-discovery-solutions

Alteryx, (-) Qué es Análisis de datos


https://www.alteryx.com/es/glossary/data-analytics#:~:text=El%20an%C3%A1lisis%20de%20datos%20es,respaldar%20la%20toma
%20de%20decisiones.

Arthur Pinkasovitch, (May, 2024) Using Decision Trees in Finance


https://www.investopedia.com/articles/financial-theory/11/decisions-trees-finance.asp

Alexandra twin, (Feb, 2024) What Is Data Mining? How It Works, Benefits, Techniques, and Examples
https://www.investopedia.com/terms/d/datamining.asp

IBM, (-) What is a decision tree


https://www.ibm.com/topics/decision-trees#:~:text=A%20decision%20tree%20is%20a,internal%20nodes%20and%20leaf%20nodes.

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 28
28
BIBLIOGRAPHY CONSULTED
Katherin Haan, (Mar, 2024) The Best Data Analytics Tools Of 2024
https://www.forbes.com/advisor/business/software/best-data-analytics-tools/

Fabyio Villegas, (2024) Data Analytics vs Data Analysis: Key differences with uses
https://www.questionpro.com/blog/data-analytics-vs-data-analysis/#:~:text=Data%20analytics%20is%20a%20general,parts%20relat
e%20to%20one%20another
.

Decide, (-) Analitica Avanzada


https://decidesoluciones.es/analitica-avanzada/

Secmotic, (-) Analitica Avanzada de datos


https://secmotic.com/analitica-avanzada-de-datos/#gref

PowerData, (-) Big Data: ¿En qué consiste? Su importancia, desafíos y gobernabilidad
https://www.powerdata.es/big-data

Sandra Suszterova, (Sep, 2023) What Is AI in Data Analytics?


https://www.gooddata.com/blog/what-is-ai-in-analytics/

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 29
ADVANCE ANALYTICS TOOLS

Name: A Beginners Guide


To The Data Analysis
Process
Duration: 10:19
Account: CareerFoundry

Coordinación de Tecnologías
para la Educación
Contemporary Developments - Edgar Olivares (2024) 30
30
Todos los recursos educativos abiertos, elaborados por la Universidad
Anáhuac México y su equipo de docentes, se proveen bajo la licencia
Creative Commons Reconocimiento -NoComercial- SinObraDerivada CC
BY-NC-ND. http://creativecommons.org/licenses/by-nc-nd/4.0/

Coordinación de Tecnologías
para la Educación
Selected Topics in Information Technologies - Edgar Olivares (2024) 31
31

You might also like