BIL1
BIL1
BIL1
Theory:
Data Exploration:
Data exploration refers to the initial step in data analysis in which data analysts use data
visualization and statistical techniques to describe dataset characterizations, such as size,
quantity, and accuracy, in order to better understand the nature of the data.
Data exploration techniques include both manual analysis and automated data exploration
software solutions that visually explore and identify relationships between different data
variables, the structure of the dataset, the presence of outliers, and the distribution of data values
in order to reveal patterns and points of interest, enabling data analysts to gain greater insight
into the raw data. Data is often gathered in large, unstructured volumes from various sources and
data analysts must first understand and develop a comprehensive view of the data before
extracting relevant data for further analysis, such as univariate, bivariate, multivariate, and
principal components analysis.
Types of Attributes:
Nominal Attributes: Nominal means “relating to names.” The values of a nominal attribute are
symbols or names of things. Each value represents some kind of category, code, or state, and so
nominal attributes are also referred to as categorical. The values do not have any meaningful
order. In computer science, the values are also known as enumerations.
Binary Attributes: A binary attribute is a nominal attribute with only two categories or states: 0
or 1, where 0 typically means that the attribute is absent and 1 means that it is present. Binary
attributes are referred to as Boolean if the two states correspond to true and false.
Ordinal Attributes: An ordinal attribute is an attribute with possible valuesthat have a
meaningful order or ranking among them, but the magnitude between successive values is not
known.
Numeric Attributes: A numeric attribute is quantitative; that is, it is a measurable quantity,
represented in integer or real values. Numeric attributes can be interval-scaled or ratio-scaled.
What is Weka?
Weka (Waikato Environment for Knowledge Analysis) is a popular suite of machine learning
software written in Java, developed at the University of Waikato, New Zealand. ... Weka is a
collection of machine learning algorithms for solving real-world data mining problems. It is
written in Java and runs on almost any platform.
Practical :
Step 1: Download Weka
https://sourceforge.net/projects/weka/
Step 2: Install Weka
Sr Attribute Type Distin Unique Min Max Val Std Label Count Desc
No ct values val
Dev
Vals
Conclusion:
We have learned the use of Weka Software for Data Exploration and how it helps in classifying
different parameters with ease in analysing.