WEKA Lab Questions Answers

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

1.

Perform data preprocessing tasks using labor data set in WEKA

Aim:

To perform data preprocessing on the labor dataset in WEKA.

Procedure:

1. Open WEKA and load the 'labor.arff' dataset.

2. Go to the Preprocess tab and explore the dataset for missing values and irrelevant attributes.

3. Use WEKA's 'ReplaceMissingValues' filter to handle missing data.

4. Normalize numerical attributes using the 'Normalize' filter.

5. Save the preprocessed dataset for further analysis.

Result:

The labor dataset is preprocessed and ready for machine learning tasks.

2. Create scatterplots and histograms using visualize option to detect outliers in WEKA

Aim:

To create scatterplots and histograms in WEKA to detect outliers.

Procedure:

1. Load the dataset into WEKA.

2. Navigate to the 'Visualize' tab.

3. Create scatterplots by selecting two attributes for the X and Y axes.

4. Use histograms to view the frequency distribution of individual attributes.

5. Identify outliers based on unusual data points.


Result:

Outliers in the dataset are visually identified using scatterplots and histograms.

3. Demonstrate how to identify and address conflicting data when merging multiple datasets

Aim:

To identify and resolve conflicting data when merging datasets.

Procedure:

1. Load multiple datasets into WEKA.

2. Check for conflicts in attributes like inconsistent data formats or missing values.

3. Resolve conflicts by standardizing attribute formats, filling missing values, or removing duplicates.

Result:

The datasets are merged and conflicts are resolved, ensuring consistency in the data.

4. Implement data preprocessing using ARFF format, CSV format, and C4.5 format in WEKA

Aim:

To preprocess datasets in ARFF, CSV, and C4.5 formats using WEKA.

Procedure:

1. Import the datasets in each format (ARFF, CSV, C4.5) into WEKA.

2. Apply preprocessing steps like attribute selection and normalization.

3. Save the processed data in the desired format.

Result:

Preprocessed datasets in multiple formats are ready for analysis.


5. Perform data preprocessing tasks using weather database in WEKA. Demonstrate how to remove

Aim:

To preprocess the weather database in WEKA and demonstrate attribute removal.

Procedure:

1. Load the 'weather.arff' dataset in WEKA.

2. Go to the Preprocess tab and select the 'Remove' filter.

3. Choose the attributes to remove, such as irrelevant or redundant ones.

4. Save the updated dataset.

Result:

Irrelevant attributes are removed, and the dataset is optimized for further analysis.

6. Demonstrate the usage of filters in WEKA

Aim:

To demonstrate the use of filters in WEKA.

Procedure:

1. Load a dataset into WEKA.

2. Apply filters like 'RemoveUseless' or 'Discretize' to process the data.

3. Observe changes in the dataset.

Result:

The dataset is processed using filters, improving its quality for analysis.

7. Create noise monitoring dataset using Apriori algorithm


Aim:

To create a noise monitoring dataset using the Apriori algorithm.

Procedure:

1. Collect noise level data and organize it into transactions.

2. Load the dataset into WEKA and select the Apriori algorithm.

3. Configure parameters like minimum support and confidence.

4. Run the algorithm to find patterns.

Result:

Frequent patterns in the noise data are identified using the Apriori algorithm.

8. Design multi-dimensional data models (Star, Snowflake, Fact Constellation) for Banking

Aim:

To design multidimensional data models for banking applications.

Procedure:

1. Create a fact table for transactions with measures like loan amount and interest rate.

2. Design dimension tables such as Customer, Time, and Branch.

3. Use normalization for Snowflake schema and multiple fact tables for Fact Constellation.

Result:

Star, Snowflake, and Fact Constellation schemas for banking are designed.

9. Design multi-dimensional data models for Healthcare applications


Aim:

To design multidimensional data models for healthcare applications.

Procedure:

1. Identify dimensions like Patient, Doctor, Time, and Treatment.

2. Create a central fact table for treatments with measures like cost and duration.

3. Design schemas (Star, Snowflake, Fact Constellation) based on requirements.

Result:

Data models for healthcare applications are created successfully.

10. Implement classification of data using K-Nearest Neighbor in WEKA

Aim:

To classify data using the K-Nearest Neighbor (KNN) algorithm in WEKA.

Procedure:

1. Load a dataset like 'iris.arff' into WEKA.

2. Go to the 'Classify' tab and select 'IBk (KNN)' as the classifier.

3. Set parameters like the number of neighbors (K=3).

4. Evaluate the classifier using cross-validation.

Result:

The dataset is classified with results such as accuracy and confusion matrix.

You might also like