paper 3 -- OnLineNewClassificationUsingMachineLearning
paper 3 -- OnLineNewClassificationUsingMachineLearning
ABSTRACT
The paper addresses the increasing demand for automatically organizing large amounts of
unstructured online data, particularly news articles. It uses supervised machine learning to sort
these articles into categories like politics, sports, and entertainment. With a dataset of 75,000
articles, several classifiers were tested, and the Naive Bayes classifier stood out, achieving 93%
accuracy, proving its effectiveness for this task
INTRODUCTION
The paper highlights the rapid increase in digital content and the difficulties in organizing
unstructured online data efficiently. It explains how automatic text classification is essential for
applications like search engines, content summarization, and question-answering systems. The
paper used supervised learning to deal with the variety of sources, writing styles, and
vocabularies found in news articles. Their goal was to personalize content for users by sorting
articles into categories such as crime, sports, politics, and entertainment.
TECHNIQUES USED
MODELS USED
RESULTS
CONCLUSION
The study highlights the effectiveness of Naive Bayes for classifying news articles in addition to
the importance of text preprocessing and dataset quality in achieving high classification
accuracy. Future improvements may include extending the work to regional languages and
experimenting with more sophisticated algorithms.