SNA Project Presentation
SNA Project Presentation
SNA Project Presentation
Group Members :
Anupam Raj (IIT2020034)
2
Fake news spreader detection is a process of
identifying individuals or organizations that
intentionally propagate misinformation,
disinformation, or propaganda to mislead the public.
Fake news has become a prevalent issue in today's
society, particularly with the rise of social media and
the internet. It can have serious consequences,
including the spread of hate speech, the erosion of
N-grams
For all four models, we ran an
extensive grid search combined with
five-fold cross-validation to find the
optimal text preparation method,
vectorization technique and
modeling parameters.
These are the hyper-parameters we tried with 5-fold cross validation to find the best set
of hyper-parameters for the 4-models.
Contd:
1. Investigated two types of text cleaning methods for all models. The first method
(M1) removed all non alphanumeric characters (except #) from the text, while the
second method (M2) removed most non alphanumeric characters (except #) but
kept emoticons and emojis. Both methods transformed the text to lower case.
3. The accuracy of our model was approximately 5% lower on the test set compared to
the cross-validation results (70% vs. 74% for the English dataset).
● refitted the four submodels with the cross-validated
hyperparameters five times on different chunks of the
original training data (each consisting of tweets from 240
users). This is done to prevent overfitting on ensemble on
training data.
Stacking-ensemble ● used these constructed training and test sets to find the
best ensemble from the following three methods:
1. majority voting
2. linear regression of predicted probabilities
3. logistic regression model [Best Result- an accuracy
of 70%for the English dataset]]
● We have experimented with english
language only.
● https://zenodo.org/record/4039435#.
ZFVGAnZBy3B
● The Average accuracy for Support Vector
Machine model came out to be 55.66 %.
Future Work
using some sort of method.