Instagram Spam Detection Using AI
Instagram Spam Detection Using AI
Instagram Spam Detection Using AI
Have you ever received a notification that someone commented on your Instagram
post? You excitedly pick up your phone and open the app only to find that it’s a bot
promoting some knockoff brand of shoes. The comment section of many Instagram
posts is filled with bots. They can range from annoying to dangerous, depending on
the type of call to action they require from you.
You can build a spam detection model using AI techniques to identify the difference
between spam and legitimate comments.
You might not be able to find a dataset that has a collection of Instagram spam
comments, but you can collect the data for this analysis by scraping the web. Access
the Instagram API with Python to get unlabelled comments from Instagram.
You can use a different set of data for training, like Kaggle’s YouTube spam collection
dataset. Then, use keywords to classify words that commonly appear in spam
comments.
Use a technique like N-Gram to assign weightage to words that tend to appear in
spam comments, then compare those words with each scraped comment from the
web. Another approach you can take is the use of a distance-based algorithm
like cosine similarity. These approaches will yield better results based on the type of
pre-processing you apply.
If you remove stop-words, whitespaces, punctuation and clean the data correctly,
you will find that the algorithm performs better as it can match similar words with
each other.
You can also use a pre-trained model like ALBERT for better results. While distance or
weightage matching algorithms work well in finding similar words, they are unable to
grasp the context of a sentence.
Team Members:
R. Dharani
P. A. Bharathi
B. M. Nithyashri