Fake Reviews Detector Project (132,133)
Fake Reviews Detector Project (132,133)
Fake Reviews Detector Project (132,133)
2. Dataset Description
The dataset used for this project contains two columns: 'review' and 'label'. The 'review' column
contains the text of the review, while the 'label' column indicates whether the review is real (0) or
fake (1).
3. Code Overview
The code is structured into several sections: data preprocessing, feature extraction, model
training, evaluation, and prediction.
4. Code Snippet
import pandas as pd
import numpy as np
import re
import nltk
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
nltk.download('stopwords')
from nltk.corpus import stopwords
from nltk.stem.porter import PorterStemmer
def preprocess_review(review):
review = re.sub('[^a-zA-Z]', ' ', review)
review = review.lower()
review = review.split()
ps = PorterStemmer()
review = [ps.stem(word) for word in review if word not in set(stopwords.words('english'))]
return ' '.join(review)
1
Enrollment Numbers : 12202080601132 , 12202080601133
Names : Patel Vaibhavi , Rana Vairagi
A D PATEL INSTITUTE OF TECHNOLOGY
DEPARTMENT OF INFORMATION TECHNOLOGY
SUBJECT NAME & COURSE CODE: ARTIFICIAL INTELLIGENCE (202044503)
# Preprocess reviews
df['processed_review'] = df['review'].apply(preprocess_review)
# Feature extraction
tfidf = TfidfVectorizer(max_features=5000)
X = tfidf.fit_transform(df['processed_review']).toarray()
y = df['label']
# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Predictions
y_pred = model.predict(X_test)
5. Expected Outputs
The expected outputs include:
- Accuracy of the model.
- A classification report detailing precision, recall, and F1-score.
- A confusion matrix indicating the number of true and false predictions.
- Predictions for sample reviews indicating whether they are real or fake.
2
Enrollment Numbers : 12202080601132 , 12202080601133
Names : Patel Vaibhavi , Rana Vairagi
A D PATEL INSTITUTE OF TECHNOLOGY
DEPARTMENT OF INFORMATION TECHNOLOGY
SUBJECT NAME & COURSE CODE: ARTIFICIAL INTELLIGENCE (202044503)
6. Conclusion
The Fake Reviews Detector project showcases the application of machine learning and natural
language processing to tackle the problem of fake reviews. By using this model, businesses can
better identify and mitigate the impact of deceptive reviews.
3
Enrollment Numbers : 12202080601132 , 12202080601133
Names : Patel Vaibhavi , Rana Vairagi