0% found this document useful (0 votes)

16 views

Text Classification

Uploaded by

ranjanagirish30

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views

Text Classification

Uploaded by

ranjanagirish30

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

Text Classification

Besat Kassaie
2 Outline

 Introduction
 Text classification definition
 Naive Bayes
 Vector Space Classification
 Rocchio Classficiation
 KNN Classification
 New Applications
3 Spam Detection
4 Authorship identification

 Was Mark Twain the writer of “Quintus Curtius Snodgrass” (QCS) letters?
 Mark Twain's role in the Civil War has been a subject of dispute for years.
 The evidence of Twain’s possible military connection in New Orleans was
from content of ten letters published in New Orleans’ Daily Crescent.
 In these letters, which have been credited to Twain and were signed
“Quintus Curtius Snodgrass” (QCS), the writer described his military
adventures.
 Bringar (1963) applied statistical tests to QCS letters.
 Bringar used word frequency distribution to determine Mark Twain’s writing
style
 Mark Twain was not the author of the disputed letters!
5 Gender Identification

 Determining if an author is male or female

 Men and women inherently use different classes of language styles.
 Samples:
 a large number of determiners (a, the, that, these) and quantifiers (one, two,
more, some) are male indicators.
 Using large number of the pronouns (I, you, she, her, their, myself, yourself,
herself) are all strong female indicators.

 Applications:
 Marketing, personalization, legal investigation
6 Sentiment Analysis

 Is the attitude of this text positive or negative?

 Rank the attitude of this text from 1 to 5
 Sample Applications:
 Movie: is this review positive or negative?
 Products: what do people think about the new iPhone?
 Public sentiment: how is consumer confidence? Is despair
increasing?
 Politics: what do people think about this candidate or issue?
 Prediction: predict election outcomes or market trends from
sentiment
7 Subject Assignment

 What is the subject of this article:

 Antagonists and inhibitors
 Blood Supply
 Chemistry
 Drug Therapy
??  Embryology
 ….
8 Text Classification Definition

Input:
a document d
a fixed set of classes C={c1,c2,…,cj}

Output: a predicted class c ∈ C

9 Approaches to Text Classification

 Manual
 Many classification tasks have traditionally been solved manually.
 e.g. Books in a library are assigned Library of Congress categories by a librarian
 manual classification is expensive to scale

 Hand-crafted rules
 A rule captures a certain combination of keywords that indicates a class.
 e.g. (multicore OR multi-core) AND (chip OR processor OR microprocessor)
 good scaling properties
 creating and maintaining them over time is labor intensive

 Machine learning based methods

10 Supervised Learning

 Input:
 a document d
 a fixed set of classes C={c1,c2,…,cj}
 a training set of m labeled documents (d1,c1), (d2,c2),…,(dm,cm)

 Output: a learned classifier γ :d → c

11 Supervised Learning

 We can use any kind of classifiers:

 Naïve bayes
 kNN
 Rocchio
 SVM
 Logistic Regression
 ….
12

Naïve Bayes Classification

13 Naïve Bayes Intuition

 It is a simple classification technique based on Bayes rule

 Documents are represented by “bag of words” model
 The bag of words model :
 a document d, is represented by the set of words and their weights
determined by the TF weights
 TF is the number of occurrences of each term.
 The document “Mary is quicker than John” is, in this view, identical to
the document “John is quicker than Mary”.
14 The bag of words representation

I love this movie! It's sweet,

but with satirical humor. The
dialogue is great and the

γ( )=c
adventure scenes are fun… It
manages to be whimsical and
romantic while laughing at the
conventions of the fairy tale
genre. I would recommend it to
just about anyone. I've seen
it several times, and I'm
always happy to see it again
whenever I have a friend who
hasn't seen it yet.
15 The bag of words representation

γ( )=c
16 Features

 Supervised learning classifiers can use any sort of feature

 URL, email address, punctuation, capitalization, social network features
and so on
 In the bag of words view of documents
 We use only word features
17 Baye’s Rule Applied to Documents and
Classes
 For a document d and a class c

𝑃 𝑑 𝑐 𝑃(𝑐)
𝑃 𝑐𝑑 =
𝑃(𝑑)
18 Naïve Bayes Classifier

𝑐𝑚𝑎𝑝 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑝 𝑐 𝑑 MAP is “maximum a

𝑐∈𝐶 posteriori” = most
likely class

𝑃 𝑑 𝑐 𝑃(𝑐)
= 𝑎𝑟𝑔𝑚𝑎𝑥 Bayes Rule
𝑐∈𝐶 𝑃(𝑑)

Dropping the
= 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 𝑑 𝑐 𝑃(𝑐) denominator
𝑐∈𝐶
Document d
= 𝑎𝑟𝑔𝑚𝑎𝑥𝑃 𝑥1 , 𝑥1 , … , 𝑥𝑛 𝑐 𝑃(𝑐) represented as Features
𝑐∈𝐶 x1..xn
19 Multinomial NB VS Bernoulli NB

 There are two different approaches for setting up an NB classifier.

 multinomial model: generates one term from the vocabulary in each
position of the document
 multivariate Bernoulli model: generates an indicator for each term of the
vocabulary, either 1 indicating presence of the term in the document or 0
indicating absence

 Difference in estimation strategies

20 Document Features

𝑐𝑚𝑎𝑝 =𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 𝑑 𝑐 𝑃(𝑐)

𝑐∈𝐶
 Multinomial:
 𝑐𝑚𝑎𝑝 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 𝑤1 , 𝑤2 , … , 𝑤𝑛 𝑐 𝑃(𝑐)
𝑐∈𝐶

 𝑤1 , 𝑤2 , … , 𝑤𝑛 is the sequence of terms as it occurs in d

 Bernoulli:
 𝑐𝑚𝑎𝑝 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 𝑒1 , 𝑒2 , … , 𝑒𝑣 𝑐 𝑃(𝑐)
𝑐∈𝐶

 𝑒1 , 𝑒2 , … , 𝑒𝑣 is a binary vector of dimensionality V that indicates for each

term whether it occurs in d or not
21 ASSUMPTIONS: Conditional independence

 𝑃 𝑥1 , 𝑥2 , … , 𝑥𝑛 𝑐 = 𝑃 𝑥1 𝑐 ∙ 𝑃 𝑥2 𝑐 ∙ ⋯ ∙ 𝑃 𝑥𝑛 𝑐
 Multinomial:
 𝑐𝑚𝑎𝑝 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 𝑤1 , 𝑤2 , … , 𝑤𝑛 𝑐 𝑃 𝑐
𝑐∈𝐶

 𝑐𝑁𝐵 = 𝑎𝑟𝑔𝑚𝑎𝑥 1≤𝑘≤𝑛 𝑃 𝑋𝑘 = 𝑤𝑘 𝑐 𝑃(𝑐)

𝑐∈𝐶

 Xk is the random variable for position k in the document and takes

as values terms from the vocabulary.
 𝑃 𝑋𝑘 = 𝑤𝑘 𝑐 is the probability that in a document of class c the term
w will occur in position k
22 ASSUMPTIONS: Conditional independence

 Bernoulli:
 𝑐𝑚𝑎𝑝 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 𝑒1 , 𝑒2 , … , 𝑒𝑣 𝑐 𝑃 𝑐
𝑐∈𝐶

 𝑐𝑁𝐵 = 𝑎𝑟𝑔𝑚𝑎𝑥 1≤𝑖≤𝑣 𝑃 𝑈𝑖 = 𝑤𝑖 𝑐 𝑃(𝑐)

𝑐∈𝐶

U is the random variable for vocabulary term i and takes as values

0 (absence) and 1 (presence).

 𝑃 𝑈𝑖 = 𝑤𝑖 𝑐 is the probability that in a document of class c the term
wi will occur
23 ASSUMPTIONS: Bag of words
assumption
 if we assumed different term distributions for each position k, we would have to
estimate a different set of parameters for each k

 positional independence: The conditional probabilities for a term are the same,
independent of position in the document.
 E.g. P(Xk1 = t|c) = P(Xk2 = t|c)

 we have a single distribution of terms that is valid for all positions k

𝑐𝑚𝑎𝑝 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 𝑤1 , 𝑤2 , … , 𝑤𝑛 𝑐 𝑃 𝑐
𝑐∈𝐶

𝑐𝑁𝐵 = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 𝑤 𝑐 𝑃(𝑐)

𝑐∈𝐶
𝑤𝜖𝑊
24 Learning Parameters for Multinomial
Naïve Bayes
 Based on Maximum Likelihood Estimation

Fraction of documents
𝐶𝑜𝑢𝑛𝑡 𝐶 = 𝑐𝑗 which are labeled as
𝑃 𝑐𝑗 =
𝑁𝑑𝑜𝑐 class cj

Create mega document

𝐶𝑜𝑢𝑛𝑡(𝑤𝑖, 𝑐𝑗 ) Fraction of times word wi
for class j By concatenating 𝑃(𝑤𝑖|𝑐𝑗 ) =
𝑤𝜖𝑉 𝐶𝑜𝑢𝑛𝑡(𝑤, 𝑐𝑗 )
appears among all words
all docs In this class.
of class cj
Use frequency of w in
mega document.
25 Laplace (add-1) smoothing for Naïve
Bayes
 Solving the problem of zero probabilities

𝐶𝑜𝑢𝑛𝑡 𝑤𝑖, 𝑐𝑗 + 1
𝑃 𝑤𝑖 𝑐𝑗 =
𝑤𝜖𝑉 𝐶𝑜𝑢𝑛𝑡 𝑤, 𝑐𝑗 + 1

𝐶𝑜𝑢𝑛𝑡 𝑤𝑖, 𝑐𝑗 + 1
=
𝑤𝜖𝑉 𝐶𝑜𝑢𝑛𝑡 𝑤, 𝑐𝑗 + |𝑉|
26 Laplace (add-1) smoothing: unknown
words
 Add one extra word to the vocabulary, the “unknown word” , wu

𝐶𝑜𝑢𝑛𝑡 𝑤𝑢, 𝑐𝑗 + 1
𝑃 𝑤𝑢 𝑐𝑗 =
( 𝑤𝜖𝑉 𝑐𝑜𝑢𝑛𝑡(𝑤, 𝑐𝑗 )) + |𝑉 + 1|

1
=
( 𝑤𝜖𝑉 𝑐𝑜𝑢𝑛𝑡(𝑤, 𝑐𝑗 )) + |𝑉 + 1|

Probability for all unknown words

27 Example
28

Vector Space Classification

29 Vector Space Representation

 The document representation in Naive Bayes is a sequence of terms or a binary

vector
 A different representation for text classification: the vector space model
 Each document represented as a vector with one real-valued component,
usually a tf-idf weight, for each term.

 Each vector is composed of all terms

Of all documents.

 Terms are axis

 High dimensionality
30 Vector Space Classification

 Vector space classification is based on the contiguity hypothesis:

 Objects in the same class form a contiguous region, and regions of different
classes do not overlap

 Classification is to compute the boundaries in the space that separate the

classes; the decision boundaries

 Two methods will be discussed:

Rocchio and kNN
31

Vector Space Classification

Rocchio
32 Rocchio Classification

 Basic idea: Rocchio forms a simple representative for each class by using
the centroids.
 To compute the centroid of each class:

1
𝜇 𝑐 = 𝑑𝜖𝐷𝑐 𝑣(𝑑)
|𝐷𝑐|

Dc is the set of all documents that belong to class c

v(d) is the vector space representation of d.

33 Rocchio Classification

 The boundary between two classes in

Rocchio classification is the set of points
with equal distance from the two centroids

 To classify a new object determine

which centroid is closest to and assign it to the corresponding class ci .
34 Problems with Rocchio Classification

 Implicitly assumes that classes are spheres with similar radius

 Ignores details of distribution of points within a class and only considers
centroid distances.

 Red cube is assigned to Kenya class

 But is a better fit for UK class
since UK is more scattered than Kenya

 Not requiring similar radius → modified decision rule:

Assign d to class c iff 𝜇 𝑐 − 𝑣 𝑑 ≤ 𝜇 𝑐 −𝑣 𝑑 −𝑏
35 Problems with Rocchio Classification

 Does not work well for classes than cannot be accurately represented by a
single “center”

 Class “a” consists of two different clusters.

 Rocchio classification will misclassify “o” as “a” because
it is closer to the centroid A of the “a” class than to the
centroid B of the “b” class.
36

Vector Space Classification

K Nearest Neighbor
37 K Nearest Neighbor Classification

 For kNN we assign each document to the majority class of its k closest
neighbors where k is a parameter.

 The rationale of kNN classification is that, based on the contiguity

hypothesis, we expect a test document d to have the same label as the
training documents located in the local region surrounding d.

 Learning: just store the labeled training examples D

38 K Nearest Neighbor Classification

 Determining class of a new document :

 For k = 1: Assign each object to the class of its closest neighbor.

 For k > 1: Assign each object to the majority class among its k closest
neighbors.
 The parameter k must be specified in advance. k = 3 and k = 5 are
common choices, but much larger values between 50 and 100 are also
used
39 kNN decision boundary

 The decision boundary defined by the Voronoi tessellation

 Assuming k = 1: For a given set of objects in the space, each object define
a cell consisting of all points that are closer to that object than to other
objects.
 Results in a set of convex polygons;
so-called Voronoi cells.

 Decomposing a space into such cells

gives us the so-called Voronoi tessellation.
 In the case of k > 1, the Voronoi cells
are given by the regions in the space for which
the set of k nearest neighbors is the same.
40

 Decision boundary for 1NN (double line): defined along the regions of
Voronoi cells for the objects in each class. Shows the non-linearity of kNN
41 kNN: Probabilistic version

 There is a probabilistic version of kNN classification algorithm.

 We can estimate the probability of membership in class c as the proportion of
the k nearest neighbors in c.

 𝑃 (𝑐𝑖𝑟𝑐𝑙𝑒 𝑐𝑙𝑎𝑠𝑠|𝑠𝑡𝑎𝑟) = 1/3

 𝑃(𝑋 𝑐𝑙𝑎𝑠𝑠|𝑠𝑡𝑎𝑟) = 2/3
 𝑃(𝑑𝑖𝑎𝑚𝑜𝑛𝑑 𝑐𝑙𝑎𝑠𝑠|𝑠𝑡𝑎𝑟) = 0

 The 3NN estimate: 𝑃(𝑐𝑖𝑟𝑐𝑙𝑒 𝑐𝑙𝑎𝑠𝑠|𝑠𝑡𝑎𝑟) = 1/3

 The 1NN estimate: 𝑃(𝑐𝑖𝑟𝑐𝑙𝑒 𝑐𝑙𝑎𝑠𝑠|𝑠𝑡𝑎𝑟) = 1
 3NN preferring the X class
and 1NN preferring the circle class .
42 kNN: distance weighted version

 We can weight the “votes” of the k nearest neighbors by their cosine

similarity:

𝑆𝑐𝑜𝑟𝑒(𝑐, 𝑑) = 𝐼𝑐 𝑑 cos(𝑣 𝑑 , 𝑣 𝑑 )
𝑑𝜖𝑆𝑘 (𝑑)

 Sk(d) is the set of d’s k nearest neighbors

 Ic(d′) = 1 iff d′ is in class c and 0 otherwise.
 Assign the document to the class with the highest score.
 Weighting by similarities is often more accurate than simple voting.
43 kNN Properties

 No training necessary
 Scales well with large number of classes
 Don’t need to train n classifiers for n classes
 May be expensive at test time
 In most cases it’s more accurate than NB or Rocchio
44 Some Recent Applications of Text
Classification
 Social networks are rich source of text data e.g. twitter
 Many new applications are emerging based on these sources
 We will discuss three recent works based on twitter data
 “Learning Similarity Functions for Topic Detection in Online Reputation
Monitoring. Damino Spina, et. Al, SIGIR, 2014”
 “Target-dependent Churn Classification in Microblogs, Hadi Amiri, et.al,
AAAI 2015”
 “Fine-Grained Location Extraction from Tweets with Temporal Awareness,
Chenliang Li, SIGIR, 2014”
45 Learning Similarity Functions for Topic
Detection in Online Reputation Monitoring

 What are people are saying about a given entity (company, brand,
organization, personality,…)
 Is there any issues that may damage the reputation of the entity?
 In order to answer such questions, reputation experts have to daily monitor
social networks such as twitter
 The paper aim is to solve this problem automatically as topic detection task
 Reputation alerts must be detected early before they explode. So there are
a few number of related tweets
 Probabilistic generative approaches are less appropriate here because of
data sparsity
46 Learning Similarity Functions…

 Paper proposes a hybrid approach: supervised + unsupervised

 Tweets need to be clustered based on their topics
 Clusters change over time based on data stream
 Tweets with similar topics belong to the same cluster
 For clustering we need a similarity function.
 Classification (SVM) is used to learn the similarity function.
 Two tweets belong to either similar or not similar classes

SVM

Hierarchical Agglomerative Clustering

47 Learning Similarity Functions…

 Dataset is comprised of 142,527 manually annotated tweets.

 Features used for classification:
 Term Features:
 taking into account similarity between terms in the tweets
 Tweets sharing high percentage of vocabulary are likely about same topic

 Semantic features:
 Useful for tweets that do not have words in common.
 Used Wikipedia as a knowledge based to find semantically related words. e.g. mexico,
mexicanas

 Metadata features:
 Such as author, URLs, hashtags,…

 Time-aware features:
 Time stamp of tweets
48 Fine-Grained Location Extraction from
Tweets with Temporal Awareness
 Through tweets, users often casually or implicitly reveal their current
locations and short term plans where to visit next, at fine-grained granularity

 Such information enables tremendous opportunities for personalization and

location-based services/marketing

 This paper is about extracting fine-grained locations mentioned in tweets

with temporal awareness.
49 Fine-Grained Location Extraction…

 Dataset: a sample of 4000 manually annotated tweets

 Classifier: Conditional Random Fields
 Some features:
 Lexical Features: words itself, its lower case, bag-of-words of the context window,
 Grammatical Features: POS tags (for verb tense), word groups (shudd, shuld,
shoud)
 …
 If a user mentions a point-of-interest (POI) (e.g., restaurant, shopping
mall,…) in her tweet:
 The name of the POI is extracted
 The tweet is assigned into one of these classes: The POI is visited, is currently at, or
will soon visit (temporal awareness).
50 Target-dependent Churn Classification
in Microblogs
 The problem of classifying micro-posts as churny or non-churny with respect
to a given brand (telecom companies) using Twitter data.
 Whether a tweet is an indicator that user is going to cancel a service
 Sample tweets:
 “One of my main goals for 2013 is to leave BrandName”
 “I cant take it anymore, the unlimited data isn’t even worth it”
 “My days with BrandName are numbered”
51 Target-dependent Churn Classification
in Microblogs
 Dataset : 4800 tweets about three telecommunication brands
 Classification method: linear regression, SVM, logistic regression
 Features:
 Demographic Churn Indicators:
 Activity ratio (if a user is not active with respect to any competitors, then he is less likely
to send churny contents about a target brand),
 # followers and friends, …

 Content Churn Indicators:

 Sentiment features, Tense of tweet,…

…
52

THANK YOU
53 References

https://web.stanford.edu/class/cs124/lec/sentiment.pptx
https://web.stanford.edu/class/cs124/lec/naivebayes.pdf
An introduction to Information Retrieval, Christopher D. Manning
,Prabhakar Raghavan,Hinrich Schütze,2009
Learning Similarity Functions for Topic Detection in Online
Reputation Monitoring. Damino Spina, et. Al, SIGIR, 2014
Target-dependent Churn Classification in Microblogs, Hadi Amiri,
et.al, AAAI 2015
Fine-Grained Location Extraction from Tweets with Temporal
Awareness, Chenliang Li, SIGIR, 2014

CS464 Chapter 4: Naïve Bayes: (Slides Based On The Slides Provided by Öznur Taştan and Mehmet Koyutürk)
No ratings yet
CS464 Chapter 4: Naïve Bayes: (Slides Based On The Slides Provided by Öznur Taştan and Mehmet Koyutürk)
55 pages
04 Textcat
No ratings yet
04 Textcat
101 pages
NLP NB
No ratings yet
NLP NB
52 pages
Irs Unit 4 CH 1
No ratings yet
Irs Unit 4 CH 1
58 pages
L5 TextClassification Updated
No ratings yet
L5 TextClassification Updated
179 pages
Lecture13 Nbayes
No ratings yet
Lecture13 Nbayes
56 pages
nb24aug
No ratings yet
nb24aug
85 pages
MultinomialNB
No ratings yet
MultinomialNB
52 pages
4 NB 2024
No ratings yet
4 NB 2024
82 pages
Naive Bayes
No ratings yet
Naive Bayes
56 pages
Naivebayes 2021
No ratings yet
Naivebayes 2021
77 pages
NB 24 Aug
No ratings yet
NB 24 Aug
82 pages
20250129_Lecture03_naivebayes
No ratings yet
20250129_Lecture03_naivebayes
25 pages
Text Classification
No ratings yet
Text Classification
32 pages
Lecture 4
No ratings yet
Lecture 4
36 pages
03 ML Essentials
No ratings yet
03 ML Essentials
52 pages
03 Classification
No ratings yet
03 Classification
66 pages
Inf2b Learn Note07 2up
No ratings yet
Inf2b Learn Note07 2up
5 pages
Week4
No ratings yet
Week4
45 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
48 pages
AI Lec 04+05 - Naive Bayes
No ratings yet
AI Lec 04+05 - Naive Bayes
55 pages
nb24aug
No ratings yet
nb24aug
79 pages
lecture3-linear-classifiers
No ratings yet
lecture3-linear-classifiers
36 pages
L2_CSE256_FA24_TC
No ratings yet
L2_CSE256_FA24_TC
65 pages
T4L1 Naive Bayes
No ratings yet
T4L1 Naive Bayes
50 pages
NaiveBayes N Text Analytics
No ratings yet
NaiveBayes N Text Analytics
20 pages
11 Text Categorization
No ratings yet
11 Text Categorization
25 pages
4 Naive Bayes
No ratings yet
4 Naive Bayes
82 pages
L 13 Naive Bayes Classifier
100% (1)
L 13 Naive Bayes Classifier
52 pages
Introduction To Machine Learning and Deep Learning
No ratings yet
Introduction To Machine Learning and Deep Learning
40 pages
Naive Bayes With Sentiment Classification
No ratings yet
Naive Bayes With Sentiment Classification
82 pages
Lecture5 421
No ratings yet
Lecture5 421
115 pages
Lecture-Feb20&25
No ratings yet
Lecture-Feb20&25
11 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Naive Bayes and Sentiment Classification
No ratings yet
Naive Bayes and Sentiment Classification
23 pages
Tackling The Poor Assumptions of Naive Bayes Text Classifiers
No ratings yet
Tackling The Poor Assumptions of Naive Bayes Text Classifiers
8 pages
02 Text Processing PDF
No ratings yet
02 Text Processing PDF
70 pages
NLP ch4 l1
No ratings yet
NLP ch4 l1
23 pages
Unit 3 PPT
No ratings yet
Unit 3 PPT
20 pages
Naive Bayes Sentiment Analysis
No ratings yet
Naive Bayes Sentiment Analysis
23 pages
Naive_Bayes (1)
No ratings yet
Naive_Bayes (1)
4 pages
Unit 4
No ratings yet
Unit 4
207 pages
4.Machine Learning for Text Understanding-1
No ratings yet
4.Machine Learning for Text Understanding-1
45 pages
IR unit 2 (1,2)
No ratings yet
IR unit 2 (1,2)
76 pages
Naive Bayes and Sentiment
No ratings yet
Naive Bayes and Sentiment
19 pages
01 What Is Text Classification 8-12
No ratings yet
01 What Is Text Classification 8-12
4 pages
Discriminative Generative: R Follow A
100% (1)
Discriminative Generative: R Follow A
18 pages
Text Categorization and Classification
No ratings yet
Text Categorization and Classification
13 pages
6. Naive Bayes
No ratings yet
6. Naive Bayes
26 pages
Chapter 4 Text Classification
No ratings yet
Chapter 4 Text Classification
28 pages
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper (1)
No ratings yet
Winter Semester 2023-24 CSE3015 ETH AP2023246000714 Quiz-I-Question-Paper (1)
74 pages
Slp3 TextClassification Reduced
No ratings yet
Slp3 TextClassification Reduced
60 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
64 pages
3
No ratings yet
3
23 pages
Classification (NaiveBayes KNN SVM DecisionTrees)
No ratings yet
Classification (NaiveBayes KNN SVM DecisionTrees)
105 pages
An Approach of The Naive Bayes Classifier For The Document Classification
No ratings yet
An Approach of The Naive Bayes Classifier For The Document Classification
4 pages
Naive Bayes
No ratings yet
Naive Bayes
2 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Math for Computer Applications
From Everand
Math for Computer Applications
The Editors of REA
No ratings yet
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Environmental Studies
No ratings yet
Environmental Studies
59 pages
LLM System Design
No ratings yet
LLM System Design
11 pages
New_ Stanford State of AI in 2025
No ratings yet
New_ Stanford State of AI in 2025
24 pages
Prompting Guide 101
No ratings yet
Prompting Guide 101
68 pages
AI Engineer Reading List
No ratings yet
AI Engineer Reading List
10 pages
Ai For All
No ratings yet
Ai For All
8 pages
11 Optimizing Real Time Object Detection A Comparison of Yolo Models
No ratings yet
11 Optimizing Real Time Object Detection A Comparison of Yolo Models
18 pages
ThoughtSpot Survey Report
No ratings yet
ThoughtSpot Survey Report
20 pages
Kansal Privacy-Enhancing Person Re-Identification Framework - A Dual-Stage Approach WACV 2024 Paper
No ratings yet
Kansal Privacy-Enhancing Person Re-Identification Framework - A Dual-Stage Approach WACV 2024 Paper
10 pages
UCS551 Chapter 6 - Classification
No ratings yet
UCS551 Chapter 6 - Classification
20 pages
HW Template
No ratings yet
HW Template
6 pages
JOCC - Volume 2 - Issue 1 - Pages 50-65
No ratings yet
JOCC - Volume 2 - Issue 1 - Pages 50-65
16 pages
Advanced Python Training Content 2022
No ratings yet
Advanced Python Training Content 2022
4 pages
Tribhuvan University: Institute of Engineering
No ratings yet
Tribhuvan University: Institute of Engineering
48 pages
Customer Churn Prediction System: A Machine Learning Approach
No ratings yet
Customer Churn Prediction System: A Machine Learning Approach
24 pages
Bayesian Learning: Conchi Aus In and Mike Wiper Department of Statistics Universidad Carlos III de Madrid
No ratings yet
Bayesian Learning: Conchi Aus In and Mike Wiper Department of Statistics Universidad Carlos III de Madrid
19 pages
03 - Classification PDF
No ratings yet
03 - Classification PDF
92 pages
Wahyudi 2021 J. Phys. Conf. Ser. 1830 012016
No ratings yet
Wahyudi 2021 J. Phys. Conf. Ser. 1830 012016
13 pages
Module - 4 AIML
No ratings yet
Module - 4 AIML
22 pages
NNDL - Unit - I Notes
No ratings yet
NNDL - Unit - I Notes
23 pages
8 Machine Learning Algorithms in Python
100% (3)
8 Machine Learning Algorithms in Python
16 pages
A Review On Sentiment Analysis Methodologies Practices and Applications
No ratings yet
A Review On Sentiment Analysis Methodologies Practices and Applications
9 pages
Slides - Machine Learning and Advanced Analytics Using Python
No ratings yet
Slides - Machine Learning and Advanced Analytics Using Python
39 pages
Wa0000.
No ratings yet
Wa0000.
26 pages
Formulario - de - Extraccion - de - Datos - 2
No ratings yet
Formulario - de - Extraccion - de - Datos - 2
4 pages
Mechine Learning
No ratings yet
Mechine Learning
106 pages
BookSlides 6A Probability-Based Learning
No ratings yet
BookSlides 6A Probability-Based Learning
68 pages
[Ebooks PDF] download grokking Machine Learning MEAP v07 Luis G Serrano full chapters
100% (2)
[Ebooks PDF] download grokking Machine Learning MEAP v07 Luis G Serrano full chapters
39 pages
178 hw1
No ratings yet
178 hw1
4 pages
Thesis
No ratings yet
Thesis
104 pages
CU6051NA - Artificial Intelligence: Student Name: Renish Gautam
No ratings yet
CU6051NA - Artificial Intelligence: Student Name: Renish Gautam
18 pages
Thesis 1
No ratings yet
Thesis 1
39 pages
Performances of Machine Learning Algorithms For Bi
No ratings yet
Performances of Machine Learning Algorithms For Bi
9 pages
CampusX DSMP 2.0 Syllabus
No ratings yet
CampusX DSMP 2.0 Syllabus
36 pages
Cs3491 Aiml Unit 2 Qbank
No ratings yet
Cs3491 Aiml Unit 2 Qbank
33 pages
Cheet Sheet
No ratings yet
Cheet Sheet
47 pages
Project Viva
No ratings yet
Project Viva
4 pages
ML Lab Programs (1-13)
No ratings yet
ML Lab Programs (1-13)
44 pages