0% found this document useful (0 votes)

36 views109 pages

Nlp4web Lecture 2 Text Classification

Uploaded by

mukulbasavaraj1999v2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views109 pages

Nlp4web Lecture 2 Text Classification

Uploaded by

mukulbasavaraj1999v2

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 109

NLP and the Web – WS 2024/2025

Lecture 2
Foundations of Text Classification

Dr. Thomas Arnold

Hovhannes Tamoyan
Kexin Wang

Ubiquitous Knowledge Processing Lab

Technische Universität Darmstadt
Syllabus (tentative)

Nr. Lecture
01 Introduction / NLP basics
02 Foundations of Text Classification
03 IR – Introduction, Evaluation
04 IR – Word Representation, Data Collection
05 IR – Re-Ranking Methods
06 IR – Language Domain Shifts, Dense / Sparse Retrieval
07 LLM – Language Modeling Foundations
08 LLM – Neural LLM, Tokenization
09 LLM – Transformers, Self-Attention
10 LLM – Adaption, LoRa, Prompting
11 LLM – Alignment, Instruction Tuning
12 LLM – Long Contexts, RAG
13 LLM – Scaling, Computation Cost
14 Review & Preparation for the Exam

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 2

Last Lecture – Linguistic Analysis Levels

Phonetics and Phonology

Segmentation

Morphology

Syntax

Semantics

Pragmatics and Discourse

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 3
Today's lecture

Text Classification: Introduction

Algorithms
Naive Bayes
Hidden Markov Models

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 4

What is Text Classification?

Output tags /
Input Text Classification Model
classes

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 5

Examples – Spam Detection

From: Danke (archdigest@news.condenast.com)

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 6

Examples – Sentiment Analysis

Positive or negative movie review?

▪ This is it. This is the one. This is the worst movie ever made. Ever. It
beats everything. I have never seen worse.

▪ Expertly scripted and perfectly delivered, this searing parody leaves you
literally rolling with laughter.

▪ While watching this film I started to come up with things I would rather be
doing, including drinking bleach, rubbing sand in my eyes, and tax
returns.

▪ Just finished watching this movie for maybe the 7th or 8th time

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 7

More Examples

Topic Labeling Age/Gender Identification

I've seen George Foreman shadow

boxing and the shadow won.

Authorship Identification Language Identification

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 8
Approaches
Rule-based

Handcrafted linguistic rules

Human comprehensible

Pro: Precision can be high

Con: Very expensive to build and maintain

IF "basketball" THEN
return top_sports
ELSEIF…

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 9

Approaches
Supervised Machine Learning

Step 1: Training

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 10

Approaches
Supervised Machine Learning

Step 2: Prediction

Classification Model
Input Text Feature
Extractor
Features

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 11

Approaches
Supervised Machine Learning

Classification model based on Training Data

Human comprehensible? Dependant on model!

Pro: Easier to maintain, usually more accurate

Con: Needs training data

Classification Model

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 12

Comparison to Neural Models

Kuzman T, Mozetič I, Ljubešić N. Automatic Genre Identification for Robust Enrichment of Massive Text Collections: Investigation of
Classification Methods in the Era of Large Language Models. Machine Learning and Knowledge Extraction. 2023; 5(3):1149-1175

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 13

Today's lecture

Text Classification: Introduction

Algorithms
Naive Bayes
Hidden Markov Models

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 14

Naïve Bayes Classifier: Background

▪ Before starting let’s review related concepts:

▪ Conditional Probability
▪ Bayes' Rule.

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 15

Conditional Probability

▪ Conditional Probability in plain English:

▪ What is the probability that something will happen, given that something
else has already happened.
▪ Assume we have some Outcome O and some Evidence E.
▪ 𝑃(𝑂, 𝐸): the Probability of having both the Outcome O and Evidence E is the
multiplication of two probabilities:
▪ 𝑃(𝑂): Probability of O occurring
▪ 𝑃(𝐸|𝑂): Probability of E given that O happened

𝑃(𝑂, 𝐸) = 𝑃(𝑂) × 𝑃(𝐸|𝑂)

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 16

Conditional Probability: Example

▪ Let say we have a group of students

▪ Students could be either awake or asleep
▪ Students are also either bachelor or master students
▪ If we select one student randomly: what is the probability that this person
is a sleeping bachelor student?
▪ Conditional Probability can help us answer that:
▪ 𝑃(𝐴𝑠𝑙𝑒𝑒𝑝 , 𝐵𝑎𝑐ℎ𝑒𝑙𝑜𝑟) = 𝑃(𝐴𝑠𝑙𝑒𝑒𝑝) ∗ 𝑃 𝐵𝑎𝑐ℎ𝑒𝑙𝑜𝑟 𝐴𝑠𝑙𝑒𝑒𝑝)
▪ We could compute the exact same thing, the reverse way
▪ 𝑃(𝐴𝑠𝑙𝑒𝑒𝑝 , 𝐵𝑎𝑐ℎ𝑒𝑙𝑜𝑟) = 𝑃(𝐵𝑎𝑐ℎ𝑒𝑙𝑜𝑟) ∗ 𝑃(𝐴𝑠𝑙𝑒𝑒𝑝 |𝐵𝑎𝑐ℎ𝑒𝑙𝑜𝑟)

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 17

Bayes Rule

▪ Conceptually, this is a way to go from

𝑃(𝐸|𝑂) × 𝑃(𝑂)
𝑃 𝑂𝐸 =
𝑃(𝐸)

Bayes Rule

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 18

Getting to Naïve Bayes'

▪ So far, we have talked only about one piece of evidence.

▪ In reality, we have to predict an outcome given multiple evidence
▪ math gets very complicated :(
▪ Naive Bayes‘ solution:
▪ treat each of piece of evidence as independent -> simpler math :)
▪ This approach is why this is called naive Bayes
▪ Suppose we have multiple evidences 𝐸1 , … , 𝐸𝑛 and an outcome 𝑂

𝑃 𝐸1 𝑂 × 𝑃 𝐸2 𝑂 × … × 𝑃 𝐸𝑛 𝑂 × 𝑃(𝑂)
𝑃 𝑂 𝐸1 , … , 𝐸𝑛 =
𝑃(𝐸1 , 𝐸2 , … , 𝐸𝑛 )

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 19

Getting to Naïve Bayes'

𝑃 𝐸1 𝑂 × 𝑃 𝐸2 𝑂 × … × 𝑃 𝐸𝑛 𝑂 × 𝑃(𝑂)
𝑃 𝑂 𝐸1 , … , 𝐸𝑛 =
𝑃(𝐸1 , 𝐸2 , … , 𝐸𝑛 )

Many people choose to remember this as:

𝑃(𝐿𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑜𝑓 𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒) ∗ 𝑃𝑟𝑖𝑜𝑟 𝑝𝑟𝑜𝑏 𝑜𝑓 𝑜𝑢𝑡𝑐𝑜𝑚𝑒

𝑃 𝑜𝑢𝑡𝑐𝑜𝑚𝑒 𝑒𝑣𝑖𝑑𝑒𝑛𝑐𝑒 =
𝑃(𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒)

▪ Notes:
• If the 𝑃(𝑒𝑣𝑖𝑑𝑒𝑛𝑐𝑒|𝑜𝑢𝑡𝑐𝑜𝑚𝑒) is 1, then we are just multiplying by 1.
• If the 𝑃(𝑠𝑜𝑚𝑒 𝑝𝑎𝑟𝑡𝑖𝑐𝑢𝑙𝑎𝑟 𝑒𝑣𝑖𝑑𝑒𝑛𝑐𝑒|𝑜𝑢𝑡𝑐𝑜𝑚𝑒) is 0, then the whole probability becomes 0
• Since we divide everything by 𝑃(𝐸𝑣𝑖𝑑𝑒𝑛𝑐𝑒),
• we can even get away without calculating it.
• The intuition behind multiplying by the prior is
• to give high probability to more common outcomes, and low probabilities to unlikely outcomes.
• These are also called base rates and they are a way to scale our predicted probabilities

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 20

Naïve Bayes: Example

▪ Let's say that we have data on 1000 pieces of fruit: Banana, Orange or
some Other Fruit
▪ We know 3 characteristics about each fruit
▪ Whether it is Long
▪ Whether it is Sweet and
▪ If its color is Yellow
▪ Our training set
Type Long Not Long Sweet Not Sweet Yellow Not Yellow Total

Banana 400 100 350 150 450 50 500

Orange 0 300 150 150 300 0 300

Other Fruit 100 100 150 50 50 150 200

Total 500 500 650 350 800 200 1000

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 21

Naïve Bayes: Example cont.

Type Long Not Long Sweet Not Sweet Yellow Not Yellow Total

Banana 400 100 350 150 450 50 500

Orange 0 300 150 150 300 0 300

Other Fruit 100 100 150 50 50 150 200

Total 500 500 650 350 800 200 1000

▪ “Prior” probabilities
▪ P(Banana) = 500/1000 = 0.5, P(Orange) = 0.3, P(Other Fruit) = 0.2
▪ Probability of "Evidence“
▪ p(Long) = 500/1000 = 0.5, P(Sweet) = 0.65, P(Yellow) = 0.8
▪ Probability of "Likelihood“
▪ P(Long|Banana) = 0.8, P(Long|Orange) = 0
▪ P(Yellow|Other Fruit) = 50/200 = 0.25, P(Not Yellow|Other Fruit) = 0.75

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 22

Naïve Bayes: Example cont.

▪ “Prior” probabilities
▪ P(Banana) = 0.5 (500/1000), P(Orange) = 0.3, P(Other Fruit) = 0.2
▪ Probability of "Evidence“
▪ p(Long) = 500/100 = 0.5, P(Sweet) = 0.65, P(Yellow) = 0.8
▪ Probability of "Likelihood“
▪ P(Long|Banana) = 0.8, P(Long|Orange) = 0
▪ P(Yellow|Other Fruit) = 50/200 = 0.25, P(Not Yellow|Other Fruit) = 0.75

Given an unknown fruit which is long, sweet and yellow, is it Banana, Orange or Other Fruit?

P(Long|Banana) ∗ P(Sweet|Banana) ∗ P(Yellow|Banana) ∗ P(Banana)

▪ P Banana Long, Sweet and Yellow =
P(Long) ∗ P(Sweet) ∗ P(Yellow)
= 0.8 ∗ 0.7 ∗ 0.9 ∗ 0.5 / P(evidence)= 0.252 / P(evidence)
▪ P Orange Long, Sweet and Yellow = 0 , why?

P(Long|Other Fruit) ∗ P(Sweet|Other Fruit) ∗ P(Yellow|Other Fruit) ∗ P(Other Fruit)

▪ P Other Fruit Long, Sweet and Yellow =
P(Long) ∗ P(Sweet) ∗ P(Yellow)
= (100/200 ∗ 150/200 ∗ 50/150 ∗ 200/1000) / 𝑃(𝑒𝑣𝑖𝑑𝑒𝑛𝑐𝑒) = 0.01875 / 𝑃(𝑒𝑣𝑖𝑑𝑒𝑛𝑐𝑒)

0.252 >> 0.01875 => the unknown fruit is most likely a banana

Is it clear now why we don‘t need to calculated the P(evidence)?

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 23
Naïve Bayes for Text Classification based on
Words as Features

▪ Multinomial Naïve Bayes Model: The probability of document d belonging

to a class c is proportional to the product of the probabilities of terms t
belonging to a class c, and to the class prior P(c):

▪ The best class c for a document d is found by selecting the class, for
which the maximum a posteriori (map) probability is maximal:

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 24

Summary on Naïve Bayes

▪ Bayesian methods provide the basis for probabilistic learning methods

▪ Bayesian methods can be used to determine the most probable
hypothesis given the data
▪ No training, just probability calculation
▪ Binary, numeric and nominal features can be mixed
▪ Naïve Bayes fails if the independence assumption is violated too much
▪ Especially identical or highly overlapping features pose a problem that has to be
addressed with proper feature selection

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 25

I didn't expect a kind of menti quiz.

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 26

Today's lecture

Text Classification: Introduction

Algorithms
Naive Bayes
Hidden Markov Models

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 27

Limitations of Standard classification

▪ Standard classification problem assumes

▪ individual cases are disconnected and independent
▪ Many NLP problems do not satisfy this assumption
▪ involve making many connected decisions, each resolving a different ambiguity,
but which are mutually dependent
▪ More sophisticated learning and inference techniques are needed

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 28

Sequence Labeling

▪ Many NLP problems can viewed as sequence labeling

▪ Each token in a sequence is assigned a label
▪ Labels of tokens are dependent on the labels of other tokens in the
sequence, particularly their neighbors
▪ Examples:
▪ Part Of Speech Tagging
In: John saw the saw and decided to take it to the table.
Out: PN V Det N Con V Part V Pro Prep Det N

▪ Named entity recognition: people organizations places

Michael Dell is the CEO of Dell Computer Corporation and lives in Austin Texas.

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 29

Use case: Part of Speech (POS) Tagging

▪ Given a sentence X, predict its part of speech sequence Y

▪ A type of “structured” prediction

▪ How can we do this?

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 30

Use case: Part of Speech (POS) Tagging

Possible Answers
▪ Sequence labeling as classification:
▪ Pointwise prediction: predict each word individually with a classifier
▪ Generative sequence models: e.g. Hidden Markov Model (HMM)

▪ Later in the lecture: Neural Sequence Models (RNN / LSTM)

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 31

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

Slides from: Raymond J. Mooney

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 32
Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 33

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

Det

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 34

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 35

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

Conj

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 36

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 37

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

Part

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 38

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 39

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

Pro

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 40

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

Prep

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 41

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

Det

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 42

Sequence Labeling as Classification

▪ Classify each token independently but use as input features, information

about the surrounding tokens (sliding window).

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 43

Sequence Labeling as Classification

Using Outputs as Inputs

▪ Better input features are usually the categories of the surrounding tokens,
but these are not available yet.
▪ Can use category of either the preceding or succeeding tokens by going
forward or back and using previous output.

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 44

Forward Classification

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 45

Forward Classification

PN
John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 46

Forward Classification

PN V
John saw the saw and decided to take it to the table.

classifier

Det

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 47

Forward Classification

PN V Det
John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 48

Forward Classification

PN V Det N
John saw the saw and decided to take it to the table.

classifier

Conj

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 49

Forward Classification

PN V Det N Conj
John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 50

Forward Classification

PN V Det N Conj V
John saw the saw and decided to take it to the table.

classifier

Part

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 51

Forward Classification

PN V Det N Conj V Part

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 52

Forward Classification

PN V Det N Conj V Part V

John saw the saw and decided to take it to the table.

classifier

Pro

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 53

Forward Classification

PN V Det N Conj V Part V Pro

John saw the saw and decided to take it to the table.

classifier

Prep

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 54

Forward Classification

PN V Det N Conj V Part V Pro Prep

John saw the saw and decided to take it to the table.

classifier

Det

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 55

Forward Classification

PN V Det N Conj V Part V Pro Prep Det

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 56

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 57

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

N
John saw the saw and decided to take it to the table.

classifier

Det

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 58

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

Det N
John saw the saw and decided to take it to the table.

classifier

Prep

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 59

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

Prep Det N
John saw the saw and decided to take it to the table.

classifier

Pro

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 60

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

Pro Prep Det N

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 61

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

▪ Preposition? (“I am heading to the store”)
▪ Particle? (in front if infinitive verbs, like “I want to eat. I am going to leave.”)

V Pro Prep Det N

John saw the saw and decided to take it to the table.

classifier

Part

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 62

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

Part V Pro Prep Det N

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 63

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

V Part V Pro Prep Det N

John saw the saw and decided to take it to the table.

classifier

Conj

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 64

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

Conj V Part V Pro Prep Det N

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 65

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

V Conj V Part V Pro Prep Det N

John saw the saw and decided to take it to the table.

classifier

Det

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 66

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

Det V Conj V Part V Pro Prep Det N

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 67

Backward Classification

▪ Disambiguating “to” in this case would be even easier backward.

V Det V Conj V Part V Pro Prep Det N

John saw the saw and decided to take it to the table.

classifier

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 68

Sequence Labeling as Classification

Problems
▪ Not easy to integrate information from category of tokens on both sides
▪ Difficult to propagate uncertainty between decisions and “collectively”
determine the most likely joint assignment of categories to all of the
tokens in a sequence.

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 69

Probabilistic Sequence Models

▪ Probabilistic sequence models allow

▪ integrating uncertainty over multiple, interdependent classifications
▪ and collectively determine the most likely global assignment.

▪ Generative sequence models: e.g. Hidden Markov Model (HMM)

▪ Later in the lecture: Neural Sequence Models (RNN / LSTM)

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 70

Intuition of HMM decoding: PoS Tagging

▪ Choose the tag sequence that is most probable given the observation sequence
of n words:
Tag sequence

Best tag sequence Word sequence

▪ Bayes‘ Rule:,

▪ Drop the denominator

▪ Assumption 1: word appearing depends only on its own tag

▪ Assumption 2: the probability of a tag is dependent only on the previous tag

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 71

Hidden Markov Model

A Hidden Markov Model is a statistical model of hidden, stochastic state transitions

with observable, stochastic output. Key features:

▪ A fixed set of states

▪ At each "time", the hidden markov model is in exactly one of these states
▪ State transition probabilities
▪ The starting state can be fixed or probabilistic
▪ A fixed set of possible outputs
▪ For each state: a distribution of probabilities for every possible output
▪ Also called emission probabilities

Task:

For an observed output sequence, what is the (hidden) state sequence that has the
highest probabiliy to produce this output?
WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 72
Hidden Markov Model - Example

Every day, Darth Vader is in one of three moods: Good, Neutral or Bad
But, because he wears his mask, we cannot observe it!

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 73

Hidden Markov Model - Example

Every day, Darth Vader is in one of three moods: Good, Neutral or Bad
But, because he wears his mask, we cannot observe it!

(image from pixabay.com, CC0 Creative Commons licence)

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 74

Hidden Markov Model - Example

Somehow, you know the odds how his mood changes from day to day:

0.4
0.8 0.4
0.2 0.5

0.2 0.5

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 75

Hidden Markov Model - Example

You also know the chances for his mood on day 1:

S
0.3 0.3
0.4
0.4
0.8 0.4
0.2 0.5

0.2 0.5

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 76

Hidden Markov Model - Example

What we CAN observe is if Darth Vader destroys a planet or not, which depends on
his mood!
S
0.3 0.3
0.4
0.4
0.8 0.4
0.2 0.5

0.2 0.5

0.5 0.5 0.3 0.7 0.1 0.9

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 77

Hidden Markov Model - Example

We observe that he does not destroy a planet on the first day, but he
destroys a planet each on the second and third day:

Question:

What is the most probable sequence of his mood on these three days?

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 78

Hidden Markov Model - Example

What is the probability for this mood

sequence and this observation:

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 79

Hidden Markov Model - Example

What is the probability for this mood

sequence and this observation:

First day:

Transition probability: S 0.3

Emission probability: 0.1
Joint probability: 0.3 * 0.1 = 0.03

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 80

Hidden Markov Model - Example

What is the probability for this mood

sequence and this observation:

Second day:

Transition probability: 0.5

Emission probability: 0.7
Joint probability: 0.5 * 0.7 = 0.35

Probability for sequence: 0.03 (day one) * 0.35 (day two) = 0.0105

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 81

Hidden Markov Model - Example

What is the probability for this mood

sequence and this observation:

Third day:

Transition probability: 0.4

Emission probability: 0.9
Joint probability: 0.4 * 0.9 = 0.36

Probability for sequence: 0.03 * 0.35 * 0.36 = 0.00378

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 82

Hidden Markov Model – Application to NLP

In our POS tagging example, we know the sequence of words, and we want to
know the sequence of POS tags!
transitions
▪ (hidden) States: POS tags
states t1 t2 t3 t4
▪ (observable) Outputs: Tokens

▪ We also need outputs w1 w2 w3 w4

▪ Transition probabilities between states
▪ Also: Initial probabilities = Probabilities for the first token of a sentence
▪ Emission probabilities for every state

How would our graph from the example look like?

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 83

Hidden Markov Model – Application to NLP

S
0.3 0.3
0.4
0.4
0.8 0.4
0.2 0.5
N V DT
0.2 0.5

0.6 0.0 0.1 0.1 0.1 0.8

0.4 0.8 0.1

Tom saw the Tom saw the Tom saw the

How do we get to the transition and emission probabilities?

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 84

Hidden Markov Model – Application to NLP:
Estimating the Probabilities

▪ The probabilities are estimated just by counting on a tagged training

corpus
▪ Transition probability: how often the first tag is followed by the second divided
by the number of the times the first tag was seen in a labeled corpus

▪ The emission probabilities: the number of times the word was associated with
the tag in the labeled corpus divided by number of the times the first tag was
seen in a labeled corpus

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 85

Example

Input sentence: Janet will back the bill

Correct PoS tags: Janet/NNP will/MD back/VB the/DT bill/NN

Transition probabilities based on WSJ corpus

Rows are labeled with the conditioning event; thus P(VB|MD) is 0.7968.

Given the observation (output) likelihoods

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 86

Hidden Markov Model – Complexity

Question: What is the most likely state sequence given an output sequence?

▪ Naïve solution:
▪ brute force search by enumerating all possible sequences of states
▪ Complexity 𝑂 𝑠 𝑚
▪ where m is the length of the input and s is the number of states in the model.
▪ Better solution: Dynamic Programming!
▪ Standard procedure is called the Viterbi algorithm
▪ Running time is 𝑂(𝑚𝑠 2 ),
▪ where m is the length of the input and s is the number of states in the model.

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 87

Viterbi algorithm: Motivation

▪ Let 𝑇: 𝑜1 , 𝑜2 , … , 𝑜𝑇 some input sentence

▪ Let 𝑆: 𝑠1 , 𝑠2 , … , 𝑠𝑇 sequence of tags
▪ Goal: best sequence of tags given the input sequence
▪ argmax𝑠1 ,𝑠2,…,𝑠𝑇 𝑃(𝑜1 , … , 𝑜𝑇 , 𝑠𝑇 , 𝑠2 , … , 𝑠𝑇 )
▪ Example
▪ T = The, man, saw, the saw
▪ S = {Det, N, V}
▪ Possible tag sequences:
-> Det Det Det Det Det -> 0.000003
Det Det Det Det N -> 0.000008
Det Det Det Det V -> 0.000009
…
Det N V Det N -> 0.012

we have 35 (243) sequences: sentence length = 5, #tags = 3

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 88

Viterbi algorithm: Basic idea

Let us say we have only two possible states, A and B, and some observation o
What is the best possible state sequence of length 5 for this observation o?
It is either:

▪ the best possible sequence of length 4 that ends with A, followed by A

▪ the best possible sequence of length 4 that ends with A, followed by B
▪ the best possible sequence of length 4 that ends with B, followed by A
▪ the best possible sequence of length 4 that ends with B, followed by B

This is only true because the next state only depends on the state directly before!

So, what is the best possible sequence of length 4 that ends with A? It is either:
▪ the best possible sequence of length 3 that ends with A, followed by A
▪ the best possible sequence of length 3 that ends with B, followed by A
▪ …
WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 89
Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man saw the saw $$

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 90

Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man saw the saw $$

0.6 * 0.5
DT = 0.3

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 91

Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man saw the saw $$

DT 0.3

V 0

N 0

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 92

Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man

DT 0.3 = max {
P (man | V) * P (V | DT) * 0.3,
P (man | V) * P (V | V) * 0,
V 0 ? P (man | V) * P (V | N) * 0
}

N 0

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 93

Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man

DT 0.3 = max {
P (man | V) * P (V | DT) * 0.3,
P (man | V) * P (V | V) * 0,
V 0 ? P (man | V) * P (V | N) * 0
}

N 0

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 94

Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man

DT 0.3 = max {
P (man | V) * P (V | DT) * 0.3,
P (man | V) * P (V | V) * 0,
V 0 ? P (man | V) * P (V | N) * 0
}

N 0

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 95

Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man

DT 0.3 = max {
P (man | V) * P (V | DT) * 0.3,
P (man | V) * P (V | V) * 0,
V 0 ? P (man | V) * P (V | N) * 0
}

N 0

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 96

Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man

DT 0.3 = max {
0.1 * 0.3 * 0.3,
0.1 * 0.2 * 0,
V 0 ? 0.1 * 0.6 * 0
} = 0.009

N 0

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 97

Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man

DT 0.3 = max {
0.1 * 0.3 * 0.3,
0.1 * 0.2 * 0,
0.009
V 0 DT 0.1 * 0.6 * 0
} = 0.009

N 0 Predecessor of the most probable path

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 98

Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man

DT 0.3 0

0.009
V 0 DT

0.054
N 0 DT

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 99

Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man saw

DT 0.3 0 = max {
P (saw | V) * P (V | DT) * 0,
P (saw | V) * P (V | V) * 0.009,
0.009
V 0 DT ? P (saw | V) * P (V | N) * 0.054
}
0.054
N 0 DT

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 100
Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man saw

DT 0.3 0 = max {
0.2 * 0.3 * 0, ( = 0)
0.2 * 0.2 * 0.009, ( = 0,00036)
0.009 6,48 ∗ 10−3
V 0 DT N 0.2 * 0.6 * 0.054 ( = 0,00648)
} = 0,00648 = 6,48 ∗ 10−3
0.054
N 0 DT

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 101
Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man saw the saw $$

1,62 ∗ 10−3
DT 0.3 0 0 V 0

0.009 6,48 ∗ 10−3 9,72 ∗ 10−5 1,94 ∗ 10−4

V 0 DT N 0 DT N

0.054 2,16 ∗ 10−3 1,94 ∗ 10−4

N 0 DT N 0 DT

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 102
Viterbi algorithm: Example

Emission probabilities Transition probabilities

saw
The man saw the $$
N

1,62 ∗ 10−3
DT 0.3 0 0 V 0

0.009 6,48 ∗ 10−3 9,72 ∗ 10−5

V 0 DT N 0 DT N
0.054 2,16 ∗ 10−3 1,94 ∗ 10−4
N 0 DT N 0 DT

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 103
Viterbi algorithm: Example

Emission probabilities Transition probabilities

the saw
The man saw $$
DT N

1,62 ∗ 10−3
DT 0.3 0 0 V 0

0.009 6,48 ∗ 10−3 9,72 ∗ 10−5

V 0 DT N 0 DT N
0.054 2,16 ∗ 10−3
N 0 DT N 0 𝐃𝐓
WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 104
Viterbi algorithm: Example

Emission probabilities Transition probabilities

saw the saw

The man $$
V DT N

DT 0.3 0 0 V 0

0.009 6,48 ∗ 10−3 9,72 ∗ 10−5

V 0 DT N 0 DT N
0.054 2,16 ∗ 10−3
N 0 DT N 0 𝐃𝐓
WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 105
Viterbi algorithm: Example

Emission probabilities Transition probabilities

The man saw the saw

$$
DT N V DT N

DT 0.3 0 0 V 0

0.009 9,72 ∗ 10−5

V 0 DT N 0 DT N
2,16 ∗ 10−3
N 0 DT N 0 𝐃𝐓
WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 106
Viterbi algorithm: Example

Remember the complexity of the Viterbi algorithm?

Running time is 𝑂(𝑚𝑠 2 ),
▪ where m is the length of the input and s is the number of states in the model.

Now we see why: For every token (m) we we have to evaluate every POS (s) in combination with every
possible predeccessor POS (s), so he have m * s * s operations = 𝑚𝑠 2

The man saw the saw

$$
DT N V DT N

DT 0.3 0 0 V 0

0.009 9,72 ∗ 10−5

V 0 DT N 0 DT N
2,16 ∗ 10−3
N 0 DT N 0 𝐃𝐓
WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 107
Summary

▪ Sequence Labeling:
▪ Input and output are signal sequences
▪ No individual classification per signal, but joint classification that minimizes
some cost
▪ Hidden Markov Models
▪ Emissions can be observed
▪ States are hidden
▪ Goal: Find most probable state sequence for a given emission sequence
▪ Solve via Viterbi (dynamic programming)

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 108
Next Lecture

Information Retrieval
Introduction

WS24/25 | Computer Science Department | UKP - Prof. Dr. Iryna Gurevych | 109

Assignment 1 Ans (Reference)
No ratings yet
Assignment 1 Ans (Reference)
18 pages
Polytechnic University of The Philippines Statistical Analysis With Software Application
100% (1)
Polytechnic University of The Philippines Statistical Analysis With Software Application
9 pages
4 NB 2024
No ratings yet
4 NB 2024
82 pages
AI Lec 04+05 - Naive Bayes
No ratings yet
AI Lec 04+05 - Naive Bayes
55 pages
Naivebayes 2021
No ratings yet
Naivebayes 2021
77 pages
Naive Bayes
No ratings yet
Naive Bayes
29 pages
NB 24 Aug
No ratings yet
NB 24 Aug
82 pages
Lecture13 Nbayes
No ratings yet
Lecture13 Nbayes
56 pages
Naive Bayes
No ratings yet
Naive Bayes
31 pages
Unit 3 Bayesian Learning
No ratings yet
Unit 3 Bayesian Learning
49 pages
L4 Naive Bayes
No ratings yet
L4 Naive Bayes
31 pages
Bayesian Learning
No ratings yet
Bayesian Learning
49 pages
Statistics
No ratings yet
Statistics
25 pages
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
No ratings yet
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
47 pages
Multimedia Application L7 - For
No ratings yet
Multimedia Application L7 - For
46 pages
Lecture 8-1 - Text Classification, Naïve Bayes, Vector Space Classification
No ratings yet
Lecture 8-1 - Text Classification, Naïve Bayes, Vector Space Classification
38 pages
BAI601 Module 3 PDF
No ratings yet
BAI601 Module 3 PDF
19 pages
03 ML Essentials
No ratings yet
03 ML Essentials
52 pages
Unit II Probabilistic Reasoning
No ratings yet
Unit II Probabilistic Reasoning
28 pages
Lecture - 4 Classification (Naive Bayes)
No ratings yet
Lecture - 4 Classification (Naive Bayes)
33 pages
Irs Unit 4 CH 1
No ratings yet
Irs Unit 4 CH 1
58 pages
Inf2b Learn Note07 2up
No ratings yet
Inf2b Learn Note07 2up
5 pages
Unit 3
No ratings yet
Unit 3
157 pages
Multimedia Application L8
No ratings yet
Multimedia Application L8
68 pages
Adobe Scan Jun 26, 2025
No ratings yet
Adobe Scan Jun 26, 2025
12 pages
Unit 6
No ratings yet
Unit 6
19 pages
2022 Slide9 BayesML Eng
No ratings yet
2022 Slide9 BayesML Eng
34 pages
Unit 4
No ratings yet
Unit 4
36 pages
Aiml 2 3
No ratings yet
Aiml 2 3
51 pages
In4080 2022 Lecture 03
No ratings yet
In4080 2022 Lecture 03
62 pages
Text Mining - Classification
No ratings yet
Text Mining - Classification
28 pages
@vtudeveloper - in ML Mod 4
No ratings yet
@vtudeveloper - in ML Mod 4
11 pages
4.machine Learning For Text Understanding-1
No ratings yet
4.machine Learning For Text Understanding-1
45 pages
Bayes Algorithm
No ratings yet
Bayes Algorithm
26 pages
T4L1 Naive Bayes
No ratings yet
T4L1 Naive Bayes
50 pages
Unit 3
No ratings yet
Unit 3
46 pages
23-Naive Bayes
No ratings yet
23-Naive Bayes
22 pages
Bayes Theorem
No ratings yet
Bayes Theorem
20 pages
6 Naive-Bayes
No ratings yet
6 Naive-Bayes
18 pages
COMP2050-Lecture 22 - Machine Learning
No ratings yet
COMP2050-Lecture 22 - Machine Learning
47 pages
04 Textcat
No ratings yet
04 Textcat
101 pages
Module V - v1
No ratings yet
Module V - v1
58 pages
Week 4
No ratings yet
Week 4
45 pages
NLP NB
No ratings yet
NLP NB
52 pages
Unit I Probabilistic Reasoning I 9
No ratings yet
Unit I Probabilistic Reasoning I 9
20 pages
An Introduction To Naive Bayes Algorithm For Beginners
No ratings yet
An Introduction To Naive Bayes Algorithm For Beginners
11 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
14 pages
Wa0002.
No ratings yet
Wa0002.
24 pages
ML Unit 3 Part 1
No ratings yet
ML Unit 3 Part 1
36 pages
Lecture03 Naivebayes
No ratings yet
Lecture03 Naivebayes
25 pages
Unit 3 Bayesian Concept Learning
No ratings yet
Unit 3 Bayesian Concept Learning
66 pages
Machine - Learning (Unit 3)
No ratings yet
Machine - Learning (Unit 3)
9 pages
Lecture 5 Bayesian
No ratings yet
Lecture 5 Bayesian
37 pages
NaiveBayes N Text Analytics
No ratings yet
NaiveBayes N Text Analytics
20 pages
lec20-ML I
No ratings yet
lec20-ML I
48 pages
Bayesian Learning: Based On "Machine Learning", T. Mitchell, Mcgraw Hill, 1997, Ch. 6
No ratings yet
Bayesian Learning: Based On "Machine Learning", T. Mitchell, Mcgraw Hill, 1997, Ch. 6
54 pages
Chapter 4 Text Classification
No ratings yet
Chapter 4 Text Classification
28 pages
ML Unit 2
No ratings yet
ML Unit 2
107 pages
Lecture 3
No ratings yet
Lecture 3
6 pages
Mod (ML) 4
No ratings yet
Mod (ML) 4
48 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
6 pages
Machine Learning: Fundamentals and Applications
From Everand
Machine Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Hasil Olah Data Jadiiiii
No ratings yet
Hasil Olah Data Jadiiiii
9 pages
Hemocue Comparison - Hemochroma Plus Performance Rev.02 ENG
No ratings yet
Hemocue Comparison - Hemochroma Plus Performance Rev.02 ENG
1 page
Normal Practice
No ratings yet
Normal Practice
4 pages
BUS173 Assignment
No ratings yet
BUS173 Assignment
19 pages
Daa C6
No ratings yet
Daa C6
14 pages
18mab204t - MCQ (I-V)
No ratings yet
18mab204t - MCQ (I-V)
15 pages
Statistical Methods in Biology Design and Analysis of Experiments and Regression 1st Edition Welham pdf download
No ratings yet
Statistical Methods in Biology Design and Analysis of Experiments and Regression 1st Edition Welham pdf download
78 pages
Statistics 260 Final
No ratings yet
Statistics 260 Final
9 pages
Bankex Dataset For Students
No ratings yet
Bankex Dataset For Students
32 pages
EGVN 602 Assignment 3 - Solution
No ratings yet
EGVN 602 Assignment 3 - Solution
13 pages
A Unified View of Performance Metrics: Translating Threshold Choice Into Expected Classification Loss
No ratings yet
A Unified View of Performance Metrics: Translating Threshold Choice Into Expected Classification Loss
57 pages
Frequencies: Frequencies Variables Usia /piechart Freq /order Analysis
No ratings yet
Frequencies: Frequencies Variables Usia /piechart Freq /order Analysis
5 pages
Basic Statistical Tools in Research and Data Analysis
No ratings yet
Basic Statistical Tools in Research and Data Analysis
8 pages
Lecture 4 Day 3 Stochastic Frontier Analysis
100% (7)
Lecture 4 Day 3 Stochastic Frontier Analysis
45 pages
Spss 1. Uji Normalitas Data (Kolmogorov-Smirnov)
No ratings yet
Spss 1. Uji Normalitas Data (Kolmogorov-Smirnov)
3 pages
Lec16 SummarizingPosteriors BayesianModelSelection
No ratings yet
Lec16 SummarizingPosteriors BayesianModelSelection
59 pages
Document 2
No ratings yet
Document 2
18 pages
Statistics - Statistical Inference
No ratings yet
Statistics - Statistical Inference
3 pages
Advanced Statistics Project
17% (6)
Advanced Statistics Project
2 pages
Chapter 0
No ratings yet
Chapter 0
10 pages
Rebound Hammer Summary
No ratings yet
Rebound Hammer Summary
39 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
46 pages
May 2021 Examination Diet School of Mathematics & Statistics MT4614
No ratings yet
May 2021 Examination Diet School of Mathematics & Statistics MT4614
6 pages
MODULE7 Measures of Variability
No ratings yet
MODULE7 Measures of Variability
24 pages
Practical On SPSS
No ratings yet
Practical On SPSS
2 pages
4th Sem Math Probability 7 Page Suggestions
No ratings yet
4th Sem Math Probability 7 Page Suggestions
17 pages
CS2 Corrections 2023 14092023
No ratings yet
CS2 Corrections 2023 14092023
14 pages
Chi-Squared Distribution
No ratings yet
Chi-Squared Distribution
12 pages