0% found this document useful (0 votes)

31 views

Few-Shot Learning Tutorial - Medium

Uploaded by

deebakwa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views

Few-Shot Learning Tutorial - Medium

Uploaded by

deebakwa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Get unlimited access to the best of Medium for less than $1/week.

Become a member

Achieve 90% Results in Few-Shot Text Classification

with Just 0.1% Data
Knowledgator Engineering · Follow
5 min read · Dec 27, 2023

Listen Share More

Zero-shot abilities of modern LLMs are truly inspiring and make us feel that AGI is pretty close.
However, it requires large networks and pre-training on huge massive of data. And still, it’s not
enough. You need to fine-tune a model specifically to your case to tackle actual business
problems with acceptable accuracy. What makes a difference, in this case, is how few examples
you need to achieve reasonable results. In our team, we developed a zero-shot text classification
model that, with just 8 examples per label, can achieve up to 90% and beat huge LLMs fine-
tuned on thousands of examples. In this tutorial, we will show you how to achieve the same
results with our open-source zero-shot text classification model.

Requirements and data

First of all, make sure that you have installed the following libraries:

pip install datasets transformers accelerate setfit

datasets: unified interface for managing and accessing diverse machine learning datasets.

transformers: Hugging Face library offering pre-trained models and tools for natural
language processing tasks.
accelerate: library that enables the same PyTorch code to be run across any distributed
configuration by adding just four lines of code.

setfit: an efficient and prompt-free framework for few-shot fine-tuning of Sentence

Transformers.

Okay, right now we need to download a dataset: we will use the “emotion” dataset, which
contains 6 classes of different emotions that describe a text. Then we will split the dataset into
test and train, and from the train, we will randomly select 48 examples, with an average of 8
examples per label.

from datasets import load_dataset

def get_train_dataset(dataset, N):

ids = []
label2count = {}
train_dataset = dataset['train'].shuffle(seed=41)
for id, example in enumerate(train_dataset):
if example['label'] not in label2count:
label2count[example['label']]=1
elif label2count[example['label']]>=N:
continue
else:
label2count[example['label']]+=1
ids.append(id)
return train_dataset.select(ids)

#emotion
emotion_dataset = load_dataset("dair-ai/emotion")
test_dataset = emotion_dataset['test']
classes = test_dataset.features["label"].names
N = 8
train_dataset = get_train_dataset(emotion_dataset, N)

SetFit
Firstly, we will see what results we can achieve with SetFit — an alternative few-shot learning
approach that uses text embeddings for classification. SetFit is the latest breakthrough in this
field, an open-source framework for few-shot fine-tuning of Sentence Transformers. Creators
claim that with just 8 labeled examples per class on the Customer Reviews (CR) sentiment
dataset, SetFit surpasses fine-tuned RoBERTa Large on the whole training set of 3k examples.

Then, we’ll execute the same task with our approach and compare results (ours are much
better).
from setfit import SetFitModel, Trainer, TrainingArguments
from sklearn.metrics import classification_report

model = SetFitModel.from_pretrained("BAAI/bge-base-en-v1.5")

args = TrainingArguments(
batch_size=32,
num_epochs=1,
)

trainer = Trainer(
model=model,
args=args,
train_dataset=train_dataset,
eval_dataset=test_dataset,
)
trainer.train()

To test the model, we run the following command:

preds = model.predict(test_dataset['text'])
print(classification_report(test_dataset['label'], preds,
target_names=classes, digits=4))
SetFit results on emotion dataset

We got even slightly worse results than SetFit demonstrates in a zero-shot setting. One of the
reasons is that the uniform distribution of labels in the training set does not reflect the real
distribution and, usually, the SetFit approach requires more examples to distinguish different
examples in vector space. Our method is more universal, and the fine-tuning of the model does
not require training some additional classification heads.

Comprehend-it method
Let’s try our approach now, firstly you need to initialize the model:

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = 'knowledgator/comprehend_it-base'

tokenizer = AutoTokenizer.from_pretrained(model_name)

model = AutoModelForSequenceClassification.from_pretrained(model_name)

Our approach is based on a text classification model that was trained to distinguish whether two
statements are neutral, contradict each other, or entail.

Right now, let’s initialize all data processing functions:

from transformers import TrainingArguments, Trainer
from transformers import DataCollatorWithPadding
from datasets import Dataset
import random
import torch
import evaluate
import numpy as np

accuracy = evaluate.load("accuracy")

def transform_dataset(dataset, classes, template = '{}'):

new_dataset = {'sources':[], 'targets': [], 'labels': []}

texts = dataset['text']
labels = dataset['label']

label2count = {}
for label in labels:
if label not in label2count:
label2count[label]=1
else:
label2count[label]+=1
count = len(labels)
label2prob = {label:lc/count for label, lc in label2count.items()}
unique_labels = list(label2prob)
probs = list(label2prob.values())

ids = list(range(len(labels)))
for text, label_id in zip(texts, labels):
label = classes[label_id]
for i in range(len(classes)-1):
new_dataset['sources'].append(text)
new_dataset['targets'].append(template.format(label))
new_dataset['labels'].append(1.)

for i in range(len(classes)-1):
neg_class_ = label
while neg_class_==label:
# neg_class_ = random.sample(classes, k=1)[0]
neg_lbl = np.random.choice(unique_labels, p=probs)
neg_class_ = classes[neg_lbl]

new_dataset['sources'].append(text)
new_dataset['targets'].append(template.format(neg_class_))
new_dataset['labels'].append(-1.)
return Dataset.from_dict(new_dataset)

def compute_metrics(eval_pred):
predictions, labels = eval_pred

predictions = np.argmax(predictions, axis=1)

return accuracy.compute(predictions=predictions, references=labels)

def tokenize_and_align_label(example):
hypothesis = example['targets']

seq = example["sources"]+hypothesis

tokenized_input = tokenizer(seq, truncation=True, max_length=512,

padding="max_length")

label = example['labels']
if label==1.0:
label = torch.tensor(1)
elif label==0.0:
label = torch.tensor(2)
else:
label = torch.tensor(0)
tokenized_input['label'] = label
return tokenized_input

And let’s process the training dataset and run training:

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

dataset = transform_dataset(train_dataset, classes)

tokenized_dataset = dataset.map(tokenize_and_align_label)
tokenized_dataset = tokenized_dataset.train_test_split(test_size=0.1)

training_args = TrainingArguments(
output_dir='comprehendo',
learning_rate=3e-5,
per_device_train_batch_size=8,
per_device_eval_batch_size=8,
num_train_epochs=3,
weight_decay=0.01,
evaluation_strategy="epoch",
)

trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset["train"],
eval_dataset=tokenized_dataset['test'],
tokenizer=tokenizer,
data_collator=data_collator,
compute_metrics=compute_metrics,
)

trainer.train()

trainer.save_model('comprehender')

To use our model for inference, we can utilise the Hugging Face pipeline for zero-shot
classification:

from transformers import pipeline

from sklearn.metrics import classification_report
from tqdm import tqdm

classifier = pipeline("zero-shot-classification",
model='comprehendo',tokenizer=tokenizer, device=device)

And let’s test the model:

preds = []
label2idx = {label: id for id, label in enumerate(classes)}

for example in tqdm(test_dataset):

pred = classifier(example['text'],classes)['labels'][0]
idx = label2idx[pred]
preds.append(idx)

print(classification_report(test_dataset['label'], preds,
target_names=classes, digits=4))

We got impressive results, considering that not all labels were in our dataset, and the results
were 8% higher in terms of micro F1 score than our model in the zero-shot setting.
Open in app

Comprehend-it results on emotion dataset

Search
Conclusion
As a result, our approach significantly outperformed SetFit; however, it’s important to note that
SetFit will be faster depending on model size and amount of labels. Our approach depends on
the number of labels because it requires full attention between text and labels, so we should run
model N time, which is an equal amount of labels. So, the choice depends on the balance
between performance, accuracy and amount of training examples you have.

Benchmark:

NLP Llm Few Shot Learning Text Classification AI

Written by Knowledgator Engineering

296 Followers

Open-Source ML research company focused on developing fundamental encoder-based model for information extraction
https://knowledgator.com/

More from Knowledgator Engineering

Knowledgator Engineering

Extract custom table from PDF with LLMs

Portable Document Format (PDF) is one of the most widely used file formats for sharing information,
especially in academic, scientific…

5 min read · Sep 30, 2023

648 2

Knowledgator Engineering

How to classify text into millions of classes

Text Classification is one of the fundamental tasks in NLP and ML in general. During years of development,
the fields have emerged with a…

5 min read · Feb 29, 2024

130 2
Knowledgator Engineering

Extract any named entities from PDF using custom Spacy pipeline
Portable Document Format (PDF) stands as a predominant file format for distributing content, particularly in
academic, scientific…

6 min read · Dec 3, 2023

368
Knowledgator Engineering

Run T5 model 🦾 on 100k tokens 📚 20x faster ⚡

Small language models (SLMs) are essential in tasks that require instant response or in situations with low
computational resources…

5 min read · Apr 25, 2024

132

See all from Knowledgator Engineering

Recommended from Medium

Knowledgator Engineering

Game-Changing Few-Shot Learning Works With 8 Examples Per Label

Currently #1 approach for Text Classification
8 min read · Jan 9, 2024

215

Yu-Cheng Tsai in Towards Data Science

Are GPTs Good Embedding Models

A surprising experiment to show that the devil is in the details

6 min read · May 18, 2024

159

Lists

Natural Language Processing

1477 stories · 990 saves

The New Chatbots: ChatGPT, Bard, and Beyond

12 stories · 385 saves

Generative AI Recommended Reading

52 stories · 1081 saves
What is ChatGPT?
9 stories · 363 saves

Okan Yenigün

Exploring Hugging Face: Topic Modeling

Topic Modeling with BERTopic

7 min read · Jan 7, 2024

125 1
Research Graph

Automated Knowledge Graph Construction with Large Language Models — Part 2

Harvesting the Power and Knowledge of Large Language Models

6 min read · May 13, 2024

136 2
Howard Chi in WrenAI

How to use OpenAI GPT-4o to query your database?

Today, OpenAI released its latest LLM model, GPT-4o. People are sharing crazy applications built on top of
this groundbreaking model. By…

5 min read · May 14, 2024

538 9

Dr. Dhanya NM

Zero Shot Learning for text classification

In this article we will learn about how to do zero shot learning for classifying a text.

8 min read · Feb 1, 2024

See more recommendations

Unit Plan-Bentwood Boxes
100% (1)
Unit Plan-Bentwood Boxes
8 pages
THE Organization of Knowledge in The Mind
No ratings yet
THE Organization of Knowledge in The Mind
29 pages
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
No ratings yet
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
17 pages
ccc
No ratings yet
ccc
25 pages
Null 0
No ratings yet
Null 0
6 pages
09 Milestone Project 2 Skimlit
No ratings yet
09 Milestone Project 2 Skimlit
32 pages
Pdf
No ratings yet
Pdf
41 pages
IBest_DeepLearning
No ratings yet
IBest_DeepLearning
123 pages
Deep Learning Assignments
No ratings yet
Deep Learning Assignments
6 pages
Effects of Batches - Jupyter Notebook
No ratings yet
Effects of Batches - Jupyter Notebook
73 pages
ML Lab Manual PDF
No ratings yet
ML Lab Manual PDF
9 pages
Train Edu Bert
No ratings yet
Train Edu Bert
3 pages
Assignment 4x
No ratings yet
Assignment 4x
19 pages
CIFAR_10_ Dataset_Using_CNN_Aniiiii_HTML
No ratings yet
CIFAR_10_ Dataset_Using_CNN_Aniiiii_HTML
8 pages
CS335 Lab6
No ratings yet
CS335 Lab6
7 pages
Assignment 3 DS5620
No ratings yet
Assignment 3 DS5620
11 pages
Bert
No ratings yet
Bert
2 pages
Week_2
No ratings yet
Week_2
17 pages
ML Week10.1
No ratings yet
ML Week10.1
5 pages
Python Code
No ratings yet
Python Code
52 pages
ML LAB P-1
No ratings yet
ML LAB P-1
10 pages
Assignment3
No ratings yet
Assignment3
6 pages
dl lab_merged (2)
No ratings yet
dl lab_merged (2)
60 pages
1729401471516
No ratings yet
1729401471516
98 pages
School of Engineering: Lab Manual On Machine Learning Lab
No ratings yet
School of Engineering: Lab Manual On Machine Learning Lab
23 pages
Codes and Concepts of ML-Developer-2
No ratings yet
Codes and Concepts of ML-Developer-2
17 pages
Machine Learning Code Explanation
No ratings yet
Machine Learning Code Explanation
33 pages
DL & AI - Lab Manual
No ratings yet
DL & AI - Lab Manual
33 pages
Approachin190808095205 PDF
No ratings yet
Approachin190808095205 PDF
112 pages
Crashcourse DL Pytorch Parr
No ratings yet
Crashcourse DL Pytorch Parr
39 pages
NLP
No ratings yet
NLP
45 pages
PyTorch Workflow Fundamentals - Zero To Mastery Learn PyTorch For Deep Learning
No ratings yet
PyTorch Workflow Fundamentals - Zero To Mastery Learn PyTorch For Deep Learning
43 pages
Text Classification_movie Review_news Wires
No ratings yet
Text Classification_movie Review_news Wires
5 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
10 pages
Lab Manual ML
No ratings yet
Lab Manual ML
28 pages
3-Sentiment Analysis BERT
No ratings yet
3-Sentiment Analysis BERT
5 pages
Tensorflow and Deep Learning
No ratings yet
Tensorflow and Deep Learning
51 pages
Intro To Deep Learning With TensorFlow - Introduction To TensorFlow Cheatsheet - Codecademy
No ratings yet
Intro To Deep Learning With TensorFlow - Introduction To TensorFlow Cheatsheet - Codecademy
8 pages
NN From Scratch
No ratings yet
NN From Scratch
5 pages
ML_lab_programs
No ratings yet
ML_lab_programs
8 pages
Deep Neural Network Application
No ratings yet
Deep Neural Network Application
17 pages
hatespeech_code_ipynb
No ratings yet
hatespeech_code_ipynb
31 pages
Project Documentation
No ratings yet
Project Documentation
24 pages
A Comprehensive Guide To Understand and Implement Text Classification in Python
No ratings yet
A Comprehensive Guide To Understand and Implement Text Classification in Python
34 pages
Homework2
No ratings yet
Homework2
3 pages
Machine Learning practical file
No ratings yet
Machine Learning practical file
31 pages
intent_recognizer
No ratings yet
intent_recognizer
5 pages
chapter2 (1)
No ratings yet
chapter2 (1)
35 pages
Machine Learning HW3 - Image Classification
No ratings yet
Machine Learning HW3 - Image Classification
48 pages
Untitled 10
No ratings yet
Untitled 10
6 pages
CS4740/5740 Introduction To NLP Fall 2017 Neural Language Models and Classifiers
No ratings yet
CS4740/5740 Introduction To NLP Fall 2017 Neural Language Models and Classifiers
7 pages
NN From Scratch PDF 1735495327
No ratings yet
NN From Scratch PDF 1735495327
19 pages
Deep Learning Manual (1)
No ratings yet
Deep Learning Manual (1)
53 pages
Deep Learning
No ratings yet
Deep Learning
46 pages
PyTorch Neural Network Classifcation
No ratings yet
PyTorch Neural Network Classifcation
1 page
cs519 hw2
No ratings yet
cs519 hw2
15 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
val
No ratings yet
val
9 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
merge
No ratings yet
merge
33 pages
Sentiment Analysis On Tweets
No ratings yet
Sentiment Analysis On Tweets
2 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Q1 Housekeeping Week4
100% (2)
Q1 Housekeeping Week4
4 pages
Cot - Lesson Plan English
100% (1)
Cot - Lesson Plan English
15 pages
Lesson Plan Grade GRADE 11
100% (2)
Lesson Plan Grade GRADE 11
3 pages
Maths - Hungry Caterpillar
No ratings yet
Maths - Hungry Caterpillar
3 pages
Rapport Anglais
No ratings yet
Rapport Anglais
10 pages
GLO4 Critical Thinking
No ratings yet
GLO4 Critical Thinking
13 pages
Week 1 Outline
No ratings yet
Week 1 Outline
2 pages
Three Social Theories: Jona Q. Guevarra
No ratings yet
Three Social Theories: Jona Q. Guevarra
20 pages
BPL Narrative Report
No ratings yet
BPL Narrative Report
8 pages
Literature Review of English Language Teaching
100% (1)
Literature Review of English Language Teaching
7 pages
Wesleyan University - Philippines Maria Aurora, 3202 Aurora: Senior High School
No ratings yet
Wesleyan University - Philippines Maria Aurora, 3202 Aurora: Senior High School
77 pages
Unit 3 Lesson 1 Identify Roles, Functions and Skills of Social Worker in Each Social Work Settings
No ratings yet
Unit 3 Lesson 1 Identify Roles, Functions and Skills of Social Worker in Each Social Work Settings
6 pages
Board Exam 6
No ratings yet
Board Exam 6
3 pages
PLAN TRAINIG SESSION - PPT (NDE) (Autosaved)
100% (1)
PLAN TRAINIG SESSION - PPT (NDE) (Autosaved)
25 pages
Microteaching Ia Lesson Plan November 14th
No ratings yet
Microteaching Ia Lesson Plan November 14th
4 pages
7es Model For Learning The 21st Century Skills
No ratings yet
7es Model For Learning The 21st Century Skills
16 pages
Reading Skills
No ratings yet
Reading Skills
6 pages
What Is Action Learning
No ratings yet
What Is Action Learning
8 pages
ADVANCED EDUCATIONAL PSYCHOLOGY 2nd Edition S. K. Mangal - The ebook in PDF format is ready for download
100% (1)
ADVANCED EDUCATIONAL PSYCHOLOGY 2nd Edition S. K. Mangal - The ebook in PDF format is ready for download
77 pages
UH Lesson Plan Template: Brittany Gilchrest
No ratings yet
UH Lesson Plan Template: Brittany Gilchrest
4 pages
Why Do Young People Misbehave in School?
No ratings yet
Why Do Young People Misbehave in School?
12 pages
Current Perspectives in ESP
No ratings yet
Current Perspectives in ESP
15 pages
Senior TLE Nail Care Q1 - M4 For Printing
No ratings yet
Senior TLE Nail Care Q1 - M4 For Printing
26 pages
PH Nutrition Module Introduction
No ratings yet
PH Nutrition Module Introduction
11 pages
DATU SALAYAN ELEMENTARY School Based Child Protection and Anti Bullying Policy New 2
No ratings yet
DATU SALAYAN ELEMENTARY School Based Child Protection and Anti Bullying Policy New 2
29 pages
CLIL in Colombia - Jhon Alexander Estrada Hincapié
No ratings yet
CLIL in Colombia - Jhon Alexander Estrada Hincapié
3 pages
Card Template
No ratings yet
Card Template
2 pages
Relationship Between Students' Perception and Degree of Satisfaction of SHS Students On Modular Distance Learning Modality of Tagumbao National High School
No ratings yet
Relationship Between Students' Perception and Degree of Satisfaction of SHS Students On Modular Distance Learning Modality of Tagumbao National High School
74 pages