0% found this document useful (0 votes)

25 views

Murenei - Natural Language Processing With Python and NLTK

This document provides a cheat sheet on natural language processing with Python and the nltk library. It covers topics like text handling, tokenization, part-of-speech tagging, parsing, named entity recognition, and using regular expressions with Pandas.

Uploaded by

Sony Asampalli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

Murenei - Natural Language Processing With Python and NLTK

Uploaded by

Sony Asampalli

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Natural Language Processing with Python & nltk Cheat Sheet

by RJ Murray (murenei) via cheatography.com/58736/cs/15485/

Handling Text Part of Speech (POS) Tagging

text='Some words' assign string nltk.help.upenn_tagset( Lookup definition for a POS

list(text) Split text into character tokens 'MD') tag

set(text) Unique tokens nltk.pos_tag(words) nltk in-built POS tagger

len(text) Number of characters <use an alternative tagger

to illustrate ambiguity>

Accessing corpora and lexical resources

Sentence Parsing
from nltk.corpus import brow import CorpusReader
object g=nltk.data.load('grammar.cfg') Load a
n
grammar from
brown.words(text_id) Returns pretokenised
a file
document as list of words
g=nltk.CFG.fromstring("""...""") Manually
brown.fileids() Lists docs in Brown
define
corpus
grammar
brown.categories() Lists categories in Brown
parser=nltk.ChartParser(g) Create a parser
corpus
out of the
grammar
Tokenization
trees=parser.parse_all(text)
text.split(" ") Split by space
for tree in trees: ... print tree
nltk.word_tokenizer( nltk in-built word tokenizer
from nltk.corpus import treebank
text)
treebank.parsed_sents('wsj_00 Treebank
nltk.sent_tokenize(d nltk in-built sentence tokenizer
01.mrg') parsed
oc)
sentences

Lemmatization & Stemming

Text Classification
input="List listed lists listing listing Different
from sklearn.feature_extraction.text import CountVe
s" suffixes
ectorizer
words=input.lower().split(' ') Normalize
vect=CountVectorizer().fit(X_train) Fit bag of word
(lower‐
vect.get_feature_names() Get features
case)
words vect.transform(X_train) Convert to doc

porter=nltk.PorterStemmer Initialise
Stemmer
[porter.stem(t) for t in words] Create list
of stems
WNL=nltk.WordNetLemmatizer() Initialise
WordNet
lemmatizer
[WNL.lemmatize(t) for t in words] Use the
lemmatizer

By RJ Murray (murenei) Published 28th May, 2018. Sponsored by Readable.com

cheatography.com/murenei/ Last updated 29th May, 2018. Measure your website readability!
tutify.com.au Page 1 of 2. https://readable.com
Natural Language Processing with Python & nltk Cheat Sheet
by RJ Murray (murenei) via cheatography.com/58736/cs/15485/

Entity Recognition (Chunking/Chinking)

g="NP: {<DT>?<JJ>*<NN>‐ Regex chunk grammar

cp=nltk.RegexpParser(g Parse grammar

)

ch=cp.parse(pos_sent) Parse tagged sent. using

grammar
print(ch) Show chunks

ch.draw() Show chunks in IOB tree

cp.evaluate(test_sents Evaluate against test doc

)

sents=nltk.corpus.treebank.tagged_sents(
)

print(nltk.ne_chunk(s‐ Print chunk tree

ent))

RegEx with Pandas & Named Groups

df=pd.DataFrame(time_sents, columns=['text'])

df['text'].str.split().str.len()

df['text'].str.contains('word')

df['text'].str.count(r'\d')

df['text'].str.findall(r'\d')

df['text'].str.replace(r'\w+day\b', '???')

df['text'].str.replace(r'(\w)', lambda x: x.groups(‐

)[0][:3])

df['text'].str.extract(r'(\d?\d):(\d\d)')

df['text'].str.extractall(r'((\d?\d):(\d\d) ?([ap
]m))')

df['text'].str.extractall(r'(?P<digits>\d)')

By RJ Murray (murenei) Published 28th May, 2018. Sponsored by Readable.com

cheatography.com/murenei/ Last updated 29th May, 2018. Measure your website readability!
tutify.com.au Page 2 of 2. https://readable.com

Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Natural Language Processing With Python & NLTK Cheat Sheet: by Via
No ratings yet
Natural Language Processing With Python & NLTK Cheat Sheet: by Via
2 pages
Shubham Jade MSC It 31031420010 NLP Practical Journal
No ratings yet
Shubham Jade MSC It 31031420010 NLP Practical Journal
17 pages
Sree017 NLP
No ratings yet
Sree017 NLP
3 pages
NLP Programs
No ratings yet
NLP Programs
5 pages
Natural Language Processing With Python's NLTK Package – Real Python
No ratings yet
Natural Language Processing With Python's NLTK Package – Real Python
27 pages
NLP Programming
No ratings yet
NLP Programming
39 pages
7 idf
No ratings yet
7 idf
5 pages
NLP FinAL (1)
No ratings yet
NLP FinAL (1)
27 pages
NLP Using Python
No ratings yet
NLP Using Python
50 pages
NLP (1)
No ratings yet
NLP (1)
12 pages
UNIT-V-NLP Using NLTK
No ratings yet
UNIT-V-NLP Using NLTK
19 pages
NLTK Cheatsheet
No ratings yet
NLTK Cheatsheet
27 pages
SK NLP Practical (FS)
No ratings yet
SK NLP Practical (FS)
22 pages
NLP Record
No ratings yet
NLP Record
6 pages
NLP Manual (1-12) 1
No ratings yet
NLP Manual (1-12) 1
56 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
33 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
55 pages
01 NLP - Merged Vinay
No ratings yet
01 NLP - Merged Vinay
27 pages
NLTK: The Natural Language Toolkit: Steven Bird Edward Loper
No ratings yet
NLTK: The Natural Language Toolkit: Steven Bird Edward Loper
4 pages
UBC Summer School in NLP - VSP 2019 Lecture 10
No ratings yet
UBC Summer School in NLP - VSP 2019 Lecture 10
33 pages
Text Analysis With NLTK Cheatsheet PDF
No ratings yet
Text Analysis With NLTK Cheatsheet PDF
3 pages
Text Analysis With NLTK Cheatsheet PDF
No ratings yet
Text Analysis With NLTK Cheatsheet PDF
3 pages
Text Analysis With NLTK Cheatsheet
No ratings yet
Text Analysis With NLTK Cheatsheet
3 pages
Dsbdal A7
No ratings yet
Dsbdal A7
65 pages
NLTK
No ratings yet
NLTK
16 pages
Natural Language Toolkit NLTK PDF
No ratings yet
Natural Language Toolkit NLTK PDF
23 pages
CCS369-LAB EX 3,4,5
No ratings yet
CCS369-LAB EX 3,4,5
8 pages
Final_NLP_Lab_File
No ratings yet
Final_NLP_Lab_File
28 pages
Natural Language Processing
No ratings yet
Natural Language Processing
12 pages
NLTK Tutorial
No ratings yet
NLTK Tutorial
33 pages
Natural Language Processing: Practical 1
No ratings yet
Natural Language Processing: Practical 1
64 pages
NLP Notes and Related Questions
No ratings yet
NLP Notes and Related Questions
7 pages
I041 - NLP - Assignment1.ipynb - Colaboratory
No ratings yet
I041 - NLP - Assignment1.ipynb - Colaboratory
11 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
25 pages
NLP LAB_MANUAL (1)
No ratings yet
NLP LAB_MANUAL (1)
33 pages
NLP - Cheatsheet
No ratings yet
NLP - Cheatsheet
10 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
Lab2 IR
No ratings yet
Lab2 IR
16 pages
NLP Lab1
No ratings yet
NLP Lab1
6 pages
Text Processing
No ratings yet
Text Processing
16 pages
7.TextAnalysis
No ratings yet
7.TextAnalysis
3 pages
Natural Language Processing Dossier 20231110 141736 0000
No ratings yet
Natural Language Processing Dossier 20231110 141736 0000
114 pages
p4
No ratings yet
p4
10 pages
Lab Prgms Weel1-Output
No ratings yet
Lab Prgms Weel1-Output
4 pages
Wsma Final Manual
No ratings yet
Wsma Final Manual
58 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
54 pages
NLTK Documentation: Release 3.2.5
No ratings yet
NLTK Documentation: Release 3.2.5
87 pages
R22 Nlp Python Programs
No ratings yet
R22 Nlp Python Programs
15 pages
PPT for Assignment-10 (Machine Learning With Python_NLP-2)
No ratings yet
PPT for Assignment-10 (Machine Learning With Python_NLP-2)
37 pages
NLTK
No ratings yet
NLTK
4 pages
NLP Practicals All
No ratings yet
NLP Practicals All
57 pages
AI Zone: Log in Sign Up
No ratings yet
AI Zone: Log in Sign Up
24 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
AIML_P4
No ratings yet
AIML_P4
12 pages
NLP___
No ratings yet
NLP___
28 pages
Natural Language Processing Notes
No ratings yet
Natural Language Processing Notes
26 pages
Clint-Roy Muvirimi-Mukarakate H1802386 AI Practical Assignment
No ratings yet
Clint-Roy Muvirimi-Mukarakate H1802386 AI Practical Assignment
8 pages
Assignment 7
No ratings yet
Assignment 7
2 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
LLM and Generative AI Report - SDAIA
No ratings yet
LLM and Generative AI Report - SDAIA
23 pages
Transformer-Based Korean Pretrained Language Models - NLP - Ai
No ratings yet
Transformer-Based Korean Pretrained Language Models - NLP - Ai
7 pages
ACL - 2021 - Xiang Lisa Li - Prefix-Tuning Optimizing Continuous Prompts For Generation
No ratings yet
ACL - 2021 - Xiang Lisa Li - Prefix-Tuning Optimizing Continuous Prompts For Generation
16 pages
NLP Syllabus R21
No ratings yet
NLP Syllabus R21
2 pages
Murenei - Natural Language Processing With Python and NLTK
No ratings yet
Murenei - Natural Language Processing With Python and NLTK
2 pages
Explain To Me Like I Am Five - Sentence Simplification Using Transformers
No ratings yet
Explain To Me Like I Am Five - Sentence Simplification Using Transformers
4 pages
Generating Datasets With Pretrained Language Models
No ratings yet
Generating Datasets With Pretrained Language Models
9 pages
LLMs and Retrieval-Augmented Generation (RAG)
No ratings yet
LLMs and Retrieval-Augmented Generation (RAG)
120 pages
Text Corpus
No ratings yet
Text Corpus
3 pages
AI for Marketing (1)
No ratings yet
AI for Marketing (1)
198 pages
SNLP_Syllabus
No ratings yet
SNLP_Syllabus
3 pages
Module III
No ratings yet
Module III
42 pages
NLP
No ratings yet
NLP
5 pages
Finacial News Summary and Sentiment Report
No ratings yet
Finacial News Summary and Sentiment Report
3 pages
NLP Assignment
No ratings yet
NLP Assignment
4 pages
Part of Speech Colab
No ratings yet
Part of Speech Colab
2 pages
Unleashing The Power of Large Language Models Fauber
No ratings yet
Unleashing The Power of Large Language Models Fauber
4 pages
Ram Chandra Padwal - Pratical Guide To NLTK For Data Science
No ratings yet
Ram Chandra Padwal - Pratical Guide To NLTK For Data Science
37 pages
NLP Exp 4
No ratings yet
NLP Exp 4
2 pages
NLP 3-6
No ratings yet
NLP 3-6
20 pages
UNIT V
No ratings yet
UNIT V
21 pages
Unit 2
No ratings yet
Unit 2
42 pages
6th International Conference on NLP & Information Retrieval (NLPI 2025)
No ratings yet
6th International Conference on NLP & Information Retrieval (NLPI 2025)
2 pages
N Gram
No ratings yet
N Gram
6 pages
NLP_Prelim Exam_SE24-25
No ratings yet
NLP_Prelim Exam_SE24-25
2 pages
Mera: Merging Pretrained Adapters For Few-Shot Learning
No ratings yet
Mera: Merging Pretrained Adapters For Few-Shot Learning
6 pages
Eqps + Notes (TCS)
No ratings yet
Eqps + Notes (TCS)
92 pages
Slides For 'Large Language Model: From Theory To Implementations', Chapter 1
No ratings yet
Slides For 'Large Language Model: From Theory To Implementations', Chapter 1
40 pages
NLP LAB MANUAL 3-2 AIML R22 UPDATE (1)
100% (1)
NLP LAB MANUAL 3-2 AIML R22 UPDATE (1)
20 pages
controlling-large-language-model-hallucination-based-on-agent-ai-with-lang-graph
No ratings yet
controlling-large-language-model-hallucination-based-on-agent-ai-with-lang-graph
7 pages