POS TAGGING
AIM:
To perform Part-of-Speech (POS)
Taggi--------------------------------------------------------------------------------------------------------------------------ng
on a given text using Natural Language Processing (NLP) in Python with the NLTK library.
REQUIREMENTS:
Python (version 3.x recommended)
NLTK library (pip install nltk)
PROCEDURE:
1. Install Required Libraries:
o Ensure that the NLTK library is installed using pip install nltk.
2. Import Necessary Modules:
o Import nltk for natural language processing tasks.
3. Download Required NLTK Data:
o Use nltk.download('punkt') for tokenization.
o Use nltk.download('averaged_perceptron_tagger') for POS tagging.
4. Input Sample Text:
o Define a sample sentence for testing.
5. Tokenize the Text:
o Use nltk.word_tokenize(text) to split the text into words.
6. Perform POS Tagging:
o Use nltk.pos_tag(words) to assign part-of-speech tags to each word.
7. Display the Tagged Output:
o Print the words along with their respective POS tags.
PROGRAM:
import nltk
nltk.download('punkt_tab')
nltk.download('averaged_perceptron_tagger_eng')
# Sample text
text = "Jhon is learning Natural Language Processing."
# Tokenization
words = nltk.word_tokenize(text)
# POS Tagging
pos_tags = nltk.pos_tag(words)
# Display POS tags
print("POS Tagged Output:")
for word, tag in pos_tags:
print(f"{word} -> {tag}")
POS Tagged Output:
Jhon -> NNP
is -> VBZ
learning -> VBG
Natural -> NNP
Language -> NNP
Processing -> NNP
. -> .
(NNP = Proper noun, VBZ = Verb, VBG = Verb Gerund, etc.)
RESULT:
Thus the above To perform Part-of-Speech (POS) Tagging on a given text using Natural Language
Processing (NLP) in Python with the NLTK library was successfully executed and output verified.
STEMMING AND CHUNKING IN NLP
AIM:
To implement Stemming& Chunking in Natural Language Processing (NLP) using Python and the
PROCEDURE:
Step 1: Install and Import Required Libraries
Step 2: Download Required Resources
Step 3: Initialize the Stemmer
Step 4:Define a sentence that includes verbs, nouns, and named entities for both stemming and
chunking.
Step 5:Convert the text into individual words using word_tokenize().
Step 6 :Apply stemming to each word in the tokenized text using PorterStemmer.
Step 7: Apply Chunking (Named Entity Recognition - NER).
PROGRAM:
import nltk
from nltk.stem import PorterStemmer
from nltk.tokenize import word_tokenize
from nltk import pos_tag, ne_chunk
# Download required resources
nltk.download('punkt_tab')
nltk.download('maxent_ne_chunker')
nltk.download('words')
nltk.download('maxent_ne_chunker_tab')
# Initialize Stemmer
stemmer = PorterStemmer()
# Sample text for both Stemming and Chunking
text = "Elon Musk is running Tesla, which is headquartered in California."
# Tokenization
words = word_tokenize(text)
# **STEMMING PROCESS**
stemmed_words = [stemmer.stem(word) for word in words]
print("\n--- Stemming Output ---")
for word, stemmed in zip(words, stemmed_words):
print(f"{word} -> {stemmed}")
# **CHUNKING PROCESS**
# Perform POS tagging
pos_tags = pos_tag(words)
# Apply Named Entity Recognition (NER) Chunking
chunked_output = ne_chunk(pos_tags)
print("\n--- Chunking Output ---")
print(chunked_output)
Output:
--- Stemming Output ---
Elon ->elon
Musk -> musk
is -> is
running -> run
Tesla -> tesla
, -> ,
which -> which
is -> is
headquartered ->headquart
in -> in
California ->california
. -> .
--- Chunking Output ---
(S
(PERSON Elon/NNP)
(PERSON Musk/NNP)
is/VBZ
running/VBG
(ORGANIZATION Tesla/NNP)
,/,
which/WDT
is/VBZ
headquartered/VBN
in/IN
(GPE California/NNP)
./.)
RESULT:
Thus the above program of To implement Stemming& Chunking in Natural Language Processing
(NLP) using Python and the NLTK library was successfully Implemented and output verified.
MORPHOLOGICAL ANALYSIS IN NLP
AIM:
To perform Morphological Analysis using Python and the NLTK library to analyze the structure of
words, including root words, prefixes, and suffixes.
PROCEDURE
1. Install and Import Libraries
o Install nltk using pip install nltk.
o Import necessary modules like PorterStemmer, WordNetLemmatizer, and
word_tokenize.
2. Tokenization
o Split the given text into words using word_tokenize().
3. Stemming and Lemmatization
o Apply Stemming using PorterStemmer to get root forms.
o Apply Lemmatization using WordNetLemmatizer for meaningful base forms.
4. Display the Results
o Print the original words along with their stemmed and lemmatized versions.
PROGRAM:
import nltk
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer, WordNetLemmatizer
# Download necessary resources
nltk.download('punkt_tab')
nltk.download('wordnet')
# Initialize Stemmer and Lemmatizer
stemmer = PorterStemmer()
lemmatizer = WordNetLemmatizer()
# Sample text
text = "The cats are running faster than the mice. The children played happily."
# Tokenization
words = word_tokenize(text)
# Apply Stemming and Lemmatization
print("\n--- Morphological Analysis Output ---")
print("Word\t\tStemming\tLemmatization")
print("-" * 40)
for word in words:
stemmed_word = stemmer.stem(word)
lemmatized_word = lemmatizer.lemmatize(word)
print(f"{word}\t\t{stemmed_word}\t\t{lemmatized_word}")
Output:
--- Morphological Analysis Output ---
Word Stemming Lemmatization
----------------------------------------
The the the
cats cat cat
are are are
running run running
faster faster faster
than than than
the the the
mice mice mouse
. . .
The the the
children child child
played play played
happily happili happily
RESULT:
Thus the above program ofMorphological Analysis using Python and the NLTK library to analyze the
structure of words, including root words, prefixes, and suffixes successfully Implemented and output
verified.