0% found this document useful (0 votes)
8 views7 pages

NLP Lab

The document outlines procedures for performing Part-of-Speech (POS) tagging, stemming, chunking, and morphological analysis using Python's NLTK library. It includes installation instructions, sample code, and expected outputs for each NLP task. Each section demonstrates how to process text data and analyze its structure and components effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views7 pages

NLP Lab

The document outlines procedures for performing Part-of-Speech (POS) tagging, stemming, chunking, and morphological analysis using Python's NLTK library. It includes installation instructions, sample code, and expected outputs for each NLP task. Each section demonstrates how to process text data and analyze its structure and components effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

POS TAGGING

AIM:

To perform Part-of-Speech (POS)


Taggi--------------------------------------------------------------------------------------------------------------------------ng
on a given text using Natural Language Processing (NLP) in Python with the NLTK library.

REQUIREMENTS:

 Python (version 3.x recommended)

 NLTK library (pip install nltk)

PROCEDURE:

1. Install Required Libraries:

o Ensure that the NLTK library is installed using pip install nltk.

2. Import Necessary Modules:

o Import nltk for natural language processing tasks.

3. Download Required NLTK Data:

o Use nltk.download('punkt') for tokenization.

o Use nltk.download('averaged_perceptron_tagger') for POS tagging.

4. Input Sample Text:

o Define a sample sentence for testing.

5. Tokenize the Text:

o Use nltk.word_tokenize(text) to split the text into words.

6. Perform POS Tagging:

o Use nltk.pos_tag(words) to assign part-of-speech tags to each word.

7. Display the Tagged Output:

o Print the words along with their respective POS tags.

PROGRAM:

import nltk

nltk.download('punkt_tab')
nltk.download('averaged_perceptron_tagger_eng')

# Sample text

text = "Jhon is learning Natural Language Processing."

# Tokenization

words = nltk.word_tokenize(text)

# POS Tagging

pos_tags = nltk.pos_tag(words)

# Display POS tags

print("POS Tagged Output:")

for word, tag in pos_tags:

print(f"{word} -> {tag}")

POS Tagged Output:

Jhon -> NNP

is -> VBZ

learning -> VBG

Natural -> NNP

Language -> NNP

Processing -> NNP

. -> .

(NNP = Proper noun, VBZ = Verb, VBG = Verb Gerund, etc.)

RESULT:
Thus the above To perform Part-of-Speech (POS) Tagging on a given text using Natural Language
Processing (NLP) in Python with the NLTK library was successfully executed and output verified.
STEMMING AND CHUNKING IN NLP

AIM:

To implement Stemming& Chunking in Natural Language Processing (NLP) using Python and the

PROCEDURE:

Step 1: Install and Import Required Libraries

Step 2: Download Required Resources

Step 3: Initialize the Stemmer

Step 4:Define a sentence that includes verbs, nouns, and named entities for both stemming and
chunking.

Step 5:Convert the text into individual words using word_tokenize().

Step 6 :Apply stemming to each word in the tokenized text using PorterStemmer.

Step 7: Apply Chunking (Named Entity Recognition - NER).

PROGRAM:

import nltk

from nltk.stem import PorterStemmer

from nltk.tokenize import word_tokenize

from nltk import pos_tag, ne_chunk

# Download required resources

nltk.download('punkt_tab')

nltk.download('maxent_ne_chunker')

nltk.download('words')

nltk.download('maxent_ne_chunker_tab')

# Initialize Stemmer

stemmer = PorterStemmer()

# Sample text for both Stemming and Chunking

text = "Elon Musk is running Tesla, which is headquartered in California."

# Tokenization
words = word_tokenize(text)

# **STEMMING PROCESS**

stemmed_words = [stemmer.stem(word) for word in words]

print("\n--- Stemming Output ---")

for word, stemmed in zip(words, stemmed_words):

print(f"{word} -> {stemmed}")

# **CHUNKING PROCESS**

# Perform POS tagging

pos_tags = pos_tag(words)

# Apply Named Entity Recognition (NER) Chunking

chunked_output = ne_chunk(pos_tags)

print("\n--- Chunking Output ---")

print(chunked_output)

Output:

--- Stemming Output ---

Elon ->elon

Musk -> musk

is -> is

running -> run

Tesla -> tesla

, -> ,

which -> which

is -> is

headquartered ->headquart

in -> in

California ->california
. -> .

--- Chunking Output ---

(S

(PERSON Elon/NNP)

(PERSON Musk/NNP)

is/VBZ

running/VBG

(ORGANIZATION Tesla/NNP)

,/,

which/WDT

is/VBZ

headquartered/VBN

in/IN

(GPE California/NNP)

./.)

RESULT:
Thus the above program of To implement Stemming& Chunking in Natural Language Processing
(NLP) using Python and the NLTK library was successfully Implemented and output verified.

MORPHOLOGICAL ANALYSIS IN NLP


AIM:

To perform Morphological Analysis using Python and the NLTK library to analyze the structure of
words, including root words, prefixes, and suffixes.

PROCEDURE

1. Install and Import Libraries

o Install nltk using pip install nltk.

o Import necessary modules like PorterStemmer, WordNetLemmatizer, and


word_tokenize.

2. Tokenization
o Split the given text into words using word_tokenize().

3. Stemming and Lemmatization

o Apply Stemming using PorterStemmer to get root forms.

o Apply Lemmatization using WordNetLemmatizer for meaningful base forms.

4. Display the Results

o Print the original words along with their stemmed and lemmatized versions.

PROGRAM:

import nltk

from nltk.tokenize import word_tokenize

from nltk.stem import PorterStemmer, WordNetLemmatizer

# Download necessary resources

nltk.download('punkt_tab')

nltk.download('wordnet')

# Initialize Stemmer and Lemmatizer

stemmer = PorterStemmer()

lemmatizer = WordNetLemmatizer()

# Sample text

text = "The cats are running faster than the mice. The children played happily."

# Tokenization

words = word_tokenize(text)

# Apply Stemming and Lemmatization

print("\n--- Morphological Analysis Output ---")

print("Word\t\tStemming\tLemmatization")

print("-" * 40)

for word in words:

stemmed_word = stemmer.stem(word)
lemmatized_word = lemmatizer.lemmatize(word)

print(f"{word}\t\t{stemmed_word}\t\t{lemmatized_word}")

Output:

--- Morphological Analysis Output ---

Word Stemming Lemmatization

----------------------------------------

The the the

cats cat cat

are are are

running run running

faster faster faster

than than than

the the the

mice mice mouse

. . .

The the the

children child child

played play played

happily happili happily

RESULT:
Thus the above program ofMorphological Analysis using Python and the NLTK library to analyze the
structure of words, including root words, prefixes, and suffixes successfully Implemented and output
verified.

You might also like