0% found this document useful (0 votes)

48 views34 pages

Mapping Words To Properties Using Python Dictionaries

The document discusses the use of Python dictionaries for mapping words to part-of-speech tags, highlighting their efficiency in storing and retrieving associations. It explains how to create, update, and utilize dictionaries, including the use of default dictionaries for handling non-existent keys. Additionally, it covers advanced topics such as inverting dictionaries for reverse lookups and provides examples of practical applications in language processing tasks.

Uploaded by

Karthik S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views34 pages

Mapping Words To Properties Using Python Dictionaries

Uploaded by

Karthik S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 34

1

Mapping Words to Properties Using Python

Dictionaries

S.Karthik, Assistant Professor,

Department of Information Technology,
Sri Ramakrishna College of Arts & Science, Coimbatore
Mapping Words to Properties Using Python
Dictionaries

 A tagged word of the form (word, tag) is an association between a word and a
part-of-speech tag.

 Once we start doing part-of-speech tagging, we will be creating programs that

assign a tag to a word, the tag which is most likely in a given context.

 We can think of this process as mapping from words to tags.

 The most natural way to store mappings in Python uses the so-called dictionary
data type.
Mapping Words to Properties Using Python
Dictionaries
 Indexing Lists Versus Dictionaries

 A text is treated in Python as a list of words.

 An important property of lists is that we can “look up” a particular item by

giving its index, e.g., text1[100].

 We specify a number and get back a word.

 We can think of a list as a simple kind of table, as shown in figure below.

Mapping Words to Properties Using Python
Dictionaries
 With frequency distributions, where we specify a word and get back a number,
e.g., fdist['monstrous'], which tells us the number of times a given word has
occurred in a text.

 Lookup using words is familiar to anyone who has used a dictionary. Some more
examples are shown in figure below.
Mapping Words to Properties Using Python
Dictionaries

 In the case of a phonebook, we look up an entry using a name and get back a
number.

 When we type a domain name in a web browser, the computer looks this up to
get back an IP address.

 A word frequency table allows us to look up a word and find its frequency in a
text collection.

 In all these cases, we are mapping from names to numbers, rather than the
other way around as with a list.

 In general, we would like to be able to map between arbitrary types of

information.
Mapping Words to Properties Using Python
Dictionaries
 Table below lists a variety of linguistic objects, along with what they map.

 Most often, we are mapping from a “word” to some structured object.

 For example, a document index maps from a word to a list of pages.

Mapping Words to Properties Using Python
Dictionaries
 Dictionaries in Python

 Python provides a dictionary data type that can be used for mapping between
arbitrary types.

 It is like a conventional dictionary, in that it gives you an efficient way to look

things up.

 However, Dictionaries in Python has a much wider range of uses.

 To illustrate, we define pos to be an empty dictionary and then add four entries
to it, specifying the part-of-speech of some words.

 We add entries to a dictionary using the familiar square bracket notation.

Mapping Words to Properties Using Python
Dictionaries
 Example.
Mapping Words to Properties Using Python
Dictionaries
 So, for example, say that the part-of-speech of colorless is adjective, or more
specifically, that the key 'colorless' is assigned the value 'ADJ' in dictionary pos.

 When we inspect the value of pos we see a set of key-value pairs.

 Once we have populated the dictionary in this way, we can employ the keys to
retrieve values.

 We might accidentally use a key that hasn’t been assigned a value.

Mapping Words to Properties Using Python
Dictionaries

 Unlike lists and strings, where we can use len() to work out which integers will
be legal indexes, how do we work out the legal keys for a dictionary?

 If the dictionary is not too big, we can simply inspect its contents by evaluating
the variable pos, this gives us the key-value pairs.

 They are not in the same order they were originally entered; this is because
dictionaries are not sequences but mappings, and the keys are not inherently
ordered.
Mapping Words to Properties Using Python
Dictionaries

 Alternatively, to just find the keys, we can either convert the dictionary to a list
or use the dictionary in a context where a list is expected, as the parameter of
sorted() or in a for loop.
Mapping Words to Properties Using Python
Dictionaries

 As well as iterating over all keys in the dictionary with a for loop, we can use the
for loop as we did for printing lists.
Mapping Words to Properties Using Python
Dictionaries

 Finally, the dictionary methods keys(), values(), and items() allow us to access
the keys, values, and key-value pairs as separate lists.

 We can even sort tuples, which orders them according to their first element.
Mapping Words to Properties Using Python
Dictionaries

 We want to be sure that when we look something up in a dictionary, we get only

one value for each key.

 Now suppose we try to use a dictionary to store the fact that the word sleep can
be used as both a verb and a noun.
Mapping Words to Properties Using Python
Dictionaries

 Initially, pos['sleep'] is given the value 'V’.

 But this is immediately overwritten with the new value, 'N’.

 In other words, there can be only one entry in the dictionary for 'sleep’.

 However, there is a way of storing multiple values in that entry: we use a list
value, e.g., pos['sleep'] = ['N', 'V'].
Mapping Words to Properties Using Python
Dictionaries
 Defining Dictionaries

 We can use the same key-value pair format to create a dictionary.

 There are a couple of ways to do this, and we will normally use the first.

 The dictionary keys must be immutable types, such as strings and tuples. If we
try to define a dictionary using a mutable key, we get a TypeError.
Mapping Words to Properties Using Python
Dictionaries

 Default Dictionaries

 If we try to access a key that is not in a dictionary, we get an error.

 However, it’s often useful if a dictionary can automatically create an entry for
this new key and give it a default value, such as zero or the empty list.

 Since Python 2.5, a special kind of dictionary called a defaultdict has been
available.

 In order to use it, we have to supply a parameter which can be used to create
the default value, e.g., int, float, str, list, dict, tuple.
Mapping Words to Properties Using Python
Dictionaries
 Example.
Mapping Words to Properties Using Python
Dictionaries
 The preceding examples specified the default value of a dictionary entry to be
the default value of a particular data type.

 However, we can specify any default value we like, simply by providing the
name of a function that can be called with no arguments to create the required
value.

 Let’s return to our part-of-speech example, and create a dictionary whose

default value for any entry is 'N’ .

 When we access a non-existent entry, it is automatically added to the

dictionary.
Mapping Words to Properties Using Python
Dictionaries
 Let’s see how default dictionaries could be used in a more substantial language
processing task.

 Many language processing tasks- including tagging - struggle to correctly process

the hapaxes of a text.

 They can perform better with a fixed vocabulary and a guarantee that no new
words will appear.

 We can preprocess a text to replace low-frequency words with a special “out of

vocabulary” token, UNK, with the help of a default dictionary.

 We need to create a default dictionary that maps each word to its replacement.

 The most frequent n words will be mapped to themselves.

 Everything else will be mapped to UNK.

Mapping Words to Properties Using Python
Dictionaries
 Example.
Mapping Words to Properties Using Python
Dictionaries

 Incrementally Updating a Dictionary

 We can employ dictionaries to count occurrences, emulating the method for

tallying words.

 We begin by initializing an empty defaultdict, then process each part-of-speech

tag in the text.

 If the tag hasn’t been seen before, it will have a zero count by default.

 Each time we encounter a tag, we increment its count using the += operator.
Mapping Words to Properties Using Python
Dictionaries
 Example.
Mapping Words to Properties Using Python
Dictionaries

 The listing in previous example illustrates an important idiom for sorting a

dictionary by its values, to show words in decreasing order of frequency.

 The first parameter of sorted() is the items to sort, which is a list of tuples
consisting of a POS tag and a frequency.

 The second parameter specifies the sort key using a function itemgetter().

 In general, itemgetter(n) returns a function that can be called on some other

sequence object to obtain the nth element.
Mapping Words to Properties Using Python
Dictionaries

 The last parameter of sorted() specifies that the items should be returned in
reverse order, i.e., decreasing values of frequency.

 There’s a second useful programming idiom at the beginning of previous

example, where we initialize a defaultdict and then use a for loop to update its
values. Here’s a schematic version.
Mapping Words to Properties Using Python
Dictionaries
 Here’s another instance of this pattern, where we index words according to their
last two letters.
Mapping Words to Properties Using Python
Dictionaries
 The following example uses the same pattern to create an anagram dictionary.

 Since accumulating words like this is such a common task, NLTK provides a more
convenient way of creating a defaultdict(list), in the form of nltk.Index().
Mapping Words to Properties Using Python
Dictionaries
 We can use default dictionaries with complex keys and values.

 Let’s study the range of possible tags for a word, given the word itself and the
tag of the previous word.

 We will see how this information can be used by a POS tagger.

Mapping Words to Properties Using Python
Dictionaries

 The example uses a dictionary whose default value for an entry is a dictionary
(whose default value is int(), i.e., zero).

 Notice how we iterated over the bigrams of the tagged corpus, processing a pair
of word-tag pairs for each iteration .

 Each time through the loop we updated our pos dictionary’s entry for (t1, w2), a
tag and its following word.

 When we look up an item in pos we must specify a compound key, and we get
back a dictionary object.

 A POS tagger could use such information to decide that the word right, when
preceded by a determiner, should be tagged as ADJ.
Mapping Words to Properties Using Python
Dictionaries
 Inverting a Dictionary

 Dictionaries support efficient lookup, so long as you want to get the value for
any key.

 If d is a dictionary and k is a key, we type d[k] and immediately obtain the

value.

 Finding a key given a value is slower and more cumbersome.

Mapping Words to Properties Using Python
Dictionaries
 If we expect to do this kind of “reverse lookup” often, it helps to construct a
dictionary that maps values to keys.

 In the case that no two keys have the same value, this is an easy thing to do.

 We just get all the key-value pairs in the dictionary, and create a new dictionary
of value-key pairs.

 The below example also illustrates another way of initializing a dictionary pos
with key-value pairs.
Mapping Words to Properties Using Python
Dictionaries
 Let’s first make our part-of-speech dictionary a bit more realistic and add some
more words to pos using the dictionary update() method, to create the situation
where multiple keys have the same value.

 Then the technique just shown for reverse lookup will no longer work.

 Instead, we have to use append() to accumulate the words for each part-of-
speech, as follows.
Mapping Words to Properties Using Python
Dictionaries
 Now we have inverted the pos dictionary, and can look up any part-of-speech
and find all words having that part-of-speech.

 We can do the same thing even more simply using NLTK’s support for indexing,
as follows.

 A summary of Python’s dictionary methods is given in the table given next.

Mapping Words to Properties Using Python
Dictionaries

p21 4cs Research Brief Series - Creativity
100% (2)
p21 4cs Research Brief Series - Creativity
30 pages
English: Quarter 3 - Module 2 Evaluate Narratives Based On How The Author Developed The Elements
100% (6)
English: Quarter 3 - Module 2 Evaluate Narratives Based On How The Author Developed The Elements
51 pages
Pythonlearn 09 Dictionaries
No ratings yet
Pythonlearn 09 Dictionaries
30 pages
(L4) Programming With Python (Intermediate Level)
No ratings yet
(L4) Programming With Python (Intermediate Level)
17 pages
09 Dictionaries
No ratings yet
09 Dictionaries
33 pages
Py4Inf 09 Dictionaries
No ratings yet
Py4Inf 09 Dictionaries
32 pages
Unit 2-Dictionaries
No ratings yet
Unit 2-Dictionaries
52 pages
Pythonlearn 09 Dictionaries 1
No ratings yet
Pythonlearn 09 Dictionaries 1
31 pages
Py4Inf 09 Dictionaries
No ratings yet
Py4Inf 09 Dictionaries
30 pages
Chap9 Python-Dictionaries
No ratings yet
Chap9 Python-Dictionaries
29 pages
Pythonlearn 09 Dictionaries
No ratings yet
Pythonlearn 09 Dictionaries
8 pages
Python Dictionaries: Python For Informatics: Exploring Information
No ratings yet
Python Dictionaries: Python For Informatics: Exploring Information
32 pages
Python UNIT-3
No ratings yet
Python UNIT-3
122 pages
Dictionaries: 'One' 'Uno'
No ratings yet
Dictionaries: 'One' 'Uno'
10 pages
Modue 3 - Dictionaries
No ratings yet
Modue 3 - Dictionaries
41 pages
Python Session 23-26 Module 3-Dictionaries
No ratings yet
Python Session 23-26 Module 3-Dictionaries
31 pages
Unit IV Dictionary
No ratings yet
Unit IV Dictionary
20 pages
Dictionaries
No ratings yet
Dictionaries
26 pages
Chapter 9 Dictionaries
No ratings yet
Chapter 9 Dictionaries
6 pages
Dictionaries
No ratings yet
Dictionaries
20 pages
4 - Dictionaries - Introduction
No ratings yet
4 - Dictionaries - Introduction
53 pages
Dictionaries in Python Comp SC
No ratings yet
Dictionaries in Python Comp SC
19 pages
Lesson 9 Dictionaries
No ratings yet
Lesson 9 Dictionaries
41 pages
16 Dictionaries in Python
No ratings yet
16 Dictionaries in Python
11 pages
05 - Dictionaries and Tuples
No ratings yet
05 - Dictionaries and Tuples
61 pages
Lecture 5 - Dictionaries
No ratings yet
Lecture 5 - Dictionaries
51 pages
16 Dictionaries in Python
No ratings yet
16 Dictionaries in Python
11 pages
Dictionary
No ratings yet
Dictionary
22 pages
Day-5 Dictionaries in Python
No ratings yet
Day-5 Dictionaries in Python
18 pages
Lab 03
No ratings yet
Lab 03
19 pages
Module 4
No ratings yet
Module 4
24 pages
Python 06 Dictionary
No ratings yet
Python 06 Dictionary
20 pages
Unit - 3 - III Cs - Python
No ratings yet
Unit - 3 - III Cs - Python
46 pages
Dictionaries
No ratings yet
Dictionaries
38 pages
Chap - 10
No ratings yet
Chap - 10
16 pages
Dictionary in Python
No ratings yet
Dictionary in Python
7 pages
DICTIONARY
No ratings yet
DICTIONARY
10 pages
Module 3 Dic Strings
No ratings yet
Module 3 Dic Strings
66 pages
Handling Missing Keys With Setdefault: Example 3-2 Example 3-2 Example 3-3
No ratings yet
Handling Missing Keys With Setdefault: Example 3-2 Example 3-2 Example 3-3
5 pages
Chapter 6 Dictctionaries
No ratings yet
Chapter 6 Dictctionaries
43 pages
Pyhton 4 Dictionaries
No ratings yet
Pyhton 4 Dictionaries
33 pages
Chapter 6 Dictionaries
No ratings yet
Chapter 6 Dictionaries
43 pages
Art Integrated Project (CS) ) Xi-A
100% (2)
Art Integrated Project (CS) ) Xi-A
18 pages
Pujan's Python Student Activity
No ratings yet
Pujan's Python Student Activity
14 pages
UNIT-III - Part-3 (Dict and Sets)
No ratings yet
UNIT-III - Part-3 (Dict and Sets)
62 pages
Unit-2 ch-13 Dictionary
No ratings yet
Unit-2 ch-13 Dictionary
13 pages
Exp3
No ratings yet
Exp3
15 pages
Class 12th Python Lecture 6
No ratings yet
Class 12th Python Lecture 6
12 pages
Class 12th Python Lecture 6
No ratings yet
Class 12th Python Lecture 6
12 pages
Class 12th Python Lecture 6
No ratings yet
Class 12th Python Lecture 6
12 pages
Dictionaries: Hash Maps
No ratings yet
Dictionaries: Hash Maps
12 pages
Python Dictionaries - A Network Engineer's Guide - Python For Network Engineer
No ratings yet
Python Dictionaries - A Network Engineer's Guide - Python For Network Engineer
10 pages
Python for Beginners: This comprehensive introduction to the world of coding introduces you to the Python programming language
From Everand
Python for Beginners: This comprehensive introduction to the world of coding introduces you to the Python programming language
Vere salazar
No ratings yet
Xi - Dictionary
No ratings yet
Xi - Dictionary
23 pages
Computer CH - 1 Unit - 7 Session - 1
No ratings yet
Computer CH - 1 Unit - 7 Session - 1
20 pages
Dictionary in Python
No ratings yet
Dictionary in Python
28 pages
Learning Journal 8: in Operator Shows You The Key in The Dictionary Ord Built in Function Converts Characters To Numbers
No ratings yet
Learning Journal 8: in Operator Shows You The Key in The Dictionary Ord Built in Function Converts Characters To Numbers
4 pages
Subhajit Das TT Ca1 GR F
No ratings yet
Subhajit Das TT Ca1 GR F
11 pages
Introduction To Python Programming (23ECAE21) Dictionary: SJB Institute of Technology
No ratings yet
Introduction To Python Programming (23ECAE21) Dictionary: SJB Institute of Technology
32 pages
05-Dictionaries - Jupyter Notebook
No ratings yet
05-Dictionaries - Jupyter Notebook
3 pages
Dictionary in Python1
No ratings yet
Dictionary in Python1
11 pages
Slide8 Dictionaries
No ratings yet
Slide8 Dictionaries
14 pages
Searching and Sorting
No ratings yet
Searching and Sorting
55 pages
Queues
No ratings yet
Queues
22 pages
Functions in C Programming
No ratings yet
Functions in C Programming
32 pages
Data Structures - Unit 3 Linked List
No ratings yet
Data Structures - Unit 3 Linked List
87 pages
The Effects of Multisensory Approach in The Development of The Reading Comprehension Skill
No ratings yet
The Effects of Multisensory Approach in The Development of The Reading Comprehension Skill
9 pages
(eBook PDF) The Art of Public Speaking 12th Edition by Stephen Lucas download
100% (1)
(eBook PDF) The Art of Public Speaking 12th Edition by Stephen Lucas download
115 pages
PCMS Visual 1.0
No ratings yet
PCMS Visual 1.0
1 page
Colby Melvin Resume 2011
No ratings yet
Colby Melvin Resume 2011
2 pages
Does The Bushing of Variator Need Grease - DR - Pulley
No ratings yet
Does The Bushing of Variator Need Grease - DR - Pulley
3 pages
Lesson 1 - 5 Community Final Reviewer
100% (2)
Lesson 1 - 5 Community Final Reviewer
4 pages
Learning Activity Sheet in English - 9 - : II. Introductory Concept
No ratings yet
Learning Activity Sheet in English - 9 - : II. Introductory Concept
6 pages
Bannarifinal Bannariamman Textiles LTD
No ratings yet
Bannarifinal Bannariamman Textiles LTD
194 pages
Linking Knowledge Application
No ratings yet
Linking Knowledge Application
27 pages
Graduation Thesis Defense PPT: Peking University School of Design 2017 Class 01
No ratings yet
Graduation Thesis Defense PPT: Peking University School of Design 2017 Class 01
27 pages
Using Poka-Yoke Techniques For Early Defect Detection
No ratings yet
Using Poka-Yoke Techniques For Early Defect Detection
22 pages
Logcat
No ratings yet
Logcat
103 pages
Big Data Unit II
No ratings yet
Big Data Unit II
23 pages
Julius COT On Materials That Float and Sink
100% (1)
Julius COT On Materials That Float and Sink
3 pages
Pe LP 4
No ratings yet
Pe LP 4
2 pages
Analytical Ability Part 2
No ratings yet
Analytical Ability Part 2
9 pages
Oracle 10g Cheat Sheet: 1 ER Model
No ratings yet
Oracle 10g Cheat Sheet: 1 ER Model
3 pages
Amit Goel
No ratings yet
Amit Goel
1 page
LM 297
No ratings yet
LM 297
12 pages
Sop 1
No ratings yet
Sop 1
2 pages
Training Fuzzy Deep Neural Network With Honey Badger Algorithm For Intrusion Detection in Cloud Environment
No ratings yet
Training Fuzzy Deep Neural Network With Honey Badger Algorithm For Intrusion Detection in Cloud Environment
17 pages
Supplier Evaluation Process
No ratings yet
Supplier Evaluation Process
14 pages
Juvenes-Translatores 2017 Text en English
No ratings yet
Juvenes-Translatores 2017 Text en English
1 page
Repuestos para Servicios de Los Preventivos de Vibro Compactador CS-533E CAT
No ratings yet
Repuestos para Servicios de Los Preventivos de Vibro Compactador CS-533E CAT
2 pages
HVAC Training Slides
No ratings yet
HVAC Training Slides
54 pages
End Anchorage Options CarboDur - Nikos
No ratings yet
End Anchorage Options CarboDur - Nikos
16 pages
4731430
No ratings yet
4731430
7 pages

Mapping Words To Properties Using Python Dictionaries

Uploaded by

Mapping Words To Properties Using Python Dictionaries

Uploaded by

1

Mapping Words to Properties Using Python

S.Karthik, Assistant Professor,

 Once we start doing part-of-speech tagging, we will be creating programs that

 We can think of this process as mapping from words to tags.

 A text is treated in Python as a list of words.

 An important property of lists is that we can “look up” a particular item by

 We specify a number and get back a word.

 We can think of a list as a simple kind of table, as shown in figure below.

 In general, we would like to be able to map between arbitrary types of

 Most often, we are mapping from a “word” to some structured object.

 For example, a document index maps from a word to a list of pages.

 It is like a conventional dictionary, in that it gives you an efficient way to look

 However, Dictionaries in Python has a much wider range of uses.

 We add entries to a dictionary using the familiar square bracket notation.

 When we inspect the value of pos we see a set of key-value pairs.

 We might accidentally use a key that hasn’t been assigned a value.

 We want to be sure that when we look something up in a dictionary, we get only

 Initially, pos['sleep'] is given the value 'V’.

 But this is immediately overwritten with the new value, 'N’.

 We can use the same key-value pair format to create a dictionary.

 If we try to access a key that is not in a dictionary, we get an error.

 Let’s return to our part-of-speech example, and create a dictionary whose

 When we access a non-existent entry, it is automatically added to the

 Many language processing tasks- including tagging - struggle to correctly process

 We can preprocess a text to replace low-frequency words with a special “out of

 The most frequent n words will be mapped to themselves.

 Everything else will be mapped to UNK.

 Incrementally Updating a Dictionary

 We can employ dictionaries to count occurrences, emulating the method for

 We begin by initializing an empty defaultdict, then process each part-of-speech

 The listing in previous example illustrates an important idiom for sorting a

 In general, itemgetter(n) returns a function that can be called on some other

 There’s a second useful programming idiom at the beginning of previous

 We will see how this information can be used by a POS tagger.

 If d is a dictionary and k is a key, we type d[k] and immediately obtain the

 Finding a key given a value is slower and more cumbersome.

 A summary of Python’s dictionary methods is given in the table given next.

You might also like