George Mason University
Python: Generators & Regular Expressions
▪ Iterators, Iterables, Generators
▪ Regular Expressions (re Python library)
▪ Final Exam format
2
5
Iterable Objects, Iterators, Generators
❑ What is an iterable objects? Iterators? Generators?
❑ How do iterators differ from generators?
❑ How to use them in Python – manually and automatically
❑ Why use generators in Digital Forensics?
Regular Expressions
❑ What is a regular expression
❑ How can we search for specific items or patterns
6
7
▪ Iterable objects are like containers (in memory) that you
are able step through to access its content.
▪ Lists, Sets, Tuples, Dictionaries are all iterable objects
▪ Iterable objects invokes the iter() method to get an iterator
Lists/Tuples Dictionaries
8
▪ Iterators remember the state of the last item accessed in a
memory container (iterable object)
▪ Iterators use next() method to get the next item in the
container
▪ When the iterator has accessed the last element, it will
raise the StopIteration exception – indicating there are
no more elements in the container
9
StopIteration Exception
10
http://nvie.com/posts/iterators-vs-generators/
‘for’ loop What are:
─ listWeek15?
─ iteratorList?
─ week15item?
─ listWeek15a?
iter() method
11
▪ Generators – special functions that produce a special iterator
called a generator
▪ They can generate new data values when called
▪ Can be written as a single-lined expression or as a function
▪ Use special method yield() which holds the generator’s state for
the next time next() is called
▪ Best to use when dealing with data that takes up vast amounts of
memory
Why would this be useful in digital forensics?
12
13
RECALL
▪ Iterables
▪ Memory containers where you can access elements within memory
▪ Calls iter() method to get an iterator object
▪ Iterators
▪ Object that holds internal state; allows you to retrieve next item in
our iterable container by calling next() method
▪ Generators
▪ Special functions that produce special iterators (for lazy evaluation)
▪ Lazy evaluation – only done when specifically called
▪ Used to generator data or process large volumes of data
▪ If you see yield() you are working with a generator
14
ITERABLES, ITERATORS & GENERATORS
From BB → Course Content → Class15… → Exercises
Download:
❑ iterGen (.ipynb or .py)
Run through the cell in note any questions/issues/comments you
have for each step.
Complete the QUICK CHECK with your group. 15
QUICK CHECK 1
a. Calls iter() method to get an
iterator object
_ Generator
b. Object that holds internal
_ Iterable Object state; allows you to retrieve
next item in our iterable
container by calling next()
_ Iterator method
c. Special functions that
produce special iterators (for
lazy evaluation) 16
QUICK CHECK 1
____( )?
____( )?
___________? ‘this’
This is the
x12=[‘this’, ‘is’, ‘the’] _________ ‘is’
which
‘the’
calls…
• Generators are special methods that produce _____________________
and uses the _________ method which will hold the generator’s state until
_________ is called?
• Generators are best used for? 18
20
re library
▪ Regular expressions are used to search for patterns in datasets.
What information might we want to quickly triage for in an
investigation?
21
▪ Python re library
▪ Allows user to search for Unicode or 8-byte character strings
Useful re methods:
re.compile(pattern, flags=0) compiles a sequence into a regex object to perform searches directly on
the object
re.match(pattern, string, flags=0) returns a match object* if pattern is found at the beginning of string
re.search(pattern, string, flags=0) returns a match object if pattern is found in string
re.findall(pattern, string, flags=0) returns a list containing all patterns found in string
re.finditer(pattern, string, flags=0) return an iterator yielding match object over all non-overlapping matches
for the RE pattern in string.
* match object = object containing information about the search and the result.
23
▪ Match object:
▪ .span() – returns a tuple containing the start and end positions of
the match
▪ .string() – returns the string passed into the function
▪ .group() – returns the part of the string where there was a match
24
25
Source: https://www.w3schools.com/python/python_regex.asp
26
Source: https://www.w3schools.com/python/python_regex.asp
27
Source: https://www.w3schools.com/python/python_regex.asp
REGULAR EXPRESSIONS
From BB → Course Content → Class15… → Exercises
Download:
❑ reExamples (.ipynb or .py)
Run through the cell in note any questions/issues/comments you
have for each step.
Complete the QUICK CHECK with your group. 28
QUICK CHECK 2
1. What information can one triage data for with regular expressions?
2. Which re statement is used to find if a pattern is at the beginning of
a sequence?
3. What information does the Match Object return?
4. Would the following statement return a Match Object or None:
pat2find = “MU”
str2Search = “Mason Patriots at GMU”
re.search(pat2find,str2Search)
a. If a Match Object is returned, what is given in the span() field?
29
▪ 12-May: Final Exam
▪ Test will be available starting at 12AM (EST) on 12-May and open all day
▪ Timed exam (90 minutes)
▪ Covers the 2nd half of the semester
▪ Similar format to Midterm exam (e.g. multiple choice/answer, short answer, etc.)
▪ Be sure you have examined (static and dynamically)
DFOR510_Spring22_FINAL
▪ Password will be available tonight (10PM EST) for you to access the file
▪ Conduct your analysis prior to final
▪ The final WILL contain questions about the file
31
32
33
34
https://clipartfox.com/categories/view/8afc348758b19996c75d34ea32d903b144a67dca/spring-break-2017.html
Iterators, Iterators, Generators
https://nvie.com/posts/iterators-vs-generators/
https://realpython.com/introduction-to-python-generators/
https://realpython.com/python-for-loop/#the-python-for-loop
https://www.dataquest.io/blog/python-generators-tutorial/
https://towardsdatascience.com/pythons-list-generators-what-when-how-and-why-2a560abd3879
Regular Expressions
https://docs.python.org/3/library/re.html
https://docs.python.org/3/howto/regex.html#regex-howto
https://www.regular-expressions.info/tutorial.html
https://www.w3schools.com/python/python_regex.asp
35