Progcomp Training Week 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Graphs Algorithms- Part 1

Viney Kumar
11/09/2019

Implementing Graphs
Earlier, we learned about sets and dictionaries (also known as hash tables), and how the use of a hash table
or set (instead of an ordinary array) can drastically improve the efficiency of an algorithm. In addition, we
also learned how to implement these data structures in Python, and how to use their associated methods to
solve some actual programming competition problems.
This week, we’re going to build on the knowledge we learned last week to implement a new data structure
called a “Graph” or “Network”. We’re also going to discuss methods for visiting the nodes in the graph in an
efficient manner.

Friend List- AIO Starter Problem:


Let’s start with a problem:
http://orac.amt.edu.au/cgi-bin/train/problem.pl?set=simple3&problemid=416

Problem Summary:

Given a list of “undirected” friendships (i.e A is friends with B implies that B is friends with A), we must
output all people with the maximum number of friendships in increasing order of id.

Strategy:

Store all the information in a graph. Then compute the “degree” of each vertex (i.e the number of edges
connected to each vertex) and find the maximum degree. Select all vertices with maximum degree.

Implementation:

We implement a graph as a dictionary of sets (or dictionary of lists). We hash each id to a set of its vertices.
Note that the maximum degree of a vertex is the length of Graph[vertex]. Hence, we simply use the length
operation on Graph[id] in order to find the maximum degree of that vertex.
By looping through the graph in the same order as we appended the items, we ensure that the list of id’s
with maximum degree is in increasing order. This solution works in linear time.
A smarter alternative would be to, hash each id to a count of how many friends they have using a simple
dictionary. This would use a lot less memory, but the time complexity would be the same.
Another practice problem requiring a similar sort of graph implementation will be given for the practice
session.

1
Traversal
Graphs are a powerful mental (and mathematical) model of structure in general; if you can formulate a
problem as one dealing with graphs, even if it doesn’t look like a graph problem, you are probably one step
closer to solving it. It just so happens that there is a highly useful mental model for graph algorithms. That
skeleton key is traversal: discovering, and later visiting, all the nodes in a graph.
Consider the problem of visiting every vertex in the graph. Clearly, such a graph may have many “connected
components”. A connected component is a subgraph where any two vertices are connected by paths, and
there are no additional vertices in the subgraph.
Hence, visiting all the vertices in an undirected graph boils down to “traversing” each connected component.

Traversing a Connected Component:

Let’s consider the problem of ordering the nodes of a connected graph v1 , v2 . . . .vn , so that the subgraph
v1 . . . .vi is connected. This problem looks hard, but is actually quite easy! Imagine that we’ve already found
an ordering of the subgraph consisting of v1 . . . .up to the (i-1)th vertex, and we just need to add another
vertex vi to keep the graph connected. How do we do that? Recall that in a connected component, any two
vertices are connected by paths. Hence, if the graph consisting of v1 , v2 . . . .vn is connected, there must be
some path between a vertex that we have connected and a vertex that isn’t connected yet!
Let’s be more precise about this. Consider an arbitrary vertex u in the first i-1 nodes that we have connected
and a vertex v in the remainder. Since there is a path from u to v, consider the last node x on the path
from u to v that is in the connected portion we’ve built, and consider the first node y on the path that is
outside it. Then connecting x to y always keeps the graph connected! Why? Since the original subgraph
of (i-1) vertices in the graph was connected, and we can guarantee there is an edge between x and y, we can
see that there must be a path from y to all the other (i-1) vertices in the subgraph.

Moral of the Story:

While that proof was a little longwinded it shows that we can traverse a connected component by simply
starting at a random vertex and then always connecting nodes that we haven’t visited, and are one vertex
away from the vertices we’ve “just” visited.
To implement this procedure, we need to somehow keep track of these “fringe” or “frontier” nodes that are
just one edge away. If we start with a single node, the frontier will simply be its neighbors. As we start
exploring, the neighbors of newly visited nodes will form the new fringe, while those nodes we visit now fall
inside it. In other words, we need to maintain the fringe as a collection of some sort, where we can remove
the nodes we visit and add their neighbors (unless they’re already on the list or we’ve already visited them).
It becomes a sort of to-do list of nodes we want to visit but haven’t gotten around to yet. You can think of
the ones we have visited as being checked off.
An interesting point is that as long as we keep connecting new nodes to our component in this way, we’re
building a tree. This tree is called a traversal tree and is a “spanning tree” of the component we’re traversing
(more on this later).

A General Traversal Function for Connected Components:

# Find all connected components


def components(G):

2
#Collect a list of components
comp=[]
#Collect vertices we've seen before so we don't
#traverse a component twice.
seen=set()
for u in G:
#Keep going if we've already seen that vertex
if u in seen:
continue
#If we haven't seen it- traverse it's connected component using the walk function
C=walk(G,u)
#Mark all those vertices in the component as seen
seen.update(C)
#Update the list of connected components.
comp.append(C)
return comp

#Find the connected component with s as the source


#s=source, G is a graph in adjacency set representation
#In this case it's empty.
def walk(G,s):
#P is a dictionary that stores the ordering of the nodes.
# It maps each node to it's predecessor (hence the name P)
P={}
#Q is "todo" queue of nodes to visit-hence the name Q
Q=set()
#The source has no predecssor
P[s]=None
#Add the source to the list of nodes we want to visit.
Q.add(s)
#While there are nodes to visit
while Q:
#Arbitrarily select one.
u=Q.pop()
# For each neighbour of the node u in the "fringe"
for v in G[u]:
#If we haven't visited the vertex already
if v not in P:
#Add it to the queue
Q.add(v)
#Set it as visited and set it's predecessor to u.
P[v]=u
#We could return P if we wanted to- or just the set of it's keys.
#It depends on the application
keys=[]
for k in P:
keys.append(k)
return keys

def main():
#This graph G is structured as the edges of a square (a,b,c,d)
#and an extra connected component e and f.

3
G={"a":{"b","d"},"b":{"a","c"},"c":{"b","d"},"d":{"a","c"},"e":{"f"},"f":{"e"}}

print(components(G))

main()

## [['a', 'd', 'b', 'c'], ['e', 'f']]

Well known Traversal Strategies- BFS and DFS.


Our general traversal function works quite well! It traverses each connected component of the graph and
outputs all the vertices in each component. The only place where it falls short is that we have no control over
the order of the vertices that have been visited.
For example: Suppose that we wanted to find the shortest path from a vertex s to another vertex t in an
unweighted graph. In this case we might want to first traverse all nodes that are one node away, followed by
those that are two nodes away from s and so forth. This sort of traversal is known as breadth first search
(BFS)- and is easy to implement with a simple modification of our general traversal function. All we need
to do is change the type of “queue” we’re using from a set to a “double ended queue” or “deque”. In this
manner, we don’t just pop off any old node, but the first node in the queue. The “deque” data structure
ensures that we can do this in O(1) time.

from collections import deque


#deque is a double ended queue- allowing popleft, appendleft in constant time.
def bfs(G,s):
P={}
P[s]=None
Q=deque([s])

while Q:
#Pops the leftmost element in the queue.
u=Q.popleft()
for v in G[u]:
if v not in P:
P[v]=u
Q.append(v)
return P

def main():
#This graph G is structured as the edge of a square.
#and an extra connected component e and f.
G={"a":{"b","d"},"b":{"a","c"},"c":{"b","d"},"d":{"a","c"},"e":{"f"},"f":{"e"}}

print(bfs(G,"a"))
print(bfs(G,"e"))

main()

## {'a': None, 'd': 'a', 'b': 'a', 'c': 'd'}


## {'e': None, 'f': 'e'}
For tree like structures, we might want to go “deep”, before we widen out our search. In this case, we would
use DFS or Depth First Search. The underlying idea is exactly the same, except that we can use a list for

4
our “queue” instead of a deque.

def iter_dfs(G, s):


S=set()
Q=[]
Q.append(s)
while Q:
u=Q.pop()
if u in S:
continue
S.add(u)
Q.extend(G[u])
print(u)

def main():
#This graph G is structured as the edge of a square.
#and an extra connected component e and f.
G={"a":{"b","d"},"b":{"a","c"},"c":{"b","d"},"d":{"a","c"},"e":{"f"},"f":{"e"}}

iter_dfs(G,"a")
iter_dfs(G,"e")

main()

## a
## b
## c
## d
## e
## f

Homework and Practice Problems:


1. Friend list: http://orac.amt.edu.au/cgi-bin/train/problem.pl?set=simple3&problemid=416
2. Cartography: http://orac.amt.edu.au/cgi-bin/train/problem.pl?problemid=15 -Hint: This is a con-
nected component problem- where are the connected components? What is the graph?
3.Cartography 3: Another connected component problem: http://orac.amt.edu.au/cgi-bin/train/problem.pl?
set=aic01&problemid=6
4. Cherries Mesh: A harder problem from Google’s Kickstart Competition: https://codingcompetitions.
withgoogle.com/kickstart/round/0000000000050edb/0000000000170721

You might also like