Progcomp Training Week 4
Progcomp Training Week 4
Progcomp Training Week 4
Viney Kumar
11/09/2019
Implementing Graphs
Earlier, we learned about sets and dictionaries (also known as hash tables), and how the use of a hash table
or set (instead of an ordinary array) can drastically improve the efficiency of an algorithm. In addition, we
also learned how to implement these data structures in Python, and how to use their associated methods to
solve some actual programming competition problems.
This week, we’re going to build on the knowledge we learned last week to implement a new data structure
called a “Graph” or “Network”. We’re also going to discuss methods for visiting the nodes in the graph in an
efficient manner.
Problem Summary:
Given a list of “undirected” friendships (i.e A is friends with B implies that B is friends with A), we must
output all people with the maximum number of friendships in increasing order of id.
Strategy:
Store all the information in a graph. Then compute the “degree” of each vertex (i.e the number of edges
connected to each vertex) and find the maximum degree. Select all vertices with maximum degree.
Implementation:
We implement a graph as a dictionary of sets (or dictionary of lists). We hash each id to a set of its vertices.
Note that the maximum degree of a vertex is the length of Graph[vertex]. Hence, we simply use the length
operation on Graph[id] in order to find the maximum degree of that vertex.
By looping through the graph in the same order as we appended the items, we ensure that the list of id’s
with maximum degree is in increasing order. This solution works in linear time.
A smarter alternative would be to, hash each id to a count of how many friends they have using a simple
dictionary. This would use a lot less memory, but the time complexity would be the same.
Another practice problem requiring a similar sort of graph implementation will be given for the practice
session.
1
Traversal
Graphs are a powerful mental (and mathematical) model of structure in general; if you can formulate a
problem as one dealing with graphs, even if it doesn’t look like a graph problem, you are probably one step
closer to solving it. It just so happens that there is a highly useful mental model for graph algorithms. That
skeleton key is traversal: discovering, and later visiting, all the nodes in a graph.
Consider the problem of visiting every vertex in the graph. Clearly, such a graph may have many “connected
components”. A connected component is a subgraph where any two vertices are connected by paths, and
there are no additional vertices in the subgraph.
Hence, visiting all the vertices in an undirected graph boils down to “traversing” each connected component.
Let’s consider the problem of ordering the nodes of a connected graph v1 , v2 . . . .vn , so that the subgraph
v1 . . . .vi is connected. This problem looks hard, but is actually quite easy! Imagine that we’ve already found
an ordering of the subgraph consisting of v1 . . . .up to the (i-1)th vertex, and we just need to add another
vertex vi to keep the graph connected. How do we do that? Recall that in a connected component, any two
vertices are connected by paths. Hence, if the graph consisting of v1 , v2 . . . .vn is connected, there must be
some path between a vertex that we have connected and a vertex that isn’t connected yet!
Let’s be more precise about this. Consider an arbitrary vertex u in the first i-1 nodes that we have connected
and a vertex v in the remainder. Since there is a path from u to v, consider the last node x on the path
from u to v that is in the connected portion we’ve built, and consider the first node y on the path that is
outside it. Then connecting x to y always keeps the graph connected! Why? Since the original subgraph
of (i-1) vertices in the graph was connected, and we can guarantee there is an edge between x and y, we can
see that there must be a path from y to all the other (i-1) vertices in the subgraph.
While that proof was a little longwinded it shows that we can traverse a connected component by simply
starting at a random vertex and then always connecting nodes that we haven’t visited, and are one vertex
away from the vertices we’ve “just” visited.
To implement this procedure, we need to somehow keep track of these “fringe” or “frontier” nodes that are
just one edge away. If we start with a single node, the frontier will simply be its neighbors. As we start
exploring, the neighbors of newly visited nodes will form the new fringe, while those nodes we visit now fall
inside it. In other words, we need to maintain the fringe as a collection of some sort, where we can remove
the nodes we visit and add their neighbors (unless they’re already on the list or we’ve already visited them).
It becomes a sort of to-do list of nodes we want to visit but haven’t gotten around to yet. You can think of
the ones we have visited as being checked off.
An interesting point is that as long as we keep connecting new nodes to our component in this way, we’re
building a tree. This tree is called a traversal tree and is a “spanning tree” of the component we’re traversing
(more on this later).
2
#Collect a list of components
comp=[]
#Collect vertices we've seen before so we don't
#traverse a component twice.
seen=set()
for u in G:
#Keep going if we've already seen that vertex
if u in seen:
continue
#If we haven't seen it- traverse it's connected component using the walk function
C=walk(G,u)
#Mark all those vertices in the component as seen
seen.update(C)
#Update the list of connected components.
comp.append(C)
return comp
def main():
#This graph G is structured as the edges of a square (a,b,c,d)
#and an extra connected component e and f.
3
G={"a":{"b","d"},"b":{"a","c"},"c":{"b","d"},"d":{"a","c"},"e":{"f"},"f":{"e"}}
print(components(G))
main()
while Q:
#Pops the leftmost element in the queue.
u=Q.popleft()
for v in G[u]:
if v not in P:
P[v]=u
Q.append(v)
return P
def main():
#This graph G is structured as the edge of a square.
#and an extra connected component e and f.
G={"a":{"b","d"},"b":{"a","c"},"c":{"b","d"},"d":{"a","c"},"e":{"f"},"f":{"e"}}
print(bfs(G,"a"))
print(bfs(G,"e"))
main()
4
our “queue” instead of a deque.
def main():
#This graph G is structured as the edge of a square.
#and an extra connected component e and f.
G={"a":{"b","d"},"b":{"a","c"},"c":{"b","d"},"d":{"a","c"},"e":{"f"},"f":{"e"}}
iter_dfs(G,"a")
iter_dfs(G,"e")
main()
## a
## b
## c
## d
## e
## f