Post

DS - pythonds3 - 8. Graphs and Graph Algorithms


pythonds3 - 8. Graphs and Graph Algorithms


Graphs

4 basic ways to represent a graph in memory:

  • objects and pointers
  • adjacency matrix
  • adjacency list
  • adjacency map

Vocabulary and Definitions

Vertex/node

  • a fundamental part of a graph.
  • It can have a name, “key.”
  • A vertex may also have additional information, “payload.”

Edge/“arc”

  • another fundamental part of a graph.
  • An edge connects two vertices to show that there is a relationship between them.
  • Edges may be one-way or two-way.
    • If the edges in a graph are all one-way, the graph is a directed graph, digraph.
      • you must take some step before others.

Weight

  • Edges may be weighted to show that there is a cost to go from one vertex to another.
  • For example
    • graph of roads that connect one city to another,
    • the weight on the edge might represent the distance between the two cities.

With those definitions in hand we can formally define a graph.

  • A graph can be represented by 𝐺
  • 𝐺=(𝑉,𝐸).
  • 𝑉 is a set of vertices
  • 𝐸 is a set of edges.
    • Each edge is a tuple (𝑣,𝑤), 𝑤,𝑣 ∈𝑉.
    • We can add a third component to the edge tuple to represent a weight.
    • A subgraph 𝑠 is a set of edges 𝑒 and vertices 𝑣 such that 𝑒⊂𝐸 and 𝑣⊂𝑉.

Path

  • A path in a graph is a sequence of vertices that are connected by edges.
  • Formally we would define a path as 𝑤1,𝑤2,...,𝑤𝑛 such that (𝑤𝑖,𝑤𝑖+1)∈𝐸 for all ·.
  • The unweighted path length is the number of edges in the path, specifically 𝑛−1.
  • The weighted path length is the sum of the weights of all the edges in the path.
    • For example
    • the path from 𝑉3 to 𝑉1 is the sequence of vertices (𝑉3,𝑉4,𝑉0,𝑉1).
    • The edges are {(𝑣3,𝑣4,7),(𝑣4,𝑣0,1),(𝑣0,𝑣1,5)}.

Cycle

  • A cycle in a directed graph is a path that starts and ends at the same vertex.
    • For example
    • the path (𝑉5,𝑉2,𝑉3,𝑉5) is a cycle.
    • A graph with no cycles is called an acyclic graph.
    • A directed graph with no cycles is called a directed acyclic graph / DAG
    • We will see that we can solve several important problems if the problem can be represented as a DAG.

digraph

  • example of a simple weighted digraph.
  • represent this graph as the set of six vertices: 𝑉={𝑉0,𝑉1,𝑉2,𝑉3,𝑉4,𝑉5}
  • and the set of nine edges: 𝐸={(𝑣0,𝑣1,5),(𝑣1,𝑣2,4),(𝑣2,𝑣3,9),(𝑣3,𝑣4,7),(𝑣4,𝑣0,1),(𝑣0,𝑣5,2),(𝑣5,𝑣4,8),(𝑣3,𝑣5,3),(𝑣5,𝑣2,1)}

The Graph Abstract Data Type

The graph abstract data type (ADT) is defined as follows:

  • Graph()
    • creates a new, empty graph.
  • addVertex(vert)
    • adds an instance of Vertex to the graph.
  • addEdge(fromVert, toVert)
    • Adds a new, directed edge to the graph that connects two vertices.
  • addEdge(fromVert, toVert, weight)
    • Adds a new, weighted, directed edge to the graph that connects two vertices.
  • getVertex(vertKey)
    • finds the vertex in the graph named vertKey.
  • getVertices()
    • returns the list of all vertices in the graph.
  • in
    • returns True for a statement of the form vertex in graph, if the given vertex is in the graph,
    • False otherwise.

there are several ways we can implement the graph ADT in Python.

  • there are trade-offs in using different representations to implement the ADT described above.
  • There are two well-known implementations of a graph,
    • the adjacency matrix
    • the adjacency list.

An Adjacency Matrix 邻接矩阵

1
2
3
4
class GraphyAM:
    def __init__(self) -> None:
        self.vertList = {"id":vertex}
        self.numVertices = 0

One of the easiest ways to implement a graph is to use a two-dimensional matrix.

  • each of the rows and columns represent a vertex in the graph.
  • The value that is stored in the cell at the intersection of row 𝑣 and column 𝑤 indicates if there is an edge from vertex 𝑣 to vertex 𝑤.
  • When two vertices are connected by an edge, we say that they are adjacent.
  • A value in a cell represents the weight of the edge from vertex 𝑣 to vertex 𝑤.

adjMat

advantage

  • simple, for small graphs it is easy to see which nodes are connected to other nodes.
  • good implementation for a graph when the number of edges is large.

  • However not a very efficient way to store sparse data.
    • most of the cells in the matrix are empty -> this matrix is “sparse.” 稀少贫乏

How many edges would be needed to fill the matrix?

  • Since there is one row and one column for every vertex in the graph, the number of edges required to fill the matrix is |𝑉|^2.
  • A matrix is full when every vertex is connected to every other vertex.
  • There are few real problems that approach this sort of connectivity.

An Adjacency List 邻接表

  • more space-efficient way to implement a sparsely connected graph is to use an adjacency list.
  • keep a master list of all the vertices in the Graph object
  • and then each vertex object in the graph maintains a list of the other vertices that it is connected to.
  • In our implementation of the Vertex class we will use a dictionary rather than a list where the dictionary keys are the vertices, and the values are the weights.

advantage

  • it allows compactly 紧凑地 represent a sparse graph.
  • easily find all the links that are directly connected to a particular vertex.

adjlist


Implementation

Using dictionaries, it is easy to implement the adjacency list in Python.

create two classes,

  • Graph, which holds the master list of vertices,
  • Vertex, which will represent each vertex in the graph.
    • Each Vertex uses a dictionary to keep track of the vertices to which it is connected, and the weight of each edge.
    • This dictionary is called connectedTo.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
class Vertex:
    def __init__(self,key):
        self.id = key
        self.connectedTo = {}

    def addNeighbor(self,nbr,weight=0):
      # add a connection from this vertex to another
      self.connectedTo[nbr] = weight

    def __str__(self):
        return str(self.id) + ' connectedTo: ' + str([x.id for x in self.connectedTo])

    def getConnections(self):
      # returns all of the vertices in the adjacency list, as represented by the connectedTo instance variable.
      return self.connectedTo.keys()

    def getId(self): return self.id

    def getWeight(self,nbr):
      # returns the weight of the edge from this vertex to the vertex passed as a parameter.
      return self.connectedTo[nbr]

class Graph:
    def __init__(self):
        self.vertList = {}
        self.numVertices = 0

    def addVertex(self,key):
      # adding vertices to a graph
      self.numVertices = self.numVertices + 1
      newVertex = Vertex(key)
      self.vertList[key] = newVertex
      return newVertex

    def getVertex(self,n):
      if n in self.vertList: return self.vertList[n]
      else: return None

    def getVertices(self):
      # returns the names of all of the vertices in the graph
        return self.vertList.keys()

    def __contains__(self,n):
        return n in self.vertList

    def addEdge(self,f,t,weight=0):
        if f not in self.vertList: nv = self.addVertex(f)
        if t not in self.vertList: nv = self.addVertex(t)
        self.vertList[f].addNeighbor(self.vertList[t], weight)

    def __iter__(self):
        return iter(self.vertList.values())


g = Graph()
for i in range(6):
  g.addVertex(i)

# display the vertex dictionary.
g.vertList
# {0: <adjGraph.Vertex instance at 0x41e18>,
#  1: <adjGraph.Vertex instance at 0x7f2b0>,
#  2: <adjGraph.Vertex instance at 0x7f288>,
#  3: <adjGraph.Vertex instance at 0x7f350>,
#  4: <adjGraph.Vertex instance at 0x7f328>,
#  5: <adjGraph.Vertex instance at 0x7f300>}

# add the edges that connect the vertices together.
g.addEdge(0,1,5)
g.addEdge(0,5,2)
g.addEdge(1,2,4)
g.addEdge(2,3,9)
g.addEdge(3,4,7)
g.addEdge(3,5,3)
g.addEdge(4,0,1)
g.addEdge(5,4,8)
g.addEdge(5,2,1)

# a nested loop
# verifies that each edge in the graph is properly stored.
for v in g:
for w in v.getConnections():
    print("( %s , %s )" % (v.getId(), w.getId()))
# ( 0 , 5 )
# ( 0 , 1 )
# ( 1 , 2 )
# ( 2 , 3 )
# ( 3 , 4 )
# ( 3 , 5 )
# ( 4 , 0 )
# ( 5 , 4 )
# ( 5 , 2 )

example

The Word Ladder Problem

Transform the word “FOOL” into the word “SAGE”.

  • must make the change occur gradually by changing one letter at a time.
  • At each step you must transform one word into another word,
  • not allowed to transform a word into a non-word.
  • The word ladder puzzle was invented in 1878 by Lewis Carroll, the author of Alice in Wonderland.
  • The following sequence of words shows one possible solution to the problem posed above.
1
2
3
4
5
6
7
FOOL
POOL
POLL
POLE
PALE
SALE
SAGE

we can solve this problem using a graph algorithm.

  • Represent the relationships between the words as a graph.
  • Use the graph algorithm known as breadth first search to find an efficient path from the starting word to the ending word.

implement

wordgraph

  • have an edge from one word to another if the two words are only different by a single letter.
  • If we can create such a graph, then any path from one word to another is a solution to the word ladder puzzle.

could use several different approaches to create the graph

  • start with the assumption that we have a list of words that are all the same length.
  • create a vertex in the graph for every word in the list.
  • to connect the words,
    • compare each word in the list with every other.
      • see how many letters are different.
      • If different by only one letter, create an edge between them
    • For a small set of words that approach would work fine;
      • however let’s suppose we have a list of 5,110 words.
      • Roughly speaking, comparing one word to every other word on the list is an 𝑂(𝑛^2) algorithm.
      • For 5,110 words, 𝑛^2 is more than 26 million comparisons.
    • We can do much better by using the following approach.
      • Suppose that we have a huge number of buckets, each of them with a four-letter word on the outside, except that one of the letters in the label has been replaced by an underscore.
      • For example
      • have a bucket labeled “pop_.”
      • As we process each word in our list we compare the word with each bucket, using the ‘’ as a wildcard, so both “pope” and “pops” would match “pop.”
      • Every time we find a matching bucket, we put our word in that bucket.
      • Once we have all the words in the appropriate buckets we know that all the words in the bucket must be connected.

      • implement the scheme by dictionary.
        • The labels on the buckets are the keys in dictionary.
        • The value stored for that key is a list of words.
        • Once we have the dictionary built we can create the graph.
        • We start our graph by
          • creating a vertex for each word in the graph.
          • create edges between all the vertices under the same key in the dictionary.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
from pythonds.graphs import Graph

def buildGraph(wordFile):
    g = Graph()
    d = {}

    # create buckets of words that differ by one letter
    wfile = open(wordFile,'r')
    for line in wfile:
        word = line[:-1]
        for i in range(len(word)):
            bucket = word[:i] + '_' + word[i+1:]
            if bucket in d:
                d[bucket].append(word)
            else:
                d[bucket] = [word]

    # add vertices and edges for words in the same bucket
    for bucket in d.keys():
        for word1 in d[bucket]:
            for word2 in d[bucket]:
                if word1 != word2:
                    g.addEdge(word1,word2)
    return g

how sparse is the graph?

  • The list of four-letter words we have for this problem is 5,110 words long.
  • adjacency matrix,
    • the matrix would have 5,110 * 5,110 = 26,112,100 cells.
  • adjacency list,
    • The graph constructed by the buildGraph function has exactly 53,286 edges,
    • only 0.20% of the cells filled!
    • That is a very sparse matrix indeed.

Implement: Breadth First Search (BFS)

find the shortest solution to the word ladder problem.

  • use the graph algorithm “breadth first search” algorithm.

Breadth first search (BFS)

  • one of the easiest algorithms for searching a graph.
  • It also serves as a prototype for several other important graph algorithms that we will study later.

Algorithm using Depth First Search

  • Step 1: Create a temporary stack.
  • Step 2: Recursively call topological sorting for all its adjacent vertices, then push it to the stack (when all adjacent vertices are on stack). Note this step is same as Depth First Search in a recursive way.
  • Step 3: Atlast, print contents of stack.
  • Note: A vertex is pushed to stack only when all of its adjacent vertices (and their adjacent vertices and so on) are already in stack.

Given a graph 𝐺 and a starting vertex 𝑠,

  • a breadth first search proceeds by exploring edges in the graph to find all the vertices in 𝐺 for which there is a path from 𝑠.
  • The remarkable thing about a breadth first search is that it finds all the vertices that are a distance 𝑘 from 𝑠 before it finds any vertices that are a distance 𝑘+1.
  • One good way to visualize what the breadth first search algorithm does is to imagine that it is building a tree, one level of the tree at a time. A breadth first search adds all children of the starting vertex before it begins to discover any of the grandchildren.

The breadth first search algorithm

  • uses the adjacency list graph representation
  • uses a Queue to decide which vertex to explore next.

In addition the BFS algorithm uses an extended version of the Vertex class.

  • This new vertex class adds three new instance variables: distance, predecessor, and color.
  • Each of these instance variables also has the appropriate getter and setter methods.
  • The code for this expanded Vertex class is included in the pythonds package
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from pythonds.graphs import Graph, Vertex
from pythonds.basic import Queue

def bfs(g, start):
  start.setDistance(0)
  start.setPred(None)
  vertQueue = Queue()
  vertQueue.enqueue(start)
  while (vertQueue.size() > 0):
    currentVert = vertQueue.dequeue()
    for nbr in currentVert.getConnections():
      if (nbr.getColor() == 'white'):
        nbr.setColor('gray')
        nbr.setDistance(currentVert.getDistance() + 1)
        nbr.setPred(currentVert)
        vertQueue.enqueue(nbr)
    currentVert.setColor('black')

BFS begins at the starting vertex s and colors start gray to show that it is currently being explored.

  • Two other values, the distance and the predecessor, are initialized to 0 and None respectively for the starting vertex.
  • Finally, start is placed on a Queue.
  • The next step is to begin to systematically explore vertices at the front of the queue.
  • We explore each new node at the front of the queue by iterating over its adjacency list.
  • As each node on the adjacency list is examined its color is checked.
  • If it is white, the vertex is unexplored, and four things happen:
    • The new, unexplored vertex nbr, is colored gray.
    • The predecessor of nbr is set to the current node currentVert
    • The distance to nbr is set to the distance to currentVert + 1
    • nbr is added to the end of a queue.
      • Adding nbr to the end of the queue effectively schedules this node for further exploration, but not until all the other vertices on the adjacency list of currentVert have been explored.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from pythonds.graphs import Graph, Vertex
from pythonds.basic import Queue

def bfs(g,start):
  start.setDistance(0)
  start.setPred(None)
  vertQueue = Queue()
  vertQueue.enqueue(start)

  while (vertQueue.size() > 0):
    # if not empty
    currentVert = vertQueue.dequeue()
    # for every child
    for nbr in currentVert.getConnections():
      # if its not been seen
      if (nbr.getColor() == 'white'):
        # action
        nbr.setColor('gray')
        nbr.setDistance(currentVert.getDistance() + 1)
        nbr.setPred(currentVert)
        vertQueue.enqueue(nbr)
    currentVert.setColor('black')

Step:

  1. Starting from fool we take all nodes that are adjacent to fool and add them to the tree.
    1. The adjacent nodes include pool, foil, foul, and cool.
    2. Each of these nodes are added to the queue of new nodes to expand.

bfs1

  1. bfs removes the next node (pool) from the front of the queue and repeats the process for all of its adjacent nodes.
    1. when bfs examines the node cool, it finds that the color is gray.
    2. This indicates that there is a shorter path to cool and that cool is already on the queue for further expansion.
    3. The only new node added to the queue while examining pool is poll.

bfs2

  1. The next vertex on the queue is foil.
    1. The only new node that foil can add to the tree is fail.
    2. As bfs continues to process the queue, neither of the next two nodes add anything new to the queue or the tree.

bfs3

bfsDone

You should continue to work through the algorithm on your own so that you are comfortable with how it works. Figure 6 shows the final breadth first search tree after all the vertices in Figure 3 have been expanded.

The amazing thing about the breadth first search solution is

  • not only solved the FOOL–SAGE problem, but we have solved many other problems along the way.
  • We can start at any vertex in the breadth first search tree and follow the predecessor arrows back to the root to find the shortest word ladder from any word back to fool.

To follow the predecessor links to print out the word ladder.

1
2
3
4
5
6
7
8
def traverse(y):
    x = y
    while (x.getPred()):
      print(x.getId())
      x = x.getPred()
    print(x.getId())

traverse(g.getVertex('sage'))

Analysis

  • the while loop is executed, at most one time for each vertex in the graph |𝑉|.
    • a vertex must be white before it can be examined and added to the queue.
    • This gives us 𝑂(𝑉) for the while loop.
  • The for loop is executed at most once for each edge in the graph, |𝐸|.
    • every vertex is dequeued at most once and we examine an edge from node 𝑢 to node 𝑣 only when node 𝑢 is dequeued.
    • This gives us 𝑂(𝐸) for the for loop.
  • combining the two loops gives us 𝑂(𝑉+𝐸).

Of course doing the breadth first search is only part of the task. Following the links from the starting node to the goal node is the other part of the task. \

  • The worst case for this would be if the graph was a single long chain. In this case traversing through all of the vertices would be 𝑂(𝑉). The normal case is going to be some fraction of |𝑉| but we would still write 𝑂(𝑉).

The Knight’s Tour Problem

The knight’s tour puzzle is played on a chess board with a single chess piece, the knight.

  • find a sequence of moves that allow the knight to visit every square on the board exactly once.
  • One such sequence is called a “tour.”
    • The knight’s tour puzzle has fascinated chess players, mathematicians and computer scientists alike for many years.
  • The upper bound on the number of possible legal tours for an eight-by-eight chessboard is known to be 1.305×1035;
  • however, there are even more possible dead ends.
  • Clearly this is a problem that requires some real brains, some real computing power, or both.

Although researchers have studied many different algorithms to solve the knight’s tour problem, a graph search is one of the easiest to understand and program.

Implement

solve the problem using two main steps:

  1. Represent the legal moves of a knight on a chessboard as a graph.
  2. Use a graph algorithm to find a path of length 𝑟𝑜𝑤𝑠×𝑐𝑜𝑙𝑢𝑚𝑛𝑠 −1 where every vertex on the graph is visited exactly once.

following two ideas:

  • Each square on the chessboard can be represented as a node in the graph.
  • Each legal move by the knight can be represented as an edge in the graph.

knightmoves

bigknight

  • the complete graph of possible moves on an eight-by-eight board.
  • There are exactly 336 edges in the graph.
  • the vertices corresponding to the edges of the board have fewer connections (legal moves) than the vertices in the middle of the board.
  • Once again we can see how sparse the graph is.
    • If the graph was fully connected there would be 4,096 edges.
    • Since there are only 336 edges,
    • the adjacency matrix would be only 8.2 percent full.

completeTour

  • what a complete tour around an eight-by-eight board looks like.
  • There are many possible tours; some are symmetric.
  • With some modification you can make circular tours that start and end at the same square.

Build the full graph for an n-by-n board

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
from pythonds.graphs import Graph


# The knightGraph function makes one pass over the entire board.
def knightGraph(bdSize):
    ktGraph = Graph()
    for row in range(bdSize):
       for col in range(bdSize):
          # At each square on the board the knightGraph function calls a helper, genLegalMoves,
          # to create a list of legal moves for that position on the board.
          # # All legal moves are then converted into edges in the graph.
          newPositions = genLegalMoves(row,col,bdSize)
          nodeId = posToNodeId(row,col,bdSize)
          for e in newPositions:
            nid = posToNodeId(e[0],e[1],bdSize)
            ktGraph.addEdge(nodeId,nid)
    return ktGraph

# Another helper function posToNodeId converts a location on the board in terms of a row and a column into a linear vertex number similar to the vertex numbers
def posToNodeId(row, column, board_size):
    return (row * board_size) + column


# The genLegalMoves takes the position of the knight on the board and generates each of the eight possible moves.
def genLegalMoves(x,y,bdSize):
    newMoves = []
    moveOffsets = [(-1,-2),(-1,2),(-2,-1),(-2,1),
                   ( 1,-2),( 1,2),( 2,-1),( 2,1)]
    for i in moveOffsets:
        newX = x + i[0]
        newY = y + i[1]
        if legalCoord(newX,bdSize) and legalCoord(newY,bdSize):
            newMoves.append((newX,newY))
    return newMoves

# The legalCoord helper function makes sure that a particular move that is generated is still on the board.
def legalCoord(x,bdSize):
    if x >= 0 and x < bdSize: return True
    else: return False

Implement: Depth first search (DFS)

  • breadth first search algorithm on builds a search tree one level at a time,
  • depth first search creates a search tree by exploring one branch of the tree as deeply as possible.

The depth first exploration of the graph is exactly what we need in order to find a path that has exactly 63 edges.

  • We will see that when the depth first search algorithm finds a dead end (a place in the graph where there are no more moves possible) it backs up the tree to the next deepest vertex that allows it to make a legal move.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
from pythonds.graphs import Graph, Vertex

def knightTour(n,path,u,limit):
  # n, the current depth in the search tree; the path goal
  # path, a list of vertices visited up to this point;
  # u, the vertex in the graph we wish to explore;
  # limit the number of nodes in the path.
  u.setColor('gray')
  path.append(u)

  # checks the base case condition.
  if n < limit:

   # If the path is not long enough
   # continue to explore one level deeper by choosing a new vertex to explore
   # and calling knightTour recursively for that vertex.
    nbrList = list(u.getConnections())
    i = 0
    done = False

    while i < len(nbrList) and not done:
      # DFS also uses colors to keep track of which vertices in the graph have been visited.
      # Unvisited vertices are colored white,
      # visited vertices are colored gray.
      if nbrList[i].getColor() == 'white':
        done = knightTour( n+1, path, nbrList[i], limit)
      i = i + 1
    # backtrack
    # If all neighbors of a particular vertex have been explored and we have not yet reached our goal length of 64 vertices, we have reached a dead end.
    # When we reach a dead end we must backtrack.
    # Backtracking happens when we return from knightTour with a status of False.
    if not done:
      path.pop()
      u.setColor('white')

  # If we have a path that contains 64 vertices
  # return from knightTour with a status of True,
  # indicating that we have found a successful tour.
  else:
      done = True
  return done

simple example of knightTour in action.

  • assume that the call to the getConnections method on line 6 orders the nodes in alphabetical order.
  • We begin by calling knightTour(0,path,A,6)

ktdfsa

knightTour starts with node A

  • The nodes adjacent to A are B and D.
  • Since B is before D alphabetically, DFS selects B to expand next

ktdfsb

Exploring B happens when knightTour is called recursively.

  • B is adjacent to C and D, so knightTour elects to explore C next.

ktdfsc

  • However, node C is a dead end with no adjacent white nodes.
  • At this point we change the color of node C back to white.
  • The call to knightTour returns a value of False.
  • The return from the recursive call effectively backtracks the search to vertex B

ktdfsd

The next vertex on the list to explore is vertex D

  • knightTour makes a recursive call moving to node D
  • From vertex D on, knightTour can continue to make recursive calls until we get to node C again

ktdfse

ktdfsf

ktdfsg

ktdfsh

However, this time when we get to node C the test n < limit fails

  • so we know that we have exhausted all the nodes in the graph.
  • At this point we can return True to indicate that we have made a successful tour of the graph.
  • When we return the list, path has the values [A,B,D,E,F,C], which is the order we need to traverse the graph to visit each node exactly once.

Analysis

knightTour is very sensitive to the method used to select the next vertex to visit.

  • For example
    • a five-by-five board you can produce a path in about 1.5 seconds on a reasonably fast computer.
    • But for an eight-by-eight board
      • depending on the speed of your computer, you may have to wait up to a half hour to get the results!

8arrayTree

The reason for this is that the knight’s tour problem as we have implemented is an exponential algorithm of size 𝑂(𝑘^𝑁),

  • where N is the number of squares on the chess board, and k is a small constant.
  • The root of the tree represents the starting point of the search.
  • From there the algorithm generates and checks each of the possible moves the knight can make.
  • the number of moves possible depends on the position of the knight on the board.
  • In the corners there are only two legal moves, on the squares adjacent to the corners there are three and in the middle of the board there are eight.
  • the number of moves possible for each position on a board.
  • moveCount
  • At the next level of the tree there are once again between 2 and 8 possible next moves from the position we are currently exploring.
  • The number of possible positions to examine corresponds to the number of nodes in the search tree.

the number of nodes

  • the number of nodes in a binary tree of height N is 2^(𝑁+1)−1.
  • For a tree with nodes that may have up to eight children instead of two the number of nodes is much larger.
    • Because the branching factor of each node is variable,
    • estimate the number of nodes using an average branching factor.
  • this algorithm is exponential: 𝑘^(𝑁+1)−1,
    • where 𝑘 is the average branching factor for the board.
    • how rapidly this grows!
    • For a board that is 5x5
      • the tree will be 25 levels deep,
      • or N = 24 counting the first level as level 0.
      • The average branching factor is 𝑘=3.8 So the number of nodes in the search tree is 3.8^25−1 or 3.12×10^14.
    • For a 6x6 board, 𝑘=4.4, there are 1.5×10^23 nodes,
    • for a 8x8 chess board, 𝑘=5.25, there are 1.3×10^46.
  • Of course, since there are multiple solutions to the problem we won’t have to explore every single node, but the fractional part of the nodes we do have to explore is just a constant multiplier which does not change the exponential nature of the problem.

Improve: heuristic search: Warnsdorff’s algorithm

Luckily there is a way to speed up the eight-by-eight case so that it runs in under one second.

1
2
3
4
5
6
7
8
9
10
11
12
# In the listing below we show the code that speeds up the knightTour. This function will be used in place of the call to u.getConnections in the code previously shown above.
def orderByAvail(n):
    resList = []
    for v in n.getConnections():
        if v.getColor() == 'white':
            c = 0
            for w in v.getConnections():
                if w.getColor() == 'white':
                    c = c + 1
            resList.append((c,v))
    resList.sort(key=lambda x: x[0])
    return [y[1] for y in resList]

The critical line ensures that we select the vertex to go next that has the fewest available moves.

  • why not select the node that has the most available moves? You can try that approach easily by running the program yourself and inserting the line resList.reverse() right after the sort.

The problem with using the vertex with the most available moves as your next vertex on the path is that

  • it tends to have the knight visit the middle squares early on in the tour.
  • When this happens it is easy for the knight to get stranded on one side of the board where it cannot reach unvisited squares on the other side of the board.

  • On the other hand, visiting the squares with the fewest available moves first pushes the knight to visit the squares around the edges of the board first. This ensures that the knight will visit the hard-to-reach corners early and can use the middle squares to hop across the board only when necessary.
  • Utilizing this kind of knowledge to speed up an algorithm is called a heuristic 启发式

heuristic 启发式

  • Humans use heuristics every day to help make decisions,
  • heuristic searches are often used in the field of artificial intelligence.
  • This particular heuristic is called Warnsdorff’s algorithm, named after H. C. Warnsdorff who published his idea in 1823.

Graph Algorithms

dfsbfs

BFS notes:

  • level order (BFS, using queue)
  • time complexity: O(n)
  • space complexity:
    • best: O(1),
    • worst: O(n/2)=O(n)

DFS notes:

  • time complexity: O(n)
  • space complexity:
    • best: O(log n) - avg. height of tree
    • worst: O(n)
  • inorder (DFS: left, self, right)
  • postorder (DFS: left, right, self)
  • preorder (DFS: self, left, right)

Breadth First Search O(|V| + |E|)

Breadth First Search

  • a graph traversal algorithm which explores the neighbor nodes first, before moving to the next level neighbors

Time Complexity: O(|V| + |E|)


Depth First Search O(|V| + |E|)

The knight’s tour is a special case of a depth first search where the goal is to create the deepest depth first tree, without any branches.

The more general depth first search is actually easier.

  • Its goal is to search as deeply as possible, connecting as many nodes in the graph as possible and branching where necessary.

Depth First Search

  • a graph traversal algorithm
  • which explores as far as possible along each branch before backtracking

Time Complexity: O(|V| + |E|)

depth first forest

  • It is even possible that a depth first search will create more than one tree.
  • When the depth first search algorithm creates a group of trees we call this a depth first forest

depth first search

  • As with the breadth first search, depth first search makes use of predecessor 前任 links to construct the tree.
  • In addition, the depth first search will make use of two additional instance variables in the Vertex class.
  • The new instance variables are the discovery and finish times.
    • The discovery time tracks the number of steps in the algorithm before a vertex is first encountered.
    • The finish time is the number of steps in the algorithm before a vertex is colored black.
    • As we will see after looking at the algorithm, the discovery and finish times of the nodes provide some interesting properties we can use in later algorithms.

The code for our depth first search is shown in Listing 5. Since the two functions dfs and its helper dfsvisit use a variable to keep track of the time across calls to dfsvisit we chose to implement the code as methods of a class that inherits from the Graph class.

  • This implementation extends the graph class by adding a time instance variable and the two methods dfs and dfsvisit.
  • Looking at line 11 you will notice that the dfs method iterates over all of the vertices in the graph calling dfsvisit on the nodes that are white.
  • The reason we iterate over all the nodes, rather than simply searching from a chosen starting node, is to make sure that all nodes in the graph are considered and that no vertices are left out of the depth first forest.
  • It may look unusual to see the statement for aVertex in self, but remember that in this case self is an instance of the DFSGraph class, and iterating over all the vertices in an instance of a graph is a natural thing to do.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from pythonds.graphs import Graph
class DFSGraph(Graph):
    def __init__(self):
        super().__init__()
        self.time = 0

    def dfs(self):
        # iterates over all of the vertices in the graph
        # calling dfsvisit on the nodes that are white.
        for aVertex in self:
            aVertex.setColor('white')
            aVertex.setPred(-1)
        for aVertex in self:
            if aVertex.getColor() == 'white':
                self.dfsvisit(aVertex)

    def dfsvisit(self,startVertex):
        startVertex.setColor('gray')
        self.time += 1
        startVertex.setDiscovery(self.time)

        for nextVertex in startVertex.getConnections():
          # explores all of the neighboring white vertices as deeply as possible.
          if nextVertex.getColor() == 'white':
            nextVertex.setPred(startVertex)
            self.dfsvisit(nextVertex)
        startVertex.setColor('black')
        self.time += 1
        startVertex.setFinish(self.time)

compare it to breadth first search

  • the dfsvisit algorithm is almost identical to bfs
  • except that on the last line of the inner for loop,
    • dfsvisit calls itself recursively to continue the search at a deeper level,
    • whereas bfs adds the node to a queue for later exploration.
  • bfs uses a queue, dfsvisit uses a stack.

example

The following sequence of figures illustrates the depth first search algorithm in action for a small graph.

  • In these figures, the dotted lines indicate edges that are checked, but the node at the other end of the edge has already been added to the depth first tree. In the code this test is done by checking that the color of the other node is non-white.
  1. The search begins at vertex A of the graph
    • Since all of the vertices are white at the beginning of the search the algorithm visits vertex A.
    • The first step in visiting a vertex is to set the color to gray, which indicates that the vertex is being explored and the discovery time is set to 1.
    • Since vertex A has two adjacent vertices (B, D) each of those need to be visited as well.
    • We’ll make the arbitrary decision that we will visit the adjacent vertices in alphabetical order.

gendfsa

  1. Vertex B is visited next
    • so its color is set to gray and its discovery time is set to 2.
    • Vertex B is also adjacent to two other nodes (C, D)
    • so we will follow the alphabetical order and visit node C next.

gendfsb

  1. Visiting vertex C
    • brings us to the end of one branch of the tree.
    • After coloring the node gray and setting its discovery time to 3,
    • the algorithm also determines that there are no adjacent vertices to C.
    • This means that we are done exploring node C and so we can color the vertex black, and set the finish time to 4.

gendfsc

gendfsd

  1. Since vertex C was the end of one branch we now return to vertex B and continue exploring the nodes adjacent to B.
    • The only additional vertex to explore from B is D, so we can now visit D
    • and continue our search from vertex D.

gendfse

  1. Vertex D quickly leads us to vertex E

gendfsf

  1. Vertex E has two adjacent vertices, B and F.
    • explore these adjacent vertices alphabetically,
    • but since B is already colored gray
      • the algorithm recognizes that it should not visit B since doing so would put the algorithm in a loop!
    • So exploration continues with the next vertex in the list, namely F

gendfsg

  1. Vertex F has only one adjacent vertex, C,
    • but since C is colored black there is nothing else to explore, and the algorithm has reached the end of another branch.
    • From here on, the algorithm works its way back to the first node, setting finish times and coloring vertices black.

gendfsh

gendfsi

gendfsj

gendfsk

gendfsl

The starting and finishing times for each node display a property called the parenthesis property.

  • This property means that all the children of a particular node in the depth first tree have a later discovery time and an earlier finish time than their parent.

dfstree


Depth First Search Analysis

The general running time for depth first search is as follows.

  • The loops in dfs both run in 𝑂(𝑉), not counting what happens in dfsvisit, since they are executed once for each vertex in the graph.
  • In dfsvisit the loop is executed once for each edge in the adjacency list of the current vertex.
    • Since dfsvisit is only called recursively if the vertex is white, the loop will execute a maximum of once for every edge in the graph or 𝑂(𝐸).
    • So, the total time for depth first search is 𝑂(𝑉+𝐸).

Topological Sorting using Depth First Search (DFS)

To keep track of its progress,

  • BFS colors each of the vertices white, gray, or black.
  • All the vertices are initialized to white when they are constructed. A white vertex is an undiscovered vertex.
  • When a vertex is initially discovered it is colored gray,
  • and when BFS has completely explored a vertex it is colored black.
  • This means that once a vertex is colored black, it has no white vertices adjacent to it.
  • A gray node may have some white vertices adjacent to it, indicating that there are still additional vertices to explore.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
from pythonds.basic import Stack

def dfs_topo(g, vertextargetid):
    vertex_list = g.vertList
    vertextarget = vertex_list[vertextargetid]

    stack_list = Stack()
    stack_list.push(vertextarget)
    output = Stack()
    while not stack_list.isEmpty():
        currentVert = stack_list.pop()
        if currentVert.getColor() == 'white':
            currentVert.setColor('grey')
            childVert = currentVert.getConnections()
            for vertexs in childVert:
                if vertexs.getColor() == 'white':
                    stack_list.push(vertexs)
        currentVert.setColor('black')
        output.push(currentVert)
    for i in range(output.size()):
        print(output.pop())


g = GraphyAL()
g.addEdge('1', '2', 10)
g.addEdge('2', '3', 7)
g.addEdge('3', '4', 7)
g.addEdge('4', '5', 7)
g.addEdge('5', '6', 13)
g.print_list()

dfs_topo(g, '1')

Topological 拓扑 Sorting O(|V| + |E|)

simple but useful adaptation of a depth first search.

Topological Sort

  • the linear ordering of a directed graph’s nodes
    • takes a directed acyclic 无环的 graph
    • produces a linear ordering of all its vertices
    • 直观地说就是,让你把一幅图「拉平」,而且这个「拉平」的图里面,所有箭头方向都是一致的
    • 很显然,如果一幅有向图中存在环,是无法进行拓扑排序的,因为肯定做不到所有箭头方向一致;
    • 反过来,如果一幅图是「有向无环图」,那么一定可以进行拓扑排序。
  • such that for every edge from node u to node v, u comes before v in the ordering
    • if the graph 𝐺 contains an edge (𝑣,𝑤)
    • then the vertex 𝑣 comes before the vertex 𝑤 in the ordering.
  • Time Complexity: O(|V| + |E|)

  • Directed acyclic graphs are used in many applications to indicate the precedence of events.
    • examples:
    • Making pancakes
    • software project schedules,
    • precedence charts for optimizing database queries,
    • and multiplying matrices.

stirring up a batch of pancakes:

stirring up a batch of pancakes.

  • The recipe is really quite simple: 1 egg, 1 cup of pancake mix, 1 tablespoon oil, and 34 cup of milk.
  • To make pancakes you must heat the griddle, mix all the ingredients together and spoon the mix onto a hot griddle.
  • When the pancakes start to bubble you turn them over and let them cook until they are golden brown on the bottom.
  • Before you eat your pancakes you are going to want to heat up some syrup.
  • illustrates this process as a graph.

pancakes

To know what to do first.

  • To help us decide the precise order in which we should do each of the steps required to make our pancakes
  • turn to a graph algorithm called the topological sort 拓扑

The algorithm for the topological sort is as follows:

  • Call dfs(g) for some graph g. The main reason we want to call depth first search is to compute the finish times for each of the vertices.
  • Store the vertices in a list in decreasing order of finish time.
  • Return the ordered list as the result of the topological sort.

pancakesDFS

the results of applying the topological sort algorithm to our graph.

  • Now all the ambiguity has been removed and we know exactly the order in which to perform the pancake making steps.

pancakesTS


Strongly Connected Components

the graphs produced by the connections between hosts on the Internet and the links between web pages.

strongly connected components algorithm (SCC)

for web pages

Search engines like Google and Bing exploit the fact that the pages on the web form a very large directed graph.

  • To transform the World Wide Web into a graph, we will treat a page as a vertex, and the hyperlinks on the page as edges connecting one vertex to another.
  • very small part of the graph produced by following the links from one page to the next, beginning at Luther College’s Computer Science home page. Of course, this graph could be huge, so we have limited it to web sites that are no more than 10 links away from the CS home page.

cshome

strongly connected components algorithm (SCC)

  • help find clusters of highly interconnected vertices in a graph is called the strongly connected components algorithm (SCC).
  • define a strongly connected component, 𝐶, of a graph 𝐺,
    • as the largest subset of vertices 𝐶⊂𝑉
    • such that for every pair of vertices 𝑣,𝑤∈𝐶 we have a path from 𝑣 to 𝑤 and a path from 𝑤 to 𝑣.
  • Figure 27 shows a simple graph with three strongly connected components. The strongly connected components are identified by the different shaded areas.

scc1

Once the strongly connected components have been identified we can show a simplified view of the graph by combining all the vertices in one strongly connected component into a single larger vertex.

The simplified version of the graph:

scc2

we can create a very powerful and efficient algorithm by making use of a depth first search.

  • Before we tackle the main SCC algorithm we must look at one other definition.
    • The transposition of a graph 𝐺 is defined as the graph 𝐺^𝑇 where all the edges in the graph have been reversed.
  • That is, if there is a directed edge from node A to node B in the original graph then 𝐺^𝑇 will contain and edge from node B to node A.

transpose1

transpose2

  • the graph G has two strongly connected components.
  • the graph 𝐺^𝑇 has the same two strongly connected components.

describe the algorithm to compute the strongly connected components for a graph.

  1. Call dfs for the graph 𝐺 to compute the finish times for each vertex.
  2. Compute 𝐺^𝑇.
  3. Call dfs for the graph 𝐺^𝑇 but in the main loop of DFS explore each vertex in decreasing order of finish time.
  4. Each tree in the forest computed in step 3 is a strongly connected component.
    • Output the vertex ids for each vertex in each tree in the forest to identify the component.

example

Let’s trace the operation of the steps described above on the example graph

the starting and finishing times computed for the original graph by the DFS algorithm.

scc1a

the starting and finishing times computed by running DFS on the transposed graph.

scc1b

Finally, got the forest of three trees produced in step 3 of the strongly connected component algorithm.

sccforest


Shortest Path Problems

for host

Each router on the Internet is connected to one or more other routers. So if you run the traceroute command at different times of the day, you are likely to see that your information flows through different routers at different times. This is because there is a cost associated with each connection between a pair of routers that depends on the volume of traffic, the time of day, and many other factors.

represent the network of routers as a graph with weighted edges.

routeGraph


Dijkstra’s Algorithm O(|V|^2)

dijkstra

Dijkstra’s Algorithm

  • to determine the shortest path
  • an iterative algorithm that provides the shortest path from one particular starting node to all other nodes in the graph.
  • similar to the results of a breadth first search.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
from pythonds.graphs import PriorityQueue, Graph, Vertex

def dijkstra(aGraph,start):
    pq = PriorityQueue()
    start.setDistance(0)
    pq.buildHeap([(v.getDistance(),v) for v in aGraph])
    while not pq.isEmpty():
        currentVert = pq.delMin()
        for nextVert in currentVert.getConnections():
            newDist = currentVert.getDistance() + currentVert.getWeight(nextVert)
            if newDist < nextVert.getDistance():
                nextVert.setDistance(newDist)
                nextVert.setPred(currentVert)
                pq.decreaseKey(nextVert,newDist)

To keep track of the total cost from the start node to each destination we will make use of the dist instance variable in the Vertex class.

  • The dist instance variable: contain the current total weight of the smallest weight path from the start to the vertex in question.
  • The algorithm iterates once for every vertex in the graph; however, the order iterate over the vertices is controlled by a priority queue.
  • The value that is used to determine the order of the objects in the priority queue is dist.
    • When a vertex is first created dist is set to a very large number. infinity

Dijkstra’s algorithm uses a priority queue.

  • priority queue is based on the heap
  • There are a couple of differences between that simple implementation and the implementation we use for Dijkstra’s algorithm.

step

  • First, the PriorityQueue class stores tuples of key, value pairs.
    • the key in the priority queue must match the key of the vertex in the graph.
  • Secondly the value is used for deciding the priority, and thus the position of the key in the priority queue.
    • In this implementation we use the distance to the vertex as the priority because as we will see when we are exploring the next vertex, we always want to explore the vertex that has the smallest distance.
    • The second difference is the addition of the decreaseKey method.
      • used when the distance to a vertex that is already in the queue is reduced, and thus moves that vertex toward the front of the queue.
  1. We begin with the vertex 𝑢.
    • The three vertices adjacent to 𝑢 are 𝑣, 𝑤, 𝑥.
    • the initial distances to 𝑣,𝑤,𝑥 are all initialized to sys.maxint,
    • the new costs to get to them through the start node are all their direct costs.
    • update the costs to each of these three nodes.
    • set the predecessor for each node to 𝑢 and we add each node to the priority queue.
    • We use the distance as the key for the priority queue.

dijkstraa

  1. In the next iteration of the while loop, examine the vertices that are adjacent to 𝑥.
    • The vertex 𝑥 is next because it has the lowest overall cost and therefore bubbled its way to the beginning of the priority queue.
    • its neighbors 𝑢,𝑣,𝑤 and 𝑦.
      • For each neighboring vertex,
      • check if the distance to that vertex through 𝑥 is smaller than the previously known distance.
    • this is the case for 𝑦 since its distance was sys.maxint.
    • It is not the case for 𝑢 or 𝑣 since their distances are 0 and 2 respectively.
    • However, we now learn that the distance to 𝑤 is smaller if we go through 𝑥 than from 𝑢 directly to 𝑤.
    • update 𝑤 with a new distance and change the predecessor for 𝑤 from 𝑢 to 𝑥.

dijkstrab

  1. The next step is to look at the vertices neighboring 𝑣
    • This step results in no changes to the graph,

dijkstrac

  1. so we move on to node 𝑦.
    • At node 𝑦 we discover that it is cheaper to get to both 𝑤 and 𝑧,
    • so we adjust the distances and predecessor links accordingly.

dijkstrad

dijkstrae

dijkstraf

  1. Finally we check nodes 𝑤 and 𝑧
    • However, no additional changes are found
    • so the priority queue is empty
    • and Dijkstra’s algorithm exits.

note that Dijkstra’s algorithm works only when the weights are all positive.

  • You should convince yourself that if you introduced a negative weight on one of the edges to the graph that the algorithm would never exit.

We will note that to route messages through the Internet, other algorithms are used for finding the shortest path.

One of the problems with using Dijkstra’s algorithm on the Internet is that you must have a complete representation of the graph in order for the algorithm to run.

The implication of this is that every router has a complete map of all the routers in the Internet.

In practice this is not the case and other variations of the algorithm allow each router to discover the graph as they go.

One such algorithm that you may want to read about is called the “distance vector” routing algorithm.


Analysis of Dijkstra’s Algorithm
  • building the priority queue takes 𝑂(𝑉) time since we initially add every vertex in the graph to the priority queue.

  • Once the queue is constructed the while loop is executed once for every vertex, since vertices are all added at the beginning and only removed after that. Within that loop each call to delMin, takes 𝑂(log𝑉) time.

  • Taken together that part of the loop and the calls to delMin take 𝑂(𝑉log(𝑉)).

  • The for loop is executed once for each edge in the graph, and within the for loop the call to decreaseKey takes time 𝑂(𝐸log(𝑉)).

  • So the combined running time is 𝑂((𝑉+𝐸)log(𝑉)).


broadcast host

bcast1

the broadcast host has some information that the listeners all need to receive.

The simplest solution
  • broadcasting host to keep a list of all of the listeners and send individual messages to each.
  • four copies of every message would be sent.
  • Assuming that the least cost path is used
    • All messages from the broadcaster go through router A, A sees all four copies of every message.
    • Router C sees only one copy of each message for its listener.
    • routers B and D would see three copies of every message since routers B and D are on the cheapest path for listeners 1, 2, and 3.
    • When you consider that the broadcast host must send hundreds of messages each second for a radio broadcast, a lot of extra traffic.
uncontrolled flooding
  • A brute force solution
  • for the broadcast host to send a single copy of the broadcast message and let the routers sort things out.
  • In this case, the easiest solution is a strategy called uncontrolled flooding.

  • The flooding strategy works as follows.
    • Each message starts with a time to live (ttl) value set to some number greater than or equal to the number of edges between the broadcast host and its most distant listener.
    • Each router gets a copy of the message and passes the message on to all of its neighboring routers.
    • When the message is passed on the ttl is decreased.
    • Each router continues to send copies of the message to all its neighbors until the ttl value reaches 0.
  • uncontrolled flooding generates many more unnecessary messages than our first strategy.

spanning tree

The solution to this problem lies in the construction of a minimum weight spanning tree.

  • minimum spanning tree 𝑇 for a graph 𝐺=(𝑉,𝐸):
    • 𝑇 is an acyclic 非循环 subset of 𝐸 that connects all the vertices in 𝑉.
    • The sum of the weights of the edges in T is minimized.

bcast1

mst1

  • simplified version of the broadcast graph

  • highlights the edges that form a minimum spanning tree for the graph.

broadcast problem:

  • the broadcast host simply sends a single copy of the broadcast message into the network.
  • Each router forwards the message to any neighbor that is part of the spanning tree, excluding the neighbor that just sent it the message.
  • In this example
    • A forwards the message to B.
    • B forwards the message to D and C.
    • D forwards the message to E, which forwards it to F, which forwards it to G.
  • No router sees more than one copy of any message, and all the listeners that are interested see a copy of the message.

Prim’s algorithm O(|V|^2)

greedy algorithms: at each step we will choose the cheapest next step.

  • a greedy algorithm that finds a minimum spanning tree for a weighted undirected graph.
    • follow the edge with the lowest weight.
  • Prim’s find a subset of edges that forms a tree that includes every node in the graph

The basic idea in constructing a spanning tree is as follows:

1
2
3
While T is not yet a spanning tree
   Find an edge that is safe to add to the tree
   Add the new edge to T

safe edge

  • any edge that connects a vertex that is in the spanning tree to a vertex that is not in the spanning tree.
  • This ensures that the tree will always remain a tree and therefore have no cycles.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
from pythonds.graphs import PriorityQueue, Graph, Vertex

def prim(G,start):
    pq = PriorityQueue()
    for v in G:
        v.setDistance(sys.maxsize)
        v.setPred(None)
    start.setDistance(0)
    pq.buildHeap([(v.getDistance(),v) for v in G])
    while not pq.isEmpty():
        currentVert = pq.delMin()
        for nextVert in currentVert.getConnections():
          newCost = currentVert.getWeight(nextVert)
          if nextVert in pq and newCost<nextVert.getDistance():
              nextVert.setPred(currentVert)
              nextVert.setDistance(newCost)
              pq.decreaseKey(nextVert,newCost)

The following sequence of figures (Figure 11 through Figure 17) shows the algorithm in operation on our sample tree.

  1. We begin with the starting vertex as A.
    • The distances to all the other vertices are initialized to infinity.
    • Looking at the neighbors of A we can update distances to two of the additional vertices B and C
      • because the distances to B and C through A are less than infinite.
    • This moves B and C to the front of the priority queue.
    • Update the predecessor links for B and C by setting them to point to A.
    • It is important to note that we have not formally added B or C to the spanning tree yet.
    • A node is not considered to be part of the spanning tree until it is removed from the priority queue.
  2. Since B has the smallest distance we look at B next.
    • Examining B’s neighbors we see that D and E can be updated.
    • Both D and E get new distance values and their predecessor links are updated.
    • Moving on to the next node in the priority queue we find C.
  3. The only node C is adjacent to that is still in the priority queue is F,
    • thus we can update the distance to F
    • and adjust F’s position in the priority queue.
  4. Now we examine the vertices adjacent to node D.
    • We find that we can update E and reduce the distance to E from 6 to 4.
    • When we do this we change the predecessor link on E to point back to D, thus preparing it to be grafted into the spanning tree but in a different location.
    • The rest of the algorithm proceeds as you would expect, adding each new node to the tree.

Tarjan’s algorithm

Articulation point: vertex in connected undirected graph such that removing the vertex will disconnect the graph.

Screen Shot 2021-10-10 at 4.27.59 PM

Step:

  • check child
  • visited += 1
  • if coming from?
    • yes
    • no
      • is seen
        • is seen
        • is not seen
          • visited time <= low time of comimg point
            • yes
            • no: update visted to low time

Greedy Algorithms

  • algorithms that make locally optimal choices at each step in the hope of eventually reaching the globally optimal solution
  • Problems must exhibit two properties in order to implement a Greedy solution:
    • Optimal Substructure 最优子结构
      • An optimal solution to the problem contains optimal solutions to the given problem’s subproblems.
    • The Greedy Property
      • An optimal solution is reached by “greedily” choosing the locally optimal choice without ever reconsidering previous choices.
  • Example: Coin Change
  • Given a target amount V cents and a list of denominations of n coins, i.e. we have coinValue[i] (in cents) for coin types i from [0…n - 1], what is the minimum number of coins that we must use to represent amount V?
  • Assume that we have an unlimited supply of coins of any type Coins - Penny (1 cent), Nickel (5 cents), Dime (10 cents), Quarter (25 cents)
  • Assume V = 41. We can use the Greedy algorithm of continuously selecting the largest coin denomination less than or equal to V, subtract that coin’s value from V, and repeat.
1
2
3
4
5
6
7
V = 41 | 0 coins used
V = 16 | 1 coin used (41 - 25 = 16)
V = 6 | 2 coins used (16 - 10 = 6)
V = 1 | 3 coins used (6 - 5 = 1)
V = 0 | 4 coins used (1 - 1 = 0)

Using this algorithm, we arrive at a total of 4 coins which is optimal

Bitmasks

  • Bitmasking is a technique used to perform operations at the bit level.
  • Leveraging bitmasks often leads to faster runtime complexity and helps limit memory usage
  • Test kth bit: s & (1 << k);
  • Set kth bit: s |= (1 << k);
  • Turn off kth bit: s &= ~(1 << k);
  • Toggle kth bit: s ^= (1 << k);
  • Multiple by 2n: s << n;
  • Divide by 2n: s >> n;
  • Intersection: s & t;
  • Union: s | t;
  • Set Subtraction: s & ~t;
  • Extract lowest set bit: s & (-s);
  • Extract lowest unset bit: ~s & (s + 1);
  • Swap Values: * x ^= y; y ^= x; x ^= y;

other Algorithm

Bellman-Ford Algorithm

bellman-ford

Bellman-Ford Algorithm

  • an algorithm that computes the shortest paths from a single source node to all other nodes in a weighted graph

  • slower than Dijkstra’s
  • but more versatile, as it is capable of handling graphs in which some of the edge weights are negative numbers

Time Complexity:

  • Best Case: O(E)
  • Worst Case: O(V E)

Floyd-Warshall Algorithm

Floyd-Warshall Algorithm

  • an algorithm for finding the shortest paths in a weighted graph with positive or negative edge weights,
  • but no negative cycles
  • A single execution of the algorithm will find the lengths (summed weights) of the shortest paths between all pairs of nodes

Time Complexity:

  • Best Case: O(V^3)
  • Worst Case: O(V^3)
  • Average Case: O(V^3)

Kruskal’s Algorithm O(|E|log|V|)

Kruskal’s Algorithm

  • a greedy algorithm that finds a minimum spanning tree in a graph.
  • However, in Kruskal’s, the graph does not have to be connected
Time Complexity: O(ElogV)

.

This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.