Anyone or anything that can search effectively will eventually make a major impact on its environment. For a prime example just look at Google -- from a simple search engine came one of the largest companies in the world of IT. So here, in this article, we will discuss a good search algorithm for graphs. What importance may it yield in the future?
Contributed by Gabor Bernat Rating: / 10 May 12, 2009
Last time I presented the breadth-first search and as I told you then, I am continuing this series of articles about graphs with the second main search algorithm: the depth-first search. Our objective is to visit all points of the graph by starting from a root item and traveling through the edges.
During this article I will focus just as much on the application of the depth-first search algorithm as on the technique itself, so beyond the search itself we are going to solve issues like determining if there is a circle in the graph, the basic circle system of a graph, and categorizing the edges according the depth-first search.
We will sort the tree into a topological order; during a future article, we will find the strong connected components of a graph, and the disconnecting set. These are those edges that, if we remove them, we will increase the number of strongly connected components in the graph (articulation edges). In addition, of course, we will look at the bond, which is the same, but for a set of vertexes (articulation vertexes).
The depth first search's short description is as follows. Visit the root item. Visit the child of that item, and follow on with the child of that item and so on until we find the "bottom" of the graph. Now step back and visit the second child of the previous item, and so on.
The main difference between this and the breadth-first search is that, while the breadth-first search will first visit all the neighbors of the root item and continue only after doing this, the DFS will visit a vertex and immediately step ahead to the child of that item if it is possible. Otherwise, it steps back and tries to do the same with the second vertex, after that with the third (if it exists of course) until it has visited all of the neighbor edges. When we arrive at a child item, the same plan is applied.
This method reminds us of recursive iteration throughout the graph. As opposed to the BFS, the DFS is harder to comprehend, but easier to write. Again, for the best performance, the right way to store the graph inside of memory is with the adjacency list. The search can be wrapped up in just the following few lines:
void DFS(int vertex)
{
printf(" %d " , vertex);
pListIt p;
color[vertex] = GRAY; // visited
for (p = adjacencyList[vertex]; p != NULL; p = p -> p_next)
if (!color[p -> value])
{
DFS(p -> value);
}
color[vertex] = BLACK;
}
For those of you who skipped my article about the breadth first search, I need to explain that to better simulate the state of the vertex I made a definition that a vertex is white if it has not yet been visited, gray if it is currently being visited, and black if it has been visited. Therefore, all we need to do is call the search for a vertex; if we find a neighbor that has not yet been visited, call the DFS for that vertex also. Let the recursion handle the rest.
You may observe that the items are visited in the FIFO (First in First Out) order that is simulated by a stack. Yet we do not need to write the stack data structure ourselves, as the recursion will also implement this feature. This is why it is a little simpler to write a DFS than a BFS, where we need to manually simulate the queue. Nevertheless, you may write a non-recursive DFS (implemented with loops) where you need to use the stack also.
The order in which we drop by the vertexes will build the DFS search of the graph. As for the categorization of the edges, first we should know what kind of edges exist, so let us look at a little citation from my article about the BFS:
"A tree edge means that the edge will become a part of the search tree (we traveled ahead on it inside the graph or found another vertex through it). Back-pointing edges are the ones that will point to a node already black (it points to an earlier generation vertex). Cross-pointing edges are gray, and their distance from the root is less than the distance from the root of the currently visited node."
The DFS puts each edge in one of these four categories when it first observes its existence during the search. As we visit only to the edges that we have not yet visited (remember, visited edges are black), it is logical that our search will build a tree. This has all the edges through which we found another vertex. Additionally, we can classify the other edges as I explain in the next paragraph.
Let us be at a vertex u. Moreover, we visit all of its neighbors, which we mark here with letter v. If v is white, the edge is a tree edge; if it is gray, it will be a back-pointing edge. If it is black, there are two cases. We measure the distance of the vertexes from the root, and if the distance of vertex v is larger than of vertex u we have an ahead-pointing edge, otherwise it will be a cross-pointing edge.
The DFS search should be called first on the root item, and further on with all of the non-visited vertexes, in this way assuring that the entire tree will be traveled. This happens when you have a graph that is not connected, like a forest (two individual trees, for example). Here is the code snippet that makes all this happen in our program; we just need to extend the DFS a little:
void DFS(int vertex)
{
printf(" %d ", vertex);
pListIt p;
color[vertex] = GRAY; // visited
for (p = adjacencyList[vertex]; p != NULL; p = p -> p_next)
if (!color[p -> value])
{
ancient[p->value] = vertex;
distance[p->value] = distance[vertex] +1;
p->type = TREE_EDGE;
DFS(p -> value);
}
else
{
if( color[p->value] == GRAY)
{
p->type = BACK_EDGE;
}
else
if (distance[p->value] > distance[vertex])
{
p->type = AHEAD_EDGE ;
}
else
p->type = CROSS_EDGE ;
}
color[vertex] = BLACK;
}
Here is a picture of a graph during the DFS. The search just arrived at node 11. The edges that will form the tree are marked with a solid line; the edges that are back-pointing are dashed lines; the cross edges are marked with dotted lines; and finally, the ahead-pointing edges show a dash-and-dot line.
-->Image Courtesy of Kátai Zoltán- Introduction to Graphs<--
The topological sorting is all about constructing an order of the vertexes which will make sure that, by drawing the vertexes from left to right, all of the edges will be directed in the right direction. Now from this we can conclude that this kind of order is possible only out of a directed and circle-free graph.
This order is in fact the inverse order of the DFS tree, or in other words the inverse order of how the vertexes become black. What assures that this will be correct ? Whenever we step forward from a node and mark it as visited fully (painted black), all of its children are already visited; otherwise, we would not leave that point.
Therefore, if we insert it now inside the list, all of the vertexes that stand at the other end of the out-going edges are already visited, and are on the left side of this item in the DFS order. From this we conclude that, in the inverse of this order, they will be on the right side.
In code snippet terms, we only need to add a couple of lines after we mark an item as black (fully visited). We use a global variable to count the number of already-inserted items into the order; nevertheless, we could also inverse the order after the search:
// add to the topological order
++atPos;
topological_order[n-atPos] = vertex;
-->Image Courtesy of Kátai Zoltán- Introduction to Graphs<--
The execution of the application will print for us similar result to the upper image:
For all this and something extra, I've provided a little C file you can study. Inside it you will find the solution for the determination of the strong connecting components, the base circle system and the articulation edges and vertexes as a bonus to what we've learned so far.
If you do not understand any of this, do not worry, as we will look over those items in detail during the next couple of articles related to graphs. Feel free to explore them until next time, when I will be back for another lesson on graphs.
Thank you for investing the time to read my article and taking the effort to improve yourself. I want to encourage you to rate my article accordingly and to post your ideas and questions related to your experience with graphs here on the blog, or over on our friendly forum running under the name of DevHardware. Until next time, remember to Live With Passion!