namespace tf { /** @page graphtraversal Graph Traversal We study the graph traversal problem by visiting each vertex in parallel following their edge dependencies. Traversing a graph is a fundamental building block of many graph applications especially for large-scale graph analytics. @tableofcontents @section GraphTraversalProblemFormulation Problem Formulation Given a directed acyclic graph (DAG), i.e., a graph that has no cycles, we would like to traverse each vertex in order without breaking dependency constraints defined by edges. The following figure shows a graph of six vertices and seven edges. Each vertex represents a particular task and each edge represents a task dependency between two tasks. @dotfile images/task-level-parallelism.dot Traversing the above graph in parallel, the maximum parallelism we can acquire is three. When Task1 finishes, we can run Task2, Task3, and Task4 in parallel. @section GraphTraversalGraphRepresentation Graph Representation We define the data structure of our graph. The graph is represented by an array of nodes of the following structure: @code{.cpp} struct Node { std::string name; size_t idx; // index of the node in a array bool visited {false}; std::atomic dependents {0}; // number of incoming edges std::vector successors; // number of outgoing edges void precede(Node& n) { successors.emplace_back(&n); n.dependents ++; } }; @endcode Based on the data structure, we randomly generate a DAG using ordered edges. @code{.cpp} std::unique_ptr make_dag(size_t num_nodes, size_t max_degree) { std::unique_ptr nodes(new Node[num_nodes]); // Make sure nodes are in clean state for(size_t i=0; i nodes = make_dag(100000, 4); std::vector tasks; // create the traversal task for each node for(size_t i=0; ivisited = true; for(size_t j=0; jsuccessors.size(); ++j) { v->successors[j]->dependents.fetch_sub(1); } }).name(nodes[i].name); tasks.push_back(task); } // create the dependency between nodes on top of the graph structure for(size_t i=0; iidx]); } } executor.run(taskflow).wait(); // after the graph is traversed, all nodes must be visited with no dependents for(size_t i=0; i @dotfile images/graph_traversal_2.dot With task parallelism, we flow computation naturally with the graph structure. The runtime autonomously distributes tasks across processor cores to obtain maximum task parallelism. You do not need to worry about details of scheduling. @section GraphTraversalDynamicTraversal Dynamic Traversal We can traverse the graph dynamically using tf::Subflow (see @ref SubflowTasking). We start from the source nodes of zero incoming edges and recursively spawn subflows whenever the dependency of a node is meet. Since we are creating tasks from the execution context of another task, we need to store the task callable in advance. @code{.cpp} tf::Taskflow taskflow; tf::Executor executor; // task callable of traversing a node using subflow std::function traverse; traverse = [&] (Node* n, tf::Subflow& subflow) { assert(!n->visited); n->visited = true; for(size_t i=0; isuccessors.size(); i++) { if(n->successors[i]->dependents.fetch_sub(1) == 1) { subflow.emplace([s=n->successors[i], &traverse](tf::Subflow &subflow){ traverse(s, subflow); }).name(n->name); } } }; // create a graph std::unique_ptr nodes = make_dag(100000, 4); // find the source nodes (no incoming edges) std::vector src; for(size_t i=0; i @dotfile images/graph_traversal_1.dot In general, the dynamic version of graph traversal is slower than the static version due to the overhead incurred by spawning subflows. However, it may be useful for the situation where the graph structure is unknown at once but being partially explored during the traversal. */ }