namespace tf {
/** @page DependentAsyncTasking Asynchronous Tasking with Dependencies
This chapters discusses how to create a task graph dynamically
using asynchronous tasks,
which is extremely beneficial for workloads that want to
(1) explore task graph parallelism out of dynamic control flow
or
(2) overlap task graph creation time with individual task execution time.
We recommend that you first read @ref AsyncTasking before digesting this chapter.
@tableofcontents
@section CreateADynamicTaskGraph Create a Dynamic Task Graph
When the construct-and-run model of a task graph is not possible in your application,
you can use tf::Executor::dependent_async and tf::Executor::silent_dependent_async
to create a task graph dynamically.
This type of parallelism is also known as on-the-fly task graph parallelism,
which offers great flexibility for expressing dynamic task graph parallelism.
The example below dynamically creates a task graph of
four dependent async tasks, @c A, @c B, @c C, and @c D, where @c A runs before @c B and @c C
and @c D runs after @c B and @c C:
@dotfile images/simple.dot
@code{.cpp}
tf::Executor executor;
tf::AsyncTask A = executor.silent_dependent_async([](){ printf("A\n"); });
tf::AsyncTask B = executor.silent_dependent_async([](){ printf("B\n"); }, A);
tf::AsyncTask C = executor.silent_dependent_async([](){ printf("C\n"); }, A);
auto [D, fuD] = executor.dependent_async([](){ printf("D\n"); }, B, C);
fuD.get(); // wait for D to finish, which in turn means A, B, C have finished
@endcode
Both tf::Executor::dependent_async and tf::Executor::silent_dependent_async
create a task of type tf::AsyncTask to run the given function asynchronously.
Additionally, tf::Executor::dependent_async returns a @std_future
that eventually holds the result of the execution.
When returning from both calls, the executor has scheduled a worker
to run the task whenever its dependencies are met.
That is, task execution happens @em simultaneously
with the creation of the task graph, which is different from constructing a %Taskflow
and running it from an executor, illustrated in the figure below:
@image html images/dependent_async_execution_diagram.png
Since this model only allows relating a dependency from the current task
to a previously created task,
you need a correct topological order of graph expression.
In our example, there are only two possible topological orderings,
either @c ABCD or @c ACBD.
The code below shows another feasible order of expressing this
dynamic task graph parallelism:
@code{.cpp}
tf::Executor executor;
tf::AsyncTask A = executor.silent_dependent_async([](){ printf("A\n"); });
tf::AsyncTask C = executor.silent_dependent_async([](){ printf("C\n"); }, A);
tf::AsyncTask B = executor.silent_dependent_async([](){ printf("B\n"); }, A);
auto [D, fuD] = executor.dependent_async([](){ printf("D\n"); }, B, C);
fuD.get(); // wait for D to finish, which in turn means A, B, C have finished
@endcode
In addition to using @std_future to synchronize the execution,
you can use tf::Executor::wait_for_all to wait for all scheduled tasks
to finish:
@code{.cpp}
tf::Executor executor;
tf::AsyncTask A = executor.silent_dependent_async([](){ printf("A\n"); });
tf::AsyncTask B = executor.silent_dependent_async([](){ printf("B\n"); }, A);
tf::AsyncTask C = executor.silent_dependent_async([](){ printf("C\n"); }, A);
tf::AsyncTask D = executor.silent_dependent_async([](){ printf("D\n"); }, B, C);
executor.wait_for_all();
@endcode
@section SpecifyARagneOfDependentAsyncTasks Specify a Range of Dependent Async Tasks
Both tf::Executor::dependent_async(F&& func, Tasks&&... tasks) and
tf::Executor::silent_dependent_async(F&& func, Tasks&&... tasks)
accept an arbitrary number of tasks in the dependency list.
If the number of dependent tasks is unknown at programming time,
such as those relying on runtime variables,
you can use the following two overloads
to specify dependent tasks in an iterable range [first, last):
+ tf::Executor::dependent_async(F&& func, I first, I last)
+ tf::Executor::silent_dependent_async(F&& func, I first, I last)
The code below creates an asynchronous task that depends on
@c N previously created asynchronous tasks stored in a vector,
where @c N is a runtime variable:
@code{.cpp}
tf::Executor executor;
std::vector dependents;
for(size_t i=0; i fibonacci;
// calculate the Fibonacci sequence: 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89
fibonacci = [&](int N){
if (N < 2) {
return N;
}
auto [t1, fu1] = executor.dependent_async(std::bind(fibonacci, N-1));
auto [t2, fu2] = executor.dependent_async(std::bind(fibonacci, N-2));
executor.corun_until([&](){ return t1.is_done() && t2.is_done(); });
return fu1.get() + fu2.get();
};
auto [task, fib11] = executor.dependent_async(std::bind(fibonacci, 11));
assert(fib11 == 89); // the 11-th Fibonacci number is 89
@endcode
*/
}