ExecuteTaskflow Executor Create an Executor ExecuteTaskflow_1CreateAnExecutor Execute a Taskflow ExecuteTaskflow_1ExecuteATaskflow Execute a Taskflow with Transferred Ownership ExecuteTaskflow_1ExecuteATaskflowWithTransferredOwnership Execute a Taskflow from an Internal Worker ExecuteTaskflow_1ExecuteATaskflowFromAnInternalWorker Touch an Executor from Multiple Threads ExecuteTaskflow_1ThreadSafety Query the Worker ID ExecuteTaskflow_1QueryTheWorkerID Observe Thread Activities ExecuteTaskflow_1ObserveThreadActivities After you create a task dependency graph, you need to submit it to threads for execution. In this chapter, we will show you how to execute a task dependency graph. Create an Executor To execute a taskflow, you need to create an executor of type tf::Executor. An executor is a thread-safe object that manages a set of worker threads and executes tasks through an efficient work-stealing algorithm. Issuing a call to run a taskflow creates a topology, a data structure to keep track of the execution status of a running graph. tf::Executor takes an unsigned integer to construct with N worker threads. The default value is std::thread::hardware_concurrency. tf::Executorexecutor1;//createanexecutorwiththenumberofworkers //equaltostd::thread::hardware_concurrency tf::Executorexecutor2(4);//createanexecutorof4workerthreads An executor can be reused to execute multiple taskflows. In most workloads, you may need only one executor to run multiple taskflows where each taskflow represents a part of a parallel decomposition. Execute a Taskflow tf::Executor provides a set of run_* methods, tf::Executor::run, tf::Executor::run_n, and tf::Executor::run_until to run a taskflow for one time, multiple times, or until a given predicate evaluates to true. All methods accept an optional callback to invoke after the execution completes, and return a tf::Future for users to access the execution status. The code below shows several ways to run a taskflow. 1://Declareanexecutorandataskflow 2:tf::Executorexecutor; 3:tf::Taskflowtaskflow; 4: 5://Addthreetasksintothetaskflow 6:tf::TaskA=taskflow.emplace([](){std::cout<<"ThisisTaskA\n";}); 7:tf::TaskB=taskflow.emplace([](){std::cout<<"ThisisTaskB\n";}); 8:tf::TaskC=taskflow.emplace([](){std::cout<<"ThisisTaskC\n";}); 9: 10://Buildprecedencebetweentasks 11:A.precede(B,C); 12: 13:tf::Future<void>fu=executor.run(taskflow); 14:fu.wait();//blockuntiltheexecutioncompletes 15: 16:executor.run(taskflow,[](){std::cout<<"endof1run";}).wait(); 17:executor.run_n(taskflow,4); 18:executor.wait_for_all();//blockuntilallassociatedexecutionsfinish 19:executor.run_n(taskflow,4,[](){std::cout<<"endof4runs";}).wait(); 20:executor.run_until(taskflow,[cnt=0]()mutable{return++cnt==10;}); Debrief: Lines 6-8 create a taskflow of three tasks A, B, and C Lines 13-14 run the taskflow once and wait for completion Line 16 runs the taskflow once with a callback to invoke when the execution finishes Lines 17-18 run the taskflow four times and use tf::Executor::wait_for_all to wait for completion Line 19 runs the taskflow four times and invokes a callback at the end of the fourth execution Line 20 keeps running the taskflow until the predicate returns true Issuing multiple runs on the same taskflow will automatically synchronize to a sequential chain of executions in the order of run calls. executor.run(taskflow);//execution1 executor.run_n(taskflow,10);//execution2 executor.run(taskflow);//execution3 executor.wait_for_all();//execution1->execution2->execution3 A running taskflow must remain alive during its execution. It is your responsibility to ensure a taskflow not being destructed when it is running. For example, the code below can result undefined behavior. tf::Executorexecutor;//createanexecutor //createataskflowwhoselifetimeisrestrictedbythescope { tf::Taskflowtaskflow; //addtaskstothetaskflow //... //runthetaskflow executor.run(taskflow); }//leavingthescopewilldestroytaskflowwhileitisrunning, //resultinginundefinedbehavior Similarly, you should avoid touching a taskflow while it is running. tf::Taskflowtaskflow; //Addtasksintothetaskflow //... //Declareanexecutor tf::Executorexecutor; tf::Future<void>future=executor.run(taskflow);//non-blockingreturn //alterthetaskflowwhilerunningleadstoundefinedbehavior taskflow.emplace([](){std::cout<<"Addanewtask\n";}); You must always keep a taskflow alive and must not modify it while it is running on an executor. Execute a Taskflow with Transferred Ownership You can transfer the ownership of a taskflow to an executor and run it without wrangling with the lifetime issue of that taskflow. Each run_* method discussed in the previous section comes with an overload that takes a moved taskflow object. tf::Taskflowtaskflow; tf::Executorexecutor; taskflow.emplace([](){}); //lettheexecutormanagethelifetimeofthesubmittedtaskflow executor.run(std::move(taskflow)); //nowtaskflowhasnotasks assert(taskflow.num_tasks()==0); However, you should avoid moving a running taskflow which can result in undefined behavior. tf::Taskflowtaskflow; tf::Executorexecutor; taskflow.emplace([](){}); //executordoesnotmanagethelifetimeoftaskflow executor.run(taskflow); //error!youcannotmoveataskflowwhileitisrunning executor.run(std::move(taskflow)); The correct way to submit a taskflow with moved ownership to an executor is to ensure all previous runs have completed. The executor will automatically release the resources of a moved taskflow right after its execution completes. //submitthetaskflowandwaituntilitcompletes executor.run(taskflow).wait(); //nowit'ssafetomovethetaskflowtotheexecutorandrunit executor.run(std::move(taskflow)); Likewise, you cannot move a taskflow that is running on an executor. You must wait until all the previous fires of runs on that taskflow complete before calling move. //submitthetaskflowandwaituntilitcompletes executor.run(taskflow).wait(); //nowit'ssafetomovethetaskflowtoanother tf::Taskflowmoved_taskflow(std::move(taskflow)); Execute a Taskflow from an Internal Worker Each run variant of tf::Executor returns a tf::Future object which allows you to wait for the result to complete. When calling tf::Future::wait, the caller blocks without doing anything until the associated state is written to be ready. This design, however, can introduce deadlock problem especially when you need to run multiple taskflows from the internal workers of an executor. For example, the code below creates a taskflow of 1000 tasks with each task running a taskflow of 500 tasks in a blocking fashion: tf::Executorexecutor(2); tf::Taskflowtaskflow; std::array<tf::Taskflow, 1000>others; std::atomic<size_t>counter{0}; for(size_tn=0;n<1000;n++){ for(size_ti=0;i<500;i++){ others[n].emplace([&](){counter++;}); } taskflow.emplace([&executor,&tf=others[n]](){ //blockingtheworkercanintroducedeadlockwhere //allworkersarewaitingfortheirtaskflowstofinish executor.run(tf).wait(); }); } executor.run(taskflow).wait(); To avoid this problem, the executor has a method, tf::Executor::corun, to execute a taskflow from a worker of that executor. The worker will not block but co-run the taskflow with other tasks in its work-stealing loop. tf::Executorexecutor(2); tf::Taskflowtaskflow; std::array<tf::Taskflow, 1000>others; std::atomic<size_t>counter{0}; for(size_tn=0;n<1000;n++){ for(size_ti=0;i<500;i++){ others[n].emplace([&](){counter++;}); } taskflow.emplace([&executor,&tf=others[n]](){ //thecallerworkerwillnotblockbutcorunthese //taskflowsthroughitswork-stealingloop executor.corun(tf); }); } executor.run(taskflow).wait(); Similar to tf::Executor::corun, the method tf::Executor::corun_until is another variant that keeps the calling worker in the work-stealing loop until the given predicate becomes true. You can use this method to prevent blocking a worker from doing useful things, such as being blocked when submitting an outstanding task (e.g., a GPU operation). taskflow.emplace([&](){ autofu=std::async([](){std::sleep(100s);}); executor.corun_until([](){ returnfu.wait_for(std::chrono::seconds(0))==future_status::ready; }); }); You must call tf::Executor::corun_until and tf::Executor::corun from a worker of the calling executor or an exception will be thrown. Touch an Executor from Multiple Threads All run_* methods are thread-safe. You can have multiple threads call these methods from an executor to run different taskflows. However, the order which taskflow runs first is non-deterministic and is up to the runtime. 1:tf::Executorexecutor; 2: 3:for(inti=0;i<10;++i){ 4:std::thread([i,&](){ 5://...modifymytaskflowati 6:executor.run(taskflows[i]);//runmytaskflowati 7:}).detach(); 8:} 9: 10:executor.wait_for_all(); Query the Worker ID Each worker in an executor has an unique integer identifier in the range [0, N) that can be queried by the caller thread using tf::Executor::this_worker_id. If the caller thread is not a worker in the executor, -1 is returned. This method is convenient for users to maintain a one-to-one mapping between a worker and its application data structure. std::vector<int>worker_vectors[8];//onevectorperworker tf::Taskflowtaskflow; tf::Executorexecutor(8);//anexecutorofeightworkers assert(executor.this_worker_id()==-1);//masterthreadisnotaworker taskflow.emplace([&](){ intid=executor.this_worker_id();//intherange[0,8) auto&vec=worker_vectors[worker_id]; //... }); Observe Thread Activities You can observe thread activities in an executor when a worker thread participates in executing a task and leaves the execution using tf::ObserverInterface an interface class that provides a set of methods for you to define what to do when a thread enters and leaves the execution context of a task. classObserverInterface{ virtual~ObserverInterface()=default; virtualvoidset_up(size_tnum_workers)=0; virtualvoidon_entry(tf::WorkerViewworker_view,tf::TaskViewtask_view)=0; virtualvoidon_exit(tf::WorkerViewworker_view,tf::TaskViewtask_view)=0; }; There are three methods you must define in your derived class, tf::ObserverInterface::set_up, tf::ObserverInterface::on_entry, and tf::ObserverInterface::on_exit. The method, tf::ObserverInterface::set_up, is a constructor-like method that will be called by the executor when the observer is constructed. It passes an argument of the number of workers to observer in the executor. You may use it to preallocate or initialize data storage, e.g., an independent vector for each worker. The methods, tf::ObserverInterface::on_entry and tf::ObserverInterface::on_exit, are called by a worker thread before and after the execution context of a task, respectively. Both methods provide immutable access to the underlying worker and the running task using tf::WorkerView and tf::TaskView. You may use them to record timepoints and calculate the elapsed time of a task. You can associate an executor with one or multiple observers (though one is common) using tf::Executor::make_observer. We use std::shared_ptr to manage the ownership of an observer. The executor loops through each observer and invoke the corresponding methods accordingly. #include<taskflow/taskflow.hpp> structMyObserver:publictf::ObserverInterface{ MyObserver(conststd::string&name){ std::cout<<"constructingobserver"<<name<<'\n'; } voidset_up(size_tnum_workers)overridefinal{ std::cout<<"settingupobserverwith"<<num_workers<<"workers\n"; } voidon_entry(tf::WorkerVieww,tf::TaskViewtv)overridefinal{ std::ostringstreamoss; oss<<"worker"<<w.id()<<"readytorun"<<tv.name()<<'\n'; std::cout<<oss.str(); } voidon_exit(tf::WorkerVieww,tf::TaskViewtv)overridefinal{ std::ostringstreamoss; oss<<"worker"<<w.id()<<"finishedrunning"<<tv.name()<<'\n'; std::cout<<oss.str(); } }; intmain(){ tf::Executorexecutor(4); //Createataskflowofeighttasks tf::Taskflowtaskflow; autoA=taskflow.emplace([](){std::cout<<"1\n";}).name("A"); autoB=taskflow.emplace([](){std::cout<<"2\n";}).name("B"); autoC=taskflow.emplace([](){std::cout<<"3\n";}).name("C"); autoD=taskflow.emplace([](){std::cout<<"4\n";}).name("D"); autoE=taskflow.emplace([](){std::cout<<"5\n";}).name("E"); autoF=taskflow.emplace([](){std::cout<<"6\n";}).name("F"); autoG=taskflow.emplace([](){std::cout<<"7\n";}).name("G"); autoH=taskflow.emplace([](){std::cout<<"8\n";}).name("H"); //createanobserver std::shared_ptr<MyObserver>observer=executor.make_observer<MyObserver>( "MyObserver" ); //runthetaskflow executor.run(taskflow).get(); //removetheobserver(optional) executor.remove_observer(std::move(observer)); return0; } The above code produces the following output: constructingobserverMyObserver settingupobserverwith4workers worker2readytorunA 1 worker2finishedrunningA worker2readytorunB 2 worker1readytorunC worker2finishedrunningB 3 worker2readytorunD worker3readytorunE worker1finishedrunningC 4 5 worker1readytorunF worker2finishedrunningD worker3finishedrunningE 6 worker2readytorunG worker3readytorunH worker1finishedrunningF 7 8 worker2finishedrunningG worker3finishedrunningH It is expected each line of std::cout interleaves with each other as there are four workers participating in task scheduling. However, the ready message always appears before the corresponding task message (e.g., numbers) and then the finished message.