release-3-0-0 Release 3.0.0 (2021/01/01) Download release-3-0-0_1release-3-0-0_download System Requirements release-3-0-0_1release-3-0-0_system_requirements Working Items release-3-0-0_1release-3-0-0_working_items New Features release-3-0-0_1release-3-0-0_new_features Taskflow Core release-3-0-0_1release-3-0-0_taskflow_core cudaFlow release-3-0-0_1release-3-0-0_cudaflow Utilities release-3-0-0_1release-3-0-0_utilities Taskflow Profiler (TFProf) release-3-0-0_1release-3-0-0_profiler New Algorithms release-3-0-0_1release-3-0-0_new_algorithms CPU Algorithms release-3-0-0_1release-3-0-0_cpu_algorithms GPU Algorithms release-3-0-0_1release-3-0-0_gpu_algorithms Bug Fixes release-3-0-0_1release-3-0-0_bug_fixes Breaking Changes release-3-0-0_1release-3-0-0_breaking_changes Deprecated and Removed Items release-3-0-0_1release-3-0-0_deprecated_items Documentation release-3-0-0_1release-3-0-0_documentation Miscellaneous Items release-3-0-0_1release-3-0-0_miscellaneous_items Taskflow 3.0.0 is the 1st release in the 3.x line! This release includes several new changes such as CPU-GPU tasking, algorithm collection, enhanced web-based profiler, documentation, and unit tests. Starting from v3, we have migrated the codebase to the C++17 standard to largely improve the expressivity and efficiency of the codebase. Download Taskflow 3.0.0 can be downloaded from here. System Requirements To use Taskflow v3.0.0, you need a compiler that supports C++17: GNU C++ Compiler at least v7.0 with -std=c++17 Clang C++ Compiler at least v6.0 with -std=c++17 Microsoft Visual Studio at least v19.27 with /std:c++17 AppleClang Xcode Version at least v12.0 with -std=c++17 Nvidia CUDA Toolkit and Compiler (nvcc) at least v11.1 with -std=c++17 Intel C++ Compiler at least v19.0.1 with -std=c++17 Taskflow works on Linux, Windows, and Mac OS X. Working Items enhancing the taskflow profiler (TFProf) adding methods for updating tf::cudaFlow (with unit tests) adding support for cuBLAS adding support for cuDNN adding support for SYCL (ComputeCpp and DPC++) New Features Taskflow Core replaced all non-standard libraries with C++17 STL (e.g., std::optional, std::variant) added tf::WorkerView for users to observe the running works of tasks added asynchronous tasking (see Asynchronous Tasking) modified tf::ObserverInterface::on_entry and tf::ObserverInterface::on_exit to take tf::WorkerView added a custom graph interface to support dynamic polymorphism for tf::cudaGraph supported separate compilations between Taskflow and CUDA (see Compile Taskflow with CUDA) added tf::Semaphore and tf::CriticalSection to limit the maximum concurrency added tf::Future to support cancellation of submitted tasks (see Request Cancellation) cudaFlow added tf::cudaFlowCapturer for building a cudaFlow through stream capture (see GPU Tasking (cudaFlowCapturer)) added tf::cudaFlowCapturerBase for creating custom capturers added tf::cudaFlow::capture for capturing a cudaFlow within a parent cudaFlow added tf::Taskflow::emplace_on to place a cudaFlow on a GPU added tf::cudaFlow::dump and tf::cudaFlowCapturer::dump to visualize cudaFlow added tf::cudaFlow::offload and update methods to run and update a cudaFlow explicitly supported standalone cudaFlow supported standalone cudaFlowCapturer added tf::cublasFlowCapturer to support cuBLAS (see LinearAlgebracublasFlowCapturer) Utilities added utility functions to grab the cuda device properties (see cuda_device.hpp) added utility functions to control cuda memory (see cuda_memory.hpp) added utility functions for common mathematics operations added serializer and deserializer libraries to support tfprof added per-thread pool for CUDA streams to improve performance Taskflow Profiler (TFProf) added visualization for asynchronous tasks added server-based profiler to support large profiling data (see Profile Taskflow Programs) New Algorithms CPU Algorithms added parallel sort (see Parallel Sort) GPU Algorithms added single task (see Single Task) added parallel iterations (see Parallel Iterations) added parallel transforms added parallel reduction Bug Fixes fixed the bug in stream capturing (need to use ThreadLocal mode) fixed the bug in reporting wrong worker ids when compiling a shared library due to the use of thread_local (now with C++17 inline variable) Breaking Changes changed the returned values of asynchronous tasks to be std::optional in order to support cancellation (see Asynchronous Tasking and Request Cancellation) Deprecated and Removed Items removed tf::cudaFlow::device; users may call tf::Taskflow::emplace_on to associate a cudaflow with a GPU device removed tf::cudaFlow::join, use tf::cudaFlow::offload instead removed the legacy tf::Framework removed external mutable use of tf::TaskView Documentation added Compile Taskflow with CUDA added Benchmark Taskflow added Limit the Maximum Concurrency added Asynchronous Tasking added GPU Tasking (cudaFlowCapturer) added Request Cancellation added Profile Taskflow Programs added cudaFlow Algorithms Single Task to run a kernel function in just a single thread Parallel Iterations to perform parallel iterations over a range of items Parallel Transforms to perform parallel transforms over a range of items added Governance Rules Team Code of Conduct added Contributing Guidelines Contributors revised Conditional Tasking revised documentation pages for files Miscellaneous Items We have presented Taskflow in the following C++ venues with recorded videos: 2020 CppCon Taskflow Talk 2020 MUC++ Taskflow Talk We have published Taskflow in the following conferences and journals: Tsung-Wei Huang, "A General-purpose Parallel and Heterogeneous Task Programming System for VLSI CAD," IEEE/ACM International Conference on Computer-aided Design (ICCAD), CA, 2020 Chun-Xun Lin, Tsung-Wei Huang, and Martin Wong, "An Efficient Work-Stealing Scheduler for Task Dependency Graph," IEEE International Conference on Parallel and Distributed Systems (ICPADS), Hong Kong, 2020 Tsung-Wei Huang, Dian-Lun Lin, Yibo Lin, and Chun-Xun Lin, "Cpp-Taskflow: A General-purpose Parallel Task Programming System at Scale," IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems (TCAD), to appear, 2020