release-3-3-0 Release 3.3.0 (2022/01/03) Download release-3-3-0_1release-3-3-0_download System Requirements release-3-3-0_1release-3-3-0_system_requirements Release Summary release-3-3-0_1release-3-3-0_summary New Features release-3-3-0_1release-3-3-0_new_features Taskflow Core release-3-3-0_1release-3-3-0_taskflow_core cudaFlow release-3-3-0_1release-3-3-0_cudaflow syclFlow release-3-3-0_1release-3-3-0_syclflow Utilities release-3-3-0_1release-3-3-0_utilities Taskflow Profiler (TFProf) release-3-3-0_1release-3-3-0_profiler Bug Fixes release-3-3-0_1release-3-3-0_bug_fixes Breaking Changes release-3-3-0_1release-3-3-0_breaking_changes Deprecated and Removed Items release-3-3-0_1release-3-3-0_deprecated_items Documentation release-3-3-0_1release-3-3-0_documentation Miscellaneous Items release-3-3-0_1release-3-3-0_miscellaneous_items Taskflow 3.3.0 is the 4th release in the 3.x line! This release includes several new changes, such as sanitized data race, pipeline parallelism, documentation, and unit tests. We highly recommend that adopting Taskflow v3.3 in your projects if possible. This release has resolved pretty much all the potential data-race issues induced by incorrect memory order. Download Taskflow 3.3.0 can be downloaded from here. System Requirements To use Taskflow v3.3.0, you need a compiler that supports C++17: GNU C++ Compiler at least v8.4 with -std=c++17 Clang C++ Compiler at least v6.0 with -std=c++17 Microsoft Visual Studio at least v19.27 with /std:c++17 AppleClang Xcode Version at least v12.0 with -std=c++17 Nvidia CUDA Toolkit and Compiler (nvcc) at least v11.1 with -std=c++17 Intel C++ Compiler at least v19.0.1 with -std=c++17 Intel DPC++ Clang Compiler at least v13.0.0 with -std=c++17 and SYCL20 Taskflow works on Linux, Windows, and Mac OS X. Release Summary This release has resolved data race issues reported by tsan and has incorporated essential sanitizers into the continuous integration workflows for detecting data race, illegal memory access, and memory leak of the Taskflow codebase. This release has introduced a new pipeline interface (tf::Pipeline) that allow users to create a pipeline scheduling framework for implementing pipeline algorithms. This release has introduced a new thread-id mapping algorithm to resolve unexpected thread-local storage (TLS) errors when building Taskflow projects in a shared library environment. New Features Taskflow Core Changed all lambda operators in parallel algorithms to copy by default Cleaned up data race errors in tsan caused by incorrect memory order Enhanced scheduling performance by caching tasks in the invoke loop Added tf::Task::data to allow associating a task with user-level data Added tf::Executor::named_async to allow associating an asynchronous task a name Added tf::Executor::named_silent_async to allow associating a silent asynchronous task a name Added tf::Subflow::named_async to allow associating an asynchronous task a name Added tf::Subflow::named_silent_async to allow associating a silent asynchronous task a name Added multi-conditional tasking to allow a task to jump to multiple successors Added tf::Runtime tasking interface to enable in-task scheduling control Added tf::Taskflow::transform to perform parallel-transform algorithms Added tf::Graph interface to allow users to create custom module tasks Added tf::FlowBuilder::erase to remove a task from the associated graph cudaFlow Starting from v3.3, using tf::cudaFlow needs to include the header, taskflow/cuda/cudaflow.hpp. See Breaking Changes. syclFlow This release does not have any update on syclFlow. Utilities Added tf::SmallVector to the documentation Added relax_cpu call to optimize the work-stealing loop Taskflow Profiler (TFProf) This release does not have any update on the profiler. Bug Fixes Fixed incorrect static TLS access when building Taskflow in a shared lib Fixed memory leak in updating tf::cudaFlowCapturer of undestroyed graph Fixed data race in the object-pool when accessing the heap pointer Fixed invalid lambda capture by reference in tf::Taskflow::sort Fixed invalid lambda capture by reference in tf::Taskflow::reduce Fixed invalid lambda capture by reference in tf::Taskflow::transform_reduce Fixed invalid lambda capture by reference in tf::Taskflow::for_each Fixed invalid lambda capture by reference in tf::Taskflow::for_each_index If you encounter any potential bugs, please submit an issue at issue tracker. Breaking Changes For the purpose of compilation speed, you will need to separately include the follwoing files for using specific features and algorithms: taskflow/algorithm/reduce.hpp for creating a parallel-reduction task taskflow/algorithm/sort.hpp for creating a parallel-sort task taskflow/algorithm/transform.hpp for creating a parallel-transform task taskflow/algorithm/pipeline.hpp for creating a parallel-pipeline task taskflow/cuda/cudaflow.hpp for creating a tf::cudaFlow and a tf::cudaFlowCapturer tasks taskflow/cuda/algorithm/for_each.hpp for creating a single-threaded task on a CUDA GPU taskflow/cuda/algorithm/for_each.hpp for creating a parallel-iteration task on a CUDA GPU taskflow/cuda/algorithm/transform.hpp for creating a parallel-transform task on a CUDA GPU taskflow/cuda/algorithm/reduce.hpp for creating a parallel-reduce task on a CUDA GPU taskflow/cuda/algorithm/scan.hpp for creating a parallel-scan task on a CUDA GPU taskflow/cuda/algorithm/merge.hpp for creating a parallel-merge task on a CUDA GPU taskflow/cuda/algorithm/sort.hpp for creating a parallel-sort task on a CUDA GPU taskflow/cuda/algorithm/find.hpp for creating a parallel-find task on a CUDA GPU Deprecated and Removed Items This release does not have any deprecated and removed items. Documentation Revised Building and Installing Build Sanitizers Revised Static Tasking Attach User Data to a Task Revised Composable Tasking Create a Custom Composable Graph Revised Conditional Tasking Create a Multi-condition Task Revised GPU Tasking (cudaFlow) Revised GPU Tasking (cudaFlowCapturer) Revised Limit the Maximum Concurrency Define a Conflict Graph Revised Parallel Sort to add header-include information Revised Parallel Reduction to add header-include information Revised cudaFlow Algorithms to add header-include information Revised CUDA Standard Algorithms to add header-include information Added Interact with the Runtime Added Parallel Transforms Added Task-parallel Pipeline Miscellaneous Items We have published Taskflow in the following venues: Tsung-Wei Huang, Dian-Lun Lin, Chun-Xun Lin, and Yibo Lin, "Taskflow: A Lightweight Parallel and Heterogeneous Task Graph Computing System," IEEE Transactions on Parallel and Distributed Systems (TPDS), vol. 33, no. 6, pp. 1303-1320, June 2022 Tsung-Wei Huang, "TFProf: Profiling Large Taskflow Programs with Modern D3 and C++," IEEE International Workshop on Programming and Performance Visualization Tools (ProTools), St. Louis, Missouri, 2021 Please do not hesitate to contact Dr. Tsung-Wei Huang if you intend to collaborate with us on using Taskflow in your scientific computing projects.