ParallelTransformsCUDA Parallel Transforms Include the Header ParallelTransformsCUDA_1CUDAParallelTransformsIncludeTheHeader Transform a Range of Items ParallelTransformsCUDA_1cudaFlowTransformARangeOfItems Transform Two Ranges of Items ParallelTransformsCUDA_1cudaFlowTransformTwoRangesOfItems Miscellaneous Items ParallelTransformsCUDA_1ParallelTransformCUDAMiscellaneousItems tf::cudaFlow provides template methods for transforming ranges of items to different outputs. Include the Header You need to include the header file, taskflow/cuda/algorithm/transform.hpp, for creating a parallel-transform task. #include<taskflow/cuda/algorithm/transform.hpp> Transform a Range of Items Iterator-based parallel-transform applies the given transform function to a range of items and store the result in another range specified by two iterators, first and last. The task created by tf::cudaFlow::transform(I first, I last, O output, C op) represents a parallel execution for the following loop: while(first!=last){ *output++=op(*first++); } The following example creates a transform kernel that transforms an input range of N items to an output range by multiplying each item by 10. //output[i]=input[i]*10 cudaflow.transform( input,input+N,output,[]__device__(intx){returnx*10;} ); Each iteration is independent of each other and is assigned one kernel thread to run the callable. Since the callable runs on GPU, it must be declared with a __device__ specifier. Transform Two Ranges of Items You can transform two ranges of items to an output range through a binary operator. The task created by tf::cudaFlow::transform(I1 first1, I1 last1, I2 first2, O output, C op) represents a parallel execution for the following loop: while(first1!=last1){ *output++=op(*first1++,*first2++); } The following example creates a transform kernel that transforms two input ranges of N items to an output range by summing each pair of items in the input ranges. //output[i]=input1[i]+inpu2[i] cudaflow.transform( input1,input1+N,input2,output,[]__device__(inta,intb){returna+b;} ); Miscellaneous Items The parallel-transform algorithms are also available in tf::cudaFlowCapturer.