364 lines
21 KiB
HTML
364 lines
21 KiB
HTML
<!DOCTYPE html>
|
|
<html lang="en">
|
|
<head>
|
|
<meta charset="UTF-8" />
|
|
<title>tf::cudaExecutionPolicy class | Taskflow QuickStart</title>
|
|
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Source+Sans+Pro:400,400i,600,600i%7CSource+Code+Pro:400,400i,600" />
|
|
<link rel="stylesheet" href="m-dark+documentation.compiled.css" />
|
|
<link rel="icon" href="favicon.ico" type="image/vnd.microsoft.icon" />
|
|
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
|
<meta name="theme-color" content="#22272e" />
|
|
</head>
|
|
<body>
|
|
<header><nav id="navigation">
|
|
<div class="m-container">
|
|
<div class="m-row">
|
|
<span id="m-navbar-brand" class="m-col-t-8 m-col-m-none m-left-m">
|
|
<a href="https://taskflow.github.io"><img src="taskflow_logo.png" alt="" />Taskflow</a> <span class="m-breadcrumb">|</span> <a href="index.html" class="m-thin">QuickStart</a>
|
|
</span>
|
|
<div class="m-col-t-4 m-hide-m m-text-right m-nopadr">
|
|
<a href="#search" class="m-doc-search-icon" title="Search" onclick="return showSearch()"><svg style="height: 0.9rem;" viewBox="0 0 16 16">
|
|
<path id="m-doc-search-icon-path" d="m6 0c-3.31 0-6 2.69-6 6 0 3.31 2.69 6 6 6 1.49 0 2.85-0.541 3.89-1.44-0.0164 0.338 0.147 0.759 0.5 1.15l3.22 3.79c0.552 0.614 1.45 0.665 2 0.115 0.55-0.55 0.499-1.45-0.115-2l-3.79-3.22c-0.392-0.353-0.812-0.515-1.15-0.5 0.895-1.05 1.44-2.41 1.44-3.89 0-3.31-2.69-6-6-6zm0 1.56a4.44 4.44 0 0 1 4.44 4.44 4.44 4.44 0 0 1-4.44 4.44 4.44 4.44 0 0 1-4.44-4.44 4.44 4.44 0 0 1 4.44-4.44z"/>
|
|
</svg></a>
|
|
<a id="m-navbar-show" href="#navigation" title="Show navigation"></a>
|
|
<a id="m-navbar-hide" href="#" title="Hide navigation"></a>
|
|
</div>
|
|
<div id="m-navbar-collapse" class="m-col-t-12 m-show-m m-col-m-none m-right-m">
|
|
<div class="m-row">
|
|
<ol class="m-col-t-6 m-col-m-none">
|
|
<li><a href="pages.html">Handbook</a></li>
|
|
<li><a href="namespaces.html">Namespaces</a></li>
|
|
</ol>
|
|
<ol class="m-col-t-6 m-col-m-none" start="3">
|
|
<li><a href="annotated.html">Classes</a></li>
|
|
<li><a href="files.html">Files</a></li>
|
|
<li class="m-show-m"><a href="#search" class="m-doc-search-icon" title="Search" onclick="return showSearch()"><svg style="height: 0.9rem;" viewBox="0 0 16 16">
|
|
<use href="#m-doc-search-icon-path" />
|
|
</svg></a></li>
|
|
</ol>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</nav></header>
|
|
<main><article>
|
|
<div class="m-container m-container-inflatable">
|
|
<div class="m-row">
|
|
<div class="m-col-l-10 m-push-l-1">
|
|
<h1>
|
|
<div class="m-doc-include m-code m-inverted m-right-m m-text-right"><span class="cp">#include</span> <a class="cpf" href="cuda__execution__policy_8hpp.html"><taskflow/cuda/cuda_execution_policy.hpp></a></div>
|
|
<div class="m-doc-template">template<unsigned NT, unsigned VT></div>
|
|
<span class="m-breadcrumb"><a href="namespacetf.html">tf</a>::<wbr/></span>cudaExecutionPolicy <span class="m-thin">class</span>
|
|
</h1>
|
|
<p>class to define execution policy for CUDA standard algorithms</p>
|
|
<table class="m-table m-fullwidth m-flat">
|
|
<thead>
|
|
<tr><th colspan="2">Template parameters</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td style="width: 1%">NT</td>
|
|
<td>number of threads per block</td>
|
|
</tr>
|
|
<tr>
|
|
<td>VT</td>
|
|
<td>number of work units per thread</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<nav class="m-block m-default">
|
|
<h3>Contents</h3>
|
|
<ul>
|
|
<li>
|
|
Reference
|
|
<ul>
|
|
<li><a href="#pub-static-attribs">Public static variables</a></li>
|
|
<li><a href="#pub-static-methods">Public static functions</a></li>
|
|
<li><a href="#typeless-methods">Constructors, destructors, conversion operators</a></li>
|
|
<li><a href="#pub-methods">Public functions</a></li>
|
|
</ul>
|
|
</li>
|
|
</ul>
|
|
</nav>
|
|
<p>Execution policy configures the kernel execution parameters in CUDA algorithms. The first template argument, <code>NT</code>, the number of threads per block should always be a power-of-two number. The second template argument, <code>VT</code>, the number of work units per thread is recommended to be an odd number to avoid bank conflict.</p><p>Details can be referred to <a href="CUDASTDExecutionPolicy.html" class="m-doc">Execution Policy</a>.</p>
|
|
<section id="pub-static-attribs">
|
|
<h2><a href="#pub-static-attribs">Public static variables</a></h2>
|
|
<dl class="m-doc">
|
|
<dt id="abb1050526f45873c967976a99e9a370d">
|
|
static const unsigned <a href="#abb1050526f45873c967976a99e9a370d" class="m-doc-self">nt</a>
|
|
</dt>
|
|
<dd>static constant for getting the number of threads per block</dd>
|
|
<dt id="a9410f1b3a5cb9a3cc5e8d640bc7d3990">
|
|
static const unsigned <a href="#a9410f1b3a5cb9a3cc5e8d640bc7d3990" class="m-doc-self">vt</a>
|
|
</dt>
|
|
<dd>static constant for getting the number of work units per thread</dd>
|
|
<dt id="a92ac5a32147584738f32a720ea08e3f4">
|
|
static const unsigned <a href="#a92ac5a32147584738f32a720ea08e3f4" class="m-doc-self">nv</a>
|
|
</dt>
|
|
<dd>static constant for getting the number of elements to process per block</dd>
|
|
</dl>
|
|
</section>
|
|
<section id="pub-static-methods">
|
|
<h2><a href="#pub-static-methods">Public static functions</a></h2>
|
|
<dl class="m-doc">
|
|
<dt id="ab96c478964fcba935aa99efe91a64e5c">
|
|
<span class="m-doc-wrap-bumper">static auto <a href="#ab96c478964fcba935aa99efe91a64e5c" class="m-doc-self">num_blocks</a>(</span><span class="m-doc-wrap">unsigned N) -> unsigned</span>
|
|
</dt>
|
|
<dd>queries the number of blocks to accommodate N elements</dd>
|
|
<dt>
|
|
<div class="m-doc-template">template<typename T></div>
|
|
<span class="m-doc-wrap-bumper">static auto <a href="#a446cee95bb839ee180052059e2ad7fd6" class="m-doc">reduce_bufsz</a>(</span><span class="m-doc-wrap">unsigned count) -> unsigned</span>
|
|
</dt>
|
|
<dd>queries the buffer size in bytes needed to call reduce kernels</dd>
|
|
<dt>
|
|
<div class="m-doc-template">template<typename T></div>
|
|
<span class="m-doc-wrap-bumper">static auto <a href="#abcafb001cd68c1135392f4bcda5a2a05" class="m-doc">min_element_bufsz</a>(</span><span class="m-doc-wrap">unsigned count) -> unsigned</span>
|
|
</dt>
|
|
<dd>queries the buffer size in bytes needed to call <a href="namespacetf.html#a572c13198191c46765264f8afabe2e9f" class="m-doc">tf::<wbr />cuda_min_element</a></dd>
|
|
<dt>
|
|
<div class="m-doc-template">template<typename T></div>
|
|
<span class="m-doc-wrap-bumper">static auto <a href="#a31fe75c4b0765df3035e12be49af88aa" class="m-doc">max_element_bufsz</a>(</span><span class="m-doc-wrap">unsigned count) -> unsigned</span>
|
|
</dt>
|
|
<dd>queries the buffer size in bytes needed to call <a href="namespacetf.html#a3fc577fd0a8f127770bcf68bc56c073e" class="m-doc">tf::<wbr />cuda_max_element</a></dd>
|
|
<dt>
|
|
<div class="m-doc-template">template<typename T></div>
|
|
<span class="m-doc-wrap-bumper">static auto <a href="#af25648b3269902b333cfcd58665005e8" class="m-doc">scan_bufsz</a>(</span><span class="m-doc-wrap">unsigned count) -> unsigned</span>
|
|
</dt>
|
|
<dd>queries the buffer size in bytes needed to call scan kernels</dd>
|
|
<dt>
|
|
<span class="m-doc-wrap-bumper">static auto <a href="#a1febbe549d9cbe4502a5b66167ab9553" class="m-doc">merge_bufsz</a>(</span><span class="m-doc-wrap">unsigned a_count,
|
|
unsigned b_count) -> unsigned</span>
|
|
</dt>
|
|
<dd>queries the buffer size in bytes needed for CUDA merge algorithms</dd>
|
|
</dl>
|
|
</section>
|
|
<section id="typeless-methods">
|
|
<h2><a href="#typeless-methods">Constructors, destructors, conversion operators</a></h2>
|
|
<dl class="m-doc">
|
|
<dt id="aea3b671f778bfb9eca5d7113636f63bf">
|
|
<span class="m-doc-wrap-bumper"><a href="#aea3b671f778bfb9eca5d7113636f63bf" class="m-doc-self">cudaExecutionPolicy</a>(</span><span class="m-doc-wrap">) <span class="m-label m-flat m-info">defaulted</span></span>
|
|
</dt>
|
|
<dd>constructs an execution policy object with default stream</dd>
|
|
<dt id="ac1c7784472394d4abcb6f6a2a80cc019">
|
|
<span class="m-doc-wrap-bumper"><a href="#ac1c7784472394d4abcb6f6a2a80cc019" class="m-doc-self">cudaExecutionPolicy</a>(</span><span class="m-doc-wrap">cudaStream_t s) <span class="m-label m-flat m-info">explicit</span> </span>
|
|
</dt>
|
|
<dd>constructs an execution policy object with the given stream</dd>
|
|
</dl>
|
|
</section>
|
|
<section id="pub-methods">
|
|
<h2><a href="#pub-methods">Public functions</a></h2>
|
|
<dl class="m-doc">
|
|
<dt id="a5be1b273985800ab886665d28663c29b">
|
|
<span class="m-doc-wrap-bumper">auto <a href="#a5be1b273985800ab886665d28663c29b" class="m-doc-self">stream</a>(</span><span class="m-doc-wrap">) -> cudaStream_t <span class="m-label m-flat m-success">noexcept</span></span>
|
|
</dt>
|
|
<dd>queries the associated stream</dd>
|
|
<dt id="a5f2a4d6b35af49403756ee2291264758">
|
|
<span class="m-doc-wrap-bumper">void <a href="#a5f2a4d6b35af49403756ee2291264758" class="m-doc-self">stream</a>(</span><span class="m-doc-wrap">cudaStream_t stream) <span class="m-label m-flat m-success">noexcept</span></span>
|
|
</dt>
|
|
<dd>assigns a stream</dd>
|
|
</dl>
|
|
</section>
|
|
<section>
|
|
<h2>Function documentation</h2>
|
|
<section class="m-doc-details" id="a446cee95bb839ee180052059e2ad7fd6"><div>
|
|
<h3>
|
|
<div class="m-doc-template">
|
|
template<unsigned NT, unsigned VT>
|
|
template<typename T>
|
|
</div>
|
|
<span class="m-doc-wrap-bumper">static unsigned tf::<wbr />cudaExecutionPolicy<NT, VT>::<wbr /></span><span class="m-doc-wrap"><span class="m-doc-wrap-bumper"><a href="#a446cee95bb839ee180052059e2ad7fd6" class="m-doc-self">reduce_bufsz</a>(</span><span class="m-doc-wrap">unsigned count)</span></span>
|
|
</h3>
|
|
<p>queries the buffer size in bytes needed to call reduce kernels</p>
|
|
<table class="m-table m-fullwidth m-flat">
|
|
<thead>
|
|
<tr><th colspan="2">Template parameters</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td style="width: 1%">T</td>
|
|
<td>value type</td>
|
|
</tr>
|
|
</tbody>
|
|
<thead>
|
|
<tr><th colspan="2">Parameters</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>count</td>
|
|
<td>number of elements to reduce</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<p>The function is used to allocate a buffer for calling <a href="namespacetf.html#a8a872d2a0ac73a676713cb5be5aa688c" class="m-doc">tf::<wbr />cuda_reduce</a>, <a href="namespacetf.html#a492e8410db032a0273a99dd905486161" class="m-doc">tf::<wbr />cuda_uninitialized_reduce</a>, <a href="namespacetf.html#a4463d06240d608bc31d8b3546a851e4e" class="m-doc">tf::<wbr />cuda_transform_reduce</a>, and <a href="namespacetf.html#aa451668b7a0a3abf385cf2abebed8962" class="m-doc">tf::<wbr />cuda_uninitialized_transform_reduce</a>.</p>
|
|
</div></section>
|
|
<section class="m-doc-details" id="abcafb001cd68c1135392f4bcda5a2a05"><div>
|
|
<h3>
|
|
<div class="m-doc-template">
|
|
template<unsigned NT, unsigned VT>
|
|
template<typename T>
|
|
</div>
|
|
<span class="m-doc-wrap-bumper">static unsigned tf::<wbr />cudaExecutionPolicy<NT, VT>::<wbr /></span><span class="m-doc-wrap"><span class="m-doc-wrap-bumper"><a href="#abcafb001cd68c1135392f4bcda5a2a05" class="m-doc-self">min_element_bufsz</a>(</span><span class="m-doc-wrap">unsigned count)</span></span>
|
|
</h3>
|
|
<p>queries the buffer size in bytes needed to call <a href="namespacetf.html#a572c13198191c46765264f8afabe2e9f" class="m-doc">tf::<wbr />cuda_min_element</a></p>
|
|
<table class="m-table m-fullwidth m-flat">
|
|
<thead>
|
|
<tr><th colspan="2">Template parameters</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td style="width: 1%">T</td>
|
|
<td>value type</td>
|
|
</tr>
|
|
</tbody>
|
|
<thead>
|
|
<tr><th colspan="2">Parameters</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>count</td>
|
|
<td>number of elements to search</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<p>The function is used to decide the buffer size in bytes for calling <a href="namespacetf.html#a572c13198191c46765264f8afabe2e9f" class="m-doc">tf::<wbr />cuda_min_element</a>.</p>
|
|
</div></section>
|
|
<section class="m-doc-details" id="a31fe75c4b0765df3035e12be49af88aa"><div>
|
|
<h3>
|
|
<div class="m-doc-template">
|
|
template<unsigned NT, unsigned VT>
|
|
template<typename T>
|
|
</div>
|
|
<span class="m-doc-wrap-bumper">static unsigned tf::<wbr />cudaExecutionPolicy<NT, VT>::<wbr /></span><span class="m-doc-wrap"><span class="m-doc-wrap-bumper"><a href="#a31fe75c4b0765df3035e12be49af88aa" class="m-doc-self">max_element_bufsz</a>(</span><span class="m-doc-wrap">unsigned count)</span></span>
|
|
</h3>
|
|
<p>queries the buffer size in bytes needed to call <a href="namespacetf.html#a3fc577fd0a8f127770bcf68bc56c073e" class="m-doc">tf::<wbr />cuda_max_element</a></p>
|
|
<table class="m-table m-fullwidth m-flat">
|
|
<thead>
|
|
<tr><th colspan="2">Template parameters</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td style="width: 1%">T</td>
|
|
<td>value type</td>
|
|
</tr>
|
|
</tbody>
|
|
<thead>
|
|
<tr><th colspan="2">Parameters</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>count</td>
|
|
<td>number of elements to search</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<p>The function is used to decide the buffer size in bytes for calling <a href="namespacetf.html#a3fc577fd0a8f127770bcf68bc56c073e" class="m-doc">tf::<wbr />cuda_max_element</a>.</p>
|
|
</div></section>
|
|
<section class="m-doc-details" id="af25648b3269902b333cfcd58665005e8"><div>
|
|
<h3>
|
|
<div class="m-doc-template">
|
|
template<unsigned NT, unsigned VT>
|
|
template<typename T>
|
|
</div>
|
|
<span class="m-doc-wrap-bumper">static unsigned tf::<wbr />cudaExecutionPolicy<NT, VT>::<wbr /></span><span class="m-doc-wrap"><span class="m-doc-wrap-bumper"><a href="#af25648b3269902b333cfcd58665005e8" class="m-doc-self">scan_bufsz</a>(</span><span class="m-doc-wrap">unsigned count)</span></span>
|
|
</h3>
|
|
<p>queries the buffer size in bytes needed to call scan kernels</p>
|
|
<table class="m-table m-fullwidth m-flat">
|
|
<thead>
|
|
<tr><th colspan="2">Template parameters</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td style="width: 1%">T</td>
|
|
<td>value type</td>
|
|
</tr>
|
|
</tbody>
|
|
<thead>
|
|
<tr><th colspan="2">Parameters</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>count</td>
|
|
<td>number of elements to scan</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<p>The function is used to allocate a buffer for calling <a href="namespacetf.html#a2e1b44c84a09e0a8495a611cb9a7ea40" class="m-doc">tf::<wbr />cuda_inclusive_scan</a>, <a href="namespacetf.html#aeb391c40120844318fd715b8c3a716bb" class="m-doc">tf::<wbr />cuda_exclusive_scan</a>, <a href="namespacetf.html#afa4aa760ddb6efbda1b9bab505ad5baf" class="m-doc">tf::<wbr />cuda_transform_inclusive_scan</a>, and <a href="namespacetf.html#a2e739895c1c73538967af060ca714366" class="m-doc">tf::<wbr />cuda_transform_exclusive_scan</a>.</p>
|
|
</div></section>
|
|
<section class="m-doc-details" id="a1febbe549d9cbe4502a5b66167ab9553"><div>
|
|
<h3>
|
|
<div class="m-doc-template">
|
|
template<unsigned NT, unsigned VT>
|
|
</div>
|
|
<span class="m-doc-wrap-bumper">static unsigned tf::<wbr />cudaExecutionPolicy<NT, VT>::<wbr /></span><span class="m-doc-wrap"><span class="m-doc-wrap-bumper"><a href="#a1febbe549d9cbe4502a5b66167ab9553" class="m-doc-self">merge_bufsz</a>(</span><span class="m-doc-wrap">unsigned a_count,
|
|
unsigned b_count)</span></span>
|
|
</h3>
|
|
<p>queries the buffer size in bytes needed for CUDA merge algorithms</p>
|
|
<table class="m-table m-fullwidth m-flat">
|
|
<thead>
|
|
<tr><th colspan="2">Parameters</th></tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td style="width: 1%">a_count</td>
|
|
<td>number of elements in the first vector to merge</td>
|
|
</tr>
|
|
<tr>
|
|
<td>b_count</td>
|
|
<td>number of elements in the second vector to merge</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<p>The buffer size of merge algorithm does not depend on the data type. The buffer is purely used only for storing temporary indices (of type <code>unsigned</code>) required during the merge process.</p><p>The function is used to allocate a buffer for calling <a href="namespacetf.html#a37ec481149c2f01669353033d75ed72a" class="m-doc">tf::<wbr />cuda_merge</a> and <a href="namespacetf.html#aa84d4c68d2cbe9f6efc4a1eb1a115458" class="m-doc">tf::<wbr />cuda_merge_by_key</a>.</p>
|
|
</div></section>
|
|
</section>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</article></main>
|
|
<div class="m-doc-search" id="search">
|
|
<a href="#!" onclick="return hideSearch()"></a>
|
|
<div class="m-container">
|
|
<div class="m-row">
|
|
<div class="m-col-m-8 m-push-m-2">
|
|
<div class="m-doc-search-header m-text m-small">
|
|
<div><span class="m-label m-default">Tab</span> / <span class="m-label m-default">T</span> to search, <span class="m-label m-default">Esc</span> to close</div>
|
|
<div id="search-symbolcount">…</div>
|
|
</div>
|
|
<div class="m-doc-search-content">
|
|
<form>
|
|
<input type="search" name="q" id="search-input" placeholder="Loading …" disabled="disabled" autofocus="autofocus" autocomplete="off" spellcheck="false" />
|
|
</form>
|
|
<noscript class="m-text m-danger m-text-center">Unlike everything else in the docs, the search functionality <em>requires</em> JavaScript.</noscript>
|
|
<div id="search-help" class="m-text m-dim m-text-center">
|
|
<p class="m-noindent">Search for symbols, directories, files, pages or
|
|
modules. You can omit any prefix from the symbol or file path; adding a
|
|
<code>:</code> or <code>/</code> suffix lists all members of given symbol or
|
|
directory.</p>
|
|
<p class="m-noindent">Use <span class="m-label m-dim">↓</span>
|
|
/ <span class="m-label m-dim">↑</span> to navigate through the list,
|
|
<span class="m-label m-dim">Enter</span> to go.
|
|
<span class="m-label m-dim">Tab</span> autocompletes common prefix, you can
|
|
copy a link to the result using <span class="m-label m-dim">⌘</span>
|
|
<span class="m-label m-dim">L</span> while <span class="m-label m-dim">⌘</span>
|
|
<span class="m-label m-dim">M</span> produces a Markdown link.</p>
|
|
</div>
|
|
<div id="search-notfound" class="m-text m-warning m-text-center">Sorry, nothing was found.</div>
|
|
<ul id="search-results"></ul>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
<script src="search-v2.js"></script>
|
|
<script src="searchdata-v2.js" async="async"></script>
|
|
<footer><nav>
|
|
<div class="m-container">
|
|
<div class="m-row">
|
|
<div class="m-col-l-10 m-push-l-1">
|
|
<p>Taskflow handbook is part of the <a href="https://taskflow.github.io">Taskflow project</a>, copyright © <a href="https://tsung-wei-huang.github.io/">Dr. Tsung-Wei Huang</a>, 2018–2024.<br />Generated by <a href="https://doxygen.org/">Doxygen</a> 1.9.1 and <a href="https://mcss.mosra.cz/">m.css</a>.</p>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
</nav></footer>
|
|
</body>
|
|
</html>
|