mesytec-mnode/external/taskflow-3.8.0/docs/classtf_1_1cudaFlowRoundRobinOptimizer.html
2025-01-04 01:25:05 +01:00

142 lines
8.6 KiB
HTML

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<title>tf::cudaFlowRoundRobinOptimizer class | Taskflow QuickStart</title>
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Source+Sans+Pro:400,400i,600,600i%7CSource+Code+Pro:400,400i,600" />
<link rel="stylesheet" href="m-dark+documentation.compiled.css" />
<link rel="icon" href="favicon.ico" type="image/vnd.microsoft.icon" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="theme-color" content="#22272e" />
</head>
<body>
<header><nav id="navigation">
<div class="m-container">
<div class="m-row">
<span id="m-navbar-brand" class="m-col-t-8 m-col-m-none m-left-m">
<a href="https://taskflow.github.io"><img src="taskflow_logo.png" alt="" />Taskflow</a> <span class="m-breadcrumb">|</span> <a href="index.html" class="m-thin">QuickStart</a>
</span>
<div class="m-col-t-4 m-hide-m m-text-right m-nopadr">
<a href="#search" class="m-doc-search-icon" title="Search" onclick="return showSearch()"><svg style="height: 0.9rem;" viewBox="0 0 16 16">
<path id="m-doc-search-icon-path" d="m6 0c-3.31 0-6 2.69-6 6 0 3.31 2.69 6 6 6 1.49 0 2.85-0.541 3.89-1.44-0.0164 0.338 0.147 0.759 0.5 1.15l3.22 3.79c0.552 0.614 1.45 0.665 2 0.115 0.55-0.55 0.499-1.45-0.115-2l-3.79-3.22c-0.392-0.353-0.812-0.515-1.15-0.5 0.895-1.05 1.44-2.41 1.44-3.89 0-3.31-2.69-6-6-6zm0 1.56a4.44 4.44 0 0 1 4.44 4.44 4.44 4.44 0 0 1-4.44 4.44 4.44 4.44 0 0 1-4.44-4.44 4.44 4.44 0 0 1 4.44-4.44z"/>
</svg></a>
<a id="m-navbar-show" href="#navigation" title="Show navigation"></a>
<a id="m-navbar-hide" href="#" title="Hide navigation"></a>
</div>
<div id="m-navbar-collapse" class="m-col-t-12 m-show-m m-col-m-none m-right-m">
<div class="m-row">
<ol class="m-col-t-6 m-col-m-none">
<li><a href="pages.html">Handbook</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
</ol>
<ol class="m-col-t-6 m-col-m-none" start="3">
<li><a href="annotated.html">Classes</a></li>
<li><a href="files.html">Files</a></li>
<li class="m-show-m"><a href="#search" class="m-doc-search-icon" title="Search" onclick="return showSearch()"><svg style="height: 0.9rem;" viewBox="0 0 16 16">
<use href="#m-doc-search-icon-path" />
</svg></a></li>
</ol>
</div>
</div>
</div>
</div>
</nav></header>
<main><article>
<div class="m-container m-container-inflatable">
<div class="m-row">
<div class="m-col-l-10 m-push-l-1">
<h1>
<span class="m-breadcrumb"><a href="namespacetf.html">tf</a>::<wbr/></span>cudaFlowRoundRobinOptimizer <span class="m-thin">class</span>
<div class="m-doc-include m-code m-inverted m-text-right"><span class="cp">#include</span> <a class="cpf" href="cuda__optimizer_8hpp.html">&lt;taskflow/cuda/cuda_optimizer.hpp&gt;</a></div>
</h1>
<p>class to capture a CUDA graph using a round-robin algorithm</p>
<nav class="m-block m-default">
<h3>Contents</h3>
<ul>
<li>
Reference
<ul>
<li><a href="#typeless-methods">Constructors, destructors, conversion operators</a></li>
<li><a href="#pub-methods">Public functions</a></li>
</ul>
</li>
</ul>
</nav>
<p>A round-robin capturing algorithm levelizes the user-described graph and assign streams to nodes in a round-robin order level by level. The algorithm is based on the following paper published in Euro-Par 2021:</p><ul><li>Dian-Lun Lin and Tsung-Wei Huang, &quot;Efficient GPU Computation using Task <a href="classtf_1_1Graph.html" class="m-doc">Graph</a> Parallelism,&quot; <em>European Conference on Parallel and Distributed Computing (Euro-Par)</em>, 2021</li></ul><p>The round-robin optimization algorithm is best suited for large cudaFlow graphs that compose hundreds of or thousands of GPU operations (e.g., kernels and memory copies) with many of them being able to run in parallel. You can configure the number of streams to the optimizer to adjust the maximum kernel currency in the captured CUDA graph.</p>
<section id="typeless-methods">
<h2><a href="#typeless-methods">Constructors, destructors, conversion operators</a></h2>
<dl class="m-doc">
<dt id="aef646675174ffcab6135fbfb7f0eecfe">
<span class="m-doc-wrap-bumper"><a href="#aef646675174ffcab6135fbfb7f0eecfe" class="m-doc-self">cudaFlowRoundRobinOptimizer</a>(</span><span class="m-doc-wrap">) <span class="m-label m-flat m-info">defaulted</span></span>
</dt>
<dd>constructs a round-robin optimizer with 4 streams by default</dd>
<dt id="ab293c8613773baf87ff740d2cec14149">
<span class="m-doc-wrap-bumper"><a href="#ab293c8613773baf87ff740d2cec14149" class="m-doc-self">cudaFlowRoundRobinOptimizer</a>(</span><span class="m-doc-wrap">size_t num_streams) <span class="m-label m-flat m-info">explicit</span> </span>
</dt>
<dd>constructs a round-robin optimizer with the given number of streams</dd>
</dl>
</section>
<section id="pub-methods">
<h2><a href="#pub-methods">Public functions</a></h2>
<dl class="m-doc">
<dt id="a22fb9667ce393c31d908c3cc4f0ba650">
<span class="m-doc-wrap-bumper">auto <a href="#a22fb9667ce393c31d908c3cc4f0ba650" class="m-doc-self">num_streams</a>(</span><span class="m-doc-wrap">) const -&gt; size_t</span>
</dt>
<dd>queries the number of streams used by the optimizer</dd>
<dt id="acbd190f22ecc606a8b888953649a5be6">
<span class="m-doc-wrap-bumper">void <a href="#acbd190f22ecc606a8b888953649a5be6" class="m-doc-self">num_streams</a>(</span><span class="m-doc-wrap">size_t n)</span>
</dt>
<dd>sets the number of streams used by the optimizer</dd>
</dl>
</section>
</div>
</div>
</div>
</article></main>
<div class="m-doc-search" id="search">
<a href="#!" onclick="return hideSearch()"></a>
<div class="m-container">
<div class="m-row">
<div class="m-col-m-8 m-push-m-2">
<div class="m-doc-search-header m-text m-small">
<div><span class="m-label m-default">Tab</span> / <span class="m-label m-default">T</span> to search, <span class="m-label m-default">Esc</span> to close</div>
<div id="search-symbolcount">&hellip;</div>
</div>
<div class="m-doc-search-content">
<form>
<input type="search" name="q" id="search-input" placeholder="Loading &hellip;" disabled="disabled" autofocus="autofocus" autocomplete="off" spellcheck="false" />
</form>
<noscript class="m-text m-danger m-text-center">Unlike everything else in the docs, the search functionality <em>requires</em> JavaScript.</noscript>
<div id="search-help" class="m-text m-dim m-text-center">
<p class="m-noindent">Search for symbols, directories, files, pages or
modules. You can omit any prefix from the symbol or file path; adding a
<code>:</code> or <code>/</code> suffix lists all members of given symbol or
directory.</p>
<p class="m-noindent">Use <span class="m-label m-dim">&darr;</span>
/ <span class="m-label m-dim">&uarr;</span> to navigate through the list,
<span class="m-label m-dim">Enter</span> to go.
<span class="m-label m-dim">Tab</span> autocompletes common prefix, you can
copy a link to the result using <span class="m-label m-dim"></span>
<span class="m-label m-dim">L</span> while <span class="m-label m-dim"></span>
<span class="m-label m-dim">M</span> produces a Markdown link.</p>
</div>
<div id="search-notfound" class="m-text m-warning m-text-center">Sorry, nothing was found.</div>
<ul id="search-results"></ul>
</div>
</div>
</div>
</div>
</div>
<script src="search-v2.js"></script>
<script src="searchdata-v2.js" async="async"></script>
<footer><nav>
<div class="m-container">
<div class="m-row">
<div class="m-col-l-10 m-push-l-1">
<p>Taskflow handbook is part of the <a href="https://taskflow.github.io">Taskflow project</a>, copyright © <a href="https://tsung-wei-huang.github.io/">Dr. Tsung-Wei Huang</a>, 2018&ndash;2024.<br />Generated by <a href="https://doxygen.org/">Doxygen</a> 1.9.1 and <a href="https://mcss.mosra.cz/">m.css</a>.</p>
</div>
</div>
</div>
</nav></footer>
</body>
</html>