Update references to rpc.html

Jessica Lin · web-flow · commit c7fbd2d450a7 · 2020-04-21T12:49:22.000-07:00
diff --git a/docs/stable/rpc/distributed_autograd.html b/docs/stable/rpc/distributed_autograd.html
@@ -334,12 +334,12 @@
 <span id="id1"></span><h1>Distributed Autograd Design<a class="headerlink" href="#distributed-autograd-design" title="Permalink to this headline">¶</a></h1>
 <p>This note will present the detailed design for distributed autograd and walk
 through the internals of the same. Make sure you’re familiar with
-<a class="reference internal" href="../notes/autograd.html#autograd-mechanics"><span class="std std-ref">Autograd mechanics</span></a> and the <a class="reference internal" href="rpc.html#distributed-rpc-framework"><span class="std std-ref">Distributed RPC Framework</span></a> before
+<a class="reference internal" href="../notes/autograd.html#autograd-mechanics"><span class="std std-ref">Autograd mechanics</span></a> and the <a class="reference internal" href="../rpc.html#distributed-rpc-framework"><span class="std std-ref">Distributed RPC Framework</span></a> before
 proceeding.</p>
 <div class="section" id="background">
 <h2>Background<a class="headerlink" href="#background" title="Permalink to this headline">¶</a></h2>
 <p>Let’s say you have two nodes and a very simple model partitioned across two
-nodes. This can be implemented using <a class="reference internal" href="rpc.html#module-torch.distributed.rpc" title="torch.distributed.rpc"><code class="xref py py-mod docutils literal notranslate"><span class="pre">torch.distributed.rpc</span></code></a> as follows:</p>
+nodes. This can be implemented using <a class="reference internal" href="../rpc.html#module-torch.distributed.rpc" title="torch.distributed.rpc"><code class="xref py py-mod docutils literal notranslate"><span class="pre">torch.distributed.rpc</span></code></a> as follows:</p>
 <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">torch</span>
 <span class="kn">import</span> <span class="nn">torch.distributed.rpc</span> <span class="k">as</span> <span class="nn">rpc</span>
 
@@ -386,7 +386,7 @@ <h2>Autograd recording during the forward pass<a class="headerlink" href="#autog
 <li><p>Each <code class="docutils literal notranslate"><span class="pre">send-recv</span></code> pair is assigned a globally unique <code class="docutils literal notranslate"><span class="pre">autograd_message_id</span></code>
 to uniquely identify the pair. This is useful to lookup the corresponding
 function on a remote node during the backward pass.</p></li>
-<li><p>For <a class="reference internal" href="rpc.html#rref"><span class="std std-ref">RRef</span></a>, whenever we call <a class="reference internal" href="rpc.html#torch.distributed.rpc.RRef.to_here" title="torch.distributed.rpc.RRef.to_here"><code class="xref py py-meth docutils literal notranslate"><span class="pre">torch.distributed.rpc.RRef.to_here()</span></code></a>
+<li><p>For <a class="reference internal" href="../rpc.html#rref"><span class="std std-ref">RRef</span></a>, whenever we call <a class="reference internal" href="../rpc.html#torch.distributed.rpc.RRef.to_here" title="torch.distributed.rpc.RRef.to_here"><code class="xref py py-meth docutils literal notranslate"><span class="pre">torch.distributed.rpc.RRef.to_here()</span></code></a>
 we attach an appropriate <code class="docutils literal notranslate"><span class="pre">send-recv</span></code> pair for the tensors involved.</p></li>
 </ul>
 <p>As an example, this is what the autograd graph for our example above would look
@@ -396,7 +396,7 @@ <h2>Autograd recording during the forward pass<a class="headerlink" href="#autog
 <div class="section" id="distributed-autograd-context">
 <h2>Distributed Autograd Context<a class="headerlink" href="#distributed-autograd-context" title="Permalink to this headline">¶</a></h2>
 <p>Each forward and backward pass that uses distributed autograd is assigned a
-unique <a class="reference internal" href="rpc.html#torch.distributed.autograd.context" title="torch.distributed.autograd.context"><code class="xref py py-class docutils literal notranslate"><span class="pre">torch.distributed.autograd.context</span></code></a> and this context has a
+unique <a class="reference internal" href="../rpc.html#torch.distributed.autograd.context" title="torch.distributed.autograd.context"><code class="xref py py-class docutils literal notranslate"><span class="pre">torch.distributed.autograd.context</span></code></a> and this context has a
 globally unique <code class="docutils literal notranslate"><span class="pre">autograd_context_id</span></code>. This context is created on each node
 as needed.</p>
 <p>This context serves the following purpose:</p>
@@ -407,7 +407,7 @@ <h2>Distributed Autograd Context<a class="headerlink" href="#distributed-autogra
 before we have the opportunity to run the optimizer. This is similar to
 calling <a class="reference internal" href="../autograd.html#torch.autograd.backward" title="torch.autograd.backward"><code class="xref py py-meth docutils literal notranslate"><span class="pre">torch.autograd.backward()</span></code></a> multiple times locally. In order to
 provide a way of separating out the gradients for each backward pass, the
-gradients are accumulated in the <a class="reference internal" href="rpc.html#torch.distributed.autograd.context" title="torch.distributed.autograd.context"><code class="xref py py-class docutils literal notranslate"><span class="pre">torch.distributed.autograd.context</span></code></a>
+gradients are accumulated in the <a class="reference internal" href="../rpc.html#torch.distributed.autograd.context" title="torch.distributed.autograd.context"><code class="xref py py-class docutils literal notranslate"><span class="pre">torch.distributed.autograd.context</span></code></a>
 for each backward pass.</p></li>
 <li><p>During the forward pass we store the <code class="docutils literal notranslate"><span class="pre">send</span></code> and <code class="docutils literal notranslate"><span class="pre">recv</span></code> functions for
 each autograd pass in this context. This ensures we hold references to the
@@ -524,7 +524,7 @@ <h3>Computing dependencies<a class="headerlink" href="#computing-dependencies" t
 <a class="reference internal" href="#distributed-autograd-context">Distributed Autograd Context</a>. The gradients are stored in a
 <code class="docutils literal notranslate"><span class="pre">Dict[Tensor,</span> <span class="pre">Tensor]</span></code>, which is basically a map from Tensor to its
 associated gradient and this map can be retrieved using the
-<a class="reference internal" href="rpc.html#torch.distributed.autograd.get_gradients" title="torch.distributed.autograd.get_gradients"><code class="xref py py-meth docutils literal notranslate"><span class="pre">get_gradients()</span></code></a> API.</p></li>
+<a class="reference internal" href="../rpc.html#torch.distributed.autograd.get_gradients" title="torch.distributed.autograd.get_gradients"><code class="xref py py-meth docutils literal notranslate"><span class="pre">get_gradients()</span></code></a> API.</p></li>
 </ol>
 <div class="line-block">
 <div class="line"><br /></div>
@@ -595,20 +595,20 @@ <h3>SMART mode algorithm<a class="headerlink" href="#smart-mode-algorithm" title
 </div>
 <div class="section" id="distributed-optimizer">
 <h2>Distributed Optimizer<a class="headerlink" href="#distributed-optimizer" title="Permalink to this headline">¶</a></h2>
-<p>The <a class="reference internal" href="rpc.html#torch.distributed.optim.DistributedOptimizer" title="torch.distributed.optim.DistributedOptimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">DistributedOptimizer</span></code></a> operates as follows:</p>
+<p>The <a class="reference internal" href="../rpc.html#torch.distributed.optim.DistributedOptimizer" title="torch.distributed.optim.DistributedOptimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">DistributedOptimizer</span></code></a> operates as follows:</p>
 <ol class="arabic simple">
-<li><p>Takes a list of remote parameters (<a class="reference internal" href="rpc.html#torch.distributed.rpc.RRef" title="torch.distributed.rpc.RRef"><code class="xref py py-class docutils literal notranslate"><span class="pre">RRef</span></code></a>) to
+<li><p>Takes a list of remote parameters (<a class="reference internal" href="../rpc.html#torch.distributed.rpc.RRef" title="torch.distributed.rpc.RRef"><code class="xref py py-class docutils literal notranslate"><span class="pre">RRef</span></code></a>) to
 optimize. These could also be local parameters wrapped within a local
 <code class="docutils literal notranslate"><span class="pre">RRef</span></code>.</p></li>
 <li><p>Takes a <a class="reference internal" href="../optim.html#torch.optim.Optimizer" title="torch.optim.Optimizer"><code class="xref py py-class docutils literal notranslate"><span class="pre">Optimizer</span></code></a> class as the local
 optimizer to run on all distinct <code class="docutils literal notranslate"><span class="pre">RRef</span></code> owners.</p></li>
 <li><p>The distributed optimizer creates an instance of the local <code class="docutils literal notranslate"><span class="pre">Optimizer</span></code> on
 each of the worker nodes and holds an <code class="docutils literal notranslate"><span class="pre">RRef</span></code> to them.</p></li>
-<li><p>When <a class="reference internal" href="rpc.html#torch.distributed.optim.DistributedOptimizer.step" title="torch.distributed.optim.DistributedOptimizer.step"><code class="xref py py-meth docutils literal notranslate"><span class="pre">torch.distributed.optim.DistributedOptimizer.step()</span></code></a> is invoked,
+<li><p>When <a class="reference internal" href="../rpc.html#torch.distributed.optim.DistributedOptimizer.step" title="torch.distributed.optim.DistributedOptimizer.step"><code class="xref py py-meth docutils literal notranslate"><span class="pre">torch.distributed.optim.DistributedOptimizer.step()</span></code></a> is invoked,
 the distributed optimizer uses RPC to remotely execute all the local
 optimizers on the appropriate remote workers. A distributed autograd
 <code class="docutils literal notranslate"><span class="pre">context_id</span></code> must be provided as input to
-<a class="reference internal" href="rpc.html#torch.distributed.optim.DistributedOptimizer.step" title="torch.distributed.optim.DistributedOptimizer.step"><code class="xref py py-meth docutils literal notranslate"><span class="pre">torch.distributed.optim.DistributedOptimizer.step()</span></code></a>. This is used
+<a class="reference internal" href="../rpc.html#torch.distributed.optim.DistributedOptimizer.step" title="torch.distributed.optim.DistributedOptimizer.step"><code class="xref py py-meth docutils literal notranslate"><span class="pre">torch.distributed.optim.DistributedOptimizer.step()</span></code></a>. This is used
 by local optimizers to apply gradients stored in the corresponding
 context.</p></li>
 <li><p>If multiple concurrent distributed optimizers are updating the same
@@ -977,4 +977,4 @@ <h2>Resources</h2>
     })
   </script>
 </body>
-</html>
+</html>