File tree Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Original file line number Diff line number Diff line change @@ -239,8 +239,8 @@ are to conserve variance of the activation as well as variance of back-propagate
239
239
This allows information to flow well upward and downward in the network and
240
240
reduces discrepancies between layers.
241
241
Under some assumptions, a compromise between these two constraints leads to the following
242
- initialization: :math:`uniform[-\frac{6} {\sqrt{fan_{in}+fan_{out}}},\frac{6 }{\sqrt{fan_{in}+fan_{out}}}]`
243
- for tanh and :math:`uniform[-4*\frac{6} {\sqrt{fan_{in}+fan_{out}}},4*\frac{6 }{\sqrt{fan_{in}+fan_{out}}}]`
242
+ initialization: :math:`uniform[-\frac{\sqrt{6}} {\sqrt{fan_{in}+fan_{out}}},\frac{\sqrt{6} }{\sqrt{fan_{in}+fan_{out}}}]`
243
+ for tanh and :math:`uniform[-4*\frac{\sqrt{6}} {\sqrt{fan_{in}+fan_{out}}},4*\frac{\sqrt{6} }{\sqrt{fan_{in}+fan_{out}}}]`
244
244
for sigmoid. Where :math:`fan_{in}` is the number of inputs and :math:`fan_{out}` the number of hidden units.
245
245
For mathematical considerations please refer to [Xavier10]_.
246
246
You can’t perform that action at this time.
0 commit comments