Skip to content

Commit 64b8728

Browse files
committed
fix proofs, improve figures
1 parent 6eac566 commit 64b8728

File tree

3 files changed

+37
-27
lines changed

3 files changed

+37
-27
lines changed

docs/db.tex

Lines changed: 37 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ \section{$\partial\mathbb{B}$ nets}\label{sec:db-nets}
130130
\begin{equation*}
131131
\operatorname{harden}(f({\bf x})) = g(\operatorname{harden}({\bf x}))
132132
\end{equation*}
133-
for all ${\bf x} \in [0,1]^{n}$. For shorthand write $f \btright g$.
133+
for all ${\bf x} \in \{(x_{1}, \dots, x_{n}) ~|~ x_{i} \in [0,1] \setminus \{1/2\}\}$. For shorthand write $f \btright g$.
134134
\end{definition}
135135

136136
Neural networks are typically composed of nonlinear activation functions (for representational generality) that are strictly monotonic (so gradients always exist that link changes in inputs to outputs without local minima) and differentiable (so gradients reliably represent the local loss surface). However, activation functions that are monotonic but not strictly (so some gradients are zero) and differentiable almost everywhere (so some gradients are undefined) can also work, e.g. RELU \citep{10.5555/3104322.3104425}. $\partial \mathbb{B}$ nets are composed from `activation' functions that also satisfy these properties plus the additional property of hard-equivalence to a boolean function (and natural generalisations).
@@ -155,6 +155,12 @@ \subsection{Learning to negate}
155155

156156
There are many kinds of differentiable fuzzy logic operators (see \cite{VANKRIEKEN2022103602} for a review). So why this functional form? Product logics, where $f(x,y) = x y$ is as a soft version of $x \wedge y$, although hard-equivalent at extreme values, e.g. $f(1,1)=1$ and $f(0,1)=0$, are not hard-equivalent at intermediate values, e.g. $f(0.6, 0.6) = 0.36$, which hardens to $\operatorname{False}$ not $\operatorname{True}$. G\"{o}del-style $\operatorname{min}$ and $\operatorname{max}$ functions, although hard-equivalent over the entire soft-bit range, i.e. $\operatorname{min}(x,y) \btright x \wedge y$ and $\operatorname{max}(x,y) \btright x \vee y$, are gradient-sparse in the sense that their outputs are not always a function of all their inputs, e.g. $\frac{\partial}{\partial x} \operatorname{max}(x,y) = 0$ when $(x,y)=(0.1, 0.9)$. So although the composite function $\operatorname{max}(\operatorname{min}(w, x), \operatorname{min}(1-w, 1-x))$ is differentiable and $\btright \neg(x \oplus w)$ it does not always backpropagate error to its inputs. In contrast, $\partial_{\neg}$ always backpropagates error to its inputs because it is a gradient-rich function (see figure \ref{fig:gradient-rich}).
157157

158+
\begin{definition}[Gradient-rich]
159+
A function, $f: [0,1]^n \rightarrow [0,1]^m$, is {\em gradient-rich} if $\frac{\partial f({\bf x})}{\partial x_{i}} \neq {\bf 0}$ for all ${\bf x} \in \{(x_{1}, \dots, x_{n}) ~|~ x_{i} \in [0,1] \setminus \{1/2\}\}$.
160+
\end{definition}
161+
162+
TODO: explain
163+
158164
\subsection{Margin packing}
159165

160166
Say we aim to construct a differentiable analogue of $x \wedge y$. Note that $\operatorname{min}(x,y)$ essentially selects one of $x$ or $y$ as a representative soft-bit that is guaranteed hard-equivalent to $x \wedge y$. However, by selecting only one of $x$ or $y$ then $\operatorname{min}$ is also guaranteed to be gradient-sparse. We define a `margin packing' method to solve this dilemma.
@@ -611,13 +617,14 @@ \section{Proofs}
611617
Table \ref{not-table} is the truth table of the boolean function $\neg (x \oplus w)$, where $h(x) = \operatorname{harden}(x)$.
612618
\begin{table}[h!]
613619
\begin{center}
614-
\begin{tabular}{cccccc}
615-
\multicolumn{1}{c}{$x$} &\multicolumn{1}{c}{$y$} &\multicolumn{1}{c}{$h(x)$} &\multicolumn{1}{c}{$h(y)$} &\multicolumn{1}{c}{$\partial_{\neg}(h(x), h(y))$} &\multicolumn{1}{c}{$h(\partial_{\neg}(h(x), h(y)))$}
620+
\begin{tabular}{ccccccc}
621+
\multicolumn{1}{c}{$x$} &\multicolumn{1}{c}{$y$} &\multicolumn{1}{c}{$h(x)$} &\multicolumn{1}{c}{$h(y)$} &\multicolumn{1}{c}{$\partial_{\neg}(x, y)$} &\multicolumn{1}{c}{$h(\partial_{\neg}(x, y))$}
622+
&\multicolumn{1}{c}{$\neg (h(y) \oplus h(x))$}
616623
\\ \hline \\
617-
$\left[0, \frac{1}{2}\right]$ & $\left[0, \frac{1}{2}\right]$ & 0 & 0 & 1 & 1\\[0.1cm]
618-
$\left(\frac{1}{2}, 1\right]$ & $\left[0, \frac{1}{2}\right]$ &1 & 0 & 0 & 0\\[0.1cm]
619-
$\left[0, \frac{1}{2}\right]$ & $\left(\frac{1}{2}, 1\right]$ &0 & 1 & 0 & 0\\[0.1cm]
620-
$\left(\frac{1}{2}, 1\right]$ & $\left(\frac{1}{2}, 1\right]$ &1 & 1 & 1 & 1\\[0.1cm]
624+
$\left[0, \frac{1}{2}\right)$ & $\left[0, \frac{1}{2}\right)$ & 0 & 0 & $\left(\frac{1}{2},1\right]$ & 1 & 1\\[0.1cm]
625+
$\left(\frac{1}{2}, 1\right]$ & $\left[0, \frac{1}{2}\right)$ &1 & 0 & $\left[0, \frac{1}{2}\right)$ & 0 & 0\\[0.1cm]
626+
$\left[0, \frac{1}{2}\right)$ & $\left(\frac{1}{2}, 1\right]$ &0 & 1 & $\left[0, \frac{1}{2}\right)$ & 0 & 0\\[0.1cm]
627+
$\left(\frac{1}{2}, 1\right]$ & $\left(\frac{1}{2}, 1\right]$ &1 & 1 & $\left(\frac{1}{2}, 1\right]$ & 1 & 1\\[0.1cm]
621628
\end{tabular}
622629
\end{center}
623630
\caption{$\partial_{\neg}(x,y) \btright \neg (y \oplus x)$.}\label{not-table}
@@ -653,13 +660,14 @@ \section{Proofs}
653660
Table \ref{and-table} is the truth table of the boolean function $x \wedge y$, where $h(x) = \operatorname{harden}(x)$..
654661
\begin{table}[h!]
655662
\begin{center}
656-
\begin{tabular}{cccccc}
657-
\multicolumn{1}{c}{$x$} &\multicolumn{1}{c}{$y$} &\multicolumn{1}{c}{$h(x)$} &\multicolumn{1}{c}{$h(y)$} &\multicolumn{1}{c}{$\partial_{\wedge}(h(x), h(y))$} &\multicolumn{1}{c}{$h(\partial_{\wedge}(h(x), h(y)))$}
663+
\begin{tabular}{ccccccc}
664+
\multicolumn{1}{c}{$x$} &\multicolumn{1}{c}{$y$} &\multicolumn{1}{c}{$h(x)$} &\multicolumn{1}{c}{$h(y)$} &\multicolumn{1}{c}{$\partial_{\wedge}(x, y)$} &\multicolumn{1}{c}{$h(\partial_{\wedge}(x, y))$}
665+
&\multicolumn{1}{c}{$h(x) \wedge h(y)$}
658666
\\ \hline \\
659-
$\left[0, \frac{1}{2}\right]$ & $\left[0, \frac{1}{2}\right]$ & 0 & 0 & 0 & 0\\[0.1cm]
660-
$\left(\frac{1}{2}, 1\right]$ & $\left[0, \frac{1}{2}\right]$ &1 & 0 & $\frac{1}{4}$ & 0\\[0.1cm]
661-
$\left[0, \frac{1}{2}\right]$ & $\left(\frac{1}{2}, 1\right]$ &0 & 1 & $\frac{1}{4}$ & 0\\[0.1cm]
662-
$\left(\frac{1}{2}, 1\right]$ & $\left(\frac{1}{2}, 1\right]$ &1 & 1 & 1 & 1\\[0.1cm]
667+
$\left[0, \frac{1}{2}\right)$ & $\left[0, \frac{1}{2}\right)$ & 0 & 0 & $\left[0, \frac{1}{2}\right)$ & 0 & 0\\[0.1cm]
668+
$\left(\frac{1}{2}, 1\right]$ & $\left[0, \frac{1}{2}\right)$ &1 & 0 & $\left(\frac{1}{4}, \frac{1}{2}\right)$ & 0 & 0\\[0.1cm]
669+
$\left[0, \frac{1}{2}\right)$ & $\left(\frac{1}{2}, 1\right]$ &0 & 1 & $\left(\frac{1}{4}, \frac{1}{2}\right)$ & 0 & 0\\[0.1cm]
670+
$\left(\frac{1}{2}, 1\right]$ & $\left(\frac{1}{2}, 1\right]$ &1 & 1 & $\left(\frac{1}{2}, 1\right]$ & 1 & 1\\[0.1cm]
663671
\end{tabular}
664672
\end{center}
665673
\caption{$\partial_{\wedge}(x,y) \btright x \wedge y$.}\label{and-table}
@@ -673,13 +681,14 @@ \section{Proofs}
673681
Table \ref{or-table} is the truth table of the boolean function $x \vee y$, where $h(x) = \operatorname{harden}(x)$..
674682
\begin{table}[h!]
675683
\begin{center}
676-
\begin{tabular}{cccccc}
677-
\multicolumn{1}{c}{$x$} &\multicolumn{1}{c}{$y$} &\multicolumn{1}{c}{$h(x)$} &\multicolumn{1}{c}{$h(y)$} &\multicolumn{1}{c}{$\partial_{\vee}(h(x), h(y))$} &\multicolumn{1}{c}{$h(\partial_{\vee}(h(x), h(y)))$}
684+
\begin{tabular}{ccccccc}
685+
\multicolumn{1}{c}{$x$} &\multicolumn{1}{c}{$y$} &\multicolumn{1}{c}{$h(x)$} &\multicolumn{1}{c}{$h(y)$} &\multicolumn{1}{c}{$\partial_{\vee}(x, y)$} &\multicolumn{1}{c}{$h(\partial_{\vee}(x, y))$}
686+
&\multicolumn{1}{c}{$h(x) \vee h(y)$}
678687
\\ \hline \\
679-
$\left[0, \frac{1}{2}\right]$ & $\left[0, \frac{1}{2}\right]$ & 0 & 0 & 0 & 0\\[0.1cm]
680-
$\left(\frac{1}{2}, 1\right]$ & $\left[0, \frac{1}{2}\right]$ &1 & 0 & $\frac{3}{4}$ & 1\\[0.1cm]
681-
$\left[0, \frac{1}{2}\right]$ & $\left(\frac{1}{2}, 1\right]$ &0 & 1 & $\frac{3}{4}$ & 1\\[0.1cm]
682-
$\left(\frac{1}{2}, 1\right]$ & $\left(\frac{1}{2}, 1\right]$ &1 & 1 & 1 & 1\\[0.1cm]
688+
$\left[0, \frac{1}{2}\right)$ & $\left[0, \frac{1}{2}\right)$ & 0 & 0 & $\left[0,\frac{1}{2}\right)$ & 0 & 0\\[0.1cm]
689+
$\left(\frac{1}{2}, 1\right]$ & $\left[0, \frac{1}{2}\right)$ &1 & 0 & $\left(\frac{1}{2},1\right]$ & 1 & 1\\[0.1cm]
690+
$\left[0, \frac{1}{2}\right)$ & $\left(\frac{1}{2}, 1\right]$ &0 & 1 & $\left(\frac{1}{2},1\right]$ & 1 & 1\\[0.1cm]
691+
$\left(\frac{1}{2}, 1\right]$ & $\left(\frac{1}{2}, 1\right]$ &1 & 1 & $\left(\frac{1}{2},1\right]$ & 1 & 1\\[0.1cm]
683692
\end{tabular}
684693
\end{center}
685694
\caption{$\partial_{\vee}(x,y) \btright x \vee y$.}\label{or-table}
@@ -693,16 +702,17 @@ \section{Proofs}
693702
Table \ref{implies-table} is the truth table of the boolean function $x \Rightarrow y$, where $h(x) = \operatorname{harden}(x)$..
694703
\begin{table}[h!]
695704
\begin{center}
696-
\begin{tabular}{cccccc}
697-
\multicolumn{1}{c}{$x$} &\multicolumn{1}{c}{$y$} &\multicolumn{1}{c}{$h(x)$} &\multicolumn{1}{c}{$h(y)$} &\multicolumn{1}{c}{$\partial \Rightarrow(h(x), h(y))$} &\multicolumn{1}{c}{$h(\partial \Rightarrow(h(x), h(y)))$}
705+
\begin{tabular}{ccccccc}
706+
\multicolumn{1}{c}{$x$} &\multicolumn{1}{c}{$y$} &\multicolumn{1}{c}{$h(x)$} &\multicolumn{1}{c}{$h(y)$} &\multicolumn{1}{c}{$\partial_{\Rightarrow}(x, y)$} &\multicolumn{1}{c}{$h(\partial_{\Rightarrow}(x, y))$}
707+
&\multicolumn{1}{c}{$h(x) \Rightarrow h(y)$}
698708
\\ \hline \\
699-
$\left[0, \frac{1}{2}\right]$ & $\left[0, \frac{1}{2}\right]$ & 0 & 0 & $\frac{3}{4}$ & 1\\[0.1cm]
700-
$\left(\frac{1}{2}, 1\right]$ & $\left[0, \frac{1}{2}\right]$ &1 & 0 & 0 & 0\\[0.1cm]
701-
$\left[0, \frac{1}{2}\right]$ & $\left(\frac{1}{2}, 1\right]$ &0 & 1 & 1 & 1\\[0.1cm]
702-
$\left(\frac{1}{2}, 1\right]$ & $\left(\frac{1}{2}, 1\right]$ &1 & 1 & $\frac{3}{4}$ & 1\\[0.1cm]
709+
$\left[0, \frac{1}{2}\right)$ & $\left[0, \frac{1}{2}\right)$ & 0 & 0 & $\left(\frac{1}{2}, 1\right]$ & 1 & 0\\[0.1cm]
710+
$\left(\frac{1}{2}, 1\right]$ & $\left[0, \frac{1}{2}\right)$ &1 & 0 & $\left[0, \frac{1}{2}\right)$ & 0 & 0\\[0.1cm]
711+
$\left[0, \frac{1}{2}\right)$ & $\left(\frac{1}{2}, 1\right]$ &0 & 1 & $\left(\frac{1}{2},1\right]$ & 1 & 0\\[0.1cm]
712+
$\left(\frac{1}{2}, 1\right]$ & $\left(\frac{1}{2}, 1\right]$ &1 & 1 & $\left(\frac{1}{2}, \frac{7}{8}\right)$ & 1 & 0\\[0.1cm]
703713
\end{tabular}
704714
\end{center}
705-
\caption{$\partial \Rightarrow(x,y) \btright x \Rightarrow y$.}\label{implies-table}
715+
\caption{$\partial_{\Rightarrow}(x,y) \btright x \Rightarrow y$.}\label{implies-table}
706716
\end{table}
707717
\end{proof}
708718
\end{prop}
@@ -722,7 +732,7 @@ \section{Proofs}
722732
\begin{theorem}\label{prop:majority}
723733
$\partial\!\operatorname{Maj} \btright \operatorname{Maj}$.
724734
\begin{proof}
725-
$\partial\!\operatorname{Maj}$ augments the representative bit $x_{i} = \operatorname{sort}({\bf x})[\operatorname{majority-index}({\bf x})]$. By lemma \ref{lem:maj}, the representative bit is $\btright\!\operatorname{Maj}(\operatorname{harden}({\bf x}))$.
735+
$\partial\!\operatorname{Maj}$ augments the representative bit $x_{i} = \operatorname{sort}({\bf x})[\operatorname{majority-index}({\bf x})]$. By lemma \ref{lem:maj} the representative bit is $\btright \operatorname{Maj}(\operatorname{harden}({\bf x}))$.
726736
By lemma \ref{prop:augmented}, the augmented bit, $\operatorname{augmented-bit}(\operatorname{sort}({\bf x}), \operatorname{majority-index}({\bf x}))$, is also $\btright\!\operatorname{Maj}(\operatorname{harden}({\bf x}))$. Hence $\partial\!\operatorname{Maj} \btright\!\operatorname{Maj}$.
727737
\end{proof}
728738
\end{theorem}

docs/logic-gates.png

-6.06 KB
Loading

docs/majority-gates.png

-33.1 KB
Loading

0 commit comments

Comments
 (0)