Broadcast and Weight: An Integrated Network

For Scalable Photonic Spike Processing
Alexander N. Tait, Student Member, OSA, Mitchell A. Nahmias, Bhavin J. Shastri, Member, IEEE,
and Paul R. Prucnal, Fellow, IEEE, Fellow, OSA

Abstract—We propose an on-chip optical architecture to support Each spiking primitive handles inputs from multiple sources
massive parallel communication among high-performance spiking by temporally integrating their weighted sum and firing a sin-
laser neurons. Designs for a network protocol, computational ele- gle spike when the integration state variable crosses a threshold.
ment, and waveguide medium are described, and novel methods are
considered in relation to prior research in optical on-chip network- This distributed, asynchronous model processes information us-
ing, neural networking, and computing. Broadcast-and-weight is ing both space and time [4]–[6], and is amenable to distributed,
a new approach for combining neuromorphic processing and op- unsupervised adaptation [7], [8]. The use of sparse coding prin-
toelectronic physics, a pairing that is found to yield a variety of ciples promises extreme improvements to computational power
advantageous features. We discuss properties and design consider- efficiency in particular [9].
ations for architectures for scalable wavelength reuse and biologi-
cally relevant organizational capabilities, in addition to aspects of Spike processing is at the heart of a modern generation of neu-
practical feasibility. Given recent developments commercial pho- romorphic electronics, although no single hardware approach
tonic systems integration and neuromorphic computing, we sug- has emerged as the clear ideal. Spiking primitives have been
gest that a novel approach to photonic spike processing represents built in both CMOS analog circuits [10], digital “neurosynaptic
a promising opportunity in unconventional computing. cores” [11], and non-CMOS devices [12]. Many architectures
Index Terms—Asynchronous circuits, network topology, neuro- that interconnect large numbers of primitives have been pro-
morphics, optical computing, optical interconnects, photonic in- posed or built, including, notably: Neurogrid [13], TrueNorth
tegrated circuits, spiking neural networks, system analysis and [14], SpiNNaker [15], and FACETS [16]. The use of physics
design, WDM networks.
for analog dynamical processing represents an important step
I. INTRODUCTION towards attaining the efficiency and functionality exhibited by
biophysical information processors, yet electronic interconnects
EUROMORPHIC processing offers many opportunities
N and challenges distinct from those of traditional von Neu-
mann computing. It seeks to engineer scalable and cost-effective
are incapable of the density and fan-in needed to support scalable
architectures that represent spikes as physical pulses. Despite a
wide variety of approaches in neuromorphic microelectronics,
hardware systems that take inspiration from abstract princi- all proposed architectures employ some form of address-event
ples of biological processing, such as parallelism and sparsity. representation (AER) of spikes. AER is a digital packet rout-
Neuromorphic architectures promise potent advantages (effi- ing scheme, which incurs significant time and energy overhead
ciency, fault tolerance, adaptability) over von Neumann archi- for signal encoding/decoding and network coordination, but
tectures for tasks involving pattern analysis, decision making, is well-suited for slow timescale (milliseconds) neuromorphic
optimization, learning, and real-time control of multi-sensor, systems [15].
multi-actuator systems. Unconventional hardware has a long Integrated photonic platforms represent an alternative to mi-
history of massive parallelism, but a more recently recognized croelectronic approaches. The communication potentials of op-
point of neural inspiration is a sparse coding scheme called tical interconnects (bandwidth, energy use, electrical isolation)
spiking [1]. have received attention for neural networking in the past; how-
Spike processing, while inspired by neuroscience, has firm ever, attempts to realize holographic or matrix-vector multi-
code-theoretic justifications. Spike codes—digital in amplitude, plication systems have encountered practical barriers, largely
but analog and sparse in pulse arrival time—can reconcile the because they cannot be integrated, let alone with effective
expressiveness and efficiency of analog processing with the ro- nonlinear processing units. Techniques in silicon photonic inte-
bustness of digital communication, and recurrent networks of grated circuit (PIC) fabrication is driven by a tremendous de-
spiking primitives possess rich algorithmic capabilities [2], [3]. mand for optical interconnects within conventional digital com-
puting systems [17]. The first platforms for systems integration
Manuscript received February 14, 2014; revised May 27, 2014; accepted July of active photonics are becoming commercial reality [18], [19],
22, 2014. Date of publication August 5, 2014; date of current version September
17, 2014. This work was supported by the National Science Foundation (NSF) and promise to bring the economies of integrated circuit manu-
Graduate Research Fellowship Program (GRFP). facturing to optical systems. Using a device set designed for dig-
The authors are with the Lightwave Communications Laboratory, Depart- ital communication (waveguides, filters, detectors, etc.), some
ment of Electrical Engineering, Princeton University, Princeton, NJ 08544 USA
(e-mail: atait@princeton.edu; mnahmias@princeton.edu; shastri@ieee.org; have realized PICs for analog signal processing [20]. The po-
prucnal@princeton.edu). tential of modern PIC platforms to enable large-scale all-optical
Color versions of one or more of the figures in this paper are available online systems for unconventional and/or analog computing has not
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/JLT.2014.2345652 yet been investigated.

Recent years have seen the emergence of a new class of ing throughput and efficiency requirements of future multi-core
optical devices that exploit a dynamical isomorphism between system on-chip architectures. Although the proposed intercon-
semiconductor photocarriers and neuron biophysics. The differ- nect is adapted for a considerably different signalling model
ence in physical timescales allows these “photonic neurons” to (spiking), some of the networking techniques presented in this
exhibit spiking behavior on picosecond (instead of millisecond) paper have been investigated in a conventional computing con-
timescales [21]–[24]. Spiking is closely related to a dynamical text. Optical ring networks with WDM channelization have
system property called excitability, which is shared by certain been proposed as a means to obtain collision-free multicast
kinds of laser devices. Excitable laser systems have been studied networks, notably ATAC [34] and optical ring NoC (ORNoC)
in the context of spike processing with the tools of bifurcation [35]. Psota et al. have also identified lightpath splitting as an
theory by [25]–[27] and experimentally by [28]–[30]. Some efficient method for multicast routing on chip. The layout flex-
are specifically designed for compatibility with silicon photonic ibility of the ring has been exploited to accommodate a tiled
PIC platforms [31], [32]. A network of photonic neurons could processor layout, and Le Beux et al. have proposed using multi-
open computational domains that demand unprecedented tem- ple independent rings for spectrum reuse; however, interfacing
poral precision, power efficiency, and functional complexity, these ORNoC subnetworks into a single system would require
potentially including applications in wideband radio frequency specialized switching nodes incorporating arbitration control,
(RF) processing, adaptive control of multi-antenna systems, and unlike the proposed architecture (see Section III-C).
high-performance scientific computing. Although the ultrafast WDM techniques significantly increase the effective
spiking dynamics of laser neurons show potential in this re- throughput-density of a physical link; however, the requirement
spect, most analysis of them has so far been limited to one of a modulator and detector for each channel can negate the
or two devices with minimal regard for a compatible network area and energy savings in some circumstances [36]. To obtain
architecture. contention-free behavior, ATAC and ORNoC stipulate at least
We propose an on-chip networking architecture called one dedicated receiver (i.e., detector, A/D converter, deserial-
“broadcast-and-weight” that could support massively parallel izer, and buffer) per channel per node, potentially creating a
interconnection between photonic spiking neurons [33]. It has buffering bottleneck [35]. In contrast, the photonic spike pro-
similarities with the fiber networking technique broadcast-and- cessing architecture sums multiple inputs in a single detector and
select, which channelizes usable bandwidth using wavelength requires neither active electronic receivers nor distinct optical
division multiplexing (WDM); however, the protocol flattens modulators (see Section III-B).
the traditional layered hierarchy of optical networks, accom-
plishing physical, logical, and processing tasks in a compact
computational primitive. Although the proposed processing cir- B. Optical Computing
cuits are unconventional, the required device set is compatible Motivated by the properties that have made optics superior for
with mainstream PIC platforms in silicon, which make heavy communications (e.g., usable bandwidth and energy efficiency),
use of WDM techniques. optical devices and architectures have long been investigated
This paper is organized to first give background on optical for computing. Optical logic gates have been implemented by
networks on chip (NoC), computing, and neural networks. We myriad techniques, including self-phase modulation in micror-
will describe the WDM broadcast-and-weight protocol, then a ing cavities [37], quantum dot saturable absorption [38], and
primitive node for processing and networking, and a waveguide many others; however, a scalable all-optical computer has so far
loop medium. Architectures consisting of multiple broadcast proven elusive. Analyses of the daunting scope of fundamental
cells will be proposed and discussed with respect to topology challenges to digital optical computing are performed by Keyes
and scalability. We have found that the implications of spike [39] and Miller [40]. A comparison of these references reveals
processing (time as information) combined with WDM (wave- strikingly similar themes, which belie the progress of photonic
length as identity) are accompanied by novel spatial freedoms technology in the intervening decades – not to mention the birth
that makes this architecture uniquely suited, among artificial and maturation of the telecom industry. Many of the fundamen-
systems, to emulate and explore certain biologically-relevant tal challenges facing digital optical computing remain difficult
organizational topologies (e.g., “small-worldness”). This pair- to achieve simultaneously in a simple device.
ing also yields key features of practical feasibility (robustness, For this reason, many attempts to leverage the capabilities of
cascadability, scalability), which have foiled some large-scale optics avoid a digital electronic computing paradigm altogether
optical processors in the past. We claim that various favor- and instead target specialized tasks, including A/D conversion
able and generalizable properties of the proposed architecture [41], amoeba-inspired processing in quantum dots [42], and
make it a viable candidate to support a new generation of scal- reservoir computing [43], [44]. The utility of an overspecial-
able high-performance spike processing in photonics. ized optical “hardware accelerator” or “coprocessor” has so far
been outweighed by the cost of commercial platform devel-
II. PRIOR WORK opment, although many unconventional approaches succeed in
exploring new and interesting intersections of computing and
A. Optical Networks On-Chip
physics [45]. The proposed architecture avoids overspecializa-
Optical networks on-chip (NoCs) have been proposed as tion with its many configuration freedoms—both in design lay-
an alternative to electronic networks to support the demand- out and in field-tunable interconnection parameters. A particular

interconnect configuration, which determines the behavior and

function of a distributed processing system, is very different
than procedural program, where operations are represented by
a stack of instructions interpretable by a Turing machine.
This absence of procedural programmability is a challenge
for the analysis and design of all neuron-inspired architectures,
but also one of their biggest advantages. Processors that re-
linquish a framework of immutable execution could exhibit
enhanced aptitude to self-organize and adapt to uncertain en-
vironments without programmer input [46], [47]. We believe
that the architecture proposed here exhibits important proper-
ties of a computing system potentially capable of sophisticated
and widely applicable large-scale information processing (see
Section IV-B), but classify it as a “scalable photonic proces-
sor” to emphasize the fact that it does not pursue a symbolic
instruction model of general purpose computation. Among un-
conventional optical processing paradigms, neural networking
is perhaps the most commonly examined class of models.

C. Optical Neural Networks Fig. 1. Functional model of a spiking neural network, depicting four neurons.
Each neuron has one output signal, which is sent to multiple other neurons.
Optical technologies for interconnection have long been rec- Input signals are independently weighted by an analog coefficient (represented
ognized as potential media for artificial neural network archi- by grayscale value) before summation. The summed signal drives a dynamical
processing model, such as spiking leaky integrate-and-fire (represented by the
tectures, which rely on parallel communication performance as phase portrait of an excitable system).
much as—if not more than—parallel operation of computational
gates (a.k.a. neurons). Attempts to realize the throughput, dissi-
pation, and cross-talk advantages of optics in a neurocomputing
connections between these units. Broadcast-and-weight is a
context, while promising in many cases, have so far encountered
WDM protocol in which many signals can coexist in a sin-
barriers in reliability, scalability, and cost. A review of optical
gle waveguide and all participant units have access to all the
neural networks (ONNs) is contained in [1].
signals. The processing-network node (PNN) is a primitive unit
For the most part, approaches to ONN interconnection have
that performs the physical and logical functions required for
focused on spatial multiplexing techniques, including config-
broadcast-and-weight networking and neuromorphic process-
urable spatial light modulation [48], matrix grating holograms
ing, respectively. The broadcast loop (BL) defines the medium
[49], and volume holograms [50], [51]. Although they are dense
in which a broadcast network exists and physically links a group
techniques for all-to-all interconnection, free-space and holo-
of PNNs to one another. Although the authors have made ev-
graphic devices are difficult to integrate and also require precise
ery attempt to present these aspects in a linear fashion, they
alignment. Systems that are non-integrable or that require exotic
are logically intertwined; a more thorough discussion of design
integration processes have extreme difficulty matching CMOS
justifications is deferred until after the aspects are presented
systems in cost or practical scalability.
Coherent interference effects in many-to-one coupling [52]
In every neural network model, each node receives signals
are particularly relevant to neural networks with large fan-
from many other nodes, performs some process, and transmits
in. Phase-sensitive designs of spiking optical neurons, such
copies of a single output signal to multiple receiver neurons
as [26], [27], must introduce methods to control the relative
(see Fig. 1). Each input is modulated independently by a con-
phases of signals originating from distinct computational primi-
stant multiplier (a.k.a. weight), which can be positive, negative,
tives. Semiconductor optical devices that implement a Hopfield
or zero. After weighting, all inputs to the neuron are summed,
(non-spiking) model have used WDM to avoid mutual inter-
before modulating a nonlinear dynamical element: in this case,
ference [48], [53]. Using WDM as a non-spatial multiplexing
a laser neuron device. The configuration of the system is de-
technique, broadcast-and-weight is compatible with commer-
termined by its weight matrix, where element wij signifies the
cial PIC integration and can exploit this spatial indeterminism
strength of the connection from neuron i to neuron j. A single
to bestow a distributed architecture with structural features not
transmission device can not alter the polarity of a signals repre-
possible with holographic or free-space systems, as discussed in
sented as optical power, so effective neural weighting requires
Section IV-A.
two optical filters per channel dropping power into a balanced
push-pull photodetector in order to implement both positive
III. SYSTEM ARCHITECTURE and negative weights. A processor can exhibit a large variety
The proposed architecture for photonic spike processing of behaviors through reconfiguration of the weight matrix, al-
consists of three aspects: a protocol, a node that abides by though this weight tuning happens on timescales much slower
that protocol, and a network medium that supports multiple than spiking dynamics. The problem of neural networking

contains prominent one-to-many (multicast) and many-to-one

(fan-in) components. In the case of spiking networks, communi-
cation signals are pulses: binary in amplitude and asynchronous
in time. For interconnecting signals with spikes represented as
physical pulses (as opposed to digital packets as in AER), tem-
poral multiplexing and switch-based routing techniques are not
viable strategies because spike timing is an informatic dimen-
sion unavailable for multiplexing. The goal of our network de-
sign will be to support a large number of parallel, asynchronous,
and reconfigurable connections between a distributed group of
photonic processing primitives that is compatible with the ap-
proach of spikes represented physically as optical pulses.

A. Broadcast-and-Weight
Fig. 2. Optical broadcast-and-weight network showing parallels with the neu-
WDM channelization of the spectrum is one way to effi- ral network model of Fig. 1. An array of source lasers outputs distinct wave-
lengths (represented by solid color). These channels are wavelength multiplexed
ciently use the full capacity of a waveguide, which can have (WDM) in a single waveguide (multicolor). Independent weighting functions
useable transmission windows up to 50 nm wide (>1 THz band- are realized by spectral filters (represented by gray color wheel masks) at the in-
width) [54]. In fiber communication networks, a WDM protocol put of each unit. Demultiplexing does not occur in the network. Instead, the total
optical power of each spectrally weighted signal is detected, yielding the sum
called broadcast-and-select can create many potential connec- of the input channels. The electronic signal directly drives a laser processing
tions between nodes: the active connection is selected, not by device, such as the excitable laser proposed in [32].
altering the intervening medium, but rather by tuning a filter
at the receiver to drop the desired wavelength [55]. We present
a similar protocol for a spike processing network and call it
B. Processing-Network Node
“broadcast-and-weight.” It differs by allowing multiple inputs
to be dropped simultaneously and with intermediate strengths In a biological neural network, the complicated structure of
between 0% and 100%. physical wires (i.e., axons) connecting neurons largely deter-
Broadcast-and-weight consists of a group of nodes sharing a mines the network interconnectivity pattern, so the role of neu-
common medium in which the output of every node is assigned rons is predominantly computational (weighted addition, inte-
a unique transmission wavelength and made available to every gration, thresholding). The contrasting all-to-all nature of opti-
other node (see Fig. 2). Each node has a tunable spectral filter cal broadcast saddles the photonic neuron primitive units with
bank at its front-end. By tuning continuously between 0–100% additional responsibilities of network control (routing, wave-
drop states, each filter drops a portion of its corresponding wave- length conversion, WDM signal generation, etc.).
length channel, thereby applying a coefficient of transmission The proposed design of a PNN can perform all of these nec-
analogous to a neural weight. The filters of a given receiver essary functions, achieving compactness by flattening the dual
operate in parallel, allowing it to receive multiple inputs si- roles of processing and networking into a single set of de-
multaneously. An interconnectivity pattern is determined by the vices. It attains rich computational capabilities by leveraging
local states of filters and not a state of the transmission medium analog physics offered by optoelectronics. Overall, the PNN is
between nodes. Routing in this network is transparent, parallel, an unconventional repurposing of conventional optoelectronic
and switchless, making it ideal to support asynchronous signals devices, thereby appearing as a strikingly simple circuit with
of a neural character. potential to generalize to existing—and prospective—photonic
The ability to control each connection, each weight, indepen- platforms. One possible implementation of a PNN is depicted
dently is critical for creating differentiation amongst the process- in Fig. 3, while the dual purpose of its devices are summarized
ing elements. A great variety of possible weight profiles allows in Table I.
a group of functionally similar units to compute a tremendous The PNN interacts with a WDM waveguide via two tunable
variety of functions despite sharing a common set of available filter banks. One filter bank represents the weights of excitatory
input signals. Reconfiguration of the filters’ drop states, corre- (positive) input connections while the other controls inhibitory
sponding to weight adaptation or learning, intentionally occurs (negative) inputs. These weight profiles could be stored in local
on timescales much slower (μs or ms) than spike signaling (ps). co-integrated or off-chip CMOS memory. The two weighted
A reconfigurable filter could, for example, be implemented by (i.e., spectrally filtered) subsets of the broadcast channels are
a microring resonator whose resonance is tuned thermally or dropped—without demultiplexing—to a balanced photodiode
electronically. In a group of N nodes with N wavelengths, each pair. Photodetectors output a current that represents total optical
node needs a dedicated weighting filter for all (N − 1) possible power, thus computing the weighted sum of WDM inputs in the
inputs plus one filter at its own wavelength to add its output to process of transducing them to an electronic signal, which is
the broadcast medium. The total number of filters in the system capable of modulating a laser device. The balanced photodiode
would thus scale quadratically with N 2 . A filter design example configuration enables inhibitory weighting, which is an essential
is given in Section III-D. capability of any neural network.


Element Process Function Network Function

Adaptive filter bank Weight multiplication WDM drop-and-continue

Photodetector Addition/subtraction Multiwavelength fan-in
Gain medium Temporal integration Laser modulation
Excitable laser Threshold detection Clean pulse generation
Output coupler WDM add

Fig. 3. Processing-network node (PNN) coupled to a broadcast waveguide.

The front-end consists of two banks of continuously tunable microring drop
filters that partially drop WDM channels that are present. Two waveguide inte-
the net gain of the cavity crosses unity, much like a passively
grated photodetectors (PDs) convert the optical signal to an electronic current Q-switched laser biased below threshold. In this way, it emu-
and perform summation operations on the weighted excitatory and inhibitory lates one of the most critical dynamical properties of a spik-
inputs. A short wire subtracts these photocurrents and modulates current in-
jection into an excitable laser neuron, which performs threshold detection and
ing neuron—excitability—on picosecond timescales. Although
pulse formation in an optical cavity. The output of the laser is coupled back the possibility of WDM was not explicitly discussed in prior
into the broadcast waveguide and sent to other PNNs. Insets represent example work, the lasing wavelengths of an array of excitable distributed
spectrograms of the waveguides. (a) Broadcast waveguide with 6 WDM chan-
nels: (b) three of these channels are shown partially dropped into the excitatory
feedback lasers could be tailored by altering the pitch of their
PD, and (c) two other channels are shown partially dropped into the inhibitory gratings [59].
PD. The channel subsets that are dropped are determined by the tuning state of By generating clean, stereotyped pulses at a single wave-
each filter (driving circuitry not shown).
length, the laser provides the optical signal necessary for
broadcast-and-weight networking. All light can be generated
and detected on-chip. In addition, excitable lasers effectively
provide gain, since large pulse responses can be triggered by
Total optical power detection of a still multiplexed signal is weak input pulses. If excitable gain is sufficient to counteract
a relatively rare technique because it irreversibly strips WDM insertion and fan-out losses, this means that, in principle, ac-
signals of any trace of their identifying wavelength. This prop- tive optoelectronics would not be necessary outside of the PNN
erty has been exploited in several applications including subcar- module.
rier optical multiplexing [56], a multi-input OR function [57], Finally, an output coupler adds the generated signal to the
and analog RF photonic signal processing [58]; nevertheless, it broadcast waveguide. Other wavelengths are nominally unaf-
is counterproductive in the majority of situations. Information fected by this coupler, but any incoming signals at the PNN’s
about a signal’s origin is desirable in multiwavelength commu- assigned wavelength will be completely dropped and termi-
nication systems and is maintained by demultiplexing prior to nated, avoiding collision with the newly generated output.
detection. In the neurocomputing context however, this destruc-
tion of channel information is precisely correspondent with the
C. Broadcast Loop
summation function. A photodiode can therefore be viewed in
this sense of dual purpose, not just as a transducer, but also The final aspect of the proposed networking architecture is
as an additive computational element capable of many-to-one the physical medium that transports WDM optical signals be-
wavelength fan-in. tween the output couplers and input spectral filter banks of a
The PNN front-end is not subject to well-known optical- group of PNNs. Since routing is already performed by the PNN
electronic-optical (O/E/O) conversion overhead. The cost, en- filters, the broadcast medium must simply implement an all-to-
ergy, and complexity typically involved in O/E/O are due not, all interconnection, supporting all N 2 potential—not necessar-
in fact, to the physical transduction itself but instead to the elec- ily actual—connections between participating units. This role
tronic receiver stages (i.e., amplification, sampling, and quanti- can be performed by a single integrated waveguide with ring
zation) that normally follow detection in fiber communication topology, which we refer to as a BL. A broadcast-and-weight
links [40]. The “receiver-less” pathway connecting photodiodes cell thus consists of several PNN primitives coupled to a BL
to laser neuron is not significantly affected by dispersion or medium, as illustrated in Fig. 4. Its ring shape is reminiscent of
electromagnetic interference (EMI) in this case because it can metropolitan fiber networks, though the neuromorphic process-
be made very short (∼ 20 μm) regardless of fan-in degree. ing implications of the BL are worthy of further consideration.
The electronic signal from the balanced photodetector pair The BL waveguide is fully multiplexed at all points along
modulates a laser processor, which performs some dynamical its length. Most signal power is allowed to continue through a
and strongly nonlinear process, described in more detail in [31], PNN, even if a portion of it is dropped. This technique called
[32]. The modulated laser gain medium is an active optical semi- drop-and-continue is an instance of lightpath splitting, where
conductor, which acts as a subthreshold temporal integrator with the information carried by an optical channel can be copied pas-
timeconstant equal to carrier recombination lifetime. The laser sively and instantaneously, albeit with a reduction in power [60].
system itself acts as a threshold detector, rapidly dumping en- The weight-dependent signal power distribution of drop-and-
ergy stored in the gain medium into the optical mode when continue does create an undesirable interdependency between

lasing over a 45 nm band (1525–1570 nm) [61]. The trans-

fer function of a resonator drop filter is approximated by a
Lorentzian function:
1 Q
T (δ) = 2
, where δ = (ω − ω0 ) (1)
1+δ ω0
in which δ is linewidth-normalized frequency, Q is quality fac-
tor (Q ≈ 10, 300), and ω0 is peak center frequency. The ex-
tinction ratio of a single filter is R = Tm ax [dB] − Tm in [dB] =
T (0)[dB] − T (ωtun )[dB], where ωtun is the maximum tuning
range in linewidths. For broadcast-and-weight, analysis of cross-
talk must also take into account that the resonant frequency of
each filter is shifted in order to control the weight applied to its
channel. The worst case cross-talk Xij (defined as in [54]) is not
identical between upper and lower neighbors because the filter
Fig. 4. Conceptual diagram of a broadcast loop (BL). The loop waveguide resonance moves towards longer wavelength channels when de-
carries WDM channels from all participating PNNs, so each PNN can detect
a configurable subset of all channels. The PNN laser then outputs its signal, tuned from center. For the jth neighbor: Xj 0 = T (jΔω)[dB] −
a function of those inputs, on its unique wavelength channel. Once a signal T (0)[dB] and X0j = T (ωtun − jΔω)[dB] − T (0)[dB], where
transverses the BL, it is completely dropped and terminated by its originating Δω is channel spacing. Insertion loss on the ith channel is
unit to avoid interference between different parts of a channel. Filter banks and
inhibitory pathways not shown. I = 1 − R−1 · Πj < i (1 − Xij ) · Πj > i (1 − Xj i ) where I, R,
and X are in linear units.
For a specified tolerable performance of R > 13 dB,
{Xj 0 , X0j } < −13 dB, and I < 0.35 dB, we find WDM pa-
filter weights at different neurons, which could present a control rameters of ωtun = 4.4 (0.66 nm) and Δω = 8.8 (1.3 nm) meet
problem in adaptive systems. Drop-and-continue is a physical this specification. With the 45 nm gain band of hybrid III-V/SOI
solution to optical multicasting that can radically reduce net- lasers, these parameters lead to a channel number N = 34 per
work traffic for a given virtual interconnect density [34]. In the BL. The approximate footprint of a single filter bank in this
BL, this technique reaches it’s maximum potential, supporting case is 34 ×16 = 540 μm2 , compared to ∼4,000 μm2 for the
N 2 independent interconnections in a waveguide with only N active devices in a single PNN. The corresponding BL foot-
channels. print is 34 × 4,540 = 0.15 mm2 . The BL waveguide must be at
An example of a folded layout for tight packing is shown least 342 × 4 μm = 4.6 mm long to physically accommodate
in Fig. 5. Multiple BLs integrated on the same chip could in- this number of filters, contributing a minimum power penalty
teract by simply designating interfacial PNNs: nodes that re- of about 0.4 dB, given SOI waveguide loss [62]. We have made
ceive inputs from one BL and transmit into another (bottom the simplifying assumptions that every connection has a dedi-
of Fig. 5). In this way, a unified processing system consisting cated tunable MRR filter, these filters are all critically coupled
of multiple BLs can be created without any additional arbitra- to the bus waveguide, and that they are single-pole (i.e., single-
tion, routing, or device technology. BLs interacting via inter- MRR). Further investigations that depart from these simplifying
facial PNNs constitute distinct broadcast media and can thus assumptions could likely improve performance and maximum
reuse the same optical spectrum, much like a cellular tele- number of channels (for example, using double-pole MRRs for
phone network reuses spectrum geographically. Unlike a cel- steeper filter rolloff [54]).
lular phone network however, the operation of these broadcast Power budget is also a very important design consideration;
media is dissociated from their exact geometry, as long as the however, the analysis of noise and signal power in conventional
loop topology is present. The associated spatial freedoms will be digital interconnects can not be mapped trivially to the present
seen to yield a promising variety of multi-BL architectures (see system. Although similar noise mechanisms are present (e.g.,
Section IV-A). ASE, cross-talk, etc.), the relationship between SNR and spike
error rate in an optical spiking link requires further investigation.
D. Design Example For a full system design, the tolerance of overall system function
for communication errors must also be specified. This tolerance
The design of tunable filter banks for WDM weighting can
is application-dependent, but likely relaxed compared to digital
proceed similarly to that of wavelength demultiplexers based on
systems, due to the statistical and intrinsically noisy nature of
microring resonators (MRRs) in conventional digital intercon-
neuromorphic algorithms.
nects. In [54], the FSR-limited maximum wavelength count for
a silicon WDM link was found to be N = 62 for a trans-
mission window of 50 nm and channel spacing of Δλ0 =
5.3 linewidths (0.8 nm). Heterogeneous integration platforms The broadcast-and-weight protocol is a novel approach for
incorporating III/V active sections and passive silicon-on- combining neuromorphic processing and optical networking,
insulator (SOI) waveguides have demonstrated broadband pho- based on deep-seated correspondences between the chosen mod-
todetector responsivity and, with proper design, single-mode els of processing and networking. This combination gives rise

to novel properties that are native neither to optics nor to neu-

roscience. Optical WDM in a waveguide gives the architecture
special spatial freedoms, which are not observed in other hard-
ware neuromorphic systems. These freedoms will be discussed
with respect to their practical consequences to layout and organi-
zational flexibility. The spiking paradigm has reciprocal effects
on optics as an information processing substrate. We find that
many of the common challenges of robustness, cascadability,
and scalability faced by conventional optical logic architectures
can be addressed, a possibility largely attributable to the uncon-
ventional paradigm.

A. Multi-BL System Layout and Organization

A means to interface different BLs was initially introduced
in Fig. 5 to reuse spectrum on-chip. Although PNNs in different
loops can interact indirectly via interfacial PNNs, a multi-BL
system does not exhibit the same all-to-all potential interconnec-
Fig. 5. Example folded layout of a broadcast-and-weight cell showing 5 PNNs
tion observed in a single BL. This could cause informatic frag- (delimited by green areas) and two interfacial PNNs (blue areas) coupled to a
mentation and bottlenecks between different parts of a system contained BL (tan area). The lightpath of one channel (magenta) is shown
with many interfaced BLs, effectively neutralizing the computa- traversing the BL waveguide and branching into multiple filter banks. Originat-
ing and terminating in the leftmost PNN, this signal can be partially dropped into
tional usefulness of scaling the node count. We argue that inter- any of the PNNs around the BL. Each processing node must transmit on a unique
connect sparsity resulting from spectral reuse is not necessarily wavelength channel, except the outgoing interfacial node (lower right), which
detrimental to overall computational complexity, provided de- transmits into a different loop. Each node’s filter bank drops a linear superposi-
tion of the present channels, except the incoming interfacial node (lower left),
sign can follow appropriate principles. When determining struc- whose inputs are derived from another loop. Inhibitory pathways not shown.
tural constraints in distributed processing networks, communi-
cation and computation become fundamentally intertwined, so
design rules for organizing multi-BL architectures must shift
to invoke concepts outside of the field of communication net- Although WDM and bandwidth-distance properties of optics
works. We find that the ability to incorporate these distributed have been used for decades in communication networks, dis-
processing principles in an optical system is made possible by tributed processing consequences of spatial indeterminacy have
a special topological property of broadcast-and-weight, which not been explored. This is not a matter of oversight, but rather
we call spatial layout freedom. context. Fiber telecom networks transport signals between geo-
1) Spatial Layout Freedom: A BL waveguide can manifest graphic locations, a purpose intrinsically tied to space. On the
arbitrary shape in order to accommodate any layout of a group other hand, processing networks transport signals between a
of PNNs; this stipulation contrasts nearly all other approaches group of computational nodes; it makes no essential difference
to physical neuromorphic architectures (e.g., cross bar arrays where its nodes or its signals are located. At any spatial scale,
or holographic matrix-vector multipliers), where the layout of BL implementation relies on an identical device repertoire (i.e.,
computational primitives follows from the particular parallel filters, photodetectors, and excitable lasers), with the exception
networking approach. In a situation where signals are distin- perhaps of bus waveguide amplifiers that are needed to counter
guished based on their position, wire, or wavevector, physi- distance dependent loss in a silicon waveguide. Spatial invari-
cal layout inherits the geometrical constraints of the intercon- ance in multiplexing protocol, signal transmission, and device
nect, which can give rise to tangible limitations to interconnect technology—in the context of distributed processing—results in
structure (e.g., Rent’s rule [63]). Biology can avoid multiplex- the possibility to implement interesting and important structures
ing altogether by using dedicated wires (i.e., axons) for every in multi-BL architectures.
connection. However, this 3-D approach is not possible with Fig. 6 illustrates a multi-BL structure, demonstrating key fea-
state-of-the-art quasi-2-D fabrication techniques. While the ex- tures of hierarchical organization. Each BL reuses the same
act implications of this dimensional disparity are beyond the spectrum and WDM channelization, but can represent differ-
current scope, one can assert provisionally that any conserva- ent hierarchical levels of organization. A level-1 BL interfaces
tion of spatial degrees of freedom could be supremely important with other level-1 BLs (via “lateral” PNNs) and a level-2 BL
in neuromorphic engineering. (via “uplink” and “downlink” PNNs). Interfacial PNNs can be
In the broadcast-and-select protocol, spatial degrees of free- thought of as regular PNNs whose input spectral weight bank
dom are essentially undetermined: node identity is distin- receives the broadcast signals of a different BL (Fig. 5). While
guishable based on wavelength alone. In addition, the large similar in some ways to routing interfaces in conventional opti-
bandwidth-distance product of optical waveguides means the cal communication networks (which can also have hierarchical
corrupting role of dispersion remains small over a range of organizations), the PNN interfaces are spike processors that
spatial scales, compared to electrical transmission lines [64]. intrinsically transform information while transporting it. As a

all-optically through a transparent fiber-waveguide path repre-

sents an interesting possibility for further investigation.
Spatial layout freedom can be viewed as a powerful tool
to combat the sparse interconnection constraints inherent in
multi-BL spectral reuse and allow a wide potential variety of
system organization. However, determining particular multi-BL
organizations and the number of PNNs allocated at each inter-
face represent significant design challenges. Design parame-
ters that impact network structure fundamentally exceed pure
communication theory and must invoke theories of distributed
computation, such as complex network topology and cortical
2) Organization Principles for Multi-BL Architectures: De-
velopments in complex network theory have recently been ap-
plied to understand aspects of structure, organization, and col-
lective dynamics in cortical networks [65], and insights from this
field could be used to guide multi-BL system design. Complex
network theory describes relationships between interconnection
patterns (i.e., graph topology) and dynamic functionality in dis-
tributed systems, which contrasts with the study of information
capacity in static states or isolated communication channels.
While the goal of neuron-inspired processing should not be
perfect emulations of biological networks, the study of corti-
Fig. 6. Hierarchical organization of the waveguide broadcast architecture cal connectomics (i.e., biological neural network structure) also
showing a scalable modular structure. Colored rectangles represent PNNs. Green provides examples of the types of topological features that may
PNNs indicate input and output coupling to the same broadcast loop. Blue PNNs
interface between distinct BLs and are classified as “uplink,” “downlink,” or “lat- be relevant for processing tasks in neuron-inspired systems. Im-
eral” varieties based on their position in the hierarchy. Each transmitting PNN portant aspects can be judged with the tools of complex network
has a unique output wavelength within its given broadcast space, but spectrum science and connectomics, which enable the abstraction of rel-
is reused between different BLs.
evant metrics of informatic and computational complexity in
distributed systems.
For example, a complex network metric called “small-
worldness” describes some networks that lie between an ordered
result of the processing done in PNN interfaces, network nodes and random interconnectivity pattern. “Small-worldness” is en-
in a given BL can not directly send their outputs to nodes in gendered by both high clustering coefficient (i.e., cliquishness)
other BLs, and multi-BL systems can no longer implement all- and short average path length (i.e., sparse long-range connec-
to-all interconnects. Instead of attempting to faithfully transfer tions) [66]. In complex systems, small-world networks have
any one signal from one BL to another, the PNN interfaces been associated with dynamical complexity [67] and informa-
create mutual informatic relationships that extend beyond BL tion integration over multiple spatial scales [68]. Small-world
boundaries. At the same time, PNN interfaces do not experi- characteristics are also observed in anatomical networks, rang-
ence additional buffering or wavelength allocation constraints, ing from the simplest animal nervous system (C. Elegans), to
and the BL communication load is constant across different lev- mammalian cortex, which has a consistently modular and hier-
els of the hierarchy instead of growing exponentially as in pure archical organization throughout [69].
communication networks. These biological and mathematical insights could provide
Fig. 7 shows a layout that corresponds to the network diagram evidence to guide organizational design principles of neuro-
of Fig. 6. The lowest level is a tightly packed group of compu- morphic processing systems. Spatial layout freedom means a
tational primitives connected by a folded loop (see Fig. 7(c)). BL can fully interconnect a tightly packed group of processing
Some computational primitives can interface with other loops, nodes, or it can run over an entire chip area. This coexistence of
either directly with nearby first level loops, or with a second large fan-in and long-range connections is a physical correlate
level loop that connects physically distant components on the of the simultaneous clustering and short path lengths that typify
chip-scale. The second level loop (see Fig. 7(d)) has a similar small-world networks.
functionality compared to the first level, but it occupies a much In order to realize small-world topological properties in an
larger area and represents a more complex dynamical process- artificial neural network, an interconnect implementation must
ing network. Although the chip scale corresponds to just the support connections over a range of spatial scales. Electrical
second level in this example, intermediate levels on chip are wires exhibit a bandwidth-distance-energy tradeoff that im-
entirely possible. Continuing in this direction of hierarchical pedes this goal [64]. Systems based on spatial multiplexing
levels, a multi-chip system based on fiber loops (see Fig. 7(e)) in holograms or cross-bar arrays cannot be easily detached from
could be considered. Interfacing multiple optoelectronic chips a characteristic length (e.g., diffraction length) and have very

little flexibility or potential to scale hierarchically. Spatial lay-

out freedom as described above could grant the flexibility re-
quired to meet these goals, making broadcast-and-weight ar-
chitectures uniquely suited, among artificial hardware systems,
to explore computationally efficient and biologically-relevant
network topologies.
Based on qualitative similarities between organizational abil-
ities of multi-BL systems and principles of complex and cortical
networks, we have hypothesized that the proposed architecture is
capable of enacting salient processing structures. Further inquiry
into multi-BL architecture design must incorporate principles of
complex network theory, likely including, but not limited to, the
idea of small-worldness. Concretization of the corresponding
design rules represents a formidable research problem, which
lies in the intersection of linear lightwave networks and complex
system science.

B. Feasibility of Photonic Processing With Spikes

In this section, we will briefly consider how three aspects
of practical feasibility (cascadability, robustness, and scalabil-
ity) in photonic processing are impacted by adopting a spike
processing paradigm. Cascadability is the ability of a computa-
tional element to drive multiple stages of similar devices with
fidelity in the presence of noise. Robustness refers to a system’s
potential to mitigate the effect of device defects—inevitable in
large-scale integration—on overall functionality. Scalability is Fig. 7. An example layout strategy for a hierarchical network demonstrating
an architecture’s capacity to increase in size and complexity, the scale-independent nature of a waveguide BL. Computational primitives are
which requires a system format able to accommodate modular classified as (a) interfacial PNN, whose output is coupled into a different BL
waveguide than its inputs and (b) non-interfacial PNN, which transmits and
expansion without performance degradation. receives in the same BL. (c) A broadcast-and-weight network constitutes the
1) Cascadability: In digital electronic design, a logic gate first-level of hierarchy and consists of a group of potentially all-to-all connected
needs power gain to fan-out to multiple other gates, and it must computational primitives. In this case, it takes a folded shape for the sake of
packing efficiency. (d) A chip-scale second-level broadcast network intercon-
have logic-level restoring behavior to suppress noise. These nects the interfacial PNNs from many first-level BLs. First-level BLs can also
conditions usually imply cascadability in electronics, yet a more interface directly via lateral interfacial PNNs (purple dotted lines). (e) A multi-
multifaceted notion of cascadability applies to an optical device chip third level network illustrating a compatibility with fiber implementations
of a BL. The broadcast-and-weight network is conceptually the same as in
due to the extra dimension of wavelength (or phase). This extra other levels, but the BL waveguide consists of coupled fibers and integrated
degree of freedom can be a major boon to functionality in an waveguides.
optical system (e.g., WDM) but can introduce vulnerabilities to
new sources of uncertainty (e.g., wavelength drift). In particular,
systems that exploit WDM can suffer from a need for wavelength cavity-mediated optoelectronic interactions to realize spiking
conversion. dynamics at ultrafast timescales, which allow it to perform hy-
The proposed PNN co-integrates the complementary physics brid analog-digital information transformations in a small foot-
of optics and analog electronics in order to address cascad- print [31]. These dynamics, shared by spiking biological and
ability issues in WDM. The PNN curtails propagation of analog CMOS neurons, prevent noise generated in analog por-
phase/wavelength noise from one stage to another by interleav- tions of the pathway from propagating through the system and
ing optical representations with an analog electronic part of eventually corrupting signal integrity. Fan-out can pose a prob-
the primary signal pathway. The photodiode-laser setup “con- lem to optical processors because splitting is accompanied by
verts” information from multiwavelength inputs onto a single an N-fold reduction in signal power. This loss could be counter-
wavelength output, physically capable of driving other PNNs. balanced by laser excitable gain, in that small input pulses can
However, total power detection for wavelength fan-in is insep- trigger the release of a much larger quantity of stored energy, or
arable from an analog summation function. While this effect with additional waveguide amplifiers in the BL.
would corrupt channel information in digital signals, the sum- Spikes carry information predominantly in their timing, so
mation is precisely correspondent with weighed summation in time skew has the potential to corrupt signals. The authors
models of neuromorphic processing. of [70] noted that differences in electronic and optical signal
All-or-nothing output quantization is critical in spiking transmission can cause timing requirements that make the leap
paradigms because the significant amount of analog processing from combinatorial logic to sequential logic highly nontrivial.
is vulnerable to amplitude noise. The excitable laser employs However, since synchrony is not a critical aspect of the spike

The broadcast-and-weight network can easily incorporate

small amounts of hardware overhead. Since all PNNs have
access to all signals in a single BL, they can be swapped in-
terchangeably in the event of device defect or death. The PNNs
are functionally similar, so any unused PNN can virtually swap
its interconnection relationships with any defective PNN by ex-
changing filter bank weights. Overhead PNNs therefore do not
backup a single primary PNN, but rather cover all possible fail-
ures in the BL. Virtual swapping through reconfiguration can
react to specific failures that occur both during fabrication or in
the field. Programming a reconfiguration to avoid defects can
be very energy and computation hungry in some systems (i.e.,
field-programmable gate arrays) due to the intensive problems
of placement and routing associated with mesh networks [73].
In contrast, a broadcast network has no corresponding constraint
in mapping automata to devices, trivializing the hardware opti-
mization problem.
Fig. 8. System failure rate as a function of the number of nodes comparing a The ability to easily swap the role of every hardware primi-
conventional hard-wired circuit (blue dash-dot line, Equation (2)) to broadcast-
and-weight systems with varying amounts of hardware overhead (7%: green tive means that system success now requires any n processors
circle, 9%: red triangle, 11%: cyan square, and 13%: magenta cross). The exact to work out of a total of m = (1 + a)n PNNs in the BL,
failure rate of the BLs (markers, Equation (6)) differ from the approximate error where a is overhead ratio. If the number of working PNNs
function curves (solid lines, Equation (4)) due largely to integer rounding. Even
though the hard-wired system is shown here with nodes of 10 times higher k ∈ (0, 1, . . . , m) is a Poisson random variable
reliability (5 · 10 −3 versus 5 · 10 −2 failure rate), the systemic reliability of a  
BL can be much greater than the hard-wired system and even the reliability of m k
a single element (black dotted line). P [k] = psucc (1 − psucc )m −k (3)
n −1

Pfail = P [k]. (4)
processing paradigm, strict conformity of timing parameters
k =0
is not necessary. The asynchronous nature of broadcast-and-
weight provides a mechanism to perhaps even exploit hetero- For large values of n, this failure rate can be approximated
geneity in spike timings in order to implement advanced spa-
tiotemporal algorithms, such as [71]. On the other hand, the P (k) ≈ Norm (k ; mpsucc , m(1 − psucc )) (5)

effect of noise on pulse timing (i.e., jitter) is relevant in deter- 1 mpsucc − n
mining spike precision and channel capacity. Pfail ≈ erfc  (6)
2 2m(1 − psucc )
2) Robustness: Suppose a given distributed processing task
requires n computational primitives. Each device has some fixed where Norm (k; μ, σ 2 ) is a Gaussian function with mean μ
reliability: the probability that it will work successfully psu cc . and variance σ 2 and erfc (·) is the complementary error func-
Since the system requires all devices working, its failure rate is tion. System failure rate as a function of network size is plotted
given by in Fig. 8, comparing the robustness of hard-wired systems to
Pfail = 1 − pnsucc . (2) broadcast-and-weight systems with varying amounts of hard-
ware overhead. The system with swappable nodes inverts the
Systemic failure rapidly approaches certainty as the system size conventional trend, exhibiting a failure rate that decreases expo-
(i.e., node count) increases. This unreliability is particularly nentially with the nominal node count. Surprisingly, systemic
important for large-scale integrated systems since, a defective reliability can even be better (in some cases by orders of mag-
transistor or laser device cannot simply be replaced after the nitude) than the reliability of a single node.
fact. Robustness can be improved by increasing device yield, This mechanism of robustness through swapping could be
a strategy that is not always practicable, or by incorporating very useful in other on-chip photonic networks; however, it
hardware redundancy called overhead. It is impossible to know does not extend arbitrarily to computational models outside of
ahead of time which devices will fail, so overhead must cover neuromorphic processing. Only processing elements invariant
every possible failure, even though each is highly unlikely. If to input ordering (e.g., addition, NAND, etc.) allow for swap-
each primary device is given a backup device (100% overhead), ping of nodes. In most other processing models (e.g., Fredkin
the majority of overhead hardware will remain unused, and a gates, CPU cores, etc.), the sequence of different inputs must
joint failure of both primary and backup devices could still remain distinguishable to a processor. This invariance to input
disable the system. More sophisticated ways of incorporating sequence in a summation corresponds to a photodetector de-
redundancy based on coding theory can be applied in special stroying wavelength information, which is a key compatibility
cases, but no general code theoretic approach to robustness in between the photonic physics and neuromorphic function of the
Boolean systems has yet been identified [72]. PNN.

Alexander N. Tait (S’11) received the B.Sci.Eng. (Hons.) in electrical engi- Paul R. Prucnal (S’75–M’79–SM’90–F’92) received the A.B. degree from
neering from Princeton University, Princeton, NJ, USA, in 2012, where he is Bowdoin College (summa cum laude), with Highest Honors in math and physics,
currently working toward the Ph.D. degree in electrical engineering in the Light- where he was elected to Phi Beta Kappa. He received the M.S., M.Phil., and
wave Communications Group, Department of Electrical Engineering. Ph.D. degrees from Columbia University, where he was elected to the Sigma
He was a Research Intern for the summers of 2008–2010 at the Laboratory Xi Honor Society. He is currently a Professor of Electrical Engineering, Prince-
for Laser Energetics, University of Rochester, Rochester, NY, USA, and an Un- ton University, Princeton, NJ, USA, where he has also served as the Founding
dergraduate Researcher for the summers of 2011–2012 at the MIRTHE Center, Director of the Center for Photonics and Optoelectronic Materials. He has held
Princeton University, Princeton, NJ, USA. His research interests include pho- visiting faculty positions at the University of Tokyo and University of Parma.
tonic devices for nonlinear signal processing, integrated systems, neuromorphic Prof. Prucnal was an Area Editor of the IEEE TRANSACTIONS ON COMMU-
engineering, and hybrid analog–digital signal processing and computing. NICATIONS for optical networks, and was Technical Chair and General Chair
Mr. Tait is a Student Member of the IEEE Photonics Society and the Optical of the IEEE Topical Meeting on Photonics in Switching in 1997 and 1999,
Society of America. He received the National Science Foundation Graduate Re- respectively. He is a Fellow of IEEE with reference to his work on optical
search Fellowship. He received the Optical Engineering Award of Excellence networks and photonic switching, a Fellow of the OSA, and a recipient of the
from the Department of Electrical Engineering, Princeton University. He has Rudolf Kingslake Medal from the SPIE, cited for his seminal paper on photonic
coauthored six journal papers and one Springer book chapter. switching. In 2006, he received the Gold Medal from the Faculty of Physics,
Mathematics and Optics from Comenius University in Slovakia, for his contri-
butions to research in photonics. He has received Princeton Engineering Council
Awards for Excellence in Teaching, the University Graduate Mentoring Award,
and the Walter Curtis Johnson Prize for Teaching Excellence in Electrical En-
gineering, as well as the Distinguished Teacher Award from Princetons School
Mitchell A. Nahmias (S’11) Graduated (Hons.) from Princeton University, of Engineering and Applied Science. He is Editor of the book, Optical Code
Princeton, NJ, USA, with the B.S. degree in electrical engineering and a certifi- Division Multiple Access: Fundamentals and Applications.Ó
cate in engineering physics. He is currently working toward the Ph.D. degree
in electrical engineering under Prof. P. Prucnal to continue his undergraduate
work on his excitable, photonic neuron.
Mr. Nahmias received the John Ogden Bigelow Jr. Prize in Electrical En-
gineering and Cowinner of the “Best Engineering Physics Independent Work
Award” for his senior thesis. He received the National Science Foundation Grad-
uate Research Fellowship.

Bhavin J. Shastri (S’03–M’11) received the B.Eng. (Hons. with distinction),

M.Eng., and Ph.D. degrees in electrical engineering from McGill University,
Montreal, QC, Canada, in 2005, 2007, and 2011, respectively.
He is currently a Postdoctoral Research Fellow at the Lightwave Communi-
cations Laboratory, Princeton University, Princeton, NJ, USA. His research in-
terests include ultrafast cognitive computing—neuromorphic engineering with
photonic neurons, high-speed burst-mode clock and data recovery circuits,
optoelectronic-VLSI systems, optical access networks, machine learning, and
computer vision.
Dr. Shastri has received the following research awards: 2012 D. W. Ambridge
Prize for the top graduating Ph.D. student, nomination for the 2012 Canadian
Governor General Gold Medal, IEEE Photonics Society 2011 Graduate Stu-
dent Fellowship, 2011 Postdoctoral Fellowship from National Sciences and
Engineering Research Council of Canada (NSERC), 2011 SPIE Scholarship
in Optics and Photonics, a Lorne Trottier Engineering Graduate Fellow, and a
2008 Alexander Graham Bell Canada Graduate Scholarship from NSERC. He
received the Best Student Paper Award at the 2010 IEEE Midwest Symposium
on Circuits and Systems, the corecipient of the Silver Leaf Certificate at the 2008
IEEE Microsystems and Nanoelectronics Conference, the 2004 IEEE Computer
Society Lance Stafford Larson Outstanding Student Award, and the 2003 IEEE
Canada Life Member Award. He was the President/Cofounder of the McGill
OSA Student Chapter.

