Differential Phase-Shift Keying for High Spectral

Efficiency Optical Transmissions
Chris Xu, Xiang Liu, Member, IEEE, and Xing Wei

Invited Paper

Abstract—Differential phase-shift keying (DPSK) has attracted

significant attentions in research and development during the last
several years. An overview of DPSK for high spectral efficiency
optical transmission is presented in this paper. The advantages of
DPSK in terms of receiver sensitivity and tolerance to fiber non-
linearity will be discussed in detail. A simplified method for esti-
mating the performance of phase-shift keying in numerical simula-
tions is explained. Results of experimental and numerical investiga-
tions of several phase shift keying formats, including polarization
division multiplexing and multilevel encoding, will be reviewed. Fi-
nally, methods for further enhancing performance of phase-shift
keying in long-haul transmissions, specifically the compensation of
nonlinear phase jitter, are presented.
Index Terms—Communication systems, modulation coding,
nonlinear optics, optical fiber communication, phase modulation,
phase shift keying.


P HASE-SHIFT KEYING (PSK) for fiber-optic data trans-

mission first attracted significant attention around 1990.
Most of these early experiments were focused on coherent op-
Fig. 1. Number of published journal articles on optical phase shift keying
versus year of publication. Data obtained from ISI Web of Science data base.
For 2003, the data cover only January–September.
tical communications [1]–[5], with the main emphasis being the
receiver sensitivity. For practical applications, however, PSK duction of wavelength-division multiplexing (WDM) for data
requires precise alignment of the transmitter and demodulator transmission provided a new direction for increasing the system
center frequencies, which was difficult to achieve at the low capacity, in addition to increasing the data rate per wavelength
data rates in the early1990s. With the advent and deployment of channel. For example, the channel data rate of commercially de-
erbium-doped fiber amplifiers (EDFAs), the interest in PSK for ployed systems has improved from 2.5 Gb/s in the mid-1990s
optical transmission decreased noticeably, especially after the to 40 Gb/s today, and the number of wavelength channels has
realization that nonlinear phase jitter limits the unregenerated reached , enabling a multiterabit system in a single fiber.
transmission distance of a single channel PSK system and As a result, the spectral efficiency (SE) of fiber-optic communi-
PSK was unlikely to outperform amplitude-shift keying (ASK) cations has improved significantly, from a very low SE of single
[6]. Since then, research and development efforts have been channel transmission around 1990 to 0.4 bit/s/Hz for the current
mostly focused on ASK transmission format, particularly on commercial system, and even higher in research experiments.
on–off keying (OOK). There were relatively few reports of Concurrent to the rapid expansion of fiber capacity, the unre-
PSK studies, either experimental or numerical, between the generated reach of fiber-optic transmission has also increased
mid-1990s and 2001 (Fig. 1). dramatically, mainly driven by the desire to achieve a trans-
The capacity of fiber-optic transmission, on the other hand, parent all-optical network and ultimately reduce the cost of data
increased dramatically during the same time period. The intro- transmission.
The advances in channel data rate, system SE, and reach dra-
matically altered the relative merits in performance and prac-
caused by fiber nonlinearity increase with increasing SE and

reach. Thus, effects caused by fiber nonlinearity are vitally im-
portant for high-SE long-haul optical transmissions. Although
the received optical signal-to-noise ratio (OSNR) increases lin-
early with the increase of the signal power at the input of the
system, optical nonlinearity typically increases superlinearly.
Consequently, the received OSNR is severely limited by fiber
nonlinearities. In fact, receiver sensitivity and tolerance to fiber
nonlinearity are the most important considerations in a high-SE
optical system. As we shall discuss in greater detail, PSK sys-
tems are not only superior in receiver sensitivity but also more
tolerant to fiber nonlinearity, especially at a system SE of 0.4
or greater. There were research efforts on PSK between 2000
and early 2002 [8]–[13]; however, the performances did not
surpass that of conventional OOK, and in some cases either
single-ended, instead of balanced, receivers were used or ampli-
fied spontaneous emission (ASE) noise was neglected in mod-
eling. The conclusive evidence of PSK advantages in high-SE
long-haul optical transmissions came in March 2002 [14], when
a 4000-km-reach 2.56-Tb/s system (64 channels at 40 Gb/s)
was experimentally demonstrated, easily doubling the reach of Fig. 2. (a) Schematic illustration of a binary DPSK system with MZDI. (b)
a conventional OOK system. Many experimental and numer- Comparison of conventional OOK and DPSK (assuming the same peak power
of “1s” in the two formats). For the phasor diagram, the horizontal axes are the
ical investigations on PSK transmissions followed in the next real part of E-field, and the vertical axes are the imaginary part of E-field.
18 months, including 40-Gb/s transmission with ultra-long-haul
transmission distance of 10 000 km [15], ultrahigh capacity of
6.4 Tb/s on a single fiber [16], and a record SE of 1.6 bits/s/Hz are differentially detected by balanced photodiodes, and the bit
[17]. These performances far exceeded what is achievable with error rate (BER) is then measured on the regenerated data. It
conventional OOK. As indicated by the number of publications is perhaps most instructive to graphically examine the different
in Fig. 1, there is clearly renewed excitement in PSK after a rel- modulation formats in the complex plane of the E-field (i.e., the
atively quiet period of nearly ten years. phasor diagram). Each data pulse is represented by a single point
Significant effort was devoted to coherent detections (i.e., (or a vector connecting the origin to that point) in the phasor dia-
heterodyne and homodyne detections) of PSK in the early gram, in which the radial direction represents the E-field ampli-
1990s; however, direct detection of differential PSK (DPSK) tude and the angular direction is the E-field phase [Fig. 2(b)]. In
with a Mach–Zehnder delay interferometer (MZDI) is much contrast to conventional OOK, where the data are represented by
more practical to implement at high data rate and only suffers either “0” or “1,” binary DPSK essentially encodes the date as
a small penalty in receiver sensitivity. In addition, it was also “1” or “ 1” (i.e., 0- or -phase shift). Thus, a differential bal-
realized that the return-to-zero (RZ) format further improved anced receiver and a completely periodic intensity pattern are
the practical implementation and nonlinearity tolerance of PSK two major differences between DPSK and OOK. Not surpris-
[18]. Judging from the published literature in the last two years, ingly, the advantages of DPSK are mostly derived from these
RZ-DPSK is the focus of current research and development two characteristics.
efforts. Thus, although some of the discussions presented in this
paper are applicable to PSK in general, we will limit our scope A. Advantage of Receiver Sensitivity Using DPSK
in this paper to RZ-DPSK transmission format for high-SE Transmission Format
transmission. In particular, we will include binary RZ-DPSK,
polarization multiplexed and/or interleaved RZ-DPSK, and The receiver sensitivity of DPSK was studied extensively in
multilevel encoding. Methods for further enhancing DPSK the past [19]. Recently, we revisited this issue and used a simple
for long-haul transmissions, specifically the compensation for model to calculate the BER for DPSK [20]. This simple model
nonlinear phase jitter, are presented. did not use advanced mathematics such as the chi square dis-
tribution or Marcum Q-function, and yet confirmed analytically
that, in a linear channel dominated by ASE noise, DPSK re-
quires approximately 3-dB lower OSNR than OOK does for the
A generic binary RZ-DPSK system is schematically shown in same BER.
Fig. 2. Instead of encoding data on the pulse amplitude (OOK), In a linear optical transmission system where ASE noise dom-
the data information is stored in the phase transitions between inates, the field of RZ pulses at the end of the transmission
adjacent bits in a DPSK format. Compared with the RZ-OOK can be expressed as
system, there is no major difference in the design of the trans-
mitter and the fiber spans. For direct detection, a MZDI with
a 1-bit delay converts the incoming RZ-DPSK signal into in- (1)
tensity-modulated signals at its two output ports. These signals

where is the angular frequency of the optical carrier, is the

bit period, is the envelope function of the RZ pulse
in the th timeslot, is the (complex) amplitude of the th
pulse, and represents a classical additive Gaussian noise
(the optical noise with a different polarization will be neglected
for simplicity).
To optimize the receiver performance, in front of the receiver
we use a matched optical filter with an impulse response


where is the energy per bit.

The filtered signal is a convolution of and . Near the
center of the th timeslot, the filtered signal is
Fig. 3. BERs versus OSNR for DPSK with a MZDI and a balanced receiver
and OOK in a linear channel. The BERs are calculated with (12) and (13),
(3) assuming 40 Gb/s transmission.
(for “0”), and the decision level is thus at zero. The BER is the
(5) probability for to have a wrong sign. Using the Gaussian
probability density function (pdf) of and , one can calcu-
The filtered noise amplitude consists of a real part and late the BER analytically [20]. The result of the calculation is
an imaginary part , i.e.,
BER (12)
which agrees with earlier results for an ideal DPSK system [2],
and and are independent zero-mean Gaussian-distributed [21].
quantities with the same variance The BER as a function of signal-to-noise ratio for OOK can
be calculated with essentially the same model based on matched
optical filtering. Note that with the same average signal power,
It can be easily proven that is equal to the power spectral the energies for a 0-bit and 1-bit in OOK are 0 and 2 , respec-
density of the unfiltered white noise (single polariza- tively, assuming equal probabilities for “0” and “1.” Using the
tion). same noise field (i.e., additive and Gaussian with respect to the
The filtered DPSK signal is decoded with an MZDI as shown signal field, and with the same variance of , we obtain for
in Fig. 2. The optical output of the MZDI is a constructive in- conventional OOK
terference or a destructive interference of the adjacent bits, de-
pending on the relative phase between and . The signals BER (13)
measured by the two photodiodes are, respectively,
Neglecting the unimportant prefactors in front of the expo-
nential functions, (12) and (13) show that an approximately
3-dB advantage in receiver sensitivity is obtained using DPSK
format. Fig. 3 shows a comparison between direct detection
(9) OOK and DPSK with an MZDI and a balanced receiver. Here
the signal-to-noise ratios per bit (i.e., ) in (12) and
A subtraction between and is then performed by a differ- (13) have been converted to the more commonly used OSNR
ential amplifier, and the balanced output is in optical communications (the bit rate is 40 Gb/s and the noise
bandwidth is 0.1 nm). The sensitivity advantage of DPSK over
(10) OOK is 2.8 dB at a BER of 10 . It reduces slightly to 2.7 dB
at a BER of 10 .
Using (4) and (6), we find A simple explanation of the 3-dB advantage of DPSK is that
the separation between the two transmitted symbols in DPSK
(11) (2 ) is larger than in OOK 2 when the same average
signal power is used. We note that the 3-dB advantage of DPSK
We note that the above expression represents the “inner in a linear channel disappears if we use only one of the two
product” of two vectors associated with the two complex outputs of the delay interferometer and ignore the other. Thus,
quantities and . Depending on the relative sign of balanced detection is essential to achieve the 3-dB receiver sen-
and is either around (for “1”) or around sitivity advantage in DPSK.

Theoretically, the quantum limit (with optically preamplified

receiver) for achieving a BER of 10 (a BER level at which re-
ceiver sensitivity is typically defined) is 38 photons/bit for OOK
signal and 20 photons/bit for DPSK using balanced detection
[19]. In practice, a receiver sensitivity of 60 and 30 photons/bit
has been achieved for 10-Gb/s OOK and DPSK, respectively
[22]. At 42.7 Gb/s, a sensitivity of 45 photons/bit has been re-
ported for RZ-DPSK, which is also 3 dB better than OOK
We further note here that simulations have demonstrated that
RZ-DPSK is more robust than OOK in terms of narrow-band op-
tical filtering [24] and polarization-mode dispersion [25], espe-
cially when a balanced receiver is employed. Both issues are of
practical importance in high-bit-rate and high-SE transmissions.

B. Comparison of Nonlinearity Tolerance Between

Besides its 3-dB receiver sensitivity advantage over OOK,
DPSK improves the tolerance to fiber nonlinearity in a high-SE
system. DPSK significantly reduces the cross-phase modula-
tion (XPM) effects in optical transmission. It is perhaps most
instructive to first discuss the XPM effects of DPSK and OOK
in a dispersion-managed soliton (DMS) system [26] without
ASE noise. In a dense WDM DMS system, solitons in different
channels overtake each other. Every time the bits pass each
other, a soliton collision occurs. Because of XPM-induced
frequency shift and dispersion, each collision advances or
delays the soliton. In an OOK DWDM DMS system, each
WDM channel can have very different bit sequences, causing
the solitons to experience very different collision patterns.
This causes nonuniform timing shifts in the soliton arrival
Fig. 4. Transmitted eye diagrams of (top) DPSK-DMS and (bottom)
and therefore, timing jitters [27], [28]. Thus, it is not soliton OOK-DMS without ASE noise at 7500 km.
collision itself but soliton collision with a random bitstream
that results in timing jitter. Because XPM depends only on the
intensity profile of the pulses and is independent of the phases
of the pulses [in contrast to four-wave mixing (FWM)], one
should be able to eliminate the impairment caused by XPM by
phase coding only. The collisions between solitons in different
WDM channels still occur. However, since each WDM channel
has uniform intensity pattern, the collisions are the same for all
solitons. The net effects of the collisions are uniform shifts in Fig. 5. Schematic illustrations of the Gordon–Mollenauer effect in RZ-DPSK.
soliton arrival. Thus, no timing jitter is introduced. Fig. 4 shows
numerical simulation of DPSK and OOK DMS transmission adjacent WDM channels through interchannel XPM [29], [30].
without ASE noise. While a large amount of timing jitter is In principle, the SPM mediated Gordon–Mollenauer effect is
present in the OOK transmission, DPSK completely eliminated a single-channel transmission penalty and is independent of
such a pattern-dependent XPM penalty. WDM transmission, and thus independent of SE. The XPM me-
It is more complicated when ASE noise is included for a prac- diated effect is a concern in dense WDM. Nonetheless, its effect
tical system. Because the nonlinear transmission penalties differ is relatively small [30], [31]. As we have already discussed,
considerably depending on the channel data rate, we will discuss the main nonlinear penalty in a DMS-OOK transmissions is
the 10- and 40-Gb/s transmissions separately. interchannel XPM-induced timing jitter [27], [28], obviously a
In addition to the linear phase noise due to ASE, the non- multichannel effect. Thus, the comparison between OOK and
linear phase noise in a 10-Gb/s DPSK system results mainly DPSK in terms of nonlinear transmission penalty at 10 Gb/s
from two nonlinear effects. The first one is the well-known involves a tradeoff between single-channel SPM (in DPSK)
Gordon–Mollenauer effect [6] in which amplitude fluctuations and multichannel XPM (in OOK), and we expect the relative
induced by ASE noise are converted to phase fluctuations transmission performance of DPSK when compared with OOK
through self-phase modulation (SPM) (Fig. 5). The second to improve as the SE of the system increases. Furthermore, the
effect occurs in WDM transmission: nonlinear phase jitter is improvement in receiver sensitivity for DPSK with a balanced
added to a given wavelength by the amplitude fluctuations in its detection allows error-free transmission at significantly lower

OOK transmission [35]. DPSK is found numerically to suffer

less overall nonlinear penalty in pulse-overlapped 40-Gb/s
transmissions [36]. Due to the strong dispersive effect in
40-Gb/s transmissions, the Gordon–Mollenauer effect is much
reduced. Thus, the major nonlinear penalty in 40-Gb/s DPSK
is intrachannel FWM-induced nonlinear phase noise [36]. It
is further found, through theoretical analysis and numerical
simulations, that DPSK suffers less penalty from intrachannel
FWM than OOK with the same average power due to the lower
peak power of DPSK and a correlation between the nonlinear
phase shifts experienced by any two adjacent bits [37].
Fig. 8 shows the simulated phasor diagrams of RZ-OOK
Fig. 6. Single-channel RZ-OOK and RZ-DPSK transmission at 10 Gb/s.
and RZ-DPSK (with the same average power) after pulse-over-
0 0
The path-averaged powers are 12.9 and 10.5 dBm for DPSK and OOK, lapped 40-Gb/s nonlinear transmission with a maximum accu-
respectively. mulated dispersion of 850 ps/nm [37]. Evidently, ghost-pulse
generation in OOK causes severe eye-closure penalty, while
DPSK suffers relatively smaller eye closure in the phase domain
[see Fig. 8(b)], due to reduced signal peak power in DPSK.
Interestingly, the variance of the differential phase is almost the
same as (instead of twice as large as) that of the absolute phase,
showing the strong correlation between the phase shifts of two
adjacent bits induced by intrachannel FWM. Because the data
information resides in the relative phase difference between
adjacent bits, correlated phase shifts have no direct impact on
the transmission performance. Thus, besides its 3-dB receiver
sensitivity advantage over OOK, DPSK improves the tolerance
to fiber nonlinearity in a strong pulse-overlapped transmission
system by reducing the impairments of both intrachannel XPM
and FWM.
Fig. 7. Dense WDM (10 Gb/s at 50-GHz channel separation) RZ-OOK and It is worth mentioning that with a newly developed disper-
RZ-DPSK transmission. The path-averaged powers are indicated. sion compensation scheme based on the periodic-group-delay
dispersion compensators [38]–[40], interchannel XPM can be
signal powers (even lower pulse energies), leading to additional much reduced, and this favors OOK more than DPSK. Due to
reduction in nonlinear transmission penalty, which becomes the practically limited ratio between useful channel bandwidth
more important in a high-SE system. and channel spacing (normally 0.5) of these new dispersion
Fig. 6 compares low-SE DPSK and OOK transmissions [32]. compensation devices, however, DPSK remains favorable for
OOK clearly outperformed DPSK in such an extreme case. In high-SE transmission.
fact, even better “single” channel OOK performance can be
obtained by increasing the signal power. The results for WDM III. MODELING DPSK PERFORMANCE IN
transmission, however, are markedly different. Fig. 7 shows NUMERICAL SIMULATIONS
transmission of DPSK and OOK at 10 Gb/s with a channel
One challenge in numerical simulations is to provide a re-
separation of 50 GHz. Comparing with the data shown in Fig. 6,
liable estimate of the BER. Due to limited computation time,
WDM DPSK performance is essentially the same as that of the
a typical simulation program uses only hundreds of bits, and
single channel, while the performance of WDM OOK is sig-
therefore, the BER is usually not counted directly but estimated
nificantly degraded. Clearly, the relative performance of DPSK
by evaluating the statistical fluctuation in the received signal. In
improves at higher SE transmission. Indeed, DPSK is found to
simulations of OOK, such fluctuations are often characterized
outperform OOK in 10-Gb/s dense WDM systems with 0.4 SE
by a Q-factor defined as
by both numerical simulations [32] and experiments [33].
In contrast to 10-Gb/s transmissions, strong pulse overlap
usually occurs in 40-Gb/s transmissions due to its four-times-re- (14)
duced signal bit period (or 16-times-stronger dispersive effect).
The main nonlinear effects are intrachannel FWM and XPM where denotes the separation between the intensity
[34]. Thus, the comparison between OOK and DPSK in terms levels of “1” and “0” and is the sum of the standard
of nonlinear transmission penalty in a 40-G system is mostly deviations of the intensities around the levels of “1” and “0.” It
within a single channel. It is known that the intrachannel XPM- is a standard practice in numerical simulation of OOK to relate
and FWM-induced timing and amplitude jitters of the “ones” Q to the BER through the complementary error function
can be greatly reduced in a “symmetric” dispersion-managed
link, and the intrachannel FWM-induced ghost pulse generation
BER (15)
on “zeros” remains as a major nonlinear penalty source in

(a) (b) (c)

Fig. 8. Simulated phasor diagrams showing the amplitude and absolute phase at the centers of signal bits in (a) RZ-OOK and (b) RZ-DPSK, and (c) the mean
amplitude of adjacent bits and differential phase between them in RZ-DPSK after pulse-overlapped nonlinear transmission. The pseudorandom bit sequence length
is 2 1.

We emphasize here that (15) is based on the assumption of

Gaussian intensity distribution of the received signal, which
is invalid in a beat-noise limited receiver. Equation (15) has
worked well in OOK modeling only because it coincidentally
gives a fairly good prediction of the BER at the optimal
threshold, even though the noise distribution in the intensity do-
main is not Gaussian [19]. For DPSK with balanced detection,
however, such a coincidence no longer occurs, and following
the same procedure in OOK modeling, i.e., using (14) and (15),
produces erroneous estimations of the BER. The details are
explained below.
For DPSK transmission, (11) shows that the output of the bal-
anced receiver is , corresponding to
. Since and are independent, we find that the stan-
dard deviation of is in the small noise
limit (i.e., the signal-noise beat dominates the amplitude fluctu- Fig. 9. Illustrations of eye diagram of (a) the balanced receiver and (b) the
ations). Thus, the Q-factor as defined in (14) gives differential phase (in unit of  ).

(16) We introduced a Q-factor that is defined for the optical phase

of the received signal [20], [41]. Fig. 9(b) shows a “differential
phase” eye diagram of the received RZ-DPSK signal. The “dif-
Substituting the above Q-value into (15), we can once again ferential phase” is the phase difference between two sampling
relate BER to the signal-to-noise ratio
points separated by one bit period mapped to the range of 2
to 3 2. We then assume a Gaussian distribution for the noise
BER (17) at the center of each bit slot and, similar to (14) for OOK, define
a differential-phase-Q for DPSK as

Comparing (17) to (12), which we obtained earlier using ana-

lytical method, the predicted BER in (17) is 3-dB worse. Thus, (18)
direct applications of (14) and (15) in simulations of DPSK
lead to erroneous predictions of the BER, and the Q-factor ob- where and represent the standard deviations of the
tained by measuring the eye-diagram of the balanced receiver differential phase on the zero and rails, respectively.
[Fig. 9(a)] cannot be used to predict the BER of DPSK transmis- To verify the validity of the differential-phase-Q, we have
sion. We emphasize here that the inability of the conventional compared the BER estimates obtained by the differential-
Q-factor to predict DPSK performance is fundamentally due to phase-Q to direct error counting in numerical simulations [41].
the non-Gaussian nature of the noise distribution in the output The comparison was limited to BERs of 10 to 10 so
signal of a DPSK balanced receiver. Thus, a new approach must that good error statistics can be obtained within a reasonable
be taken in order to provide reliable estimates of BER in numer- amount of computation time. Fig. 10 shows the BER as a
ical modeling of DPSK [20]. function of the launch power. The solid square symbols are

A. DPSK Transmission Experiments

As we have already discussed, the performance of DPSK is
inferior to OOK at low SE, such as in a single channel trans-
mission, and achieves parity with OOK at SE of 0.2. Thus,
most DPSK experiments were performed at SE of 0.4 or higher,
where the performance of DPSK is superior than OOK.
Using conventional OOK signals, 40-Gb/s WDM transmis-
sion systems have achieved record distances due to such in-
novations as forward error correction, distributed Raman am-
plification, and new transmission fibers. In 2001, for example,
Fig. 10. BERs obtained from different methods: (a) direct error counting, (b) using 100-km fiber spans, 77-channel [43] and 125-channel [44]
estimation using the conventional Q-factor method, and (c) estimation using the transmission over 1200 km has been demonstrated, while trans-
differential-phase-Q method. mission over 2000 km has been accomplished with 40 channels
obtained from direct error counting in the simulation and the The first terabit/second RZ-DPSK experiment was reported
open circles in Fig. 10 are obtained from the expression in early 2002 [14], where 64 channels at 40 Gb/s were trans-
mitted over 4000 km, doubling the previous record. In this
landmark experiment, the advantages of RZ-DPSK with a
BER erfc (19) balanced receiver for long-haul high-SE optical transmission
were unequivocally demonstrated. Within a year, a number of
which can be derived from the Gaussian approximation for RZ-DPSK experiments were reported. For 10-Gb/s systems,
the differential-phase noise. We find that (19) qualitatively these included 185 channels over 8370 km [46] (with only
reproduces the result obtained from the direct error counting C-band EDFA) and 373 channels over 11 000 km [47]. For
method, and in particular, (19) correctly takes into account 40 Gb/s systems, 10000 km reach were achieved using 100 km
nonlinear transmission penalties at high launch power and pre- terrestrial span length [15] and 0.8 SE were reported [48].
dicts the optimum launch power accurately. On the other hand, Although there have been reports of 40-Gb/s transmission with
the conventional Q-factor (14) completely failed to predict the OOK at 0.8 SE, this is the first experiment of 40-G transmission
SPM penalty, as shown by the open triangles in Fig. 10. The at 0.8 SE with all copolarized channel launch, thus testing the
conventional Q-factor underestimates the system performance worst case in nonlinear transmission penalties. The reach and
in the linear regime and overestimates the performance in capacity of DPSK systems were advanced further judging from
the highly nonlinear regime. There is a relatively small yet the latest reports in ECOC2003 [49], [50]. Even higher SEs (SE
noticeable discrepancy between the BER values obtained from greater than one) can be obtained with polarization-division
the direct error counting and the differential-phase-Q methods. multiplexing and multilevel coding, which we will discuss in
This is understandable because the probability distribution more detail below.
function of the differential-phase noise in the received signal is
not exactly Gaussian [42]. B. High-SE Transmissions With DPSK and Polarization-
The differential-phase-Q in general works satisfactorily in Division Multiplexing and/or Polarization-Bit Interleaving
DPSK modeling. However, we note that in a linear channel
Polarization-division multiplexing (PDM) has been proposed
the differential-phase-Q slightly overestimates the BER per-
in the past to increase the total system capacity. However, it was
formance because amplitude noise also has to be taken into
soon realized that nonlinear polarization rotation due to polar-
account. Thus, an alternative Q that measures both phase and
ization-mode dispersion (PMD) and interchannel XPM causes
amplitude fluctuations provides more accurate estimate of
rapid and random polarization fluctuations in the WDM chan-
the BER [20]. We have used the alternative Q in our mod-
nels, making error-free polarization demultiplexing difficult in
eling of DPSK transmissions, and our simulation results are
OOK [51]. We show here that DPSK is ideally suited for PDM
confirmed by system experiments. Compared with the direct
to increase the SE of transmission.
error counting method, this alternative Q-method provides
good estimate for DPSK performance and yet reduces the It is instructive to discuss the physics of nonlinear polar-
computation time tremendously. Thus, it is of great practical ization rotation in DPSK and OOK systems. In a PDM dense
value for numerical modeling of DPSK. WDM system, there are two data channels polarization-multi-
plexed at each WDM wavelength. Thus, the system capacity is
increased by a factor of two without reducing the WDM channel
spacing. At the transmitter, the two channels have orthogonal
The advantages of RZ-DPSK for high-SE long-haul trans- polarizations, and if the degree of polarization is maintained
mission have been experimentally verified in the last two years. throughout the transmission, polarization demultiplexing can
In this section, we will first review some of the state-of-the-art be performed at the receiving end. In a dense WDM OOK
DPSK experiments. We will then discuss the combination of po- system, XPM causes nonlinear polarization rotation. Although
larization-division multiplexing and/or multilevel coding with nonlinear polarization rotation induced by XPM does not
DPSK to further enhance the system capacity and reach. occur if the polarizations of the two bits are either parallel or

Fig. 12. Scatter plot of the ends of the Stokes vectors of individual bits at
4000 km, looking directly at the average vector, on the Poincare sphere. The
standard deviation of the angles between these vectors and the mean vector

is 8.4 , corresponding to a polarization extinction ratio of 23 dB, which is
sufficient to suppress the crosstalk between the orthogonal states at the PDM.

(bottom). The reduction in nonlinear polarization rotation

penalty is dramatic, resulting in wide-open eyes. Once again,
by eliminating the pattern-dependent XPM effects through
phase coding, significant improvement in system performance
is achieved.
Analogous to the discussion presented above, DPSK is well
suited for polarization bit-interleaving (i.e., the adjacent bits
within the same channel have orthogonal polarization) to reduce
the intrachannel nonlinear penalty [53]. Because of the reduc-
Fig. 11. Transmitted spectra (top rows, in dB) and eye diagrams of OOK-DMS
(top) and DPSK-DMS (bottom) with ASE noise and PDM at 10 Gb/s. Middle tion of XPM-induced nonlinear polarization rotation, the degree
rows: optical eye diagrams; bottom rows: electrical eye diagrams. of polarization of the bits is largely maintained in DPSK. This
allows the orthogonality of the polarization states of adjacent
bits to be kept over long distance, suppressing effectively the
perpendicular to each other, the presence of PMD in any fiber
intrachannel FWM and XPM along the entire transmission link.
system quickly destroys the initial polarization alignment of
Although polarization bit-interleaving has been shown to reduce
different wavelength channels, making nonlinear polarization
the intrachannel nonlinear penalty in 40-Gb/s OOK transmis-
rotation inevitable in any dense WDM systems. In fact, it
sions [54], the combination of polarization bit-interleaving with
has been shown that the polarization fluctuation in an OOK
DPSK is much more effective, especially for high-SE long-haul
dense WDM system is random and rapid (on a bit-to-bit time
transmissions. Fig. 12 shows DPSK transmission at 40 Gb/s
scale) [51], resulting in the dramatic reduction of the degree of
using polarization bit interleaving [53]. The nonlinear polariza-
polarization within a single channel. The combined effects of
tion rotation is largely suppressed in DPSK, significantly im-
PMD and nonlinear polarization rotation significantly limit the
proving the system reach.
applicability of PDM in a long-haul dense WDM OOK system.
On the other hand, XPM-induced nonlinear polarization
rotation can be largely eliminated in a DPSK system due to its C. High-SE Transmission Using Multilevel DPSK
uniform intensity pattern (i.e., the net effects of the collisions The SE of a binary encoded system obviously can never ex-
are uniform rotation of polarization states). Thus, the degree ceed 1 bit/s/Hz per polarization. With necessary guard bands,
of polarization is well maintained and permits polarization the highest efficiency for binary coding will probably be 0.8.
demultiplexing at the receiver. PDM DPSK transmission can On the other hand, multilevel coding can significantly enhance
be used to double system capacity with only a small sacrifice the spectral efficiency. The effective data rate of -level coding
in transmission distance [52]. Fig. 11 (top) shows the eye is times the symbol rate. Thus, the SE of an -level
diagrams of the center wavelength channel (typically the worst system is improved by a factor of when compared to
channel) at 4000-km transmission distances for a 10-Gb/s a binary coding system at the same symbol rate. Although re-
PDM OOK-DMS system. Evidence of nonlinear polarization search in multilevel coding has a strong emphasis for achieving
rotation is clearly visible. At 4000 kms, the eye is significantly an SE of greater than 1/bit/s/Hz, there are other advantages by
closed. In comparison, we show results for a PDM DPSK using multilevel coding even at SE of 0.8 or less. Because the
DMS system with the same path-averaged power in Fig. 11 high SE is achieved through multilevel coding, there will not be

[72]. Theoretically, the OSNR requirement for QPASK is only

1 dB higher than that for OOK with the same overall data rate
ASK obviously works with DPSK overmodulation, but it may
come as somewhat of a surprise that DPSK works with ASK.
The following is a brief mathematical proof to show that the
information carried by DPSK can be unambiguously detected
when a balanced receiver is used. Let and
represent the E-field of the th and ( 1)th bit at the input of the
Fig. 13. Phasor diagrams of DQPSK and QPASK. The horizontal axes are the MZDI, respectively. The electrical output signal of the balanced
real part of E-field, and the vertical axes are the imaginary part of E-field. detector can be calculated as


where the two terms in (20) represent the intensities at the two
output arms of the delay interferometer. Equation (20) can be
reduced to


where in DPSK, or is the differential phase

(a) (b)
between two adjacent bits. It is clear from (21) that the differ-
Fig. 14. Back-to-back eye diagrams of (a) the DPSK tributary and (b) the ASK ential phase, where information in DPSK transmission resides,
tributary of a 20-Gb/s DPASK signal.
can be unambiguously determined by measuring the sign of the
balanced receiver output. This statement is true for any nonzero
increases in sensitivity to chromatic and polarization-mode dis-
and . We emphasize that balanced detection for DPSK
persion, adverse side effects typically associated with increasing
is essential in QPASK. QPASK fails if single-ended detection
the channel data rate. For example, a 40-Gb/s channel can now
for DPSK is used. This exceptional performance of balanced
be carried at 20 Gsymbol/s, possibly eliminating the need for the
detection in the presence of large intensity fluctuations is well
PMD compensator and the tunable dispersion compensator in
known in a variety of fields. In fact, some early DPSK experi-
long-haul transmission, and significantly increasing the compat-
ments were entirely motivated by this particular advantage [59],
ibility of 40-Gb/s transmission with many existing fiber plants.
Two multilevel schemes have attracted attention lately: dif-
It is interesting to note that simultaneous phase and am-
ferential quadrature phase-shift keying (DQPSK) and quater-
plitude modulation is routinely done in radio-frequency
nary DPSK-ASK (or QPASK). As illustrated in Fig. 13, the four
(RF) transmissions to enhance the system capacity, such as
symbols in DQPSK all have the same amplitude but are sepa-
16-QAM. Nonetheless, such a system has only recently been
rated by 2 in phase; while in QPASK, the four symbols have
demonstrated in optical networking, largely propelled by the
two amplitude levels and two phase levels. DQPSK was demon-
dramatic advancements in DPSK. Obviously, amplitude and
strated to improve system SE by a factor of two [55]. An SE of
phase modulation format can intrinsically go far beyond four
1.0 bit/s/Hz and a reach of 1000 km are obtained at 25 Gb/s
levels. Indeed, there was a recent proposal for an eight-level (3
with 25-GHz channel spacing [55]. There was a recent report
bits/symbol) system by combining binary ASK with DQPSK
on combining PDM and DQPSK to further enhance the SE. A
[61]. Such a “two-dimensional” modulation format using both
record SE of 1.6 bit/s/Hz is achieved [17].
the amplitude and phase of the optical carrier is perhaps the
The main difficulty of implementing DQPSK is that a set of
most promising approach to push for the limits in SE and
complicated transmitter and receiver is required, and the fre-
quency offset of the transmitter and receiver needs to be much
smaller than binary DPSK [7]. As an alternative, QPASK is pro-
posed [56] and experimentally demonstrated [57], [58]. In this
format, two sets of data, one in DPSK and the other in ASK
with a finite extinction ratio, are modulated on a single op- Nonlinear phase jitter caused by amplitude fluctuations
tical carrier and simultaneously transmitted. At the receivers, and SPM poses new limitations on the DPSK systems. As
a simple splitter divides the optical power, and the ASK and we have discussed before, a single-channel PSK system is
DPSK channels are independently received by direct detections. fundamentally limited by ASE and SPM induced nonlinear
Thus, QPASK can essentially be accomplished using existing phase jitter [6]. It is important here to emphasize the difference
transmitters and receivers that are already employed in binary between Gordon–Mollenauer effect and random phase noise.
DPSK and OOK systems, simplifying the practical deployment The Gordon–Mollenauer effect is a deterministic effect: one
of multilevel systems. Fig. 14 shows the back-to-back eye di- can calculate the amount of Gordon–Mollenauer effect by
agrams of the two tributaries of a 20-Gb/s RZ QPASK signal knowing the pulse intensity and transmission distance. Thus,

nonlinear phase jitter can be mostly eliminated through a

nonlinearity management scheme. Methods of nonlinearity
management have been proposed in the past [62]. For example,
fibers with alternating positive and negative nonlinear refractive
indexes can be used to effectively cancel the nonlinear
phase shift resulting from SPM. However, transmission fibers
with negative simply do not exist. There are materials with
intrinsic negative ; however, the practicalities of using these
materials in optical fiber transmissions are yet to be explored.
Another method to reduce the nonlinear phase noise is in-line
Fig. 15. Schematic drawings of a DPSK system with postnonlinearity
phase conjugation [63]; however, its potential in dense WDM compensation.
transmission needs to be assessed. Thus, there is currently no
practical means of deploying distributed nonlinearity manage-
ment in a dense WDM optical transmission.
It has been realized, however, that lumped compensation of
nonlinear phase shift after the transmission link can effectively
reduce the nonlinear phase jitter in PSK systems [31]. Because
such a technique essentially parallels the concept of postdisper-
sion compensation, we named it postnonlinearity compensation
(PNC). To understand how PNC works, recall that the growth
of phase jitter, which is driven by noise-induced power kicks, is
similar to the growth of Gordon–Haus timing jitter [64], which
is driven by noise-induced frequency kicks. Analogous to the
reduction in timing jitter by applying the right amount of dis- Fig. 16. Phasor diagrams of E-field at a transmission distance of 6000 km
(a) without and (b) with postnonlinearity compensation. Horizontal axes: real
persion compensation after the transmission system [65], [66], part of E-field. Vertical axes: imaginary part of E-field.
the nonlinear phase jitter can also be reduced by applying the
right amount of PNC. Because the power kicks imparted by dif-
ferent amplifiers are statistically independent, nonlinear phase
jitter cannot be eliminated completely in a lumped compen-
sation (again similar to Gordon–Haus jitter). Optimal PNC is
achieved when the compensating phase shift is related to the
total nonlinear phase shift accumulated during transmission in
the following way: Let denote the accumulated nonlinear
phase shift and denote the compensating phase shift. Anal-
ysis shows that, in the absence of direct phase noise from ASE,
one can reduce the variance of phase jitter by a factor of four
when PNC is applied after transmission with [31],
[67]. This can be intuitively understood by assuming that all the
ASE noise is added in the middle of the transmission to produce
the nonlinear phase jitter.
We have proposed and studied a particular implementa-
tion scheme [68], [69] where a phase modulator is used to
modulate the phase of the data pulses in front of the receiver
(Fig. 15). The magnitude of the phase modulation is directly
proportional to the detected pulse intensity and the sign is Fig. 17. Plots of Q-factors [defined in (19)] versus transmission distance.
opposite to the nonlinear phase shift caused by SPM. Thus, Error-free Q-value of 15.6 dB is also indicated.
effects of negative are generated and the nonlinear phase
jitter is partially compensated. As an example, we show by transmission distance of 6000 km after PNC. The fluctuations
numerical simulations a DPSK DMS system at 10 Gb/s with in the phase are significantly reduced, and Fig. 17 also shows
such a PNC scheme [70]. Fig. 16(a) shows the phasor diagram as a function of transmission distance after PNC. An
of the E-field at a transmission distance of 6000 km with a error-free reach of 6000 km is now achieved. In fact, an im-
model system. Note the asymmetry in the E-field distribution. provement of 3 dB in Q-factor is obtained for all transmission
The system performance is clearly dominated by nonlinear distances greater than 3500 km.
phase jitter. Quantitatively, we show (19) as a function of An important distinction between nonlinearity management
transmission distance in Fig. 17. The requirement of using a phase modulator and other methods using true optical
(i.e., BER ) limits the error-free transmission distance negative is in their response times. Optical negative gen-
of DMS-DPSK to 4000 km. Fig. 16(b) shows the phasor erally has ultrafast response time ( 1 ps), while the phase mod-
diagram of the E-field of the same transmission system at a ulator has much slower response ( 100 ps). However, it was

the ASE noise. Thus, we expect prenonlinearity compensation

to have lower performance than postnonlinearity compensation.
Nonetheless, the simplicity of precompensation may override its
performance deficiencies in practical deployments.

Differential phase-shift keying has received tremendous
attention recently because of its superior performance in
high-SE long-haul optical transmissions. When compared to
OOK, DPSK has extended the reach by approximately a factor
Fig. 18. Prenonlinearity compensation at the transmitter for long-haul DPASK
transmission. PC: pulse carver. IM: intensity modulator. PM: phase modulator. of two in a 40-Gb/s system at 0.4 SE. Its impact in higher
SEs, such as 0.8, is even greater. The ability to combine DPSK
with polarization bit interleaving and polarization-division
shown that the bandwidth of line rate components is more than multiplexing further increased the system capacity and reach.
sufficient for PNC [69]. Thus, PNC can be readily implemented Multilevel DPSK and multilevel amplitude and phase-shift
in 40-Gb/s DPSK systems and beyond. For practical applica- keying are available means to push the limit of system SEs. The
tions, polarization-independent devices or polarization diversity main nonlinear penalty of DPSK transmission is the nonlinear
schemes must be used to make the PNC polarization insensitive. phase jitter induced by ASE noise and SPM, which can be
The amplitude and phase have to remain independent of each compensated by nonlinearity management scheme. For prac-
other for successful QPASK transmission. Although the condi- tical implementations, a simple postnonlinearity compensation
tion for independent phase and amplitude is satisfied in a linear device can significantly enhance the system reach.
system by proper dispersion management, the amplitude and Other than the MZDI at the receiver, the components and the
phase become coupled because of the Gordon–Mollenauer ef- fiber link for DPSK transmission are similar to conventional
fect [6]. Unlike in binary DPSK, large amplitude differences in- OOK systems. Thus, commercial deployment of DPSK systems
trinsically exist in QPASK because of the ASK modulation, re- is feasible without major overhauls of existing fiber infrastruc-
sulting in a large phase difference between a “1” bit and a “0” bit ture and manufacturing base. With its significant performance
in ASK. Thus, if PNC is optional to further improve the reach advantages at high-SE transmission, DPSK clearly stands as a
of DPSK systems, compensation of the nonlinear phase jitter is very attractive candidate for the next-generation high-bit-rate
essential in long-haul QPASK transmission. high-SE long-haul optical transmission system.
The PNC scheme shown in Fig. 15 can obviously be ap-
plied to significantly enhance the reach of the QPASK system. ACKNOWLEDGMENT
Because the PNC device here compensates the Gordon–Mol- The authors acknowledge valuable discussions with and
lenauer effects caused by both ASK and ASE, the enhancement assistance from Dr. L. F. Mollenauer, Dr. A. Chraplyvy,
in transmission distance should be greater than what we pre- Dr. R. Slusher, Dr. R. Giles, Dr. J. Gordon, Dr. S. Hunsche,
dicted previously, where the PNC device only compensates the Dr. A. Gnauck, Dr. A. Grant, Dr. C. Mckenstrie, Dr. C. Doerr,
Gordon–Mollenauer effects induced by ASE noise. An inter- and Dr. D. Fishman.
esting variation of the PNC scheme is the prenonlinearity com-
