On Partial Transmit Sequences for PAR Reduction
in OFDM Systems ∗
Trung Thanh Nguyen and Lutz Lampe†
Department of Electrical and Computer Engineering
University of British Columbia, Vancouver, British Columbia, Canada
Email: {trungn,lampe}@ece.ubc.ca
Abstract — Partial transmit sequences (PTS) is a popular technique to reduce the peak-to-average
power ratio (PAR) in orthogonal frequency division multiplexing (OFDM) systems. PTS is highly successful
in PAR reduction and efficient redundancy utilization, but the considerable computational complexity
for the required search through a high-dimensional vector space and the necessary transmission of side
information (SI) to the receiver are potential problems for a practical implementation. In this paper,
we revisit PTS for PAR reduction and tackle these two problems. To address the complexity issue, we
formulate the search problem of PTS as a combinatorial optimization (CO) problem. This enables us
to (i) unify various search strategies proposed earlier in the PTS literature and (ii) adapt efficient search
algorithms known from the CO literature to PTS. We also propose a modified PTS objective function,
which reduces the number of multiplications required for PTS. Numerical results show that, perhaps
surprisingly, simple random search yields the best performance-complexity tradeoff for moderate PAR
reduction, whereas two novel CO-based methods excel if close-to-optimum PAR reduction is desired. The
SI transmission problem is solved by a simple preprocessing of the data stream before PAR reduction. This
preprocessing introduces the minimal possible redundancy and allows SI embedding without affecting the
PAR reduction capability of PTS or causing peak regrowth.
Index terms: Orthogonal frequency division multiplexing (OFDM), peak-to-average power ratio (PAR)
reduction, partial transmit sequences (PTS), combinatorial optimization, simulated annealing, tabu search,
trellis shaping, side information.
∗
This work has been accepted in part for presentation at the 2006 IEEE Global Communications Conference
(Globecom).
†
Corresponding author
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
1
1
Introduction
Orthogonal frequency division multiplexing (OFDM) is a widely used modulation technique for wireless
communication over frequency-selective channels. One of its main drawbacks is the large peak-to-average
power ratio (PAR) of the time-domain transmit signal. If not processed, high peaks in the OFDM signal
can lead to unwanted saturation in the power amplifier. As a result, expensive linear power amplifiers need
to be employed or nonlinear amplifiers must be operated power-inefficiently to avoid error-rate performance
degradation and out-of-band radiation.
To alleviate this problem, many PAR reduction techniques have been proposed in the literature, cf.
e.g. [1] for an overview. One of the classical and most popular techniques is known as partial transmit
sequences (PTS) [2, 3]. In PTS, non-overlapping subsets of OFDM subcarriers are formed, rotated
independently, and combined again. Since the signal representations corresponding to different rotations
exhibit different PARs, selecting the representation with the minimum PAR leads to PAR reduction. PTS
is known to achieve a high performance and redundancy utilization, but implementation problems arise
from (i) a relatively high computational complexity for searching the optimum sequence of rotation factors
and (ii) the need to transmit side information (SI) about the selected sequence to the receiver to undo
the rotation of OFDM subcarriers.
In this paper, we take a fresh look at PTS for PAR reduction and propose solutions for both the
above-mentioned problems. To tackle the complexity issue of PTS, we formulate the sequence search
of PTS as a particular combinatorial optimization (CO) problem. This new perspective enables us to (i)
unify various heuristics for reducing the computational complexity of PTS presented previously in [4, 5]
under the CO framework, and (ii) present two new efficient search strategies for PTS based on wellknown CO algorithms. We also propose a modified PTS objective function based on the approximation
which reduces the number of required multiplications for the peak-amplitude search dramatically. The
embedding of SI for PTS has been addressed repeatedly in the literature. Efficient schemes with explicit
SI transmission have been devised in e.g. [6, 7]. SI embedding, i.e., SI is not explicitely transmitted via
dedicated OFDM subcarriers, has been proposed in the form of differential encoding [3, 8], marking of
subcarriers [9, 10], and choosing PTS vectors from special codebooks [11]. Some of the disadvantages
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
2
of these methods are peak regrowth [9, 10], increased redundancy [3, 8], inapplicability to general search
algorithms [9, 10, 11], and increased detection complexity [9, 10, 11]. We propose a new method to embed
the PTS SI into the OFDM transmit signal, which is inspired by the trellis-shaping technique considered
in [12, 13] for PAR reduction. It requires only minimal additional redundancy and complexity, does not
affect the PAR reduction algorithm, avoids peak regrowth, and practically does not degrade the overall
error-rate performance of the OFDM system.
We would like to point out that the different solutions devised in this paper are applicable to PTS in
a modular fashion, i.e., the CO-based search algorithms, the modified objective function, and the new SI
embedding scheme can be applied individually or in combination to any PTS scheme.
Organization: Section 2 briefly introduces the PAR of an OFDM signal and reviews the PTS technique
for PAR reduction. In Section 3, the sequence search of PTS is formulated as a CO optimization problem
and efficient solutions, i.e., search strategies, are presented. The modified objective function is devised
in Section 4, and the new SI embedding scheme is developed in Section 5. Numerical performance and
complexity results are presented in Section 6. Section 7 concludes this paper.
Notation: (·)T , ⊕, E{·}, and || · ||∞ denote transpose, modulo-2 addition, statistical average, and
the max norm (or l∞ norm), respectively. 0K , 1K and I K denote, respectively, the all-zero and all-one
vector of length K, and the K × K identity matrix. ℜ{·} and ℑ{·} are the real and imaginary part of a
complex number or vector. Finally, the K dimensional vector x = IDFTK×P (X) is the inverse discrete
Fourier transform (IDFT) of the P -dimensional vector X, P ≤ K, after appropriate zero-padding of X.
2
Preliminaries
In this section, we briefly review the PAR of OFDM signals and the PTS technique for PAR reduction.
2.1
PAR of OFDM Signal
We consider OFDM transmission with N subcarriers and subcarrier spacing ∆f = 1/NT , where T is the
modulation interval and NT is the duration of the OFDM symbol excluding the guard interval. M-ary
data symbols Xk taken from a quadrature-amplitude modulation (QAM) or phase-shift keying (PSK)
3
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
constellation are assigned to OFDM subcarriers at frequencies fk = k∆f , 0 ≤ k ≤ N − 1. As common
in the literature (cf. e.g. [4, 14, 5, 13, 1]), we model the complex envelope of the OFDM signal by
N −1
1 X
Xk ej2πfk t ,
x(t) = √
N k=0
0 ≤ t < NT ,
(1)
where an idealized rectangular time-domain window is assumed. We also consider a cyclic prefix extension
of x(t), which does not alter the PAR. Then, the PAR of the continuous-time signal x(t) is defined as
ζ c (x(t)) ,
max {|x(t)|2 }
0≤t<N T
E{|x(t)|2}
.
(2)
In PTS, PAR reduction is based on the sampled, discrete-time signal
N −1
1 X
Xk ej2πkn/LN ,
xn , x(nT /L) = √
N k=0
0 ≤ n < LN, where L is the oversampling factor.
(3)
Accordingly, let us define the vectors
X , [X0 . . . XN −1 ] and x , [x0 . . . xLN −1 ] = IDFTLN ×N (X). The corresponding PAR approximation follows as (σx2 , E{|xn |2 })
ζ(x) ,
max {|xn |2 }
0≤n<LN
E{|xn |2 }
=
||x||2∞
.
σx2
(4)
We note that, since E{|x(t)|2 } = E{|xn |2 }, the continuous-time PAR is not smaller than the discretetime PAR, i.e., ζ(x) ≤ ζ c (x(t)). According to the results in [15, 16, Chapter 3], a good approximation
ζ(x) ≈ ζ c (x(t)) is achieved for L ≥ 4. We therefore adopt L = 4 when showing our numerical results in
Section 6.
Finally, we introduce the complementary cumulative density function (CCDF) of the PAR
Fζ (ζ0 ) , Pr{ζ(x) > ζ0 } ,
which is commonly used to quantify the efficacy of a PAR reduction scheme.
(5)
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
2.2
4
PTS for PAR Reduction
(v)
In PTS [2, 3], data symbols in X are partitioned into V disjoint subblocks X (v) , [X0
(v)
. . . XN −1 ] with
(v)
Xk = Xk or 0, 0 ≤ v ≤ V − 1, such that
X=
V
−1
X
X (v) .
(6)
v=0
The number of non-zero components of X (v) is denoted by nv , which in general is different for different
v. Three partitioning strategies have been proposed in [3, 8], and we apply the random partitioning
strategy, which typically yields the best PAR reduction performance, for the numerical results presented
in Section 6.
The subblocks X (v) are transformed into V time-domain partial transmit sequences
(v)
(v)
x(v) , [x0 . . . xLN −1 ] = IDFTLN ×N (X (v) ) .
(7)
These sequences are independently rotated by some phase factors bv , ejφv and then combined to produce
the time-domain OFDM signal
x=
V
−1
X
bv x(v) .
(8)
v=0
Assuming that φv and thus bv can attain W different values and taking into account that b0 = 1
can be fixed without loss in PAR reduction, there are W V −1 alternative representations for an OFDM
symbol. These representations correspond to all possible vectors b , [b1 . . . bV −1 ]. PTS selects a vector
b̂, as described in detail in Section 3 below, such that the PAR ζ(x̂) of the corresponding transmit
sequence x̂ is the lowest among all examined sequences. The SI that needs to be transmitted to inform
the receiver about the selected vector is R = (V − 1) log2 (W ) bits. Since PTS with binary weighting
factors bv ∈ {±1}, i.e., W = 2 and R = V − 1, attains a favorable performance-redundancy tradeoff (cf.
e.g. [8]) we concentrate on this choice in the following.
3
Search Strategies for PTS
In this section, we first formulate the sequence search of PTS as a binary CO problem and identify various
known implementations for this search as so-called descent heuristics. We then adapt two search strategies
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
5
from the CO literature to PTS and optimize their respective parameters.
3.1
Sequence Search as a CO Problem
Using the notation introduced in Section 2, we can state the optimization problem of PTS, i.e., to find
the vector b that yields the transmit signal with the minimum PAR, as the following binary CO problem:
minimize
b
f (b)
(9)
subject to: b ∈ {±1}
V −1
,
with the objective function (see (4), (8))
f (b) =
||x||2∞
=
V
−1
X
v=0
2
(v)
.
bv x
(10)
∞
Thereby, we have exploited the fact that the phase rotations in (8) do not change the average signal
power σx2 .
Finding the exact solution of (9) requires full enumeration of all 2V −1 possible phase vectors. Each
evaluation of the objective function involves (i) combining partial transmit sequences xv and (ii) computing
the peak power of the corresponding sequence x. Assuming that complex numbers are stored as pairs
of real and imaginary parts and since bv ∈ ±1, calculating the transmit sequence involves 2NL(V − 1)
additions and computing the peak power requires 2NL real multiplications per trial vector. Especially the
large number of multiplications renders full enumeration computationally prohibitive even for moderate
values of V . Hence, the development of nonexact (in the sense that the chosen solution cannot be
guaranteed to be optimal with respect to (9)) PTS algorithms performing a reduced number of searches
for the best phase vector becomes attractive. We refer to such algorithms as heuristics [17] and from
the CO literature there is a rich set of computationally efficient heuristics at our disposal (for a detailed
discussion we refer the reader to [17]). In the following, we (re-)introduce nonexact PTS algorithms which
fall into two classes, so-called descent heuristics (Section 3.2) and metaheuristics (Section 3.3).
In passing, it is worth noting that following similar steps as in [16, Section 4.3], the CO problem
(9) could be written as a mixed integer (MI) linear program (LP) if x(v) are real-valued and as a MI
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
6
quadratically constrained quadratic program (QCQP) if x(v) are complex-valued, respectively. However,
relaxation of the integrality restrictions to arrive at a computationally simpler LP and QCQP, respectively,
will often return the all-zero vector as solution. Hence, the relaxation approach frequently applied to CO
problems is not useful in the case of PTS.
3.2
Descent Heuristics for PTS
In the description of heuristics in this paper, b̃ is the current solution and b̂ is the best-so-far solution.
A move is an attempt to replace the current solution b̃ by a trial solution b. In descent heuristics, only
downhill moves, i.e., b̃ is replaced by b only if f (b) < f (b̃), are accepted. Hence, b̃ is also the bestso-far solution b̂. Interestingly, a number of suboptimal PTS algorithms presented in the literature can
be categorized as descent heuristics. In particular, the bit-flipping search proposed in [4] is known as
a construction greedy algorithm and the PTS method devised in [5] is a particular implementation of a
local search algorithm, where the local neighborhood of the vector b̃ is defined in terms of the Hamming
distance from b̃. In this paper, we extend the bit-flipping algorithm from [4] in that bit flipping can be
continued in a cyclic fashion and the local-search algorithm from [5] in that we maintain a list of all vectors
that have been searched in previous iterations, which allows early termination if no more downhill moves
can be proposed. Another descent heuristic that is considered in this paper is “simple” random search,
where a trial solution b is randomly generated. Random search was already mentioned in [4].
3.3
Metaheuristics for PTS
Descent heuristics usually do not perform very well when dealing with hard CO problems. The reason is
that the solution is always updated in the direction of improvements and hence is quickly trapped in a
local optimum [17]. Metaheuristics, on the other hand, have mechanisms to accept uphill moves to avoid
early convergence to local optima. We consider the application of two well-known metaheuristics to PTS.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
3.3.1
7
Simulated Annealing
The simulated annealing metaheuristic [18] is deduced from the physical annealing process of solid materials. In simulated annealing, a solution vector b represents a system state and the value f (b) represents
the energy of the system at that state. A move from the current state b̃ to another state b is accepted
or rejected based on the Metropolis criterion [19]:
If δ ≤ 0, accept the move,
If δ > 0, accept the move with probability e−δ/T ,
where δ , f (b) − f (b̃) and T is the temperature. At fixed T , the systems is likely to approach thermal equilibrium where the probability of a state b̃ is proportional to e−f (b̃)/T . If the system is cooled
down sufficiently slow, simulated annealing returns, to a certain approximation, the global minimum. Unfortunately, such cooling schedules are normally too slow to be practical. Instead, simulated annealing
implementations aim at obtaining near-optimum solutions in limited time constraints.
The simulated annealing implementation that we propose for PTS sequence search is summarized in
Fig. 1 (upper part). The trial solution b is derived from b̃ in the same procedure as in cyclic bit-flipping.
This cyclic order avoids leaving any solution in the neighborhood of b̃ unvisited for a long time. The
range of the temperature T and how it is cooled down are important factors in simulated annealing. We
consider the following cooling schedules.
• Geometric cooling: The temperature T is decreased to αT , 0 < α < 1, after each iteration. If
the starting temperature is Ts , the final temperature is Tf , and the number of iterations is I, then
α = (Tf /Ts )1/I . Experiments are required to find a suitable range for the temperature. In Section 6
we present annealing curves [20] to determine good values for Ts and Tf .
• Gaussian-like cooling: This cooling schedule from [21] allows a more flexible control of the temperature decrement. The temperature at the i-th iteration is
b
Ti = Ts a−(i/cI) ,
(11)
where b , log[log(Ts /Tf )/ log(a)]/ log(1/c) to achieve TI = Tf . The two parameters a and c can
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
8
be selected so that the cooling curve can go through any temperature Ts > Ti > Tf at a given
iteration i.
• Constant temperature: At a too high temperature, the annealing process becomes complete random,
whereas at a too low temperature, it becomes a descent search. Since we are limited by the maximum
number of iterations I, it may be advantageous to spend all the time at a fixed temperature in the
middle. Such temperature should be high enough to allow escape from local minima and low enough
to ensure a thorough search in the regions near good solutions.
The performances of simulated annealing with the three cooling schedules will be compared in Section 6.
3.3.2
Tabu Search
The tabu search metaheuristic [22, 23] uses information from the search history to drive the search into
regions that might have better solutions. Two highly important elements of tabu search are intensification
and diversification strategies. The former encourage more thorough search near good visited solutions,
whereas the latter take the search to unexplored areas. Tabu search implementations realize these strategies by employing, for examples, tabu lists and aspiration criteria. Tabu lists prohibit moves which would
be ineffective. Aspiration criteria, on the other hand, overrule tabu lists by allow tabu moves that are
deemed beneficial to the search.
In this paper we apply to PTS a simple tabu search implementation that uses only short-term history
to avoid ineffective cyclic move sequences. The algorithm, presented in Fig. 1 (lower part), is based on
local search. The neighborhood N (b̃) of b̃ is the set of vectors that have Hamming distance dH = 1 from
b̃, i.e., |N (b̃)| = V − 1. After each iteration, the new solution differs from the last solution at exactly
one bit position. That position is recorded in the tabu list L and it will not be flipped in the next B
iterations. Since both uphill and downhill moves are allowed, the role of this tabu list is to avoid possible
cycles of visited solutions. The search starts with a randomly generated initial solution and an empty tabu
list. The only tunable parameter of the algorithm is the tabu tenure B. Again, we refer to Section 6 for
numerical results.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
4
9
Objective Function Approximation for PTS
The heuristics reviewed and proposed in the previous section aim at reducing the number of sequence
searches, i.e., the number of evaluations of the cost function f (b) required for PTS. In this section we
consider the cost function itself. In particular, we propose an approximation for f (b) that is simpler to
compute than the exact peak power while retaining the PAR reduction capability of PTS.
4.1
CO Problem with Approximate Objective Function
The peak power is the square of the radius of the circle that encloses the selected PTS transmit sequence
x̂ in the complex plane. Following an approach from [24], we approximate this circle by a symmetrical
polygon of P sides, where we require P to be a multiple of four. This approximation is illustrated in Fig. 2
for different values of P .
Applying the polygon approximation, we can replace the PTS CO problem (9) by
minimize
b
g(b)
(12)
subject to: b ∈ {±1}V −1 .
with the new objective function
g(b) , max{ℜ{xej2πp/P } , p = 0, . . . , P − 1} .
(13)
The selected sequence x̂ is now enclosed by the P -polygon. For efficient computation of g(b), we generate
the real-valued sequences
y (v) , ℜ{x(v) } ℑ{x(v) } . . . ℜ{x(v) ej2π(P/4−1)/P } ℑ{x(v) ej2π(P/4−1)/P }
(14)
of length NLP/2, 0 ≤ v ≤ V − 1. Then, for different vectors b, the combined sequence
y,
V
−1
X
bv y (v) ,
(15)
v=0
is formed and the value of g(b) is obtained by a simple maximum search,
g(b) = ||y||∞ = max{y, −y} .
(16)
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
10
We observe that all algorithms introduced in Section 3 can directly be applied to the new CO problem
(12). It is also clear that the solution of (12) approaches that of (9) as P increases in the sense that the
corresponding PARs approach identical values.
Finally, we note that the approximation of a circle by a symmetrical polygon has also been used in [25]
in the context of PAR reduction, but there a different optimization problem which arises in PAR reduction
via tone reservation has been considered.
4.2
Computational Complexity
It is insightful to compare the number of operations involved in the evaluation f (b) and g(b). Assuming
again that complex numbers are stored as Cartesians and that I is the total number of searches, we have:
objective function
# of multiplications
# of additions
f (b)
2NLI
(to find the peak value)
2NL(V − 1)I
(to generate x)
g(b)
4NLV (P/4 − 1)
(to generate y (v) )
(P/2)NL(V − 1)I + 2NLV (P/4 − 1)
(to generate y)
(to generate y (v) )
Although the number of additions is increased by about a factor of P/4 for the new objective function
compared to that for f (b), the number of multiplications is reduced by a factor of I/[2V (P/4 − 1)].
Considering multiplications as computationally most expensive, using the objective function approximation
g(b) is advantageous when more than 2V (P/4 − 1) searches are performed. The numerical results in
Section 6 show that this is generally the case for good PAR reduction with PTS.
5
Novel Side Information (SI) Embedding Scheme for PTS
The transmission and detection of the SI b is critical for PAR reduction with PTS. In this section, we
present a new SI embedding technique that has the following attractive features. (i) The PTS PAR
reduction algorithm is not affected and no peak regrowth is caused. (ii) The added redundancy is the
minimum value of R = V − 1 bits (considering W = 2), i.e., one bit per subblock. (iii) It can be applied
to general QAM/PSK constellations and combined with any PTS search algorithm. (iv) SI detection is
very low complexity and performance degradation due to erroneously detected SI is minimal.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
11
The only assumption we make concerns the mapping of binary data to M-ary data symbols Xk . In
particular, we apply a sign-bit mapping where inverting the most significant bit (MSB) inverts the sign of
the signal point. The less significant bits (LSBs) may be mapped according to e.g. Gray labeling.
5.1
Implementation of the SI Scheme
For the sake of a clear exposition, we think of the multiplication with bv (see (8)) as being executed in the
frequency domain. Then, let us define u(v) as the nv -dimensional vector of the MSBs of the nv non-zero
components Xk in the subblocks X (v) , and s(v) be this vector for the rotated signal bv X (v) . We also
refer to u(v) and s(v) as unshaped and shaped MSBs, respectively. Let us further define
1n , if bv = −1
v
c(v) ,
0 , if b = 1
v
nv
(17)
Taking the sign-bit mapping into account, the PTS PAR reduction can then be formulated as
s(v) = u(v) ⊕ c(v) .
(18)
Since c(v) ∈ {1nv , 0nv } are the two codewords of the (nv , 1, nv )-repetition code, the PAR reduction
operation (18) can be made transparent for data transmission by obtaining the nv unshaped MSBs from
u(v) = d(v) H −T
(19)
at the transmitter and applying the “inverse” operation
s(v) H T = (d(v) H −T ⊕ c(v) )H T = d(v)
(20)
to the shaped MSBs at the receiver. H is the (nv − 1) × nv repetition code parity-check matrix and H −T
the left-inverse of H T . We note that any (nv − 1) × nv matrix H with nv − 1 independent rows which
satisfies 1nv H T = 0nv −1 is a valid parity check matrix of the code. For any such matrix, there exists at
least one left inverse. Since d(v) contains nv − 1 bits, one redundancy bit per block is implicitly embedded
as PTS SI.
The block diagram of the proposed SI embedding scheme is shown in Fig. 3. It is interesting to note
that this implementation of PTS can be regarded as a special case of trellis-shaping for PAR reduction
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
12
recently presented in [12, 13] with a constellation mapping referred to as Type-I in [13]. The complexity
for SI embedding and detection is extremely low as H and H −1 can be made sparse, e.g.,
I
I
nv −1
nv −1
, H −T
H T1 =
=
1
1nv −1
0nv −1
(21)
are valid matrices, and operations are in the Galois field GF(2).
5.2
Effect on Error-rate Performance
While the particular choice of H does not have any impact on SI embedding, the distributions of errors
in d̂
(v)
in case of erroneous MSBs ŝ(v) depends on H. For example, consider H T1 in (21) and
I nv −1
0
⊕ nv −1 .
H T2 =
0nv −1
I nv −1
(22)
Whereas a single error in the nv th position of s(v) will cause a burst of nv errors if H 1 is applied, single
errors in the nv − 2 center positions of s(v) will lead to double-adjacent errors for H 2 . Hence, in case
of coded transmission using block codes and hard-decision decoding, separate error control coding for
LSBs and MSBs could be advisable, and the code protecting d(v) should take the different error patterns
depending on H into account (cf. e.g. [26, Table 5.1] for a list of double-adjacent-error-correcting codes).
If convolutional codes or more powerful concatenated codes in combination with iterative decoding are
applied, the syndrome forming (20) should be incorporated in the iterative decoding and demapping
process, cf. e.g. [27].
For the case of uncoded transmission let us consider M-ary QAM transmission and the bit-error rate
(BER) after demapping, BERk , for sufficiently large signal-to-noise ratio (SNR) γk in subcarrier k. For
the (m − 1) LSBs per QAM symbol (m , log2 (M)) we find [28, Section 5.2.9]
!
r
√
3
4 M −4
Q
(23)
γk .
BERk,LSB ≈ √
M −1
M (m − 1)
R∞ 2
where Q(x) , 1/2π x e−t /2 dt is the Gaussian Q-function. Considering the matrices H T1 in (21) and
H T2 in (22) as an example, simple counting of the cases of error propagation leads to the approximation
!
r
3
2 2nv − 2
Q
γk
(24)
BERk,MSB ≈ √
M −1
M nv
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
13
for the data bits d(v) representing MSBs in both cases. Assuming further that nv ≫ 1, an approximation
for the overall error rate is given by
!
r
3
nv − 1
4
1
(m − 1)BERk,LSB +
BERk,MSB ≈
Q
γk .
BERk =
m
nv
m
M −1
(25)
We note that this approximation is independent of nv (as long as nv ≫ 1) and identical to that for
uncoded Gray labelled MQAM transmission without shaping for large γk . We therefore conclude that the
asymptotical error rate is not affected by the proposed SI embedding scheme.
To obtain an expression for the average BER, we assume transmission over a Rayleigh fading channel.
We generalize this model in that we consider D-fold diversity with maximal ratio combining achieved either
through the use of multiple receive branches or spreading across subcarriers for time-dispersive channels.
Using (25) and [28, Section 14.4.1] the average BER is approximated by
r
D−1
4pD X D − 1 + i
γ̄
1
i
BER ≈
(1 − p) , p =
1−
,
i
m i=0
2
β + γ̄
(26)
where β , 2(M − 1)/3 and γ̄ , E{γk }/D = Ēs /N0 /D, and Ēs and N0 denote the average received
energy per symbol and the equivalent complex baseband noise power spectral density, respectively.
6
Results and Discussion
In this section, we present performance results for PAR reduction using PTS with the techniques introduced
in the previous sections. We first discuss the PAR reduction performance and PAR-reduction vs. complexity
tradeoff for the various search strategies devised in Section 3. Then, we compare PTS with the exact and
the approximate objective function developed Section 4. Finally, BER results when applying the new SI
embedding from Section 5 scheme are shown.
As relevant and typical system parameters, we choose 16QAM modulation and N = 256 subcarriers,
which are divided into V = 16 subblocks employing random partition [3, 8].
6.1
Performance of PTS with Different Heuristics
Structure of the CO problem: First, it is insightful to illustrate the structure of the CO problem (9),
especially the dependency of the objective function f (b) on b. For this purpose, we employ the CDF
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
14
Gb(ζ0 ) of the PAR ζ(x) = f (b)/σx2 considering the rotation vector b as an in {−1, +1}V −1 uniformly
distributed random variable and a fixed data vector X. Gb(ζ0 ) is plotted in Fig. 4 for 10 different
realizations of X. We observe that quite many vectors b lead to moderately low PAR. For example, a
PAR of less than 8 dB is achieved for 20% to 50% of all vectors b. However, the flatness of the left end
of the graphs shows that only very few vectors lead to a very low PAR close to the minimum PAR value.
We therefore conclude that it is hard to achieve close-to-optimum PAR reduction with an unstructured
search strategy, i.e., random search (see Section 3.3).
PTS with random search: Fig. 5 shows the CCDF of the PAR when random search with I = [2, 4, 16,
64, 256, 1024, 4096] searches is applied. Also included are the CCDFs for OFDM without PAR reduction
and for optimal PTS with full enumeration of all I = 2V −1 = 32768 vectors b. In good agreement with
the conclusions from Fig. 4, it can be seen that (i) considerable improvements in PAR are already achieved
with relatively few random searches, e.g., for I ≤ 64, however (ii) closing the remaining gap of about 1 dB
to optimum PTS appears quite difficult to achieve with random search unless the number of searches and
thus complexity are increased and become comparable to those of PTS with full search.
PTS with simulated annealing: In order to apply simulated annealing, we need to determine a reasonable range of the temperature parameter T (see Section 3.3.1 and Fig. 1). For this purpose, we consider
the annealing curve E{f (b̃)}/σx2 , which is the value of f (b̃)/σx2 averaged with respect to b̃ at a fixed
temperature T vs. T /σx2 [20]. The annealing curves for three different data vectors X are plotted in
Fig. 6. We observe that at about T /σx2 > 10 almost all proposed moves will be accepted, i.e., r < e−δ/T
(see Fig. 1), and the value of E{f (b̃)} is close to the average over all possible solutions b. The annealing process becomes random and no improvement over random search is expected. On the other hand,
hardly any uphill moves will be accepted for T /σx2 < 0.1, i.e., b̃ is replaced by a new trial vector b only
if the corresponding peak value f (b) is smaller than f (b̃). In this case, simulated annealing reduces to
bit-flipping search. These results suggest that temperatures in the range of 0.1 ≤ T /σx2 < 10 should be
applied for simulated annealing for PTS. After some further experimentation, we found Ts /σx2 = 1 and
Tf /σx2 = 0.2 for the geometric and Gaussian-like cooling, with a = 0.5 and c = 0.4 for the latter, and
T /σx2 = 0.5 for constant temperature annealing as appropriate values.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
15
In order to compare the performances for the three cooling schedules, we plot the PAR of the bestso-far solution b̂ as function of the number of iterations in Fig. 7 for three different and representatively
chosen data vectors X and I = 1024. For the first data vector (graphs at the top) geometric cooling
yields the best results for simulated annealing with more than 50 iterations (i.e., searches). For example, it
outperforms constant temperature and Gaussian-like cooling by 0.8 dB and 0.6 dB, respectively. However,
the relative performances are switched for the second and third data vector, where, respectively, constant
temperature and Gaussian-like cooling perform best. Since the data X shapes the cost function of the
CO problem, the optimum cooling schedule for simulated annealing also depends on X, and as seen in
Fig. 7, no single cooling schedule is optimal in all cases. In fact, considering average performance, e.g.
the CCDF Fζ (ζ0 ), we found that all three cooling schedules yield very similar PAR reduction performance.
We therefore only show average PAR results for simulated annealing with constant temperature in the
following.
In Fig. 8, simulated annealing is compared with random search for I = [16, 64, 256, 1024, 4096]
searches. Except for the case of I = 16 and Fζ (ζ0 ) < 10−2 , simulated annealing provides consistent
improvements over random search with some significant gains in the low PAR range. In particular,
simulated annealing needs to perform only about one-fourth of the number of the searches required for
random search to achieve PAR reduction close to that of optimal PTS with full search.
Comparison of PTS with different heuristics: Finally, we compare all the search strategies introduced
in Section 3. For the case of tabu search, only the tabu tenure B needs to be optimized (see Fig. 1).
Numerical results, which are omitted for the sake of brevity, showed that a relatively wide range of values
for B is optimum with respect to PAR reduction performance. Therefore, B = 9 is chosen for all the
following results.
A fair comparison of the performance-complexity tradeoffs for the different PTS search strategies
is provided in Fig. 9, where the numbers of searches required to achieve Fζ (ζ0 ) = 10−3 are plotted as
function of ζ0 . These numbers are averages for the extended bit-flipping search and local search, since
these algorithms terminate early if no further improvement in peak-power reduction is achieved. It is
quite remarkable that the “trivial” random search heuristic performs at least as a good as and often
outperforms the bit-flipping and local search heuristics, originally proposed in [4] and [5], respectively.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
16
Only the application of the simulated annealing and tabu search metaheuristics to solve the CO problem
of PTS yields an enhanced tradeoff in the low-PAR range (e.g. at 7 dB), where about four times less
searches are required than for random search and the same PAR reduction. Thereby, it should be noted
that the curves in Fig. 9 depend on the target value of Fζ (ζ0 ), and a somewhat different comparison may
result for other values than Fζ (ζ0 ) = 10−3 (for example, see the curves for I = 16 in Fig. 8).
6.2
Objective Function Approximation for PTS
Results for PTS when applying the objective function g(b) as introduced in Section 4 as approximation for
the true peak power f (b) are presented in Fig. 10. PTS using simulated annealing with I = 1024 and g(b)
with P = 4, 8, 16 (see Eq. (13)) are considered. For P = 4 no multiplications are needed to setup and solve
the approximate CO problem, compared to 2NLI multiplications required for the original CO problem, and
only 0.6 dB in PAR reduction are lost at Fζ (ζ0 ) = 10−3 . When P = 8, 32NL multiplications are required
to setup the approximate CO problem, and PAR reduction is still improved compared to optimizing f (b)
in 64 searches which involve 128NL, i.e., four times the number of multiplications. With P = 16 we see
that the approximation yields essentially the same performance as the original objective function, but the
number of multiplications is reduced by a factor of 10.
6.3
SI Embedding Scheme for PTS
In order to verify the analytical approximations for the BER if the new SI embedding scheme is applied
for PTS, Fig. 11 shows the BER performances for uncoded OFDM transmission with (i) conventional
Gray labelled 16QAM (i.e., either perfect explicit SI is available or PAR reduction is not applied) and (ii)
16QAM with sign-bit mapping and the novel SI embedding scheme. Unfaded additive white Gaussian noise
(AWGN) and fading AWGN channels with D-fold diversity are considered, and simulation and numerical
results using Eqs. (25) and (26) are plotted. It can be seen that (i) numerical and simulation results match
very well and (ii) the loss in BER performance compared to OFDM without SI embedding is very small
(about a factor 1.2 in BER). This, together with the other features outlined in Section 5, renders the new
SI embedding scheme an interesting solution for OFDM systems employing PTS-based PAR reduction.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
7
17
Conclusion
In this paper, we revisited PTS for PAR reduction in OFDM systems and tackled the two main challenges
of PTS, which are computational complexity and transmission of SI. We formulated the sequence search
of PTS as a particular CO problem, which allowed us (i) to unify various heuristics for finding the optimal
phase vector proposed in the PTS literature and (ii) to adapt efficient search algorithms known from the
CO literature to PTS. A comparison of the different search strategies led to the interesting conclusion
that simple random search performs best for relatively small numbers of sequence searches. The newly
proposed simulated annealing and tabu search methods showed their strengths if close-to-optimum PAR
reduction is desired. We also proposed a modified PTS objective function, which reduces the number
of multiplications required for PTS, while yielding excellent PAR reduction results. Finally, we presented
a novel SI embedding scheme, which is transparent to PAR reduction, requires only minimal additional
bandwidth and complexity, and does practically not degrade the BER performance.
References
[1] S. Han and J. Lee, “An overview of peak-to-average power ratio reduction techniques for multicarrier
transmission,” IEEE Trans. Wireless Commun., vol. 12, no. 2, pp. 56–65, Apr. 2005.
[2] S.H. Müller and J.B. Huber, “OFDM with reduced peak-to-average power ratio by optimum combining of partial transmit sequences,” Electron. Lett., vol. 33, no. 5, pp. 368–369, Feb. 1997.
[3] ——, “A novel peak power reduction scheme for OFDM,” in Proc. of Intern. Symp. on Personal,
Indoor and Mobile Radio Communications (PIMRC), 1997, pp. 1090–1094.
[4] L. J. Cimini, Jr and N. R. Sollenberger, “Peak-to-average power ratio reduction of an OFDM signal
using partial transmit sequences,” IEEE Commun. Lett., vol. 4, no. 3, pp. 86–88, Mar. 2000.
[5] S. H. Han and J. H. Lee, “PAPR reduction of OFDM signals using a reduced complexity PTS
technique,” IEEE Sig. Processing Lett., vol. 11, no. 11, pp. 887–890, Nov. 2004.
[6] C.-C. Feng, C.-Y. Wang, C.-Y. Lin, and Y.-H. Hung, “Protection and transmission of side information
for peak-to-average power ratio reduction of an OFDM signal using partial transmit sequences,” in
Proc. of Veh. Technol. Conf. (VTC), vol. 4, Oct. 2003, pp. 2461–2465.
[7] A. D. S. Jayalath and C. Tellambura, “Side information in PAR reduced PTS-OFDM signals,” in
Proc. of Intern. Symp. on Personal, Indoor and Mobile Radio Communications (PIMRC), vol. 1,
Sept. 2003, pp. 226–230.
[8] S. H. Müller-Weinfurtner, OFDM for wireless communications: Nyquist windowing, peak-power reduction, and synchronization. Aachen: Shaker Verlag, 2000.
[9] L. J. Cimini, Jr. and N. R. Sollenberger, “Peak-to-average power ratio reduction of an OFDM
signal using partial transmit sequences with embedded side information,” in Proc. of IEEE Global
Telecommun. Conf. (GLOBECOM), vol. 2, Nov.-Dec. 2000, pp. 746–750.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
18
[10] C.-C. Feng, Y.-T. Wu, and C.-Y. Chi, “Embedding and detection of side information for peak-toaverage power ratio reduction of an OFDM signal using partial transmit sequences,” in Proc. of Veh.
Technol. Conf. (VTC), vol. 2, Oct. 2003, pp. 1354–1358.
[11] A. D. S. Jayalath and C. Tellambura, “SLM and PTS peak-power reduction of OFDM signals without
side information,” IEEE Trans. Wireless Commun., vol. 4, pp. 2006–2013, Sept. 2005.
[12] W. Henkel and B. Wagner, “Another application for trellis shaping: PAR reduction for DMT
(OFDM),” IEEE Trans. Commun., vol. 48, no. 9, pp. 1471–1476, Sept. 2000.
[13] H. Ochiai, “A novel trellis-shaping design with both peak and average power reduction for OFDM
systems,” IEEE Trans. Commun., vol. 52, no. 11, pp. 1916–1926, Nov. 2004.
[14] C. Tellambura, “Improved phase factor computation for the PAR reduction of an OFDM signal using
PTS,” IEEE Commun. Lett., vol. 5, no. 4, pp. 135–137, Apr. 2001.
[15] ——, “Computation of the continuous-time PAR of an OFDM signal with BPSK subcarriers,” IEEE
Commun. Lett., vol. 5, no. 5, pp. 185–187, May 2001.
[16] J. Tellado, Multicarrier modulation with low PAR: applications to DSL and wireless. Norwell, MA,
USA: Kluwer Academic Publishers, 2000.
[17] C. R. Reeves, Ed., Modern heuristic techniques for combinatorial problems. London: Blackwell
Scientific Publications, 1993.
[18] S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi, “Optimization by simulated annealing,” Science, vol.
220, 4598, pp. 671–680, 1983.
[19] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, and A. H. Teller, “Equation of state calculations
by fast computing machines,” Journal of Chemical Physics, vol. 21, no. 6, pp. 1087–1092, June 1953.
[20] S. R. White, “Concepts of scale in simulated annealing,” in Proc. IEEE Int. Conference on Computer
Design, 1984, pp. 646–651.
[21] M. M. Atiqullah, “An efficient simple cooling schedule for simulated annealing,” in Proc. Int. Conf.
Computational Science and Its Applications (ICCSA), Assisi, Italy, 2004, pp. 396–404.
[22] F. Glover, “Future paths for integer programming and links to artificial intelligence,” Comput. Oper.
Res., vol. 13, no. 5, pp. 533–549, 1986.
[23] F. Glover and F. Laguna, Tabu Search. Norwell, MA, USA: Kluwer Academic Publishers, 1997.
[24] X. Chen and T. Parks, “Design of FIR filters in the complex domain,” IEEE Trans. Acoust., Speech,
Signal Processing, vol. 35, no. 2, pp. 144–153, Feb. 1987.
[25] B.S. Krongold and D.L. Jones, “An active-set approach for OFDM PAR reduction via tone reservation,” IEEE Trans. Signal Processing, vol. 52, no. 2, pp. 495–509, Feb. 2004.
[26] R. E. Blahut, Theory and Practice of Error Control Codes. Reading, Massachusetts: Addison–
Wesley, 1983.
[27] C. Lee, S. Ng, L. Piazzo, and L. Hanzo, “TCM, TTCM, BICM and iterative BICM assisted OFDMbased digital video broadcasting to mobile receivers,” in Proc. of Veh. Technol. Conf. (VTC), Rhodes,
Greece, May 2001, pp. 732–736.
[28] J. Proakis, Digital Communications, 4th ed. New York: McGraw–Hill, 2001.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
19
// Simulated Annealing for PTS
Input: temperature T
Select initial vector b̃, best solution b̂ = b̃
for (i = 1, . . . , I − 1)
Select next trial vector b from b̃ by cyclic bit-flipping
Calculate δ = f (b) − f (b̃)
Generate a uniform random number r in (0, 1)
if (δ < 0 or r < e−δ/T )
b̃ = b
end if
if (f (b) < f (b̂))
b̂ = b
end if
Update T according to cooling schedule
end for
Return b̂
// Tabu Search for PTS
Input: tabu tenure B
Initial tabu list L(B) = ∅
Select initial vector b̃, best solution b̂ = b̃
for (k = 1, . . . , K − 1)
Create acceptance set A = N (b̃) \ L(B)
Select new solution b̃ = argminb∈A {f (b)}
if (f (b̃) < f (b̂))
b̂ = b̃
end if
Update tabu list L(B)
end for
Return b̂
Figure 1: Pseudocode for metaheuristics proposed for PTS. Top: simulated annealing, bottom: tabu
search.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
P =8
P = 12
P = 16
P =4
circle
Figure 2: Approximation of a circle with a P -sided symmetrical polygon.
20
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
21
from PTS algorithm
Transmitter
c(v)
d(v)
H
−T
u(v)
s(v)
MSB
Mapping
X (v)
LSB
data
Receiver
X̂
(v)
ŝ(v)
Demapping
MSB
H
T
d̂
(v)
LSB
Figure 3: Block diagram of SI embedding scheme for one subblock. Top: transmitter side, bottom:
receiver side.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
22
1
0.9
0.8
Gb(ζ0 ) −→
0.7
0.6
0.5
different realizations of X
0.4
0.3
0.2
0.1
0
6
7
8
9
10
11
12
10 log10 (ζ0 ) [dB] −→
Figure 4: CDF Gb(ζ0 ) of ζ(x) = f (b)/σx2 considering b as an in {−1, +1}V −1 uniformly distributed
random variable and fixed X. Plot shows Gb(ζ0 ) for 10 different realizations of X.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
23
0
10
Original OFDM
RS
Optimal PTS
[2,4,16,64,256,
1024,4096]
searches
−1
Fζ (ζ0 )
−→
10
−2
10
−3
10
6
7
8
10 log10 (ζ0 ) [dB]
9
10
11
−→
Figure 5: CCDF of PAR for random search (RS) with different number of searches. Also shown:
OFDM without PAR reduction and optimal PTS with 215 searches.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
24
7.5
7
E{f (b̃)}/σx2
−→
6.5
6
5.5
different realizations of X
5
4.5
4
3.5
−2
10
−1
0
10
10
T /σx2
1
10
2
10
−→
Figure 6: Normalized average of f (b̃) with respect to b̃ for simulated annealing at fixed T and three
(arbitrarily chosen and) different data vectors X as function of T /σx2 .
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
25
Geometric cooling
Gaussian−like cooling
Constant temperature
1st data vector
6
5
−→
6
f (b̂)/σx2
4
5
2nd data vector
4
3rd data vector
6
5
4
100
200
300
400
Iteration i
500
600
700
800
900
1000
−→
Figure 7: PAR f (b̂)/σx2 of best-so-far solution b̂ for simulated annealing with three different cooling
schedules vs. number of iterations i. Results are plotted for three (representatively chosen and)
different data vectors X.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
26
0
10
RS
SA
Optimal PTS
16
−1
10
−→
64
Fζ (ζ0 )
256
1024
4096
−2
10
−3
10
6
6.5
7
7.5
8
8.5
10 log10 (ζ0 ) [dB] −→
Figure 8: CCDF of PAR for simulated annealing (SA) with constant temperature and random search
(RS) for different number of searches. Also shown: optimal PTS with 215 searches.
(average) # of searches required for Fζ (ζ0 ) = 10−3 −→
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
27
RS
SA
TS
BFS
LS dH=1
3
LS d =2
10
H
2
10
1
10
6.8
7
7.2
7.4
7.6
7.8
10 log10 (ζ0 ) [dB]
8
−→
8.2
8.4
8.6
8.8
Figure 9: (Average) number of searches required to achieve Fζ (ζ0 ) = 10−3 for PTS with different
search strategies (random search (RS), simulated annealing (SA), tabu search (TS), bit-flipping
search (BFS), local search (LS) with Hamming distances dH = 1 and dH = 2).
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
28
0
10
SA, 64 searches
Original OFDM
−1
10
−→
SA, 1024 seaches
P = 4
Fζ (ζ0 )
P = 8
P = 16
−2
10
PTS with approximated peak power
as objective function
Optimal
PTS
PTS with true peak power
as objective function
−3
10
6
6.5
7
7.5
8
8.5
10 log10 (ζ0 ) [dB]
9
9.5
10
10.5
11
11.5
−→
Figure 10: CCDF of PAR for simulated annealing (SA) with I = 1024 searches, using approximation
of the peak power with P = 4, 8, 16. For comparison: SA with 64 searches, optimal PTS, and OFDM
without PAR reduction.
Revised as Paper for publication in the IEEE Transactions on Wireless Communications, November, 2006.
29
0
10
SI embedding, numerical
SI embedding, simulation
no SI embedding
−1
10
D = 1
−2
D =2
−3
10
BER
−→
10
AWGN
D =4
−4
10
−5
10
−6
10
0
5
10
15
10 log10 (Ēs /N0 ) [dB]
20
25
30
−→
Figure 11: BER vs. 10 log10 (Ēs /N0 ) for OFDM with and without SI embedding. AWGN and Rayleigh
fading with D-fold diversity reception.