Adaptive Lossless Image Coding

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 45, NO.

4, APRIL 1997

437

Context-Based, Adaptive, Lossless Image Coding


Xiaolin Wu, Senior Member, IEEE, and Nasir Memon, Member, IEEE
AbstractWe propose a context-based, adaptive, lossless image
codec (CALIC). The codec obtains higher lossless compression of
continuous-tone images than other lossless image coding techniques in the literature. This high coding efficiency is accomplished with relatively low time and space complexities. CALIC
puts heavy emphasis on image data modeling. A unique feature
of CALIC is the use of a large number of modeling contexts
(states) to condition a nonlinear predictor and adapt the predictor
to varying source statistics. The nonlinear predictor can correct
itself via an error feedback mechanism by learning from its
mistakes under a given context in the past. In this learning
process, CALIC estimates only the expectation of prediction
errors conditioned on a large number of different contexts rather
than estimating a large number of conditional error probabilities.
The former estimation technique can afford a large number of
modeling contexts without suffering from the context dilution
problem of insufficient counting statistics as in the latter approach, nor from excessive memory use. The low time and space
complexities are also attributed to efficient techniques for forming
and quantizing modeling contexts.
Index TermsAdaptive prediction, entropy coding, image compression, statistical context modeling.

I. INTRODUCTION

ECENT years have seen an increased level of research


in image compression. Most of the effort, however, has
focused on the development of lossy compression techniques.
Certain applications, such as medical imaging, image archiving, and remote sensing, require or desire lossless compression.
Furthermore, lossless compression is often a necessary last
step in many lossy image compression systems. Hence, the
development of efficient and effective lossless image compression is an important research topic that has many potential
applications.
Statistical modeling of the source being compressed plays
a central role in any data compression system. Fitting a given
source with statistical models is a difficult endeavor pursued
by researchers in a wide range of disciplines including source
coding, machine learning, game theory, and estimation. It is
not the intention of this paper to give this subject a thorough
theoretical treatment (the reader is referred to [6] and [9] for
concepts and background on universal statistical data modeling). Instead, we study practical algorithmic techniques to use
the paradigm of statistical modeling efficiently and effectively
for the purpose of lossless compression of continuous-tone
Paper approved by M. R. Civanlar, the Editor for Image Processing of
the IEEE Communications Society. Manuscript received February 20, 1996;
revised July 8, 1996. This work was supported in part by the Natural Sciences
and Engineering Research Council of Canada.
X. Wu is with the Department of Computer Science, University of Western
Ontario, London, Ont., N6A 5B7 Canada.
N. Memon is with the Department of Computer Science, Northern Illinois
University, De Kalb, IL 60115 USA.
Publisher Item Identifier S 0090-6778(97)02784-0.

images. Such a study on the computational aspect of lossless


image coding is warranted because the problem has some
unique operational difficulties [4], [9]. In this paper, we
illustrate these difficulties, and provide practical solutions that
were incorporated in the design of an efficient and effective
context-based lossless image coding schemeCALIC.
CALIC was designed in response to the ISO/IEC JTC
1/SC 29/WG 1 (JPEG) call soliciting proposals for a new
international standard for lossless compression of continuoustone images. In the initial evaluation of the nine proposals
submitted at the JPEG meeting in Epernay, France, July 1995,
CALIC had the lowest lossless bit rates in six of seven
image classes: medical, aerial, prepress, scanned, video, and
compound document, and the third lowest bit rate in the
class of computer-generated images. CALIC gave an average
lossless bit rate of 2.99 b/pixel on the 18 8-b test images
selected by JPEG for proposal evaluation, compared with an
average bit rate of 3.98 b/pixel for lossless JPEG1 on the
same set of test images. Even more encouragingly, CALIC
obtained a 2% lower bit rate than the recent UCM (Universal
Context Modeling) method developed by Weinberger et al. [9].
The latter is a principled but complex context-based image
coding technique that is considered to be indicative of the
lower bound on lossless bit rates achievable by other more
practical methods.
Unlike UCM, CALIC is intended to be a practical lossless
image codec. In order to attain this goal of practicality,
new efficient algorithmic techniques have been developed
for context formation, quantization, and modeling. Although
conceptually more elaborate than many existing lossless image
coders, CALIC is algorithmically quite simple, involving
mostly integer arithmetic and simple logic. The prediction
and modeling components of the encoder and decoder require
only 1.52 CPU s on 512 512 continuous-tone images and
2.28 CPU s on 720 576 continuous-tone images when being
executed on a SUN SPARC10 workstation. Both the encoding
and decoding algorithms are suitable for parallel and pipelined
hardware implementation while supporting sequential buildup. Furthermore, CALIC is symmetric, meaning that the
encoder and decoder have similar time and space complexities.
The binary executables of CALIC are available for common
hardware platforms by anonymous ftp.2
This paper is structured as follows. In the next section,
we give a schematic description of CALIC to provide an
overview of the entire coding system. Then, in Sections
IIIVIII, we elaborate on the individual components of the
CALIC coding system, and present the algorithms involved
1 Implementations for lossless JPEG on
available.
2 ftp://ftp.csd.uwo.edu/pub/from

0090-6778/97/$10.00 1997 IEEE

wu.

>8

b images are not publicly

438

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 45, NO. 4, APRIL 1997

Fig. 1. Schematic description of CALICs encoder.

and the rationales and design decisions behind the algorithms.


Finally, in Sections IX and X, we present entropy results and
actual bit rates that CALIC obtained on the ISO test images,
and comparisons between CALIC and some popular existing
lossless image coders.
II. SYSTEM OVERVIEW
CALIC is a sequential coding scheme that encodes and
decodes in raster scan order with a single pass through the
image. The coding process uses prediction and context templates that involve only the two previous scan lines of coded
pixels. Consequently, the encoding and decoding algorithms
require a simple two-line buffer that holds the two rows of
pixels that immediately precede the current pixel. A schematic
description of CALICs encoder is given in Fig. 1. Decoding
is just the reverse process.
CALIC operates in two modes: binary and continuous tone.
Binary mode is for situations in which the current locality of
the input image has no more than two distinct intensity values,
hence it is designed for a more general class of images than the
traditional class of black and white images. The system selects
one of the two modes on the fly during the coding process,
depending on the context of the current pixel. Mode selection
is automatic, and completely transparent to the user. In binary
mode, a context-based adaptive ternary arithmetic coder is
used to code three symbols, including an escape symbol which
triggers an exit from binary mode.
In continuous-tone mode, the system has four major integrated components: gradient-adjusted prediction (GAP), context selection and quantization, context modeling of prediction
errors, and entropy coding of prediction errors. First, the
gradient of the intensity function at the current pixel
is
estimated. A gradient-adjusted prediction of is made. A
detailed description of GAP is presented in Section III. As
illustrated by Fig. 1, the predicted value is further adjusted
via an error feedback loop of one-step delay. This results in
an adaptive, context-based, nonlinear prediction , as we will
explain shortly.
The residue of the predictor is entropy-coded based on
eight estimated conditional probabilities in eight different contexts. These eight contexts, called coding contexts, are formed
by quantizing an error energy estimator , a random variable,
into eight bins. These quantizer bins partition prediction error

terms into eight classes by the expected error magnitude. The


described procedures in relation to the system are identified by
the blocks labeled Context Quantization and Conditional
Probabilities Estimation in Fig. 1. Details of this context
quantization scheme in association with entropy coding are
given in Section IV.
Performance of the GAP predictor can be significantly
improved via context modeling. Hence, to further improve
coding efficiency, we combine 144 texture contexts with four
error energy contexts
to form a total of 576 compound
contexts. Texture contexts are formed by a quantization of a
local neighborhood of pixel values to a binary vector. The
expectation of prediction error conditioned on each compound
context is estimated. This estimate , which is the sample
mean conditioned on the current compound context, is added
to correct its bias in the given
to the GAP predictor
compound context, generating an improved context-sensitive
prediction
. In Fig. 1, these steps are conceptually
related to the whole system by the blocks labeled Context
Quantization and Error Modeling and the error feedback
loop. In addition, the sign of
is utilized to sharpen the
estimated conditional probabilities (hence reduce the underlying conditional entropies) that drive the entropy coder via a
technique of sign flipping [10]. This corresponds to the block
of Coding Histogram Sharpening in Fig. 1. The modeling
of prediction errors conditioned on compound contexts is
discussed in Section V.
The sign-flipped prediction error is then encoded using
the eight coding contexts
described above. In the binary
mode, we use a simple ternary adaptive arithmetic coder as
described in Section VI. In the continuous-tone mode, we use
an adaptive -ary arithmetic coder. In order to improve coding
and space efficiency, some preprocessing of errors is done
prior to entropy coding. Details are given in Section VII.
III. GAPGRADIENT-ADJUSTED PREDICTION
The simplest and also most common form of statistical
redundancy in image data is the smoothness of the intensity
function. A universal statistical model, while being able to
learn this feature of smoothness, may require more parameters
than a prediction scheme that utilizes a priori knowledge of
image smoothness. Moreover, this learning process may take
time, causing a delay in model adaptation. On the other hand,
a simple and heuristic data fitting or prediction technique such
as DPCM can decorrelate image data in smooth areas with
very few fixed coefficients (no learning involved) and at very
low computational cost. Therefore, prediction can be viewed
as a context modeling technique of very low model cost
that is highly effective under an assumption of smoothness.
Hence, in CALIC, we chose to employ predictive coding in
the continuous-tone mode.
The GAP predictor employed by CALIC is a simple,
adaptive, nonlinear one that can adapt itself to the intensity
gradients near the predicted pixel. Hence, it is more robust than
traditional DPCM-like linear predictors, particularly in areas
of strong edges. GAP differs from existing linear predictors
in that it weights the neighboring pixels of
according
to the estimated gradients of the image. For simple denotation

WU AND MEMON: CALIC

439

Fig. 2. Labeling of neighboring pixels used in prediction and modeling.

in the sequel, we let

(1)
meaning north, west, northeast, northwest, northnorth,
westwest, and northnortheast, respectively. The locations
of these pixels with respect to
are given in Fig. 2.
We estimate the gradient of the intensity function at the
current pixel
by the following quantities3:
(2)
Clearly,
and
are estimates, within a scaling factor,
of the gradients of the intensity function near pixel
in the horizontal and vertical directions. The values of
and
are used to detect the magnitude and orientation of
edges in the input image, and make necessary adjustments
in the prediction in order to get improved performance in
the presence of local edges. Three absolute differences are
used for in each direction. This gave the best compression
results in our experiments. Two or one absolute differences
can be used here for lower complexity with a small loss
in performance. The reason for using absolute rather than
algebraic differences in (2) is to prevent cancellation of values
of opposite signs. Efficient incremental and/or parallel schemes
for evaluating
and
are straightforward. For instance, to
avoid unnecessary repeated computations, one can store the
values of
associated with preceding pixels for
3 For the sake of clarity, we only consider sequential coding of interior
pixels. Many possible treatments of boundary pixels are possible, and they do
not make a significant difference in the final compression ratio.

IF
ELSE IF
ELSE
IF
ELSE IF
ELSE IF
ELSE IF

future reference. This only requires an array of the size ,


the width of the image.
and , a gradient-adjusted prediction
Having computed
of
is made by the procedure, shown at the
(GAP)
vottom of the page..
The predictor coefficients and thresholds given above were
empirically chosen. A major criterion in choosing these coefficients is ease of computation. For instance, most coefficients
are powers of 2 so that multiplications/divisions can be performed by shifting. The thresholds given above are for 8-b
data, and are adapted on the fly for higher intensity resolution
images as explained in Section VIII. It is possible, albeit
quite expensive, to optimize the coefficients and thresholds
for an image or a class of images, so that a norm of the
is minimized. We do not
expected prediction error
recommend that such an optimization process be carried out
on an image-by-image basis. However, it is important to point
out that the coefficients and thresholds in computing
can be set by a user who knows optimal or nearly optimal
coefficients for target images.
IV. CODING CONTEXT SELECTION

AND

QUANTIZATION

The prediction step does not completely remove the statistical redundancy in the image. It is observed that the variance
strongly correlates to the
of prediction errors
.
smoothness of the image around the predicted pixel
To model this correlation at a small computational cost, we
define an error energy estimator
(3)
and
are defined in (2) and
is the previous prediction error.
is included in
because large errors tend to occur consecutively. One can,
in . This
of course, include
makes a slightly better error energy estimator, but requires
buffering of an extra row of errors. We decided not to use
. Note that (3) has only two independent parameters since
can be scaled arbitrarily. The coefficients
, and can
be determined in an off-line design process by standard linear
is the least-squares estimator
regression technique so that
based on
. For algorithm
of the error strength
and
in (3).
efficiency, we recommend
By conditioning the error distribution on , we can separate the prediction errors into classes of different variances.
Thus, entropy coding of errors using estimated conditional
improves coding efficiency over using
.
probability
to
levels.
For time and space efficiency, we quantize
where

sharp horizontal edge


sharp vertical edge
;
horizontal edge
weak horizontal edge
vertical edge
weak vertical edge

440

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 45, NO. 4, APRIL 1997

Fig. 3. Convexity relationship between GAP predictor I^ and the neighboring pixels.

In practice,
is found to be sufficient. Denote the
quantizer by , i.e.,
. The quantization
criterion is to minimize the conditional entropy of the errors. In
pairs
an off-line design process, we get a training set of
from test images, and use dynamic programming to choose
to partition
into
intervals such that
(4)
quantizer using
is minimized. We designed an optimal
a training set consisting of all ISO test images and with
and
to minimize (4). This quantizer, whose
bins are

the specifics of applying these principles are different. In


addition to refining some of the techniques in [10], we have
also kept in mind the complexity of software and hardware
implementations. We describe below the details by which we
form and quantize contexts and arrive at an adaptive, contextbased, nonlinear prediction.
A. Formation and Quantization of Contexts
In CALIC, contexts for error modeling are formed by
combining 144 texture contexts with four error energy contexts
by uniform quantization scheme
to form a total of 576
compound contexts. Texture context
is formed by a local
neighborhood of pixel values

(5)
worked almost as well as the optimal individual-imagedependent quantizer. Quantizing the error energy estimator
or the activity level in error modeling was used earlier by
many authors, for example, in [1], [8]. But we optimized the
quantizer under a conditional entropy criterion.
Estimating
conditional error probabilities
requires only a modest amount of memory during entropy
coding. Furthermore, the small number of conditional error
probabilities involved means that even small images will
provide enough samples for context modeling to learn
quickly in adaptive entropy coding.
V. CONTEXT MODELING

OF

PREDICTION ERRORS

Gradients alone cannot adequately characterize some of the


more complex relationships between the predicted pixel
and its surroundings. Context modeling of the prediction error
can exploit higher order structures such as texture
patterns and local activity in the image for further compression
gains. However, the large number of possible contexts can
lead to the sparse context or high model cost problem. In
recent work [10], Wu has proposed a novel and effective
within
solution to this problem. Instead of estimating
each context , he estimates only the conditional expectations
using the corresponding sample means
within
each context. These estimates are then used to further refine
the prediction prior to entropy coding by an error feedback
mechanism. In CALIC, we adopt this approach. Although
the general principles that we have adopted are from [10],

(6)
is then quantized to an -b binary number
using the prediction value as the threshold, namely,
if
if

(7)

captures the texture patterns in the modeling


Clearly,
context which are indicative of the behavior of . Also
note that an event
in a modeling context need not be a
neighboring pixel to
. It can be a function of some
neighboring pixels.
and
, for example, represent the
events whether the prediction value
forms a convex or
concave waveform with respect to the neighboring pixels in the
vertical and horizontal directions, as depicted by Fig. 3. These
convexity indicators are useful in the context modeling of .
The fact that
and
bring new information
to the modeling context even though
, and
are
already present is only because of the context quantization.
Since the variability of neighboring pixels also influences
the error distribution, we combine the quantized error energy
with the quantized texture pattern
to form compound modeling contexts, denoted by
. This scheme can be viewed as a product quantization
of two independently treated image features: spatial texture
patterns and the energy of prediction errors.
B. Counting the Number of Contexts
At a glance, we would seemingly use
compound contexts since

different
and

WU AND MEMON: CALIC

441

. However, not all


binary codewords
quantizer defined by (7) are possible. If the prediction
of
lies between
and
, then the condition
value
is either always true or always false. In
in
other words, the outcome of the test
and
,
(7) is uniquely determined if
and
. This should be apparent
or if
from Fig. 3. Also, by referring to (6) and (7), we see that
is fixed if
and
are different from each other.
in
is fixed if
and
are different from each
Likewise,
, and
are not independent, they can
other. Because
combinations. The same holds for
only yield
, and . Thus, represents only
rather than
256 valid texture patterns. Therefore, the total number of valid
.
compound contexts in error modeling is only
C. Estimation of Error Magnitude Within a Context
within each comConditional expectations
pound context are estimated by using the corresponding samfor different compound contexts. Computing
ple means
involves only accumulating the error terms in each
compound context and maintaining a count on the occurrence
of each context. For each compound context
, we count the number
of occurrences
. We only use 1 byte to store the occurrence count.
of
, we scale
Whenever the count reaches some value
intact. We
down the count, but leave the corresponding
. In order to update
, we use
recommend
of
, and compute
an accumulator
(8)
occurs. We use a 16whenever a compound context
whose range is more than
b signed integer to store
sufficient in practice. It takes more than 128 consecutive
or
to violate the bounds
, but we have
. Once
reaches 128, we rescale the variables by setting
(9)
and 2 bytes
The modeling system needs 1 byte for
for each of 576 compound contexts. Thus, the
for
total size of the memory required for context modeling of
prediction errors is 1728 bytes. Besides being a technique
to reduce the memory use, rescaling also has the widely
recognized beneficial side effect of aging the observed data.
It is an inexpensive way of adapting the context error model
to time-varying sources. Indeed, the scaling technique slightly
improved compression in our experiments.
D. Error Feedback and Sign Flipping
is the most likely predicSince the conditional mean
, we correct the
tion error in a given compound context
to get
.
bias in prediction by a feedback
In order not to overadjust the GAP predictor, we actually
rather than
consider the new prediction error
in context-based error modeling described above.
Context modeling of leads to an improved predictor for
:
, where
is the sample mean of

Fig. 4. Two-stage adaptive prediction scheme via context modeling of errors


and error feedback.

conditioned on compound context


. Conceptually,
we ultimately have a two-stage adaptive prediction scheme
via context modeling of errors and error feedback. A block
diagram of the scheme is given in Fig. 4.
Besides aiding in obtaining an improved prediction, the
within each context is also
estimated prediction error
utilized to sharpen the estimated conditional probabilities
(hence reduce the underlying conditional entropy) that drive
the entropy coder via a novel technique of sign flipping as
reported in [10]. This technique works as follows. Suppose
the current context is
. Before encoding
, the
encoder checks whether
. If yes,
; otherwise,
is encoded. Since the decoder also knows
and
, it can reverse the sign, if necessary, to reconstruct
. Coding efficiency can be improved by the sign flipping
technique because the error modeling based on compound
captures some of the statistical redundancy
contexts
in the sign of . In essence, by preserving or reversing the sign
of , we make a prediction on the sign of .
VI. BINARY MODE
The above adaptive, predictive coding scheme via context
modeling of prediction errors is designed for compressing
natural continuous-tone images. But predictive coding can
have poor performance on images or subimages which contain
very few widely separated gray levels compared with a coder
based on explicit Markov modeling. This is because the
image data no longer satisfy the assumption of smoothness
for predictive coding. However, for a discrete type of images
and subimages, direct context modeling of source symbols
becomes feasible precisely because of the very small size of
the symbol set. In this situation, it is more effective to code
pixel values directly rather than prediction errors.
In order to effectively compress uniform or nearly uniform
image areas, graphics, rasterized documents, and any combination of natural images with one or more of these types, we
propose a binary coder as a separate module of the compression system. The system now has two modes: continuous tone

442

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 45, NO. 4, APRIL 1997

and binary. Before


is to be coded, the algorithm first
checks six neighboring pixels:
.
If these six pixels have no more than two different values,
the binary mode is implicitly and automatically triggered;
otherwise, the system remains in the continuous-tone mode.
In the binary mode, let
and let the other value, if
any, be . The encoder describes
in one of the three
states, using a ternary code
if
if
otherwise.

(10)

, the encoder switches from the


In the escape case of
binary mode to the continuous-tone mode. Encoding in the
binary mode is conditioned on a context of six events:

(11)
which are quantized to get a 6-b binary number
if
(12)
if
represents the texture
As in the continous-tone mode,
pattern around
. Note that, because
can be ignored from the context . Thus, there are only
different modeling contexts in the binary mode.
Consequently, we use 32 conditional probabilities
to
drive an adaptive arithmetic coder of three symbols. For the
context modeling in the binary mode, only
frequency counts need to be maintained. Of course, we can
use higher order contexts in binary mode like in JBIG for
slightly higher compression. For compound-type images of
large portions of binary texts, the JBIGs order-10 contexts can
yield up to 2% more compression than our order-6 contexts.
But for other images, high-order binary contexts can hurt
compression because they make the condition for entering
the binary mode more restrictive. To balance out the two
conflicting situations and for easy implementation, we let the
modeling context be the same as the template in which the
binary mode is determined.
The incorporation of the binary mode into the compression
system adds little to the system complexity, while greatly enhancing the coding performance on bilevel images, compound
images, and images with large uniform areas. An important
feature of our design is that the switch between continuoustone and binary modes is context-dependent, automatic, and
completely transparent to the user.
VII. ENTROPY CODING

OF

PREDICTION ERRORS

An advantage of CALIC is the clean separation between


context-based prediction and modeling of image data and
entropy coding of prediction errors. Any entropy coder, be
it Huffman or arithmetic, static or adaptive, binary or ary, can be easily interfaced with the CALIC system. In this
section, we describe the entropy coding techniques used in the
version of CALIC that was submitted to ISO as a candidate
for standardization.

The entropy coder in the binary mode has an alphabet size


of three. Given this small alphabet size, we use a simple
ternary adaptive arithmetic coder. Coding is done under 32
different contexts, for each of which we maintain a frequency
table consisting of three symbols. We use a 1 byte integer for
each count and rescale counts when the cumulative frequency
exceeds 2 . We begin with a table in which all counts have
been set to 1. For fast adaptation, we set the initial increment
to 2 , and this increment is halved every time the counts
are rescaled. This halving process stops when the increment
reaches a value of 1.
In the continuous-tone mode, we used an adaptive -ary
arithmetic coder, based on the CACM++ package that was
developed and made publicly available by Carpinelli and
Salamonsen.4 The software is based on the work by Moffat,
et al. [5]. In order to improve coding and space efficiency,
the following techniques were employed to process the errors
prior to entropy coding.
A. Remapping Errors
Although prediction errors can potentially take on
possible values that range from
to
, they
can be mapped into the range 0 to
using the constraint that the value
must fall into the interval
, where is the number of bits in intensity
resolution. If
, we rearrange the possible prediction
errors
in the order
. In symmetry
to the case
, a similar mapping is performed for the
case
. Besides reducing the alphabet size for entropy
coding, error remapping also orders error values in decreasing
probability of occurrence. We use this monotonicity property
to reduce the alphabet size prior to entropy coding.
B. Histogram Tail Truncation
Even after error remapping, an alphabet size of 2 is
still unnecessarily large. Large errors occur with diminishing
frequency or not at all. But they still occupy spots in code
space. For instance, for
, over 99% of error population
is within the range
of the error histogram. If an error
histogram of size
is used, we have to set the frequency
counts of all empty cells to 1 in practice to guard against the
occurrence of an event of small probability, no matter how
small. These forced counts of 1 in the error histograms can
significantly distort the underlying error statistics, and hence
reduce coding efficiency.
An ad hoc and yet effective remedy of the problem is to
truncate the tails of the error histogram that is used to estimate
, and use an escape mechanism to code the errors beyond
the truncated code range, if they occur. Specifically, we limit
the size of each conditional error histogram to some value
, such that a large majority of errors to be coded
under the coding context falls into the range of the
largest
entries of the th error histogram. For a given coding context ,
a symbol (remapped prediction error)
is encoded first
to be
, followed by the codeword for the new symbol
4 ftp://ftp.cpsc.ucalgary.ca/pub/projects/arithmetic.coding.

WU AND MEMON: CALIC

443

, but being encoded under the coding context


. We experimented with a large range of frequency table
. Too small a value of
causes a large number of
sizes
does
escape symbols to be coded, whereas a too large
not cure the zero frequency problem. A good balance between
the two conflicting factors is required. We empirically set the
frequency table sizes for the eight coding contexts to be
(13)
Note that the user could be given the flexibility to set the
values differently from the default values suggested.
C. Dynamic Bit Shifting
When the dynamic range of prediction errors is very large,
the relatively small histogram sizes used in the proposed
system can lead to a large number of escape symbols, resulting
in poor coding efficiency. This is more likely to happen with
b/pixel). Therefore, we track for
high-resolution images (
each coding histogram the average of error magnitudes coded.
,
If the average magnitude is significantly higher than
then we decompose the remapped prediction error into the
bits,
and the least significant bits,
most significant
.
is then encoded in the normal manner, and
can be
coded using either a separate set of contexts or more simply
sent as is. The value of is selected such that the current
is less than
.
average of
The ad hoc technique given above does not add much to the
overall complexity of the system. As we pointed earlier in the
previous section, computing the average of error magnitudes
is simple and fast. However, the gains obtained can be very
significant for images that have a very large dynamic range
of prediction errors. Also, since the bit decomposition is done
adaptively, it leads to improved performance with many 8 bit
images by more efficiently coding certain active areas of the
image that incur large prediction errors.
VIII. VARIABLE IMAGE RESOLUTIONS
The parameters and thresholds of the predictors and the
context quantizers were designed and optimized for images of
8-b resolution. For higher resolutions, we devised a very simand
that incurs a negligible
ple scaling mechanism for
and ,
loss of compression performance. Having scaled
we proceed in a manner identical to 8-b resolution images.
We do this as it is far more involved to adjust parameters
and thresholds for the best compression performance for each
intensity resolution.
Given an image with intensity resolution of bits,
, the simplest scaling factor for
and
is
. But
in our experiments, we found that this simple scaling factor can
incur quite heavy loss in coding efficiency if the input images
are very smooth relatively to the dynamic range. To solve
and
according to the standard
this problem, we scale
of the errors (the average of error magnitudes)
deviation
in the previous row. Computationally, the value of for the
previous row is easy to update during the raster scan, requiring

only an accumulator. Based on , we use an empirical formula


(14)
and . Note that scaling
to compute the scaling factor for
and
are scaled by
requires only binary shifting. After
, the same GAP predictor as described in Section III, and
the same error feedback mechanism to compute , using the
same coefficients in
and the same
quantizer as given in
Section IV, are used to code images of more than 8-b intensity
resolution.
IX. COMPRESSION PERFORMANCE
In order to demonstrate the compression performance of
CALIC, we compare it with some representative lossless image
compression techniques. Table I lists compression results
achieved with the set of ISO test images that were made
available to the proposers. Column 2 lists the actual bit rates
obtained by CALIC with the CACM++ implementation of
arithmetic coding. Results with a static Huffman code are
on an average 3.5% worse. Columns 3 and 4 list actual
bit rates obtained by two publicly available lossless image
compression codecs: the lossless JPEG implementation of
Cornell University (LJPEG) and FELICS, a particularly simple
lossless technique developed by Howard and Vitter [2]. The
reported results of lossless JPEG were obtained by using
the best of the eight JPEG predictors followed by two-pass
Huffman coding of the prediction errors. FELICS uses contextbased prediction and RiceGolomb coding of prediction errors.
LJPEG and FELICS results on images with more than 8-b
per pixel are not given as public domain implementations of
these two only handle 8-b/pixel image data. Also, note that the
images are mostly color images and have more than one color
band. Bit rates were obtained by appropriately averaging over
the color bands weighted by size.
Both the current lossless JPEG and FELICS are rather
simple techniques that require minimal memory and computation resources. CALIC, on the other hand, is more complex
and does require more resources, although the increase in
memory and computation resources is modest. Hence, we
also compare CALIC with two other arithmetic-coding-based
techniques, ALCM and CLARA, that were proposed to ISO
by the University of California at Santa Cruz and Mitsubishi,
respectively. Both had memory requirements similar to CALIC
and were of comparable complexity. On 23 ISO test images,
the average compression ratio for CALIC is 26% better than
that of Huffman-coded lossless JPEG and 12% better than that
of arithmetic-coded lossless JPEG.
X. COMPUTATIONAL COMPLEXITY
CALIC was designed to be practical, and suitable for both
software and hardware implementations. Compared with preceding context-based lossless image compression algorithms
[1], [7], [9], CALIC is more efficient, and also simpler in algorithm structure. Even more significantly, the low complexity
and algorithm simplicity are achieved with compression performance superior to all existing methods known to us. Since
the entropy coding step is independent of source modeling in
CALIC, and it is required by all lossless image codecs, in

444

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 45, NO. 4, APRIL 1997

TABLE I
BIT RATES (BITS PER PIXEL) OF CALC ON ISO TEST
SET AND COMPARISON WITH A FEW SELECTED SCHEMES

not require the support of complex data structures (only onedimensional arrays are used), their hardware implementations
should not be difficult. The space complexity of CALIC was
already discussed at the end of Section V-C.
REFERENCES

TABLE II
COUNTS OF DIFFERENT OPERATIONS PER PIXEL INVOLVED IN CONTEXT
ERROR MODELING AND CONTEXT-BASED ADAPTIVE PREDICTION

what follows, we will analyze the time complexity only of


the context-based prediction and modeling components of the
CALIC system, which determines the difference in execution
time between different algorithms.
In CALIC, four major steps are taken by the both encoder
and decoder toward context error modeling and context-based
adaptive prediction.
Step 1: Computing
as in (2).
Step 2: Computing GAP as in Section III.
Step 3: Context quantization as in (3) and (7).
Step 4: Computing the context-based, adaptive prediction
as in Section VIII.
The arithmetic and logic operations per pixel involved in
these steps are counted and tabulated in Table II.
Note that the complexity analysis above only applies to
the continuous-tone mode. The binary mode is much simpler.
Steps 1 and 3, the two most expensive ones, can be easily
executed in parallel and by simple hardware, as self-evident
in (2) and (7). For instance,
and
can be computed
simultaneously by two identical circuitries, cutting the computation time in half; in context quantization, all s in a
texture pattern of (7) in the continuous-tone mode, or of (12)
in the binary mode, can be set simultaneously by a simple
circuitry. Because the modeling and prediction algorithms do

[1] P. Howard and J. S. Vitter, Error modeling for hierarchical lossless


image compression, in Proc. Data Compression Conf., J. A. Storer and
M. C. Cohn, Eds.,1992, pp. 269278.
[2] , Fast and efficient lossless image compression, in Proc. Data
Compression Conf., J. A. Storer and M. C. Cohn, Eds., 1993, pp.
351360.
[3] G. G. Langdon, A. Gulati, and E. Seiler, On the JPEG model for
lossless image compression, in Proc. Data Compression Conf., J. A.
Storer and M. C. Cohn, Eds., 1992, pp. 172180.
[4] N. D. Memon, K. Sayood, and S. S. Magliveras, Lossless image
compression with a codebook of block scans, IEEE J. Select. Areas
Commun., vol. 13, pp. 2431, Jan. 1995.
[5] A. Moffat, R. Neal, and I. Witten, Arithmetic coding revisited, in Proc.
Data Compression Conf., J. A. Storer and M. C. Cohn, Eds., 1995, pp.
202211.
[6] J. J. Rissanen, Universal coding, information, prediction and estimation, IEEE Trans. Inform. Theory, vol. IT-30, pp. 629636, 1984.
[7] M. J. Slyz and D. L. Neuhoff, A nonlinear VQ-based predictive lossless
image coder, in Proc. Data Compression Conf., J. A. Storer and M. C.
Cohn, Eds., 1994, pp. 304310.
[8] S. Todd, G. G. Langdon, and J. J. Rissanen, Parameter reduction and
context selection for compression of gray scale images, IBM J. Res.
Develop., vol. 29, no. 2, pp. 188193, 1985.
[9] M. J. Weinberger, J. J. Rissanen, and R. B. Arps, Applications
of universal context modeling to lossless compression of gray-scale
images, IEEE Trans. Image Processing, vol. 5, pp. 575586, Apr. 1996.
[10] X. Wu, Lossless compression of continuous-tone images via context
selection, quantization, and modeling, IEEE Trans. Image Processing,
vol. 6, May 1997.

Xiaolin Wu (SM96) was born in 1958 in Chengdu,


Sichuan, China. He received the B.Sc. degree from
Wuhan Cehui Technical University, China, in 1982,
and Ph.D. degree from the University of Calgary,
Calgary, Alta., Canada, in 1988, both in computer
science.
He is currently an Associate Professor with the
Department of Computer Science, the University
of Western Ontario, London, Ont., Canada. His
main research interests are visual computing and
communications, source coding, and algorithms. He
has published frequently in IEEE, ACM, and other international journals on
various topics in image coding, computer graphics, quantization theory and
algorithms, and image processing. He actively participated in and contributed
to the recent development of the new JPEG standard for lossless image coding.
Dr. Wu served as a program committee member of the IEEE Data
Compression Conferences in 1994 and 1995.

Nasir Memon (S91M92) received the B.E. degree in chemical engineering and the M.Sc. degree
in mathematics from the Birla Institute of Technology, Pilani, India, in 1981 and 1982 respectively.
He received his M.S. and Ph.D. degrees from the
University of Nebraska, Lincoln, both in Computer
Science, in 1989 and 1992 respectively.
He was an Assistant Professor in the Department
of Computer Science, Mathematics and Physics at
Arkansas State University from 1992 to July 1994.
He joined the Computer Science Department at
Northern Illinois University, De Kalb, in August 1994, where he is currently
an Assistant Professor. His research interests include data compression, data
encryption and communication networks.

You might also like