Parallel CRC Generator Whitepaper PDF
Parallel CRC Generator Whitepaper PDF
Parallel CRC Generator Whitepaper PDF
existing tools that can generate the code, and a lot of M-bit Data
input
examples for popular CRC polynomials. However, it’s
often beneficial to understand the underlying principles
in order to implement a customized circuit or make Figure 2—This is a parallel CRC block. The next state CRC output is
optimizations to an existing one. This is a subject every a function of the current state CRC and the data.
@
Circuit Cellar design contest two is the right shift. Modulo-2 poly-
nomial division is realized the same
entrants have received
way as long division over integers.
thousands of valuable Cyclic left and right shifts are multipli-
development tools and cation and division by (2 mod 2n – 1).
product samples. Because of
their contest participation,
these engineers receive I’ll start the discussion with a Ver-
ilog module that generates parallel USB
advance e-mail notice from
CRC5 with 4-bit data (see Listing 1).
Circuit Cellar as soon as new A synthesis tool will do its magic
samples become available. and produce a circuit depending on
Now you too can benefit from the target FPGA or ASIC technology.
However, the purpose of this article
this early notification.
is to explain how to get a parallel
CRC circuit using XOR gates and
flip-flops.
Next I’ll describe a practical
method that I use to generate parallel
CRC in a number of projects. It works
Designer's Notification Network on any polynomial and data size,
January 2010 – Issue 234
Nin[0] 0 0 1 0 1
Nin[1] 0 1 0 1 0
Nin[2] 1 0 1 0 0
Nin[3] 0 1 1 0 1
Table 1—This is the matrix H1 for USB CRC5 with N = 4.
bit data width. The method—which takes advantage of representation. Table 2 shows the matrix H2 values for
the theory described in a paper by Guiseppe Campobello USB CRC5 with N = 4.
et al titled “Parallel CRC Realization,” as well as in a In Step 6, you’re ready to construct the parallel CRC
paper by G. Albertango and R. Sisto titled “Parallel CRC equations. Each set bit j in column i of the matrix H1—
Generation”—leverages a simple serial CRC generator and that’s the critical part of the method—participates
and the linear properties of the CRC to build a parallel in the parallel CRC equation of the bit MOUT[i] as NIN[j].
CRC circuit. Likewise, each set bit j in column i of the matrix H2
In Step 1, denote N = data width and M = CRC polyno- participates in the parallel CRC equation of the bit
mial width. For parallel USB CRC5 with a 4-bit data- MOUT[i] as MIN[j].
path, N = 4 and M = 5. All participating inputs MIN [j] and NIN [j] that form
In Step 2, implement a serial CRC generator routine MOUT[i] are XORed together. For USB CRC5 with N = 4,
for a given polynomial. It’s a straightforward process and the parallel CRC equations are as follows:
can be done using different programming languages or
scripts (e.g., C, Java, Verilog, or Perl). You can use the M OUT ;0= = M IN ;1= ^ M IN ;4= ^ M IN ;0= ^ M IN ;3=
Verilog function crc5_serial in Listing 2 for the serial M OUT ;1= = M IN ;2= ^ N IN ;1=
USB CRC5. Denote this routine as CRCSERIAL. You can M OUT ;2= = M IN ;1= ^ M IN ;3= ^ M IN ;4= ^ N IN ;0= ^ N IN ;2= ^ N IN ;3=
also build a routine CRCparallel(Nin, Min) that sim- M OUT ;3= = M IN ;2= ^ M IN ;4= ^ N IN ;1= ^ N IN ;3=
ply calls CRCSERIAL N times (the number of data bits) and
M OUT ;4= = M IN ;0= ^ M IN ;3= ^ N IN ;2=
returns MOUT. The pseudocode in Listing 3 is an example
of CRCPARALLEL. MOUT is the parallel CRC implementation. I used Table 1
In Step 3, parallel CRC implementation is a function and Table 2 to derive the equations.
of N-bit data input and M-bit current CRC state, as The reason this method works is in the way we con-
shown in the Figure 2. We’re going to build two matri- structed matrices H1 and H2, where rows are linearly
ces. Matrix H1 describes MOUT (next CRC state) as a independent. We also used the fact that CRC is a linear
function of NIN (input data) when MIN = 0. Thus, MOUT = operation:
CRCPARALLEL (NIN, MIN = 0), and H1 matrix is the size CRC A + B = CRC A + CRC B
[NxM]. Matrix H2 describes MOUT (next CRC state) as a
function of MIN (current CRC state) when NIN = 0. Thus, The resulting Verilog module generates parallel USB
MOUT = CRCPARALLEL (NIN = 0, MIN), and H2 matrix is the CRC5 with 4-bit data (see Listing 4).
size [MxM].
In Step 4, build the matrix H1. Using the CRCPARALLEL
routine from step 2, calculate the CRC for the N values There are many other methods for parallel CRC gener-
of NIN when MIN = 0. The values are one-hot encoded— ation. Each method has advantages and drawbacks. Some
that is, each of the NIN values has only one bit set. For N are more suitable for high-speed designs where logic area
= 4, the values are 0x1, 0x2, 0x4, 0x8 in hex representa- is less of an issue. Others offer the most compact
tion. Table 1 shows matrix H1 values for USB CRC5 designs, but for lower speed. As with almost everything
with N = 4. else in engineering, you have to make trade-offs to bring
In Step 5, build the matrix H2. Using the CRCPARALLEL your designs to completion.
routine from Step 2, calculate CRC for the M values of Let’s review the most notable methods. One method
MIN when NIN = 0. The values are one-hot encoded. For derives a recursive formula for parallel CRC directly
M = 5, MIN values are 0x1, 0x2, 0x4, 0x8, 0x10 in hex from a serial implementation. The idea is to represent an
LFSR for serial CRC as a dis-
crete-time linear system:
Nin = 0 Mout[4] Mout[3] Mout[2] Mout[1] Mout[0]
X i + 1 = FX i + U i
January 2010 – Issue 234
Min[0] 1 0 0 0 0
Min[1] 0 0 1 0 1 Vector X(i) is the current LFSR
Min[2] 0 1 0 1 0 output. X(i + 1) is the output in
Min[3] 1 0 1 0 0 the next clock. Vector U(i) is
Min[4] 0 1 1 0 1
the ith of the input sequence. F
Table 2—This is the matrix H2 for USB CRC5 with N = 4. is a matrix chosen according
decreases because it requires more Evgeni Stavinov (evgeni@outputlogic.com) is a system design engineer for Xilinx
combinational logic levels to synthe- who holds an MSEE from USC and a BSEE from The Technion — Israel Institute of
size CRC output logic given the wider Technology. He has more than 10 years of design experience in the areas of FPGA
data and polynomial inputs. logic design, embedded software, and networking. Evgeni worked for CATC, LeCroy,
I used free Xilinx WebPACK tools to and SerialTek designing test and measurement tools for USB, Wireless USB, PCI
simulate and synthesize parallel CRC Express, Bluetooth, SAS, and SATA protocols. He also created OutputLogic.com—a
circuits for USB CRC5 and the popular web portal that offers online tools for FPGA and ASIC designers—and serves as its
Ethernet CRC32. You can explore the main developer.
results in the available Verilog code
and project files.
Xilinx’s Virtex5 LX30 is the target
FPGA. Table 3a shows USB CRC5
with 4-bit data using “for loop” Verilog
P
To download the code, go to ftp://ftp.circuitcellar.com/pub/Circuit_Cellar/
2009/234.
implementation. Table 3b shows USB
CRC5 with 4-bit data using “XOR”
Verilog implementation. Table 3c
shows CRC32 with 32-bit data using
“for loop” Verilog implementation.
R G. Albertango and R. Sisto, “Parallel CRC Generation,” IEEE Micro,
Vol. 10, No. 5, 1990.
Table 3d shows CRC32 with 32-bit
data using “XOR” Verilog implementa-
G. Campobello, G. Patane, M. Russo, “Parallel CRC Realization,”
tion. Note that a single Xilinx Virtex5
http://ai.unime.it/~gp/publications/full/tccrc.pdf.
Slice contains four FFs and four LUTs.
As expected, the number of FFs is
R. J. Glaise, “A Two-Step Computation of Cyclic Redundancy Code
five and 32 for CRC5 and CRC32. For
CRC-32 for ATM Networks,” IBM Journal of Research and Develop-
a small CRC5 circuit, there is no dif-
ment, Vol. 41, Issue 6, 1997.
ference in the logic utilization. Howev-
January 2010 – Issue 234