Floating Point Ds335

0
Floating-Point Operator v3.0

DS335 September 28, 2006
0 0
Product Specification
Introduction
The Xilinx Floating-Point core provides designers with the means to perform floating-point arithmetic on an FPGA. The core can be customized to allow optimization for operation, wordlength, latency, and interface.
Acknowledgement
Compliance with IEEE-754 Standard (with only minor documented deviations) Support for DSP48 on Virtex-4 FPGAs and DSP48E on Virtex-5 FPGAs Parameterized fraction and exponent wordlengths Optimizations for speed and latency
This Xilinx core is based on IP originally licensed from QinetiQ Ltd.
Fully synchronous design using a single clock For use with the CORE Generator which is available in the Xilinx ISE v8.2i.
Features
Available for Virtex-II, Virtex-II Pro, Virtex-4, Virtex-5, Spartan-II, Spartan-3, and Spartan-3E FPGA family members Supported operators: - multiply - add/subtract - divide - square-root - compare - conversion from floating-point to fixed-point - conversion from fixed-point to floating-point - conversion between floating-point types
Figure Top x-ref 1
Overview
The Xilinx Floating-Point core allows a range of floating-point arithmetic operations to be performed on FPGAs. The operation is specified when the core is generated, and each variant has a common interface. This interface is shown in Figure 1. When a user selects an operation that requires only one operand, the B input is omitted.
A B OPERATION OPERATION_ND OPERATION_RFD SCLR CE CLK
Floating-Point Operator
Result = A op B
RESULT
UNDERFLOW OVERFLOW INVALID_OPERATION DIVIDE_BY_ZERO RDY
Figure 1: Block Diagram of Generic Floating-Point Binary Operator Core
2006 Xilinx, Inc. All rights reserved. XILINX, the Xilinx logo, and other designated brands included herein are trademarks of Xilinx, Inc. The QinetiQ logo is a trademark of QinetiQ Ltd. All other trademarks are the property of their respective owners. Xilinx is providing this design, code, or information "as is." By providing the design, code, or information as one possible implementation of this feature, application, or standard, Xilinx makes no representation that this implementation is free from any claims of infringement. You are responsible for obtaining any rights you may require for your implementation. Xilinx expressly disclaims any warranty whatsoever with respect to the adequacy of the implementation, including but not limited to any warranties or representations that this implementation is free from claims of infringement and any implied warranties of merchantability or fitness for a particular purpose.
DS335 September 28, 2006 Product Specification
www.xilinx.com
Functional Description
The floating-point and fixed-point representations employed by the core are described in "Floating-Point Number Representation" on page 2 and "Fixed-Point Number Representation" on page 4.
Floating-Point Number Representation

The core employs a floating-point representation that is a generalization of the IEEE-754 Standard to allow for non-standard sizes [1]. When standard sizes are chosen, the format and special values employed are identical to those described by the IEEE-754 Standard. Two parameters have been adopted for the purposes of generalizing the format processed by the Floating-Point core. These specify the total format width and the width of the fractional part. For standard single precision types, the format width is 32 bits and fraction width 24 bits. In the following description, these widths are abbreviated to w and w f , respectively. A floating-point number is represented using a sign, exponent, and fraction (which are denoted as s, E, and b 0 .b 1 b 2 b w 1 , respectively).
f
The value of a floating-point number is given by: v = ( 1 ) 2 b 0 .b 1 b 2 b w

i
s E
f1
The binary bits, b i , have weighting 2 , where the most significant bit b 0 is a constant 1. As such, the combination is bounded such that 0 b 0 .b 1 b 2 b p 1< 2 and the number is said to be normalized. To provide increased dynamic range, this quantity is scaled by a positive or negative power of 2 (denoted here as E). The sign bit provides a value that is negative when s = 1 , and positive when s = 0 . The binary representation of a floating-point number contains three fields as shown in Figure 2.
Figure Top x-ref 2
Bit significance (i) s Bit position
we-1
wf -1
e
wf -1 wf -2 w
f
wf -1 0
w -1
Figure 2: Bit Fields Within the Floating-Point Representation
As b 0 is a constant, only the fractional part is retained, that is, f = b 1 b w 1 . This requires only f w f 1 bits. Of the remaining bits, one bit is used to represent the sign, and w e = w w f bits represent the exponent. The exponent field, e , employs a biased unsigned integer representation, whose value is given by:
we 1
e =
e2
i i=0
The index, i, of each bit within the exponent field is given in Figure 2. The value of the exponent, E , is obtained by removing the bias, that is, E = e ( 2
we 1
1) .
www.xilinx.com
In reality, w f is not the wordlength of the fraction, but the fraction with the hidden bit, b 0 , included. This terminology has been adopted to provide commonality with that used to describe fixed-point parameters (as employed by Xilinx System GeneratorTM for DSP). Special Values A number of values for s , e and f have been reserved for representing special numbers, such as Not a Number (NaN), Infinity ( ), zero (0), and de-normalized numbers. These special values are summarized in Table 1.
Table 1: Special Values
Symbol for Special Value

NaN
s field
we 1
e field
f field
most significant bit of fraction set (that is, f = 10...00 ) zero (that is, f = 00...00 )j zero (that is, f = 00...00 ) any non-zero field
dont care sign of sign of 0 sign of number
2 2
0 0
(that is, e = 11...11 ) (that is, e = 11...11 )
we 1
0
denormalized
Note that in Table 1 the sign bit is undefined when a result is a NaN. Also, infinity and zero are signed. Where possible, the sign is handled in the same way as finite non-zero numbers. For example, 0 + ( 0 ) = 0 , 0 + 0 = 0 and + ( ) = . Whereas, a meaningless operation such as + will raise an invalid operation exception and produce a NaN as a result.
IEEE-754 Support
The Xilinx Floating-Point core complies with much of the IEEE-754 Standard. The deviations generally provide better trade-off of resources against functionality. Specifically, the core deviates in the following ways: Non-Standard Wordlengths Denormalized Numbers Rounding Modes Signalling and Quiet NaNs Non-Standard Wordlengths The Xilinx Floating-Point core supports a greater range of fraction and exponent wordlength than defined in the IEEE-754 Standard. Standard formats commonly implemented by programmable processors: Single Format - uses 32 bits, with a 24-bit fraction and 8-bit exponent. Double Format - uses 64 bits, with 53-bit fraction and 11-bit exponent. Less commonly implemented standard formats are: Single Extended - wordlength extensions of 43 bits and above Double Extended - wordlength extensions of 79 bits and above
www.xilinx.com
The Xilinx core supports formats with fraction and exponent wordlengths outside of these standard wordlengths. Denormalized Numbers Denormalized numbers are not supported by the Xilinx Floating-Point core. To provide robustness, the core treats denormalized numbers as zero (that takes on the sign of the denormalized number). Denormalized numbers are those where b 0 is 0. As such, b 0 .b 1 b 2 b p 1 < 1 , which for a given exponent wordlength allows numbers to be represented that are smaller than otherwise possible. But note that as the value becomes smaller, it is represented with fewer bits and the relative rounding error introduced by each operation increases. An alternative way of increasing dynamic range, which uses less resources, is to increase the wordlength of the exponent. The wordlength of the format can be maintained by increasing the exponent wordlength at the expense of the fraction. Note: The support for denormalized numbers cannot be switched off on some processors. Therefore, there may be very small differences between values generated by the Floating-Point core and a program running on a conventional processor when numbers are very small. If such differences must be avoided, the arithmetic model on the conventional processor should include a simple check for denormalized numbers. This check should set the output of an operation to zero when denormalized numbers are detected to correctly reflect what happens in the FPGA implementation. Rounding Modes Currently, only the default rounding mode, Round to Nearest, as defined by the IEEE-754 Standard, is supported. Signalling and Quiet NaNs The IEEE-754 Specification requires provision of Signalling and Quiet NaNs. However, the Xilinx Floating-Point core treats all NaNs as Quiet NaNs. When any NaN is supplied as one of the operands to the core, the result will be a Quiet NaN, and an invalid operation exception will not be raised (as would be the case for signalling NaNs). The exception to this rule is floating-point to fixed-point conversion. For detailed information, see the behavior of INVALID_OP.
Fixed-Point Number Representation

A fixed-point representation is adopted that is consistent with the signed integer type used by Xilinx System Generator for DSP. Fixed-point values are represented using a 2s complement number weight by a fixed power of 2. The overall wordlength is specified by the format width with the level of scaling specified by the fraction width. The binary representation of a fixed-point number can be considered to contain three fields as shown in Figure 3 (although it is simply a 2s complement number).
Figure Top x-ref 3
s Bit position (i)

w -1
int
wf wf -1 w
frac
wf -1 0
Figure 3: Bit Fields within the Fixed-point Representation
www.xilinx.com
In Figure 3, the bit position has been labelled with an index i. Based upon this, the value of a fixed-point number is given by:
s w 1 wf
v = ( 1 ) 2
+ b w 2 b w .b w
f
f1
b 1 b 0
w2
= ( 1 )
bw 1 w 1 wf
2
0
i wf
bi
For example, a 32-bit signed integer representation is obtained when a width of 32 and a fraction width of 0 are specified. Round to Nearest is employed within the conversion operations. Note: To provide for the sign bit, the width of the integer field must be at least 1, requiring that the fractional width be no larger that width-1.
Port Description
The ports employed by the core are shown in Figure 1. They are described in more detail in Table 2. All control signals are active high.
Table 2: Core Ports
Name
A1 B1
Width
w w
Direction
INPUT INPUT Operand A
Description
Operand B: Only present on binary operation. Operation: Specifies the operation to be performed. Implemented when the core is configured for both add and subtract operations, or as a programmable comparator.
OPERATION1
INPUT
New Data: Must be set high to indicate that

OPERATION_ND 1 INPUT operand A, operand B and OPERATION, when required, are valid. Ready For Data: Set high by core to indicate that it is ready for new operands. Synchronous Reset (optional). Clock Enable (optional). Clock Result Output: Result of operation. Underflow: Set high by core when underflow occurs. Supplied in synchronism with associated RESULT. Overflow: Set high by core when overflow occurs. Supplied in synchronism with associated RESULT.
OPERATION_RFD SCLR CE CLK RESULT UNDERFLOW
1 1 1 1 w 1
OUTPUT INPUT INPUT INPUT OUTPUT OUTPUT
OVERFLOW
OUTPUT
www.xilinx.com
Table 2: Core Ports (Continued)
Name
INVALID_OPERATION
Width
1
Direction
OUTPUT
Description
Invalid Operation: Set high by core when operands cause an invalid operation. Supplied in synchronism with associated RESULT. Divide By Zero: Set high by a divide operation to indicate that a division by zero was performed. Supplied in synchronism with associated RESULT. Output Ready: Set high by core when RESULT is valid.
DIVIDE_BY_ZERO
OUTPUT
RDY
OUTPUT
1. A, B and OPERATION are not registered on the input to the core. Should this be required, registers can be added to these inputs externally to the core.
A Operand A input. B Operand B input. CLK All signals are synchronous to the CLK input. CE When CE is deasserted, the clock is disabled, and the state of the core and its outputs are maintained. SCLR When SCLR is asserted, the core control is synchronously set to its initial state. Any incomplete results are discarded, and RDY will not be generated for them. While SCLR is asserted both OPERATION_RFD and RDY are deasserted. The core is ready for new input one cycle after SCLR is deasserted, at which point OPERATION_RFD is asserted. OPERATION OPERATION is present when add and subtract operations are selected together, or when a programmable comparator is selected. The operations are binary encoded as specified in Table 3.
Table 3: Encoding of OPERATION
FP operation
Add Subtract Unordered Less Than Compare (Programmable) Equal Less Than or Equal Greater Than Not Equal Greater Than or Equal
OPERATION (5 downto 0)
000000 000001 000100 001100 010100 011100 100100 101100 110100
www.xilinx.com
OPERATION_ND OPERATION_ND should be asserted when operands are valid on inputs A and B and the FP operation is valid on OPERATION (should it be required). Deasserting OPERATION_ND will prevent the initiation of new operations and the subsequent assertion of RDY. Note: OPERATION_ND is required to synchronize operations when the core is configured to perform a multi-cycle divide or square root. OPERATION_RFD OPERATION_RFD is asserted by the core to indicate that it is ready to accept new operands on inputs A, B, and OPERATION. A new operation will be initiated by the core when both OPERATION_ND and OPERATION_RFD are asserted together. RESULT If the operation is compare, then the valid bits within the result depend upon the compare operation selected. If the operation is one of those listed in Table 3, then only the least significant bit of the result indicates whether the comparison is true or false. If the operation is condition code, then 4 bits provide the results of the comparison using the encoding summarized in Table 4. See IEEE-754 Standard for a more complete listing of the meanings of all the valid comparison results.
Table 4: Condition Code Summary

Result(3 down to 0)
Compare Operation
3 Programmable 2 1 0 0 1 Condition Code Unordered 0 0 0 0 0 0 1 > 0 0 0 1 1 1 < 0 1 1 0 0 1 See Standard EQ 1 0 1 0 1 0
Result
A OP B = False A OP B = True Meaning A=B A<B A <= B A>B A >= B A <> B A, B or both are NaN.
The following signals provide exception information. Additional detail on their behavior can be found in the IEEE-754 Standard. UNDERFLOW Underflow is signalled when the operation generates a non-zero result which is too small to be represented with the chosen precision. The result is set to zero. Underflow is detected after rounding. Note: A number that becomes de-normalized before rounding will be set to zero and underflow signalled.
www.xilinx.com
OVERFLOW Overflow is signalled when the operation generates a result that is too large to be represented with the chosen precision. The output is set to a correctly signed . INVALID_OP Invalid operation is signalled when the operation performed is invalid. According to the IEEE-754 Standard, the following are invalid operations: 1. 2. 3. 4. 5. 6. Any operation on a signalling NaN. (Note that this is not relevant here). Addition or subtraction of infinite values where the sign of the result cannot be determined. For example, magnitude subtraction of infinities such as (+ ) +(- ). Multiplication where 0 . Division where 0 0 or . Square root if the operand is less than zero. When the input of a conversion cannot correctly signalled by the result (for example NaN or infinity).
When an invalid operation occurs, the associated result is a Quiet NaN. In the case of floating-point to fixed-point conversion, NaN and infinity raise an invalid operation exception. If the operand is out of range, or an infinity, then an overflow exception is raised. By analyzing the two exception signals it is possible to determine which of the three types of operand were converted. (See Table 5.)
Table 5: Invalid Operation Summary
Operand
+ Out of Range - Out of Range + Infinity - Infinity NaN
Invalid Operation
0 0 1 1 1
Overflow
1 1 1 1 0
Result
011...11 100...00 011...11 100...00 100...00
When the operand is a NaN the result is set to the most negative representable number. When the operand is infinity or an out-of-range floating-point number, the result is saturated to the most positive or most negative number, depending upon the sign of the operand. Note: Floating-point to fixed-point conversion does not treat a NaN as a Quiet NaN, because NaN is not representable within the resulting fixed-point format, and so can only be indicated through an invalid operation exception. DIVIDE_BY_ZERO Division of a number by zero is signalled when a divide operation is performed where the divisor is zero and the dividend is a finite non-zero number. The result in this circumstance is a correctly signed . RDY RDY is asserted by the core to indicate that RESULT is valid. RDY can be used to qualify the result of a multi-cycle operation (i.e., divide or square root operations with rate greater than 1).
www.xilinx.com
Example Timing An example of signal timing is given in Figure 4 for square-root with latency 4 and rate 3. The result is provided four cycles after an active OPERATION_ND. In this example, new inputs are applied every three cycles, in accordance with the maximum rate and OPERATION_RFD output. (Data could be applied less frequently, in which case, OPERATION_RFD would stay High until OPERATION_ND was asserted with the new input.) The RDY output indicates when RESULT, and any exception flags, are valid. In this example, an overflow exception has been generated with result R2.
Figure Top x-ref 4
CLK CE A OPERATION_ND OPERATION_RFD RDY RESULT INVALID_OP

R1 R2 A1 A2 A3
Figure 4: Example Timing Diagram
Customizing the Core

Floating-Point core customization options are provided to optimize the core for specific requirements. The core can be customized in either of two ways: Using the CORE Generator Graphical User Interface (GUI) Instancing the core with appropriate generic values in VHDL and synthesizing the core using XST
Using the CORE Generator GUI

The Floating-Point core GUI can be used to configure: Core performance Implementation optimizations, including wordlength Core operation Inclusion or exclusion of optional pins Floating-Point Customization Options From the CORE Generator, the Floating-Point core is located in the Math Function category. Up to five customization screens are provided, depending on the options you select. All five screens share common navigation options: Next and Back: Moves forward or backward one screen at a time, respectively. Finish: Generates the core with the currently configured parameters.
www.xilinx.com
Cancel: Cancels generation of the core and returns to the first screen of the GUI. View Data Sheet: Displays this document, a PDF file of the core product specification. Opening the Floating-Point Main Screen 1. Start the CORE Generator. 2. Select Floating-Point 3.0 from the Math Functions category at the left side of the application window. 3. Do one of the following to display the Floating-Point main configuration screen: - Double-click Floating-Point 3.0 in the Math Functions category. - Select Floating-Point in the Math Functions category; then click Customize at the right side of the CORE Generator window. Main Configuration Screen The main configuration screen allows the following parameters to be specified: Component Name Operation Type
Component Name
The component name is used as the base name of the output files generated for the core. Names must start with a letter and be composed using the following characters: a to z, 0 to 9, and _.
Operation Type
The floating-point operation may be one of the following: Add/Subtract Multiply Divide Square-root Compare Fixed-to-float Float-to-fixed Float-to-float When Add/Subtract is selected, it is possible for the core to perform both operations, or just add or subtract. When both are selected, the operation performed on a particular set of operands is controlled by the OPERATION input (with encoding defined earlier in Table 3). When Add/Subtract or Multiply is selected, the level of embedded multiplier usage can be specified as described in the Penultimate Configuration Screen section. When Compare is selected, the compare operation may be programmable or fixed. If programmable, then the compare operation performed should be supplied via the OPERATION input (with encoding defined earlier in Table 3). If a fixed operation is required, then the operation type should be selected. When Float-to-float conversion is selected, and exponent and fraction widths of the input and result are the same, the core provides a means to condition numbers, i.e., convert denormalized numbers to zero, and signal NaNs to quiet NaNs.
10
www.xilinx.com
Second and Third Configuration Screens Depending on the configuration you select from the first screen, the second and third configuration screens let you specify the precision of the operand and result.
Precision of the Operand and Results
This parameter defines the number of bits used to represent quantities. The type of the operands and results depend on the operation requested. For fixed-point conversion operations, either the operand or result is fixed-point. For all other operations, the output is specified as a floating-point type. Note: For compare, depending upon operation selected, RESULT(3 down to 0), is used to indicate the result of the comparison operation. Table 6 defines the general limits of the format widths.
Table 6: General Limits of Width and Fraction Width
Fraction Width Format Type Min

Floating-Point Fixed-Point 4 0
Exponent/Integer Width (Width-Fraction Width) Min

4 1
Width Min
4 4
Max
64 63
Max
16 64
Max
64 64
There are also a number of further limits for specific cases: The exponent width (i.e., width - fraction width) should be chosen to support normalization of the fractional part. This can be calculated using: exponent width = ceil [ log2 ( fraction width+3 ) ] + 1 For example, a 24-bit fractional part requires an exponent of at least 6 bits (for example, {ceil [log2 (27)]+1}). The GUI enforces these limits. For the logic assisted multiplier (that is, when multiplier usage is medium), only double precision format is supported. For conversion operations, the exponent width of the floating-point input or output can be calculated using: exponent width = ceil [ log 2 ( width + 3 ) ] + 1 For example, a 32-bit integer will require a minimum exponent of 7 bits. A summary of the width limits imposed by exponent width is provided in Table 7.
Table 7: Summary of Exponent Width Limits
Exponent Width
4 5 6 7 8
Floating-Point Fraction Width or Fixed-Point Total Width

4 to 5 6 to 13 14 to 29 30 to 61 61 to 64
www.xilinx.com
11
Penultimate Configuration Screen The final configuration screen lets you specify the following: Architecture Optimizations Family Optimizations Cycles Per Operation (Rate)
Architecture Optimizations
For addition/subtraction on Virtex-5 FPGAs, it is possible to specify a latency optimized architecture, or speed optimized architecture. The latency optimized architecture offers reduced latency at the expense of increased resources.
Family Optimizations
Multiplier Usage: Allows the type and level of embedded multiplier usage to be specified.
Multiplier Usage
The level of embedded multiplier usage can be specified. The level and type of multiplier usage also depend upon the operation and FPGA family. Table 8 summarizes these options for multiplication.
Table 8: Impact of Family and Multiplier Usage on the Implementation of the Multiplier
Multiplier Usage
No usage Medium usage Full usage Max usage
1.
Virtex-II, Virtex-II Pro, Spartan-3E

Logic Not supported MULT18X18 Not supported
Virtex-4
Logic DSP48+logic1 in multiplier body DSP48 used in multiplier body DSP48 multiplier body and rounder
Virtex-5
Logic DSP48E+logi1 in multiplier body DSP48E used in multiplier body DSP48E multiplier body and rounder
1. Logic-assisted multiplier variant is only available for single and double precision in Virtex-4 FPGAs and single precision in Virtex-5 FPGAs.
Table 9 summarizes these options for addition/subtraction.

Table 9: Impact of Family, Precision, and Multiplier Usage on the Implementation of the Adder/Subtractor
Multiplier Usage (only valid values listed)

No usage Full usage
Virtex-II, Virtex-II Pro, Spartan-3E Any

Logic Not supported
Virtex-4
Virtex-5
Other
Logic Not supported
Single
Logic 4 DSP48
Double
Logic 3 DSP48
Other
Logic Not supported
Single
Logic 2 DSP48E
Double
Logic 3 DSP48E
12
www.xilinx.com
Latency
This parameter describes the number of cycles between an operand input and result output. The latency of all operators (apart from the logic-assisted, double-precision multipliers on Virtex-4 devices) can be set between 0 and a maximum value that is dependent upon the parameters chosen. The maximum latency of the Floating-Point core is tabulated for a range of width and operation types in Table 10, Table 11, Table 12, Table 13, Table 14, Table 15, Table 16, Table 17, Table 18, and Table 19. The maximum latency of the divide and square root operations is fraction width + 4, and for compare operation it is three cycles. The float-to-float conversion operation is three cycles when either mantissa or exponent width is being reduced, otherwise it is two cycles. Note that it is two cycles, even when the input and result widths are the same, as the core provides conditioning in this situation (see Operation Type for further details). Note: The maximum latency of certain operations has been increased over Floating Point Operator v2.0 to increase maximum clock frequency. If the previous maximum latency value is specified, then an equivalent implementation will be obtained.
Table 10: Latency of Floating-Point Multiplication using Logic Only
Fraction Width
4 to 5 6 to 11 12 to 23 24 to 47 (single) 48 to 64 (double)
Maximum Latency (clock cycles)

5 6 7 8 9
Table 11: Latency of Floating-Point Multiplication using MULT18X18S
Fraction Width
4 to 17 18 to 34 (single) 35 to 51 52 to 64 (double)
Maximum Latency (clock cycles)

4 6 7 8
Table 12: Latency of Floating-Point Multiplication using DSP48
Maximum Latency (clock cycles) Fraction Width Medium Usage

4 to 17 18 to 34 (single) 35 to 51 52 to 64 (double)
1. Single precision only 2. Double precision only
Full Usage
6
Max Usage
8 11 16 23
91
10 15
172
22
www.xilinx.com
13
Table 13: Latency of Floating-Point Multiplication using DSP48E
Maximum Latency (clock cycles) Fraction Width Medium Usage

4 to 17 18 to 24 (single) 25 to 34 35 to 41 42 to 51 52 to 58 (double) 59 to 64
1. Single precision only
Full Usage
6
Max Usage
8 9 11 13 16 19 23
81
8 10 12 15 18 22
Table 14: Latency of Floating-Point Addition using Medium Usage and DSP48/DSP48E
Maximum Latency (clock cycles) Fraction Width DSP48

24 (single) 53 (double) 16 12
DSP48E
16 15
Table 15: Latency of Floating-Point Addition using Logic on Families Other than Virtex-5 FPGAs
Maximum Latency (clock cycles) Fraction Width Virtex-II, Virtex-II Pro, Spartan-3E, Virtex-4
4, 5 6 to 14 15 16, 17 18 to 29 30 to 62 63, 64 9 10 11 12 13 14 15
Table 16: Latency of Floating-Point Addition using Logic and Low-Latency Optimization on Virtex-5 FPGAs
Maximum Latency (clock cycles) Fraction Width Virtex-5

single double 9 9
14
www.xilinx.com
Table 17: Latency of Floating-Point Addition using Logic and Speed Optimization on Virtex-5 FPGAs
Maximum Latency (clock cycles) Fraction Width Virtex-5

4 to 13 14 15 16, 17 18 to 61 (single, double) 62 to 64 8 9 10 11 12 13
Table 18: Latency of Fixed-point to Floating-Point Conversion
Operand Width
4 to 8 9 to 32 31 to 64
Maximum Latency (Cycles)

5 6 7
Table 19: Latency of Floating-Point to Fixed-point Conversion
Maximum of (A Fraction Width+1) and Result Width

4 5 to 16 17 to 64 Cycles Per Operation (Rate)
Maximum Latency (Cycles)

4 5 6
This parameter describes the minimum number of cycles that must elapse between inputs. This rate can be specified. A value of 1 allows operands to be applied on every clock cycle, and results in a fully-parallel circuit. A value greater than 1 enables hardware reuse. The number of slices consumed by the core reduces as the number of cycles per operation is increased. A value of 2 approximately halves the number of slices used. A fully sequential implementation is obtained when the value is equal to fraction width+1 for the square-root operation, and fraction width+2 for the divide operation. Final Configuration Screen The final configuration screen lets you specify the Optional Control and Exception Pins.
Optional Control and Exception Pins
Pins for the following signals are optional: Control Signals: OPERATION_ND, OPERATION_RDY, RDY, CE and SCLR control signals are optional. Exception Signals: UNDERFLOW, OVERFLOW, INVALID_OPERATION and DIVIDE_BY_ZERO signals are optional. The DIVIDE_BY_ZERO signal is only available when the divide operation is selected.
www.xilinx.com
15
VHDL Interface
The Floating-Point core can be generated directly from a component instantiation in VHDL using XST. A component declaration has been provided in xilinxcorelib (as employed by XST). Also, a package of constant definitions has been provided to allow parameter values to be used that have meaningful names. Note: If the generics are out of range, then the core will fail to synthesize. If this happens, either simulate the VHDL behavioral model (see VHDL Simulation) to obtain the reason for failure, or use the GUI to identify a suitable set of generics. A list of valid generics and their limits and default values is listed in Table 20.
Table 20: Parameter File Information
VHDL Generic
C_FAMILY
Valid Values
virtex2, virtex2p, virtex4, virtex5, spartan3 Use one of these families for derivatives (such as spartan3e). FLT_PT_TRUE, FLT_PT_FALSE Can be used with C_HAS_SUBTRACT to obtain add and subtract capability. When both are selected the OPERATION port is used to specify required operation as defined in Table 3. FLT_PT_TRUE, FLT_PT_FALSE Can be used with C_HAS_ADD to obtain add and subtract capability. (See further comments under C_HAS_ADD). FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE Integer with range dependant upon other parameters as defined in Precision of the Operand and Results. Integer with range dependant upon other parameters as defined in Precision of the Operand and Results. Must be same as C_A_WIDTH. Must be same as C_A_FRACTION_WIDTH. Must be same as C_A_WIDTH for all operations, other than conversion. Must be same as C_A_FRACTION_WIDTH for all operations, other than conversion. virtex2
Default Value
C_HAS_ADD
FLT_PT_FALSE
C_HAS_SUBTRACT
FLT_PT_FALSE
C_HAS_MULTIPLY C_HAS_DIVIDE C_HAS_SQRT C_HAS_COMPARE C_HAS_FIX_TO_FLT C_HAS_FLT_TO_FIX C_HAS_FLT_TO_FLT C_A_WIDTH
FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE 32
C_A_FRACTION_WIDTH
24
C_B_WIDTH C_B_FRACTION_WIDTH C_RESULT_WIDTH C_RESULT_FRACTION_WIDTH
32 24 32 24
16
www.xilinx.com
Table 20: Parameter File Information (Continued)
VHDL Generic
C_COMPARE_OPERATION
Valid Values
FLT_PT_PROGRAMMABLE, FLT_PT_LESS_THAN, FLT_PT_LESS_THAN, FLT_PT_EQUAL, FLT_PT_LESS_THAN_OR_EQUAL, FLT_PT_GREATER_THAN, FLT_PT_NOT_EQUAL, FLT_PT_GREATER_THAN_OR_EQUAL , FLT_PT_UNORDERED, FLT_PT_CONDITION_CODE. Specify when operation is compare. Integer with range dependant upon other parameters as defined in Final Configuration Screen. FLT_PT_SPEED_OPTIMIZED, FLT_PT_LOW_LATENCY. FLT_PT_NO_USAGE, FLT_PT_MEDIUM, FLT_PT_FULL_USAGE, FLT_PT_MAX_USAGE. Use as described in Multiplier Usage. FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE FLT_PT_TRUE, FLT_PT_FALSE
Default Value
FLT_PT_LESS_THAN
C_LATENCY
FLT_PT_MAX_LATENCY
C_OPTIMIZATION C_MULT_USAGE
FLT_PT_SPEED_OPTIMIZED FLT_PT_FULL_USAGE
C_HAS_CE C_HAS_SCLR C_HAS_OPERATION_ND C_HAS_OPERATION_RFD C_HAS_RDY C_HAS_UNDERFLOW C_HAS_OVERFLOW C_HAS_INVALID_OPERATION C_HAS_DIVIDE_BY_ZERO
FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE FLT_PT_FALSE
Examples The following is an example of an instantiation in VHDL that generates a single-precision adder with a full set of exception and control signals.
library xilinxcorelib; -- XST version use xilinxcorelib.floating_point_v3_0_consts.all; -- constants package use xilinxcorelib.floating_point_v3_0_comp.all; -- component declaration ..... fp_add_single: floating_point_v3_0 generic map ( C_FAMILY => virtex4, C_HAS_ADD => FLT_PT_TRUE, C_A_WIDTH => 32, C_A_FRACTION_WIDTH => 24, C_B_WIDTH => 32, C_B_FRACTION_WIDTH => 24,
www.xilinx.com
17
C_RESULT_WIDTH C_RESULT_FRACTION_WIDTH C_HAS_SCLR C_HAS_OPERATION_ND C_HAS_OPERATION_RFD C_HAS_RDY C_HAS_UNDERFLOW C_HAS_OVERFLOW C_HAS_INVALID_OP ) port ( A B OPERATION_ND OPERATION_RFD CLK SCLR RESULT UNDERFLOW OVERFLOW INVALID_OP RDY );
=> => => => => => => => =>
32, 24, FLT_PT_TRUE, FLT_PT_TRUE, FLT_PT_TRUE, FLT_PT_TRUE, FLT_PT_TRUE, FLT_PT_TRUE, FLT_PT_TRUE
=> => => => => => => => => => =>
a, b, operation_nd, operation_rfd, clk, sclr, result, underflow, overflow, invalid_op, rdy
VHDL Component Declaration This component declaration is provided within xilinxcorelibs in package floating_point_v3_0_comp. It is included below for reference purposes. Note that there are a number of generics over-and-above those defined in Table 20. The values of these additional generics should not be changed from the defaults, and can be left unspecified when the component is instantiated (as done in the example). There are also a number of ports in addition to those listed in Table 2. These should be left unconnected. Inputs will default to suitable values.
component floating_point_v3_0 is generic ( C_FAMILY : string := C_FAMILY_DEFAULT; C_HAS_ADD : integer := C_HAS_ADD_DEFAULT; C_HAS_MULTIPLY : integer := C_HAS_MULTIPLY_DEFAULT; C_HAS_DIVIDE : integer := C_HAS_DIVIDE_DEFAULT; C_HAS_SQRT : integer := C_HAS_SQRT_DEFAULT; C_HAS_COMPARE : integer := C_HAS_COMPARE_DEFAULT; C_HAS_FIX_TO_FLT : integer := C_HAS_FIX_TO_FLT_DEFAULT; C_HAS_FLT_TO_FIX : integer := C_HAS_FLT_TO_FIX_DEFAULT; C_HAS_FLT_TO_FLT : integer := C_HAS_FLT_TO_FLT_DEFAULT; C_A_WIDTH : integer := C_A_WIDTH_DEFAULT; C_A_FRACTION_WIDTH : integer := C_A_FRACTION_WIDTH_DEFAULT; C_B_WIDTH : integer := C_B_WIDTH_DEFAULT; C_B_FRACTION_WIDTH : integer := C_B_FRACTION_WIDTH_DEFAULT; C_RESULT_WIDTH : integer := C_RESULT_WIDTH_DEFAULT; C_RESULT_FRACTION_WIDTH: integer := C_RESULT_FRACTION_WIDTH_DEFAULT; C_COMPARE_OPERATION : integer := C_COMPARE_OPERATION_DEFAULT; C_LATENCY : integer := C_LATENCY_DEFAULT; C_OPTIMIZATION : integer := C_OPTIMIZATION_DEFAULT; C_MULT_USAGE : integer := C_MULT_USAGE_DEFAULT;
18
www.xilinx.com
C_RATE C_HAS_ACLR C_HAS_CE C_HAS_SCLR C_HAS_A_NEGATE C_HAS_B_NEGATE C_HAS_A_ND C_HAS_A_RFD C_HAS_B_ND C_HAS_B_RFD C_HAS_OPERATION_ND C_HAS_OPERATION_RFD C_HAS_RDY C_HAS_CTS C_HAS_UNDERFLOW C_HAS_OVERFLOW C_HAS_INVALID_OP C_HAS_INEXACT C_HAS_DIVIDE_BY_ZERO C_HAS_STATUS C_HAS_EXCEPTION C_STATUS_EARLY ); port ( A B A_NEGATE B_NEGATE OPERATION A_ND A_RFD B_ND B_RFD OPERATION_ND OPERATION_RFD CLK SCLR ACLR CE RESULT STATUS EXCEPTION UNDERFLOW OVERFLOW INVALID_OP INEXACT DIVIDE_BY_ZERO RDY CTS ); end component;
: : : : : : : : : : : : : : : : : : : : : :
integer integer integer integer integer integer integer integer integer integer integer integer integer integer integer integer integer integer integer integer integer integer
:= := := := := := := := := := := := := := := := := := := := := :=
C_RATE_DEFAULT; C_HAS_ACLR_DEFAULT; C_HAS_CE_DEFAULT; C_HAS_SCLR_DEFAULT; C_HAS_A_NEGATE_DEFAULT; C_HAS_B_NEGATE_DEFAULT; C_HAS_A_ND_DEFAULT; C_HAS_A_RFD_DEFAULT; C_HAS_B_ND_DEFAULT; C_HAS_B_RFD_DEFAULT; C_HAS_OPERATION_ND_DEFAULT; C_HAS_OPERATION_RFD_DEFAULT; C_HAS_RDY_DEFAULT; C_HAS_CTS_DEFAULT; C_HAS_UNDERFLOW_DEFAULT; C_HAS_OVERFLOW_DEFAULT; C_HAS_INVALID_OP_DEFAULT; C_HAS_INEXACT_DEFAULT; C_HAS_DIVIDE_BY_ZERO_DEFAULT; C_HAS_STATUS_DEFAULT C_HAS_EXCEPTION_DEFAULT C_STATUS_EARLY_DEFAULT
: : : : : : : : : : : : : : : : : : : : : : : : :
in std_logic_vector(C_A_WIDTH-1 downto 0); in std_logic_vector(C_B_WIDTH-1 downto 0):=(others=>'0'); in std_logic:='0'; in std_logic:='0'; in std_logic_vector(5 downto 0):=(others=>0); in std_logic:='1'; out std_logic; in std_logic:='1'; out std_logic; in std_logic:='1'; out std_logic; in std_logic; in std_logic:='0'; in std_logic:='0'; in std_logic:='0'; out std_logic_vector(C_RESULT_WIDTH-1 downto 0); out std_logic_vector(2 downto 0); out std_logic; out std_logic; out std_logic; out std_logic; out std_logic; out std_logic; out std_logic; in std_logic:='1'
www.xilinx.com
19
VHDL Constants Package Default generic values and constant terms for valid values of the generics are provided within xilinxcorelibs as the package floating_point_v3_0_consts. Some useful constants and the default generics are as follows:
constant FLT_PT_TRUE constant FLT_PT_FALSE constant FLT_PT_SPEED_OPTIMIZED constant constant constant constant constant FLT_PT_NO_USAGE FLT_PT_MEDIUM_USAGE FLT_PT_FULL_USAGE FLT_PT_MAX_USAGE FLT_PT_MAX_LATENCY : integer := 1; : integer := 0; : integer := 1; : : : : : integer integer integer integer integer := := := := := 0; 1; 2; 3; 1000;
-- Compare operation values constant FLT_PT_UNORDERED constant FLT_PT_LESS_THAN constant FLT_PT_EQUAL constant FLT_PT_LESS_THAN_OR_EQUAL constant FLT_PT_GREATER_THAN constant FLT_PT_NOT_EQUAL constant FLT_PT_GREATER_THAN_OR_EQUAL constant FLT_PT_CONDITION_CODE constant FLT_PT_PROGRAMMABLE constant FLT_PT_OPERATION_WIDTH
: : : : : : : : :
integer integer integer integer integer integer integer integer integer
:= := := := := := := := :=
0; 1; 2; 3; 4; 5; 6; 7; 8;
: integer := 6;
-- defaults for generics constant C_FAMILY_DEFAULT : string := "virtex2"; constant C_HAS_ADD_DEFAULT : integer := FLT_PT_FALSE; constant C_HAS_MULTIPLY_DEFAULT : integer := FLT_PT_FALSE; constant C_HAS_DIVIDE_DEFAULT : integer := FLT_PT_FALSE; constant C_HAS_SQRT_DEFAULT : integer := FLT_PT_FALSE; constant C_HAS_COMPARE_DEFAULT : integer := FLT_PT_FALSE; constant C_HAS_FIX_TO_FLT_DEFAULT : integer := FLT_PT_FALSE; constant C_HAS_FLT_TO_FIX_DEFAULT : integer := FLT_PT_FALSE; constant C_A_WIDTH_DEFAULT : integer := 32; constant C_A_FRACTION_WIDTH_DEFAULT : integer := 24; constant C_B_WIDTH_DEFAULT : integer := 32; constant C_B_FRACTION_WIDTH_DEFAULT : integer := 24; constant C_RESULT_WIDTH_DEFAULT : integer := 32; constant C_RESULT_FRACTION_WIDTH_DEFAULT: integer := 24; constant C_COMPARE_OPERATION : integer := FLT_PT_LESS_THAN; constant C_LATENCY_DEFAULT : integer := FLT_PT_MAX_LATENCY; constant C_OPTIMIZATION_DEFAULT : integer := FLT_PT_SPEED_OPTIMIZED; constant C_MULT_USAGE_DEFAULT : integer := FLT_PT_FULL_USAGE; constant C_RATE_DEFAULT : integer := 1; constant C_HAS_ACLR_DEFAULT : integer := FLT_PT_FALSE; constant C_HAS_CE_DEFAULT : integer := FLT_PT_FALSE; constant C_HAS_SCLR_DEFAULT : integer := FLT_PT_FALSE; constant C_HAS_A_NEGATE_DEFAULT : integer := FLT_PT_FALSE;
20
www.xilinx.com
constant constant constant constant constant constant constant constant constant constant constant constant constant constant constant
C_HAS_B_NEGATE_DEFAULT C_HAS_A_ND_DEFAULT C_HAS_A_RFD_DEFAULT C_HAS_B_ND_DEFAULT C_HAS_B_RFD_DEFAULT C_HAS_OPERATION_ND_DEFAULT C_HAS_OPERATION_RFD_DEFAULT C_HAS_RDY_DEFAULT C_HAS_CTS_DEFAULT C_HAS_UNDERFLOW_DEFAULT C_HAS_OVERFLOW_DEFAULT C_HAS_INVALID_OP_DEFAULT C_HAS_INEXACT_DEFAULT C_HAS_DIVIDE_BY_ZERO_DEFAULT C_HAS_STATUS_DEFAULT
: : : : : : : : : : : : : : :
integer integer integer integer integer integer integer integer integer integer integer integer integer integer integer
:= := := := := := := := := := := := := := :=
FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE; FLT_PT_FALSE;
Simulation
VHDL Simulation
A cycle-accurate, bit-true VHDL simulation model exists for the Xilinx Floating-Point core within xilinxcorelib library. For multi-cycle divide or square-root, in which case RATE>1, the model RESULT and exception flags may differ from the core in between valid outputs. RDY indicates when the RESULT and exception flags are valid and can be used to qualify these outputs from the model. Also note that the sign of a NaN is undefined, and the model and core may differ in this respect. The xilinxcorelib library can be compiled using COMPXLIB for your particular simulator. See ISE documentation for further details. If the core has been directly instantiated within VHDL, then the xilinxcorelib library will already have been referenced in the code, and the availability of the xilinxcorelib library to the simulator will result in the behavioral model being used to simulate the core. Note: When direct instantiation is used, XST will employ its own version of xilinxcorelib. The XST library is made available to XST when the IP download is installed. Alternatively, a simulation wrapper file can be generated by CORE Generator. Within the wrapper file, the generics on the component instance are set to the same values used to generate the core. For further details on how to generate a simulation wrapper file, see the VHDL Design Flow within the CORE Generator documentation.
Verilog Simulation
A Verilog model of the Xilinx Floating-Point core is not supplied within the Verilog version of xilinxcorelib. However, a Verilog structural simulation model for a specific core can be generated by CORE Generator. See the Generation Panel under Project Options for the controls to enable this. For further details, see Verilog Design Flow within the CORE Generator documentation.
www.xilinx.com
21
Resource Utilization and Performance

The resource requirements and maximum clock rates achievable on Spartan-3E, Virtex-4, and Virtex-5 FPGAs are summarized as follows for the case of maximum latency. Note: LUT and FF resource usage and maximum frequency will reduce with latency. Minimizing latency will minimize resources. Custom Format: 17-Bit Fraction and 24-Bit Total Wordlength The resource requirements and maximum clock rates achievable with 17-bit fraction and 24-bit total wordlength on Spartan-3E FPGAs are summarized in Table 21, on Virtex-4 in Table 22, and on Virtex-5 in Table 23.
Table 21: Characterization of 17-Bit Fraction and 24-Bit Total Wordlength on Spartan-3E FPGA
Resources Operation Embedded Type

Multiply MULT18X18S (full usage) Logic (no usage) Add/Subtract Fixed to float Float to fixed Float to float Logic (no usage) Int24 input Int24 result Single to 24-17 format 24-17 to single Compare Divide Programmable C_RATE=1 C_RATE=19 Sqrt C_RATE=1 C_RATE=18
Maximum Frequency (MHz)1 Fabric Spartan-3E -4

167 172 195 214 217 183 252 223 203 173 209 186
Number
1 0
LUTs
92 339 438 166 195 66 13 82 460 172 308 160
FFs
110 401 430 176 190 89 52 24 751 186 447 161
1. Maximum frequency obtained with map switches -ol high and -cm speed, and par switches -pl high and -rl high.
Table 22: Characterization of 17-Bit Fraction and 24-Bit Total Wordlength on Virtex-4 FPGA

Multiply DSP48 (max usage) DSP48 (full usage) Logic (no usage)
Maximum Frequency (MHz)1 Fabric Virtex-4 FFs

133 129 401
Number
2 1 0
LUTs
89 103 345
-10
396 354 267
22
www.xilinx.com
Table 22: Characterization of 17-Bit Fraction and 24-Bit Total Wordlength on Virtex-4 FPGA (Continued)

Add/Subtract Fixed to float Float to fixed Float to float Logic (no usage) Int24 input Int24 result Single to 24-17 format 24-17 to single Compare Divide Programmable C_RATE=1 C_RATE=19 Sqrt C_RATE=1 C_RATE=18

430 178 188 89 52 24 751 188 447 158
Number
LUTs
388 166 217 87 33 79 460 200 308 159
-10
324 349 346 322 497 338 313 290 325 304
Table 23: Characterization of 17-Bit Fraction and 24-Bit Total Wordlength on Virtex-5 FPGA

Multiply DSP48E (max usage) DSP48E (full usage) Logic (no usage) Add/ Subtract Fixed to float Float to fixed Float to Float Logic (no usage) Int24 input Int24 result Single to 24-17 format 24-17 to single Compare Divide Programmable C_RATE=1 C_RATE=19 Sqrt C_RATE=1 C_RATE=18

133 129 401 408 137 188 89 52 24 751 189 447 158
Number
2 1 0
LUTs
74 93 340 341 139 166 74 35 66 457 179 332 135
-1
450 406 328 399 355 396 418 478 394 394 343 430 411
www.xilinx.com
23
Single-Precision Format The resource requirements and maximum clock rates achievable with single-precision format on Spartan-3E FPGAs are summarized in Table 24, on Virtex-4 in Table 25, and on Virtex-5 in Table 26.
Table 24: Characterization of Single-Precision Format on Spartan-3E FPGA

Multiplier MULT18X18S Logic (no usage) Add/ Subtract Fixed to float Float to fixed Float to float Compare Divide Logic (no usage) Int32 input Int32 result Single to double Programmable C_RATE=1 C_RATE=26 Sqrt C_RATE=1 C_RATE=25
Maximum Frequency (MHz)1 Fabric Spartan-3E -4

162 171 230 198 195 254 213 186 159 189 181
Number
4 0 0
LUTs
185 630 580 221 251 15 100 824 234 513 214
FFs
275 696 591 227 237 101 24 1,370 229 787 206
Table 25: Characterization of Single-Precision Format on Virtex-4 FPGA

Multiply DSP48 (max usage) DSP48 (full usage) DSP48 (medium usage) Logic (no usage) Add/ Subtract DSP48 (full usage) Logic (no usage) Fixed to float Float to fixed Float to float Compare Int32 input Int32 result Single to double Programmable
Maximum Frequency (MHz)1 Fabric Virtex-4 -10

391 353 279 274 382 368 318 305 474 337
Number
5 4 1 0 4 0
LUTs
116 139 509 641 372 578 226 282 43 97
FFs
235 259 562 698 466 594 233 238 101 24
24
www.xilinx.com
Table 25: Characterization of Single-Precision Format on Virtex-4 FPGA (Continued)

Divide C_RATE=1 C_RATE=26 Sqrt C_RATE=1 C_RATE=25

278 260 305 280
Number
LUTs
824 262 513 213
FFs
1,370 229 787 206
Table 26: Characterization of Single-Precision Format on Virtex-5 FPGA

Multiply DSP48E (max usage) DSP48E (full usage) DSP48E (medium usage) Logic Add/Subtract DSP48E (speed optimized, full usage) Logic (speed optimized, no usage) Logic (low latency) Fixed to float Float to fixed Float to float Compare Divide Int32 input Int32 result Single to double Programmable C_RATE=1 C_RATE=26 Sqrt C_RATE=1 C_RATE=25

177 209 390 698 375 561 625 226 237 101 24 1,370 233 787 204
Number
3 2 1 0 2 0 0
LUTs
88 126 294 641 267 429 536 181 218 44 80 788 227 542 175
-1
450 429 375 357 410 395 372 398 373 466 393 365 316 398 388
www.xilinx.com
25
Double-Precision Format The resource requirements and maximum clock rates achievable with double-precision format on Spartan-3E FPGAs are summarized in Table 27, on Virtex-4 in Table 28, and on Virtex-5 in Table 29.
Table 27: Characterization of Double-Precision Format on Spartan-3E FPGA

Multiply MULT18X18S (full usage) Logic (no usage) Add/ Subtract Fixed to float Float to fixed Float to float Compare Divide Logic (no usage) Int64 input Int64 result Double to single Programmable C_RATE=1 C_RATE=26 Sqrt C_RATE=1 C_RATE=25
Maximum Frequency (MHz)1 Fabric Spartan-3E FFs

1,075 2,451 1,171 506 446 113 24 6,002 403 3,234 393
Number
16 0 0
LUTs
681 2,296 1,272 563 523 95 164 3,335 437 1,904 444
-4
104 124 192 103 145 168 172 126 120 132 131
Table 28: Characterization of Double-Precision Format on Virtex-4 FPGA

Multiply DSP48 (max usage) DSP48 (full usage) DSP48 (medium usage) Logic (no usage) Add/ Subtract DSP48 (full usage) Logic (no usage) Fixed to float Float to fixed Float to float Compare Int64 input Int64 result Double to single Programmable

381 303 229 185 324 284 219 214 310 246
Number
17 16 9 0 3 0
LUTs
551 550 1,332 2,311 1,220 1,274 565 523 121 161
FFs
759 774 1,658 2,457 1,139 1,139 506 447 113 24
26
www.xilinx.com
Table 28: Characterization of Double-Precision Format on Virtex-4 FPGA (Continued)

Divide C_RATE=1 C_RATE=26 Sqrt C_RATE=1 C_RATE=25

192 182 205 201
Number
LUTs
3,335 437 1,904 445
FFs
6,002 401 3,234 392
Table 29: Characterization of Double-Precision Format on Virtex-5 FPGA

Multiply DSP48E (max usage) DSP48E (full usage) Logic Add/Subtract DSP48E (speed optimized, full usage) Logic (speed optimized, no usage) Logic (low latency, no usage) Fixed to float Float to fixed Float to Float Compare Divide Int64 input Int64 result Double to single Programmable C_RATE=1 C_RATE=26 Sqrt C_RATE=1 C_RATE=25

431 369 237 356 316 291 306 297 438 346 254 252 284 278
Number
13 12 0 3 0 0
LUTs
416 424 2,309 821 804 1,045 402 396 107 142 3,228 354 1,940 355
FFs
654 669 2,457 976 1,060 1,185 504 446 131 24 6,002 399 3,234 391
www.xilinx.com
27
Ordering Information
This core may be downloaded from the Xilinx IP Center for use with the Xilinx CORE Generator v8.2i. The Xilinx CORE Generator is bundled with the ISE Foundation Series Development software at no additional charge. Information about additional Xilinx LogiCORE modules is available on the Xilinx IP Center or by contacting your local Xilinx sales representative.
Support
Provided by Xilinx, Inc. @ www.xilinx.com/support.
References
1. ANSI/IEEE, IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Standard 754-1985. IEEE-754.
Revision History
This table shows the revision history of this document. Date
04/28/05 07/27/05 01/18/06 09/28/06
Version
1.0 1.1 2.0 3.0 Initial Xilinx release.
Revision
Document modified to include minor corrections and section on simulation. Updated to version 2.0 of core, Xilinx tools v8.1i. Updated to version 3.0 of core, Xilinx tools v8.2i.
28
www.xilinx.com

Floating Point Ds335

Uploaded by

Copyright:

Available Formats

Floating Point Ds335

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Floating Point Ds335

Uploaded by

Copyright:

Available Formats

0

Floating-Point Operator v3.0

This Xilinx core is based on IP originally licensed from QinetiQ Ltd.

A B OPERATION OPERATION_ND OPERATION_RFD SCLR CE CLK

UNDERFLOW OVERFLOW INVALID_OPERATION DIVIDE_BY_ZERO RDY

Figure 1: Block Diagram of Generic Floating-Point Binary Operator Core

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

Floating-Point Number Representation

The value of a floating-point number is given by: v = ( 1 ) 2 b 0 .b 1 b 2 b w

Bit significance (i) s Bit position

Figure 2: Bit Fields Within the Floating-Point Representation

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

Symbol for Special Value

dont care sign of sign of 0 sign of number

(that is, e = 11...11 ) (that is, e = 11...11 )

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

Fixed-Point Number Representation

s Bit position (i)

Figure 3: Bit Fields within the Fixed-point Representation

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

New Data: Must be set high to indicate that

OPERATION_RFD SCLR CE CLK RESULT UNDERFLOW

OUTPUT INPUT INPUT INPUT OUTPUT OUTPUT

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

Table 2: Core Ports (Continued)

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

Table 4: Condition Code Summary

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

CLK CE A OPERATION_ND OPERATION_RFD RDY RESULT INVALID_OP

Figure 4: Example Timing Diagram

Customizing the Core

Using the CORE Generator GUI

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

Fraction Width Format Type Min

Exponent/Integer Width (Width-Fraction Width) Min

Floating-Point Fraction Width or Fixed-Point Total Width

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

Virtex-II, Virtex-II Pro, Spartan-3E

Table 9 summarizes these options for addition/subtraction.

Multiplier Usage (only valid values listed)

Virtex-II, Virtex-II Pro, Spartan-3E Any

DS335 September 28, 2006 Product Specification

Floating-Point Operator v3.0

Maximum Latency (clock cycles)

Table 11: Latency of Floating-Point Multiplication using MULT18X18S

Maximum Latency (clock cycles)

Table 12: Latency of Floating-Point Multiplication using DSP48