Skip to content

Commit fa10260

Browse files
author
coder0xff
committed
Updated readme file.
1 parent 9133f9b commit fa10260

File tree

2 files changed

+63
-51
lines changed

2 files changed

+63
-51
lines changed

QPFloat/QPFloat.vcxproj

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -97,9 +97,7 @@
9797
</PropertyGroup>
9898
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|x64'">
9999
<LinkIncremental>true</LinkIncremental>
100-
<IncludePath />
101100
<LibraryPath />
102-
<ExecutablePath />
103101
</PropertyGroup>
104102
<PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Test|Win32'">
105103
<LinkIncremental>true</LinkIncremental>

README.txt

Lines changed: 63 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,49 +1,63 @@
1-
QPFloat 0.3 beta
2-
Release under the GPL 3.0 License. See COPYING.txt
3-
4-
***** FEATURES *****
5-
6-
This library emulates the quadruple precision floating-point (IEEE-754-2008 binary128). It includes:
7-
Primitive operations such as addition, subtraction, multiplication, and division
8-
Higher operations such as natural logarithm, arbitrary base logarithm, exp, and pow
9-
Cieling, Floor, Round, Truncate, and Fraction (Fraction returns just the fractional portion)
10-
Sin, Cos, Tan, ASin, ACos, ATan, ATan2 implemented using Maclaurin series
11-
ToString function
12-
Hard coded constants such as Pi and e to full quadruple precision
13-
Implemented on little-endian, may or may not work on big-endian
14-
Follows the IEEE in-memory format on little-endian machines (transferable to and from hardware)
15-
arithmetic on sub-normals
16-
round-to-even
17-
correct propogation Inf, -Inf, and NaN
18-
enablable exception mechanisms
19-
20-
This library contains both an unmanaged implementation and a managed implementation (atop the unmanaged).
21-
22-
***** USING *****
23-
24-
VB.Net and C# users:
25-
Add a reference to Release/QPFloat.dll (x64/Release/QPFloat.dll for x64)
26-
create new variables using the System.Quadruple type (C# can do "using System;" so that you can just type Quadruple)
27-
28-
VC++/CLI users:
29-
Add a reference to Release/QPFloat.dll (x64/Release/QPFloat.dll for x64)
30-
create new variables using System::Quadruple
31-
32-
C++ users:
33-
Add a reference to Release/QPFloat.dll (x64/Release/QPFloat.dll for x64)
34-
#include "__float128.h"
35-
create new variables using __float128
36-
37-
***** COMPILING *****
38-
39-
Microsoft Visual C++ users:
40-
This code base can be compiled with or without /clr (Managed C++) to the compiler.
41-
If /clr IS used, then this library can be used by C# and VB.Net via the type Quadruple, the same way as Double
42-
If /clr IS NOT used, #ifdef _MANAGED has been used to automatically exclude the managed implementation.
43-
Since extension methods can't be used to add static functions to System.Math, Operations like Abs, Sin, etc are static methods of Quadruple
44-
__float128 is the unmanaged (faster) implementation, which is still present even when /clr is used.
45-
46-
Other compiler users:
47-
This code uses #ifdef to remove Microsoft specific functionality automatically (#ifdef _MANAGED)
48-
Though I've not tried, it should be relatively easy to build using other compilers.
49-
following existing conventions, __float128 is the type proffered by this library.
1+
# QPFloat (GPL 3.0) #
2+
3+
For high-precision mathematics, the Quadruple-Precision Floating Point library (QPFloat) emulates the IEEE 754 2008 binary128 on x86, and x64 (and probably any other little-endian platform) using integer arithmetic and bit manipulation. It contains a native C++ and assembler implementation, and a .Net C++/CLI implementation.
4+
5+
## Features ##
6+
7+
Much effort has been put into supporting a full feature set, including optimized transcendental functions to full precision.
8+
9+
### Standard operations ###
10+
* addition, subtraction, multiplication, division
11+
* Min, Max, Abs, Ceiling, Floor, Round, Truncate, Fraction
12+
* ToString and FromString
13+
* Cast operators to and from native numeric data types
14+
15+
### Numeric information ###
16+
* IsZero
17+
* IsNaN
18+
* IsInfinite
19+
* IsSigned
20+
* IsSubNormal
21+
22+
### Transcendental functions ###
23+
* natural logarithm (Ln), arbitrary-base logarithm (Log)
24+
* exponentiation (Exp)
25+
* power function (Pow)
26+
* Sin, Cos, Tan
27+
* ASin, ACos, ATan, ATan2
28+
29+
### Miscellaneous features ###
30+
* Fast! Optimized low level bit manipulation
31+
* Hard coded constant Pi and E to full quadruple precision
32+
* Implemented on little-endian architecture, may or may not work on big-endian???
33+
* Strictly follows IEEE specifications to the extent availabile on Wikipedia.
34+
* Multiple guard bits
35+
* Arithmetic on sub-normals is fully supported.
36+
* Inf, -Inf, and NaN are fully supported.
37+
* round-to-even
38+
* emulates FPU exceptions with enable/disable flags (default: all disabled)
39+
40+
## Using ##
41+
42+
### VB.Net, C# users, and C++/CLI ###
43+
1. Add a reference to Release/QPFloat.dll (x64/Release/QPFloat.dll for x64)
44+
2. Create new variables using System.Quadruple
45+
46+
### C++ users ###
47+
1. Add a reference to QPFloat.dll (x64/Release/QPFloat.dll for x64)
48+
2. include "__float128.h"
49+
3. create new variables using __float128
50+
51+
## Compiling ##
52+
53+
### Microsoft Visual C++ ###
54+
* This code base can be compiled with or without /clr (Managed C++) to the compiler.
55+
* If /clr IS used, then this library can be used by C#, VB.Net, and C++/CLI via the type System.Quadruple.
56+
* If /clr IS NOT used, #ifdef _MANAGED has been used to automatically exclude the managed implementation.
57+
* Since extension methods can't be used to add static functions to System.Math, Operations like Abs, Sin, etc are static methods of Quadruple.
58+
* __float128 is the unmanaged (faster) implementation, which is still present even when /clr is used.
59+
60+
### Other compilers ###
61+
* This code uses #ifdef to remove Microsoft specific functionality automatically (#ifdef _MANAGED)
62+
* Though I've not tried, it should be relatively easy to build using other compilers.
63+
* following existing conventions, __float128 is the type proffered by this library.

0 commit comments

Comments
 (0)