-
Notifications
You must be signed in to change notification settings - Fork 548
Initial f16 support #2413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial f16 support #2413
Conversation
why is there |
I was thinking of making a table but thought against it. Fixed the message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, I guess you are still adding more tests - for all the functions for which half support has been enabled.
3ac84cb
to
10424f0
Compare
b79c549
to
e6dbac0
Compare
/// \returns true if the \p device supports half precision operations. | ||
/// false otherwise | ||
/// \ingroup device_func_half | ||
AFAPI bool isHalfAvailable(const int device); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add more comments in order to give some examples of which device will return true which false ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if that would help users in a general way. Perhaps, a new page in our documentation that lists known devices with half support or hyperlink to a page which has this information is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On my side I m more interested to know if the device can handle half natively with better perf than f32 (most of the time meaning Volta/Turing via TensorCores and apparently P100).
Is nt that method going to return true on old GPUs (GTX10X0) where half is perhaps "supported" but perf is disastrous ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The half support for the 10x0 cards are supported through floats.
Even on the Volta cards we need to improve the way we handle things to improve the compute performance. In order to fully utilize the GPU's half compute performance we need to use half2. This will require a major shift on some of the internal kernels so I pushed that change off to later. We can talk about what functions are important to you and improve those later down the line. All the library calls for half(i.e. matmul) will be using half2 so you will get full compute performance there. If there are kernels that are bandwidth bound moving to half2 will serve no purpose because it will still be a bandwidth bound kernel.
/// \returns true if the \p device supports half precision operations. | ||
/// false otherwise | ||
/// \ingroup device_func_half | ||
AFAPI bool isHalfAvailable(const int device); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if that would help users in a general way. Perhaps, a new page in our documentation that lists known devices with half support or hyperlink to a page which has this information is better.
@@ -44,7 +44,7 @@ class Join : public ::testing::Test { | |||
|
|||
// create a list of types to be tested | |||
typedef ::testing::Types<float, double, cfloat, cdouble, int, unsigned int, | |||
intl, uintl, char, unsigned char, short, ushort> | |||
intl, uintl, char, unsigned char, short, ushort, af_half> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should be C++ typedef af::half
to be consistent with cfloat
and cboudle
. This applies to all the tests where half
is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am worried we will have conflicts with the half data type. I want to discourage the use of this type by users because of its inefficiency. I am not sure if this is the best way to handle this sort of thing.
@@ -537,6 +576,10 @@ const af::cfloat &operator+(const af::cfloat &val) { return val; } | |||
|
|||
const af::cdouble &operator+(const af::cdouble &val) { return val; } | |||
|
|||
const af_half& operator+(const af_half& val) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
af::half
to be consistent with other af::*
types.
return lhs.data_ == rhs.data_; | ||
} | ||
|
||
std::ostream &operator<<(std::ostream &os, const af_half &val) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use half_float::half
from extern/half/include/half.hpp
only in tests ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Our half type is not fully functional. It would require us to reimplement the entire half library inorder to correctly implement it. I don't think we should directly include other libraries in our API. perhaps we can discuss it in a separate issue.
* Add half support for gemm * Add half support for JIT * Set the c++ standard to c++14 internally * Add support functions for half support. * Add half support to reductions. * Add support for join
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you open an issue to fix half testing?
Everything else looking good! Like the random cleanups. 👍
ASSERT_ARRAYS_EQ(gold, out); | ||
} | ||
|
||
// TODO(umar): HalfMax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO?
EXPECT_TRUE(one.isreal()); | ||
EXPECT_FALSE(one.iscomplex()); | ||
EXPECT_FALSE(one.isbool()); | ||
EXPECT_FALSE(one.ishalf()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EXPECT_TRUE?
Initial support for fp16:
[ ] convolution[ ] anything else