Hash Functions For GPU Rendering: (A) LCG (Nested) (B) Trig (C) Iqint3 (D) Xxhash32
Hash Functions For GPU Rendering: (A) LCG (Nested) (B) Trig (C) Iqint3 (D) Xxhash32
Hash Functions For GPU Rendering: (A) LCG (Nested) (B) Trig (C) Iqint3 (D) Xxhash32
org
Figure 1. 2D plot of hash results, providing visual evidence of certain types of errors. (a)
LCG and (b) trig are obviously bad, with visible banding, linear artifacts, and repeated
patterns. (c) iqint3 and (d) xxhash32 are better, though this plot does not show that
xxhash32 is significantly better quality. (e) tea3 and (f) tea4 have 2D output, and (g)
hashwithoutsine33 and (h) pcg3d have 3D output, all four shown as color.
Abstract
In many graphics applications, a deterministic random hash provides the best source
of random numbers. We evaluate a range of existing hash functions for random num-
ber quality using the TestU01 test suite, and GPU execution speed through bench-
marking. We analyze the hash functions on the Pareto frontier to make recommenda-
tions on which hash functions offer the best quality/speed trade-off for the range of
needs, from high-performance/low-quality to high-quality/low-performance. We also
present a new class of hash tuned for multidimensional input and output that performs
well at the high-quality end of this spectrum. We provide a supplemental document
with test results and code for all hashes.
20 ISSN 2331-7418
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
1. Introduction
Random number generation comes up in a variety of contexts in 3D rendering. Re-
search into pseudo-random number generators (PRNGs) is almost as old as computer
science [Knuth 1978]. The goal of PRNGs is to provide a deterministic sequence of
numbers that are indistinguishable from a true uniform random process. It is impor-
tant that the generation of sequential random numbers be as fast as possible and that
it be possible to seed the generator to produce the same deterministic sequence on
subsequent runs of a program. It generally is not important that traditional PRNG
results appear random given adjacent seeds and most do not. It is also not particularly
important to be able to access a number from the sequence out of its sequential order;
some PRNGs allow this while others do not.
Unfortunately, both of these last two conditions are requirements for many graph-
ics applications, due to their need for both spatial locality and parallel execution.
For example, producing a random value per pixel in a GPU shader requires both that
nearby pixels be random relative to each other, and that each pixel should be able
to determine its value without first evaluating the PRNG for every other pixel with
a lower pixel index. A sequential program might be able to loop over all the lower-
indexed pixels to use a PRNG, but a GPU needs to be able to evaluate all the pix-
els independently, with each pixel remaining sufficiently random with respect to the
others.
For many graphics uses, a random hash is a better match than a PRNG. Perlin
noise uses a hash of several grid points close to the query point [Perlin 1985]. Unlike
PRNGs, random numbers based on location or other characteristics of the geometric
data are independent of the number and distribution of threads or order of evaluation.
Other applications use a hash of position, particle or object ID, or UV coordinate
[Wyman and McGuire 2017; Phillips et al. 2011; Wei 2004]. Random hashes need
to execute quickly on the GPU, need to be able to execute in parallel, need to be
sufficiently random when invoked with adjacent seeds, and often need to handle mul-
tidimensional inputs and multidimensional outputs. Also, some graphics applications
need high-quality random numbers, while others such as Perlin noise achieve accept-
able visual results even with lower-quality random values. As such, it is important to
understand the performance and quality needs of the application, and the performance
and quality characteristics of available hash functions.
Cryptographic hashes have many of the desired characteristics. If a cryptographic
hash of sequential values deviates in any detectable way from independent, uniformly
distributed random data, the correlation between values could be used as the basis
for an attack. Several researchers have used cryptographic hashes as a source of
random numbers for GPU and graphics code [Tzeng and Wei 2008; Zafar et al. 2010].
Unfortunately, though their quality is excellent, as a protection against brute-force
attacks cryptographic hashes are designed to be slow.
21
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
Hashes for data structures are designed to be fast, since the insertion time is di-
rectly affected by the evaluation speed. However, even distribution in a hash table
is not the same goal as having a uniform random distribution. As our results show,
hashes designed for data-structure use tend not to produce high-quality random num-
bers when using adjacent seeds.
Many commonly used hash functions only accept one integer as input and output
another integer. We refer to these as one-dimensional or (1 → 1) hashing algorithms.
Many graphics applications require a hash function with vector input and/or vector
output, for example a 2D pixel-coordinate input with a 3D random-vector output. A
hashing algorithm with N -dimensions of input and M -dimensions of output would
be (N → M ).
To know if a hash is a good fit for real-time GPU random-number generation,
it is necessary to measure both its quality and evaluation speed. Others have per-
formed these tests using the Diehard, SMHasher, or TestU01 test suites for quality
comparison [Zafar et al. 2010; Collet 2012; O’Neill 2014a]. However, many graphics
uses require multidimensional inputs to the hash (typically 1D–4D), and multidimen-
sional outputs (1D–4D). The best hash for use with 1D input and 1D output may
not be the best for 3D input and 3D output. In addition, many of the GPU hashes
in use today have only appeared in blog posts or shadertoy code, and have never
been formally evaluated or compared to other alternatives. This leaves anyone look-
ing for a good GPU hash with little guidance to decide which to use. We evaluate
over 100 hash variants, in terms of quality and speed for random number genera-
tion, and propose a class of new hashes with better quality for 3D and 4D use cases.
Based on our data, we make recommendations for hash use across the speed/quality
spectrum.
2. Background
Tests to compare the quality of random number generators typically perform exper-
iments using the generated values and evaluate whether differences from the known
expected result could be due to random chance. A first set of four tests was proposed
by Kendall and Smith [1938]. Since then, several more tests have been developed
and released in test suites such as Diehard, Statistical Test Suite (STS), Dieharder,
TestU01, and PracRand [Marsaglia 1995; Bassham et al. 2010; Brown 2017; L’Ecuyer
and Simard 2007; Appleby 2008; Doty-Humphrey 2019]. These test suites are used
to compare the quality of sets of random numbers, and thus can be used to compare
the randomness quality of hashing functions. For example, O’Neill [2014a] used the
TestU01 suite to compare random number generators and hash functions.
SMHasher [Appleby 2008] performs similar tests for data-structure hashes, eval-
uating the distribution of the hash across a hash table and its probability of collisions.
22
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
A less rigorous test often used for graphics purposes draws the random values as
grayscale (or extracted single bits) into a 2D image [Zafar et al. 2010; Reed 2013].
This can show visual repetition and patterns, though not deeper statistical anomalies
detected by the more sophisticated tests (See Figure 1).
When performing a multidimensional optimization, the Pareto frontier contains
solutions where improving one dimension will make the other(s) worse. For hashes
where we care about both performance and quality, if a hash is on the Pareto frontier,
the only faster hashes will have worse quality, and the only higher-quality hashes
will be slower. For hashes off of the frontier, there is another hash in the set that will
improve at least one of the criteria without making the other criteria worse. Therefore,
we are primarily interested in identifying hashes that define the speed/quality Pareto
frontier.
The list of base hash functions we compare is given in Table 1. We also compare
several derived hashes created by applying the transformations in Sections 4.1–5.
3. Methodology
23
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
Table 1. Hashes tested, short names used in plots, and the reference for each.
For multidimensional input, a Morton (or Z) ordering [Morton 1966] was used
to translate the single integer-sequence order of TestU01 into a vector of integers for
hash input. Since Morton ordering interleaves the bits of the vector dimensions, the
TestU01 translational repetition testing will catch repetitions in any of the dimensions.
24
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
ing time appears alone in the timestamp-query-based UE4 GPU Profile performance-
measurement tool without needing to make any modifications to the core engine code.
In order to get measurable results, we executed each hash 10,000 times, feeding
the output of each iteration as the input to the next, so the GPU and compiler would
not optimize out any instructions. All timings reported here are the median of five
runs, divided by 10,000. Some additional overhead beyond the actual hash call could
be counted in the total per call, but only 10−5 of that overhead per call, so timings
should still be ordered the same as a single call to the hash function.
All timing tests were done on a Windows 10 computer with an NVIDIA GeForce
GTX 1080 graphics card with no other applications running. Given the simple, purely
computational nature of the code generated for any of these hash functions, the rela-
tive performance of our results should still be valid for other current and future GPUs,
though we expect there might be some change in the relative performance between
160
lcg
Pareto Frontier
Other Hashes
lcg linear2
lcg nested2
trig
120
iqint1
BigCrush Tests Failed
iqint3
80
40
pcg
xxhash32
xxhash32 multibyte2 pcg4d
pcg nested2 pcg3d
0
0 5 10 15
Time (µs)
Figure 2. Quality vs. performance for all hashes. Labeled hashes on the black line are in the
Pareto frontier, meaning they are the fastest hashes for their quality. Please see the supple-
mentary document for full data and hash code.
25
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
integer and floating-point hashes on newer GPUs with faster 32-bit integer opera-
tions. As our best performing hashes were already integer, this would not significantly
change our results.
Hashing algorithms not originally implemented in HLSL were converted to HLSL.
In those cases, though C++ code could have been used for the TestU01 testing, the
HLSL code was used for both UE4 and TestU01 for consistency.
Figure 2 shows an overall plot of quality vs. performance of the hashes we tested.
Only hashes on the Pareto frontier are labeled, though detailed results as well as code
for all hashes are available in the supplementary materials. For a visual comparison
of a low-quality and high-quality hash from this set, see Figure 1(a) as compared to
(c), and (d).
4. Converting a (1 →) Hash to (N →)
Since many hash functions accept just one dimension of input, several methods have
been used in the literature to convert such hash functions to higher-dimensional input.
160
bbs
city
esgtsa
120
BigCrush Tests Failed
iqint1
lcg
murmur3
80 pcg
ranlim32
superfast
wang
40 xorshift32
xxhash32
0
0 5 10 15 20
Time (µs)
Figure 3. Using linear combination to increase the input dimensions of (1 →) hashes. The
first point of each line is the normal 1D hash. Each subsequent point increases the number of
dimensions of the input combined linearly. This graph shows that linear combination destroys
the hash quality for all hashes.
26
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
hash(aX + bY + cZ + d).
While this method is fast, Figure 3 shows that it greatly diminishes the quality of
the hash. This method generates repeating patterns in the hash, though how close they
are to each other is dependent on the choice of constants (see Figure 6(a)).
We only recommend this method when speed is the overriding concern and hash
quality is not important. The fastest multidimensional hash in our testing set is a linear
combination applied to lcg, though it fails 90% of the BigCrush tests.
Harnett [2018] used a variant of linear combination with xor instead of addition:
hash(aX ∧ bY ∧ cZ).
We tested this method with several (1 → 1) hashes and found this method improved
low-quality hashes for N = 2, but for higher-quality hashes and with a higher number
of dimensions, the results only got worse (see Figure 4).
160
esgtsa
iqint1
lcg
120
BigCrush Tests Failed
80
40
0
0 2 4 6 8
Time (µs)
Figure 4. Combining (1 → 1) hashes with XOR. The first point of each line is the normal
(1 → 1) hash. Each subsequent point increases the number of input dimensions.
4.2. Nested
Perlin [1985] converted a (1 → 1) hash into (N → 1) for Perlin noise by nesting the
hash:
hash(X + hash(Y + hash(Z))).
27
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
160
bbs
city
esgtsa
120
BigCrush Tests Failed
iqint1
lcg
murmur3
80 pcg
ranlim32
superfast
wang
40 xorshift32
xxhash32
0
0 10 20 30 40 50
Time (µs)
Since this method uses the hash multiple times, it avoids the repetition and quality
problems of the linear-combination method. Unfortunately, with the increased quality
comes increased computation time, since an N -dimensional input will use the hash
N times (see Figure 5). When considering multiple inputs, pcg is a good default
choice for an (N → 1) hash in terms of quality per speed. Using iqint1 is only
slightly faster than pcg for middle-range quality. While lcg is the fastest hash with
this method, one should consider using the linear-combination method instead of the
nesting method if quality is not a concern (and consider a different hash than lcg if
quality is a concern).
Figure 6. Comparison of the Wang hash using (a) linear combination, and (b) 2D nested
methods. Notice there are repeating patterns in (a), highlighted in the white squares.
28
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
Figure 6 shows a visual comparison of linear combination and nesting on the same
base hash. The linear-combination method introduces repeating patterns in the hash,
while nesting does not.
4.3. Multi-byte
Some hash functions are designed to operate on variable-length input. This includes
hash functions designed to operate on strings or for file checksum operations. Since
these hashes can handle multi-byte input, they can be used as a (N →) hash by
feeding all N input coordinates as input. In order to handle the variable length input,
these hashes typically loop over the input in multi-byte chunks performing a set of
mixing operations at each iteration, with the output of one iteration as one input for
the next iteration. After the iterator completes, these hashes then typically execute
a final scrambling on output. In our test set, the following hashes are multi-byte,
city, murmur3, superfast, xxhash32. Figure 7 shows a comparison of these
multi-byte hash functions.
160
city
md5
murmur3
120
BigCrush Tests Failed
superfast
xxhash32
80
40
0
0 5 10 15 20
Time (µs)
Figure 7. Multi-byte hashes. The first point of each line is the normal (1 → 1) hash. Each
subsequent point increases the number of input dimensions; md5 appears off the right edge of
the chart.
5. Converting a (→ 1) Hash to (→ N )
There are many methods to convert a 1D-output hash into an ND-output hash. The
most popular is to use a single hash evaluation and apply a single additional lcg
step, ah + b, with different constants to generate each subsequent output. The most
29
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
common variants of these use just addition (with a = 1), or just multiplication (with
b = 0).
A common method is one where the hash input is offset by an arbitrary number,
such as a prime, for each desired output beyond the first [Ebert et al. 2002]:
We call this method translated because, when used in a Perlin noise function, this
idiom appears as a translation of the input point.
120
BigCrush Tests Failed
80
40 pcg3d16
xxhash32 multibyte4
pcg3d
xxhash32 multibyte3 pcg4d
0
0 5 10 15
Time (µs)
Figure 8. Results for (3 →) and (4 →) hashes. The hashes that fall along the black line
(labeled) are on the Pareto frontier, meaning they are the fastest hashes for their quality.
30
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
that all components of the output change even if only one component of the input
was changed. This means the hash can also be used with fewer dimensions, holding
the unneeded input components constant. So, we can use a (3 → 3) hash function as
(2 → 3) or (1 → 3), without sacrificing quality, though possibly at a higher execution
cost than a lower-dimensional hash.
Figure 8 compares (3 →) and (4 →) hashes with any size output. Figure 9
compares only (3 → 3) and (4 → 4) hashes with 3D or 4D output. In both cases,
pcg3d and pcg4d fall on the Pareto Frontier and are a good default choice for
multidimensional high-quality hash functions; xxhash32 would be a good default
choice for (3 → 1).
Multidimensional hashes also have the advantage of being able to use fewer di-
mensions than the hash outputs. Due to dead-expression elimination by the compiler,
this may actually decrease the computation time of the hash.
160
xorshift128 Pareto Frontier
(3 →) hashes
(4 →) hashes
120
BigCrush Tests Failed
80
40 pcg3d16
pcg3d
pcg4d
0
0 5 10 15
Time (µs)
Figure 9. Results for (3 → 3) and (4 → 4) hashes. The hashes that fall along the black line
(labeled) are among the Pareto frontier, meaning they are the fastest hashes for their quality.
31
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
uint3 pcg3d(uint3 v)
{
v = v * 1664525u + 1013904223u;
v.x += v.y*v.z; v.y += v.z*v.x; v.z += v.x*v.y;
v ˆ= v >> 16u;
v.x += v.y*v.z; v.y += v.z*v.x; v.z += v.x*v.y;
return v;
}
uint4 pcg4d(uint4 v)
{
v = v * 1664525u + 1013904223u;
v.x += v.y*v.w; v.y += v.z*v.x; v.z += v.x*v.y; v.w += v.y*v.z;
v ˆ= v >> 16u;
v.x += v.y*v.w; v.y += v.z*v.x; v.z += v.x*v.y; v.w += v.y*v.z;
return v;
}
32
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
7. Conclusion
As shown in the results, no single hashing algorithm can fit every use case. Rather,
each algorithm on the Pareto frontier lies on a spectrum from the fastest algorithms to
the algorithms that yield the best results. Many of the hashes that fall along the Pareto
frontier were designed to be executed on the GPU with neighboring seeds. Chained
PRNGs with a state, such as city, murmur3, and ranlim32, may be great PRNGs
[Pike and Alakuijala 2011; Appleby 2008; Press et al. 2007] when used sequentially,
but when used in parallel with neighboring seeds, they fall short.
We conclude by providing a spanning set of exemplar algorithms for each usage
category, ordered from fastest to highest quality.
• (N → N ): pcg3d, pcg4d
In all the test cases, the fastest hashes are lcg, nested lcg, and xorshift128.
However, these hashes have such poor quality they are not practical without ad-
ditional scrambling operations. Because of this, we did not recommend lcg and
xorshift128 above.
The methods and results presented here will hopefully serve as a benchmark for
hashing algorithms in terms of quality and run-time performance.
8. Future Work
PractRand [Doty-Humphrey 2019] is a newer test suite that is more thorough than
TestU01 BigCrush [L’Ecuyer and Simard 2007]. We found better differentiation of
(fast) low-quality hashes using TestU01, but if several significantly higher-quality and
higher-performance hashes are discovered, PractRand would most likely be a better
differentiator.
It would be worthwhile to create a multidimensional hash test suite rather than
trying to borrow existing test suites that were designed for linear sequences. Such a
test suite could be more likely to notice problems in the domain than running a set of
one-dimensional experiments on the Morton order.
Given that most hashes tested in this work were not designed to simultaneously
optimize for both randomness and speed, there is certainly room for further develop-
ment of new fast and high-quality random hashes. One approach that has been suc-
cessfully used for hash development is random genetic-algorithm mutation of code.
33
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
This could be successful for GPU or multidimensional hash development with an ap-
propriate choice of objective function with a balance of both factors. In addition,
there are very few hashes natively designed for multidimensional input and/or output.
Future work that provided options in the mid-range of Figure 8 could be particularly
useful.
9. Acknowledgments
Work on this project was funded in part by a gift from Epic Games. The hardware used in the
computational studies is part of the UMBC High Performance Computing Facility (HPCF).
The facility is supported by the U.S. National Science Foundation through the MRI program
(grant nos. CNS-0821258, CNS-1228778, and OAC-1726023) and the SCREMS program
(grant no. DMS-0821311), with additional substantial support from the University of Mary-
land, Baltimore County (UMBC). See hpcf.umbc.edu for more information on HPCF and the
projects using its resources.
References
A PPLEBY, A., 2008. SMHasher. [Online; accessed June-2018]. URL: https://github.
com/aappleby/smhasher/. 22, 24, 33
BASSHAM , L., RUKHIN , A., S OTO , J., N ECHVATAL , J., S MID , M., BARKER , E., L EIGH ,
S., L EVENSON , M., VANGEL , M., BANKS , D., H ECKERT, A., D RAY, J., AND VO ,
S., 2010. A statistical test suite for random and pseudorandom number generators
for cryptographic applications. URL: https://csrc.nist.gov/publications/
detail/sp/800-22/rev-1a/final. 22
B LUM , L., B LUM , M., AND S HUB , M. 1986. A simple unpredictable pseudo-random
number generator. SIAM Journal on Computing 15, 2, 364–383. URL: https://doi.
org/10.1137/0215025. 24
B ROWN , R. G., 2017. Dieharder: A random number test suite. URL: http://webhome.
phy.duke.edu/˜rgb/General/dieharder.php. 22
C OLLET, Y., 2012. xxHash: Extremely fast hash algorithm. [Online; accessed June-2018].
URL: https://github.com/Cyan4973/xxHash. 22, 24
D OTY-H UMPHREY, C., 2019. PractRand pre0.95, October. [Online; accessed March 2020].
URL: https://sourceforge.net/projects/pracrand. 22, 23, 33
E BERT, D. S., M USGRAVE , F. K., P EACHEY, D., P ERLIN , K., AND W ORLEY, S. 2002.
Texturing and Modeling: A Procedural Approach, 3rd ed. Morgan Kaufmann Publishers
Inc., San Francisco, CA, USA, ch. 2, 91–93. 30
E PIC G AMES, 2014. Unreal Engine 4.1, Nov. URL: https://github.com/
EpicGames/UnrealEngine. 24
E PIC G AMES, 2016. Unreal Engine 4.13, Sept. URL: https://github.com/
EpicGames/UnrealEngine. 24, 32
34
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
H ARNETT, J., 2018. Toolbox of noisey goodness. [Online; accessed April-2019]. URL:
https://www.shadertoy.com/view/4dVBzz. 27
H OANG , V. T., AND ROGAWAY, P. 2010. On generalized Feistel networks. In CRYPTO 2010:
Advances in Cryptology, Springer, 613–630. URL: https://link.springer.com/
chapter/10.1007/978-3-642-14623-7_33. 32
H OSKINS , D., 2014. Hash without sine. [Online; accessed June-2018]. URL: https:
//www.shadertoy.com/view/4djSRW. 24, 27
H OWES , L., AND T HOMAS , D. 2007. Efficient random number generation and application
using CUDA. In GPU Gems 3, H. Nguyen, Ed., first ed. Addison-Wesley Professional,
Boston, MA, USA, ch. 37. URL: https://developer.nvidia.com/gpugems/
GPUGems3/gpugems3_ch37.html. 24
H SIEH , P., 2004. Hash functions. [Online; accessed June-2018]. URL: http://www.
azillionmonkeys.com/qed/hash.html. 24
J IMENEZ , J. 2014. Next generation post processing in Call of Duty: Advanced War-
fare. In SIGGRAPH Courses: Advances in Real-Time Rendering. ACM, New York, NY,
USA. URL: http://advances.realtimerendering.com/s2014/index.
html. 24
J ONES , D., 2010. Good practice in (pseudo) random number generation for bioinformatics
applications. [Online; accessed June-2018]. URL: http://www0.cs.ucl.ac.uk/
staff/d.jones/GoodPracticeRNG.pdf. 24
K ENDALL , M. G., AND S MITH , B. B. 1938. Randomness and random sampling numbers.
Journal of the Royal Statistical Society 101, 1, 147–166. URL: http://www.jstor.
org/stable/2980655. 22
K NUTH , D. E. 1978. The Art of Computer Programming, 2nd ed. Addison-Wesley Longman
Publishing Co., Inc., Boston, MA, USA. 21
L’E CUYER , P., AND S IMARD , R. 2007. TestU01: A C library for empirical testing of
random number generators. ACM Trans. Math. Softw. 33, 4 (Aug.), 22:1–22:40. URL:
https://doi.org/10.1145/1268776.1268777. 22, 23, 33
L EHMER , D. H. 1949. Mathematical methods in large-scale computing units. In Proceedings
of a Second Symposium on Large Scale Digital Calculating Machinery. Harvard Univer-
sity, Cambridge, MA, USA, 141–146. URL: https://archive.org/details/
proceedings_of_a_second_symposium_on_large-scale_. 24
M ARSAGLIA , G., 1995. The Marsaglia random number CDROM including the
diehard battery of tests of randomness. URL: https://web.archive.org/web/
20160125103112/http://stat.fsu.edu/pub/diehard/. 22
M ARSAGLIA , G. 2003. Xorshift RNGs. Journal of Statistical Software 8, 14, 1–6. URL:
https://www.jstatsoft.org/v008/i14. 24
M ORTON , G. M. 1966. A computer oriented geodetic data base; and a new
technique in file sequencing. Tech. rep., IBM Ltd., Ottawa, Canada. URL:
https://domino.research.ibm.com/library/cyberdig.nsf/papers/
0DABF9473B9C86D48525779800566A39/$File/Morton1966.pdf. 24
35
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
36
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
T ZENG , S., AND W EI , L.-Y. 2008. Parallel white noise generation on a GPU via crypto-
graphic hash. In Proceedings of the Symposium on Interactive 3D Graphics and Games,
I3D ’08. ACM, New York, NY, USA, 79–87. URL: https://doi.org/10.1145/
1342250.1342263. 21, 24
WANG , T., 2007. Integer hash function. [Online; accessed June-2018]. URL: https:
//web.archive.org/web/20070108113114/http://www.cris.com:
80/˜Ttwang/tech/inthash.htm. 24
W EI , L.-Y. 2004. Tile-based texture mapping on graphics hardware. In Proceedings of the
ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, GH 2004. ACM,
New York, NY, USA, 55–63. 21
W HEELER , D. J., AND N EEDHAM , R. M. 1995. TEA, a tiny encryption algorithm. In
Fast Software Encryption, Springer Berlin Heidelberg, Berlin, Heidelberg, B. Preneel, Ed.,
International Association for Cryptologic Research, 363–366. 24
W YMAN , C., AND M C G UIRE , M. 2017. Hashed alpha testing. In Proceedings of the 21st
ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, I3D ’17. ACM,
New York, NY, USA, 7:1–7:9. URL: https://doi.org/10.1145/3023368.
3023370. 21
Z AFAR , F., O LANO , M., AND C URTIS , A. 2010. GPU random numbers via the tiny
encryption algorithm. In Proceedings of the Conference on High Performance Graph-
ics, HPG ’10. Eurographics, Aire-la-Ville, Switzerland, Switzerland, 133–141. URL:
http://dl.acm.org/citation.cfm?id=1921479.1921500. 21, 22, 23
Mark Jarzynski and Marc Olano, Hash Functions for GPU Rendering, Journal of Computer
Graphics Techniques (JCGT), vol. 9, no. 3, 20–38, 2020
http://jcgt.org/published/0008/02/??/
37
Journal of Computer Graphics Techniques Vol. 9, No. 3, 2020
Hash Functions for GPU Rendering http://jcgt.org
Received: 2019-12-13
Recommended: 2020-01-31 Corresponding Editor: Eric Haines
Published: 2020-10-17 Editor-in-Chief: Marc Olano
38