Sorting
Sorting
22b
The Ultimate 3D Coding Tutorial (C) Ica /Hubris 1996,1997,1998
Over 150k of pure sh...er, 3d coding power !
4. Sorting
Without any kind of sorting we get a real mess when drawing more than
one polygon at a time; the polygons are drawn all the time in the same order, so
in some angles the engine draws the closest polygons first and the fartherst last,
which results to weird ‘transparency’ effects. So some kind of sorting method
would be nice.
4.1 Z-Sorting
Z-sort is really not a perfect sorting technique but it’s enough for many
purposes (for example many asm97-demos used z-sort). Anyway, I suggest
learning also at least Z-buffer and maybe BSP-tree and S-buffer if you have time
and interest.
4.2 Z-buffer
Z-buffer is the easiest (but by no means the fastest) way of doing objects
which can intersect each other. Contrary to BSP-tree, Z-buffer doesn’t require
splitting polygons when they intersect. BSP-tree is unbeatable for stable scenes,
though. As the name indicates, we use a buffer (a table), into which we save the
z and color values of the points. The size of this table should be the same as the
size of the screen, for example in the MCGA mode 64000 at least 16bit
elements. Z-buffer looks like this in C :
float zbuffer[64000];
Before we start drawing anything to the z-buffer, we need to set the buffer
to the maximum value it can reach. Then, when drawing polygons, we
interpolate also z. In the inner loop we perform a compare, and if the current z is
less than the z in the z-buffer, we set the current z as the z-buffer value and draw
the pixel. If not, we continue with the next pixel.
This check slows the routine down dramatically, but luckily we can
interpolate z just like x or anything else : only one add per per pixel (plus the
check).
Finally, when all the points of all the polygons have been checked, we
draw the z-buffer into the screen.
About optimizing : you don’t need to reset the z-buffer so often (never I’d
say) if you interpolate 1/z and perform all the operations above but change the
compare from ‘<’ to ‘>’
How to implement it, you ask. Easy : 1/z is always at the range 0..1, so if
we substract 1 from each value of the z-buffer, it’s ‘reset’ (every possible value
of 1/z is greater than this value which is at the range -1..0). The best way is to
use this trick in the inner loop :
1st frame :
if (new_1/z_value + 1) > zbuffer[].z
[]
2nd frame:
if (new_1/z_value + 2) > zbuffer[].z
[].
etc.
4.3 BSP-Tree
We start with the line 1. We check on which side of it the rest of the lines
are, and make a binary tree in which we put the lines depending on the side
they’re located. If some lines are partly at the left, partly at the right side of the
line 1, they’re split, and the two pieces are placed one to the left, one to the right
side. Then we take the line 2 and perform the same thing with the lines 3, 4, 5,
and 6, and go on like this until all the lines have been arranged. If we want, we
can stop in some certain point and sort the remaining lines with some else
sorting technique.
When we’re ready to draw the lines, we perform the next operation : We
check, on which side of line 1 the camera is located. If it’s on the left, we check
if there are lines at right and if there are some, we go down the right branch.
When the right branch is ready, we draw the line one and go down the left
branch. REMEMBER to draw the line 1 in the place mentioned.
The drawing order of the lines in respect to the camera point in the picture
is so the following :
1) We’re on the right side of 1. 2a is drawn first, then 1, then we go
down the right branch.
2) We’re also on the right side of 2b, so we go down the left branch.
3) We’re on the left of 3a, so we draw first 3a, and then go down the
left branch (no lines on the right).
4) We’re on the right of 4. So the order is 5c, 4, 5b.
5) We return back to 2b and draw it, and go down the right branch
towards 3b.
6) We’re on the left of 3b so we draw 6a and 3b and go down the right
branch.
7) Related to 5a we’re at left, so we draw 6b and finally 5a.
Neat thing that, but how to implement it ? Here I’ll describe the use of all
the needed formulas.
If all the vertices of a polygon aren’t at the same side of a plane, the plane
and the polygon intersect each other. In this kind of situation we need to
calculate the intersecting points and with the help of those form two new
polygons which are on different sides of the plane. To do this, the equation of a
line in 3-space is required :
X = X1 + (X2-X1)t
Y = Y1 + (Y2-Y1)t
Z = Z1 + (Z2-Z1)t
This line goes through the points (X1,Y1,Z1) and (X2,Y2,Z2) t getting all
real values. When we place these X, Y, and Z to the places of X, Y, and Z in the
plane equation, we’ve get the coordinates of the intersection point:
Nx*(X1+(X2-X1)t-ax) +
Ny*(Y1+(Y2-Y1)t-ay) +
Nz*(Z1+(Z2-Z1)t-az) = 0
We solve t :
Nx*(X2-X1)t + Ny*(Y2-Y1)t + Nz*(Z2-Z1)t =
Nx*(X1-ax) + Ny*(Y1-ay) + Nz*(Z1-az)
<=>
t * ( Nx*(X2-X1) + Ny*(Y2-Y1) + Nz*(Z2-Z1) ) =
Nx*(ax-X1) + Ny*(ay-Y1) + Nz*(az-Z1)
<=>
Nx*(ax-X1) + Ny*(ay-Y1) + Nz*(az-Z1)
t = ----------------------------------------------------
Nx*(X2-X1) + Ny*(Y2-Y1) + Nz*(Z2-Z1)
4.3.3 Hints
2) The easiest way is seldom the fastest. It would maybe be better to find
the alternative with the least number of polygon splits -- the number of polygons
can easily be reduced to a half which speeds up the thing dramatically.
4.4 S-buffer
So we have a table or a linked list into which we save the data of every
hline call. We compare them to each other, act accordingly, and finally draw the
visible scanlines (or parts of them). For example the following type of table
works (for texturemapping, pmode) :
typedef struct
{
short xb,xe,zb,ze; // x_begin, x_end, ...
byte ub,ue,vb,ve;
long kz,ku,kv; // kz = (ze-zb)*65536/(xe-xb) etc
unsigned char used; // if 0 -> not used
} sbuf_t;
sbuftype sbuf[200][100]; // 200 lines in mode 320x200,
// max. 100 segments / line
Note ! The segments are ordered by the size of the x coordinates from left to
right. This kind of table consumes memory 200*100*28=560000 bytes (due to
alignment), and the more complicated the world is, the greater should the
maximum number of segments be. I recommend using a linked list instead of a
table; there’s no maximum number of segments, and adding a segment between
two other segments is far easier.
Segments on the same horizontal line can be located in many ways related
to each other, so they must be compared and act accordingly. There are six ways
in which segments can be located related to each other (old line = v, new line =
u):
(1)
vvvvvvv
uuuuuuuu
(2)
vvvvvvvv
uuuuuuuu
(3)
vvvvvvvv
uuuuuuuuuuu
(4)
vvvvvvvv
uuuuuuuuuuu
(5)
vvvv
uuuuuuuuuuu
(6)
vvvvvvvvvvv
uuuu
With these six cases we can present all the possible locations related to
each other, also special cases (like one pixel segments and segments located to
equal (x,y) coordinates). Humm... Paul mentioned in #coders that his own s-
buffer requires only two or three compares in the hline ! I would be interested to
see the implementation. Paul has written a document describing his technique
but he hasn’t released it yet (so as not to annoy his employees in some weird
way...) So everyone flood his email until he gives up and finally releases the doc
! ) [addition : the technique is quite probably this : he uses BSP-tree for stabile
parts of the scene, then he inserts these pre-sorted parts into a s-buffer front-
>back, then uses normal s-buffer for moving objects and moving parts of the
scene. The bsp-tree->s-buffer insertion routine can be optimized ultra fast; a
scanline is added if and only if there’s no segment in the s-buffer already in
those coordinates.]
In the case 1 we just add the segment into the buffer before the next
segment and return to the main polygon routine (all the segments to the left have
already been checked; the list or table is ordered left -> right).
In the case 2 we try the next old segment if it exists. If not, we add the
segment into the buffer as the last one.
In the case 3 we split the segment at the end of the old segment
(remember to calculate the new values of z, u, and v), compare the left part to
the old segment, and add it into the buffer if needed. Then we continue with the
right part and the next old segment.
In the case 4 we split the segment at the beginning of the old segment, add
the left part into the buffer before the old segment, and act with the right part
like above with the left part.
In the case 5 we split the segment at both ends of the old scanline and act
like in the cases 3 and 4.
In the case 6 we act directly like with the left part in the case 3.
Here are routines to empty and flip a s-buffer using the sbuf table above.
- tmap is 256x256-sized bitmap
function flip_sbuf
integer i,j,k
integer du,dv
for i=0 -> 199
j=0
while sbuf[i][j].used<>0 and j<100
du = 0
dv = 0
for k=sbuf[i][j].xb -> sbuf[i][j].xe
putpixel(k,i,tmap[sbuf[i][j].ub+du/65536+
(sbuf[i][j].vb+dv/65536)*256])
du = du + sbuf[i][j].ku
dv = dv + sbuf[i][j].kv
endfor
j=j+1
endwhile
endfor
endf
function clear_sbuf()
integer i,j
for i=0 -> 199
j=0
while sbuf[i][j].used<>0 and j<200
sbuf[i][j].used=0
j=j+1
endwhile
endfor
endf