You are on page 1of 8

JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 6

Hilbert Curve Based Bucket Ordering for


Global Illumination
Yusuf Yavuz, Bulent Tugrul, and Suleyman Tosun

Abstract—The process of rendering a photorealistic image is an extremely compute intensive task and high quality images
require great amount of time to create. In this paper, we evaluate the bucket rendering technique that uses the Hilbert curve
approximation to sort and group the most correlated buckets in the frame buffer in an effort to decrease the rendering time. To
increase the visual quality of the output images, we also included a multilevel dynamic anti-aliasing algorithm to our renderer to
smooth the jagged edges caused by the bucket rendering. We implemented the proposed bucket ordering method on a system
that uses photon mapping algorithm. We tested the effects of our method both on single processor and on parallel processors
with three dimensional scene files to measure and analyze its performance.

Index Terms—Measurement, evaluation, modeling, simulation of multiple-processor systems, Image-based rendering,


Distributed/network graphics, Color, shading, shadowing, and texture, Fractals, Raytracing.

——————————  ——————————
scenes that are bigger than the actual memory itself.
1 INTRODUCTION Another important advantage is that system can render

P
multiple tiles at a time. This feature is crucial for multi-
hotorealistic image rendering is one of the most
core systems. Tiles can be assigned onto different cores
compute intensive tasks in computer graphics [1,
and can be rendered independently. In this case,
2]. The process uses almost all system resources to
synchronization and load balancing between tiles
generate final image on the screen. Resolution and
becomes an important issue [15].
scene complexity has dramatic effects on rendering
Bucket rendering operation can be done in various
time. Higher resolution means, longer rendering times.
ways and parameters. We can render the entire scene
It is hard to load and process entire screen on limited
with fixed bucket order like top-to-bottom or left-to-
CPU, memory, and bandwidth [3, 4]. To overcome this
right. This means buckets are processed starting from
problem, frame buffer should be organized to
top left section of the screen and ending at the bottom
maximize the usage of spatial coherence between
right section. Fixed order bucket rendering does not
specially divided screen units, so called, buckets.
consider the coherence of the neighbor buckets except
Bucket rendering technique is based on subdividing
the latest rendered bucket. To overcome this limitation,
the frame buffer into spatially coherent regions that are
we dynamically order the buckets according to the
rendered independently. The advantage of bucket
edge correlation of the neighboring buckets. When first
rendering is that algorithm stores only a tile of
bucket is rendered successfully, next bucket is selected
intermediate results rather than the full screen data.
according to the Hilbert curve. Hilbert curve
Intermediate results help process multiple regions in
maximizes the usage of cache and data correlation and
parallel. Processing multiple regions becomes more
minimizes the redundant computations done by the
important on multi-core and parallel systems since
renderer. Instead of making the same calculations for
these systems require more data transfer than the
similar buckets, edges are analyzed to reduce the
single CPU systems [7, 8].
number of calculations.
Especially, divided and utilized small tiles can be
In this paper, we present a bucket rendering method
transferred more efficiently over the communication
for photon mapping based global illumination
platform such as system bus or network line. Typical
renderer. Our bucket rendering technique is based on
rendering systems require random access to the frame
Hilbert curve approximation and frame buffer cache
buffer to get the entire image data at once. Bucket
coherence utilization. This architecture serves flexible
rendering technique fetches single and relatively small
and easy to extend infrastructure for parallel and multi-
tile at a time. Smaller tile size is better to work with
core rendering systems [6, 9]. To increase the visual
systems which have limited memory capacity.
quality of the output images, we also present a
Additionally, tiles allow system to work with larger
multilevel dynamic anti-aliasing algorithm to smooth
———————————————— the jagged edges caused by the bucket render.
1. Y.Yavuz Author is with Department of Computer Engineering, Ankara The remaining of the paper is organized as follows.
University, Besevler, Ankara 06500. In Section 2, we discuss the bucket rendering
2. B.Tugrul is with the Department of Computer Engineering, Ankara
University, Besevler, Ankara 06500. techniques and compare their weak and strong points.
3. S.Tosun is with the Department of Computer Engineering, Ankara In Section 3, we discuss the Hilbert curve in detail. In
University, Besevler, Ankara 06500. Section 4, we compare anti-aliasing techniques. In
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 7
Section 5, we present our bucket rendering reduce the calculations, cache data should be used
implementation in single processor and parallel extensively. Our system uses Hilbert space filling curve
processor systems. In Section 6, we illustrate the (also known as Hilbert order) to overcome this
experimental results. Finally, we conclude the paper in problem. Hilbert curve organizes the tiles in the frame
Section 7. buffer to achieve the most optimized results.

2 BUCKET RENDERING TECHNIQUES 3 HILBERT CURVE


Bucket rendering concept can be used on large scale Hilbert curve is a space filling curve that minimizes the
offline and online global illumination based data transfer among communication medium and
visualization systems. Global illumination requires considers the bucket correlation to reduce the
complex calculations to achieve the correct lightning on redundant calculations. The general idea of our
various objects and material properties. It is almost approach is similar to the Warnock’s recursive
impossible to load complete set of information into the algorithm [16] or Greene’s quad tree algorithm [17]
main memory at a time. Thus, frame buffer should be except we order our visits to the children of each block
specially divided to optimize the data coherence and in Hilbert-curve fashion. The advantage of the Hilbert
cache usage to reduce the redundant calculations made curve is that its strict spatial locality allows us to
by the renderer. In our rendering system we used a incrementally evaluate all edge equations. The result is
photon mapping [5] based global illumination renderer a relatively simple algorithm that requires minimal
to achieve the most realistic effects on the image. storage and implementation complexity and it has
Additionally, photon mapping gives opportunity to excellent spatial coherence properties.
work with complex effects such as caustics. Space filling curves are generally used to increase
Bucket organization plays an important role in the the coherence and improve the overall rendering
rendering operation. Buckets should be fetched from performance for ray tracing based systems [1, 12].
the frame buffer in special and predefined orders such Hilbert order visits all pixel locations according to the
as spiral, column, row, diagonal, or random as shown tile sizes. Fig. 2 shows the generation of the Hilbert
in Fig. 1. However, none of these techniques consider space-filling curve for scenes with different dimension
the data correlation among cached buckets. Therefore, sizes. After visiting all pixel locations, system moves
we cannot individually consider them as optimum into the next neighboring unprocessed tile. System
bucket selection methods for global illumination caches the processed tiles as well as unprocessed tiles
renderers. to use the coherence between them. The fact should be
noted here that the tile at hand should share some level
of similarity with previously processed tiles. In order to
use the cache coherence efficiently, blocks should be
cache aligned and ordered in the memory before
processing. Another major advantage of the Hilbert
curve is that cache coherence and performance benefits
will be available at all scales. We do not have to make
extra work if the number of pixels in a cache block
changes or different color representations stores
Fig. 1. Examples of static bucket orders.
different number pixels into the cache.

Buckets are cached in the frame buffer before


being processed by CPU. Caching gives opportunity to
specify correlated tile units and take them as next
bucket to render. Correlated bucket pairs prevent
doing the same calculations more than once. Instead,  
previously rendered bucket information is used to
render the next bucket. In fact, bucket rendering starts
from the previous bucket because of the correlation of Fig. 2. Generation of the Hilbert space-filling curve.
neighbor buckets.
Choosing the next bucket to render for the best Hilbert curve implementation divides the screen
coherence opportunity is a challenging problem. into set of unit shapes. Each unit shape represents
Renderer should make its decision quickly and choose sequential order of atomic actions to generate the final
the next bucket to render. Otherwise, processor will be and complete curve pattern. Number of atomic unit
blocked and this will slow down the entire system shapes increases according to the source image
performance. Thus, next bucket decision should be resolution. System automatically adapts the number,
optimal for the system performance. Furthermore, to size, and sequence of the unit shapes with respect to
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 8
the changing image parameters and main principle This assumption is tolerable since most of the
remains the same. Hilbert curve progress with neighboring pixels have the same color. In level-1, a
changing transformation, rotation as well as start and pixel is taken as itself and is neither grouped nor
end positions of the atomic unit shapes so that the tiles divided into sub pixels. This level is useful when there
can be traversed in correct position and correct order. is only one polygon contributing to the pixel. In level-2
Atomic unit shapes are transformed using start and and level-3, a single pixel is divided into 4 and 16 sub
end reference points as in Fig. 3. Start point is in the pixels, respectively. The color of a pixel is found based
lower left corner of the atomic unit shape and the value on the contribution of different colors. For example, if a
becomes 1 when the starting point is in the upper right polygon is contributing to a pixel by 10%, the whole
corner. End point is in the lower right corner of the unit pixel gets only 10% color of the contributing polygon.
shape and the value becomes 1 when the starting point The levels can be chosen by considering the complexity
is in the upper left corner. The method places the unit of a scene. The more complexity a scene has, the higher
shapes using a recursive procedure. Recursive the probability of several numbers of polygons
implementation changes the source parameters and contributes to pixels.
approximates the set of points that generates the Fig. 4 shows the effect of smoothing algorithm on
Hilbert curve. pixels. Source pixels represent the geometrical view of
the part of the scene sent to the renderer. The result
without any smoothing is shown in the middle in Fig.
4. The Fig. on the right in Fig. 4 shows that the pixels
are smoothed by anti-aliasing methods. In this Fig., we
see that the color of a specific pixel is proportional to
the percentage of contribution of the polygon to that
pixel. By applying this smoothing technique, we can
  eliminate the staircase effect on the final image.
As we mentioned above, we have fixed-level and
dynamic smoothing methods. In the fixed-level one,
Fig. 3. Coding system to describe the start and end points of the one can select a level based on the scene and apply it to
unit shape. entire scene without discriminating among different
parts of the scene. One disadvantage of applying fixed
4 ADAPTIVE ANTI-ALIASING level smoothing algorithm is its nature to apply the
same level over the entire image. The images have
Bucket rendering basically takes each tile and renders it
different complexities on different parts. If we apply
individually in a separate thread. Rendering small tiles
the highest level, the rendering will be very high since
reduces the memory consumption and increases the
this level divides the pixel into a lot of small sized
speed. However, it also causes the problem of jagged
pixels. This division effects the rendering time
edges. The effect is formally known as aliasing.
tremendously. For the parts that have high complexity
Aliasing occurs since the image on the screen is only a
in a scene, we may need to apply a high level of
pixilated sample of what the original 3D information
smoothing while we may apply the lowest level for the
graphics card has calculated. At increasingly higher
parts that has only one color.
resolutions, as more pixels are used, the sample
In dynamic smoothing/anti-aliasing method, the
becomes similar to the original source and the image
level is not fixed and it may change based on the
displays less aliasing. However, higher resolutions can
contributions to a pixel. In this method, only the range
degrade performance. To overcome aliasing problem,
of the levels is given. If a pixel is dominated by one
anti-aliasing techniques has been used in many studies.
color (i.e. if the percentage of a color is greater than a
Anti-aliasing basically takes samples from each pixel
threshold value), there is no need for the smoothing on
and interpolates the color values corresponding to that
that pixel. So, the lowest level is selected. However, if
pixel. This helps us to spread the interpolated color
there are several colors contributing to a pixel, a
value to the entire pixel instead of using just one color
different level must be selected based on the
value. We used a dynamic smoothing algorithm to
contribution percentages. In Fig. 5, we demonstrate the
eliminate aliasing effects by sacrificing as little as
dynamic anti-aliasing method. On the left of this Fig.,
possible on the performance.
we see the single pixel and the color contributions. On
The smoothing algorithm can be applied in two
the right, we see the pixel divisions based on the color
different ways: Fixed level (static) smoothing and
contributions. First, a pixel is divided into four parts
dynamic smoothing. Let us start to explain what we
since there is a need for smoothing as the first pixel
mean by the word level first.
shows. Then, the sub pixels are analyzed. Finally, the
Smoothing techniques are pixel based and they have
level of the smoothing is decided by checking the
four different levels in our system. In level-0, 4 pixels
contributions to sub pixels.
(2 by 2) are grouped to be assumed as one single pixel.
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 9
accuracy. After considering the number of photons, we
should trace them into the scene. First, we have to
consider photon trajectories. If we use too many
photons, this preprocessing step may take too long. If
we send the photons less than needed we can lose the
effects such as caustics. This is an important user
  oriented decision. Another important decision is where
we should send the photons. If we trace the photons
randomly resulting image may look noisy. To
Fig. 4. The effect of smoothing. Left image shows the object’s
contribution to the pixel. Middle image and right image show the overcome this problem, we used projection map [6].
final pixel colors with and without anti-aliasing, respectively. Projection maps optimize photon tracing by directing
photons towards important objects. Special effects such
as caustics and multiple reflections require more
focused photons. When designing a photon tracing
strategy we have to consider these effects. Otherwise,
we may not get realistic results with caustics and
objects that require multiple reflections.
After tracing the photons into the scene, we store the
radiance value for each photon to use in density
estimation. Choosing a fast, memory efficient and well
  scaled data structure is a crucial issue. Data structure is
also an important factor for parallelization and
Fig. 5. Setting dynamic smoothing level. Right image shows the
selection of smoothing level based on the object’s contribution to
speedup process. In our system, we used kd-tree data
the pixels. structure as in [5]. The kd-tree structure is a multi-
dimensional binary search tree in which each node is
used to partition one of the dimensions. It is possible to
The dynamic smoothing also adds some extra time
locate one photon in the kd-tree with n photons in
to rendering phase. However, this extra time is
O(log n) time. On average, the time to locate the k
considerably small when we compare to the fixed level
nearest neighbors is on the order of O(k+log n), which
smoothing. We did experiments about the effects of
makes the kd-tree a good choice for storing the photon
smoothing on rendering time.
map. Kd-tree is also very good at handling non-
uniform distribution of photons. As the kd-tree is
5 IMPLEMENTATION independent of geometry, photon mapping does not
In the next section, we present the implementation of suffer from the usual meshing artifacts and can
photon mapping and Hilbert curve based bucket efficiently be used for highly complex scenes.
rendering algorithm on a single processor and parallel
processors. Second pass is the rendering pass. Rendering is
In our system, we first implemented the photon almost the same as ray tracing [13, 14]. Most important
mapping algorithm with Hilbert curve based bucket difference comes out when ray intersects with a
rendering algorithm on a single processor. Then, we surface. When ray intersects with an object or a surface
applied parallelization to our implementation by using a radiance estimate is made for each pixel by using
JGrid structure. photon map data structure. Then, rendering equation is
used to calculate the surface radiance for intersected
point.
5.1. Single Processor Implementation Calculated color values are stored in the frame
Photon mapping algorithm mainly consists of two buffer to be visualized onto the screen. Rendering is
stages: photon tracing and rendering. Photon tracing done by using Hilbert curve based bucket rendering
stage is responsible to trace the photons through the approach. Each bucket is fetched from the frame buffer
scene. The outputs of this stage are the reflected and screen is traversed to generate the Hilbert curve
radiance at surfaces and out-scattered radiance in structure that minimizes the data transfer from
participating media. Rendering stage uses the result of communication medium.
photon tracing pass and solves the rendering equation. Our single CPU approach supports multiple CPU
Photon tracing pass starts with creating several cores as well. Each bucket can be assigned to an
photon structures. Our system contains a module to individual core. Multi-core CPU architectures are used
explicitly define the appropriate photon number. The to simulate the parallel CPU behaviors and each bucket
number of photons should be selected carefully rendering thread works on its own and generates the
otherwise resulting image may not look realistic. Thus, resulting image.
we have to make choice between speed and image
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 10
5.2. Parallelization of the Algorithm
Photon mapping algorithm on a single CPU is a
compute intensive task. Several approaches have been
proposed to speed up the algorithm. Today’s CPU’s are
highly optimized; however, we still need much more
computation power for tasks such as global
illumination especially for photon mapping. Using a
super computer cannot be a cost-efficient solution.
Instead of using one very powerful computer we can
use several small sized computers to achieve better
computation power. As a result, parallelizing the
overall process seems a good choice since the rendering
pass of the photon mapping algorithm is suitable for
Fig. 6. Service oriented architecture of the system.
parallelization.
Although parallelization seems like an ultimate
solution for our case it has also weak points. Using too
many processing elements may reduce the overall
performance due to communication overhead such as
network latency, idling, etc. We cannot think a parallel
system without collaboration of hardware and software
resources. To overcome these problems, parallelization
process should be well designed and highly scalable. It
should work in several platforms and hardware
resources.
Our parallelization method is based on JGrid project
[11]. The aim of the JGrid project is to create the next-
generation grid infrastructure based on Jini
Technology. JGrid focuses on dynamic and service-
oriented grid architectures with a goal to develop an
easy to use, reliable, and fault-tolerant global grid  
infrastructure. The features and properties of Jini
Technology such as spontaneous networking and Fig. 7. Network topology of the system.
service discovery, leasing, distributed events and
transactions, security, service-oriented programming
model make it a very suitable base for creating words each one of them takes parts for rendering pass
dynamic and reliable grid systems. of the photon mapping algorithm.
In our system, every component is either a service or One very important part of parallelization is job
a consumer of a service. Jini provides a Service distribution. Without an efficient job distribution
Oriented Architecture (SOA) where the three main strategy we cannot achieve high rendering speeds.
components are the Service, the Client, and the Before distributing the jobs we have to define smallest
Registry (Lookup Service) as shown in Fig. 6. When a job unit. In our system, smallest job unit is a bucket.
Service starts up it registers its proxy object with the Bucket rendering divides the scene into screen-space
Lookup Service. The proxy object hides all tiles and render each tile independently. This strategy
communication details to the service backed and makes has some advantages such as smaller memory
Jini protocol independent. After registering with the requirement and good compatibility with parallel
Lookup Services, clients can look up the suitable rendering algorithms. Bucket size is also an important
service based on the interface description. Moreover, problem. Small bucket size requires smaller memory
clients can subscribe for remote events that are and is better for load balancing. However, it can be
triggered when the required service appears or more expensive to send and receive too many buckets
disappears. to the slaves because of the network communication
Network topology of the system is shown in Fig. 7. overheads. We choose 32x32 pixel bucket size for our
It consists of two main components; a master computer system. The screen is divided into equally sized regions
and the slave computers. Master computer is and these regions are sent to the slave computers to
responsible for distributing the jobs to the slaves and make the computations for each bucket. Selecting an
recollects the finished jobs. After all jobs finish their optimum bucket size is in our agenda for future
execution, master computer combines the distributed research.
parts and shows the resulting completed image. Slave Our messaging strategy plays an important role in
computers are responsible for computations. In other our system as well. Without an efficient messaging
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 11
strategy, jobs cannot be distributed efficiently. 6 EXPERIMENTAL RESULTS
Messages are small commands that are sent from the
To evaluate and test the proposed system, we
network to the responsible computers. In our system,
conducted two set of experiments. In our first set of
there are two groups of messages: Initialization
experiments, we tested the parallel performance of the
messages and system messages. Initialization messages
bucket ordering techniques on various scene files and
set the environment and make the adjustments before
number of CPUs. Scene files we use in our experiments
rendering starts. With the help of these messages,
have three complexity levels. Scene-1 is the least
services can register the lookup service, compute units
complex; scene-2 has medium complexity, and scene-3
can register the lookup service, number of compute
is the most complex one. In order to analyze the
units can be determined and connections between
parallel performance gain, we tested each bucket
services can be checked.
ordering technique on a setup consisting up to 8
System messages are used for the rendering part.
computers. Each computer has 2.2 GHz Intel Core2
System messages can also be categorized into two
Duo CPU, 2G memory, and nVidia GeForce 8500GT
groups: First one is used to send the buckets from
video card.
master computer to the slave computers. Before
In our first set of experiments, we evaluate the
sending the bucket, system checks the CPU speed of
impact of different bucket orderings on the scene
each compute unit. After determining the lightly
rendering time. The bucket orderings in our
loaded CPU system, the bucket is sent to that
experiments are random bucket ordering, row-based
component. If the computers have the same CPU loads,
bucket ordering, and Hilbert bucket ordering. We
the closest compute service is selected and the bucket is
tested three bucket ordering techniques by rendering
sent to it. The second message type collects the
three scene files on different number of CPUs.
rendered buckets from compute services. When a
Rendering times show the performance gain between
compute service finishes rendering of the current
each bucket ordering technique. We illustrate our
bucket, it sends the finished job to the master
results in Fig. 9. As seen from these three bar charts for
computer. Before sending the rendered bucket to the
three scene files, Hilbert order based bucket rendering
master computer, the client checks the master
method always gives the best rendering times since it
computer in terms of work load. This process works for
uses the data coherence most. In row based bucket
every bucket in the scene. When all compute units
ordering, we can only use the coherence from the lastly
finishes their job, they send idle message to the master
rendered bucket. For scene-1, we achieve the highest
computer and master computer knows which
improvement percentage of Hilbert order over random
computer is idle.
order with 8 parallel computers and it is 67%. The
Data distribution is made by local copies.
improvement for the same scene against random order
Computing units holds a local copy in its cache. If the
is 41%. We can also observe the effects of increasing
local copy does not exist, each slave downloads the
the number of computers on the rendering time. When
required texture and model files from an HTTP server.
we double the computers on the system, on average,
HTTP server shares the data files to the computing
we obtain 25% performance increase for random order
units. The reason behind distributing data files to each
bucket rendering and 37% performance increase for
computer is to increase the speed.
both Hilbert and row based bucket rendering methods.
In order not to have a checkerboard effect on the
When we increase the number of computers from one
final image, we apply k-nearest neighbor search
to eight, we observe up to 85% performance increase as
considering the eight neighbor buckets of the current
can be seen in the bar chart for scene-3.
bucket [10]. This method is illustrated in Fig. 8. Finally,
In our second set of experiments, we tested the
master computer collects and displays the rendered
effects of dynamic anti-aliasing on our three scene files
buckets without having edge problems.
for visual quality and rendering performance. For this
experiment, we applied different levels of dynamic anti
aliasing for the scenes. We present the effect of
different dynamic anti-aliasing methods on rendering
time in Fig. 10. In this Fig., Level-0-0 means we only
use one level which is level 0. Level-0-3 means, we
apply all the levels dynamically. Bar charts in this Fig.
show the rendering times for scene1, scene-2, and
scene-3 under different levels. As can be seen from
these results, increasing the anti-aliasing levels
  proportionally increases the rendering times.

Fig. 8. Bucket structure of the rendering system. Pixel values of


the red bucket (darkest bucket in grayscale) are calculated using
eight neighbor buckets.
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 12
7 CONCLUSIONS
In this study, we presented a parallelization technique
using Hilbert order for the photon mapping algorithm
to reduce the rendering time of the images by
distributing the jobs over computing units. To increase
the visual quality of the output images, we also
presented a multilevel dynamic anti-aliasing algorithm
to smooth the jagged edges caused by the bucket
render.
From our experiments, we observed that Hilbert
order shows the best performance according to other
bucket ordering techniques. Hilbert curve maximizes
the correlation usage between neighbor buckets and
minimizes the system overhead. Reducing system
Fig. 9. Rendering time comparison of three bucket ordering
methods.
overhead like communication time and buffer usage
increases the overall performance of the system.
Performance difference becomes clear when we use
more complex scenes. Complex scenes require much
more system resources to be processed and in this case
performance gain of the Hilbert curve based bucket
ordering becomes clearer.
Due to lack of hardware support, the most complex
scene we used for our experiments was scene-3. We
believe that the performance gap between Hilbert order
and other selection methods can be observed on more
complex scenes.
 

Fig. 10. Rendering time comparison of four levels of dynamic anti- 8 ACKNOWLEDGMENTS
aliasing.
This work was supported in part by Ankara
University under grant 09B4343009.
In Fig. 11, we show the impact of different anti-
aliasing levels on the visual quality. We used level-0-0, REFERENCES
level-0-1, level-0-2, and level-0-3, for images A, B, C, [1] I .Wald and P. Slusallek, “State-of-the-Art in Interactive Ray
and D, respectively. The quality increase in each level Tracing”, EUROGRAPHICS 2001, pp 21-42, 2001.
increase can clearly be seen on the edges in the figures. [2] M.F. Cohen, S.E. Chen, J.R. Wallace, and D.P. Greenberg, “A
Progressive Refinement Approach to Fast Radiosity Image
Generation,” SIGGRAPH Comput. Graph. 22, 4 (Aug. 1988), 75-84.
[3] Fishman G. S., Monte Carlo: Concepts, Algorithms, and
Applications, Springer Verlag, New York, NY, 1996.
[4] C. Benthin, I Wald, and P. Slusallek, “A Scalable Approach to
Interactive Global Illumination,” Proc. of Eurographics, 2003.
[5] H.W. Jensen, “Global Illumination Using Photon Maps,” In
Rendering Techniques, 1996.
[6] H.W. Jensen, Realistic Image Synthesis Using Photon Mapping, A.
K. Peters, Natick, MA, 2001.
[7] M. Nijasure, S. Pattanaik, and V. Goel, “Realtime Global
Illumination on GPUs,” Journal of graphics, gpu, and game tools,
vol. 10, no. 2, pp. 55-71, 2005.
[8] T.J. Purcell, C. Donner, M. Cammarano, H.W. Jensen, and P.
Hanrahan, “Photon Mapping On Programmable Graphics
Hardware,” Proc. of the conference on graphics hardware, 2003.
Fig. 11. Scenes rendered with four different levels of anti-aliasing. [9] S. Singh and P. Faloutsos, “SIMD Packet Techniques for Photon
Outputs A, B, C, and D use level-0-0, level-0-1, level-0-2, and
level-0-3, respectively. Mapping,” RT '07. IEEE Symposium, 2007.
[10] J. McNames, “A Fast Nearest-Neighbor Algorithm Based on a
Principal Axis Search Tree,” IEEE Transactions On Pattern Analysis
JOURNAL OF COMPUTING, VOLUME 2, ISSUE 9, SEPTEMBER 2010, ISSN 2151-9617
HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/
WWW.JOURNALOFCOMPUTING.ORG 13
And Machine Intelligence, vol. 23, no. 9, pp. 964-976, 2001. [17] Mark de Berg, Marc van Kreveld, Mark Overmars, and Otfried
[11] JGrid: A Jini-based Universal Service Grid, Schwarzkopf. Computational Geometry, Springer-Verlag. , pp.291–
http://www.irt.vein.hu/jgrid. 306, 2000.
[12] M. Wan, A. Kaufman, and S. Bryson, “High Performance Presence-
Accelerated Ray Casting,” Proc. of the conference on visualization, Yusuf Yavuz received Master of Science degree from Computer
Engineering Department at Ankara University in 2009. His
1999.
research interests are computer graphics and parallel
[13] P.H. Christensen, J. Fong, D.M. Laur, D. Batali, “Ray Tracing for the programming.
Movie 'Cars',” In Proc. Of The IEEE Symposium on Interactive Ray
Tracing, 2006. Bulent Tugrul is an Instructor in Department of Computer
[14] E. Reinhard, B.E. Smits, and C. Hansen, “Dynamic Acceleration Engineering at Ankara University. He received Master of Science
degree from Computer Science Department at Syracuse
Structures for Interactive Ray Tracing,” Proc. of the Eurographics
University in 2001. His research interests are database
Workshop on Rendering Technique, 2000. management, data mining and computer graphics.
[15] S. Gibson and R.J. Hubbold, “A Perceptually-Driven Parallel
Algorithm for Efficient Radiosity Simulation,” IEEE Transactions on Suleyman Tosun is an assistant professor in Department of
Computer Engineering at Ankara University. He received master
Visualization and Computer Graphics, vol. 6, no. 3, pp. 220-235, and PhD degrees from Computer Engineering Department at
2000. Syracuse University in 2001 and 2005, respectively. His research
[16] Warnock, J. E. 1969 A Hidden Surface Algorithm for Computer interests are electronic design automation, multiprocessor
Generated Halftone Pictures. Doctoral Thesis. UMI Order Number: systems, Network on Chips, and computer graphics.
AAI6919002, The University of Utah.

You might also like