Serial bunny in on the left and the paralllel bunny is on the right. Get the triangle from z-Buffer and rasterize it Populate array of triangles with vertices and indicesĬalculate the depth of each triangle by averaging its three pointsĬreate a mapping from pixel to triangle numberĪtomic max with its triangle number at pixel loc in z-Buffer This speedup outweighs the time required to allocate its memory.Ĭopy mesh and indices from host to device I choose to use pinned memory because data can be copied accross the bus much faster to it. My parallel rasterizer spends about half of it's time allocating pined memory on the host for the pixel area. AtomicMax is a function that comes with the cuda runtime that allows us to atomically take the max of an array element.Īlthough I though the atomic operatings would be expensive, it paled in comparsision to the overhead of memory allocation. To get around this, I sorted the triangles by depth, then use the atomicMax operating to fill in the z-buffer with the highest triangle at each point. While many threads have the capability to write to the same locations at the same time, managing this memory can be tricky. The biggest lesson learned is the memory is expensive.Ī big challenge in parallelizing a rasterizer is dealing with the z-buffer. The biggest challenge I faced was allocating memory on the host (CPU) for the pixel array, and copying results back to the host. This time, I achieved a 2x speedup by parallelizing it with CUDA - the parallel programming framework for NVIDIA graphics cards. Zach Arend Final Project for Intro to Computer Graphics (CPE 471)įor My final project, I did program 1 again.
0 Comments
Leave a Reply. |