**Ray Tracing Acceleration Structures**
Acceleration
=====
1. Why accelerate?
1. Naive #intersections = rays * objects
2. _Primary rays_
* $512 \times 512 \approx 1/4\;\mathrm{M}$
* $1\;\mathrm{K} \times 1\;\mathrm{K} = 1\;\mathrm{M}$
* $\mathrm{HD} = 1920 \times 1080 \approx 2\;\mathrm{K} \times 1\;\mathrm{K} = 2\;\mathrm{M}$
* $\mathrm{4K} = 3840 \times 2160 \approx 4\;\mathrm{K} \times 2\;\mathrm{K} = 8\;\mathrm{M}$
* With AA or others, multiply by sample count
3. _Secondary rays_ / Ray tree
* For each node: ray that got there + $L$ light rays = $1+L$ rays
* Reflection only:
* $D$ nodes for tree depth $D$
* Rays: $(1+L)D$
* Reflection & Refraction:
* $2^D-1$ nodes for tree depth $D$
* Rays: $(1+L)(2^D-1)$
2. Fast intersections
* Precompute everything you can once rather than per intersection
* Worth optimizing intersection algorithm
* Compile in **Release** (aka with compiler optimization) when you can
3. Caching & coherence
* Compact arrays rather than pointers
* Ray packets (intersections with cluster of rays)
* Bin by start location and direction
4. Parallelization
1. Ray tracing is _embarrassingly parallel_
* Thread per ray
* Thread per pixel
* Thread per set of pixels (block, line)
2. Threads
* Multi-core processor, shared memory
* Share read-only
* Be sure to copy thread-local variables
* OK to have threads write to dedicated location (avoid false sharing)
* Atomic for truly shared data (e.g. counters)
3. Processes
* _Render Farm_ = collection of computers to speed rendering
* Consider scene distribution cost
4. Acceleration structure
* Gate expensive work behind cheap work (bounds on convex poly)
* Early ray termination
* Hierarchy unlocks O() improvement
* Don't forget to count data structure build-time
Bounding Volume Hierarchy (BVH)
=====
1. Overall Concepts
1. Bounding Volume
1. Only intersect stuff inside volume if ray hits volume
2. Cheaper to intersect than stuff inside
2. Hierarchy = tree to unlock O(log(N)) instead of O(N)
3. Implementation
1. Make bounding shape another object type
2. Intersection function recurses into children
4. Building
1. Levels contain geometry below, not necessarily bounding objects below
2. Bottom-up (clustering) or Top-down (splitting)
3. Longest axis split
4. Surface Area Heuristic (SAH)
1. Chance of hitting bound is roughly proportional to $(\mathit{child\ surface\ area}) / (\mathit{parent\ surface\ area})$
2. Cost is proportional to the number of objects in the child
3. Build algorithms usually sweep split through objects
2. Bounding Spheres
1. Fast to intersect
2. Very poor fit for flat things (some fit a collection of spheres)
3. Minimal bounding sphere is hard
1. Exact is NP worst case (though may be better average case)
2. Heuristics (e.g. center of bounding box, extent)
3. AABB
1. Pretty fast to intersect
* $t = (\vec{n}\cdot(p_0-e))/(\vec{n} \cdot \vec{d}) = (\mathit{xmin}-e_x)/d_x$
* $\mathit{xmin} \le e_x + t * d_x \le \mathit{xmax}$
2. Good fit to axis-aligned planes, but less to diagonal
3. Easy to fit: find min and max in each dimension
3. Probably the most popular
4. OBB
1. OK to intersect, can reuse some computations for parallel planes
2. Exact fit algorithms are OK, but tricky. Fit using point covariances is easier.
3. Reasonably good fit for anything boxy.
5. K-DOP
1. Fixed plane directions (like AABB, but with more fixed directions)
* Lots of reused computation even between parent and child K-DOPs
2. Popular for collision detection, less so for ray tracing
Spatial Partitioning
=====
1. Overall Concepts
1. Subdivide space into cells or regions rather than objects
2. Traverse in order until hit
3. Choice to put objects in interior nodes or only in leaves
1. Objects in interior nodes: one copy only, but increased intersection cost
2. Objects in leaves
1. Object may span several leaves
2. Include multiple references
3. If intersection falls outside current cell, ignore until you're processing that cell
2. Uniform grid
1. Easiest to traverse
2. Not hierarchical, so speedup depends on good choice of cell size
3. Teapot in stadium problem
3. Octrees
1. Easy to build, traversal isn't to bad
2. Lots of stuff can land between cells & end up at upper nodes
4. K-d trees (popular)
1. Still axis aligned planes, but relax placement constraint
2. Optimization to decide which axis to split and where
5. BSP trees
1. Relax axis alignment constraint
2. Optimal build NP complete, must use heuristic