Performance Indicator:
Realtime: 30 FPS
Interactive: 10 FPS
Offline Render: < 10 FPS
Out-of-core Rendering: dedicated server with super large memory
Challenge of Game Rendering:
Different rendering techniques: water, post-processing, skin, ... each requires different shader
CPU and GPU compatibility with different models
Games requires static framerate regardless of resolution
Limit CPU bandwith and memory footprint in rendering, game logic, network, animation, physics, AI all need CPU and memory. Only about 10% CPU are allowed.
Profiling: a system that automatically checks CPU usage for games in game platforms at release time.
No topics of cartoon rendering, 2D rendering, subsurface, and hair/fur is included in this course.
SIMD: common in CPU, one instruction multiple data. Combine data into vector and do vector calculation.
SIMT: single instruction multi-threads. Multiple cores take in the same instruction but with different data.
GPC: Graphics Processing Cluster. Graphic card's special region for graphics.
For more GPU Architecture related info, please look at CUDA.md
Von Neumann architecture: separation of storage and calculation. It makes design of hardware easier while reducing speed.
Application speed/latency is limited by:
Memory bound (memory access speed)
ALU bound (speed of GPU)
TMU (Texture Mapping Unit) bound
BW (Bandwidth) bound
Different Hardware:
PC: generic computer
Game Station: all memory are shared
Phone:
Instead of storing triangles and vertex directly, we store indice like an .obj
file. This saves storage about 6x
. But there are various methods:
Triangle Strip: only store vertex position in sequence and assume they are connected as a ordered strip.
Triangle List: assume every 3 vertex is a individual triangle
Note that we need normal for each vertex because one vertex can form infinitely many triangles.
Shader is classified as data in game engine although it itself is code.
When a mesh correspond to more than one shader, we need to use many "submesh"s bind to the same set of UV and Vertex Position but with different offset, shader, and texture.
Culling: throwing out renderable object out of visible cone to reduce the amount of things for GPU to draw
There are different approach to bounding Box:
Spherical Bounding Box: fastest
Axis-Aligned Bounding Box (AABB): fast while accurate
Object Bounding Box (OBB)
S-DOP: convex bounding box typically used in physics
BVH Tree Culling Example: We construct the tree by merging spherical bounding box. Then to test what objects need to be drawn, we traverse from root to leaf.
PVS: a way to do culling and resource pre-load. Each room can only see subset of the other rooms through portals (in red). Space auto-partition algorithm is complex. While almost no game engine use PVS directly, the idea remains.
GPU Culling: use GPU to calculate culling for each object. The GPU only return true/false
result. The speed is comparable to PVS and many other culling methods. Greatly adopted by industry as GPU calculates fast.
Table of Content