The Mathematics of Single Eye Perspective Projections
Matrix Representation
Visible surface determination (and other operations requiring depth information in images) are typically performed after the perspective projection. If we create the matrix exactly as shown on the previous page, we will lose our ability to perform these important tasks, because we lose all relative depth information. (Recall all points wind up with z = zpp.)
We therefore need to do something a little more sophisticated with the third row of the matrix so that we can preserve relative depth. Rather than allowing all projected points to have z = zpp, we will compute a z′ for each which preserves relative depth relationships among all the points in the scene.
So what should this third row look like? That is, on what should z′ depend? It should be clear that z′ cannot be a function of the x and y coordinates of a point. Two points with different x and y coordinates, but identical z coordinates before the transformation must be mapped to points that have the same z′ coordinate by the transformation. (You should take a moment and convince yourself that this is true.)
This means the first two elements of the third row must be zero, and hence the general form of the matrix will be as indicated above. Our challenge is to compute α and β so that relative depth is preserved. Recognizing that the perspective divide will result in the z coordinate of the affine point being z′ = α + β/z, we can determine appropriate values for α and β by applying the window-viewport limits in the z direction.
Different graphics APIs map the z direction differently. Some map zmin ≤ zeye ≤ zmax to a normalized 0 ≤ z′ ≤ 1 range. This would give us two equations in two unknowns from which α and β are easily determined:
α + β/zmax = 1
OpenGL uses an internal left-handed normalized range in which points on the near plane are mapped to -1, and points on the far plane are mapped to +1. This convention gives us the following two equations in our two unknowns:
α + β/(-f) = 1
('n' and 'f' stand for the distances to the 'near' and 'far' planes, respectively, as passed to the OpenGL routine glFrustum. They are negated above because we need to convert them from distances to actual eye z coordinates when writing the equations. That is, positive values for 'n' and 'f' correspond to distances along the line of sight, which is the negative z direction of the eye coordinate system.)
Solving for α and β, we get: α=(f+n)/(f-n) and β=2fn/(f-n).
Optional
We developed our matrix as shown here because the derivation seemed somewhat simpler (no need to introduce window-viewport maps) and slightly more intuitive (avoiding the need to introduce minus signs). Section VII of the technical report on graphical transformations includes the complete derivation of the OpenGL matrix shown in Appendix F of the OpenGL Programming Guide. If you wish to generate actual stereo projection matrices for use in OpenGL programs, you can either negate every element of the matrices shown here and then add window-viewport mapping accommodations in the x and y directions, or adapt the derivations given in the technical report (which include relevant window-viewport mappings) using the added stereo separation concepts described here. |