Simon’s Graphics Blog

Representing Rotations in Quaternion Arithmetic

· Read in about 6 min · (1228 Words)
maths

Quaternions crop up a lot in game development, since they are an efficient way to store rotations in 3-space. This article attempts to serve as a mathematical introduction to quaternions, and explains how and why we choose to use them to represent 3D rotations.

Basic Quaternion Arithmetic

Quaternions form what is called a non-commutative division ring. This space can be visualised in a similar manner to complex numbers. Just as the complex numbers can be represented by a two-dimensional co-ordinate on the complex plane, quaternions can be represented by a co-ordinate in 4-D space. This is written:

$$ q=w+x\boldsymbol{i}+y\boldsymbol{j}+z\boldsymbol{k} \text{ for } w,x,y,z \in \mathbb{R} $$

The numbers $\boldsymbol{i}$, $\boldsymbol{j}$ and $\boldsymbol{k}$ are related to the real numbers and each other by the following equalities:

$$\begin{array}{c} \boldsymbol{i}^2=\boldsymbol{j}^2=\boldsymbol{k}^2=-1 \\ \boldsymbol{i}\boldsymbol{j}=\boldsymbol{k} \end{array}$$

Quaternions can be viewed as a vector space over the real numbers. In this way they are given the notation:

$$\begin{array}{c} q=(w,x,y,z) \text{ for } w,x,y,z \in \mathbb{R} \\ \text{or} \\ q=[w,\boldsymbol{v}] \text{ where } \boldsymbol{v}=(x,y,z) \end{array}$$

When adding or subtracting quaternions you perform the same operation as for vectors and matrices: add and subtract component-wise. As with many other vector spaces, the quaternions come equipped with a norm, which is function (usually regarded as a “length”) that returns a single real number given a quaternion. It is usually defined as:

$$\Vert q\Vert=\sqrt{w^2 + x^2 + y^2 + z^2}$$

A non-commutative division ring has more structure than a vector space, namely a non-commutative multiplication operator. Without worrying too much about the exact definitions of these words (those interested can pick up the nearest undergraduate text on algebra) this basically means they form a special kind of number system. The word “non-commutative” means that multiplication is not symmetric, in other words:

$$q_1 q_2 \neq q_2 q_1 \text{ in general}$$

We can define quaternion multiplication using the second vector notation above and the normal vector dot and cross product as:

$$q_1 q_2 = [w_1 w_2 - v_1 \cdot v_2, w_1 v_2 + w_2 v_1 + v_1 \wedge v_2]$$

This is entirely equivalent to multiplying out the various components of i, j and k using the very first notation given earlier. Try it if you don’t believe me! :)

Conjugates And Inversion

Now that we can add, subtract and multiply quaternions it would be useful to also be able to divide them. As a direct extension of the complex conjugate in complex arithmetic, the quaternion conjugate is defined as:

$$q^\ast=(w,-x,-y,-z)$$

By multiplying this out using the definition of multiplication above, it can be shown that:

$$q^\ast{q}=qq^\ast=\Vert q\Vert^2$$

This equality allows us to construct the inverse of any quaternion:

$$q^{-1}=\frac{q^\ast}{\Vert q\Vert^2}$$

We can use the above definitions to prove that:

$$q^{-1}q=qq^{-1}=1$$

Exponentials, Logarithms And Powers

Since quaternions are really just numbers you would expect there to be a natural extension of our idea of exponentials and logarithms for real numbers. Well it just so happens that there is, and it is derived in a similar way to the extension for complex numbers. This section is aimed at the mathematically fearless and can be skipped if need be without causing confusion in later sections.

By considering the series expansion of the exponential, it can be shown that:

$$ \text{for } q=(w,x,y,z)=[w,\boldsymbol{v}], \text{ let } $\boldsymbol{v}=s\boldsymbol{n} \text{ such that }|\boldsymbol{n}|=1 \\ \text{then } e^q=e^w[\cos{s},\boldsymbol{n}\sin{s}] $$

This leads to a very useful factorisation of a quaternion:

$$ \text{for } q=re^{[0,\theta\boldsymbol{u}]} \\ \log{q}=\log{r}+[0,\theta\boldsymbol{u}]=[\log{r},\theta\boldsymbol{u}] $$

Using the identity for exponential above, we can write this as:

$$q=r[\cos{\theta},\boldsymbol{u}\sin{\theta}]=re^{[0,\theta\boldsymbol{u}]}$$

Note how similar this is to the $r, \theta$ factorisation of complex numbers. Once we have this factorisation, we can trivially define the quaternion logarithm as:

$$\begin{array}{c} \text{for } q=re^{[0,\theta\boldsymbol{u}]} \\ \log{q}=\log{r}+[0,\theta\boldsymbol{u}]=[\log{r},\theta\boldsymbol{u}] \end{array}$$

We can then define quaternion powers in terms of the logarithm and exponential, namely:

$$q^k=e^{k\log{q}} \text{ for } k\in\mathbb{R}$$

A Quaternion Rotation

For a rotation of $\phi$ about a unit axis $\boldsymbol{w}$, we will define the representing quaternion $q$ to be:

$$q=[\cos{\frac\phi2},\boldsymbol{w}\sin{\frac\phi2}]$$

Note that using this scheme that the norm of all quaternion rotations is 1 (i.e. they are all unit quaternions). The converse is also true: any unit quaternion can be decomposed into this form (just like we did above for logarithms) and can therefor represent a rotation. It is therefore simple to adjust for numerical drift: simply renormalise the quaternion. This takes far fewer operations than matrix orthonormalisation, which corrects numerical drift in rotation matrices.

To rotate a vector $\boldsymbol{v}$ an angle of $\phi$ around about an arbitrary unit axis $\boldsymbol{w}$, you can use the formula:

$$\boldsymbol{v}^\prime=(\boldsymbol{v}\cdot\boldsymbol{w})\boldsymbol{w}+(\boldsymbol{v}-(\boldsymbol{v}\cdot\boldsymbol{w})\boldsymbol{w})\cos\phi+\boldsymbol{v}\wedge\boldsymbol{w}\sin\phi$$

Representing the same rotation by a quaternion $q$, you can prove (go on, try it!) that this equivalent to:

$$[0,\boldsymbol{v}^\prime]=q[0,\boldsymbol{v}]q^\ast$$

This formula uses quaternion multiplication and conjugation as defined earlier. This equation is one of the reasons for encoding rotations in quaternions, since it greatly simplifies the mathematics of composing rotations. We can also describe composition of rotations:

$$\begin{array}{c} \text{let } r=pq \\ \text{then } r[0,\boldsymbol{v}]r^\ast=pq[0,\boldsymbol{v}](pq)^\ast=pq[0,\boldsymbol{v}]q^\ast{p}^\ast=p(q[0,\boldsymbol{v}]q^\ast)p^\ast \end{array}$$

This shows that multiplication of quaternions is equivalent to composition of rotations. Just like composing rotations, we need to be careful about the order in which we multiply. In the following notation, rotations are applied from right to left.

Similarly, inverting a quaternion (note that for unit quaternions this is just the conjugate directly) produces the inverse rotation. Later, we will show how to use quaternion powers to do partial rotations, or blend between them.

Quaternion To Matrix Conversion

Although I hate to admit it, it is often necessary to have a quaternion rotation in matrix form, e.g. to load onto graphics hardware for hardware vertex transformations. In this case it useful to have a formula for converting a quaternion quickly into a rotation matrix. Using the scheme of pre-multiplication of matrices given earlier, one such formula for this is:

$$ \text{let }q=(w,x,y,z) \\ R=\begin{pmatrix} 1-2y^2-2z^2 & 2xy-2wz & 2zx+2wy \\ 2xy+2wz & 1-2x^2-2z^2 & 2yz-2wx \\ 2zx-2wy & 2yz+2wx & 1-2x^2-2y^2 \end{pmatrix}$$

As I have had recently pointed out to me, applying a quaternion rotation to a vector directly takes a couple of operations more than computing the matrix and applying the matrix directly (41 vs 39 operations). The direct quaternion rotation uses fewer registers for the computation, but once the matrix has been computed only 15 operations are needed to apply it to another vector, so for rotating more than one vector it is much more efficient to use the above matrix form.

SLERP

SLERP stands for spherical-linear interpolation and is one method of interpolating two rotations smoothly. It is called spherical-linear since the two quaternion rotations are interpolated uniformly along a geodesic in the surface of the 3-sphere.

The general idea behind interpolation is to have a variable t that ranges from 0 to 1, and use this to parameterise a smooth blend between quaternions $a$ and $b$. Spherical linear interpolation uses the following equation:

$$q = (ba^{-1})^{t}a$$

Clearly q is a when t is 0, and b when t is 1. Note how we have been careful with our order of multiplication: quaternions are applied from right to left. General quaternion powers were described earlier, but since t is a real number, and all rotation quaternions are of unit length, we can greatly simplify the power equation:

$$\text{if } q=[\cos\theta,\boldsymbol{u}\sin\theta] \text{ then } q^t=[\cos{t\theta},\boldsymbol{u}\sin{t\theta}]$$

This nicely fits with our intuition that linearly interpolating between the identity transform and a rotation should be a linear change of angle.

Comments