Chapters

Hide chapters

Metal by Tutorials

Fourth Edition · macOS 14, iOS 17 · Swift 5.9 · Xcode 15

Section I: Beginning Metal

Section 1: 10 chapters
Show chapters Hide chapters

Section II: Intermediate Metal

Section 2: 8 chapters
Show chapters Hide chapters

Section III: Advanced Metal

Section 3: 8 chapters
Show chapters Hide chapters

6. Coordinate Spaces
Written by Marius Horga & Caroline Begbie

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now

To easily find a point on a grid, you need a coordinate system. For example, if the grid happens to be your iPhone 15 screen, the center point might be x: 197, y: 426. However, that point may be different depending on what space it’s in.

In the previous chapter, you learned about matrices. By multiplying a vertex’s position by a particular matrix, you can convert the vertex position to a different coordinate space. There are typically six spaces a vertex travels as its making its way through the pipeline:

  • Object
  • World
  • Camera
  • Clip
  • NDC (Normalized Device Coordinate)
  • Screen

Since this is starting to read like a description of Voyager leaving our solar system, let’s have a quick conceptual look at each coordinate space before attempting the conversions.

Object Space

If you’re familiar with the Cartesian coordinate system, you know that it uses two points to map an object’s location. The following image shows a 2D grid with the possible vertices of the dog mapped using Cartesian coordinates.

Vertices in object space
Vertices in object space

The positions of the vertices are in relation to the dog’s origin, which is located at (0, 0). The vertices in this image are located in object space (or local or model space). In the previous chapter, Triangle held an array of vertices in object space, describing the vertex of each point of the triangle.

World Space

In the following image, the direction arrows mark the world’s origin at (0, 0, 0). So, in world space, the dog is at (1, 0, 1) and the cat is at (-1, 0, -2).

Vertices in world space
Monkecix ap discy ryoqo

Camera Space

Enough about the cat. Let’s move on to the dog. For him, the center of the universe is the person holding the camera. So, in camera space (or view space), the camera is at (0, 0, 0) and the dog is approximately at (-3, -2, 7). When the camera moves, it stays at (0, 0, 0), but the positions of the dog and cat move relative to the camera.

Clip Space

The main reason for doing all this math is to project with perspective. In other words, you want to take a three-dimensional scene into a two-dimensional space. Clip space is a distorted cube that’s ready for flattening.

Clip space
Ylam bsusi

NDC (Normalized Device Coordinate) Space

Projection into clip space creates a half cube of w size. During rasterization, the GPU converts the w into normalized coordinate points between -1 and 1 for the x- and y-axis and 0 and 1 for the z-axis.

Screen Space

Now that the GPU has a normalized cube, it will flatten clip space into two dimensions and convert everything into screen coordinates, ready to display on the device’s screen.

Final render
Lajes caxzit

Converting Between Spaces

To convert from one space to another, you can use transformation matrices. In the following image, the vertex on the dog’s ear is (-1, 4, 0) in object space. But in world space, the origin is different, so the vertex — judging from the image — is at about (0.75, 1.5, 1).

Converting object to world
Wobdudtijn enwakl ve zebyc

The three transformation matrices
Fve bmbou ndopbsipbareux gurwozug

Coordinate Systems

Different graphics APIs use different coordinate systems. You already found out that Metal’s NDC (Normalized Device Coordinates) uses 0 to 1 on the z-axis. You also may already be familiar with OpenGL, which uses 1 to -1 on the z-axis.

Coordinate systems
Doabqopene mxkcamr

The Starter Project

With a better understanding of coordinate systems and spaces, you’re ready to start creating matrices.

Starter project
Bvaljic fselizb

Uniforms

Constant values that are the same across all vertices or fragments are generally referred to as uniforms. The first step is to create a uniform structure to hold the conversion matrices. After that, you’ll apply the uniforms to every vertex.

Setting up the bridging header
Jorqumr ut nju qtubnahy buasuz

#import <simd/simd.h>
typedef struct {
  matrix_float4x4 modelMatrix;
  matrix_float4x4 viewMatrix;
  matrix_float4x4 projectionMatrix;
} Uniforms;

The Model Matrix

Your train vertices are currently in object space. To convert these vertices to world space, you’ll use modelMatrix. By changing modelMatrix, you’ll be able to translate, scale and rotate your train.

var uniforms = Uniforms()
let translation = float4x4(translation: [0.5, -0.4, 0])
let rotation =
  float4x4(rotation: [0, 0, Float(45).degreesToRadians])
uniforms.modelMatrix = translation * rotation
renderEncoder.setVertexBytes(
  &uniforms,
  length: MemoryLayout<Uniforms>.stride,
  index: 11)
#import "Common.h"
vertex VertexOut vertex_main(
  VertexIn in [[stage_in]],
  constant Uniforms &uniforms [[buffer(11)]])
{
  float4 position = uniforms.modelMatrix * in.position;
  VertexOut out {
    .position = position
  };
  return out;
}
Train in world space
Jkoid od vitkr xwesa

View Matrix

To convert between world space and camera space, you set a view matrix. Depending on how you want to move the camera in your world, you can construct the view matrix appropriately. The view matrix you’ll create here is a simple one, best for FPS (First Person Shooter) style games.

uniforms.viewMatrix = float4x4(translation: [0.8, 0, 0]).inverse
float4 position = uniforms.modelMatrix * in.position;
float4 position = uniforms.viewMatrix * uniforms.modelMatrix
                    * in.position;
Train in camera space
Hgauv oz viyeta gsoho

renderEncoder.setVertexBytes(
  &uniforms,
  length: MemoryLayout<Uniforms>.stride,
  index: 11)
timer += 0.005
uniforms.viewMatrix = float4x4.identity
let translationMatrix = float4x4(translation: [0, -0.6, 0])
let rotationMatrix = float4x4(rotationY: sin(timer))
uniforms.modelMatrix = translationMatrix * rotationMatrix
A clipped train
U bwupsav pjioj

NDC clipping
TFJ xzixnakg

Projection

It’s time to apply some perspective to your render to give your scene some depth.

Projection of a scene
Vcogesmaaw uc a vroqu

Projection Matrix

➤ Open Renderer.swift, and add this code to mtkView(_:drawableSizeWillChange:):

let aspect =
  Float(view.bounds.width) / Float(view.bounds.height)
let projectionMatrix =
  float4x4(
    projectionFov: Float(45).degreesToRadians,
    near: 0.1,
    far: 100,
    aspect: aspect)
uniforms.projectionMatrix = projectionMatrix
mtkView(
  metalView,
  drawableSizeWillChange: metalView.drawableSize)
float4 position =
  uniforms.projectionMatrix * uniforms.viewMatrix
  * uniforms.modelMatrix * in.position;
Zoomed in
Siecem oc

uniforms.viewMatrix = float4x4.identity
uniforms.viewMatrix = float4x4(translation: [0, 0, -3]).inverse
Camera moved back
Haguzu poqar sozd

A greater field of view
A mxoodir xoogk eq qeic

    renderEncoder.setTriangleFillMode(.lines)
The train positioned in a scene
Nda mhoih qorofoatov ay i tbope

Perspective Divide

Now that you’ve converted your vertices from object space through world space, camera space and clip space, the GPU takes over to convert to NDC coordinates (that’s -1 to 1 in the x and y directions and 0 to 1 in the z direction). The ultimate aim is to scale all the vertices from clip space into NDC space, and by using the fourth w component, that task gets a lot easier.

The dog should appear smaller.
Lqe nos nliits oshoey yneyzik.

NDC to Screen

Finally, the GPU converts from normalized coordinates to whatever the device screen size is. You may already have done something like this at some time in your career when converting between normalized coordinates and screen coordinates.

converted.x = point.x * screenWidth/2  + screenWidth/2
converted.y = point.y * screenHeight/2 + screenHeight/2
converted = matrix * point

Refactoring the Model Matrix

Currently, you set all the matrices in Renderer. Later, you’ll create a Camera structure to calculate the view and projection matrices.

struct Transform {
  var position: float3 = [0, 0, 0]
  var rotation: float3 = [0, 0, 0]
  var scale: Float = 1
}
extension Transform {
  var modelMatrix: matrix_float4x4 {
    let translation = float4x4(translation: position)
    let rotation = float4x4(rotation: rotation)
    let scale = float4x4(scaling: scale)
    let modelMatrix = translation * rotation * scale
    return modelMatrix
  }
}
protocol Transformable {
  var transform: Transform { get set }
}
extension Transformable {
  var position: float3 {
    get { transform.position }
    set { transform.position = newValue }
  }
  var rotation: float3 {
    get { transform.rotation }
    set { transform.rotation = newValue }
  }
  var scale: Float {
    get { transform.scale }
    set { transform.scale = newValue }
  }
}
class Model: Transformable {
var transform = Transform()
let translation = float4x4(translation: [0.5, -0.4, 0])
let rotation =
  float4x4(rotation: [0, 0, Float(45).degreesToRadians])
uniforms.modelMatrix = translation * rotation
uniforms.viewMatrix = float4x4(translation: [0.8, 0, 0]).inverse
let translationMatrix = float4x4(translation: [0, -0.6, 0])
let rotationMatrix = float4x4(rotationY: sin(timer))
uniforms.modelMatrix = translationMatrix * rotationMatrix
model.position.y = -0.6
model.rotation.y = sin(timer)
uniforms.modelMatrix = model.transform.modelMatrix
Using a transform in Model
Itibg u mgeqzqopv oh Xacut

Key Points

  • Coordinate spaces map different coordinate systems. To convert from one space to another, you can use matrix multiplication.
  • Model vertices start off in object space. These are generally held in the file that comes from your 3D app, such as Blender, but you can procedurally generate them too.
  • The model matrix converts object space vertices to world space. These are the positions that the vertices hold in the scene’s world. The origin at [0, 0, 0] is the center of the scene.
  • The view matrix moves vertices into camera space. Generally, your matrix will be the inverse of the position of the camera in world space.
  • The projection matrix applies three-dimensional perspective to your vertices.

Where to Go From Here?

You’ve covered a lot of mathematical concepts in this chapter without diving too far into the underlying mathematical principles. To get started in computer graphics, you can fill your transform matrices and continue multiplying them at the usual times, but to be sufficiently creative, you’ll need to understand some linear algebra. A great place to start is Grant Sanderson’s Essence of Linear Algebra at https://bit.ly/3iYnkN1. This video treats vectors and matrices visually. You’ll also find some additional references in references.markdown in the resources folder for this chapter.

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.
© 2025 Kodeco Inc.

You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now