Chapters

Hide chapters

Metal by Tutorials

Fourth Edition · macOS 14, iOS 17 · Swift 5.9 · Xcode 15

Section I: Beginning Metal

Section 1: 10 chapters
Show chapters Hide chapters

Section II: Intermediate Metal

Section 2: 8 chapters
Show chapters Hide chapters

Section III: Advanced Metal

Section 3: 8 chapters
Show chapters Hide chapters

9. Navigating a 3D Scene
Written by Marius Horga & Caroline Begbie

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now

A scene can consist of one or more cameras, lights and models. Of course, you can add these objects in your renderer class, but what happens when you want to add some complicated game logic? Adding it to the renderer gets more impractical as you need additional interactions. Abstracting the scene setup and game logic from the rendering code is a better option.

Cameras go hand in hand with moving around a scene, so in addition to creating a scene to hold the models, you’ll add a camera structure. Ideally, you should be able to set up and update a scene in a new file without diving into the complex renderer.

You’ll also create an input controller to manage keyboard and mouse input so that you can wander around your scene. Game engines will include features such as input controllers, physics engines and sound.

While the game engine you’ll work toward in this chapter doesn’t have any high-end features, it’ll help you understand how to integrate other components and give you the foundation needed to add complexity later.

The Starter Project

Aside from some helpful comments and the unconstrained size of the view, the starter project for this chapter is the same as the challenge project for the previous chapter.

Scenes

A scene holds models, cameras and lighting. It’ll also contain the game logic and update itself every frame, taking into account user input.

import MetalKit

struct GameScene {
}
  lazy var house: Model = {
    let house = Model(name: "lowpoly-house.usdz")
    house.setTexture(name: "barn-color", type: BaseColor)
    return house
  }()
  lazy var ground: Model = {
    let ground = Model(name: "ground", primitiveType: .plane)
    ground.setTexture(name: "barn-ground", type: BaseColor)
    ground.tiling = 16
    ground.transform.scale = 40
    ground.transform.rotation.z = Float(90).degreesToRadians
    return ground
  }()
  lazy var models: [Model] = [ground, house]
lazy var scene = GameScene()
mutating func update(deltaTime: Float) {
  ground.rotation.y = sin(deltaTime)
  house.rotation.y = sin(deltaTime)
}
 scene.update(deltaTime: timer)
 for model in scene.models {
   model.render(
     encoder: renderEncoder,
     uniforms: uniforms,
     params: params)
 }
The initial scene
Bvu ulawaon yyoho

Cameras

Instead of creating view and projection matrices in the renderer, you can abstract the construction and calculation away from the rendering code to a Camera structure. Adding a camera to your scene lets you construct the view matrix in any way you choose.

import CoreGraphics

protocol Camera: Transformable {
  var projectionMatrix: float4x4 { get }
  var viewMatrix: float4x4 { get }
  mutating func update(size: CGSize)
  mutating func update(deltaTime: Float)
}
struct FPCamera: Camera {
  var transform = Transform()
}
var aspect: Float = 1.0
var fov = Float(70).degreesToRadians
var near: Float = 0.1
var far: Float = 100
var projectionMatrix: float4x4 {
  float4x4(
    projectionFov: fov,
    near: near,
    far: far,
    aspect: aspect)
}
mutating func update(size: CGSize) {
  aspect = Float(size.width / size.height)
}
var viewMatrix: float4x4 {
  (float4x4(rotation: rotation) *
  float4x4(translation: position)).inverse
}
mutating func update(deltaTime: Float) {
}
var camera = FPCamera()
init() {
  camera.position = [0, 1.4, -4.0]
}
mutating func update(size: CGSize) {
  camera.update(size: size)
}
scene.update(size: size)
uniforms.viewMatrix = scene.camera.viewMatrix
uniforms.projectionMatrix = scene.camera.projectionMatrix
uniforms.viewMatrix =
  float4x4(translation: [0, 1.4, -4.0]).inverse
ground.rotation.y = sin(deltaTime)
house.rotation.y = sin(deltaTime)
camera.rotation.y = sin(deltaTime)
The camera rotating
Clu payinu pigayehg

var viewMatrix: float4x4 {
  (float4x4(translation: position) *
  float4x4(rotation: rotation)).inverse
}
The camera rotating around its center
Chu ropuce pokofifm uqaisr ijn hetyub

Input

There are various forms of input, such as game controllers, keyboards, mice and trackpads. On both macOS and iPadOS, you can use Apple’s GCController API for these types of inputs. This API helps you set your code up for:

import GameController

class InputController {
  static let shared = InputController()
}
var keysPressed: Set<GCKeyCode> = []
private init() {
  let center = NotificationCenter.default
  center.addObserver(
    forName: .GCKeyboardDidConnect,
    object: nil,
    queue: nil) { notification in
      let keyboard = notification.object as? GCKeyboard
        keyboard?.keyboardInput?.keyChangedHandler
          = { _, _, keyCode, pressed in
        if pressed {
          self.keysPressed.insert(keyCode)
        } else {
          self.keysPressed.remove(keyCode)
        }
      }
  }
}
if InputController.shared.keysPressed.contains(.keyH) {
  print("H key pressed")
}

#if os(macOS)
  NSEvent.addLocalMonitorForEvents(
    matching: [.keyUp, .keyDown]) { _ in nil }
#endif

Delta Time

First, you’ll set up the left and right arrows on the keyboard to control the camera’s rotation.

var lastTime: Double = CFAbsoluteTimeGetCurrent()
timer += 0.005
let currentTime = CFAbsoluteTimeGetCurrent()
let deltaTime = Float(currentTime - lastTime)
lastTime = currentTime
scene.update(deltaTime: deltaTime)

Camera Rotation

➤ Open GameScene.swift. In update(deltaTime:), replace:

camera.rotation.y = sin(deltaTime)
camera.update(deltaTime: deltaTime)
enum Settings {
  static var rotationSpeed: Float { 2.0 }
  static var translationSpeed: Float { 3.0 }
  static var mouseScrollSensitivity: Float { 0.1 }
  static var mousePanSensitivity: Float { 0.008 }
}
protocol Movement where Self: Transformable {
}
extension Movement {
  func updateInput(deltaTime: Float) -> Transform {
    var transform = Transform()
    let rotationAmount = deltaTime * Settings.rotationSpeed
    let input = InputController.shared
    if input.keysPressed.contains(.leftArrow) {
      transform.rotation.y -= rotationAmount
    }
    if input.keysPressed.contains(.rightArrow) {
      transform.rotation.y += rotationAmount
    }
    return transform
  }
}
extension FPCamera: Movement { }
let transform = updateInput(deltaTime: deltaTime)
rotation += transform.rotation
Using arrow keys to rotate the camera
Eyocw olbur xojd zu zeceke zti vehedo

Camera Movement

You can implement forward and backward movement the same way using standard WASD keys:

var forwardVector: float3 {
  normalize([sin(rotation.y), 0, cos(rotation.y)])
}
Forward vectors
Zugwuly jocrayq

var rightVector: float3 {
  [forwardVector.z, forwardVector.y, -forwardVector.x]
}
var direction: float3 = .zero
if input.keysPressed.contains(.keyW) {
  direction.z += 1
}
if input.keysPressed.contains(.keyS) {
  direction.z -= 1
}
if input.keysPressed.contains(.keyA) {
  direction.x -= 1
}
if input.keysPressed.contains(.keyD) {
  direction.x += 1
}
let translationAmount = deltaTime * Settings.translationSpeed
if direction != .zero {
  direction = normalize(direction)
  transform.position += (direction.z * forwardVector
    + direction.x * rightVector) * translationAmount
}
position += transform.position
Moving around the scene using the keyboard
Mehebt ofaawd xqi fpama exutz kwo naspuuvf

Mouse and Trackpad Input

Players on macOS games generally use mouse or trackpad movement to look around the scene rather than arrow keys. This gives all-around viewing, rather than the simple rotation on the y axis that you have currently.

struct Point {
  var x: Float
  var y: Float
  static let zero = Point(x: 0, y: 0)
}
var leftMouseDown = false
var mouseDelta = Point.zero
var mouseScroll = Point.zero
center.addObserver(
  forName: .GCMouseDidConnect,
  object: nil,
  queue: nil) { notification in
    let mouse = notification.object as? GCMouse
}
// 1
mouse?.mouseInput?.leftButton.pressedChangedHandler = { _, _, pressed in
  self.leftMouseDown = pressed
}
// 2
mouse?.mouseInput?.mouseMovedHandler = { _, deltaX, deltaY in
  self.mouseDelta = Point(x: deltaX, y: deltaY)
}
// 3
mouse?.mouseInput?.scroll.valueChangedHandler = { _, xValue, yValue in
  self.mouseScroll.x = xValue
  self.mouseScroll.y = yValue
}

Arcball Camera

In many apps, the camera rotates about a particular point. For example, in Blender, you can set a navigational preference to rotate around selected objects instead of around the origin.

struct ArcballCamera: Camera {
var camera = ArcballCamera()

Orbiting a Point

The camera needs a track to rotate about a point:

Orbiting a point
Enxafejj o suucg

let minDistance: Float = 0.0
let maxDistance: Float = 20
var target: float3 = [0, 0, 0]
var distance: Float = 2.5
let input = InputController.shared
let scrollSensitivity = Settings.mouseScrollSensitivity
distance -= (input.mouseScroll.x + input.mouseScroll.y)
  * scrollSensitivity
distance = min(maxDistance, distance)
distance = max(minDistance, distance)
input.mouseScroll = .zero
if input.leftMouseDown {
  let sensitivity = Settings.mousePanSensitivity
  rotation.x += input.mouseDelta.y * sensitivity
  rotation.y += input.mouseDelta.x * sensitivity
  rotation.x = max(-.pi / 2, min(rotation.x, .pi / 2))
  input.mouseDelta = .zero
}
let rotateMatrix = float4x4(
  rotationYXZ: [-rotation.x, rotation.y, 0])
let distanceVector = float4(0, 0, -distance, 0)
let rotatedVector = rotateMatrix * distanceVector
position = target + rotatedVector.xyz

The lookAt Matrix

A lookAt matrix rotates the camera so it always points at a target. In MathLibrary.swift, you’ll find a float4x4 initialization init(eye:center:up:). You pass the camera’s current world position, the target and the camera’s up vector to the initializer. In this app, the camera’s up vector is always [0, 1, 0].

var viewMatrix: float4x4 {
  let matrix: float4x4
  if target == position {
    matrix = (float4x4(translation: target) * float4x4(rotationYXZ: rotation)).inverse
  } else {
    matrix = float4x4(eye: position, center: target, up: [0, 1, 0])
  }
  return matrix
}
camera.distance = length(camera.position)
camera.target = [0, 1.2, 0]
Inside the barn
Ijnevo dxi nutd

Orthographic Projection

So far, you’ve created cameras with perspective so that objects further back in your 3D scene appear smaller than the ones closer to the camera. Orthographic projection flattens three dimensions to two dimensions without any perspective distortion.

Orthographic projection
Ohgduvwalruv mwipudtaeh

struct OrthographicCamera: Camera, Movement {
  var transform = Transform()
  var aspect: CGFloat = 1
  var viewSize: CGFloat = 10
  var near: Float = 0.1
  var far: Float = 100

  var viewMatrix: float4x4 {
    (float4x4(translation: position) *
    float4x4(rotation: rotation)).inverse
  }
}
var projectionMatrix: float4x4 {
  let rect = CGRect(
    x: -viewSize * aspect * 0.5,
    y: viewSize * 0.5,
    width: viewSize * aspect,
    height: viewSize)
  return float4x4(orthographic: rect, near: near, far: far)
}
The orthographic projection frustum
Wne edcgaqsorduh njocemseeg njophac

mutating func update(size: CGSize) {
  aspect = size.width / size.height  
}
mutating func update(deltaTime: Float) {
  let transform = updateInput(deltaTime: deltaTime)
  position += transform.position
  let input = InputController.shared
  let zoom = input.mouseScroll.x + input.mouseScroll.y
  viewSize -= CGFloat(zoom)
  input.mouseScroll = .zero
}
var camera = OrthographicCamera()
camera.position = [3, 2, 0]
camera.rotation.y = -.pi / 2
Orthographic viewing from the front
Uhlnoztoqlav seokigp dqed pfa wrovn

camera.position = [0, 2, 0]
camera.rotation.x = .pi / 2
Orthographic viewing from the top
Aqbkonkotcaj suowowy pdot vtu qum

Challenge

For your challenge, combine FPCamera and ArcballCamera into one PlayerCamera. In addition to moving around the scene using the WASD keys, a player can also change direction and look around the scene with the mouse.

var viewMatrix: float4x4 {
  let rotateMatrix = float4x4(
    rotationYXZ: [-rotation.x, rotation.y, 0])
  return (float4x4(translation: position) * rotateMatrix).inverse
}
Moving around the scene
Julatb udeujk cya sququ

Key Points

  • Scenes abstract game code and scene setup away from the rendering code.
  • Camera structures let you calculate the view and projection matrices separately from rendering the models.
  • On macOS and iPadOS, use Apple’s GCController API to process input from game controllers, keyboards and mice.
  • On iOS, GCVirtualController gives you onscreen D-pad controls.
  • For a first-person camera, calculate position and rotation from the player’s perspective.
  • An arcball camera orbits a target point.
  • An orthographic camera renders without perspective so that all vertices rendered to the 2D screen appear at the same distance from the camera.
Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.
© 2025 Kodeco Inc.

You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now