Spans or withSpan

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.

Unlock now

In lesson 2, your tracing implementation was around returning spans and the developers are fully responsible for ending it, setting errors if there are any errors thrown and most importantly setting TracingContext.$activeSpan with TaskLocal operations.

It’s very likely that an engineer will forget to end a span, or set an error, or set the activeSpan so the child spans end up disconnected. You might be surprised how easy this can happen, especially if within the code there is a guard that returns in the else.

What if you design your APIs to take ownership on all that. Developers would only pass the operation they want to execute as a closure, and your library is responsible for configuring the span, connecting it to other spans, executing the operation, handle exceptions and automatically attach logs to the spans, and finally close it when the operation is done.

Not only that, but you can take it further by attaching the filename and function names to the spans to facilitate debugging!

The decision for this shift in the design of your library isn’t technical. It emphasizes developer experience and where the effort of the developers using your library is on. How much work is your library requiring them to do to use it properly and how many things that can break if they don’t.

The idea for the APIs you’ll implement in this segment is coming from Apple’s Swift Distributed Tracing library. The API simply looks like this:

withSpan { span in
  // Do stuff...
}

Integrating your library becomes almost of no effort, and barely introduces complexity to your code. The property span is if they want to add their own attributes or add information to the span through out the code, or can completely ignore it:

withSpan { _ in
  // Do stuff...
}

The simple signature of this method looks like this:

public func withSpan<T>(_ operation: (SpanType) throws -> T) rethrows -> T {
}

You can add this directly as a global function. No classes, not objects to call this on.

But what about the rest of the properties? Operation name to add on the span, file name and method name calling the span sound fun too. How does this work?

Open OTelSpans.swift and add this method at the end of the file after OTelSpans class not inside it:

public func withSpan<T>(_ operationName: String,  // 1
                        scopeName: String = "OTelSpans",
                        ofKind kind: SpanKind = .internal,
                        function: String = #function,
                        file: String = #fileID,
                        line: UInt = #line,
                        _ operation: (SpanType) async throws -> T
) async rethrows -> T {
  let tracer = OTelSpans.tracer(scopeName: scopeName)   // 2
  let spanBuilder = tracer.spanBuilder(spanName: operationName)   // 3
  let span = spanBuilder.startSpan()  // 4
  
  defer {   // 5
    span.end()
  }
  
  return try await TracingContext.$activeSpan.withValue(span) {   // 6
    try await operation(span)   // 7
  }
}
  1. Lets start with the method signature:
    • It takes the operation name of a string value, as a required parameter. This would mean using the function needs to be withSpan("Operation_Name") { // do stuff }. You can add a default value to the operation name if you like to simplify the call, but know that it will be less convenient reading the spans on Grafana, you can decide on the tradeoffs. :]
    • Second Parameter also a string but with a default value “OTelSpans” so you don’t need to call it
    • The Span kind with a default value.
    • Function name, file name, and line numbers are all string with their default values of #function, #fileID and #line respectively. Those three are macros. You can check their documentation here.
    • The async operation or closure to wrap spans around. It takes the span as a parameter, and it returns a generic type T, and the whole function withSpan will return what this operation returns.
  2. Create a tracer from OTelSpans with the scope name.
  3. Create a span builder.
  4. Create a span from the builder
  5. It’s best to use defer to close a span so that you don’t forget about it with any guard statements or throwing an error that would prevent the rest of the function from executing. This might be more common than you think and defer is perfect for this.
  6. Use TaskLocal feature to set the span.
  7. Execute the operation in the TaskLocal’s scope

You didn’t set yet the attributes for kind, function, file and line in the span. Let’s do that now.

Add at the end of the file this enum and extension:

public enum OtelSemanticAttributes: String {
  case sourceFunction = "source.function"
  case sourceLine = "source.line"
  case sourceFile = "source.file"
}

extension SpanBuilderBase {
  public func setAttribute(key: OtelSemanticAttributes, value: String) -> Self {
    setAttribute(key: key.rawValue, value: .string(value))
  }
}

Its a good habit to define constants as enums to keep code organized. The extension is to allow setting the attributes with keys from the enums instead of the strings directly. Consider it nitpicking. :]

Go back to your withSpan function and replace the creation of the builder with the following:

let spanBuilder = tracer.spanBuilder(spanName: operationName)
  .setSpanKind(spanKind: kind)
  .setAttribute(key: .sourceFunction, value: function)
  .setAttribute(key: .sourceFile, value: file)
  .setAttribute(key: .sourceLine, value: String(line))

if let parentSpan = TracingContext.activeSpan {
  spanBuilder.setParent(parentSpan.context)
}

To use your new function, go to TheMetStore.swift and wrap the whole function body with a call with withSpan:

func fetchObjects(for queryTerm: String) async throws {
  try await withSpan("fetchObjects") { mainSpan in
    // existing function body
  }
}

In TheMetService.swift do the same for getObject(from:) and getObjectIDs(from:):

func getObjectIDs(from queryTerm: String) async throws -> ObjectIDs? {
    try await withSpan("getObjectIDs") { span in
      // existing body
  }
}
.
.
.
func getObject(from objectID: Int) async throws -> Object? {
    try await withSpan("getObject") { span in
      // existing body
  }
}

Build and run the app. Open your Grafana Tracing. This is the details of the trace.

Grafana Tracing showing the details of the trace
Grafana Tracing showing the details of the trace

One last thing you need to take care of inside your new function… Errors!

Execute the TaskLocal operation along with the passed operation in a do-catch block:

do {
  return try await TracingContext.$activeSpan.withValue(span) {
    try await operation(span)
  }
} catch {
  OTelLogs.sendLog(scope: operationName+"-ErrorLogging", message: error.localizedDescription, span: span)
  span.status = .error(description: "\(error)")
  throw error
}

If there is an error, you create a log entry on the span, set the span’s status as .error then rethrow the error back so as engineers can still handle them. Forgetting this last step will be very bad.

There is one small inconvenience in your APIs. All of your spans will be connected to another except for the very first ones. Its not possible to define a span that isn’t connected to anything as long as a value exists in TracingContext.activeSpan. You can easily remedy this by adding a boolean to skip connecting the span.

OpenTelemetry approached this a little differently. In the framework, you can include OpenTelemetryConcurrency. It provides support out of the box with TaskLocal almost the same way you did in your function.

It has an implementation for withStartedSpan through the span builder that takes the operation to execute. Its similar to what you implemented already except for the name of the function and file. There is another called withActiveSpan that doesn’t connect the span to a parent. The decision behind having a dedicated method is to make the code more readable.

Enabling Swift concurrency is very simple:

import OpenTelemetryConcurrency

public typealias SpanType = any SpanBase
public typealias TracerType = TracerWrapper

public class OTelSpans {
  typealias OpenTelemetry = OpenTelemetryConcurrency.OpenTelemetry

Then iniside the initializer and after you register the Tracer provider, execute this line:

OpenTelemetry.registerDefaultConcurrencyContextManager()

This will automatically enable span propagation through Swift concurrency with TaskLocal.

The type aliases for SpanType and TracerType help you reduce your code changes when you want to enable or disable concurrency:

  • Without concurrency spans are of type any Span, and traces are any Tracer
  • With concurrency its any SpanBase and TracerWrapper

In the final project of the materials of the project, You’ll find OTelSpans.swift that already has the implementation of OpenTelemetry Concurrency, and the global functions withSpan and withActiveSpan so your APIs are as easy as before. Replace the contents of OTelSpans.swift in your project with this file.

Build and run the app to make sure nothing has changed.

The main benefit you get using this framework is any span created using the old approach of returning the span directly will automatically be connected to a parent span. Using withSpan isn’t necessary as its handled deep in OpenTelemetry framework directly.

See forum comments
Download course materials from Github
Previous: Introduction Next: Logging Levels