Hone logo
Hone
Problems

Implement Distributed Tracing in Go

In modern microservice architectures, understanding the flow of requests across different services is crucial for debugging, performance monitoring, and identifying bottlenecks. Distributed tracing provides a way to track a request as it propagates through a system. This challenge asks you to implement a basic distributed tracing system in Go.

Problem Description

Your task is to implement a simplified distributed tracing mechanism. You will create a Tracer interface and concrete implementations that allow you to:

  1. Start a span: A span represents a single operation or unit of work within a trace. Each span should have a unique ID, a parent ID (if it's not the root span), a name, and start and end timestamps.
  2. End a span: This marks the completion of an operation and records the end timestamp.
  3. Record span details: Spans should be collected and made available for analysis.

You need to implement this tracing system to work across multiple Go functions, simulating a distributed environment where a single request might trigger calls to different functions or potentially different (simulated) services.

Key Requirements:

  • Define a Tracer interface.
  • Implement at least one concrete Tracer that collects spans in memory.
  • Each span must have:
    • TraceID (unique identifier for the entire trace)
    • SpanID (unique identifier for this specific span)
    • ParentID (ID of the parent span, 0 or empty for root spans)
    • Name (descriptive name of the operation)
    • StartTime (timestamp when the span started)
    • EndTime (timestamp when the span ended)
    • Duration (calculated as EndTime - StartTime)
  • The Tracer should manage the generation of unique IDs for traces and spans.
  • The tracing context (specifically TraceID, SpanID, and ParentID) must be propagated correctly across function calls.
  • The implemented Tracer should provide a way to retrieve all recorded spans.

Expected Behavior:

When a trace is initiated, a root span is created. Subsequent operations that are traced will create child spans, linking them to their parent span via ParentID. All spans belonging to the same trace will share the same TraceID.

Edge Cases:

  • Handling the root span (no parent).
  • Ensuring unique IDs are generated.
  • Correctly propagating context when no tracing is active in a called function.

Examples

Example 1: Simple Sequential Tracing

// Assume a Tracer implementation named 'myTracer' is initialized.

func main() {
    ctx := context.Background()
    tracedCtx, span := myTracer.StartSpan(ctx, "main_operation")
    defer span.End()

    // Simulate work
    time.Sleep(50 * time.Millisecond)

    tracedCtx, childSpan := myTracer.StartSpan(tracedCtx, "sub_operation_1")
    defer childSpan.End()
    time.Sleep(30 * time.Millisecond)

    tracedCtx, anotherChildSpan := myTracer.StartSpan(tracedCtx, "sub_operation_2")
    defer anotherChildSpan.End()
    time.Sleep(20 * time.Millisecond)
}

// Expected output (simplified representation of collected spans):
// [
//   {TraceID: "...", SpanID: "...", ParentID: "", Name: "main_operation", StartTime: ..., EndTime: ..., Duration: ...},
//   {TraceID: "...", SpanID: "...", ParentID: "...", Name: "sub_operation_1", StartTime: ..., EndTime: ..., Duration: ...},
//   {TraceID: "...", SpanID: "...", ParentID: "...", Name: "sub_operation_2", StartTime: ..., EndTime: ..., Duration: ...}
// ]
// Note: TraceID will be the same for all spans. SpanIDs will be unique. ParentID of sub_operation_1 and sub_operation_2 will be the SpanID of main_operation.

Example 2: Nested Tracing

// Assume a Tracer implementation named 'myTracer' is initialized.

func processData(ctx context.Context) {
    ctx, span := myTracer.StartSpan(ctx, "process_data")
    defer span.End()
    time.Sleep(40 * time.Millisecond)
    fetchRecords(ctx)
}

func fetchRecords(ctx context.Context) {
    ctx, span := myTracer.StartSpan(ctx, "fetch_records")
    defer span.End()
    time.Sleep(25 * time.Millisecond)
}

func main() {
    ctx := context.Background()
    ctx, rootSpan := myTracer.StartSpan(ctx, "root_request")
    defer rootSpan.End()

    processData(ctx)
}

// Expected output (simplified representation of collected spans):
// [
//   {TraceID: "...", SpanID: "...", ParentID: "", Name: "root_request", StartTime: ..., EndTime: ..., Duration: ...},
//   {TraceID: "...", SpanID: "...", ParentID: "...", Name: "process_data", StartTime: ..., EndTime: ..., Duration: ...},
//   {TraceID: "...", SpanID: "...", ParentID: "...", Name: "fetch_records", StartTime: ..., EndTime: ..., Duration: ...}
// ]
// Note: TraceID is consistent. ParentID of 'process_data' is 'root_request'. ParentID of 'fetch_records' is 'process_data'.

Example 3: Un-traced Function Call

// Assume a Tracer implementation named 'myTracer' is initialized.

func externalApiCall(ctx context.Context) {
    // This function is NOT using myTracer.StartSpan
    fmt.Println("Simulating an external API call...")
    time.Sleep(15 * time.Millisecond)
}

func main() {
    ctx := context.Background()
    ctx, span := myTracer.StartSpan(ctx, "main_task")
    defer span.End()

    time.Sleep(30 * time.Millisecond)
    externalApiCall(ctx) // This call does not start a new traced span
    time.Sleep(10 * time.Millisecond)
}

// Expected output (simplified representation of collected spans):
// [
//   {TraceID: "...", SpanID: "...", ParentID: "", Name: "main_task", StartTime: ..., EndTime: ..., Duration: ...}
// ]
// Explanation: Even though externalApiCall is called, it doesn't initiate a new span. The context is passed, but no new tracing operation begins within it.

Constraints

  • Span and Trace IDs can be represented as strings.
  • Timestamps should be time.Time objects.
  • The in-memory storage for spans should be safe for concurrent access (though not strictly required for basic implementations, it's a good practice to consider).
  • The number of spans collected in a single trace should not exceed 1000.
  • The depth of nested spans should not exceed 50.

Notes

  • Consider using context.Context to propagate tracing information. You'll need to store and retrieve trace/span context from the context.
  • For generating unique IDs, you can use libraries like github.com/google/uuid or simpler timestamp-based approaches if you ensure uniqueness.
  • The Span struct should be immutable once it's ended.
  • Think about how to handle the absence of tracing information in the context when StartSpan is called.
  • For calculating duration, EndTime.Sub(StartTime) will be useful.
  • When retrieving spans, you should get a copy or a read-only view to prevent external modification.
Loading editor...
go