3. Comprehensive Guide to Tracing

Tracing is the foundation of the framework. It’s how you gain visibility into your chatbot’s internal workings. You can use the tracing functionality on its own, even without running the full test suite.

The @trace Decorator

The primary interface for tracing is the @tracer.trace() decorator. You apply it to any function or method you want to monitor.

@self.tracer.trace(step_name="authorize_user")
def _authorize(user_id: str, token: str):
    # ... logic ...
    return {"status": "ok"}

When this function is called, the decorator automatically captures:

  • name: The step_name you provided (“authorize_user”).

  • inputs: The arguments passed to the function ({"args": ["user123"], "kwargs": {"token": "xyz"}}).

  • outputs: The value returned by the function ({"status": "ok"}).

  • status: “success” if it completes, “error” if it raises an exception.

  • start_time / end_time: Timestamps for latency calculation.

  • run_id: The unique ID for the entire interaction.

The Tracer and Recorder Relationship

These two components work together. The Tracer is initialized for each unique request and linked to a Recorder.

  1. Your App Receives a Request: The API endpoint gets a session_id.

  2. Initialize Recorder: You create an instance of a Recorder class (e.g., LocalJsonRecorder).

  3. Initialize Tracer: You create a Tracer, passing it the recorder instance and the session_id (as run_id).

  4. Execute Logic: You call your business logic, which uses the @trace decorator.

  5. Record Data: The decorator sends the captured trace data to the recorder.record() method.

Here is the logic from our quick start example, annotated:

# In your /invoke endpoint
def invoke():
    # ... get session_id from request ...
    
    # 1. The framework passes the recorder config in the request body
    trace_config = data.get('trace_config', {})
    recorder_settings = trace_config.get('settings', {})
    
    # 2. Initialize the correct recorder based on the config
    recorder = LocalJsonRecorder(recorder_settings)
    
    # 3. Initialize the tracer for this specific run
    tracer = Tracer(recorder, run_id=session_id)
    
    # 4. Pass the tracer to your application logic
    bot = MockBot(tracer)
    
    # 5. When these methods are called, they will record data via the tracer
    agent = bot.route_request(question=question)
    result = bot.execute_agent(agent=agent)
    
    return jsonify({"final_answer": result['response']})

Advanced Tracing: Injecting Custom Metadata

Often, you want to record dynamic data that isn’t part of the function’s direct inputs or outputs. This is perfect for things like model confidence scores, tool parameters, or which LLM was used for a specific step.

You can do this by passing a special _extra_metadata dictionary when you call a traced function.

# In your chatbot logic
@self.tracer.trace(step_name="synthesize_response")
def _synthesize(agent_response: dict):
    # ... logic to call an LLM ...
    return {"final_answer": "Some generated text."}

# When you call the function
final_result = bot.synthesize_response(
    agent_response=agent_result,
    # This dictionary will be merged into the root of the trace data for this step
    _extra_metadata={
        "synthesis_details": {
            "model_id": "gpt-4o",
            "temperature": 0.2,
        },
        "confidence_score": 0.97
    }
)

The resulting trace data for this step will now include synthesis_details and confidence_score at the top level, making them easy to query.

Recorders In-Depth

Recorders are the pluggable storage backends for your trace data.

LocalJsonRecorder

This is the simplest recorder, perfect for local development and debugging.

  • How it works: It appends all trace data to a single JSON file, organized by run_id.

  • Configuration:

    tracing:
      recorder:
        type: "local_json"
        settings:
          filepath: "results/my_traces.json" # Path to the output file
    
  • Pros: No setup, easy to inspect the output.

  • Cons: Not suitable for production or concurrent writes at scale.

DynamoDBRecorder and Custom Schemas

This is the recommended recorder for scalable, cloud-based deployments.

  • How it works: It stores trace data in an AWS DynamoDB table. By default, it appends each trace step to a list within an item identified by the run_id.

  • Configuration:

    tracing:
      recorder:
        type: "dynamodb"
        settings:
          table_name: "my-chatbot-traces"
          region: "us-east-1"
          run_id_key: "sessionId" # The primary key of your table
    
  • Advanced Feature: schema_mapping This is an extremely powerful feature for making your trace data queryable. You can define a schema that maps values from your trace data (including custom metadata) to top-level attributes in your DynamoDB item. This allows you to create Global Secondary Indexes (GSIs) on these attributes for efficient lookups.

    1. Define the schema in your application code:

      # In your chatbot's app.py
      MY_CUSTOM_SCHEMA = {
          # DynamoDB Attribute Name : Path in trace_data dictionary (using dot notation)
          "step_status": "status",
          "final_agent_response": "outputs.final_answer",
          
          # Map the custom metadata we injected earlier!
          "synthesis_model": "synthesis_details.model_id",
          "routing_confidence": "confidence_score",
      
          # 'latency' is a special key that calculates the step's duration
          "latency_seconds": "latency", 
      }
      
    2. Pass the schema when initializing the recorder:

      # In your /invoke endpoint
      recorder = DynamoDBRecorder(
          settings=recorder_settings,
          schema_mapping=MY_CUSTOM_SCHEMA  # <-- Pass the schema map here
      )
      tracer = Tracer(recorder=recorder, run_id=session_id)
      

    Now, your DynamoDB items will have top-level attributes like step_status and routing_confidence, which you can index and query efficiently.

Tracing Integration Patterns

Pattern 1: Standard Application

This is the pattern we’ve used so far. You encapsulate your logic in a class, pass a tracer instance to it, and use the @tracer.trace decorator on its methods.

Pattern 2: LangGraph

LangGraph’s structure requires a slightly different approach. The key is to create the graph nodes inside a factory function that has access to the tracer.

# From examples/langgraph_app.py
from langgraph.graph import StateGraph, END

def create_graph_components(tracer):
    # Define nodes as functions and apply the decorator
    @tracer.trace(step_name="web_search_tool")
    def web_search(state: AgentState):
        state["documents"] = ["LangGraph is a library..."]
        return state

    # For LangGraph, you can pass metadata via the state dictionary
    @tracer.trace(step_name="generate_final_answer")
    def generate(state: AgentState):
        state['current_metadata'] = {"confidence_score": 0.98}
        state["generation"] = f"Based on research: {state['documents'][0]}"
        return state
    
    return web_search, generate

# In your main app logic:
# 1. Initialize the tracer
tracer = Tracer(recorder=recorder, run_id=session_id)
# 2. Create the graph nodes using the factory
web_search_node, generate_node = create_graph_components(tracer)
# 3. Build the graph with the traced nodes
workflow = StateGraph(AgentState)
workflow.add_node("web_search", web_search_node)
# ... etc ...

Pattern 3: LlamaIndex & Other Libraries

When using a library where the core logic is inside a pre-built object (like a LlamaIndex QueryEngine), the cleanest pattern is to wrap the call in your own traced function.

# From examples/llamaindex_app.py

# Assume 'query_engine' is an initialized LlamaIndex QueryEngine
def run_traced_query(engine, question, tracer):
    """Wraps the core LlamaIndex logic so we can trace it."""
    
    @tracer.trace(step_name="rag_query_pipeline")
    def _run_query(q: str):
        # This is the actual call to the library
        return engine.query(q)
    
    # Prepare metadata to inject into the trace
    metadata_to_inject = {"llm_used": "MockLLM", "retrieval_top_k": 2}

    # Call your decorated wrapper function
    response = _run_query(question, _extra_metadata=metadata_to_inject)
    return response

# In your main app logic:
tracer = Tracer(recorder=recorder, run_id=session_id)
rag_response = run_traced_query(query_engine, question, tracer)