Patterns

Extract structured data from unstructured LLM output

Overview

Patterns define how to extract structured segments (tools, reasoning, responses) from raw LLM text. Aegeantic supports custom patterns, allowing you to adapt to different LLM output formats.

Pattern Types

TOOL

Extracts tool invocations from LLM output. Supports JSON and line-based formats.

<tool>{"name": "search", "arguments": {"query": "Python"}}</tool>

REASONING

Captures internal thought processes and reasoning steps.

<reasoning>I need to search for information first.</reasoning>

RESPONSE

Extracts the final response to the user.

<response>Here's what I found about Python...</response>

Built-in Pattern Sets

Default Pattern Set

Uses XML-style tags for all segment types:

from agentic import create_default_pattern_set

patterns = create_default_pattern_set()
# Uses: <tool>, <reasoning>, <response>

JSON Tools Pattern Set

Uses JSON code blocks for tools, XML for others:

from agentic import create_json_tools_pattern_set

patterns = create_json_tools_pattern_set()
# Tools: ```json...```
# Others: <reasoning>, <response>

XML Tools Pattern Set

Alternative XML tags throughout:

from agentic import create_xml_tools_pattern_set

patterns = create_xml_tools_pattern_set()
# <tool_call>, <thinking>, <answer>

Backtick Tools Pattern Set

Uses triple backticks for tools:

from agentic import create_backtick_tools_pattern_set

patterns = create_backtick_tools_pattern_set()
# Tools: ```tool...```

Custom Pattern Sets

Create your own pattern set to match your LLM's output format:

from agentic import PatternSet, Pattern, SegmentType, PatternRegistry, RocksDBStorage

storage = RocksDBStorage("./data")
pattern_registry = PatternRegistry(storage)

custom_patterns = PatternSet(
    name="custom",
    patterns=[
        Pattern(
            name="tool",
            start_tag="[TOOL:",
            end_tag=":TOOL]",
            segment_type=SegmentType.TOOL,
            expected_format="json"
        ),
        Pattern(
            name="thinking",
            start_tag="[THINK:",
            end_tag=":THINK]",
            segment_type=SegmentType.REASONING
        ),
        Pattern(
            name="answer",
            start_tag="[ANSWER:",
            end_tag=":ANSWER]",
            segment_type=SegmentType.RESPONSE
        )
    ],
    default_response_behavior="all_remaining"
)

pattern_registry.register_pattern_set(custom_patterns)

Pattern Options

Greedy vs Non-Greedy

Control how patterns match content:

Pattern(
    name="tool",
    start_tag="<tool>",
    end_tag="</tool>",
    segment_type=SegmentType.TOOL,
    greedy=False  # Non-greedy: stops at first </tool>
)

Expected Format

For tool patterns, specify expected content format:

Pattern(
    name="tool",
    start_tag="```json",
    end_tag="```",
    segment_type=SegmentType.TOOL,
    expected_format="json"  # Enforce JSON format
)

Default Response Behavior

Control how unmatched text is handled:

PatternSet(
    name="strict",
    patterns=[...],
    default_response_behavior="explicit_only"
)

Streaming Pattern Extraction

Patterns are detected in real-time during LLM streaming:

async for event in runner.step_stream(user_input):
    if event.type == "pattern_start":
        print(f"Detected: {event.pattern_name}")

    elif event.type == "pattern_content":
        # Partial content (if stream_pattern_content=True)
        print(event.content, end="")

    elif event.type == "pattern_end":
        # Complete content after end tag
        print(f"\nComplete: {event.full_content}")

Tool Parsing Formats

JSON Format

<tool>
{
  "name": "calculate",
  "arguments": {
    "a": 5,
    "b": 3,
    "operation": "add"
  },
  "call_id": "calc_001"
}
</tool>

Line Format

<tool>
Name: calculate
Arguments:
{
  "a": 5,
  "b": 3
}
</tool>

Error Handling

Patterns handle malformed content gracefully:

result = runner.step(user_input)

# Check for parse errors
if result.segments.parse_errors:
    for key, error_text in result.segments.parse_errors.items():
        print(f"Parse error in {key}: {error_text}")

# Check for malformed patterns (streaming only)
if result.partial_malformed_patterns:
    for pattern_name, content in result.partial_malformed_patterns.items():
        print(f"Incomplete {pattern_name}: {content}")

Best Practices

Tip: Include pattern examples in your system prompt to guide the LLM's output format.

Next Steps