Patterns
Extract structured data from unstructured LLM output
Overview
Patterns define how to extract structured segments (tools, reasoning, responses) from raw LLM text. Aegeantic supports custom patterns, allowing you to adapt to different LLM output formats.
Pattern Types
TOOL
Extracts tool invocations from LLM output. Supports JSON and line-based formats.
<tool>{"name": "search", "arguments": {"query": "Python"}}</tool>
REASONING
Captures internal thought processes and reasoning steps.
<reasoning>I need to search for information first.</reasoning>
RESPONSE
Extracts the final response to the user.
<response>Here's what I found about Python...</response>
Built-in Pattern Sets
Default Pattern Set
Uses XML-style tags for all segment types:
from agentic import create_default_pattern_set
patterns = create_default_pattern_set()
# Uses: <tool>, <reasoning>, <response>
JSON Tools Pattern Set
Uses JSON code blocks for tools, XML for others:
from agentic import create_json_tools_pattern_set
patterns = create_json_tools_pattern_set()
# Tools: ```json...```
# Others: <reasoning>, <response>
XML Tools Pattern Set
Alternative XML tags throughout:
from agentic import create_xml_tools_pattern_set
patterns = create_xml_tools_pattern_set()
# <tool_call>, <thinking>, <answer>
Backtick Tools Pattern Set
Uses triple backticks for tools:
from agentic import create_backtick_tools_pattern_set
patterns = create_backtick_tools_pattern_set()
# Tools: ```tool...```
Custom Pattern Sets
Create your own pattern set to match your LLM's output format:
from agentic import PatternSet, Pattern, SegmentType, PatternRegistry, RocksDBStorage
storage = RocksDBStorage("./data")
pattern_registry = PatternRegistry(storage)
custom_patterns = PatternSet(
name="custom",
patterns=[
Pattern(
name="tool",
start_tag="[TOOL:",
end_tag=":TOOL]",
segment_type=SegmentType.TOOL,
expected_format="json"
),
Pattern(
name="thinking",
start_tag="[THINK:",
end_tag=":THINK]",
segment_type=SegmentType.REASONING
),
Pattern(
name="answer",
start_tag="[ANSWER:",
end_tag=":ANSWER]",
segment_type=SegmentType.RESPONSE
)
],
default_response_behavior="all_remaining"
)
pattern_registry.register_pattern_set(custom_patterns)
Pattern Options
Greedy vs Non-Greedy
Control how patterns match content:
Pattern(
name="tool",
start_tag="<tool>",
end_tag="</tool>",
segment_type=SegmentType.TOOL,
greedy=False # Non-greedy: stops at first </tool>
)
Expected Format
For tool patterns, specify expected content format:
"json"- Expect JSON, error if invalid"line"- Expect line format (Name: x, Arguments: {...})"auto"- Try both formats silentlyNone- Skip tool parsing
Pattern(
name="tool",
start_tag="```json",
end_tag="```",
segment_type=SegmentType.TOOL,
expected_format="json" # Enforce JSON format
)
Default Response Behavior
Control how unmatched text is handled:
"all_remaining"- Text outside patterns becomes response (default)"explicit_only"- Only extract explicitly tagged responses
PatternSet(
name="strict",
patterns=[...],
default_response_behavior="explicit_only"
)
Streaming Pattern Extraction
Patterns are detected in real-time during LLM streaming:
async for event in runner.step_stream(user_input):
if event.type == "pattern_start":
print(f"Detected: {event.pattern_name}")
elif event.type == "pattern_content":
# Partial content (if stream_pattern_content=True)
print(event.content, end="")
elif event.type == "pattern_end":
# Complete content after end tag
print(f"\nComplete: {event.full_content}")
Tool Parsing Formats
JSON Format
<tool>
{
"name": "calculate",
"arguments": {
"a": 5,
"b": 3,
"operation": "add"
},
"call_id": "calc_001"
}
</tool>
Line Format
<tool>
Name: calculate
Arguments:
{
"a": 5,
"b": 3
}
</tool>
Error Handling
Patterns handle malformed content gracefully:
- Incomplete patterns: Missing end tags are detected and reported as
ErrorEventwithmalformed_patterntype - Parse errors: Invalid JSON or format errors are stored in
segments.parse_errors - Multiple instances: All occurrences of a pattern are extracted
result = runner.step(user_input)
# Check for parse errors
if result.segments.parse_errors:
for key, error_text in result.segments.parse_errors.items():
print(f"Parse error in {key}: {error_text}")
# Check for malformed patterns (streaming only)
if result.partial_malformed_patterns:
for pattern_name, content in result.partial_malformed_patterns.items():
print(f"Incomplete {pattern_name}: {content}")
Best Practices
- Choose pattern tags that your LLM can reliably generate
- Use descriptive pattern names for debugging
- Test patterns with your LLM to ensure consistent output
- Use
"auto"format for flexibility during development - Set
expected_formatto enforce strict formats in production - Enable
stream_pattern_contentfor real-time UIs - Consider using
explicit_onlyfor strict response extraction
Next Steps
- Tools - How tools are parsed from patterns
- Agent System - Configure pattern sets per agent
- Events - Pattern detection events