Programmatic Execution: Execute Complex Workflows at Once¶

Problem: The LLM Round-Trip Tax¶

When orchestrating complex multi-step workflows with traditional tool calling, you face a fundamental tradeoff:

Scenario: Get team expenses and find budget overages

Traditional Approach (with parallel function calling):
  Round 1: "Get team members"
    → LLM calls get_team_members()
    → Result: 20 members (5KB)

  Round 2: "Get Q3 expenses for all members"
    → LLM calls get_expenses() [PARALLEL for all 20]
    → Result: 100 expense records (100KB)
    → LLM must now see and reason about 100+ records

  Round 3: "Find who exceeded budget"
    → LLM analyzes the 100 records in its context window
    → Returns result

Cost Breakdown:
  • 3 LLM API calls
  • ~100KB context (all expense data stays in LLM context)
  • ~2 seconds (3 round-trips)
  • $0.03 cost
  • ⚠️  LLM wasted tokens on data analysis instead of orchestration logic

The issue isn't just API calls—it's context bloat. The LLM receives all raw data and must reason through it, wasting tokens on manual analysis.

Solution: Programmatic Execution¶

Let the LLM generate code ONCE that orchestrates everything:

Programmatic Approach:

1. LLM Call (single):
   "Given the tools: get_team_members, get_expenses, write Python code to:
    1. Get all team members
    2. Fetch their Q3 expenses (in parallel)
    3. Filter for those over $10k budget
    4. Return the list"

   LLM generates:
   ```python
   import asyncio

   team = await get_team_members()
   expenses = await asyncio.gather(*[
       get_expenses(m["id"], "Q3") for m in team
   ])

   exceeded = [
       m for m, exp in zip(team, expenses)
       if sum(e["amount"] for e in exp) > 10000
   ]

   return {"exceeded_count": len(exceeded), "list": exceeded}
   ```

2. Code Execution (in ToolWeaver sandbox):
   - Fetches all 20 members (parallel)
   - Calls get_expenses 20x (parallel in one batch)
   - Filters locally (NO context sent to LLM)
   - Returns only summary (2KB)

Cost Breakdown:
  • 1 LLM API call
  • ~2KB context (only final summary)
  • <1 second (all parallel)
  • $0.01 cost
  • ✅ LLM didn't see raw data—only generated orchestration logic

Comparison¶

Aspect	Traditional	Programmatic	Improvement
LLM API Calls	3	1	67% fewer
Context Size	100KB	2KB	98% smaller
Latency	~2 seconds	<1 second	2x faster
Cost	$0.03	$0.01	67% savings
LLM Reasoning	Over raw data	Over generated code	Eliminated bloat
Scalability	1000 items = +1000KB	1000 items = +0.02KB	Linear cost

Why It Matters¶

1. Context Efficiency¶

Traditional tool calling forces LLMs to see and reason about all intermediate data. This is expensive.

Traditional: LLM sees
  [20 team members] + [100 expense records] + "analyze this manually"
  → LLM tokens wasted on reading raw data

Programmatic: LLM generates code that does the filtering
  → LLM only sees summary result
  → Context stays minimal

2. Token Efficiency¶

LLM tokens are consumed by reasoning, not just input/output. analyzing data, not just input/output.

Traditional: LLM must analyze
  100 expense records in context to find overages
  → Tokens spent on data analysis (could be done locally)

Programmatic: LLM generates code once
  Code performs all analysis locally in sandbox
  → LLM only generates orchestration logic (higher-value use of token

3. Scalability¶

Traditional approaches don't scale—more data = more expensive.

Processing 100 items:
  Traditional: +100KB context × 3 calls = expensive
  Programmatic: +0.1KB result × 1 call = cheap

Processing 1000 items:
  Traditional: +1000KB context × 3 calls = prohibitively expensive
  Programmatic: +1KB result × 1 call = same cost

When to Use Programmatic Execution¶

✅ Use programmatic execution when: - Processing batches of items (10+) - Multi-step workflows with filtering/aggregation - Parallel operations needed - Intermediate data is large but final result is small - Complex logic (loops, conditionals, transformations)

❌ Avoid programmatic execution when: - Single simple tool call needed - LLM must iterate based on results - Complex reasoning needed on intermediate data

Example: Batch Processing¶

Scenario: Process 500 leads through qualification workflow

# Traditional: LLM needs to see/reason about all 500 leads
#  → 500 × "is this a qualified lead?" rounds
#  → 500KB+ context
#  → Very expensive

# Programmatic: LLM generates code that handles all leads
leads_code = """
import asyncio

leads = await get_leads(query="unqualified")
results = []

for lead in leads:
    score = await score_lead(lead)
    if score > 0.8:
        company = await get_company(lead.company_id)
        results.append({
            "lead": lead,
            "score": score,
            "company": company
        })

return results
"""

# LLM generates this code ONCE
# Code processes all 500 leads locally
# Only qualified leads (maybe 50) returned to LLM
# Cost: 1 LLM call vs 500

Example: Multi-Step Workflow¶

Scenario: "Find the highest-paying job offer from tech companies"

# Traditional: Multiple rounds
#  Round 1: LLM calls get_job_offers()
#  Round 2-N: LLM analyzes each offer, queries company details
#  Result: expensive round-trip dance

# Programmatic: Generate code ONCE
code = """
import asyncio, json

# Get all offers
offers = await get_job_offers()

# Get company details for all (parallel)
company_tasks = [get_company_details(o.company_id) for o in offers]
companies = await asyncio.gather(*company_tasks)

# Filter tech companies
tech_offers = [
    o for o, c in zip(offers, companies)
    if 'tech' in c.industry.lower()
]

# Find highest paying
best_offer = max(tech_offers, key=lambda o: o.salary)

# Return structured result
print(json.dumps({
    "offer": best_offer.to_dict(),
    "company": companies[offers.index(best_offer)].to_dict(),
    "salary": best_offer.salary
}))
"""

# LLM generates this ONCE
# All logic runs in sandbox (parallel fetch of company details)
# Only best offer returned
# Cost: 1 call + parallel execution vs 3-5 round-trips

Getting Started¶

See the Orchestrate with Code guide for implementation details and security considerations.

Key Takeaways¶

Traditional	Programmatic
Multi-round reasoning	Single code generation
LLM sees all data	LLM generates orchestration logic
Expensive at scale	Linear cost
Better for simple tasks	Better for complex workflows

Programmatic execution is most powerful when you have: 1. Multiple items to process (batch operations) 2. Parallel fetches needed (asyncio.gather) 3. Local filtering/aggregation (LLM doesn't see raw data) 4. Cost sensitivity (fewer API calls, less context)