Skip to content

Programmatic Execution: Execute Complex Workflows at Once

Problem: The LLM Round-Trip Tax

When orchestrating complex multi-step workflows with traditional tool calling, you face a fundamental tradeoff:

Scenario: Get team expenses and find budget overages

Traditional Approach (with parallel function calling):
  Round 1: "Get team members"
     LLM calls get_team_members()
     Result: 20 members (5KB)

  Round 2: "Get Q3 expenses for all members"
     LLM calls get_expenses() [PARALLEL for all 20]
     Result: 100 expense records (100KB)
     LLM must now see and reason about 100+ records

  Round 3: "Find who exceeded budget"
     LLM analyzes the 100 records in its context window
     Returns result

Cost Breakdown:
   3 LLM API calls
   ~100KB context (all expense data stays in LLM context)
   ~2 seconds (3 round-trips)
   $0.03 cost
   ⚠️  LLM wasted tokens on data analysis instead of orchestration logic

The issue isn't just API calls—it's context bloat. The LLM receives all raw data and must reason through it, wasting tokens on manual analysis.

Solution: Programmatic Execution

Let the LLM generate code ONCE that orchestrates everything:

Programmatic Approach:

1. LLM Call (single):
   "Given the tools: get_team_members, get_expenses, write Python code to:
    1. Get all team members
    2. Fetch their Q3 expenses (in parallel)
    3. Filter for those over $10k budget
    4. Return the list"

   LLM generates:
   ```python
   import asyncio

   team = await get_team_members()
   expenses = await asyncio.gather(*[
       get_expenses(m["id"], "Q3") for m in team
   ])

   exceeded = [
       m for m, exp in zip(team, expenses)
       if sum(e["amount"] for e in exp) > 10000
   ]

   return {"exceeded_count": len(exceeded), "list": exceeded}
   ```

2. Code Execution (in ToolWeaver sandbox):
   - Fetches all 20 members (parallel)
   - Calls get_expenses 20x (parallel in one batch)
   - Filters locally (NO context sent to LLM)
   - Returns only summary (2KB)

Cost Breakdown:
   1 LLM API call
   ~2KB context (only final summary)
   <1 second (all parallel)
   $0.01 cost
    LLM didn't see raw data—only generated orchestration logic

Comparison

Aspect Traditional Programmatic Improvement
LLM API Calls 3 1 67% fewer
Context Size 100KB 2KB 98% smaller
Latency ~2 seconds <1 second 2x faster
Cost $0.03 $0.01 67% savings
LLM Reasoning Over raw data Over generated code Eliminated bloat
Scalability 1000 items = +1000KB 1000 items = +0.02KB Linear cost

Why It Matters

1. Context Efficiency

Traditional tool calling forces LLMs to see and reason about all intermediate data. This is expensive.

Traditional: LLM sees
  [20 team members] + [100 expense records] + "analyze this manually"
   LLM tokens wasted on reading raw data

Programmatic: LLM generates code that does the filtering
   LLM only sees summary result
   Context stays minimal

2. Token Efficiency

LLM tokens are consumed by reasoning, not just input/output. analyzing data, not just input/output.

Traditional: LLM must analyze
  100 expense records in context to find overages
   Tokens spent on data analysis (could be done locally)

Programmatic: LLM generates code once
  Code performs all analysis locally in sandbox
   LLM only generates orchestration logic (higher-value use of token

3. Scalability

Traditional approaches don't scale—more data = more expensive.

Processing 100 items:
  Traditional: +100KB context × 3 calls = expensive
  Programmatic: +0.1KB result × 1 call = cheap

Processing 1000 items:
  Traditional: +1000KB context × 3 calls = prohibitively expensive
  Programmatic: +1KB result × 1 call = same cost

When to Use Programmatic Execution

Use programmatic execution when: - Processing batches of items (10+) - Multi-step workflows with filtering/aggregation - Parallel operations needed - Intermediate data is large but final result is small - Complex logic (loops, conditionals, transformations)

Avoid programmatic execution when: - Single simple tool call needed - LLM must iterate based on results - Complex reasoning needed on intermediate data

Example: Batch Processing

Scenario: Process 500 leads through qualification workflow

# Traditional: LLM needs to see/reason about all 500 leads
#  → 500 × "is this a qualified lead?" rounds
#  → 500KB+ context
#  → Very expensive

# Programmatic: LLM generates code that handles all leads
leads_code = """
import asyncio

leads = await get_leads(query="unqualified")
results = []

for lead in leads:
    score = await score_lead(lead)
    if score > 0.8:
        company = await get_company(lead.company_id)
        results.append({
            "lead": lead,
            "score": score,
            "company": company
        })

return results
"""

# LLM generates this code ONCE
# Code processes all 500 leads locally
# Only qualified leads (maybe 50) returned to LLM
# Cost: 1 LLM call vs 500

Example: Multi-Step Workflow

Scenario: "Find the highest-paying job offer from tech companies"

# Traditional: Multiple rounds
#  Round 1: LLM calls get_job_offers()
#  Round 2-N: LLM analyzes each offer, queries company details
#  Result: expensive round-trip dance

# Programmatic: Generate code ONCE
code = """
import asyncio, json

# Get all offers
offers = await get_job_offers()

# Get company details for all (parallel)
company_tasks = [get_company_details(o.company_id) for o in offers]
companies = await asyncio.gather(*company_tasks)

# Filter tech companies
tech_offers = [
    o for o, c in zip(offers, companies)
    if 'tech' in c.industry.lower()
]

# Find highest paying
best_offer = max(tech_offers, key=lambda o: o.salary)

# Return structured result
print(json.dumps({
    "offer": best_offer.to_dict(),
    "company": companies[offers.index(best_offer)].to_dict(),
    "salary": best_offer.salary
}))
"""

# LLM generates this ONCE
# All logic runs in sandbox (parallel fetch of company details)
# Only best offer returned
# Cost: 1 call + parallel execution vs 3-5 round-trips

Getting Started

See the Orchestrate with Code guide for implementation details and security considerations.

Key Takeaways

Traditional Programmatic
Multi-round reasoning Single code generation
LLM sees all data LLM generates orchestration logic
Expensive at scale Linear cost
Better for simple tasks Better for complex workflows

Programmatic execution is most powerful when you have: 1. Multiple items to process (batch operations) 2. Parallel fetches needed (asyncio.gather) 3. Local filtering/aggregation (LLM doesn't see raw data) 4. Cost sensitivity (fewer API calls, less context)