Your synthetic workforce is programmatically addressable, self-evolving, and supports fully extensible integration and execution.
Developers who have built production AI agent systems run into a dirty secret: it's unclear what it means to make them "smarter over time".
The default configuration results in a system in which every conversation starts from zero. Every agent rediscovers the same solutions.
And let's presume you've armed them with the ability to write arbitrary code to solve problems. That clever workaround your agent figured out for parsing malformed JSONs? Gone the moment the session ends. The function composition that finally worked for that complex data pipeline? It is ephemeral between stateless instances.
The fundamental issue isn't with the underlying LLMs. They're remarkably capable. The issue is that we've built systems with systemic amnesia.
Let's further presume that we instead go down the paths that most teams do for memory. Most good implementations of flexible dynamic agent memory try to patch this with conversation histories and RAG systems. Some go as far as to decompose conversations into ontologically-usable components that are amenable to graph embedding. These are very helpful for declarative knowledge. But these are band-aids on a structural problem: the work performance of an agentic systems is ephemeral, while the underlying work problems in the enterprise are often are persistent. And the nature of what we want to store are different between declarative and procedural memory. "Michelle's phone number is (412) 316-3133" is a different kind of thing to know than a discrete set of steps that were derived to solve a problem.
In developing Swarm, we explored an approach towards building scalable institutional memory for procedural memory. Instead of trying to make Synthetics "remember things" in strictly a declarative sense, we make their practical work solutions executable, parameterizable, and shareable as synthetic tribal knowledge. When a synthetic worker solves a problem, it's codified solution becomes part of the Scalable Work Executable Library (SWEL). This forms the basis for an enterprise-wide, evolving procedural memory organ.
At its core, a Synthetic in Swarm builds sophisticated routines of behavior to perform work by dynamically recombining 34 parameterizable Motor Building Units (MBUs). Thirty-three of these are fixed capabilities—things like list_files
, computer
, compose_email
, calendar_get_availabilty
. These provide the bedrock operations that any worker might need. But the 34th MBU is different. It's the Custom Code MBU, and it's the key to everything.
The Custom Code MBU is a metatool that can become any tool. When a synthetic worker encounters a problem that can't be solved with the existing 33 MBUs, it uses the Custom Code MBU to generate and execute arbitrary Python code to solve it. That code gets parameterized, documented, and added to the SWEL where other workers can discover and use it. This is the basis of scaling institutional work competency between tasks and workers.
# The Custom Code MBU in pseudoaction
# Step 1: Check if solution exists in SWEL
existing_solution = self.search_SWEL(task_context)
if existing_solution:
params = self.generate_parameters(existing_solution, task_context)
return existing_solution.execute(**params)
# Step 2: Generate new solution
code = self.generate_code(task_context)
# Step 3: Execute in sandboxed environment
result = self.sandbox_execute(code, task_context)
# Step 4: If successful, parameterize and store
if result.success:
parameterized = self.parameterize_solution(code, task_context, result)
SWEL.add(code, parameterized)
# Write the success or failure of the operation to Working Memory and proceed with task execution
return result.value
The SWEL is a dynamic library. It's a growing repository of parameterized solutions that synthetic workers and humans alike both contribute to and draw from. It is available between workers and between tasks. This means it serves as a dimension of institutional memory that accumulates over time, making every worker more capable as the system evolves.
We wanted to create a system that could evolve and adapt without constant human intervention. The goal was to empower synthetic workers to not only use existing tools but also to create and share new ones as needed. This is a departure from traditional agent systems and even from many modern AI frameworks that rely heavily on static toolsets or human-in-the-loop updates.
Traditional Agent Systems: Each agent has access to a fixed set of tools. If you need new functionality, a human developer writes it, deploys it, and updates all agents. Knowledge is centralized in the development team. Tools are recycable between uses, agents, and tasks.
LangChain-style Systems: Agents can chain tools together, but the tools themselves are static. You can compose them differently, but you can't create new atomic capabilities. The system is only as capable as its predefined tools.
We saw this as a fundamental limitation. Real-world work problems are diverse and often require novel solutions. Relying on a fixed toolset means that many problems remain unsolvable or require significant human intervention to address.
Additionally, this method of static toolsets creates a "cold start" problem for new agents or tasks. Every new agent begins with the same limited set of tools, regardless of the accumulated experience of previous agents. This leads to inefficiencies and repeated efforts. Ideally, you want an organization where both human and agentic colleagues are able to build on the collective experience of the workforce, not start from scratch every time.
Our Approach: Synthetics create new tools on the fly. These tools immediately become available to all other workers. Knowledge accumulates in the infra itself, not in the development team. The tools themselves are documented and parameterized for future reuse. Future use is scored and monitored for performance.
# Day 1: A worker solves a specific problem
def extract_price_from_html(html: str, currency: str = 'USD') -> float:
"""Extract product price from e-commerce HTML."""
soup = BeautifulSoup(html, 'html.parser')
price_elem = soup.find('span', {'class': 'price'})
price_text = price_elem.text.strip()
return parse_currency(price_text, currency)
# Day 5: Another worker needs something similar but different
# It finds extract_price_from_html, realizes it's close but not exact
# Generates a variation:
def extract_structured_data(
html: str,
selectors: dict,
transformers: dict = None
) -> dict:
"""Extract any structured data from HTML using CSS selectors.
Evolution of: extract_price_from_html
Generalization: Works with any CSS selector and data type
"""
soup = BeautifulSoup(html, 'html.parser')
data = {}
for key, selector in selectors.items():
elem = soup.select_one(selector)
if elem:
value = elem.text.strip()
if transformers and key in transformers:
value = transformers[key](value)
data[key] = value
return data
# Day 10: The entire workforce can now extract ANY structured data from HTML
# Not because we programmed it, but because the system evolved it
At first, we dismissed this approach as just a function library. But our underlying motivation was to induce a flywheel effect where each subsequent unit of knowledge became more valuable in the presence of all others.
When a synthetic worker adds a parameterized solution to the SWEL, it adds:
1. The solution itself
2. The ability for other workers to use it
3. The ability for other workers to compose it with existing solutions
4. The ability for other workers to evolve variations of it
5. The metadata about when and how it works best
6. Parameterized method inputs and expected outputs
7. Human intervention and editability.
What we've seen is that early in a Swarm deployment's lifecycle, most problems require generating new code. But as the SWEL grows, more and more problems can be solved by discovering and composing existing solutions. The system gets faster and more reliable over time through accumulated executable knowledge as trapped in the SWEL as entries.
"This will accumulate garbage code."
This one scared us at first. We don't want the SWEL to accumulate junk. Our answer to this was to allow every solution in the SWEL to include metadata about its performance:
{
'function_name': 'parse_invoice_pdf',
'success_rate': 0.94,
'avg_execution_time': 1.3,
'total_executions': 10847,
'last_failure': '2024-03-01',
'failure_modes': {
'timeout': 0.02,
'parse_error': 0.03,
'missing_fields': 0.01
},
'forked_from': 'parse_pdf_generic',
'has_forked_to': ['parse_invoice_pdf_v2', 'parse_receipt_pdf']
}
Workers use this metadata to decide whether to use, evolve, or replace solutions. Bad code naturally gets selected out of active use. These can be routinely pruned from the SWEL over time with on-platform metaworkers, ensuring that only the most effective solutions remain.
"What about security and sandboxing?"
The Custom Code MBU executes in a sandboxed python process with resource limits. This can be locked down pretty severely, and all generated code can be pre-screened for whitelisted or blacklisted patterns, such as network access, file system access, or subprocess execution. The sandbox can be further isolated in a container or VM if needed. In practice, we've found that most useful code doesn't need to do anything fancy. It's mostly data manipulation, API calls using pre-approved SDKs, and business logic.
"How is this different from codegen?"
It is codegen. That's the whole point. It's filling in a few things we wish codegen did better:
1. Persistence: Generated code becomes permanent, reusable capabilities
2. Parameterization: Code is automatically generalized for reuse
3. Discovery: Workers can find and apply relevant code accoring to past performance metrics.
Our goal was to create a system where:
1. Code generation is purposeful: Workers generate code to solve specific problems
2. Solutions accumulate: Every problem solved makes the system more capable
3. Knowledge transfers: Solutions discovered by one worker immediately benefit all workers
4. Evolution is automatic: Workers can reuse code, adapt code, and improve code
This transforms synthetic workers from stateless executors into participants in an evolving system. They're contributing to a growing body of executable knowledge that makes every future task easier to solve.
In a conventional software orchestration environment, you write N functions, you get N capabilities. In Swarm, capabilities scale with contributions. You write N functions, they get parameterized into M variations, composed into M² workflows, and evolved into patterns that may be hard to imagine from Day 1. This only works if you understand how to engineer specifically for the Code Generation MBU.
Your intuition from traditional development will betray you here. You'll want to write comprehensive functions that solve entire workflows end-to-end. Resist this urge. The power of the SWEL comes from recombination, and monolithic functions can't recombine.
Consider the difference:
# WRONG: Monolithic function that does everything
def analyze_customer_and_send_report():
# 500 lines that:
# - Fetches customer data from 3 sources
# - Cleans and normalizes everything
# - Runs 5 different analyses
# - Generates visualizations
# - Creates PDF report
# - Sends via email
# - Updates CRM
return {"status": "complete", "were_new_files_created": True}
This function is useless to a synth that needs just the analysis logic for a different data source, or just the PDF generation for a different report type. It's a dead end in the evolutionary tree.
# RIGHT: Discretized, composable functions
def fetch_customer_metrics(customer_id: str, source: str) -> dict:
"""Fetch specific metrics from a single source."""
# 20 lines of focused logic
return metrics_dict
def normalize_customer_data(raw_data: dict, schema: str = 'standard') -> dict:
"""Normalize data to standard schema."""
# 30 lines of transformation
return normalized_data
def calculate_customer_ltv(normalized_data: dict, method: str = 'cohort') -> float:
"""Calculate LTV using specified method."""
# 25 lines of calculation
return ltv_value
def generate_metric_visualization(metric_name: str, values: list, style: str = 'line') -> str:
"""Generate a single metric visualization."""
# 15 lines to create chart
return chart_path
Now synths can compose these in infinite ways. A synth analyzing suppliers can use normalize_customer_data
with a different schema. A synth doing employee analytics can reuse calculate_customer_ltv
by treating employees as "customers." The visualization function works for any metric, not just customer metrics.
Each discretized function becomes a LEGO brick in an ever-expanding set. The smaller and more focused your bricks, the more structures future synths can build.
Synths are masters at determining what parameters to pass based on context. Your job is to expose every decision point as a parameter, not to make those decisions in code.
# WRONG: Hard-coded decisions
def main():
# Always uses USD
currency = 'USD'
# Always uses 0.08 tax rate
tax_rate = 0.08
# Always saves to ./invoices/
output_dir = './invoices/'
# Process...
return {"status": "complete", "were_new_files_created": True}
# RIGHT: Fully parameterized
def main(
invoice_data: dict,
currency: str = 'USD',
tax_rate: float = None,
tax_jurisdiction: str = None,
output_dir: str = './output/',
validation_level: str = 'standard',
rounding_precision: int = 2
) -> dict:
"""Process invoice with full configurability.
Parameters:
- invoice_data: Raw invoice data
- currency: ISO currency code
- tax_rate: Explicit tax rate (overrides jurisdiction)
- tax_jurisdiction: Auto-determine rate from jurisdiction
- output_dir: Where to save processed invoice
- validation_level: 'basic', 'standard', 'strict'
- rounding_precision: Decimal places for amounts
"""
if tax_rate is None and tax_jurisdiction:
tax_rate = lookup_tax_rate(tax_jurisdiction)
# Process with all parameters
return {
"status": "success",
"were_new_files_created": True,
"output_path": f"{output_dir}/invoice_{invoice_data['id']}.json"
}
__return__ = main(**{
"invoice_data": invoice_data,
"currency": currency,
"tax_rate": tax_rate,
"tax_jurisdiction": tax_jurisdiction,
"output_dir": output_dir,
"validation_level": validation_level,
"rounding_precision": rounding_precision
})
The synth understands context you don't have when writing the function. It knows whether it's processing invoices for a European subsidiary (needs EUR), a tax-exempt organization (tax_rate=0), or a high-risk transaction (validation_level='strict'). By parameterizing everything, you let the synth's contextual awareness flow into your code.
Every contribution to the SWEL must follow a specific structural pattern. This isn't a suggestion.
def main(**kwargs) -> dict:
"""Your function MUST be wrapped in main(), accept **kwargs, and be invoked by __return__
The synth will pass parameters as keyword arguments based on
its understanding of the task context.
"""
# Your actual logic here
result = perform_operation(
kwargs.get('input_data'),
kwargs.get('config', {}),
kwargs.get('options', {})
)
# REQUIRED return structure
return {
'status': 'success', # Required: 'success', 'failure', or 'partial'
'were_new_files_created': True, # Required: boolean
'files_created': ['output/report.pdf'], # Optional but recommended if True above
'result_data': result, # Optional: your actual results
'metrics': { # Optional: performance data
'records_processed': 1000,
'processing_time': 2.3
},
'next_suggested_action': 'validate_output' # Optional: helps composition
}
__return__ = main({
"input_data": input_data,
"config": config,
"options": options
})
The two required keys—status
and were_new_files_created
—integrate with MCAI's document control system. When were_new_files_created
is True, the system automatically indexes and tracks the created files, making them available to other synths and maintaining audit trails.
Here's a complete example:
def main(**kwargs) -> dict:
"""Generate quarterly financial summary from transaction data."""
# Extract parameters with defaults
transactions_path = kwargs.get('transactions_path', './data/transactions.csv')
quarter = kwargs.get('quarter', 'Q1')
year = kwargs.get('year', 2024)
output_format = kwargs.get('output_format', 'json')
include_projections = kwargs.get('include_projections', False)
try:
# Load and process data
transactions = load_transactions(transactions_path)
quarterly_data = filter_by_quarter(transactions, quarter, year)
summary = calculate_summary(quarterly_data)
if include_projections:
summary['projections'] = generate_projections(summary)
# Save output
output_path = f'./output/{year}_{quarter}_summary.{output_format}'
save_summary(summary, output_path, format=output_format)
return {
'status': 'success',
'were_new_files_created': True,
'files_created': [output_path],
'summary': summary,
'record_count': len(quarterly_data),
'next_suggested_action': 'generate_visualization'
}
except Exception as e:
return {
'status': 'failure',
'were_new_files_created': False,
'error': str(e),
'error_type': type(e).__name__
}
The Code Generation MBU operates in a sandboxed environment with a specific file system structure.
Presume that the synth needs to know the resource path to any file you want to work with.
# WRONG: Assuming specific paths or creating arbitrary directories
def process_data():
os.makedirs('/home/user/my_special_folder/') # Won't work
data = pd.read_csv('C:\\Data\\myfile.csv') # Windows-specific path
with open('/etc/config.conf') as f: # System file access
config = f.read()
# RIGHT: Use relative paths and assume resource availability
def main(**kwargs) -> dict:
"""Process data using sandbox-safe file operations."""
# Synth will provide paths to available resources
input_path = kwargs.get('input_path') # Synth provides this
output_dir = kwargs.get('output_dir')
# Read from provided paths
data = pd.read_csv(input_path)
with open(config_path) as f:
config = json.load(f)
# Process and save to relative path
processed = transform_data(data, config)
output_path = os.path.join(output_dir, 'processed_data.csv')
processed.to_csv(output_path, index=False)
return {
'status': 'success',
'were_new_files_created': True,
'files_created': [output_path],
'rows_processed': len(processed)
}
Never assume absolute paths. Always use paths provided by the synth. When in doubt, ask the synth to find the file's resource path first.
This is where the magic happens. Synths don't just execute your code—they contextualize it based on their entire working session. They front-run the execution by populating parameters based on their understanding of the task, previous conversations, and accumulated context.
>def main(**kwargs) -> dict:
"""Example: Generate a customer report.
The synth calling this might have:
- Just finished a conversation about Q3 performance
- Loaded customer data from three different sources
- Been instructed to focus on enterprise clients
- Learned that the CEO prefers bullet points
All of this context informs how it populates parameters.
"""
# These parameters might be auto-populated from context
customer_segment = kwargs.get('customer_segment')
# Synth knows "enterprise" from conversation
reporting_period = kwargs.get('reporting_period')
# Synth knows "Q3 2024" from discussion
output_style = kwargs.get('output_style')
# Synth knows "bullet_points" from CEO preference
metrics_focus = kwargs.get('metrics_focus', [])
# Synth includes metrics mentioned in conversation
comparison_period = kwargs.get('comparison_period')
# Synth infers "Q2 2024" as logical comparison
This means you should write functions that are context-aware but not context-dependent:
def main(**kwargs) -> dict:
"""Analyze communication patterns in provided dataset."""
# Rich set of optional parameters that synth can populate from context
dataset_path = kwargs.get('dataset_path')
# Synth might know these from conversation context
participant_filter = kwargs.get('participant_filter', None)
time_range = kwargs.get('time_range', None)
communication_types = kwargs.get('communication_types', ['email', 'chat', 'video'])
# Synth might infer these from project context
analysis_depth = kwargs.get('analysis_depth', 'standard')
include_sentiment = kwargs.get('include_sentiment', False)
language = kwargs.get('language', 'en')
# Synth might know these from previous task results
baseline_metrics = kwargs.get('baseline_metrics', None)
comparison_mode = kwargs.get('comparison_mode', None)
# Your code works with whatever context is provided
data = load_communications(dataset_path)
if participant_filter:
data = filter_participants(data, participant_filter)
if time_range:
data = filter_time_range(data, time_range)
# Proceed with whatever context was available
patterns = analyze_patterns(
data,
types=communication_types,
depth=analysis_depth
)
The synth's working memory might include:
- Identity and role information ("I'm analyzing data for the marketing team")
- Session history ("Earlier I loaded the Q3 communication logs")
- Task understanding ("The user wants to identify collaboration bottlenecks")
- Learned preferences ("This organization uses 'slack' not 'chat'")
Your code doesn't need to know how the synth knows these things. It just needs to expose the parameters that allow this contextual knowledge to flow in.
When you follow these five practices, something remarkable happens. Your code doesn't just solve one problem—it becomes a building block that synths can use in ways you never imagined:
1. A function you wrote to extract_table_from_pdf
gets composed with another developer's validate_financial_data
to create an invoice processing pipeline
2. Your calculate_trend_deviation
function, parameterized for financial data, gets repurposed by a synth analyzing manufacturing sensor data
3. The generate_summary_bullets
function you created for emails gets evolved by a synth into generate_executive_brief
for board presentations
4. Your simple normalize_timestamps
utility becomes part of a complex temporal analysis system that no single developer envisioned
This is the exponential growth in action. Every properly discretized, parameterized, and structured contribution multiplies the capabilities of the entire system. You're not just writing code—you're contributing genes to an evolving digital organism that gets more capable with every generation.
The workforce isn't programmed. It evolves. And these five practices ensure your code becomes successful DNA in that evolution.