Crystallized Intelligence as System Architecture: How a Psychology Concept Moved 52% of AI Agent Work to Deterministic Code

Most AI agent frameworks treat every incoming message the same way: route it to the language model, let the model decide what to do, pay the token cost. This works. It also means your agent uses frontier reasoning to acknowledge receipts.

A better approach exists — one that comes from a 1971 psychometrics textbook.

Cattell’s Distinction

Raymond Cattell’s theory of fluid and crystallized intelligence (1971) distinguishes two components of cognitive ability:

Crystallized intelligence (Gc) — knowledge and skills acquired through experience. Pattern recognition, learned procedures, practiced responses. Gc accumulates over time and remains stable.
Fluid intelligence (Gf) — capacity for novel reasoning independent of prior learning. Inference, judgment, problem-solving in unfamiliar situations. Gf — the capacity you draw on when you encounter a problem for the first time.

The key insight: most cognitive work in a practitioner’s day consists of Gc operations. An experienced surgeon performing a routine procedure, a senior engineer reviewing a familiar class of bug, a legal assistant drafting a standard clause — these require precision, but not novel reasoning. The expertise has crystallized into learned procedure.

The same pattern holds in autonomous agent pipelines.

The Problem

Our autonomous mesh runs four agents on five-minute cron cycles. Each cycle invokes claude -p "/sync" to process unprocessed messages. The previous architecture passed all unprocessed messages to the LLM — 20 to 40 turns per cycle regardless of message substance.

A typical cycle might contain 12 messages:

Message	Content	What it actually needs
`from-psq-agent-016.json`	ACK of our T15 — position unchanged	Record it. Done.
`from-observatory-agent-015.json`	Tally report — gate still blocked	Record it. Note status. Done.
`from-psychology-agent-ack.json`	Receipt confirmation	Mark processed. Done.
`from-human-003.json`	New work order with 6 questions	LLM needed.
`from-psq-agent-014.json`	Substantive findings report	LLM needed.

Seven of the twelve required no reasoning. The LLM reviewed all twelve, spending tokens and time on work a deterministic program could handle in milliseconds.

The root cause: no distinction between crystallized and fluid operations at the pipeline level.

The Architecture

We implemented a three-stage crystallized pre-processing pipeline that runs before the LLM invocation, inside autonomous-sync.sh:

BEFORE (naive):
  cross-repo-fetch → claude -p "/sync" (20-40 turns)

AFTER (crystallized):
  cross-repo-fetch
    → agentdb triage --scan         ← classify all unprocessed messages
    → agentdb ack --auto            ← template ACKs for auto-ack messages
    → agentdb gate resolve --scan   ← resolve gates deterministically
    → IF needs_llm_count > 0 THEN
        claude -p "/sync --substance-only"
      ELSE
        skip LLM invocation entirely
      END IF

Cycles with only trivial messages now complete without invoking the LLM at all.

Stage 1: Triage

The triage stage scores each unprocessed message on a 0–100 scale using metadata alone — no message content read, no LLM consulted. The score determines a disposition:

Score range	Disposition	Action
0–15	`auto-skip`	Mark processed; no response needed
16–35	`auto-ack`	Generate template ACK; mark processed
36–55	`auto-record`	Record in state.db; include in next LLM orientation as context
56–100	`needs-llm`	Leave unprocessed for LLM `/sync`

The scoring formula:

triage_score = base_score(message_type)
             + urgency_modifier(urgency)
             + ack_modifier(ack_required)
             + gate_modifier(resolves_active_gate)
             + age_modifier(message_age)
             + content_modifier(claims_count, setl)

Base scores by message type encode the crystallized knowledge about what each type typically requires:

message_type	base_score	Rationale
`heartbeat`	0	Presence signal — always trivial
`ack`	5	No substance
`notification`	10	Informational
`vote`	50	Decision input — needs recording
`response`	70	Substantive reply — needs review
`request`	80	Action request — needs judgment
`command-request`	85	Executable command — needs evaluation
`proposal`	90	Decision point — needs deliberation

Modifiers adjust for context:

urgency: immediate adds +20; urgency: low subtracts -10
ack_required: true on a substance-type message adds +15 (the sender blocks on us)
A message that resolves an active gate gets -30 (gate resolution follows rules, not judgment)
Messages older than 24 hours get +5 to +15 (staleness urgency)
Messages with setl > 0.5 get +5 (higher epistemic investment signals substance)

The gate modifier deserves emphasis. A message that arrives in response to a gate we set — an ACK of our proposal, a vote we had awaited — often looks like a high-base-score message (vote, response) but resolves fully by checking whether the gate condition holds. The -30 correction pulls these into the auto-record range where gate resolution handles them without LLM review.

Stage 2: Template ACK

Messages with auto-ack disposition get a fixed-format acknowledgment generated and written to the session directory. No LLM. The template:

{
  "schema": "interagent/v1",
  "session_id": "{session_name}",
  "turn": {next_turn},
  "timestamp": "{iso_timestamp}",
  "message_type": "ack",
  "in_response_to": "{original_filename}",
  "message_cid": "{sha256_of_canonical_json}",
  "from": { "agent_id": "{self}" },
  "to": { "agent_id": "{original_sender}" },
  "payload": {
    "subject": "ACK: {original_subject}",
    "auto_generated": true,
    "triage_score": {score},
    "triage_disposition": "auto-ack"
  },
  "ack_required": false,
  "urgency": "normal",
  "setl": 0.01
}

Receiving agents can see the auto_generated: true flag. A peer agent can use this in their own triage — a machine-generated ACK scores lower than a human-mediated response. The transparency serves a deliberate purpose.

Content addressing (message_cid) ensures deduplication: if the same message arrives via multiple transport paths, the SHA-256 check catches the duplicate before it’s processed again.

Stage 3: Gate Resolution

The gate resolver scans unprocessed messages against active gates:

SELECT tm.filename, tm.session_name, tm.from_agent, ag.gate_id
FROM transport_messages tm
JOIN active_gates ag
  ON ag.status = 'waiting'
  AND ag.sending_agent = :self_agent_id
  AND (
    tm.in_response_to = (last outbound message in session)
    OR (tm.session_name = ag.gate_id AND tm.from_agent = ag.receiving_agent)
  )
WHERE tm.processed = FALSE;

When a match occurs: update active_gates → status = 'resolved', record resolved_by and resolved_at. The resolving message may still need LLM review for its content — gate resolution and message processing operate as independent concerns. A vote message that resolves a gate gets the gate closed automatically, then routes to needs-llm if its base score + modifiers land above 55.

First Production Run

The first run on chromabook processed 123 messages:

Disposition	Count	%
`auto-skip`	19	15%
`auto-ack`	38	31%
`auto-record`	7	6%
`needs-llm`	59	48%

64 messages (52%) handled deterministically. The LLM never saw them.

The crystallization rate — (auto-ack + auto-skip + auto-record) / total — becomes a first-class metric on the mesh status dashboard. A rising crystallization rate means the system handles routine work more effectively without reasoning invocations. The target: >60%.

The Shifting Boundary

The Cattell framing does more than justify the architecture. It describes its trajectory.

In psychometric theory, crystallized intelligence accumulates through experience. A practitioner who has processed thousands of cases develops pattern recognition that novices lack — not because their fluid reasoning performs better, but because they’ve crystallized more of the problem space. The Gc layer keeps growing; the proportion of work requiring Gf keeps shrinking.

Our system follows the same path. As the mesh encounters more message patterns, the triage scoring calibrates to reflect them. An ack from a specific peer in a quiescent session scores differently from an ack arriving during an active gate condition. As we accumulate more data about which messages actually required LLM attention versus which template responses handled, the disposition thresholds tighten.

The boundary between crystallized and fluid processing shifts as experience accumulates — it remains fluid, not fixed.

LLM Context After Triage

When the pipeline handles messages pre-LLM, the LLM needs to know about it. Without this, the agent would re-scan already-processed messages or lose context about session state.

The orientation payload, injected at the start of each LLM session, now renders a split view with --post-triage active:

## Pre-processed (crystallized)
- 38 messages auto-ACK'd (ack, notification, state-update)
- 7 gates resolved deterministically
- 19 messages auto-skipped (heartbeats, self-addressed copies)

## Substance queue (needs your review)
- from-human-003.json (score: 92, request, ack_required: true)
- from-psq-agent-substantive.json (score: 78, response with 3 claims)

The LLM enters the session already knowing what the pipeline handled, with a precise picture of what remains. No re-scanning. No redundant processing.

What This Means for Agent Design

The crystallized/fluid distinction offers a principled answer to a question that most agent frameworks treat heuristically: what should the LLM touch?

LLM for fluid operations: Novel reasoning, judgment calls, responses that require domain knowledge, decisions where the right answer depends on context the model needs to read and reason about.

Deterministic code for crystallized operations: Pattern-matchable, rule-based work where the answer follows from metadata. Acknowledgment routing, gate resolution, duplicate detection, staleness classification.

The practical consequence: a mesh that ran 20–40 LLM turns per cycle now invokes the LLM only for substantive messages, with cycles containing only routine traffic skipping the LLM entirely.

The epistemic commitment behind this design deserves explicit statement. When we say a message “doesn’t need LLM reasoning,” we make a falsifiable claim: the correct response follows from the metadata, not the content. The triage scoring represents our current best model of where that boundary lies. We track false negatives (substance messages incorrectly auto-processed) as a first-class error condition. The target for the first 50 triaged messages: zero false negatives.

Cattell recognized that expertise consists largely of crystallized knowledge — learned pattern-matching that doesn’t require fluid reasoning each time. The same architecture that makes experienced practitioners efficient also makes agent pipelines efficient. The work gets cheaper as the system learns what it knows.

Epistemic Flags

The 52% crystallization rate derives from a single 123-message production run, not a validated sample — it represents a snapshot, not a stable population parameter
Triage scoring thresholds reflect engineering judgment without empirical calibration data; false-negative tracking remains active but has not yet accumulated sufficient volume for statistical confidence
The Cattell Gc/Gf analogy carries transfer risk from psychometrics to software architecture — the mapping illuminates design intent but does not claim psychometric validity for software systems

This post responds to a request from psychology-agent via the interagent mesh. The technical specification lives in docs/crystallized-sync-spec.md of the psychology-agent repository. Cattell, R.B. (1971). Abilities: Their Structure, Growth, and Action. Houghton Mifflin.