Why I Built PURISTA 3.0 Around AI Harness

AI demos are easy.

Owning AI in an enterprise system is not.

That difference is the reason I built AI Harness and why PURISTA 3.0 brings it into the framework.

I have seen the same pattern many times. A team builds a promising prototype. The first demo works. The model answers. The workflow looks impressive. Then the serious questions start:

Where does the data go?
Who can approve an action?
Can we switch providers?
What exactly happened in this run?
Can we test this without spending money on model calls?
Can we explain the result to operations, security, and management?
Can we integrate it with the systems we already own?

Those questions are not bureaucracy. They are the difference between an experiment and a product.

PURISTA was always built around this kind of pressure. The original idea was simple: business capabilities should be explicit, typed, traceable, and independent from the infrastructure that happens to run them. A command should not know whether the application runs as one process, behind HTTP, through a message bridge, or split across services. The business logic should stay understandable.

With AI, that idea became more important, not less.

The Problem I Wanted to Solve

Most enterprise AI work starts outside the real system.

It begins in a notebook, a provider SDK example, an automation tool, a hosted agent builder, or a small script that sits next to the application. That is fine for learning. It is dangerous as the architecture.

The moment an AI workflow becomes part of a business process, it stops being a prompt problem. It becomes a software architecture problem.

It needs boundaries. It needs validation. It needs state. It needs tools with permissions. It needs review gates. It needs observability. It needs a way to fail without destroying trust. It needs a way to change providers without rewriting the whole product. It needs tests that senior developers can run in CI and decision makers can trust as evidence.

That is the gap I wanted PURISTA 3.0 to close.

Not by hiding AI behind magic.

By making AI part of the same service model as the rest of the system.

What Changed in PURISTA 3.0

PURISTA 3.0 adds three pieces that belong together:

Queues for explicit asynchronous work.
Streams for long-running or incremental output.
AI agents through AI Harness, so model-driven work can be typed, tested, observed, and operated inside the application boundary.

This is not just a feature list. It is one operating model.

Modern enterprise software rarely behaves like a simple request and response. A user starts something. The system validates it. Some work can happen immediately. Some work must run in the background. Some results arrive gradually. Some steps need approval. Some steps call external systems. Some steps may involve an LLM, but the responsibility for the workflow still belongs to the application.

PURISTA 3.0 gives me a cleaner way to model that.

Queues: Work That Should Not Block the User

Many important business processes cannot finish inside one HTTP request.

Generating a report, processing a document, checking a policy, syncing with another platform, preparing a customer-specific result, coordinating an AI workflow, or handling a long-running backend task should not force the user to wait until everything is complete.

The usual shortcut is to hide this behind adapter code. A command writes something to Redis, publishes to NATS, or calls a worker directly. It works, but the queue becomes invisible in the service contract. The business capability depends on background work, but the architecture does not say that clearly.

In PURISTA 3.0, queues become first-class service capabilities.

A service can declare that it can enqueue work. A queue worker can be generated and tested as part of the service version. The runtime can use adapters such as NATS JetStream or Redis lists, but the business code does not have to become infrastructure code.

For a CTO, that means the architecture is easier to reason about.

For a senior developer, it means asynchronous work is not scattered through helper functions and hidden client calls.

For a CEO or product owner, it means the system can stay responsive while still doing serious work behind the scenes.

The point is not “PURISTA now has queues”. The point is: background work becomes visible, typed, and owned.

Streams: Progress Instead of Silence

The second part is streaming.

Silence is bad product behavior. When a system is doing meaningful work, users need feedback. Operators need signals. Developers need a shape they can test.

This is especially true for AI, but not only for AI. Search, analysis, export, document processing, live status updates, and long calculations all benefit from returning progress and partial results before the final answer is ready.

PURISTA 3.0 adds streams as a service artifact. They can have schemas, types, tests, and a place in the service builder next to commands, subscriptions, queues, and agents.

That matters because streaming is often where architecture gets messy. A team starts with clean service contracts, then adds one special endpoint for streaming, then another one, then some custom runtime state, then a different testing pattern. After a while the most important user experience lives outside the framework discipline.

I wanted streams to stay inside the same mental model.

The user sees progress. The developer keeps contracts. The system remains observable.

AI Harness: AI That Belongs to the Application

AI Harness is the part I care about most in this release.

I built it because I do not believe enterprise AI should be reduced to prompt calls spread through an application. I also do not believe every company should hand the shape of its AI workflows to a hosted black box before it understands what it is building.

AI Harness is my answer to a practical question:

How do I build AI capabilities that are still normal, owned, reviewable TypeScript application code?

The core ideas are direct:

Agents are typed LLM conversation loops.
Workflows are application-owned orchestration around one or more agents.
Tools are explicit capabilities, not random functions the model can call.
Skills are reusable instruction and domain guidance.
State, traces, run events, sandboxing, and logs belong to the runtime boundary.
Provider adapters keep OpenAI, Anthropic, Amazon Bedrock, and Azure AI Foundry behind a replaceable interface.
Evals and scripted test models let teams test behavior without turning CI into a provider bill.

This is important for enterprise teams because the model is not the product. The business process is the product.

The model may classify, summarize, plan, extract, compare, or propose. But the application still decides what is allowed, what needs review, what gets stored, what gets exposed to users, and what counts as a valid result.

That is why PURISTA 3.0 brings agent authoring into the service builder.

An agent can have input and output schemas. It can declare which model handle it needs. It can expose an endpoint. It can use command tools, child agents, skills, and sandbox policies. It can be tested with a scripted model provider. It can emit run events and traces.

The important shape looks like this:

export const triageTicketAgentBuilder = supportV1ServiceBuilder
  .getAgentQueueBuilder("triageTicket", "Classifies support tickets by urgency")
  .addPayloadSchema(supportV1TriageTicketInputPayloadSchema)
  .addOutputSchema(supportV1TriageTicketOutputPayloadSchema)
  .addModel("primary", {
    model: "support-triage",
    capabilities: ["object"] as const,
    defaults: { temperature: 0 },
  })
  .setRunFunction(async (context) => {
    const result = await context.harness.models.primary.object(
      {
        messages: [
          {
            role: "user",
            content: `Classify ticket ${context.payload.ticketId}: ${context.payload.text}`,
          },
        ],
        schema: supportV1TriageTicketJsonSchema,
      },
      context.signal,
    );

    return supportV1TriageTicketOutputPayloadSchema.parse(result.object);
  });

The syntax is not the story. The ownership is.

The agent is part of the service. Its inputs are known. Its output is validated. Its model access is declared. Its behavior can be tested. Its execution can be observed. It is not a hidden AI sidecar.

What This Solves for Decision Makers

For enterprise decision makers, the main value is not technical elegance.

The value is control.

If you are a CEO, AI adoption should not mean that every promising workflow becomes a dependency on a different SaaS tool, a different provider, and a different security story. You need speed, but you also need ownership. You need to know that the company can change direction without throwing away the whole implementation.

If you are a CTO, the concern is architecture. You need provider neutrality, clean integration points, observability, testability, and a way to keep teams from building one-off AI islands. You need a model where AI capabilities can be reviewed the same way other production capabilities are reviewed.

If you are a senior developer, the question is more concrete. Where do I put the code? How do I type the input? How do I validate the output? How do I mock the model? How do I prevent tool access from becoming a security problem? How do I debug one failed run without guessing?

PURISTA 3.0 does not remove those responsibilities. It gives them a place.

That is what framework work should do. It should not pretend complexity is gone. It should put complexity where it can be named, tested, and changed.

The Incident Response Example

One example I like is incident response.

Imagine a checkout outage during business hours. The business impact is real. The pressure is high. The wrong automated action can make things worse.

In a weak AI implementation, the system sends logs and alerts to a model and asks for a recommendation. That may produce an interesting answer, but it is not enough.

In a system I would trust, the workflow is more explicit:

deterministic commands load incident evidence
one agent analyzes signals
another agent assesses rollback risk
a coordinator combines the results
command tools retrieve runbooks and store the final brief
sandbox policy controls what the agent can inspect or execute
review gates decide what can happen before a mutation
traces and run events explain what happened
tests can run without live provider calls

That is the difference between an AI demo and an enterprise workflow.

The demo proves the model can answer.

The workflow proves the organization can own the answer.

Why I Brought This into PURISTA

I could have kept AI Harness completely separate.

There are reasons to do that. A standalone package is easier to explain. It can move independently. It can be used outside PURISTA.

But for PURISTA itself, I wanted a stronger statement.

AI is becoming part of backend systems. It will not stay in side projects, chat widgets, and automation scripts. It will sit inside business processes: support, operations, compliance, logistics, finance, procurement, engineering, customer service, knowledge work, and internal tooling.

If that is true, then AI needs to fit into the same architecture as the rest of the product.

That means queues when work should run later. Streams when work should be visible while it runs. Agents when model-driven reasoning is useful. Workflows when the application must coordinate multiple steps. Schemas when input and output need contracts. Traces when someone asks what happened. Tests when teams need confidence before deploying.

This is why PURISTA 3.0 matters to me.

It is not a release about chasing AI hype. It is a release about making AI boring enough to operate and explicit enough to trust.

That is where I want enterprise AI to go.

Not more magic.

More ownership.

If you want to see the direction, start with PURISTA, inspect the PURISTA repository, or look at AI Harness as the runtime foundation behind this work.