Prompt-Engineering on

Your AI Isn't Going Off the Rails. It Never Had Any.

Sun, 31 May 2026 00:00:00 +0000

Your AI Isn’t Going Off the Rails. It Never Had Any.

The most common complaint I hear from people using AI is some version of the same sentence: “It goes off the rails.” It drifts. It forgets what I told it. It invents things. It has no direction.

I understand the frustration, but the phrasing hides the actual problem. Going off the rails implies there were rails to begin with. There weren’t. A model in a blank chat window has no memory of who you are, no rules about how it should behave, and no defined process for doing the work. It is not drifting away from a plan. There was never a plan for it to drift from.

So when people tell me their AI lacks direction, what they are really describing is a missing system. The fix is almost never a better prompt. It is more structure. And the amount of structure you give the AI is the single biggest variable in whether it behaves like a reliable partner or a clever stranger who resets every morning.

The clearest way I have found to explain this is as three tiers. Each one solves a bigger slice of the “no direction” problem than the last.

Tier 1: Claude Chat — The Conversation

This is where almost everyone starts, and where most people stay. You open a chat window, you type, it responds. Each conversation is mostly a blank slate.

The defining trait of this tier is amnesia. A new chat forgets everything. Whatever context you want the model to have, you provide manually, in the prompt, every single time. The direction comes entirely from you. The model cannot touch your files, run anything, or reach your systems. It talks, and that is all it does.

This is genuinely useful. For a quick question, a brainstorm, a first draft, or thinking out loud, a chat window is fast and frictionless. But it is also exactly why it feels directionless on anything bigger. Nothing constrains it. There are no rules, no persistent goal beyond your last message. If your prompt is vague, the output is vague. It is a brilliant intern with total amnesia, and you are re-explaining the entire job every morning.

People at this tier blame the model. The model is rarely the issue. The issue is that nothing is holding it on a track, because there is no track.

Tier 2: Claude Code — The Operator

The second tier is a different kind of tool entirely. Claude Code is an agent that lives in your terminal. It does not just talk about work — it does the work. It reads and writes real files, runs commands, searches the web, and operates on your actual environment instead of an imagined one.

Two things change the moment you move here.

First, it gets a working memory of your project. A CLAUDE.md file holds persistent instructions for that codebase or project, so the model arrives already knowing the conventions, the goals, and the rules you have written down. You stop re-explaining the project on every session.

Second, and more importantly, it works in a loop: act, observe the result, correct. It writes a file and sees whether the change worked. It runs a command and reads the actual output. That feedback loop is what kills the drift. The model is not imagining what might happen — it is looking at what did happen and adjusting. Operating on real artifacts instead of guesses is most of the discipline.

The honest limitation: this discipline is per-project and largely manual. You set up each project’s instructions yourself. The memory does not follow you from one project to the next, and there is no consistent persona or process spanning everything you do. It is a sharp, capable operator — but one you have to brief fresh for every new job.

Tier 3: AI Infrastructure

The third tier is the one that actually fixes “no direction” at the root, because it stops treating each session as a fresh start.

AI Infrastructure is a system that wraps Claude Code and gives it a permanent identity, a rule set, a knowledge base, and a defined process. The jump from Tier 2 to Tier 3 is the difference between hiring a contractor and building an operations department. This post itself is a small example: it was written inside an AI Infrastructure that already knew my blog’s voice, my formatting conventions, and where the file should be saved, without my having to say any of it.

Three things work together here, and they are what make drift structurally hard.

The first is persistent identity and memory. The infrastructure does not forget who I am, what I work on, or how I want things done. When I correct it, the correction sticks across every future session, not just the current chat. The knowledge lives in files I own, not inside a conversation that disappears when I close the tab.

The second is a defined process. Every non-trivial request gets classified and routed through a structured sequence: understand the request, plan the approach, do the work, verify it against explicit criteria. The model cannot freewheel, because a process governs the response before it starts. That is the literal opposite of going off the rails — the rails are built in.

The third is context routing. Instead of me pasting the right background into every prompt, the system pulls the relevant knowledge automatically based on what I am doing. The model arrives oriented, every time.

None of this makes the underlying model smarter. It makes the environment around the model disciplined. That is the whole trick.

The Same Symptom, Mapped to the Fix

When someone describes their AI as directionless, the specific complaint usually tells you exactly which tier they are stuck on and what would move them up.

If it forgets what you told it, you are in a chat window and you need persistent instructions — that is the move to Claude Code.

If it hallucinates instead of using your real data, you need to let it read your actual files — again, the move to an operator that touches your environment.

If you keep re-explaining your preferences across projects, you have outgrown per-project memory and need persistent identity — the move to infrastructure.

And if it has no consistent process from one task to the next, you need a defined algorithm governing how every request gets handled — the same move.

Notice the pattern. Every one of these is solved by adding structure, not by writing a cleverer sentence into the prompt box.

The Real Shift

Direction is not something you nag the AI for inside each prompt. It is something you build into the system once.

That reframing is the entire jump from chatting to infrastructure. A chat window is equally capable on day one and day three hundred, because nothing accumulates. An operator gets more useful per project, as long as you keep briefing it. An infrastructure compounds — every rule you add, every preference it learns, every process you refine makes the next session start further ahead than the last.

If your AI feels like it has no direction, it is not malfunctioning. It is doing exactly what an unstructured system does. The rails were never the model’s job to build. They are yours.

Stop Prompt Engineering. Start Building Infrastructure.

Sun, 19 Apr 2026 00:00:00 +0000

Stop Prompt Engineering. Start Building Infrastructure.

Last week I opened a terminal, typed six words, and watched PAI spend the next three minutes processing a set of handwritten study notes exported from my reMarkable tablet. It converted the file format, extracted key concepts, generated structured review questions, cross-referenced my existing knowledge base, and saved everything to the correct directories in Obsidian — organized by module, tagged correctly, ready to use. I did not write a prompt. I did not explain what certification I was studying. I did not describe the output structure I wanted. I just named the module.

Eighteen months ago, that same task would have started with a paragraph explaining what the reMarkable export format was, what the certification covered, how I organized notes in Obsidian, what level of detail I wanted in the summary, and what format the quiz questions should follow — and another paragraph if I wanted the output saved to a specific location. Every single session. From scratch.

That gap is the entire argument for building an AI harness instead of staying in a chat window.

The Chat Window Tax

Prompt engineering emerged as a discipline because LLMs are stateless by default. Every conversation starts with a blank model. If you want the model to know who you are, what you work on, how you like your outputs formatted, and which approach you prefer for recurring problems — you have to tell it. Every time.

That is a tax. Not a feature. A tax.

The people who got good at prompt engineering got skilled at paying that tax efficiently — writing shorter context dumps, using system prompts in API playgrounds, building prompt libraries they paste from. It helped. But it never made the tax go away. It just made each payment slightly cheaper.

In 2026, paying that tax is a choice. The tools exist to stop paying it entirely.

What a Harness Actually Does

A harness is infrastructure wrapped around your AI runtime. In my case, that is PAI — Personal AI Infrastructure — running on top of Claude Code in the terminal. The architecture has three layers.

Memory is persistent context that survives across sessions. PAI knows my role (HRIS analyst), my platform (Oracle HCM Cloud), my Oracle triage methodology, my blog’s writing conventions, my active projects, and my preferences for output formats. None of that gets re-entered. It gets loaded automatically at session start.

Skills are pre-built, parameterized workflows. When I say “process my study notes,” a skill handles that — reading from the right directory, converting the format, saving to the right Obsidian path, cross-referencing the knowledge base. The skill is the prompt, written once, tested, improved over time. I do not craft it fresh every time.

The Algorithm is a structured execution framework. When the work is complex — multi-step, multi-file, non-trivial — PAI runs through a defined process: observe, think, plan, build, execute, verify, learn. The output is consistent because the process is consistent.

Taken together, these three things mean the model is never starting from zero. It arrives at each session already oriented.

The Token Economy Hidden Inside the Infrastructure

There is a practical angle to this that does not get talked about enough: token consumption.

Every message in a chat session burns tokens — your context, the model’s reasoning, the output, and whatever you paste in to re-establish state. The longer and more complex the session, the faster you burn toward usage limits. When you are re-explaining your role, your project, and your preferences at the start of each conversation, you are spending tokens on re-orientation, not on actual work.

A harness changes the math.

PAI loads persistent context at session start through hooks — but those are structured files read by the runtime, not large prompt blocks the model has to reason through. The model arrives oriented. The working token budget goes toward the task.

More importantly, PAI externalizes logic that would otherwise live inside the conversation. The skills are pre-written workflows. The Algorithm is a structured execution framework. The session hooks handle routing and context injection. A significant portion of what would normally require the model to think its way through — “what directory does this go in?”, “what format does this certification use?”, “what’s the right next step in this process?” — is already answered in scripts and configuration files that run before the model responds.

That is not just more efficient. It changes your usage ceiling. When the model is not spending context budget on re-orientation or derivable decisions, more of each session goes toward meaningful work. You hit limits later, do more per session, and run longer chains of complex tasks without interruption.

Prompt engineering optimizes the prompt. Infrastructure optimizes the budget.

CLI vs. Chat: It Is Architecture, Not Preference

This is the part that took me a while to articulate. The preference for CLI over chat window is not aesthetic — it is structural.

A chat window is a conversation interface. Conversations are ephemeral. They have no persistent state, no programmable hooks, no way to inject context at session start, no way to trigger workflows, no way to store outputs in structured memory. The UX is polished. The architecture is a dead end for anything requiring continuity.

A CLI is a programmable runtime. Session start hooks can load context files. Commands can trigger skills. Outputs can write back to memory. Different agents can be spawned with different contexts and run in parallel. The AI operates inside an environment you built, not inside a box you are renting.

That difference compounds. A chat window is equally capable on day one and day three hundred. A harness gets more capable every time you add a skill, improve the memory, or refine the algorithm.

Before and After: The Same Problem, Two Environments

Chat window, eight months ago:

“I have study notes from a certification I’m working through, exported as a Word document from my tablet. I organize my notes in Obsidian under a folder structure by certification and module number. I need you to convert the content to clean markdown, extract the key concepts as a structured summary, generate quiz questions with answers, and format everything to match my existing note structure. The certification is [name], this is module [N], and here’s an example of how my other notes look: [paste example]…”

Then the session ended. Next time I had notes to process — same context dump, from scratch.

With PAI, today:

“Process my study notes for module 4.”

PAI already knows the certification, the Obsidian directory structure, the naming conventions, the quiz format, and which knowledge base to cross-reference. Processing starts immediately. The notes land in the right place in the right format.

The eight-month gap between those two experiences is not better prompting. It is infrastructure.

2026: Where the Power Users Went

The practitioners who were deep into prompt engineering two years ago have largely moved on — not to better prompts, but to better systems. They are building skills, writing memory schemas, wiring session hooks, running structured execution algorithms on complex work. The prompt engineer persona is being quietly replaced by the AI infrastructure builder.

This is not about being technical. It is about thinking one level up. Instead of asking how to get a better response to this prompt, you ask what a system would need to know to handle this reliably, every time.

Your Knowledge Doesn’t Live in the Model

One of the less obvious benefits of building infrastructure rather than relying on chat conversations: your knowledge is not locked to any LLM.

When everything lives in a chat window, switching models means starting over. Your context, your conversation history, your accumulated session knowledge — gone. The model you were using knew who you were because you kept telling it. A different model knows nothing.

With PAI, the knowledge lives in files you own. The memory is markdown on your machine. The skills are scripts in a directory. The algorithm is a structured process your runtime executes. None of it is stored inside Claude, or any other model. The AI is the engine, not the warehouse.

That distinction matters more than it sounds. LLMs are evolving fast. A model that is the best choice today may not be the best choice in six months. If your entire working context is entangled with one provider’s chat history, migration is painful. If your context lives in a portable, file-based system, switching the underlying model is a configuration change — not a rebuild.

I run PAI on Claude today because it is the best fit for how I work right now. But the memory schema, the skill library, the algorithm — all of it would transfer to a different model without losing a session’s worth of context. That portability is a deliberate design choice, and it is one of the most underappreciated properties of building on open infrastructure rather than inside a walled chat product.

Credit Where It’s Due

PAI did not emerge from a vacuum. A significant part of the thinking behind it — the idea that AI should be augmenting structured, intentional human systems rather than replacing ad-hoc conversations — traces directly to the work of Daniel Miessler .

Daniel has been articulating the case for AI infrastructure thinking longer than most. His Fabric project, his writing on augmented intelligence, and his broader framing of what it means to build systems that extend human capability rather than just answer questions — all of it shaped how PAI was conceived and how it continues to evolve.

The shift from “better prompts” to “better systems” is not a new idea. It just needed enough tooling to become practical. Daniel saw that early.

Where to Start

PAI is open-source. Claude Code is free to start — it is Anthropic’s official CLI, available to any Claude user. The distance between using AI in a chat window and running it inside a harness is smaller than it looks, and the compounding return starts from the first session where PAI remembers something you did not have to re-enter.

If you are still re-explaining yourself every time you open a new tab, that is the problem worth solving.