Writing

AI Harnesses and the Pareto Principle

Learning AI and developing production applications with it has really reinforced to me the idea of a Pareto principle—that there are certain power laws, certain load bearing things that never or very seldom change—and then there are certain things that change very rapidly.

Many people don't know how to discern between the things that are transient, and the things that are, what Nassim Taleb would call, "lindy". And it can become very difficult when you confuse lindy things with transient things and transient things with lindy things, because you're constantly chasing your tail around 1% or 2% increase and you're not paying attention to the foundations and the fundamentals. With AI, for example, there's a lot of discussion around context windows and memory, and less discussion about how to work well with the models themselves.

If you look at where the companies that design these models are investing their time, you'll see that it's not in context window management, although that's part of the problem. And it's not in memory, although that's part of the problem. What these companies are actually investing in—and this is always a good signal because where the money flows in these companies is a good indication of what's actually going to be relevant for these models—is something called harnesses, which is essentially a way of utilizing the raw power of an LLM. So if you think of an LLM as a hammer or screwdriver or a raw piece of marble, it's a blank canvas. It's a tool and it's very powerful, but also not directed enough. It's not pointed in a particular way enough for it to be useful out of the box.

So a harness helps the LLM or the end user to harness the LLM in a useful way to do things. And so you'll see people who are drowning in information about memory upgrades or context window upgrades or other transient things that will be put out of existence with the next model update.

But what you don't see them focusing on is how to interact properly with an LLM, how to create systems that enable the LLM to be maximally useful. The naive AI user simply expects it to do anything to be an all-powerful tool, and of course that's not the case. The problem is that it's close enough to useful output in many domains that the casual user doesn't understand the gap between what they can do out of a prompt window in ChatGPT or Claude and what's possible when they really understand how to harness the model for their own systems and workflows.

If I were going to put my money on the reason that most people use AI casually and then quit when they see that they don't get the output that they want, is that they have no idea how to communicate or how to use the model in an effective way. And it takes a lot of trial and error. And it takes a lot of dead-end paths, but learning anything worthwhile does.

It always pays to think in fundamentals.

And as I see more and more people experiment and give up, I am reminded that it always pays to think in—as Christopher Alexander contended—the big movers in a system, the constraint that is upstream of all other constraints.