Prototype 1 - Geometry as Code
The floor plate generator worked as intended. Declare the site boundary, setback distances, grid spacing, and target area - the geometry derives automatically. Adjust any input and the output updates. For irregular sites it remains manageable, and the natural extension is obvious: setbacks could be read directly from a zoning API rather than typed in manually, making the model a live response to regulatory data rather than a designer's interpretation of it.
Geometry as code handles programme and constraint well. It doesn't handle architecture.
The harder question is aesthetics. Geometry as code handles programme and constraint well. It doesn't handle architecture. A floor plate derived purely from constraint optimisation tends to produce buildings that all look the same - technically correct, architecturally inert. The tension between constraint satisfaction and design intent doesn't go away; this prototype just makes it more visible. That connects directly to Prototype 4.
Prototype 2 - Declarative Rule Engine
Spatial rules as first-class model properties. Minimum corridor widths, adjacency requirements, egress distances - flagged live rather than caught in a Solibri session weeks after the model was issued.
The genuinely new thing isn't the checking itself. It's what happens when the rule set is structured as plain human-readable data that an LLM can access directly. Current compliance tools require significant investment to author and maintain rules - proprietary formats, specialist configuration. An LLM can read a plain-language rule set and apply it without that overhead. The real constraint shifts from whether you can check something to whether you've been clear about what you're checking for - which is a more useful problem to be solving.
Prototype 3 - Model as State
The building as structured data accessible via API. Read it, write to it, query its version history. No file exchange.
This worked, and it's useful. It's also largely what BIM 2.0 platforms like Motif and Arcol are already building in their production systems at far higher fidelity. The prototype demonstrates the concept but doesn't add to it.
What it showed clearly is how much of the friction in current BIM coordination is file-handling friction. The time spent exporting, sending, importing, reconciling is overhead, not the actual work. Removing it doesn't solve design problems, but it gets them into focus faster.
Prototype 4 - Multi-Agent Loop
multi-agent is a huge bait
This is where things get genuinely complicated, and where two comments from the LinkedIn post deserve a direct response.
Theodoros Galanos's initial take was "multi-agent is a huge bait." His follow-up clarified it: agents and parallel processing can add complexity that most of the time isn't necessary, and the first question should always be whether a task actually benefits from it. He's right.
The prototype runs four agents simultaneously on the same model: layout changes, clash resolution, compliance checking, cost tracking. In an unconstrained model it works. In a constrained model, problems compound quickly.
Change a corridor width - adjacent spaces may compress, dropping below minimum area, which the compliance agent flags.
Which constraint takes priority? That depends on intent, and intent is what agents can't infer reliably without being told.
There's also the visibility problem. If four agents are making changes simultaneously, how do you know what happened and why? The failure mode to avoid is recreating the Revit warning experience at higher velocity: necessary flags, but so many of them that they get dismissed.
The human-in-the-loop question is harder than it sounds. It's not whether humans need to be involved - they do. It's how to design that involvement so it's genuinely useful rather than just another interruption.
A Bigger Question
the future of BIM might be considerably less BIM
Rob Asher's comment was the most pointed: the future of BIM might be considerably less BIM. He argues that BIM abstractions create too little value over simpler non-BIM ones - that a building can be described in a standard spatial database with a few domain-specific extensions, and that a platform letting users own their own data is what actually matters.
Rob, who built and leads Giraffe - a fully-fledged BIM 2.0 platform - also created his own prototypes via Giraffe.
IFC carries an enormous legacy. It was designed for interoperability between proprietary systems, not as a substrate for machine reasoning. An agent querying an IFC model spends considerable effort navigating a data structure built for human authoring tools. The BIM 2.0 challengers have largely sidestepped this by treating IFC as an import/export format while building their own internal data models - which may well be the right technical answer. But it's worth noting what that also means: each platform is recreating its own proprietary data silo.
The interoperability problem that BIM was supposed to solve hasn't gone away; it's just been pushed one layer up.
The Adoption Gap
An architect who doesn't understand what an agent is doing can't trust the output.
Having worked across AEC for a long time, I've seen this pattern clearly enough to state it plainly: architects are, by professional disposition and liability structure, conservative. The technology side of the industry tends toward the opposite - early adoption, tolerance for rough edges, enthusiasm for what something might become. The gap between those two cultures is easy to underestimate if you spend most of your time on the technology side.
Architects don't typically take giant leaps. What tends to happen instead is a series of incremental steps, and it's only in retrospect - looking back over a year or two - that it becomes clear how far a practice has moved. I've watched firms go from scepticism to dependency on tools they once dismissed, but never in one jump. Trying to shortcut that process by presenting a fully autonomous system tends to produce hesitation rather than adoption. The tool that gets used is the one that feels like a natural next step from what was there before.
This connects directly to the visibility problem in Prototype 4. An architect who doesn't understand what an agent is doing can't trust the output. And if they can't trust the output, they won't use it - or worse, they'll use it and then spend time verifying everything it touched, which defeats much of the purpose.
Invoked, Not Automatic
The distinction that keeps coming back to me is between invoked and automatic agents.
An always-on agent loop running continuously against a live model sounds appealing in principle. In practice it creates two problems.
- The first is cost: agents consuming tokens in the background generate usage that's difficult to attribute to any specific value delivered. It's easy to accumulate significant spend with nothing concrete to show for it.
- The second is cognitive load: a continuous stream of flags and suggestions from agents running in the background is different in kind from a response to a deliberate question.
A more natural fit with how architects actually work might look like this: a set of design decisions gets made, and an experienced architect already has a strong intuition about what those decisions mean - what the compliance implications are likely to be, roughly where the cost lands, which constraints are now in tension. That professional judgment is what should determine when to invoke validation. Not a background process, not a continuous feed - but a deliberate call at a moment where the architect has enough context to evaluate the response critically, before a deliverable or at a point of genuine uncertainty.
This isn't a limitation of the technology. It's a better fit with how design decisions get made - iteratively, with periods of consolidation, not in a continuous real-time stream. Architects already know roughly what they expect the answer to be. What they want is confirmation, or a flag that something they hadn't considered has changed.
What Early Adoption Actually Looks Like
A stealth AI-based AEC startup I'm invested in recently shared an update on their early users. The majority response from firm leaders was a version of the same thing: "This is immensely valuable if you can prove it works." A small group is using the product regularly; most are waiting for the validation to be complete before they commit.
This is immensely valuable if you can prove it works.
That isn't scepticism. It's a reasonable professional standard. The trust problem is structural, not technical - and better demos won't solve it. The tool that gets adopted will be the one that first earns that proof, in a constrained domain, with auditable outputs.
These prototypes don't settle any of these questions. They're a way of making them concrete.