Post header background circles

Spec-Driven Development for Data and Data Products

Spec-Driven Development for Data and Data Products

AI has made change cheap. And if you are not careful, it might wreck your business.

Executive Summary

AI has made change in data product development dramatically cheaper and faster. But the greatest risk it introduces is not broken code—it is semantic drift: the gradual loss of meaning, intent, and trust as systems evolve at speed.

Spec-driven development addresses this by making intent explicit, durable, and machine-readable. By capturing why a data product exists alongside what it does and how it is implemented, specifications ensure every new change is evaluated against the full history of requirements. With AI now able to maintain and validate specs automatically, organisations can move faster without sacrificing clarity, governance, or coherence—turning every change into an opportunity to build better data products, not just more of them.

Introduction

Data product development using AI Agents gives 100x speed and efficiency improvements. Spec driven development doesn’t slow this down. It ensures that this speed doesn’t come at the cost of clarity, coherence, or trust. In fact it goes further and creates consistency, governance and trust that has never been possible before.

Thanks to AI, we can now create, modify or extend data products tens of times quicker than before; more people making more changes every day. Metis, our data engineering and data product AI agent can build/edit production ready data products, from scratch, in minutes….. so that’s everything solved right?!

Nope. If you don’t solve one CRITICAL problem, you may end up much worse off than you started.

The risk is broken semantics. It's not challenging to make your Data Product “also do X.” Where it gets tricky is when you want it to “also do X while continuing to do everything someone has previously asked you to as well”.

Spec-driven development exists to close that gap, ensuring that as change accelerates, meaning doesn’t get left behind.

What is spec-driven development?

In any data product, the “What” and the “How” are typically defined as a set of configuration and code (e.g., sql files for transformation, yaml files for pipeline definitions, tests and infrastructure definition). In simple terms the “Spec” defines the “Why” – why does the current configuration and code look like it does. It’s a set of context and history that gives a complete narrative about the “What” and the “How” and most importantly the “why” for each.

Consider a home renovation.

Without context, a requirement to an architecture AI like “I want more light and a better flow between the kitchen and dining room” might lead to a “What” of “An updated set house layout to not have this wall”. What would be stored going forward is a set of floor plans with no wall between the kitchen and dining room.

Then, a year later, a new requirement to the AI comes along “We want more kitchen storage” and the AI looks at the plans and says “Easy, if we put a wall between the kitchen and the dining room, we can put cabinets on it, problem solved.

But is it? It’s addressed the new requirement but at what cost? It had the output from the previous requirements – the floor plans, but no idea of why they were the way they were, and therefore no context to preserve the previous requirements.

What is missing here is the Specification (or Spec). If we stored not only the floor plan but a record of:

  • Amendment 1:
    • Requirement: The user wants more light and a better flow between the kitchen and dining room”
    • Change: we removed the wall between the kitchen and the dining room

And this was stored and versioned alongside the floor plan itself, we would have a complete context. Then, when the user poses the new question the Spec is updated to include a new amendment:

  • Amendment 1:
    • Requirement: The user wants more light and a better flow between the kitchen and dining room”
    • Change: we removed the wall between the kitchen and the dining room
  • Amendment 2:
    • Requirement: The user wants more kitchen storage
    • Change: <tbd>

Now a properly built AI agent asks a different question of itself “What is the best floor plan that meets ALL the user requirements in the Specification” and clearly adding the wall back in, while addressing the second, would fail to meet the first, and so this option is rejected. We’ll come back to why the question the AI agent asks itself is SO critical later.

In the case of software products, and even more importantly Data Products, the Spec is the context – it explains Why everything is the way it is and prevents mistakes down the line.

How does spec driven development affect process?

In a traditional AI-powered development process (how quickly the world moves that we can already talk about a “traditional” AI powered development process!) works something like this:

  1. User asks AI for a new requirement, let’s say ABC
  2. AI reads current configuration and code
  3. User and AI go back and forth a bit to refine
  4. AI makes change X to the configuration and code

The Spec driven alternative of this is:

  1. User asks AI for a new requirement, let’s say ABC
  2. AI reads current configuration and code AND the Spec
  3. User and AI go back and forth a bit to refine
  4. AI writes the updated Spec
  5. Based on the NEW Spec AI determines what change (let’s call it Y) is required to best achieve the new Spec
  6. AI makes change Y to the configuration and code
     

Why spec-driven development?

Spec-driven development is often described as a way to improve quality or governance. In practice, its biggest benefits show up somewhere more fundamental: how data products evolve over time.

Spec-driven development achieves this by making intent explicit and durable. Below are three of the most important benefits, and why they matter in real systems.

Preventing semantic drift as teams and AI scale

When systems are small and ownership is clear, meaning lives in people’s heads. As systems scale, that meaning starts to degrade. Code still runs. Tests still pass. Pipelines still succeed. But what the system represents and why slowly shifts. A metric changes definition. A dataset starts being used for something it was never designed for. Historical numbers move without anyone touching a dashboard.

This is semantic drift — and it’s one of the most insidious and expensive failure modes in all of data.

Spec-driven development prevents this by preserving intent alongside implementation. A specification captures not just how something is built, but what it is supposed to mean.

Let’s consider a trivial example:

Table: Orders

customer_id

order_timestamp

order_amount

Does order_amount represent the number of items ordered, or the total order value? From inspection of the configuration and code, it’s not at all clear to a person, or an AI agent, and this is the root of many of the mistakes. And accompanying Spec would have clearly defined:

Table: orders
Description: Records individual customer purchase orders used for revenue and order analysis. 

    • customer_id – STRING  

    Description: Unique identifier for the customer who placed the order.

    Why: Required for joining onto other parts of the dataset. Not explicitly asked for by the user but required for many use cases.

    • order_timestamp – TIMESTAMP  
  •              Description: Date and time when the order was placed, stored in UTC.

    Why: Required for identifying the date and time of the order. Not explicitly asked for by the user but required for many use cases.

    • order_amount – DECIMAL(10,2) 
      Description: Total monetary value of the order, expressed in USD.
      Why: Required so that the total order value can be calculated and put on
      invoices.

 

Now there is no doubt – and it wouldn’t matter with if the column was named “order_amount” or “column_14” – there would still be a clear definition of what was meant and why it was created.

When change is requested like “I want to be able to see the number of items in each order” – any option of reusing or modifying order_amount would clearly be totally unacceptable.

This is a very trivial example but this becomes increasingly important as:

  • more teams contribute to the same data products
  • ownership changes over time
  • AI systems generate and refactor code at high speed

Without specifications, scale creates drift, fragility, erodes understanding and undermines trust. With specifications, scale creates more consistency, more reuse and MUCH less technical debt. Let’s unpack this last one.

Keeping Technical Debt Low by Design

Technical debt rarely comes from one bad decision. It usually comes from many reasonable decisions made in isolation and over a period of time.

Consider - a new requirement ABC arrives. The fastest solution is to add “code for ABC”. Later, another requirement arrives, and someone adds “code for DEF”. Both work. Both ship. But ABC and DEF has a lot in common, and now the system has two overlapping implementations that drift apart over time.

In many cases, “code for ABC” plus “code for DEF” is far worse than “code for ABC + DEF” — but without a unifying specification, that relationship is easy to miss.

Spec-driven development changes this dynamic. Because new requirements are implemented against an entire specification, implementors (human or AI) can look at the whole problem space, not just the local change. The spec becomes the place where overlap, duplication, and inconsistency are visible and then avoided.

Consider the step above where we said “Based on the NEW Spec AI determines what change (let’s call it Z) is required to best achieve the new Spec”. The subtle, but critical, distinction here is that the AI is NOT simply trying to write code to achieve DEF, it’s looking at “what would be best solution be to achieve ADC and DEF” and then make whatever changes required to get there.

The result is that code naturally stays:

  • more modular
  • more reusable
  • more DRY

Technical debt doesn’t totally disappear, but becomes far less likely — and is much easier to address when it does appear.

Every time the AI acts on a change, it continually looks to build the best solution to the overall set of requirements, not just the most recent one.

This has one final, important benefit which we will look at next.

Turning every change into an opportunity to have the best data product possible

In many teams, change is something to get through as quickly as possible. Fix the bug. Add the column. Ship the feature.

Spec-driven development subtly but powerfully shifts that mindset. Because every change is evaluated, not as a single, standalone transaction, but against an entire, updated specification, it forces the AI think in terms of the data product as a whole, not just the immediate requirement. The whole implementation is effectively re-evaluated every time:

Best possible data product implementation to meet original Spec

becomes

Best possible data product implementation to meet original spec + Requirement ABC

becomes

Best possible data product implementation to meet original spec + Requirement ABC + Requirement DEF

At the most extreme, a relatively simple change, could tip the balance in the favor of a more material technical decision (it’s time to move from pandas version 1.5 to the newer 2.3) or even whole new technology stack (i.e. moving from doing some transformation in SQL to python + data frames or similar).

Incidentally, this is where mindset changes have to occur. In the past a Junior Developer, implementing a relatively small change but deciding that a new technology choice would be slightly optimal would have been metaphorically shot! Why? Because the work required to:

  • make the technology change
  • retrofit the technology change to the previous requirements (and test)
  • use the new technology to meet the new requirement (and test)
  • validate that everything is still working as it should

FAR exceeded the value of the feature. The economics of AI generated data products, where ALL of the above is fully and comprehensively automated, makes this a reality.

The example of the pandas library is actually much more significant than it appears – because technology rusts! Stated less dramatically, over time, even if you do nothing to it, your codebase gets old and out of date. The world is always moving forward and your state of the art code yesterday is a bit tired today and an unsupported liability tomorrow. The panda library that was just out when the version was pinned last year, at best, is probably unsupported and at worst, may have security vulnerabilities.

How does Spec driven development help? Every time the AI operates, it is always trying to ensure we have the best possible data product implementation to meet the current Spec. Usually we use that to implement changes to the Spec. But what if the Spec hasn’t changed, the world has.

Why can’t we tell the AI to “have best possible data product implementation to meet the current spec” – even if the spec hasn’t changed, but the rest of the context, the world around us has. At this point it will design the “Best possible data product implementation without any old libraries, out of date code, any vulnerabilities and then go ahead and implement this.

I see a world in the not too distant future where an AI might evaluate every data product every week saying “Based on my company's rules, guidelines, requirements, the state of the technology industry etc etc – what is the best implementation of this data product specification” – essentially data products that are constantly updating themselves.

Imagine a world where say an Enterprise Information Architect created a new rule/guideline that 10 data product owners with 100 data products needed to become compliant with. Today that would be a 6-12 month activity at major people cost and lost time actually creating new company value. In this new world, on Monday morning each data product owner logs in on Monday morning to find 10 Merge Requests, one for each data product showing:

  • The new rule/guideline that was added
  • The changes made to implement it
  • The results of the extensive automated testing and backwards compatibility testing done to confirm than not only is the new rule/guideline being followed, but the entire Spec (including every previous rule/guideline/requirement) is being honoured as well

One mouse click to deploy live. 100 mouse clicks and the new rule/guideline is live, in production, across every data product, and no-one is on their second cup of coffee.

Why This Matters Now

All of the benefits of spec driven development existed before agentic AI. Why didn’t we use it? We assumed/hoped that people would be able to remember all of this stuff plus the cost of maintaining the Spec manually was high.

With AI, spec-driven development is unavoidable is we want quality, trustable outcomes, but the cost of maintaining the spec is now essentially zero since it’s all handled by the Agent.

In addition, AI dramatically lowers the cost of change. It’s absolutely legitimate to rewrite large swaths of SQL to remove duplication even when implementing a small new requirement. And if your fight or flight reflexed is prickling at the thought of the risk of all these potential rewrites – you don’t have your test automation where it needs to be – but don’t worry, your AI Agent will take care of that for you too.