Pre-acquisition code due diligence for AI-built products

AI code qualityZegaware engineering19 June 202610 min read

Last updated: 19 June 2026

Technical due diligence on an AI-built product should examine whether code that works is also safe to own: committed secrets, missing authorisation, hollow tests, duplicated logic and hallucinated dependencies. It matters now because AI-assisted code ships fast but carries measurably more security defects, and those defects transfer to the buyer as remediation cost and regulatory liability rather than staying with the seller.

A working demo is not a finished product. When most of a young company's code has been generated by an AI assistant, the gap between "it runs" and "it is safe to own" is where the real deal risk sits, and it is the gap most commercial and financial diligence never opens.

Why AI-built products need their own due diligence

The case for AI-specific diligence is not opinion, it is now measured. A 2026 Carnegie Mellon University benchmark, SUSVIBES, found that code produced by AI coding agents was about 61% functionally correct but only 10.5% secure, and that more than 80% of the working solutions still contained at least one critical vulnerability [1]. In other words, four in five features that pass a demo are carrying a serious flaw underneath.

The pattern holds across independent measurements. Veracode's Spring 2026 analysis found that 45% of AI-generated code contained at least one vulnerability from the OWASP Top 10, the Open Worldwide Application Security Project's list of the most critical web application risks, and that the figure had not improved in two years [2]. Veracode summarised the position bluntly: "The productivity revolution is here. The security revolution isn't" [2].

The defects also cluster in predictable, dangerous places. An analysis by CodeRabbit comparing AI-authored and human-authored changes found that AI changes carried roughly 1.7 times more issues, and were 1.91 times more likely to introduce an insecure direct object reference (IDOR), a flaw where one user can read another user's data simply by changing an identifier in a request [3].

These are three different bodies measuring three different things, and they point the same way: AI raises output volume without raising safety. Diligence on an AI-built product therefore cannot assume that a working feature is a finished feature. If you want the underlying argument in full, we set it out in is AI-generated code safe to ship?.

What a technical reviewer finds in an afternoon

Most of the serious problems in an AI-built codebase surface fast, because they follow recurring patterns. The findings below are the ones we reach for first, and they are documented at length in what a vibe code audit actually finds.

Secrets committed to the repository

Credentials checked into source control are the quickest critical finding to confirm. GitGuardian's State of Secrets Sprawl 2026 recorded 28.65 million new secrets (API keys, tokens and passwords) pushed to public repositories during 2025, and found that commits made with AI assistance leaked secrets at 3.2%, against a 1.5% baseline for ordinary commits, roughly double the rate [4]. A reviewer reads the full git history, not just the current files, because a key that was deleted last month still sits in an earlier commit and is still live until it is rotated. In our reviews we routinely find working credentials in history that the team believed were long gone.

Missing or broken authorisation

The most common serious flaw we find is missing authorisation: the code checks who you are (authentication) but not what you are allowed to see (authorisation). The insecure direct object reference described above is the everyday form of this, and CodeRabbit measured it at 1.91 times more likely in AI-authored code [3]. The consequences scale fast. In 2025 a security researcher showed that McDonald's "McHire" recruitment platform exposed up to 64 million job applications through a sequential record identifier with no access control, a textbook IDOR [5]. Over personal data in the United Kingdom, a single flaw of this kind is a reportable breach under the General Data Protection Regulation (GDPR), enforced by the Information Commissioner's Office (ICO).

Hollow test suites

A green test suite is a claim to be verified, not evidence on its own. AI assistants are good at producing tests that assert the code does whatever it currently does, tests that mock the very component under test, and high coverage numbers sitting on top of almost no meaningful assertions. In our reviews we read the tests before we trust the coverage figure, because a suite that cannot fail tells the buyer nothing about whether the product is correct.

Duplicated and divergent logic

AI-generated codebases tend to re-implement the same operation several different ways, because each prompt starts with a fresh context and no memory of how the last one solved the problem. Pricing rules, input validation and permission checks drift apart across files. The product still works, but every future change has to be made in several places at once, and missing one is how regressions and security holes are reintroduced.

Hallucinated dependencies

AI assistants invent packages that do not exist. Researchers at USENIX Security 2025 found that 19.7% of the packages referenced in AI-generated code samples did not exist, and that 43% of those hallucinated names recurred across repeated prompts [6]. The recurrence is the danger: it lets an attacker register a commonly hallucinated name and wait for it to be installed, an attack now known as slopsquatting. A reviewer confirms that every dependency resolves to a real, maintained package from a legitimate source.

What buyers and investors should ask for

The access a target is willing to grant is itself a signal. To run real diligence rather than a guided tour, ask for the following before you commit:

Read access to the full git repository and its history, not a zipped snapshot of the current source.
The dependency manifests and lockfiles for every part of the system.
A list of third-party services in use and where their credentials are stored.
The continuous integration and continuous delivery (CI/CD) configuration.
Any prior security testing, penetration tests or audit reports.
Time with a named senior engineer on the target's side who can answer questions directly.

In our experience the terms of access predict the result. A founder who hands over full history and answers questions plainly is usually sitting on a healthier codebase than one who offers only a packaged build and a walkthrough.

How findings map to deal risk

Findings only matter to a buyer once they are translated into money, liability and time. The table below maps the common findings to the deal risk they carry.

Finding	What it means for the deal	Evidence
Secrets in git history	Live credentials transfer to the buyer; rotation and incident review needed	AI-assisted commits leak at 3.2% versus 1.5% baseline [4]
Missing authorisation (IDOR)	Personal-data exposure; ICO-reportable breach risk	1.91 times more likely in AI code [3]; McHire reached 64 million records [5]
Hallucinated dependencies	Supply-chain exposure through slopsquatting	19.7% of referenced packages did not exist [6]
Hollow tests	False confidence; the true defect rate is unknown	Zegaware review finding
Duplicated logic	Higher cost and risk on every future change	Zegaware review finding

Remediation cost

Remediation is the figure a buyer prices into the offer. Forrester predicted that 75% of technology decision-makers would see their technical debt rise to a moderate or high level of severity by 2026, and named the rapid development of AI solutions as an accelerant [7]. A good report turns each finding into an estimate of engineering effort, so the cost of getting the product to a safe state is a number on the table rather than a surprise after completion.

Security and regulatory liability

Committed secrets and missing authorisation are not abstract risks once the asset changes hands. Personal-data exposure is an ICO matter, and the McHire case shows how far a single unprotected identifier can reach [5]. This liability moves with the product. A buyer who completes without understanding it has bought the breach as well as the business.

Key-person and maintainability risk

If one founder holds the only working mental model of an AI-assembled codebase, that is key-person risk, and it is acute when the code is duplicated and the tests are hollow. Both raise the cost of every future change and slow the team you are acquiring. In our experience this is the risk most often under-priced, because it does not show up until the original author has left.

What a good diligence report contains

A diligence report earns its place by being decisive. It should contain:

A clear verdict in plain language: safe to own, safe to own with named conditions, or not yet.
Severity-ranked findings (critical, high, medium, low), each one reproducible by the buyer's own engineers.
A rebuild-or-patch call on every material issue, because some flaws are a configuration change and others mean re-architecting a module.
An estimate of remediation effort in engineering weeks.
Named senior-engineer sign-off, so the verdict carries accountability.

This is the standard we hold our own work to. Zegaware's senior engineers, most holding doctorates, sign off every engagement by name, and the Vibe Code Audit is delivered at a fixed price, because for a buyer on a deadline, certainty about scope and cost is part of the product.

Frequently asked questions

What is different about diligence on an AI-built product?

AI-built products often work in a demo while carrying defects that ordinary diligence misses, because the code was generated quickly and reviewed lightly. Benchmarks show AI code is frequently functionally correct yet insecure [1]. Diligence must read the git history, the test suite and the authorisation logic, not simply run the application and watch it work.

How long does technical due diligence take?

A focused review of a single product usually takes a few days, and the most serious findings (committed secrets, missing authorisation and hallucinated dependencies) tend to surface within the first afternoon. A full report with severity ranking and a rebuild-or-patch call follows after. Timelines scale with the size of the codebase and the access you are given.

We are the founder being acquired, should we review first?

Yes. Commissioning your own review before a sale lets you fix or disclose findings on your own terms, rather than discover them across the table during a buyer's diligence. It protects valuation and shortens the process. In our experience, founders who review first negotiate from a stronger position and face fewer surprises at completion.

What access does technical due diligence need?

It needs read access to the full git history rather than a snapshot, the dependency lockfiles, the continuous integration configuration, a list of third-party services, and time with a named engineer on the target's side. Full history matters because a rotated secret still sits in past commits and remains live until it is changed [4].

Book a pre-deal review

If you are buying, investing in, or selling an AI-built software product, a verdict in writing before the money moves is the cheapest insurance in the deal. Zegaware's senior engineers read the codebase, the history and the tests, and tell you what is safe to own and what is not. Commission the Vibe Code Audit ahead of completion and you get a clear verdict, severity-ranked findings, and a rebuild-or-patch call you can take into the negotiation.

Sources

Songwen Zhao et al., "Is Vibe Coding Safe? Benchmarking Vulnerability of Agent-Generated Code in Real-World Tasks" (SUSVIBES benchmark), Carnegie Mellon University, arXiv:2512.03262, 2026. https://arxiv.org/abs/2512.03262
Veracode, Spring 2026 GenAI Code Security Update, 24 March 2026. https://www.veracode.com/blog/spring-2026-genai-code-security/
CodeRabbit, State of AI vs Human Code Generation Report, 17 December 2025. https://www.coderabbit.ai/blog/state-of-ai-vs-human-code-generation-report
GitGuardian, The State of Secrets Sprawl 2026, 17 March 2026. https://blog.gitguardian.com/the-state-of-secrets-sprawl-2026/
Ian Carroll and Sam Curry, write-up of the McDonald's "McHire" IDOR exposing up to 64 million job applications, 2025. https://ian.sh/mcdonalds
Joseph Spracklen et al., "We Have a Package for You! A Comprehensive Analysis of Package Hallucinations by Code Generating LLMs", USENIX Security 2025, arXiv:2406.10279. https://arxiv.org/abs/2406.10279
Forrester, "Predictions 2025: Technology And Security" (press release), 22 October 2024. https://www.forrester.com/press-newsroom/forrester-predictions-2025-tech-security/

Not sure what you are shipping? Our Vibe Code Audit puts senior engineers across your AI-built software and signs off what is safe to ship. Fixed fee, scored review, a clear go or no-go.

Book an audit