Why Every AI-Built App Needs a Human Code Audit Before Launch

We've audited codebases built by solo founders, early-stage teams, and agencies that call themselves vibe coding shops. The pattern is consistent: the more AI was used without review, the more issues we find.

This isn't an argument against AI-assisted development. We use it on every project we ship. It's an argument for not skipping the audit.

Here's what we consistently find — and how to catch it before it becomes a production incident.

Why AI-Built Code Has a Specific Failure Pattern

AI tools produce plausible code. That's the dangerous part. A function that handles user data will look correct, follow the right patterns, and often work correctly for the happy path. What AI misses — consistently — is the cases it wasn't explicitly asked about.

Authorization is the most common failure. AI will write a route that fetches a user's data by ID. It might not check whether the requesting user has access to that ID. The code works. In testing, you only test with your own account. You ship it. A month later, someone discovers they can access any user's data by changing a URL parameter.

This is not a hypothetical. We've seen it.

The other consistent gap: input validation. AI validates the types it knows about. It doesn't validate the combinations, the ranges, or the ways users will abuse inputs that weren't in the prompt. Server-side validation that trusts client-side validation is a liability.

What We Check in Every Audit

Authentication and Authorization

These are not the same thing. Authentication is "who are you?" — handled by Clerk, NextAuth, or Supabase Auth. Authorization is "are you allowed to do this?" — typically not handled by any library and has to be implemented correctly in every route.

The audit questions we run for every API endpoint:

—Is there a valid session check before any data is returned?
—Does the route verify that the requesting user owns or has access to the requested resource?
—Is there a second-account test? (We always test with two separate accounts, each trying to access the other's data.)
—Does the route handle missing auth gracefully (401) rather than crashing?

A route that returns a 500 on missing auth is hiding a problem — the user shouldn't reach any logic that fails before authorization is checked.

Input Validation

Every input that comes from user-controlled sources should be validated server-side. This includes:

—Form fields — expected types, lengths, formats
—URL parameters — are they the right type? Are they validated against existing records before being used in a query?
—File uploads — type, size, and content validation
—Query strings — presence and format

Client-side validation is a UX feature, not a security control. It can be bypassed by anyone with a browser devtools or a curl command. The server validates. The server enforces.

We check that parameterized queries are used everywhere SQL is involved (Supabase handles this for standard operations; custom SQL is where issues appear). We check that no user input is concatenated into a query string.

Environment Variables and Secrets

Secrets in code happen. They're common enough that we check every project:

—Is .env in .gitignore? Is it committed to the repository?
—Are there any hardcoded API keys, tokens, or credentials in the codebase?
—Are production secrets stored in Vercel's environment configuration, not in the repository?
—Are client-side environment variables (prefixed NEXT_PUBLIC_) actually safe to expose? Do any contain secrets that should be server-only?

We've found service API keys in package.json scripts, hardcoded Supabase service role keys in client components, and Stripe secret keys stored in variables that get bundled into the client. Each of these is an immediate stop-everything-and-fix-it finding.

Rate Limiting

API routes without rate limiting are open to abuse. A form submission endpoint with no rate limiting can be used to send thousands of emails. An authentication endpoint with no rate limiting is susceptible to brute force. An LLM API proxy route without rate limiting can burn through your entire API budget in minutes.

We check every public-facing endpoint for rate limiting. The implementation depends on the stack — Upstash Rate Limit is the most common pattern for Vercel deployments. The key point is that rate limiting belongs on the server, applied before any significant work is done.

Error Handling and Information Disclosure

Error messages are a surprisingly common security issue. A route that returns a full database error message to the client is leaking schema information. A 500 response that includes a stack trace is giving attackers a map of your internals.

We check that:

—Error messages returned to clients are generic ("something went wrong") rather than descriptive ("column 'user_id' doesn't exist in table 'transactions'")
—Stack traces are logged server-side (Sentry, console) but not returned in API responses
—404 and 401 responses are consistent — you shouldn't be able to tell the difference between "this record doesn't exist" and "this record exists but you can't access it"

Dependency Audit

npm audit is a 30-second check that catches known vulnerabilities in your dependencies. We run it on every project. We also look at what's actually in package.json — unused packages, packages that do what a native API already handles, and packages with poor maintenance histories.

AI often adds packages it has training data on without checking whether they're still maintained or whether a simpler alternative exists. It's worth spending 15 minutes looking at what's been pulled in.

The Performance Check

Security gets most of the attention in audits, but performance issues ship too.

The things we check quickly:

N+1 queries: A page that renders a list by fetching each item individually in a loop. Common in AI-generated code that follows the "obvious" implementation without considering database round trips. Check the Supabase dashboard logs after loading each page.

Bundle size: Vercel's build output shows bundle sizes per route. A 500KB JavaScript bundle for a simple page is a red flag. next/dynamic imports exist for a reason — heavy components that aren't needed on load should be dynamically imported.

Image optimization: <img> tags from AI-generated code won't be optimized. Next.js has <Image> for a reason — it handles responsive sizing, lazy loading, and format optimization. Every <img> in the codebase should be intentional.

Lighthouse on the main pages: A score below 80 before launch is fixable. After launch with real users and caches it's harder. Run Lighthouse on the key pages during the audit and fix what's easy.

What to Do If You Skipped the Audit

If you're already in production with an AI-built codebase that hasn't been audited:

—Check authorization first — run the two-account test on every data-fetching endpoint immediately
—Rotate any API keys that were ever visible in the codebase (even briefly)
—Check git log for any commits that included .env files or hardcoded secrets
—Run npm audit and fix critical findings
—Add rate limiting to any public-facing endpoints that could be abused

Then do the full audit. The above is damage control, not a substitute.

Why We Audit Every Project, No Exceptions

The audit is a fixed cost at the end of every project. It adds half a day. It has prevented production incidents on roughly one in four projects we've shipped.

That ratio is not surprising. AI-assisted development moves fast and produces a lot of code. Fast and a lot of code is exactly the condition where security issues multiply. The audit is the mechanism that keeps speed from becoming a liability.

A bug in production you find yourself is better than a bug a user finds. A bug in the audit is better than both.

Frequently Asked Questions

How long does a pre-launch audit take?

For a typical 5-day MVP, we budget half a day — 3 to 4 hours. That covers the security checklist (auth, inputs, secrets, rate limiting), a performance pass, and cross-browser/device testing. Larger codebases or products with complex data models take longer. The audit scope should scale with the product scope.

What are the most critical things to check for an MVP with user data?

In order: authorization on every data-fetching route, server-side input validation on every form and API endpoint, secrets not in the repository, and rate limiting on auth and sensitive endpoints. If you can only audit four things, audit those.

Should I use an automated security scanner instead of a manual audit?

Automated scanners (Snyk, npm audit, GitHub Advanced Security) catch known vulnerabilities in dependencies and some code patterns. They don't catch missing authorization logic — which is the most common and most severe issue we find in AI-built codebases. Automated scanning is a useful complement, not a substitute for reading the code.

Does using Supabase's Row Level Security make the authorization audit unnecessary?

RLS helps significantly and we enable it on every project. But it's not a complete solution: RLS protects direct database access, but API routes that use the Supabase service role key bypass RLS entirely. Those routes need explicit authorization checks in the application code. We check both the RLS policies and the service role key usage in every audit.

What's the most common finding in AI-built codebases?

Missing authorization checks on API routes — specifically, routes that validate the session (you're logged in) but don't validate ownership (you can access this resource). It's the gap between "this user is authenticated" and "this user is allowed to do this specific thing with this specific piece of data." AI consistently writes the former without the latter unless explicitly told otherwise.

Written by

Michael

Lead Engineer, Greta Agency

Raj has shipped over 30 products using AI-assisted development workflows. He audits every codebase before it goes live — no exceptions — and has strong opinions about what 'production-ready' actually means.

Keep Reading

Vibe Coding