When a prospective client asks an AI assistant "who's a good accountant for my limited company in Leeds?", the assistant doesn't browse the way a person does. It fetches pages, reads what it can, and leans on explicit, machine-readable signals to decide whose name to put forward. So we asked a simple, testable question: across a sample of small independent UK firms, are those signals actually there?
We ran our own free Foundations Check, the same tool behind our free Snapshot, across the public websites of these firms. These are small convenience samples, not a census, and we have deliberately kept the report aggregate and anonymised: we name no individual firm with a shortcoming. The numbers below are reported exactly as measured, with every limitation stated. Credibility here comes from transparency.
We've now audited 93 UK accountancy firms across three groups: general practices, IR35/contractor specialists, and e-commerce specialists.
That's three separate convenience samples: 34 general independent practices (the original study, across ten cities), 28 IR35 / contractor specialists, and 31 e-commerce / online-seller specialists, all checked on the same day, 3 June 2026, with the same tool. The general-practice numbers are the foundation of this report and are presented in full further down. First, the cross-niche picture, because the comparison tells a story none of the three samples tells on its own.
The cross-niche picture
The three niches sit at different points on the digital-maturity curve. General high-street independents are the least marketing-led; e-commerce specialists (marketing-fluent firms selling to software-fluent online sellers) are the most. You might expect the more sophisticated firms to be uniformly more AI-ready. On the machine-readable signals, they are. But on the one signal that matters most, can a simple AI crawler read the homepage at all?, the pattern flips.
| Signal | General (n = 34) | IR35 / contractor (n = 28) | E-commerce (n = 31) |
|---|---|---|---|
| Identity schema (readable firms) | 17% 4 / 24 | 19% 4 / 21 | 33% 8 / 24 |
| FAQ schema (readable firms) | 0% 0 / 24 | 24% 5 / 21 | 46% 11 / 24 |
| Blank / unreadable to a simple crawler | 18% 6 / 34 | 25% 7 / 28 | 23% 7 / 31 |
Identity and FAQ schema percentages are over the firms whose homepages we could read (the denominator differs by group, shown beneath each figure). The blank/unreadable figure is over the full sample for each group. The e-commerce blank-shell figure is 22.6% (7 of 31), shown rounded to 23%.
The takeaway: the more digitally mature the niche, the worse the JavaScript-blank-shell problem. E-commerce specialists lead on every machine-readable signal: they were twice as likely as general firms to carry "I am an accounting business" schema (33% vs 17%) and far more likely to use FAQ markup (46% vs 0%). Yet that same modern, JavaScript-heavy website profile is exactly what tends to render as an empty shell to a simple fetch: nearly a quarter of e-commerce specialists (23%) served our crawler nothing readable, broadly in line with the IR35 specialists (25%) and worse, on the page-render axis, than you'd guess from how polished these sites look in a browser.
The lesson isn't "build a worse website." It's that a slick, modern site can still be invisible to a simple AI crawler if the content only appears after JavaScript runs. Identity schema and FAQ markup are worth nothing to an AI that never receives any readable content in the first place. For the most sophisticated firms, server-rendered or pre-rendered content is the single highest-leverage fix.
// Three convenience samples (34 + 28 + 31 = 93 firms), all checked 3 June 2026. Aggregate and anonymised. Foundations / readiness only, not live AI rankings. Our automated reader does not execute JavaScript, so "blank/unreadable" reflects what a simple non-JS crawler sees, not necessarily what a full browser renders.
One honest caveat before the detail: these are three separate convenience samples found via different searches, and the specialist niches skew toward larger national firms with dedicated web teams. So the machine-readable gaps are arguably understated for the smaller end of each niche, and the niches aren't perfectly like-for-like. We treat the comparison as a directional signal, not a precise ranking. The rest of this report sets out the general-practice sample, the largest and most representative of our ideal client, in full.
Why the machine-readable layer matters
AI assistants don't invent firm names. They retrieve them from sources they can read, then describe firms using the structured signals those sources carry. Missing plumbing means an assistant has less to go on, and a competitor's name may be easier for it to use.
Three foundations sit underneath this. First, being readable at all. Many AI crawlers fetch the raw page, so a homepage that only renders after JavaScript runs can look like an empty shell to them. Second, structured data (schema.org / JSON-LD): the explicit, machine-readable tags that say "this is an AccountingService, here's the name, address and phone." Third, consistent name/address/phone and supporting signals (sitemap, robots access). These are plumbing, not magic. Plumbing that's missing is plumbing an AI can't use.
The general-practice sample, in three honest buckets
Everything from here on describes the 34 general independent practices, the original and largest of our three samples. Before any percentage, the most important split: could we read the site at all? Where our tool could not read a page, we count that separately and never record it as "no schema." Claiming a firm lacks schema when we never saw its page would be false.
| Bucket | Count | What it means |
|---|---|---|
| Homepage readable | 24 / 34 | We retrieved real page content. Schema and NAP findings below are based on these. |
| Homepage unreadable | 6 / 34 | Server returned a "success"-type response (HTTP 202) but zero readable text: a blank/JS-only shell. Not scored for schema/NAP. |
| Could not fetch | 4 / 34 | Repeated connection failures or an origin error (HTTP 520) on two attempts. Not scored. |
The split itself is a finding: 10 of 34 firms (29%) served either nothing readable or nothing at all to a simple automated fetch, exactly the kind of fetch many AI crawlers perform.
The findings
All schema and NAP percentages below are calculated only over the 24 firms whose homepages we could actually read (denominator = 24), so they describe firms we genuinely assessed. The llms.txt, sitemap, robots and fetch-outcome percentages are over the full sample of 34.
// Finding 01 · entity schema
Only 4 of 24 readable firms (17%) tell AI "I am an accounting business"
Just 4 of 24 readable firms (17%) carried a LocalBusiness, AccountingService, ProfessionalService or FinancialService schema type, the structured signal that lets an AI confidently classify the firm as a local accounting provider. Most schema we did find was generic (WebSite, WebPage, BreadcrumbList): useful for normal search, but it doesn't say "accountant."
// Finding 02 · FAQ markup
Not one firm in the sample used FAQ markup
0 of 24 readable firms (0%) had FAQPage schema. FAQ markup is one of the cheapest ways to hand an AI ready-made question-and-answer content it can lift directly into an answer. Nobody in this sample is doing it.
// Finding 03 · no structured data
A quarter of readable firms had no machine-readable schema at all
6 of 24 readable firms (25%) had no parseable JSON-LD structured data whatsoever. AI assistants get no explicit structured entity data from these sites and must infer everything from prose.
// Finding 04 · blank to a crawler
Six firms served a homepage that read as blank to an automated crawler
6 of 34 firms (18%) returned an HTTP 202 response with no readable text and no title: a JavaScript-only or placeholder shell. Human visitors with a full browser see a normal site; a simple AI crawler that reads the raw response may see nothing. We've flagged this conservatively, as a readability risk rather than a confirmed hard block. It is potentially the highest-impact issue of all: you can't be recommended on the strength of content a crawler never receives.
// Finding 05 · llms.txt, sitemaps, robots
llms.txt is almost universally absent; sitemaps and robots are patchy
llms.txt present: 4 / 34 (12%), an emerging (not yet essential) standard for guiding AI crawlers, and an easy edge almost nobody has taken. Sitemap present: 23 / 34 (68%), so roughly a third have no discoverable sitemap. robots.txt found: 24 / 34 (71%).
// Finding 06 · crawler blocking
Outright AI-crawler blocking is rare; "partial" blocking is more common
2 / 34 (6%) explicitly disallow major AI crawlers (GPTBot, ClaudeBot, Google-Extended, CCBot, Bytespider, Applebot-Extended) in robots.txt. 6 / 34 (18%) have rules that partially restrict crawlers from paths that may include real content. Confirmed WAF / bot-challenge hard blocks: 0 / 34 (0%); one firm had a managed challenge present while still serving content (not a hard block).
// Finding 07 · NAP
NAP basics are mostly present, but rarely structured
Of the 24 readable firms: 15 (62%) had strong name/address/phone signals (both a phone number and a UK postcode detectable); 9 (38%) had only partial signals; 0 had none at all. But only 2 (8%) had NAP inside their JSON-LD in a form we could cross-check. 22 (92%) carried no structured NAP for an AI to read reliably. The address is on the page for humans, but not in the machine-readable layer AI prefers.
What this adds up to
For this convenience sample of small UK independent firms, the pattern is consistent: firms have built websites for humans and for traditional Google, but very few have done the small, specific things that make them legible to AI assistants. The gap isn't "bad websites." It's the missing machine-readable layer: the explicit "I am an accountant" schema (only about 1 in 6 readable firms), FAQ markup (nobody), structured address data an AI can trust (about 1 in 12), and, for nearly a fifth of firms, a homepage a simple crawler can read at all.
These are low-cost, high-leverage fixes. The firms that close this gap first, in their city and specialism, give AI assistants the clearest reason to name them.
Methodology: read this, it's the honest part
// How we ran it
- What we ran: our own Foundations Check tool, which makes a small number of polite, public-page requests (homepage, robots.txt, llms.txt, sitemap.xml, and /contact / /about). It inspects only publicly available information, with a normal browser user-agent, short timeouts, and a handful of pages per site.
- What it checks: can the homepage be read; is schema.org / JSON-LD present (and of what type); is there FAQ markup; is there an llms.txt; is there a sitemap; does robots.txt block AI crawlers; is there a WAF / bot-challenge interstitial; are NAP signals present and internally consistent.
- What it does NOT do: it does not test whether any AI assistant actually names the firm. This report is about foundations / readiness, not live AI rankings.
- Samples: three separate convenience samples, 93 firms in total, all checked on 3 June 2026. General (34): identified via public web searches ("independent chartered accountants [city]") across Leeds, Bristol, Manchester, Nottingham, Glasgow, Cardiff, Birmingham, Norwich, Brighton and Newcastle, biased deliberately toward small independents (our ideal client). IR35 / contractor (28): firms that explicitly position on contractor / IR35 / limited-company work. E-commerce (31): firms that explicitly position on e-commerce / Amazon / Shopify / online-seller work. None is random or representative; the two specialist niches skew toward larger national firms. Treat every percentage as "of the firms we happened to check," not a national statistic.
- Date: all checks run on 3 June 2026. Websites change; these are point-in-time findings.
- Honesty rule: no number here is rounded up or invented. Where the tool could not read a page, we count that separately and never record it as "no schema."
Limitations, stated plainly
// Where this stops
- Three convenience samples (34 + 28 + 31 = 93). Firms were found via public searches and grouped into general, IR35/contractor and e-commerce niches. None is random, representative, or generalisable to all UK accountants, and the three are not perfectly like-for-like (the specialist niches skew larger and more national). With samples this size, each firm is roughly 3–4 percentage points, so small differences are noise and the cross-niche comparison is directional, not exact.
- Point-in-time (3 June 2026). Sites change; a firm could fix or break any of this tomorrow.
- Foundations only. We measured readiness, not whether any AI engine actually recommends a firm. No claim here should be read as "AI ignores firm X."
- Automated fetch ≈ a simple crawler. Our reader does not execute JavaScript. Sites that render entirely client-side read as "unreadable" to us, and to similarly simple AI crawlers, even though they look fine in a browser. We count these separately and never as "no schema."
- Fetch failures may be transient or defensive. The 4 unfetchable firms failed twice; we report them as "could not fetch," not as any specific fault.
- Anonymised by design. Method, tool and raw per-firm data are retained internally. This published report is aggregate and anonymised for fairness and data-minimisation: no individual firm is named with a shortcoming.
A note on AI rankings (clearly labelled, not a core finding)
The solid core of this report is the foundations data above. We did not perform verbatim AI-engine reads for it, and we make no claim about which firms AI assistants name. Any future "who shows up" analysis will be labelled as a separate, proxy measurement with its own caveats. If you want to understand the signals that drive those recommendations, our guide on how AI decides which accountant to recommend walks through them in plain English.