Measure the real document size Google processes first.
Check the page before Google drops the rest
Google only indexes the first 2 MB of HTML and can stop parsing heavy resources long before a problem becomes visible in Search Console. Run one public URL to see what Googlebot gets, what it skips, and where technical risk begins.
- One pass for HTML, CSS, JS, compression, headers, and response timing.
- Googlebot access checks across robots.txt, meta robots, and X-Robots-Tag.
- Clear results cards built for technical SEO triage, not vanity metrics.
Run a live check
Paste any public URL
Use a product page, landing page, or any deep URL with large bundles or inline data.
Catch oversized bundles and understand total transfer weight.
Check robots, headers, redirects, and crawl-facing status.
Complete the security check to unlock the request.
Running crawl-facing checks
We fetch the document first, then check linked assets and bot accessibility.
Included in every run
What the tool checks
Fast enough for debugging a single URL, detailed enough for handing the issue to engineering.
HTML document size
Google only indexes the first 2 MB of HTML content. Large inline JSON, hydration payloads, or third-party embeds can silently push critical content past the line.
CSS and JS resources
Each stylesheet and script is checked individually. Heavy bundles hurt crawl rendering, slow page load, and often point to broader front-end bloat.
Googlebot accessibility
Robots rules, meta robots, X-Robots-Tag, and Googlebot fetch status are checked together so indexability issues are visible in one place.
Compression and caching
gzip or brotli can shrink transfer size dramatically, and cache headers reduce repeated downloads for users, bots, and synthetic monitors alike.
Security headers
HSTS, CSP, X-Frame-Options, and related headers are summarized so SEO, UX, and security checks can happen in the same QA pass.
robots.txt and sitemap hints
We inspect robots.txt for bot blocks and sitemap references, then show the parts most likely to affect the exact URL you tested.
Common questions
FAQ
The most frequent edge cases around Google's size limits and crawl behaviour.
Google documents a higher download ceiling overall, but only the first 2 MB of HTML content is indexed. The limit keeps crawl cost under control at web scale.
The limit applies to the decompressed content size. A page that looks small over gzip can still expand beyond 2.5 MB when Googlebot parses the HTML.
Googlebot may not execute the full script. If your content depends on client-side rendering, part of the page can become invisible to the crawler. Code splitting and SSR/SSG usually reduce the risk.
Nginx: gzip on; gzip_types text/css application/javascript;
Apache: enable mod_deflate and mod_brotli.
Cloudflare: compression is typically enabled by default.
Node.js / Express: use the compression middleware.
Some stacks block or degrade responses specifically for bots through WAF rules, bot protection, CDN products, or UA filtering. The human page can load while the crawler gets 403, 503, or empty HTML.
Not directly, but oversized CSS and JS increase render blocking, parse time, and main-thread work. That pushes LCP and TBT in the wrong direction, which often becomes both a ranking and conversion problem.
Need hands-on help?
Need an SEO audit or a custom performance tool?
Ivatech builds fast websites, technical SEO workflows, and custom tooling for teams that need actionable engineering fixes instead of generic reports.