A developer using Cloudflare Pages with Astro static-site generation outlines a workflow of post-deploy checks designed to catch production issues related to publishing configuration and crawlability. For three sites (aiappdex.com, findindiegame.com, and ossfind.com), the developer says they add: (1) a sitemap reachability check that verifies sitemap-index.xml returns HTTP 200 on each domain, and also checks the generated sitemap-0.xml contains a minimum expected number of URLs; (2) an IndexNow submission step that runs after the sitemap checks and posts the live URLs from each sitemap to Bing, Yandex, Naver, and Seznam using site-specific keys, reporting outcomes such as 200 OK; and (3) a weekly Lighthouse spot-check (Monday 04:30 UTC) that measures homepage and a deep entry page for performance, layout stability (CLS), and accessibility trends. The developer notes they do not block deployments based on Lighthouse score changes, treating results as trend monitoring rather than a release gate. Separately, they describe a manual script that verifies affiliate and ad sections render in production when corresponding Cloudflare Pages environment variables are deployed, by fetching pages identified via sitemaps and checking for expected human-readable strings and ads.txt content.
Developers describe post-deploy checks for Cloudflare Pages sites, covering sitemaps, IndexNow, Lighthouse
A developer using Cloudflare Pages with Astro static-site generation outlines a workflow of post-deploy checks designed to catch production issues related to publishing configuration and crawlability....
- The developer runs three domains on Cloudflare Pages with Astro SSG: aiappdex.com, findindiegame.com, and ossfind.com.
- Post-deploy sitemap checks verify sitemap-index.xml returns HTTP 200 and that sitemap-0.xml contains an expected minimum URL count.
- After sitemap validation, the workflow submits URLs via IndexNow to Bing, Yandex, Naver, and Seznam using site-specific keys.
- A weekly Lighthouse spot-check measures performance, CLS, and accessibility for a homepage and a deep URL per site, and uploads results for comparison.
- A separate script checks affiliate/ad rendering by fetching HTML for sample detail pages discovered via sitemap indexes and validating specific rendered strings and ads.txt patterns.
I run three directory sites that display affiliate links, AdSense slots, and Amazon blocks — but only when the corresponding environment variables are set in Cloudflare Pages. When the variables aren't deployed, those sections simply don't render. No error. No broken layout. Just missing revenue, invisible unless you check. This happened twice in the first month. A redeploy would go out without the affiliate env vars being re-applied. The site looked identical to a working version at a glance. Clicking around would eventually reveal the missing CTA, but only if you happened to land on the right page type. I wrote scripts/check-affiliates.mjs to make that check fast and explicit. What the script does The script checks three sites in sequence. For each site, it: Fetches /sitemap-index.xml to find the sub-sitemap for detail pages Picks one detail URL from that sitemap Fetches the raw HTML Checks for specific strings that indicate each CTA is rendered The output is a plain pass/fail report per site: → aiappdex.com ads.txt: ✓ pub ID set sample: https://aiappdex.com/models/qwen2-7b/ affiliate CTA "Run this model on": ✓ rendered AdSense slot: ✓ in HTML Amazon block: ✗ hidden (PUBLIC_AMAZON_TAG not deployed) One line per check, one pass per site, one run to confirm three deployments. The sitemap crawl Hardcoding a URL would make the check brittle — detail URLs change when slugs are regenerated, and I'd rather not maintain a separate list. Using the sitemap is more robust: it's the canonical URL source the site already generates. async function pickFirstSlug(site, prefix) { const sitemapRes = await fetch(`https://${site}/sitemap-index.xml`); if (!sitemapRes.ok) return null; const idx = await sitemapRes.text(); const subSitemap = (idx.match(/<loc>([^<]+)<\/loc>/) ?? [])[1]; if (!subSitemap) return null; const subRes = await fetch(subSitemap); if (!subRes.ok) return null; const sub = await subRes.text(); const urls = [...sub.matchAll(/<loc>([^<]+)<\/loc>/g)].map((m) => m[1]); const detail = urls.find((u) => u.includes(prefix) && !u.endsWith(prefix)); return detail ?? null; } It parses <loc> tags from the sitemap XML with a regex rather than a full XML parser. That's intentional — DOMParser isn't available in Node by default, and adding a dependency for sitemap parsing felt disproportionate. The regex works because sitemap XML format is structurally consistent; a more complex format would warrant a parser. One thing to note: the function takes a prefix argument (like /models/ or /games/). That's how I distinguish detail pages from the index pages that also appear in the sitemap. I want a URL like /models/qwen2-7b/, not /models/. Checking ads.txt and affiliate strings The ads.txt check is a separate fetch, not the HTML check. It looks for the AdSense publisher ID pattern: const adsTxtRes = await fetch(`https://${site}/ads.txt`); const adsTxt = await adsTxtRes.text(); const hasAdsensePub = /pub-\d{10,}/.test(adsTxt); The HTML checks are string presence checks against the fetched page: const hasSection = html.includes(section); const hasAdsense = html.includes("adsbygoogle") && html.includes("data-ad-client"); const hasAmazon = html.includes("Gear up on Amazon") || html.includes("amazon.com/s?k="); The section variable is site-specific. For aiappdex.com it's "Run this model on"; for findindiegame.com it's "Find on other stores"; for ossfind.com it's "Self-host on". These are heading strings that only appear in the rendered HTML when the relevant env var is set. I deliberately check strings that are human-readable rather than env var names or data-attributes. If the heading renders, the CTA is live. If the heading is absent, something upstream didn't connect. The message in that case tells me exactly which env var to check in Cloudflare. Why this is better than a visual check The pattern I was relying on before — opening a few pages after a deploy and eyeballing them — has two problems. First, it's slow across three sites with multiple CTA types. Second, it's unreliable at catching conditional rendering: an affiliate block that's absent looks the same as a block I intentionally disabled or that I haven't scrolled to yet. A script that fetches programmatically, checks presence by string match, and reports pass/fail for each CTA type takes about two seconds and catches the failure unambiguously. The output is readable in a terminal and doesn't require loading a browser. This connects to the same principle behind the three-tier content quality ladder: checks at different stages catch different things. Post-deploy verification catches deploy-time configuration problems. Pre-commit linting catches content problems. Neither replaces the other. What I'd add Right now the script only checks one sample page per site. A more thorough version would check one page from each content type per site — a model page, a compare page, an alternatives page — since some CTAs only render on specific page types. That would require more sitemap traversal but would catch more edge cases. The output format is human-readable but not machine-parseable. If I wanted to hook this into CI and fail a deploy when a CTA is missing, I'd add a JSON output mode and return a non-zero exit code on any ✗. For now I run it manually after deploys — it takes less than ten seconds and the terminal output is enough. Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.
4 hours agoAfter spending two weeks debugging issues that only showed up in production — a sitemap _redirects rule that was blocking my own sitemap-index.xml and a Bluesky image upload race against Cloudflare Pages deploy lag — I added three post-deploy checks to my workflow. They're fast and specific to the failure modes I've actually hit, not a full end-to-end test suite. Three sites (aiappdex.com, findindiegame.com, ossfind.com) on Cloudflare Pages with Astro 5 SSG. Here's what I check. Check 1: Sitemap reachability The simplest check and the one I should have had from day one. After a Cloudflare Pages deploy, I verify that sitemap-index.xml is reachable and returning 200 on all three domains: for domain in aiappdex.com findindiegame.com ossfind.com; do status=$(curl -s -o /dev/null -w "%{http_code}" "https://$domain/sitemap-index.xml") echo "$domain/sitemap-index.xml → $status" if [ "$status" != "200" ]; then echo "FAIL: $domain sitemap unreachable" fi done I also check sitemap-0.xml — the actual URL sub-sitemap that @astrojs/sitemap generates — and assert that it contains at least a minimum expected URL count. For aiappdex.com that threshold is 1,000; if it drops below that after a deploy, the ETL data pipeline probably broke silently. The reason this check exists: I had a _redirects rule rewriting sitemap-index.xml → sitemap-0.xml as an emergency workaround that turned out to be wrong. It was live for five days before I found it. The rule was blocking the real sitemap-index.xml from reaching crawlers while appearing fine in the browser (which followed the redirect). Curl with -o /dev/null -w "%{http_code}" doesn't follow redirects by default, so it would have caught this immediately. Check 2: IndexNow batch submission After every successful sitemap check, I run node scripts/indexnow.mjs. The script reads the live sitemap XML from each domain, collects all URLs, and POSTs them to the IndexNow endpoint for Bing, Yandex, Naver, and Seznam using site-specific keys. Output looks like: aiappdex.com: submitted 1179 URLs → 200 OK findindiegame.com: submitted 139 URLs → 200 OK ossfind.com: submitted 144 URLs → 200 OK If a site returns 403 from IndexNow it usually means the key verification file (/<key>.txt) wasn't deployed correctly or a _redirects rule is mangling the path. Catching this right after deploy matters because the IndexNow key-verification window isn't instantaneous — letting it sit in a broken state delays indexing. I wrote more about the IndexNow setup in this week's tools post. I run this manually after deploy rather than inline in the GitHub Actions workflow because the Cloudflare Pages build takes 2-3 minutes, and IndexNow works best with live URLs. Running it as a separate workflow_dispatch trigger after the deployment succeeds means I'm submitting URLs that are actually live rather than ones that might still be deploying. Check 3: Weekly Lighthouse spot-check The third check runs on a cron — Monday 04:30 UTC — not after every deploy. It's slower (3-4 minutes per site, nine URLs total), so daily would be wasteful for a static site that doesn't change at runtime. The workflow uses treosh/lighthouse-ci-action with one homepage and one deep entry page per site: matrix: site: - { domain: aiappdex.com, sample: /models/timm-vit-base-patch16-clip-224-openai/ } - { domain: findindiegame.com, sample: /games/dredge-1562430/ } - { domain: ossfind.com, sample: /alternatives/ghost/ } I'm watching for Performance below 80, CLS above 0.1, or accessibility score regression. Astro SSG with no client-side JS should hold steady on all three — if they slip it means something in Tailwind v4 config or the ad slot component changed the layout paint behavior. The results upload to temporaryPublicStorage so I can diff before/after on regressions. I don't set hard failure thresholds that block deploys. These sites are pre-revenue with essentially zero traffic right now; blocking a deploy because a Lighthouse score dropped from 94 to 88 would be disproportionate. I treat Lighthouse as a trend monitor, not a gate. What I'm deliberately not checking No uptime monitoring — I'm relying on Cloudflare's own infrastructure status. No end-to-end user flow tests. No API availability checks — the Turso DB is only queried at build time in SSG mode, so there's nothing to check at runtime. For a dynamically rendered site, those gaps would matter. For a static CDN deployment where the entire runtime is pre-built HTML, CSS, and a handful of JSON files, the three checks above cover the actual failure surface I've encountered. The publish pipeline has its own idempotency layer (it reads published_urls from article frontmatter and skips already-distributed posts), so I don't need to verify cross-posting state after each deploy. That's a separate concern. Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.
1 week agoAfter spending two weeks debugging issues that only showed up in production — a sitemap _redirects rule that was blocking my own sitemap-index.xml and a Bluesky image upload race against Cloudflare Pages deploy lag — I added three post-deploy checks to my workflow. They're fast and specific to the failure modes I've actually hit, not a full end-to-end test suite. Three sites (aiappdex.com, findindiegame.com, ossfind.com) on Cloudflare Pages with Astro 5 SSG. Here's what I check. Check 1: Sitemap reachability The simplest check and the one I should have had from day one. After a Cloudflare Pages deploy, I verify that sitemap-index.xml is reachable and returning 200 on all three domains: for domain in aiappdex.com findindiegame.com ossfind.com; do status=$(curl -s -o /dev/null -w "%{http_code}" "https://$domain/sitemap-index.xml") echo "$domain/sitemap-index.xml → $status" if [ "$status" != "200" ]; then echo "FAIL: $domain sitemap unreachable" fi done I also check sitemap-0.xml — the actual URL sub-sitemap that @astrojs/sitemap generates — and assert that it contains at least a minimum expected URL count. For aiappdex.com that threshold is 1,000; if it drops below that after a deploy, the ETL data pipeline probably broke silently. The reason this check exists: I had a _redirects rule rewriting sitemap-index.xml → sitemap-0.xml as an emergency workaround that turned out to be wrong. It was live for five days before I found it. The rule was blocking the real sitemap-index.xml from reaching crawlers while appearing fine in the browser (which followed the redirect). Curl with -o /dev/null -w "%{http_code}" doesn't follow redirects by default, so it would have caught this immediately. Check 2: IndexNow batch submission After every successful sitemap check, I run node scripts/indexnow.mjs. The script reads the live sitemap XML from each domain, collects all URLs, and POSTs them to the IndexNow endpoint for Bing, Yandex, Naver, and Seznam using site-specific keys. Output looks like: aiappdex.com: submitted 1179 URLs → 200 OK findindiegame.com: submitted 139 URLs → 200 OK ossfind.com: submitted 144 URLs → 200 OK If a site returns 403 from IndexNow it usually means the key verification file (/<key>.txt) wasn't deployed correctly or a _redirects rule is mangling the path. Catching this right after deploy matters because the IndexNow key-verification window isn't instantaneous — letting it sit in a broken state delays indexing. I wrote more about the IndexNow setup in this week's tools post. I run this manually after deploy rather than inline in the GitHub Actions workflow because the Cloudflare Pages build takes 2-3 minutes, and IndexNow works best with live URLs. Running it as a separate workflow_dispatch trigger after the deployment succeeds means I'm submitting URLs that are actually live rather than ones that might still be deploying. Check 3: Weekly Lighthouse spot-check The third check runs on a cron — Monday 04:30 UTC — not after every deploy. It's slower (3-4 minutes per site, nine URLs total), so daily would be wasteful for a static site that doesn't change at runtime. The workflow uses treosh/lighthouse-ci-action with one homepage and one deep entry page per site: matrix: site: - { domain: aiappdex.com, sample: /models/timm-vit-base-patch16-clip-224-openai/ } - { domain: findindiegame.com, sample: /games/dredge-1562430/ } - { domain: ossfind.com, sample: /alternatives/ghost/ } I'm watching for Performance below 80, CLS above 0.1, or accessibility score regression. Astro SSG with no client-side JS should hold steady on all three — if they slip it means something in Tailwind v4 config or the ad slot component changed the layout paint behavior. The results upload to temporaryPublicStorage so I can diff before/after on regressions. I don't set hard failure thresholds that block deploys. These sites are pre-revenue with essentially zero traffic right now; blocking a deploy because a Lighthouse score dropped from 94 to 88 would be disproportionate. I treat Lighthouse as a trend monitor, not a gate. What I'm deliberately not checking No uptime monitoring — I'm relying on Cloudflare's own infrastructure status. No end-to-end user flow tests. No API availability checks — the Turso DB is only queried at build time in SSG mode, so there's nothing to check at runtime. For a dynamically rendered site, those gaps would matter. For a static CDN deployment where the entire runtime is pre-built HTML, CSS, and a handful of JSON files, the three checks above cover the actual failure surface I've encountered. The publish pipeline has its own idempotency layer (it reads published_urls from article frontmatter and skips already-distributed posts), so I don't need to verify cross-posting state after each deploy. That's a separate concern. Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.
1 week agoAfter spending two weeks debugging issues that only showed up in production — a sitemap _redirects rule that was blocking my own sitemap-index.xml and a Bluesky image upload race against Cloudflare Pages deploy lag — I added three post-deploy checks to my workflow. They're fast and specific to the failure modes I've actually hit, not a full end-to-end test suite. Three sites (aiappdex.com, findindiegame.com, ossfind.com) on Cloudflare Pages with Astro 5 SSG. Here's what I check. Check 1: Sitemap reachability The simplest check and the one I should have had from day one. After a Cloudflare Pages deploy, I verify that sitemap-index.xml is reachable and returning 200 on all three domains: for domain in aiappdex.com findindiegame.com ossfind.com; do status=$(curl -s -o /dev/null -w "%{http_code}" "https://$domain/sitemap-index.xml") echo "$domain/sitemap-index.xml → $status" if [ "$status" != "200" ]; then echo "FAIL: $domain sitemap unreachable" fi done I also check sitemap-0.xml — the actual URL sub-sitemap that @astrojs/sitemap generates — and assert that it contains at least a minimum expected URL count. For aiappdex.com that threshold is 1,000; if it drops below that after a deploy, the ETL data pipeline probably broke silently. The reason this check exists: I had a _redirects rule rewriting sitemap-index.xml → sitemap-0.xml as an emergency workaround that turned out to be wrong. It was live for five days before I found it. The rule was blocking the real sitemap-index.xml from reaching crawlers while appearing fine in the browser (which followed the redirect). Curl with -o /dev/null -w "%{http_code}" doesn't follow redirects by default, so it would have caught this immediately. Check 2: IndexNow batch submission After every successful sitemap check, I run node scripts/indexnow.mjs. The script reads the live sitemap XML from each domain, collects all URLs, and POSTs them to the IndexNow endpoint for Bing, Yandex, Naver, and Seznam using site-specific keys. Output looks like: aiappdex.com: submitted 1179 URLs → 200 OK findindiegame.com: submitted 139 URLs → 200 OK ossfind.com: submitted 144 URLs → 200 OK If a site returns 403 from IndexNow it usually means the key verification file (/<key>.txt) wasn't deployed correctly or a _redirects rule is mangling the path. Catching this right after deploy matters because the IndexNow key-verification window isn't instantaneous — letting it sit in a broken state delays indexing. I wrote more about the IndexNow setup in this week's tools post. I run this manually after deploy rather than inline in the GitHub Actions workflow because the Cloudflare Pages build takes 2-3 minutes, and IndexNow works best with live URLs. Running it as a separate workflow_dispatch trigger after the deployment succeeds means I'm submitting URLs that are actually live rather than ones that might still be deploying. Check 3: Weekly Lighthouse spot-check The third check runs on a cron — Monday 04:30 UTC — not after every deploy. It's slower (3-4 minutes per site, nine URLs total), so daily would be wasteful for a static site that doesn't change at runtime. The workflow uses treosh/lighthouse-ci-action with one homepage and one deep entry page per site: matrix: site: - { domain: aiappdex.com, sample: /models/timm-vit-base-patch16-clip-224-openai/ } - { domain: findindiegame.com, sample: /games/dredge-1562430/ } - { domain: ossfind.com, sample: /alternatives/ghost/ } I'm watching for Performance below 80, CLS above 0.1, or accessibility score regression. Astro SSG with no client-side JS should hold steady on all three — if they slip it means something in Tailwind v4 config or the ad slot component changed the layout paint behavior. The results upload to temporaryPublicStorage so I can diff before/after on regressions. I don't set hard failure thresholds that block deploys. These sites are pre-revenue with essentially zero traffic right now; blocking a deploy because a Lighthouse score dropped from 94 to 88 would be disproportionate. I treat Lighthouse as a trend monitor, not a gate. What I'm deliberately not checking No uptime monitoring — I'm relying on Cloudflare's own infrastructure status. No end-to-end user flow tests. No API availability checks — the Turso DB is only queried at build time in SSG mode, so there's nothing to check at runtime. For a dynamically rendered site, those gaps would matter. For a static CDN deployment where the entire runtime is pre-built HTML, CSS, and a handful of JSON files, the three checks above cover the actual failure surface I've encountered. The publish pipeline has its own idempotency layer (it reads published_urls from article frontmatter and skips already-distributed posts), so I don't need to verify cross-posting state after each deploy. That's a separate concern. Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.
1 week agoAfter spending two weeks debugging issues that only showed up in production — a sitemap _redirects rule that was blocking my own sitemap-index.xml and a Bluesky image upload race against Cloudflare Pages deploy lag — I added three post-deploy checks to my workflow. They're fast and specific to the failure modes I've actually hit, not a full end-to-end test suite. Three sites (aiappdex.com, findindiegame.com, ossfind.com) on Cloudflare Pages with Astro 5 SSG. Here's what I check. Check 1: Sitemap reachability The simplest check and the one I should have had from day one. After a Cloudflare Pages deploy, I verify that sitemap-index.xml is reachable and returning 200 on all three domains: for domain in aiappdex.com findindiegame.com ossfind.com; do status=$(curl -s -o /dev/null -w "%{http_code}" "https://$domain/sitemap-index.xml") echo "$domain/sitemap-index.xml → $status" if [ "$status" != "200" ]; then echo "FAIL: $domain sitemap unreachable" fi done I also check sitemap-0.xml — the actual URL sub-sitemap that @astrojs/sitemap generates — and assert that it contains at least a minimum expected URL count. For aiappdex.com that threshold is 1,000; if it drops below that after a deploy, the ETL data pipeline probably broke silently. The reason this check exists: I had a _redirects rule rewriting sitemap-index.xml → sitemap-0.xml as an emergency workaround that turned out to be wrong. It was live for five days before I found it. The rule was blocking the real sitemap-index.xml from reaching crawlers while appearing fine in the browser (which followed the redirect). Curl with -o /dev/null -w "%{http_code}" doesn't follow redirects by default, so it would have caught this immediately. Check 2: IndexNow batch submission After every successful sitemap check, I run node scripts/indexnow.mjs. The script reads the live sitemap XML from each domain, collects all URLs, and POSTs them to the IndexNow endpoint for Bing, Yandex, Naver, and Seznam using site-specific keys. Output looks like: aiappdex.com: submitted 1179 URLs → 200 OK findindiegame.com: submitted 139 URLs → 200 OK ossfind.com: submitted 144 URLs → 200 OK If a site returns 403 from IndexNow it usually means the key verification file (/<key>.txt) wasn't deployed correctly or a _redirects rule is mangling the path. Catching this right after deploy matters because the IndexNow key-verification window isn't instantaneous — letting it sit in a broken state delays indexing. I wrote more about the IndexNow setup in this week's tools post. I run this manually after deploy rather than inline in the GitHub Actions workflow because the Cloudflare Pages build takes 2-3 minutes, and IndexNow works best with live URLs. Running it as a separate workflow_dispatch trigger after the deployment succeeds means I'm submitting URLs that are actually live rather than ones that might still be deploying. Check 3: Weekly Lighthouse spot-check The third check runs on a cron — Monday 04:30 UTC — not after every deploy. It's slower (3-4 minutes per site, nine URLs total), so daily would be wasteful for a static site that doesn't change at runtime. The workflow uses treosh/lighthouse-ci-action with one homepage and one deep entry page per site: matrix: site: - { domain: aiappdex.com, sample: /models/timm-vit-base-patch16-clip-224-openai/ } - { domain: findindiegame.com, sample: /games/dredge-1562430/ } - { domain: ossfind.com, sample: /alternatives/ghost/ } I'm watching for Performance below 80, CLS above 0.1, or accessibility score regression. Astro SSG with no client-side JS should hold steady on all three — if they slip it means something in Tailwind v4 config or the ad slot component changed the layout paint behavior. The results upload to temporaryPublicStorage so I can diff before/after on regressions. I don't set hard failure thresholds that block deploys. These sites are pre-revenue with essentially zero traffic right now; blocking a deploy because a Lighthouse score dropped from 94 to 88 would be disproportionate. I treat Lighthouse as a trend monitor, not a gate. What I'm deliberately not checking No uptime monitoring — I'm relying on Cloudflare's own infrastructure status. No end-to-end user flow tests. No API availability checks — the Turso DB is only queried at build time in SSG mode, so there's nothing to check at runtime. For a dynamically rendered site, those gaps would matter. For a static CDN deployment where the entire runtime is pre-built HTML, CSS, and a handful of JSON files, the three checks above cover the actual failure surface I've encountered. The publish pipeline has its own idempotency layer (it reads published_urls from article frontmatter and skips already-distributed posts), so I don't need to verify cross-posting state after each deploy. That's a separate concern. Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.
1 week ago
Father and son and two boys are rescued after being trapped in Venezuela earthquake rubble
Rescue teams in Venezuela pull people alive from the rubble of collapsed buildings in the aftermath of recent earthquake...
Oshiomhole backs death penalty proposal for kidnappers and bandits in Edo State
Former Edo State Governor Adams Oshiomhole says he supports a proposed death penalty for kidnappers and bandits, citing...
Modi Visits Seychelles for Golden Jubilee, Strengthens Maritime and Defence Cooperation
Prime Minister Narendra Modi is on a three-day official visit to Seychelles after arriving in Victoria to mark the India...