Trouble submitting sitemap for custom/persistant URLs

I’ve attempted to submit a sitemap in google search console for a site that has a custom domain, when submitting I receive a couldn't fetch error, however if I submit the sitemap under the raw or canister Id url the sitemap returns success. Could this possibly be a service worker issue?

I can confirm that submitting a sitemap.xml to the Google Search Console using a custom domain registered with the boundary nodes works out.

Do you use your own infrastructure?

Awesome, thank you for confirming.

We’re not using any of our own infrastructure, the property was registered using the custom domains guide. I suspected it could be service worker related because when navigating to the sitemap in an incognito window I received the service worker loader, but seeing as you’ve managed to get one accepted it has to be something else… :thinking:

I dont think it’s related to the service worker because de boundary nodes redirect crawlers to raw if needed as in this case.

You have the sitemap.xml at the root ? If not it is correctly linked with a meta tag ? Do you also have added the link to the xml file in robots txt (not sure it’s needed but I do) ?

Yes, I have the sitemap.xml hosted at the root, and its linked in the robots.txt. It’s strange that it works for the canister Id urls within search console, but not the custom url.

Maybe @rbirkner has an idea?

Hi @Mitch,

it’s a bit hard to debug without working on the concrete example. Could you please share the canister-id and custom domain with us (per DM if that’s better for you).

Thanks a lot!

1 Like

I’m bumping this thread as I’m still unable to submit a sitemap in google search console after months of trial and error. Nuance has since open sourced if it helps to check out our code:

canister Id: https://exwqn-uaaaa-aaaaf-qaeaa-cai.ic0.app/
custom domain: https://nuance.xyz/

In comparison to yours, I noticed following difference in my sitemap.xml which is successfully validated by Google:

  1. I declare more schemas (even though I don’t use those)
  2. Instead of lastmod I use the tags changefreq and priority

Even though all these information are optional, can it be that Google Seach Console somehow requires those?

Your sitemap: https://nuance.xyz/sitemap.xml

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://nuance.xyz/paul-the-dev/8-3tzz7-naaaa-aaaaf-qakha-cai/cycling-in-the-surrey-hills</loc>
<lastmod>2023-10-09</lastmod>
</url>

Mine: https://juno.build/sitemap.xml

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:news="http://www.google.com/schemas/sitemap-news/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:image="http://www.google.com/schemas/sitemap-image/1.1" xmlns:video="http://www.google.com/schemas/sitemap-video/1.1">
<url>
<loc>https://juno.build/blog</loc>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>https://juno.build/blog/archive</loc>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
1 Like