We want crawlers to ignore our devs sites, which are still on the public internet.
If Cro can do this by matching the Host request header, that's preferred. However, if you need some extra runtime config, I can do that on the server for you.
If the request's host != "raku.org", we need to return the following at the path /robots.txt
User-agent: *
Disallow: /
Otherwise, we can return 404.
The docs site does this correctly, and fwiw it's more important for the docs site since there's lots more content to index.
Compare: https://docs.raku.org/robots.txt with https://dev.docs.raku.org/robots.txt