Explore >> Select a destination


You are here

www.madebydusk.com
| | www.woorank.com
2.0 parsecs away

Travel
| | Optimize your site's crawling and indexing. Tell search engines exactly where to find your XML sitemap in your robots.txt file.
| | www.ross.ws
4.2 parsecs away

Travel
| | Michael Ross, freelance writer and web developer
| | ericlathrop.com
2.5 parsecs away

Travel
| | All sorts of companies are building machine learning models by crawling the web for training data. This is a form of copyright laundering, and the legality is questionable.
| | tsak.dev
17.9 parsecs away

Travel
| With the recent news of OpenAI's web crawler respecting robots.txt and the ensuing scramble by seemingly everybody ensuring their robots.txt is blocking GPTBot, I was thinking if there wasn't a better solution to help our future AI overlords make sense of the world. As I am hosting all my sites on a tiny NUC using nginx and having previously played with its return directive I decided to reuse the same trick for visits of GPTBot.