Explore >> Select a destination


You are here

www.jeremiak.com
| | ericlathrop.com
1.3 parsecs away

Travel
| | All sorts of companies are building machine learning models by crawling the web for training data. This is a form of copyright laundering, and the legality is questionable.
| | www.andrlik.org
1.1 parsecs away

Travel
| | It is now clear that at least some AI companies are ignoring robots.txt that forbid them from scraping a site. Robb Knight wrote up a great guide for explicitly blocking those scraping bots via your Nginx config. However, this site is currently served by AWS CloudFront, which means that the content gets served without the request touching the source server. I was sure there had to be a way to do something similar with a CloudFront function, so I set out to try.
| | lewisdale.dev
1.3 parsecs away

Travel
| |
| | audisto.com
22.0 parsecs away

Travel
| Use our scalable indexability checker to test how a robots directive noindex, robots.txt directive, canonical link, hreflang or duplicate content affects the indexability and SEO of your website.