**maple "mavica" syrup [6502]** @mavica_again@computerfairi.es · May 20, 2025, 23:12

**maple "mavica" syrup [6502]** @mavica_again@computerfairi.es · May 20, 2025, 23:12

maple "mavica" syrup [6502] @mavica_again@computerfairi.es

May 20, 2025, 23:12

maple "mavica" syrup [6502] @mavica_again@computerfairi.es

riddle me this: what's stopping AI scrapers from changing their user agent if they already don't respect robots.txt

stupid-ass arms race

fe78fc9eb80aea77.png

**Esme Povirk** @madewokherd@computerfairi.es · May 20, 2025, 23:40

**Esme Povirk** @madewokherd@computerfairi.es · May 20, 2025, 23:40

May 20, 2025, 23:40

Esme Povirk @madewokherd@computerfairi.es

@mavica_again Nerd-sniped into searching GitHub for the answer

I think maybe it's a deliberate choice to not make it any more complicated than necessary for actual existing bots https://github.com/TecharoHQ/anubis/pull/78#issuecomment-2764334672

**maple "mavica" syrup [6502]** @mavica_again@computerfairi.es · May 20, 2025, 23:41

**maple "mavica" syrup [6502]** @mavica_again@computerfairi.es · May 20, 2025, 23:41

May 20, 2025, 23:41

maple "mavica" syrup [6502] @mavica_again@computerfairi.es

@madewokherd i get it but like. then what's the point

this was prompted by a forum post pointing out you can get rid of anubis (which they argue is harmful for getting in the way of javascript-free browsing) by just making your user-agent as wget's

**Esme Povirk** @madewokherd@computerfairi.es · May 20, 2025, 23:45

**Esme Povirk** @madewokherd@computerfairi.es · May 20, 2025, 23:45

May 20, 2025, 23:45

Esme Povirk @madewokherd@computerfairi.es

@mavica_again The point is it doesn't take 5 minutes to load Bugzilla anymore. If that changes, then they escalate, I guess.

**maple "mavica" syrup [6502]** @mavica_again@computerfairi.es · May 20, 2025, 23:46

**maple "mavica" syrup [6502]** @mavica_again@computerfairi.es · May 20, 2025, 23:46

May 20, 2025, 23:46

maple "mavica" syrup [6502] @mavica_again@computerfairi.es

@madewokherd like i said, arms race

because it's less than trivial for ai scrapers to change their user agent

**Esme Povirk** @madewokherd@computerfairi.es · May 20, 2025, 23:48

**Esme Povirk** @madewokherd@computerfairi.es · May 20, 2025, 23:48

May 20, 2025, 23:48

Esme Povirk @madewokherd@computerfairi.es

@mavica_again Apparently, they don't care that much (yet).

**Esme Povirk** @madewokherd@computerfairi.es · May 21, 2025, 00:09

**Esme Povirk** @madewokherd@computerfairi.es · May 21, 2025, 00:09

May 21, 2025, 00:09

Esme Povirk @madewokherd@computerfairi.es

@mavica_again It's an arms race, but one where no one is currently escalating. Probably because it buys them very little. Yes, an AI scraper could trivially detect Anubis and change the user agent to wget or whatever. And they could scrape the very small portion of the web behind Anubis for maybe a weekend before people start configuring it to block that.

**Esme Povirk** @madewokherd@computerfairi.es · 2025-05-21T00:13:02Z

Esme Povirk @madewokherd@computerfairi.es

@mavica_again I bet AI companies could also trivially configure their scrapers to not DDoS the Internet, but apparently they don't care enough to do that either. It'd probably be less work for more payoff than participating in this particular arms race.

May 21, 2025, 00:13 · · · ·

**maple "mavica" syrup [6502]** @mavica_again@computerfairi.es · May 21, 2025, 00:14

**maple "mavica" syrup [6502]** @mavica_again@computerfairi.es · May 21, 2025, 00:14

May 21, 2025, 00:14

maple "mavica" syrup [6502] @mavica_again@computerfairi.es

@madewokherd well, we'll see

at this point i'm just so tired.

Resources

Developers

What is Mastodon?

computerfairi.es

More…