oh hey just as a PSA, any code any of y'all may have on github might have been scraped by @swheritage, a self-proclaimed "preservation" org which

- just hoovered up vast amounts of data without asking or telling anyone
- insists on deadnaming trans people forever for "integrity" reasons
- used it to build an LLM training data set

huggingface.co/datasets/bigcod to check and for opt-out instructions

@jaredwhite @outie Time to sue then. Those "BuT oPt-OuT iS a FoRm Of CoNsEnT!" shitfaces need to learn a valuable lesson, *especially* if what they do wouldn't be possible with opt-in. If @swheritage can't live without it then the organisation deserves to become history.

@Natanox @jaredwhite @outie @swheritage If the projects had an OSS license I don't see what grounds you have for a lawsuit.

@ao @Natanox @jaredwhite @outie @swheritage theatre Open source isn’t public domain. open source licenses only count if you follow their terms, which an AI doesn’t.

@arborelia @Natanox @jaredwhite @outie @swheritage
I was talking about them publishing data, that is legally okay. Using it in an AI may be legally problematic, but providing a dataset that follows licenses is afaik legally okay. I'm just trying to say that there's no real reason to hope for a successful lawsuit against this particular project.

That said: They force you to agree to this in order to get access to it for AI purposes, which I think is way better than many other datasets with our data in them: softwareheritage.org/2023/10/1

@ao @Natanox @jaredwhite @outie @swheritage their dataset also _doesn’t_ follow licenses. they just took stuff with no license. They openly say this.

On the AI model side, asking users to obey licenses for them (so they don’t have to) sure is a gambit.

@arborelia @Natanox @jaredwhite @outie @swheritage hm yeah I missed that part. yeah guess that's a good point.

Sign in to participate in the conversation
Computer Fairies

Computer Fairies is a Mastodon instance that aims to be as queer, friendly and furry as possible. We welcome all kinds of computer fairies!