To the best of our knowledge, all files contained in the dataset are licensed with one of the permissive licenses (see list in Licensing information) or no license.
Emphasis mine.
What the cinnamon toast fuck?
@swheritage To find out if they have appropriated your code, you can check "Am I in The Stack?": https://huggingface.co/datasets/bigcode/the-stack-v2
However, _do not believe their supposed opt-out_. I mean, sure, submit an opt-out if you want, but I know how they operate -- they'll just keep doing whatever they want and never process any takedowns unless the law makes them.
Hey, in case their transphobia wasn't enough for you, @swheritage is yoinking all the code on GitHub -- regardless of license -- to train a generative AI that plagiarizes code.
No matter how many times they say "ethical", it isn't.
Part 2 of my attempt to get the Software Heritage Archive to stop deadnaming me: the part where I get angry in French.
https://cohost.org/arborelia/post/5052044-the-software-heritag
The Software Heritage Archive (@swheritage) wants to deadname me forever.
Part 1 of 3-ish.
https://cohost.org/arborelia/post/4968198-the-software-heritag
I'd like to find a new instance. This one's been kinda okay except it's old. so the software is out of date, and also old instances with too many trans people on them get put on blocklists and never taken off
I want to find a large, queer, well-connected, well-moderated instance. Do those exist? Any recommendations? If they charge some money to pay moderators that's okay.
BREAKING: Politico reports that Senator Cantwell says she will delay hotlining the Kids Online Safety Act (#KOSA) because of ongoing concerns around LGBTQ rights and the need for further changes.
This is an important win for human rights. But we have so much more to do
Two assertions:
1. There are serious challenges for marginalized groups on the fediverse, including for the groups that are relatively overrepresented here compared to other spaces.
2. We are not going to solve those problems even transiently through "better blocklists."
There are a variety of solutions that we can talk over. I've talked about quite a few of them, as have others, but the quest to "build a better blocklist" is at best a distraction and at worst actively harmful to solving this
-, not about Meta, I do not care about Meta
I've gotten upset, defensive, and a little bit paranoid about the way I see post-Twitter queer communities fighting each other.
and then I ended up part of it. I said something hurtful to someone who didn't deserve it. Over... discourse and general social media shit.
I need to take a break from the fediverse.
a thing i’ve gotten interesting while making word games and NLP tools is the construction of (English) word lists
i’m currently of the conviction that when someone releases a word list, it needs metadata that includes an edit log of all source lists as well as individual word removals and deletions at the discretion of the author
I like games that you can play again and they're different the next time: such as randomizers, roguelikes, and gender expression! Twitch stream: https://twitch.tv/arborelia
also at: https://cohost.org/arborelia