Stefan Baack
@sbaack.com
250 followers 550 following 20 posts
Senior researcher studying data governance and AI training data. Mastodon: @[email protected] he/him
Posts Media Videos Starter Packs
Pinned
sbaack.com
Check in if you're interested in my thoughts about what open source AI should aspire to be in relation to proprietary AI
opensource.org
What should open source AI aspire to be? Watch Stefan Baack and Kasia Odrozek's keynote at OSI Deep Dive: Data Governance conference taking place October 1-3. Register for free at: https://opensource.org/datagovernanceconf
sbaack.com
"The update is yet another signal that payment processors...are currently the ultimate arbiter of what kind of content can be made easily available online, or not."
sbaack.com
The key questions we always should ask when people talk about AI: What is being automated and why? @alexhanna.bsky.social @weizenbauminstitut.bsky.social
sbaack.com
"AI is a labor disciplining device" @alexhanna.bsky.social
Reposted by Stefan Baack
danieldrepper.bsky.social
“The reporter is a man of critical value. No amount of money or effort spent in fitting the right men for this work could possibly be wasted, for the health of society depends upon the quality of the information it receives.” — Walter Lippmann [a century later, I’d swap “man” for “person” though]
sbaack.com
"brainstorming and iteration is...a crucial everyday part of game development...and is not a problem to be solved...I have had many discussions with other game developers who interact with AI engineers and savants who believe our industry pipelines need 'fixing' by them and them alone"
aftermath.site
‘An overwhelmingly negative and demoralizing force’: what it’s like working for a company that’s forcing AI on its developers

aftermath.site/ai-video-game-...
Reposted by Stefan Baack
bildoperationen.bsky.social
«By moving fast and breaking things, DOGE forces a collapse of the system where unanswered questions are met with technological solutions. Shifting the conversation to the technical is a way of locking policymakers and the public out of decisions and shifting that power to the code they write.»
Reposted by Stefan Baack
Reposted by Stefan Baack
auschwitzmemorial.bsky.social
Auschwitz was at the end of a long process. It did not start from gas chambers.

This hatred was gradually developed by humans. From ideas, words, stereotypes & prejudice through legal exclusion, dehumanization & escalating violence... to systematic and industrial murder.

Auschwitz took time.
A bird's-eye view of a former Auschwitz II-Birkenau camp showing a wide dirt pathway flanked by parallel rows of barbed-wire fences. Groups of visitors walk along the path, surrounded by the remnants of brick structures and barracks, now reduced to foundations. Green grass contrasts with the somber history of the site, as the path leads toward a guard tower in the distance.
Reposted by Stefan Baack
eryk.bsky.social
“AI is fake and sucks” vs “AI is real and dangerous” is a Twitter argument. In reality I think the debate also has a lot of “AI is real but not for how you’re using it,” to “AI is fake and that is dangerous,” to “things are happening to real people because of AI hype and that should stop.”
sbaack.com
My reading for this week, delivered to me by the great
@aschrock.bsky.social
themself! Thank you, looking forward to reading :-)
Reposted by Stefan Baack
danieldrepper.bsky.social
Dieser Report gibt Hoffnung!

Immer mehr neue, ambitionierte Medien haben sich in Deutschland und Europa gegründet. Medien mit dem Ziel, die Öffentlichkeit hochwertig zu informieren.

@netzwerkrecherche.org hat für den „Journalism Value Report“ 174 Medien in 31 Ländern befragt und kann zeigen:
Reposted by Stefan Baack
thandis.bsky.social
“Without facts, you can’t have truth, and without truth, you can’t have trust”. - Maria Ressa, 2021 Nobel Peace Prize
Reposted by Stefan Baack
sbaack.com
It ended well though. He got the job, and still has it. We met recently 😅
sbaack.com
I still remember when a friend asked for advice about getting a job I intended to apply for
sbaack.com
Long term, there should be less reliance on sources like Common Crawl and a bigger emphasis on training generative AI on datasets created and curated by people in equitable and transparent ways (10/10)
sbaack.com
A key issue is that filtered Common Crawl versions are not updated after their original publication to take feedback and criticism into account. Therefore, we need dedicated intermediaries tasked with filtering Common Crawl in transparent and accountable ways that are continuously updated (9/10)
sbaack.com
AI builders should put more effort into filtering Common Crawl, establish industry standards and best practices for end-user products to reduce potential harms when using Common Crawl or similar sources for training data (8/10)
sbaack.com
Both Common Crawl and AI builders can help making generative AI less harmful. Common Crawl should highlight the limitations and biases of its data, be more transparent and inclusive about its governance, and enforce more transparency by requiring AI builders to attribute using Common Crawl (7/10)
sbaack.com
Due to Common Crawl’s deliberate lack of curation, AI builders need to filter it with care, but such care is often lacking. Popular filtered versions like C4 are especially problematic as the filtering techniques used to create them are simplistic and leave lots of harmful content untouched (6/10)