Author | Lightnews

Reposted by Enzo Doyen

angela zhou @angelamczhou.bsky.social · 4d

blog.arxiv.org/2025/10/31/a...

FYI the blog post for the updated policy is out. Our llm future is dire:/

Reposted by Enzo Doyen

Alexander Doria @dorialexander.bsky.social · Sep 15

> be a language model
> all you see is tokens
> you don't care, it's all abstracted away
> you live for a world of pure ideas, chain of concepts, reasoning streams
> tokens don't exist.

2 12 110

Reposted by Enzo Doyen

Nick Fleisher @nickfleisher.bsky.social · Aug 17

NYT obit for Robin Lakoff

Robin Lakoff, Expert on Language and Gender, Is Dead at 82

www.nytimes.com

5 12

Enzo Doyen @edoyen.com · Jul 15

It should be said that LLMs also generally have on-par performance with traditional NMT engines (see arxiv.org/html/2401.05... or aclanthology.org/2024.wmt-1.1...); but apart from that, I guess the whole "novelty" thing makes it a preferred choice for people wanting to implement machine l10n.

3

Enzo Doyen @edoyen.com · Jul 15

Compared to traditional NMT engines, LLMs do have this advantage of easily allowing to provide requirements for the translation (in terms of style, keywords; see aclanthology.org/2023.wmt-1.8... or arxiv.org/abs/2301.13294); even though I highly doubt it's widely used for machine l10n.

1 1

Reposted by Enzo Doyen

GITT 2025 @gitt-workshop.bsky.social · Jun 23

@bsavoldi.bsky.social taking us back in time at #GITT2025 ⌚⏳ focusing on the first discussions of gender bias in language technology as a socio-technical issue. No, the problem hasn't been fixed yet. But what has happened?

6 3 7

Enzo Doyen @edoyen.com · May 29

hmm that's nice, but does ACL allow to change style files like that?

1 1

Reposted by Enzo Doyen

Maria Antoniak @mariaa.bsky.social · Apr 6

to quote a colleague quoting a goose: “alignment to what? alignment to what??”

Jeremy Morrell @jeremymorrell.dev · Apr 5

Meta introduced Llama 4 models and added this section near the very bottom of the announcement 😬

“[LLMs] historically have leaned left when it comes to debated political and social topics.”

ai.meta.com/blog/llama-4...

Meta
Addressing bias in LLMs

It's well-known that all leading LLMs have had issues with bias-specifically, they historically have leaned left when it comes to debated political and social topics. This is due to the types of training data available on the internet.

Our goal is to remove bias from our Al models and to make sure that Llama can understand and articulate both sides of a contentious issue. As part of this work, we're continuing to make Llama more responsive so that it answers questions, can respond to a variety of different viewpoints without passing judgment, and doesn't favor some views over others.

We have made improvements on these efforts with this release—Llama 4 performs significantly better than Llama 3 and is comparable to Grok:

• Llama 4 refuses less on debated political and social topics overall (from 7% in Lama 3.3 to below 2%).
• Llama 4 is dramatically more balanced with which prompts it refuses to respond to (the proportion of unequal response refusals is now less than 1% on a set of debated topical questions).
• Our testing shows that Llama 4 responds with strong political lean at a rate comparable to Grok (and at half of the rate of Llama 3.3) on a contentious set of political or social topics. While we are making progress, we know we have more work to do and will continue to drive this rate further down.
We're proud of this progress to date and remain committed to our goal of eliminating overall bias in our models.

2 6 31

Enzo Doyen @edoyen.com · Mar 13

I never said that you were against benchmarking; rather that, in my opinion, such datasets can be used as a starting point to theoretically define the "default behaviors" of LLMs insofar as they reflect what we generally expect from them on a diverse range of tasks.

Enzo Doyen @edoyen.com · Mar 12

To my knowledge, there is no research on the topic, but I intuitively believe that generic prompts are much more prevalent than one may first think. While many do, I don't think *most* people actually use already created prompt templates or necessarily have the time to describe their task at length.

1 1

Enzo Doyen @edoyen.com · Mar 12

I think that makes sense to draw on these benchmarks for research on LLM behaviors given they're the standard in evaluating LLMs.

So the "golden" default behavior for each task could theoretically be found in standard LLM benchmarking datasets (and same for "generic prompts").

1

Enzo Doyen @edoyen.com · Mar 12

Actually, I think we should talk about default behaviors (plural) where each default behavior is task-dependent. Main tasks can be determined from commonly used LLM benchmarks (that is, commonsense reasoning w/ ARC; language understanding/question-answer w/ OpenBookQA…).

1 1

Enzo Doyen @edoyen.com · Feb 12

vastai is the cheapest and the most reliable that I know

1

Enzo Doyen @edoyen.com · Feb 11

MIT releasing new live sessions I can't
www.youtube.com/watch?v=TTX4...

Ring Of Past (live)

YouTube video by Men I Trust

www.youtube.com

Reposted by Enzo Doyen

Tyler Glaiel @tylerglaiel.com · Feb 1

we've been laughing at so many of the twitter responses to this, its very funny

Tyler Glaiel @tylerglaiel.com · Jan 31

lmao

3 8 91

Enzo Doyen @edoyen.com · Feb 1

aaah! Well that's definitely an interesting question. Very curious to know the answer too lol. Theoretically I guess it's possible but the performance may not be very good

1

Enzo Doyen @edoyen.com · Feb 1

It can: github.com/ading2210/do...

GitHub - ading2210/doompdf: A port of Doom (1993) that runs inside a PDF file

A port of Doom (1993) that runs inside a PDF file. Contribute to ading2210/doompdf development by creating an account on GitHub.

github.com

1 4