Lightnews — Scholar-powered news

Martin Koch

@martinkoch.bsky.social

110 followers 700 following 19 posts

LLMs will eat the world.
CPO @ aqua-cloud.io
Opinions are my own.

Posts Replies Media Videos

Martin Koch

@martinkoch.bsky.social

- Employs an LLM judge to assess compliance, focusing on rule adherence rather than correctness alone.

Results
PromptPex achieved 5.5% higher non-compliance rates compared to baseline test generators, indicating its effectiveness in identifying prompt weaknesses.

Paper: arxiv.org/abs/2503.05070v1

PromptPex: Automatic Test Generation for Language Model Prompts

Large language models (LLMs) are being used in many applications and prompts for these models are integrated into software applications as code-like artifacts. These prompts behave much like tradition...

arxiv.org

March 13, 2025 at 4:57 PM

Martin Koch

@martinkoch.bsky.social

How It Works:

- Extracts Input Specifications (IS) and Output Rules (OR) directly from prompts using LLMs.

- Generates targeted tests based on IS and OR to validate prompt compliance.

- Creates challenging "inverse" tests from OR rules to evaluate model limits.

🧵3/n

March 13, 2025 at 4:57 PM

Martin Koch

@martinkoch.bsky.social

Key Features:

✅ Specification Extraction: Provides insights into prompt behavior, beyond basic black-box testing

✅ Inverse Rule-Based Testing: Uncovers edge cases to enhance prompt robustness

✅ Automated Compliance Checks: Facilitates prompt portability and informed model selection

🧵2/n

March 13, 2025 at 4:57 PM

Martin Koch

@martinkoch.bsky.social

Zuck folds 🧵6/6
Follows Elons footsteps.

January 7, 2025 at 2:53 PM

Martin Koch

@martinkoch.bsky.social

Zuck folds 🧵5/6

January 7, 2025 at 2:53 PM

Martin Koch

@martinkoch.bsky.social

Zuck folds 🧵5/6
Will push back against European Censorship Regulations with help of US Gov

January 7, 2025 at 2:53 PM

Martin Koch

@martinkoch.bsky.social

Zuck folds 🧵3/6
Will move Content review team from California to Texas.

January 7, 2025 at 2:48 PM

Martin Koch

@martinkoch.bsky.social

Zuck folds 🧵2/6

People wanted less political content in their feed as they felt stressed by it, so they toned it down.
But "it feels like we are in a new era now" so they will fill your feed with political content again.

January 7, 2025 at 2:48 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news