Lightnews — Scholar-powered news

Mike Samuel 🟣

@mvsamuel.bsky.social

690 followers 690 following 1.4K posts

I solve large software systems problems with programming language techniques.

Previously, I was the first frontend engineer on Google Calendar, and was a security engineer who worked on the industrial-strength Mad Libs undergirding Gmail.

Posts Replies Media Videos

Mike Samuel 🟣

@mvsamuel.bsky.social

Pew's 2019 numbers (7.5%) seem roughly in line with Dr. Chelwa's paper's 2018 numbers (6.4%) which is linked later in this thread.

November 28, 2025 at 8:55 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Updoc, erights.org/elang/tools/... , asserts that some subset of good tests are also good documentation and tries to bridge the two with a REPL-syntax.

November 28, 2025 at 5:43 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Practice of Programming is old but chapter 6 talks about properties: tests that pass are quiet. Test suites cover conservation properties and corner cases, check equivalence to simple&slow implementations, cover inputs that are unlikely when used in good faith.

November 28, 2025 at 5:41 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Yeah, without that context from the précis that it's when performing a high-risk task, the 40% stat is meaningless.

E.g. high risk: generating a database query from an untrusted input which can lead to SQL injection.
Low risk: pure math, like compute compound interest.

arxiv.org/pdf/2108.09293

Journal article.

Headline: Asleep at the Keyboard? Assessing the
Security of GitHub Copilot’s Code Contributions

Followed by 5 authors from New York University Department of ECE: Hammond Pearce, Baleegh Ahmad, Benjamin Tam, Brendan Dolan-Gavitt, Ramesh Karri

A sentence from the abstract is highlighted: "To perform this analysis we prompt Copilot to generate code in scenarios relevant to high-risk cybersecurity weaknesses, e.g. those from MITRE’s “Top 25” Common Weakness Enumeration (CWE) list"

Full text follows:

Abstract—There is burgeoning interest in designing AI-based
systems to assist humans in designing computing systems,
including tools that automatically generate computer code. The
most notable of these comes in the form of the first self-described
‘AI pair programmer’, GitHub Copilot, which is a language
model trained over open-source GitHub code. However, code
often contains bugs—and so, given the vast quantity of unvetted
code that Copilot has processed, it is certain that the language
model will have learned from exploitable, buggy code. This raises
concerns on the security of Copilot’s code contributions. In this
work, we systematically investigate the prevalence and conditions
that can cause GitHub Copilot to recommend insecure code.
To perform this analysis we prompt Copilot to generate code
in scenarios relevant to high-risk cybersecurity weaknesses, e.g.
those from MITRE’s “Top 25” Common Weakness Enumeration
(CWE) list. We explore Copilot’s performance on three distinct
code generation axes—examining how it performs given diversity
of weaknesses, diversity of prompts, and diversity of domains. In
total, we produce 89 different scenarios for Copilot to complete,
producing 1,689 programs. Of these, we found approximately
40 % to be vulnerable.
Index Terms—Cybersecurity, Artificial Intelligence (AI), code
generation, Common Weakness Enumerations (CWEs)

November 25, 2025 at 8:37 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

What even is this? Boolean reverse implication? No, it's just the first input.

wtftw

November 25, 2025 at 8:02 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

actuators: where values go to die

November 25, 2025 at 7:41 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Investing a $B in AI over 30 years would produce a crop of post docs who do great work.

Trying to jam $10B through a funnel in a fraction of the time won't.

Talented people co-opted into doing mediocre work instead of laying a solid foundation, without as you say, critical review.

November 25, 2025 at 7:23 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

We software practitioners need to hash out what constitutes responsible AI use, and the sources for these statistics are important.

But as you say, these bags of stats do not lead us towards clarity.

November 25, 2025 at 7:18 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

I tracked down one of those the other day: 45% of generated code has exploitable flaws.

The authors were choosing prompts for coding tasks known to entail risk, so "use this input to query the database" is within their methodology because SQL injection, but "implement compound interest" is not.

November 25, 2025 at 7:16 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

SBI ≟ simulation based inference

November 25, 2025 at 7:02 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

<voice trumpian>
Wales. I don't like Wales. Old Wales they call it.
Big but not bigly.
I prefer my Wales new and south.

November 25, 2025 at 4:41 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

best of luck

November 25, 2025 at 3:24 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

I vaguely remember a biography of Helen Keller talking about how learning language at the age of 7 revealed internal vistas.

I can't vouch for the below but it matches my recollections.

Block of test from https://www.reddit.com/r/Frisson/comments/2umaun/text_helen_keller_on_learning_language_and_her/

Anne Sullivan(teacher and life long companion) spelled words into Helen’s hand and tried to help the girl connect letters and words with objects’ names. At first, Helen thought her teacher was just playing a game. Helen memorized words but failed to understand that they did, in fact, have meaning.

It wasn't until April 5, 1887, when Anne took Helen to an old pump house, that Helen finally understood that everything has a name. Sullivan put Helen’s hand under the stream and began spelling “w-a-t-e-r” into her palm, first slowly, then more quickly.

Keller later wrote in her autobiography, “As the cool stream gushed over one hand she spelled into the other the word water, first slowly, then rapidly. I stood still, my whole attention fixed upon the motions of her fingers. Suddenly I felt a misty consciousness as of something forgotten–-a thrill of returning thought; and somehow the mystery of language was revealed to me. I knew then that ‘w-a-t-e-r’ meant the wonderful cool something that was flowing over my hand. That living word awakened my soul, gave it light, hope, joy, set it free! There were barriers still, it is true, but barriers that could in time be swept away.”

November 25, 2025 at 3:20 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

But yeah, you don't want bottlenecks either. There's some optimal level of distribution of knowledge, and docs and tests are important ways to encode that.

These are things mgmt needs to know to value.

November 24, 2025 at 6:00 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

I was floored when a director casually said about an HR application suite "I'll probably spin up a new team do a full rewrite in a few years."

The idea that the existing codebase and tests didn't embody lots of hard-learned lessons and that there wasn't valuable oral culture in the existing team.

November 24, 2025 at 5:58 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Noise marine has entered the chat.

November 24, 2025 at 5:21 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Sure, a tool isn't good or bad *per se*, but a use can be judged on its effect on people, and tools that overwhelmingly are turned to bad for people are bad in the ways that matter.

The distinction between tools/means and ends/moral standing is pretty explicit in Kant's second formulation.

November 24, 2025 at 5:19 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Yeah, I poke fun, but Scala did some really impressive things to shoehorn a nice language onto the JVM.

Bridging nominals and structurals is never going to be pretty, but it's great for DX.

bsky.app/profile/mvsa...

Mike Samuel 🟣 @mvsamuel.bsky.social · 10d

Nominal types are when a PL names types.
A type like X is declared in some source file, A.
Other files, B and C, can depend on it.
And if yet another depends on B and C, it can use an X value from one with the other.

Ad-hoc types
graph of nodes in a diamond arrangement with arrows flowing from top to bottom. Top of diamond labeled 'gives an X from B to C', middle two nodes labels are 'B uses X' and 'C uses X'. Bottom node label is 'A defines type X'.

November 24, 2025 at 5:02 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Beats Scala2.

www.scala-lang.org/api/2.12.2/s...

Screenshot of a javadoc class list from https://www.scala-lang.org/api/2.12.2/scala/index.html consisting of a series of letters in circles next to monospaced class names.

The first list entry is "Symbol" with an O in a circle and a C in a circle next to it.
The last is "UninitializedError".

In the middle are Tuple1 through Tuple22 with C's next to them sorted in lexicographic order so Tuple9 is at the bottom and Tuple20-22 follow Tuple2.

November 24, 2025 at 4:54 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

There's study work on snowy conditions, so this is not a first rodeo situation, but things will get weird initially.

bsky.app/profile/mvsa...

Mike Samuel 🟣 @mvsamuel.bsky.social · 9d

Yeah. legacy.sae.org/publications...

Possibly bad interactions if icebike-listers create two tracks going opposite directions and then autonomous cars try to straddle them.

Tire Track Identification: A Method for Drivable Region Detection in Conditions of Snow-Occluded Lane Lines

Today’s Advanced Driver Assistance Systems (ADAS) predominantly utilize cameras to increase driver and passenger safety. Computer vision, as the enabler of this technology, extracts two key environmen...

legacy.sae.org

November 21, 2025 at 7:06 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Surprising level of purity in your fall from Grace, but I forget whom I'm talking to.

November 21, 2025 at 6:26 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Do you have any kind of scoped/structured transactions so that a Grace program can be explicit about which prompts inherit state from prior ones?

November 21, 2025 at 6:01 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Nice!

We have all this new machinery for producing semi-structured outputs but I can imagine the orchestration sauce is missing to assemble outputs and get them in a structured form.

November 21, 2025 at 6:01 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

So you can have forms driven by types that give pieces that you use to compose a prompt from a template and then there's additional promptery done so that the model application produces a structured result fitting the output type passed to prompt?

November 21, 2025 at 5:45 PM

Mike Samuel 🟣

@mvsamuel.bsky.social

Speedrunning soteriology

November 21, 2025 at 3:43 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news