Wesley Pasfield
wesleypasfield.bsky.social
Wesley Pasfield
@wesleypasfield.bsky.social
Write at https://open.substack.com/pub/wesleypasfield | Previously Lark Health, AWS/Amazon, GoPro, Nielsen company | Currently Emerging Tech Fellow at US Census, and Adjunct Professor at University of San Diego Applied AI MS
Not to mention the domestic challenges from AI driven inequality caused by our current incentive structure
February 11, 2025 at 6:01 PM
Yeah I’d love to see the equivalent from ChatGPT. While this data is awesome I think very important to emphasize this is adoption by field, not potential capabilities.
February 11, 2025 at 5:24 PM
Certainly not training compute, which is what current regulation specifies. Primary point is that compute required for specific capabilities will be a moving target, exemplified by efficiencies shown recently by deepseek
February 7, 2025 at 5:39 PM
Thank you for sharing this perspective broadly! This is very much in line with my paper at the Neurips RegML Workshop. Data + Evaluation needs to be our focus: wesleypasfield.com/pasfield_neu... "Powering LLM Regulation through Data: Bridging the
Gap from Compute Thresholds to Customer
Experiences
wesleypasfield.com
January 30, 2025 at 5:24 PM
I keep hearing that AI will automate tasks and let people focus on higher leverage tasks…but clearly intention is for AI to try and go up the chain for higher leverage tasks. I think we are hand waving away what our intention is for humans in this agent driven future
January 15, 2025 at 4:26 AM
Like AGI, depends who you ask!
January 12, 2025 at 5:00 PM
I don’t have a great answer but I think test time is especially bad because it gives connotation that it is just for test sets (not live applications), I think especially confusing for non tech folks
January 8, 2025 at 2:17 AM
It’s easy to lead the witness - it’s important to be as neutral as possible and ask for analysis/alternatives to ensure you are not just getting examples of being agreeable from the models
January 4, 2025 at 9:25 PM
I’ve found the newsletter personally useful to keep up with research at a greater breadth than before. I’m using Claude for the paper identification and summary, and everything is serverless on AWS so quite cheap. I hope folks enjoy!
January 3, 2025 at 4:49 PM
LLM hallucinations are a feature AND a bug
December 24, 2024 at 3:12 PM
Paper contents around using data for domain specific evaluation to enable more logical LLM regulation
December 10, 2024 at 9:50 PM
I think this extends to regulatory efforts as well. Better benchmarks / means of evaluation will lead to more logical regulation. That is the core principle of this paper I will present at Neurips later this week
December 10, 2024 at 9:50 PM
They say the ideal use case is “narrow sets of complex tasks led by experts” - thoughts on if that means a singular outcome with a lot of complicated steps, or perhaps a wider set of outcomes but a very defined problem space (or something else)? Having a hard time interpreting that on the surface
December 6, 2024 at 9:24 PM