Lightnews — Scholar-powered news

Reposted by Wayne

Chris Donahue

@chrisdonahue.com

Inaugurating new acct to share work from my PhD student!

Wayne et al have been running a live eval platform Copilot Arena - a VSCode extension serving code completions from AI systems to real developers. See 🧵 for findings and preprint

Excited to be evaluating human-AI *workflows* holistically!

Wayne @waynechi.bsky.social · Mar 5

What do developers 𝘳𝘦𝘢𝘭𝘭𝘺 think of AI coding assistants?

In October, we launched Copilot Arena to collect user preferences on real dev workflows. After months of live service, we’re here to share our findings in our recent preprint.

Here's what we have learned /🧵

March 5, 2025 at 5:01 PM

Wayne

@waynechi.bsky.social

What do developers 𝘳𝘦𝘢𝘭𝘭𝘺 think of AI coding assistants?

In October, we launched Copilot Arena to collect user preferences on real dev workflows. After months of live service, we’re here to share our findings in our recent preprint.

Here's what we have learned /🧵

March 5, 2025 at 4:49 PM

Wayne

@waynechi.bsky.social

Got to test out InceptionAILab's newest model, Mercury Coder Mini, on Copilot Arena!

Mercury Coder Mini is blazing fast and overtakes Codestral as the fastest coding model out there (0.24s end-to-end latency) while boasting similar performance (joint #2).

Congrats to InceptionAILabs! 📸

February 26, 2025 at 11:51 PM

Wayne

@waynechi.bsky.social

I had the same problem. I only use cursor for newer, small projects. I use Copilot Arena's edit feature for projects in VSCode (but obviously I'm biased)

Kyle Lo @kylelo.bsky.social · Jan 5

tried switching to cursor and having extreme difficulty getting all my vscode extensions to work properly ☹️ doesn’t seem worth

January 5, 2025 at 8:11 AM

Wayne

@waynechi.bsky.social

Deepseek v3 (FiM) is now available in Copilot Arena for free!

Download at lmarena.ai/copilot

December 31, 2024 at 9:12 PM

Wayne

@waynechi.bsky.social

These lists are better than most "2024's best games" lists

HDKirin @hdkirin.bsky.social · Dec 25

This week's Famitsu had a lot of Japanese gaming industry folks give their personal Game of the Year lists. I'll update this thread periodically since there's a lot of them.

December 27, 2024 at 4:16 AM

Wayne

@waynechi.bsky.social

Copilot Arena's leaderboard is now live on lmarena.ai/leaderboard!

We've collected over 15k votes on 11 models (2 new models since our last blogpost release). Congrats @deepseek.bsky.social🥇and @anthropic.com🥇!

Chatbot Arena (formerly LMSYS): Free AI Chat to Compare & Test Best AI Chatbots

lmarena.ai

December 23, 2024 at 9:41 PM

Wayne

@waynechi.bsky.social

I'm not physically at NeurIPS, but my good friend
@naveenraman.bsky.social will be presenting in my stead.

In this work, we found that UI element ordering significantly affected GUI agent performance. Come check out the poster (and quiz Naveen) at the Workshop on Open-World Agents (OWA-2024)!

December 13, 2024 at 7:32 AM

Wayne

@waynechi.bsky.social

Bruh what... 💀

December 10, 2024 at 5:51 PM

Wayne

@waynechi.bsky.social

We've open sourced CopilotArena’s server code!

Check out how we handle code completions and share your ideas for new system prompts!

Github:
github.com/lmarena/copi...
Technical details in the blog: blog.lmarena.ai/blog/2024/co...

Download Copilot now at: lmarena.ai/copilot

December 5, 2024 at 7:44 PM

Wayne

@waynechi.bsky.social

Trying out Bluesky. Will mostly be posting about Copilot Arena!

November 20, 2024 at 6:59 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news