Vicente Ordonez
@vicenteor.bsky.social
2.8K followers 1.7K following 61 posts
Rice University, Associate Professor of Computer Science. Computer Vision, Multimodal AI, Deep Learning. Houston, Texas. Check our work at https://vislang.ai/
Posts Media Videos Starter Packs
Reposted by Vicente Ordonez
iccv.bsky.social
⏰The ICCV 2025 discussion phase will close soon! ⏰

As a reviewer, it's your job to:
-📖 Read author rebuttals
-🗣️ Engage in discussions
-✅ Submit your final rating & justification

Your responsible engagement is critical for a fair review process!

Deadline: May 27!
vicenteor.bsky.social
My solidarity to Harvard colleagues and my respect for maintaining dignity during troubled times. Their leadership makes their standing in the global stage well deserved.
vicenteor.bsky.social
Cappuccino served at the 50th year celebration of the School of Engineering and Computing at Rice.
An owl which is the mascot of Rice is formed on the foam of a cappuccino
vicenteor.bsky.social
Group picture with my PhD students #studio-ghibli
vicenteor.bsky.social
I wish more authors of papers I have missed citing would email me. At the same time I have only done this sparely and mostly with people I already know first hand or had some kind of interaction in the past and in no case I believe this was deliberate. It is just hard to keep up sometimes.
vicenteor.bsky.social
Thanks CVPR for bringing the conversation here
vicenteor.bsky.social
We just have to believe
vicenteor.bsky.social
I remember when Twitter was this small and familiar. I had/have some followers on Twitter that now are low key celebrities but probably followed me long way back before they became low key celebrities.
cvprconference.bsky.social
By popular demand, we are extending #CVPR2025 coverage to Bluesky. Stay tuned!
Reposted by Vicente Ordonez
cvprconference.bsky.social
By popular demand, we are extending #CVPR2025 coverage to Bluesky. Stay tuned!
vicenteor.bsky.social
But if it does happen in the open, my hope is that it’s a concerted effort that includes academics as much as a larger coalition of people who agrees this issue is important.
vicenteor.bsky.social
There should be pushback publicly but it should not be only self serving. If people only express concerns when it directly affects them that’s a sad state of things.
vicenteor.bsky.social
Indeed I don’t feel that way. I think there should be pushback but I also think the general public should be as concerned. Complaining and pushing back will happen but maybe not in the open.
vicenteor.bsky.social
That said there's no reason to not continue using our institutional channels to continue championing science and education.
vicenteor.bsky.social
One thing we can do going forward is to work so that the general community gets convinced why funding science and having strong research institutions is a good thing. This time we might have to learn from our mistakes.
vicenteor.bsky.social
There are so many more things to be outraged about at the moment than complaining about federal funding for science. Especially when coming from academics, it is a bit too self-serving at this moment. Yes it is bad and the effects will be long lasting but so will be a dozen other things.
vicenteor.bsky.social
Great times for innovation ahead!
vicenteor.bsky.social
These are models that can perfectly be run on most cheap hardware unlike the full large R1. But I wouldn’t be surprised we will see R1 quality running on more accessible hardware.
vicenteor.bsky.social
Everyone concentrates on o1 and R1 but even the base 7B or 1.5B models seem better than the very first public version of ChatGPT (3.5-turbo) that took the world by surprise.
vicenteor.bsky.social
I think it’s exciting to see LLMs that are open source and on par with the top models accesible only through APIs. DeepSeek and before that Llama-3.
Reposted by Vicente Ordonez
moayedha.bsky.social
Can pretrained diffusion models be connected for cross-modal generation?

📢 Introducing AV-Link ♾️

Bridging unimodal diffusion models in one self-contained framework to enable:
📽️ ➡️ 🔊 Video-to-Audio generation.
🔊 ➡️ 📽️ Audio-to-Video generation.

🌐: snap-research.github.io/AVLink/

⤵️ Results
vicenteor.bsky.social
Check this recent work by my PhD student Moayed. He has been doing amazing work on Generative AI for images, video and audio. We introduce AV-Link ♾️, an unified approach for audio-video generation. Our generated audio is the best in terms of synchronization with video actions. Check thread below.
moayedha.bsky.social
Can pretrained diffusion models be connected for cross-modal generation?

📢 Introducing AV-Link ♾️

Bridging unimodal diffusion models in one self-contained framework to enable:
📽️ ➡️ 🔊 Video-to-Audio generation.
🔊 ➡️ 📽️ Audio-to-Video generation.

🌐: snap-research.github.io/AVLink/

⤵️ Results
vicenteor.bsky.social
I still don’t feel quite more productive in the era of LLMs. There are very few things I can do better but far from what I hear from anecdotes. I wonder what would be the one low hanging fruit I should be delegating to LLMs.