arxiv.org/abs/2412.13303
arxiv.org/abs/2412.13303
🚀Our answer is Yes -- Excited to introduce our latest work: World-consistent Video Diffusion (WVD) with Explicit 3D Modeling!
arxiv.org/abs/2412.01821
🚀Our answer is Yes -- Excited to introduce our latest work: World-consistent Video Diffusion (WVD) with Explicit 3D Modeling!
arxiv.org/abs/2412.01821
Delighted to share AIMv2, a family of strong, scalable, and open vision encoders that excel at multimodal understanding, recognition, and grounding 🧵
paper: arxiv.org/abs/2411.14402
code: github.com/apple/ml-aim
HF: huggingface.co/collections/...
Delighted to share AIMv2, a family of strong, scalable, and open vision encoders that excel at multimodal understanding, recognition, and grounding 🧵
paper: arxiv.org/abs/2411.14402
code: github.com/apple/ml-aim
HF: huggingface.co/collections/...
With PLUM, a pipeline for teaching LLMs to remember prior user conversations, we aim to enable your future personalization research! Joint work with @maartjeterhoeve.bsky.social, Katherine Metcalf and Yizhe Zhang from my internship at Apple.
🧵
With PLUM, a pipeline for teaching LLMs to remember prior user conversations, we aim to enable your future personalization research! Joint work with @maartjeterhoeve.bsky.social, Katherine Metcalf and Yizhe Zhang from my internship at Apple.
🧵