Lightnews — Scholar-powered news

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

From 8 -> 5 -> 4 exocentric cameras, all visualized with @rerundotio. I'm dropping the number of cameras used and collecting my own data to make sure I'm not overfitting to open-source datasets.

October 10, 2025 at 8:25 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

It's finally done, I've finished ripping out my full-body pipeline and replaced it with a hands-only version. Critical to make it work in a lot more scenarios! I've visualized the final predictions with @rerundotio!

September 29, 2025 at 1:00 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

If you're not labeling your own data, you're NGMI. I take this seriously, so I finished building the first version of my hand-tracking annotation app using rerun.io and gradio.app

September 19, 2025 at 5:00 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

Unfortunately, I've identified some serious issues with the hand tracker that necessitate manual intervention. So I decided the next best course of action is to build a labeling app with @rerun.io and
@gradio-hf.bsky.social, as I transition from my full-body solution to a hands-only solution.

September 15, 2025 at 5:01 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

One of the failure modes for my pipeline is the full-body tracker. This is a huge issue for ego views, where it falls apart. I've been looking for a solution that generalizes to ego/exo, and I think I found it!

August 29, 2025 at 4:43 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

Go check out ethz-vlg. github. io/mvtracker/it's a really clever idea for multiview tracking, and they made an awesome @rerun.io‬ integration on their project page!

August 29, 2025 at 1:44 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

This is a big one! I'm almost done with my MVP, built with @rerun.io‬ and @gradio-hf.bsky.social‬!

TLDR:
Inputs
- 2-N Synchronized Videos in a zip file

Outputs
- Multiview Consistent Depthmaps + Pointclouds (Thanks VGGT + MogeV2!)
- Depth Confidence Values

August 22, 2025 at 4:38 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

✨ Massive Pipeline Refactor → One Framework for Ego + Exo Datasets, Visualized with @rerun.io 🚀

After a refactoring, my entire egocentric/exocentric pipeline is now modular. One codebase handles different sensor layouts and outputs a unified, multimodal timeseries file that you can open in Rerun.

June 26, 2025 at 1:40 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

Streaming iPhone data in real-time directly to @rerun.io 🚀

May 22, 2025 at 4:54 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

MVP of Multiview Video → Camera parameters + 3D keypoints. Visualized with @rerun.io

May 9, 2025 at 3:09 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

First signs of life on my self-collected data =]

I still need to add synchronization between the exo/ego videos, but all of the work the past few months is coming together

This shows a synchronization app I'm building with @rerun.io and @gradio-hf.bsky.social . More to come

May 5, 2025 at 8:47 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

Trying to wrap my head around fwd/bwd kinematics for imitation learning, so I built a fully‑differentiable kinematic hand skeleton in JAX and visualized it with @rerun.io new callback system in a Jupyter Notebook. This shows each joint angle and how it impacts the kinematic skeleton.

May 2, 2025 at 8:59 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

I added export settings to get the images/masks/camera parameters in NerfStudio format. I can chuck these into brush from @arthurperpixel.bsky.social and get a splat on my M4.

Right now, such sparse input doesn't produce great results, but the depths are not being used, which should be able to help

April 25, 2025 at 4:36 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

@rerun.io v0.23 is finally out! 🎉 I’ve extended my @gradio-hf.bsky.social annotation pipeline to support multiview videos using the callback system introduced in 0.23.

April 24, 2025 at 2:20 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

Visualized with @rerun.io, I’ve integrated video‑based depth estimation into my robot‑training pipeline to make data collection as accessible as possible—without requiring specialized hardware.

April 17, 2025 at 8:08 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

I extended my previous @rerun.io and @gradio-hf.bsky.social annotation pipeline for multiple views. You can see how powerful this is when using Meta's Segment Anything and multi-view geometry. Only annotating 2 views, I can triangulate the other 6 views and get masks extremely quickly!

April 10, 2025 at 5:00 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

Here’s a sneak peek using @rerun.io and @gradio-hf.bsky.social for data annotation. It uses Video Depth Anything and Segment Anything 2 under the hood to generate segmentation masks and depth maps/point clouds. More to share next week.

April 1, 2025 at 7:13 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

Continuing with my robot training collection pipeline, one of the challenges has been obtaining calibrated cameras from sparse multi-view inputs. I previously worked with Dust3r but found its accuracy insufficient for generating reliable camera parameters.

March 25, 2025 at 5:57 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

More progress towards building a straightforward method to collect first-person (ego) and third-person (exo) data for robotic training in @rerun.io. I’ve been using the HO-cap dataset to establish a baseline, and here are some updates I’ve made (code at the end)

March 18, 2025 at 3:32 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

Finally finished porting mast3r-slam to @rerun.io and adding a @gradio-hf.bsky.social interface. Really cool to see it running on any video I throw at it, I've included the code at the end

March 7, 2025 at 9:52 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

I'm working towards an easy method to collect a combined third-person and first-person pose dataset starting Assembly101 from Meta, with near real-time performance via @rerun.io visualization. The end goal is robot imitation learning with Hugging Face LeRobot

February 24, 2025 at 4:02 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

Following up on my prompt depth anything post, I'm starting a bit of a miniseries where I'm going through the tutorials of
Lerobot to understand better how I can get a real robot to work on my custom dataset. Using @rerun.io to visualize
code: github.com/rerun-io/pi0...

February 11, 2025 at 5:09 PM

pablovelagomez.bsky.social

@pablovelagomez.bsky.social

Recently, I've been playing with my iPhone ToF sensor, but the problem has always been the abysmal resolution (256x192). The team behind DepthAnything released PromptDepthAnything that fixes this. Using @rerun.io to visualize. Links at the end of the thread

February 3, 2025 at 1:18 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news