Hirokatsu Kataoka | 片岡裕雄
banner
hirokatukataoka.bsky.social
Hirokatsu Kataoka | 片岡裕雄
@hirokatukataoka.bsky.social
Chief Scientist @ AIST | Academic Visitor @ Oxford VGG | PI @ cvpaper.challenge | 3D ResNet (Top 0.5% in 5-yr CVPR) | FDSL (ACCV20 Award/BMVC23 Award Finalist)
Slides from my #BMVC2025 talk are now available!
hirokatsukataoka.net/temp/presen/...

This includes the following papers:
- Industrial Synthetic Segment Pre-training arxiv.org/abs/2505.13099
- S3OD: Towards Generalizable Salient Object Detection with Synthetic Data arxiv.org/abs/2510.21605
December 5, 2025 at 11:26 AM
Released HanDyVQA, ego-centric QAs for fine-grained hand-object interaction with 11.1K QAs, 10.3K segmentation masks in 112 domains.

Even Gemini-2.5-Pro reaches 73% & 97% human score, revealing key issue in space-time task.

Project: masatate.github.io/HanDyVQA-pro...
December 5, 2025 at 11:13 AM
We have publicly shared our "PowerCLIP," a method to align powersets of image sub-region with textual structures for precise image-text recognition.

Outperforms several SotA in zero-shot classification, retrieval, robustness, and compositional tasks!

arxiv.org/abs/2511.23170
December 4, 2025 at 4:20 AM
[ #NeurIPS2025 Spotlight ] We're very excited to share our "Domain Unlearning," this is a collaboration between Irie Lab, TUS & AIST. Selectively removing domain-specific knowledge from trained models.

- Project: kodaikawamura.github.io/Domain_Unlea...
- Paper: arxiv.org/abs/2510.08132
December 4, 2025 at 4:13 AM
We’ve released the ICCV 2025 Report!
hirokatsukataoka.net/temp/presen/...

Compiled during ICCV in collaboration with LIMIT.Lab, cvpaper.challenge, and Visual Geometry Group (VGG), this report offers meta insights into the trends and tendencies observed at this year’s conference.

#ICCV2025
October 31, 2025 at 5:46 PM
I’m planning to attend ICCV 2025 in person!

Here are my accepted papers and roles at this year’s #ICCV2025 / @iccv.bsky.social .

Please check out the threads below:
October 16, 2025 at 2:14 AM
We organized the "Cambridge Computer Vision Workshop" at the University of Cambridge together with Elliott Wu, Yoshihiro Fukuhara, and LIMIT.Lab! It was a fantastic workshop featuring presentations, networking, and discussions.
cambridgecv-workshop-2025sep.limitlab.xyz
October 2, 2025 at 12:15 PM
Finally, the accepted papers at #ICCV2025 / @iccv.bsky.social LIMIT Workshop has been publicly released!
--
- OpenReview: openreview.net/group?id=the...
- Website: iccv2025-limit-workshop.limitlab.xyz
October 2, 2025 at 12:06 PM
At ICCV 2025, I am organizing two workshops: the LIMIT Workshop and the FOUND Workshop.

◆ LIMIT Workshop (19 Oct, PM): iccv2025-limit-workshop.limitlab.xyz
◆ FOUND Workshop (19 Oct, AM): iccv2025-found-workshop.limitlab.xyz

We warmly invite you to attend at these workshops in ICCV 2025 Hawaii!
September 17, 2025 at 3:42 PM
I’m thrilled to announce my invited talk at BMVC 2025 Smart Cameras for Smarter Autonomous Vehicles and Robots!

supercamerai.github.io
September 2, 2025 at 2:35 PM
Our AnimalClue has been accepted to #ICCV2025 as a highlight🎉🎉🎉 We also released an official press release from AIST!! This is the collaboration between AIST x Oxford VGG.

Project page: dahlian00.github.io/AnimalCluePa...
Dataset: huggingface.co/risashinoda
Press: www.aist.go.jp/aist_j/press...
August 3, 2025 at 9:52 PM
Our AgroBench has been accepted to #ICCV2025 🎉🎉🎉 We released project page, paper, code, and dataset!!

Project page: dahlian00.github.io/AgroBenchPage/
Paper: arxiv.org/abs/2507.20519
Code: huggingface.co/datasets/ris...
Dataset: github.com/dahlian00/Ag...
August 3, 2025 at 9:49 PM
We’ve released the CVPR 2025 Report!
hirokatsukataoka.net/temp/presen/...

Compiled during CVPR in collaboration with LIMIT.Lab, cvpaper.challenge, and Visual Geometry Group (VGG), this report offers meta insights into the trends and tendencies observed at this year’s conference.

#CVPR2025
hirokatsukataoka.net
June 17, 2025 at 11:10 AM
[LIMIT.Lab Launched]
limitlab.xyz

We’ve established "LIMIT.Lab" a collaboration hub for building multimodal AI models under limited resources, covering images, videos, 3D, and text, when any resource (e.g., compute, data, or labels) is constrained.
June 6, 2025 at 10:03 AM
“Industrial Synthetic Segment Pre-training” on arXiv!

Formula-driven supervised learning (FDSL) has surpassed the vision foundation model "SAM" on industrial data. It delivers strong transfer performance to industry while minimizing IP-related concerns.

arxiv.org/abs/2505.13099
May 21, 2025 at 10:28 AM
I’m honored to serve as an Area Chair for CVPR 2025 for the second time. Thank you so much for the support!!

cvpr.thecvf.com/Conferences/...
2025 Progam Committee
cvpr.thecvf.com
May 12, 2025 at 8:05 AM
Reposted by Hirokatsu Kataoka | 片岡裕雄
where I apparently asked you to present your poster on 3D ResNet *in 2 minutes* in #CVPR2018...
7 years later, I am very grateful for your *50 minutes* talk and full day visit to my group...
Thanks for the personal touch ☺️
2/2
April 29, 2025 at 6:43 PM
Reposted by Hirokatsu Kataoka | 片岡裕雄
Many thanks @hirokatukataoka.bsky.social for visiting @bristoluni.bsky.social #MaVi group during your stay @oxford-vgg.bsky.social
This was a very motivating talk on training from limited synthetic data.
Thanks for bringing up our first encounter #CVPR2018
....
1/2
April 29, 2025 at 6:42 PM
Very excited to announce that our Formula-Driven Supervised Learning (FDSL) series now includes audio modality 🎉🎉🎉
--
Formula-Supervised Sound Event Detection: Pre-Training Without Real Data, ICASSP 2025.
- Paper: arxiv.org/abs/2504.04428
- Project: yutoshibata07.github.io/Formula-SED/
April 11, 2025 at 1:53 PM
Our paper has been published ( lnkd.in/gyuPEWSA ) and issued an AIST press release (JPN: lnkd.in/gyifZauS ) on applying FDSL pre-training for microfossil recognition, showcasing an example of image recognition technology in AI for Science! 🎉🎉🎉

# The shared images are coming from our paper
March 20, 2025 at 2:52 PM
Reposted by Hirokatsu Kataoka | 片岡裕雄
Introducing VGGT (CVPR'25), a feedforward Transformer that directly infers all key 3D attributes from one, a few, or hundreds of images, in seconds!

Project Page: vgg-t.github.io
Code & Weights: github.com/facebookrese...
March 17, 2025 at 2:08 AM
[ Reached 5,000 Citations! 🎉🎉🎉 ]

My research has reached 5,000 citations on Google Scholar! This wouldn’t have been possible without the support of my co-authors, colleagues, mentors, and the entire research community.

Looking forward to the next phase of collaboration! 🙌
February 7, 2025 at 3:33 PM
[New pre-training / augmentation dataset] MoireDB – a formula-generated interference-fringe image dataset for synthetic pre-training and data augmentation 🎉🎉🎉

Paper: arxiv.org/abs/2502.01490
February 4, 2025 at 3:56 PM
Reposted by Hirokatsu Kataoka | 片岡裕雄
Demis Hassabis, James Manyika, and I wrote up an overview of the AI research work & advances across Google in 2024 (Gemini, NotebookLM, robotics, ML for science, & advances in responsible AI+more). 🎊

Given it a read or paste it into NotebookLM to listen, if you prefer!

blog.google/technology/a...
2024: A year of extraordinary progress and advancement in AI
As we move into 2025, we’re looking back at the astonishing progress in AI in 2024.
blog.google
January 24, 2025 at 12:46 AM
[ Research Paper Award🏅]
Our paper, "Efficient Load Interference Detection with Limited Labeled Data," has won the SICE International Young Authors Award (SIYA) 2025🎉🎉🎉 This work is the family of FDSL pre-training and its real-world application in advanced logistics using forklists.
January 25, 2025 at 4:55 AM