Van Nguyen Nguyen

Email: vanngn dot nguyen at gmail dot com

I am a Research Scientist at United Imaging Intelligence (UII America) in Greater Boston, MA, USA.

I completed my PhD at IMAGINE team, École des Ponts ParisTech, under the supervision of Prof. Vincent Lepetit.

My research focuses on 3D computer vision and robotics.

Our team is always looking for self-motivated research interns. Please drop me a line if you are interested!

News

02/2026: CIF and MedGRPO accepted to CVPR 2026.
01/2026: Universal Beta Splatting accepted to ICLR 2026.
06/2025: Joined United Imaging Intelligence in Burlington, MA as a research scientist.
06/2025: BOP challenge 2024 report received CV4MR Best Paper Award at CVPR 2025.
04/2024: BOP challenge 2024 report and GoTrack accepted to CV4MR workshop at CVPR 2025.
12/2024: PhD defended!
06/2024: BOP challenge 2024 opened!
05/2022: Joined Meta Reality Labs as a research intern with Tomas Hodan.
04/2024: Accepted to CVPR 2024 Doctoral Consortium.
04/2024: BOP challenge 2023 report accepted to CVPRW 2024.
02/2024: GigaPose, NOPE, OSV5M accepted to CVPR 2024.
10/2023: CNOS accepted to R6D workshop at ICCV 2023. Awarded best 2D detection method for unseen objects at BOP challenge 2023.
08/2022: PIZZA accepted (as Oral) to 3DV 2022.
05/2022: Joined Meta Reality Labs as a research intern, working with Pierre Moulon.
03/2022: Template-pose accepted to CVPR 2022.
10/2020: Started PhD at IMAGINE team, advised by Prof. Vincent Lepetit.

PhD thesis: Pose Estimation of Novel Rigid Objects

Supervisor: Prof. Vincent Lepetit
Reviewers: Prof. Markus Vincze, Dr. Benjamin Busam
Examiners: Prof. Josef Sivic, Prof. Dima Damen, Dr. Slobodan Ilic

Publications

CIF: Consistent Instance Field for Dynamic Scene Understanding

Junyi Wu, Van Nguyen Nguyen, Benjamin Planche, Jiachen Tao, Changchang Sun, Zhongpai Gao, Zhenghao Zhao, Anwesa Choudhuri, Gengyu Zhang, Meng Zheng, Feiran Wang, Terrence Chen, Yan Yan, Ziyan Wu

CVPR 2026

A continuous probabilistic spatio-temporal representation for dynamic scene understanding that disentangles visibility from persistent object identity. Our approach employs instance-embedded deformable 3D Gaussians that encode both radiance and semantic information, enabling novel-view panoptic segmentation and open-vocabulary 4D querying tasks.

arXiv project

MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding

Yuhao Su, Anwesa Choudhuri, Zhongpai Gao, Benjamin Planche, Van Nguyen Nguyen, Meng Zheng, Yuhan Shen, Arun Innanje, Terrence Chen, Ehsan Elhamifar, Ziyan Wu

CVPR 2026

We introduce MedVidBench, a large-scale benchmark of 531K video-instruction pairs across 8 medical sources, and MedGRPO, an RL framework with cross-dataset reward normalization and a medical LLM judge for balanced multi-dataset training. Fine-tuning Qwen2.5-VL-7B on MedVidBench substantially outperforms GPT-4.1 and Gemini-2.5-Flash, while MedGRPO further improves grounding and captioning tasks.

arXiv project

Universal Beta Splatting

Rong Liu, Zhongpai Gao, Benjamin Planche, Meida Chen, Van Nguyen Nguyen, Meng Zheng, Anwesa Choudhuri, Terrence Chen, Yue Wang, Andrew Feng, Ziyan Wu

ICLR 2026

A unified framework that generalizes 3D Gaussian Splatting to N-dimensional anisotropic Beta kernels for explicit radiance field rendering. Beta kernels naturally decompose scene properties into interpretable components (surface vs. texture, diffuse vs. specular, static vs. dynamic) without explicit supervision.

arXiv project code

BOP Challenge 2024 on Model-free, Model-based Detection, and Pose Estimation of Unseen Rigid Objects

Van Nguyen Nguyen, Stephen Tyree, Andrew Guo, Médéric Fourmy, Anas Gouda, Taeyeop Lee, Sungphill Moon, Hyeontae Son, Lukas Ranftl, Jonathan Tremblay, Eric Brachmann, Bertram Drost, Vincent Lepetit, Carsten Rother, Stan Birchfield, Jiri Matas, Yann Labbé, Martin Sundermeyer, Tomáš Hodaň

CVPRW 2025 (CV4MR Best Paper Award)

The report of BOP Challenge 2024 on model-based and model-free 2D/6D object detection on BOP-Classic and new BOP-H3 datasets (HOT3D, HOPEv2, HANDAL).

arXiv project

GoTrack: Generic 6DoF Object Pose Refinement and Tracking

Van Nguyen Nguyen, Christian Forster, Sindi Shkodrani, Bugra Tekin, Vincent Lepetit, Cem Keskin, Vincent Lepetit, Tomáš Hodaň

CVPRW 2025

An efficient and accurate CAD-based method for 6DoF pose refinement and tracking of unseen objects. Given a CAD model of an object, an RGB image with known intrinsics that shows the object in an unknown pose, and an initial object pose, Gotrack refines the object pose such as the 2D projection of the model aligns closely with the object’s appearance in the image.

arXiv code

BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects

Tomas Hodan, Martin Sundermeyer, Yann Labbé, Van Nguyen Nguyen, Gu Wang, Eric Brachmann, Bertram Drost, Vincent Lepetit, Carsten Rother, Jiri Matas

CVPRW 2024

The report of BOP Challenge 2023 on state-of-the-art methods for seen and unseen object pose estimation.

arXiv project

GigaPose: Fast and Robust Novel Object Pose Estimation via One Correspondence

Van Nguyen Nguyen, Thibault Groueix, Mathieu Salzmann, Vincent Lepetit

CVPR 2024

A “hybrid” template-patch correspondence approach that is fast, robust, and more accurate to estimate 6D pose of novel objects in RGB images. GigaPose predicts 6D object pose from a single 2D-to-2D correspondence.

arXiv code

NOPE: Novel Object Pose Estimation from a Single Image

Van Nguyen Nguyen, Thibault Groueix, Georgy Ponimatkin, Yinlin Hu, Renaud Marlet, Mathieu Salzmann, Vincent Lepetit

CVPR 2024

A method that can estimate relative pose of unseen objects given only a single reference image. It also predicts 3D pose distribution which can be used to address pose ambiguities due to symmetries.

arXiv code

OpenStreetView-5M: The Many Roads to Global Visual Geolocation

Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis, Constantin Aronssohn, Nacim Bouia, Stephanie Fu, Romain Loiseau, Van Nguyen Nguyen, Charles Raude, Elliot Vincent, Lintao Xu, Hongyu Zhou, Loic Landrieu

CVPR 2024

A new benchmark for visual geolocation (~Geoguessr).

arXiv code

CNOS: A Strong Baseline for CAD-based Novel Object Segmentation

Van Nguyen Nguyen, Thibault Groueix, Georgy Ponimatkin, Vincent Lepetit, Tomáš Hodaň

ICCVW 2023 (R6D Best Method Award for 2D detection of unseen objects)

A method that can segment novel objects for a given RGB image from only their CAD models. Based on Segmenting Anything, DINOv2, CNOS is a strong baseline for Task 5 and 6 in the BOP challenge 2023.

arXiv code

PIZZA: A Powerful Image-only Zero-Shot Zero-CAD Approach to 6DoF Tracking

Van Nguyen Nguyen⁺, Yuming Du⁺, Yang Xiao, Michaël Ramamonjisoa, Vincent Lepetit

3DV (Oral)

A method for tracking the 6D motion of objects in RGB video sequences when neither training images nor even the 3D geometry of the objects is available.

arXiv code

Templates for 3D Object Pose Estimation Revisited: Generalization to New Objects and Robustness to Occlusions

Van Nguyen Nguyen, Yinlin Hu, Yang Xiao, Mathieu Salzmann, Vincent Lepetit

CVPR 2022

A method that can recognize objects and estimate their 3D pose in color images even under partial occlusions. Our method requires neither a training phase on these objects nor real images depicting them, only their CAD models.

arXiv code