Sai Kumar Dwivedi

Profile Picture

Recent News:

  • Jun 2025 πŸ† InteractVLM won Human Contact Challenge at CVPR 2025

  • Jun 2025 πŸ† Received the Outstanding Reviewers Award for CVPR 2025

  • Jun 2025 πŸŽ‰ 2 papers accepted in CVPR 2025 - InteractVLM (first author), PICO (co-author)

  • Jan 2025 πŸ’Ό Accepted a 6-month internship offer at Meta Zurich

  • Oct 2024 πŸ† Received the Outstanding Reviewers Award for ECCV 2024

  • Jun 2024 πŸŽ‰ 2 papers accepted in CVPR 2024 - TokenHMR (first author), ChatPose (co-author)

  • May 2024 πŸ“° POCO was featured in RSIP Vision Magazine

  • Mar 2024 πŸŽ‰ 1 paper accepted in 3DV 2024 (Oral) - POCO (first author)

  • Jun 2023 πŸŽ‰ 1 paper accepted in CVPR 2023 - HOT (co-author)

  • Oct 2021 πŸŽ‰ 1 paper accepted in ICCV 2021 - DSR (first author)

  • Oct 2021 πŸŽ“ Started PhD at MPI-IS

  • Jun 2020 πŸ›οΈ Joined MPI-IS as a Research Assistant

  • Oct 2019 πŸŽ‰ 1 paper accepted in ICCV Workshop 2019 - ProtoGAN (first author)

  • Sep 2019 πŸŽ‰ 1 paper accepted in 3DV 2019 (Oral) - GPM (co-author)

  • Jun 2019 πŸŽ‰ 1 paper accepted in CVPR 2019 - OD-GZSL (co-author)

  • Sep 2017 πŸ’Ό Joined Mercedes-Benz R&D India as a Computer Vision Researcher

  • Jan 2016 πŸ’Ό Joined Intel as a Machine Learning Engineer

  • Sep 2014 πŸ›οΈ Research Exchange Student at Tallinn University of Technology

  • Sep 2014 πŸŽ–οΈ Received the Erasmus Mundus Fellowship

  • Jan 2016 πŸ›οΈ Research Internship at IIT-GN

  • Sep 2014 πŸŽ–οΈ Received the SRIP Summer Research Fellowship

  • Jul 2011 πŸŽ“ Started undergraduate studies in Computer Science at NIT Rourkela

I am a PhD candidate at Max Planck Institute for Intelligent Systems (MPI-IS) in Germany, supervised by Dr. Michael Black and Dr. Dimitris Tzionas since October 2021. I also closely collaborate with Dr. Cordelia Schmid. My PhD research initially focused on estimating human pose and shape from monocular images. Currently, I am exploring the perception and estimation of human-object interactions from single RGB image.

Before starting my PhD, I worked as a Computer Vision Researcher at Mercedes-Benz, India where I collaborated with Dr. Arjun Jain on human pose estimation for Intelligent Interior. This work was deployed as Side Mirror Selection feature (see video) in Mercedes EQS 2021 and Rear Sunblind Control feature (see video) in Mercedes S-Class 2021. Prior to that, I worked at Intel, India developing deep learning algorithms for edge devices. I earned my Masters in Computer Science from NIT, Rourkela.

Publications

For a complete list of publications, visit my Google Scholar profile.

Browse Papers by Category:

2025

InteractVLM Image

InteractVLM is a novel method to estimate 3D contact points on human bodies and objects from single in-the-wild images, enabling accurate joint reconstruction by leveraging large foundational model.

PICO Image

PICO introduces PICO-db, a dataset of natural images with dense 3D human-object contact annotations, and PICO-fit, an optimization method that uses these annotations to jointly fit 3D body and object meshes to images.

2024

TokenHMR Image

TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation

Sai Kumar Dwivedi*, Yu Sun*, Priyanka Patel, Yao Feng, Michael J. Black

CVPR, 2024

Integrated into Meshcapade's commercial solution (see here)

TokenHMR addresses the paradox of declining 3D accuracy of HPS methods with increasing 2D precision by introducing a Threshold-Adaptive Loss Scaling (TALS) loss and reformulating the problem as token prediction.

ChatPose Image

ChatPose integrates Large Language Models to comprehend and reason about 3D human poses from images or textual descriptions, leveraging world knowledge and body language understanding to unify pose estimation and generation tasks.

POCO Image

POCO: 3D Pose and Shape Estimation using Confidence

Sai Kumar Dwivedi, Cordelia Schmid, Hongwei Yi, Michael J. Black, Dimitrios Tzionas

3DV, 2024 (Oral)

Featured in RSIP Vision Magazine (see here)

POCO is a novel framework that can be applied to common human pose and shape regressors, extending them to estimate the method’s confidence in the result without any downside.

2023

HOT Image

HOT introduces a novel dataset and detector to identify human-object contact in images, enhancing human-centered AI by addressing the absence of reliable detection methods.

2021

DSR Image

DSR introduces a novel Differentiable Semantic Rendering (DSR) loss that utilizes semantic clothing information to improve 3D human body estimation, surpassing prior state-of-the-art methods.

2019

ProtoGAN Image

ProtoGAN addresses the challenge of few-shot learning for action recognition by synthesizing additional examples for novel categories using class prototype vectors, improving generalization towards novel classes.

GZSL-OD Image

While addressing the challenges of generalized zero-shot action recognition, our novel framework incorporates an out-of-distribution detector to distinguish between seen and unseen action categories, achieving significant improvements over existing methods.

GPM Image

Our simple yet effective multi-task learning framework addresses the issue of online and early gesture detection by modelling the gesture progression along with frame level recognition.

Experience

MPI Logo

Research Assistant
July 2020 - Aug 2021
Max Planck Institute for Intelligent Systems
Advisor: Dr. Michael J. Black

Mercedes Logo

Computer Vision Researcher
Sep 2017 - Feb 2020
Mercedes-Benz R&D India
Major Focus: Human Pose Estimation for Intelligent Interiors

Intel Logo

Machine Learning Engineer
Dec 2015 - Aug 2017
Intel Corporation
Major Focus: Deep Learning algorithms for edge devices

Tallinn University Logo

Research Exchange Student
Aug 2014 - Jun 2015
Tallinn University of Technology
Major Focus: Machine Learning and Digital System Design

Education

NIT Rourkela Logo

Bachelors and Masters Degree
(2011 - 2016)
Computer Science and Engineering
National Institute of Technology, Rourkela
CPGA - 8.51/10
Courses - Machine Learning, Principles of Artificial Intelligence, Computer Graphics

Awards

[2022] IMPRS-IS Scholar - Max Planck Institute for Intelligent Systems
[2021] RA Fellowship - Max Planck Institute for Intelligent Systems
[2018] Outstanding Performer Award - Mercedes-Benz R&D India
[2017] Employee Recognition - Intel Corporation
[2015] Student Fellowship - Erasmus Mundus European Union Program
[2014] Student Fellowship - IIT Gandhinagar