Sai Kumar Dwivedi

Profile Picture
CV Google Scholar MPI Webpage GitHub LinkedIn Email

For a complete list of publications, visit my Google Scholar profile.

Click here to highlight key papers.

TokenHMR Image

TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
Sai Kumar Dwivedi*, Yu Sun*, Priyanka Patel, Yao Feng, Michael J. Black
Computer Vision and Pattern Recognition (CVPR 2024)
(*Equal Contribution)

Summary | Project | arXiv | Paper | Video | Code | Poster

TokenHMR addresses the paradox of declining 3D accuracy of HPS methods with increasing 2D precision by introducing a Threshold-Adaptive Loss Scaling (TALS) loss and reformulating the problem as token prediction.

ChatPose Image

ChatPose: Chatting about 3D Human Pose
Yao Feng, Jing Lin, Sai Kumar Dwivedi, Yu Sun, Priyanka Patel, Michael J. Black
Computer Vision and Pattern Recognition (CVPR 2024)

Summary | Project | arXiv | Paper | Code | Video

ChatPose integrates large language models to comprehend and reason about 3D human poses from images or textual descriptions, leveraging world knowledge and body language.

POCO Image

POCO: 3D Pose and Shape Estimation using Confidence
Sai Kumar Dwivedi, Cordelia Schmid, Hongwei Yi, Michael J. Black, Dimitrios Tzionas
International Conference on 3D Vision (3DV 2024)
(Oral Presentation)

Summary | Project | arXiv | Paper | Code | Video | Poster | Talk

POCO is a novel framework that can be applied to common human pose and shape regressors, extending them to estimate the method’s confidence in the result without any downside.

HOT Image

Detecting Human-Object Contact in Images
Yixin Chen, Sai Kumar Dwivedi, Michael J. Black, Dimitrios Tzionas
Computer Vision and Pattern Recognition (CVPR 2023)

Summary | Project | arXiv | Paper | Code | Video

HOT tackles the lack of a reliable approach for detecting human-object 2D contact in images by introducing a dataset of 2D contacts and developing a contact detector guided by part-attention, which surpasses all baseline methods.

DSR Image

Learning to Regress Bodies from Images using Differentiable Semantic Rendering
Sai Kumar Dwivedi, Nikos Athanasiou, Muhammed Kocabas, Michael J. Black
International Conference on Computer Vision (ICCV 2021)

Summary | Project | arXiv | Paper | Code | Video | Poster

DSR introduces a novel Differentiable Semantic Rendering (DSR) loss that utilizes semantic clothing information to improve 3D human body estimation and thus surpassing prior state-of-the-art methods.

ProtoGAN Image

ProtoGAN: Towards Few Shot Learning for Action Recognition
Sai Kumar Dwivedi, Vikram Gupta, Rahul Mitra, Shuaib Ahmed, Arjun Jain
International Conference on Computer Vision Workshop (ICCVw 2019)

Summary | arXiv | Paper | Data

ProtoGAN framework addresses the challenge of few-shot learning (FSL) for action recognition and generalized FSL by synthesizing additional examples for novel categories using class prototype vectors.

GZSL-OD Image

Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition
Devraj Mandal, Sanath Narayan, Sai Kumar Dwivedi, Vikram Gupta, Shuaib Ahmed, Fahad Shahbaz Khan, Ling Shao
Computer Vision and Pattern Recognition (CVPR 2019)

Summary | arXiv | Paper | Code

While addressing the challenges of generalized zero-shot action recognition, our novel framework incorporates an out-of-distribution detector to distinguish between seen and unseen action categories, achieving significant improvements over existing methods.

GPM Image

Progression Modelling for Online and Early Gesture Detection
Vikram Gupta, Sai Kumar Dwivedi, Rishabh Dabral, Arjun Jain
International Conference on 3D Vision (3DV 2019)
(Oral Presentation)

Summary | arXiv | Paper | Data

Our simple yet effective multi-task learning framework addresses the issue of online and early gesture detection by modelling the gesture progression along with frame level recognition.