TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
Sai Kumar Dwivedi*, Yu Sun*, Priyanka Patel, Yao Feng, Michael J. Black
Computer Vision and Pattern Recognition (CVPR 2024)
(*Equal Contribution)
TokenHMR addresses the paradox of declining 3D accuracy of HPS methods with increasing 2D precision by introducing a Threshold-Adaptive Loss Scaling (TALS) loss and reformulating the problem as token prediction.
project | arXiv | paper | video | code | poster
TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
Sai Kumar Dwivedi*, Yu Sun*, Priyanka Patel, Yao Feng, Michael J. Black
Computer Vision and Pattern Recognition (CVPR 2024) (*Equal Contribution)
TokenHMR addresses the paradox of declining 3D accuracy of HPS methods with increasing 2D precision by introducing a Threshold-Adaptive Loss Scaling (TALS) loss and reformulating the problem as token prediction.
project | arXiv | paper | video | code | poster
ChatPose: Chatting about 3D Human Pose
Yao Feng, Jing Lin, Sai Kumar Dwivedi, Yu Sun, Priyanka Patel, Michael J. Black
Computer Vision and Pattern Recognition (CVPR 2024)
ChatPose integrates large language models to comprehend and reason about 3D human poses from images or textual descriptions, leveraging world knowledge and body language.
project | arXiv | paper | code | video
ChatPose: Chatting about 3D Human Pose
Yao Feng, Jing Lin, Sai Kumar Dwivedi, Yu Sun, Priyanka Patel, Michael J. Black
Computer Vision and Pattern Recognition (CVPR 2024)
ChatPose integrates large language models to comprehend and reason about 3D human poses from images or textual descriptions, leveraging world knowledge and body language.
project | arXiv | paper | code | video
POCO: 3D Pose and Shape Estimation using Confidence
Sai Kumar Dwivedi, Cordelia Schmid, Hongwei Yi, Michael J. Black, Dimitrios Tzionas
International Conference on 3D Vision (3DV 2024)
(Oral Presentation)
POCO is a novel framework that can be applied to common human pose and shape regressors, extending them to estimate the method’s confidence in the result without any downside.
project | arXiv | paper | code | video | poster | talk
POCO: 3D Pose and Shape Estimation using Confidence
Sai Kumar Dwivedi, Cordelia Schmid, Hongwei Yi, Michael J. Black, Dimitrios Tzionas
International Conference on 3D Vision (3DV 2024)
(Oral Presentation)
POCO is a novel framework that can be applied to common human pose and shape regressors, extending them to estimate the method’s confidence in the result without any downside.
project | arXiv | paper | code | video | poster | talk
Detecting Human-Object Contact in Images
Yixin Chen, Sai Kumar Dwivedi, Michael J. Black, Dimitrios Tzionas
Computer Vision and Pattern Recognition (CVPR 2023)
HOT tackles the lack of a reliable approach for detecting human-object 2D contact in images by introducing a dataset of 2D contacts and developing a contact detector guided by part-attention, which surpasses all baseline methods.
project | arXiv | paper | code | video
Detecting Human-Object Contact in Images
Yixin Chen, Sai Kumar Dwivedi, Michael J. Black, Dimitrios Tzionas
Computer Vision and Pattern Recognition (CVPR 2023)
HOT tackles the lack of a reliable approach for detecting human-object 2D contact in images by introducing a dataset of 2D contacts and developing a contact detector guided by part-attention, which surpasses all baseline methods.
project | arXiv | paper | code | video
Learning to Regress Bodies from Images using Differentiable Semantic Rendering
Sai Kumar Dwivedi, Nikos Athanasiou, Muhammed Kocabas, Michael J. Black
International Conference on Computer Vision (ICCV 2021)
DSR introduces a novel Differentiable Semantic Rendering (DSR) loss that utilizes semantic clothing information to improve 3D human body estimation and thus surpassing prior state-of-the-art methods.
project | arXiv | paper | code | video | poster
Learning to Regress Bodies from Images using Differentiable Semantic Rendering
Sai Kumar Dwivedi, Nikos Athanasiou, Muhammed Kocabas, Michael J. Black
International Conference on Computer Vision (ICCV 2021)
DSR introduces a novel Differentiable Semantic Rendering (DSR) loss that utilizes semantic clothing information to improve 3D human body estimation and thus surpassing prior state-of-the-art methods.
project | arXiv | paper | code | video | poster
ProtoGAN: Towards Few Shot Learning for Action Recognition
Sai Kumar Dwivedi, Vikram Gupta, Rahul Mitra, Shuaib Ahmed, Arjun Jain
International Conference on Computer Vision Workshop (ICCVw 2019)
ProtoGAN framework addresses the challenge of few-shot learning (FSL) for action recognition and generalized FSL by synthesizing additional examples for novel categories using class prototype vectors.
arXiv | paper | data
ProtoGAN: Towards Few Shot Learning for Action Recognition
Sai Kumar Dwivedi, Vikram Gupta, Rahul Mitra, Shuaib Ahmed, Arjun Jain
International Conference on Computer Vision Workshop (ICCVw 2019)
ProtoGAN framework addresses the challenge of few-shot learning (FSL) for action recognition and generalized FSL by synthesizing additional examples for novel categories using class prototype vectors.
arXiv | paper | data
Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition
Devraj Mandal, Sanath Narayan, Sai Kumar Dwivedi, Vikram Gupta, Shuaib Ahmed, Fahad Shahbaz Khan, Ling Shao
Computer Vision and Pattern Recognition (CVPR 2019)
While addressing the challenges of generalized zero-shot action recognition, our novel framework incorporates an out-of-distribution detector to distinguish between seen and unseen action categories, achieving significant improvements over existing methods.
arXiv | paper | code
Out-Of-Distribution Detection for Generalized Zero-Shot Action Recognition
Devraj Mandal, Sanath Narayan, Sai Kumar Dwivedi, Vikram Gupta, Shuaib Ahmed, Fahad Shahbaz Khan, Ling Shao
Computer Vision and Pattern Recognition (CVPR 2019)
While addressing the challenges of generalized zero-shot action recognition, our novel framework incorporates an out-of-distribution detector to distinguish between seen and unseen action categories, achieving significant improvements over existing methods.
arXiv | paper | code
Progression Modelling for Online and Early Gesture Detection
Vikram Gupta, Sai Kumar Dwivedi, Rishabh Dabral, Arjun Jain
International Conference on 3D Vision (3DV 2019)
(Oral Presentation)
Our simple yet effective multi-task learning framework addresses the issue of online and early gesture detection by modelling the gesture progression along with frame level recognition.
arXiv | paper | data
Progression Modelling for Online and Early Gesture Detection
Vikram Gupta, Sai Kumar Dwivedi, Rishabh Dabral, Arjun Jain
International Conference on 3D Vision (3DV 2019)
(Oral Presentation)
Our simple yet effective multi-task learning framework addresses the issue of online and early gesture detection by modelling the gesture progression along with frame level recognition.
arXiv | paper | data