Shiry Ginosar* Amir Bar* Gefen Kohavi Caroline Chan Andrew Owens Jitendra Malik In CVPR 2019 [Code] [Data] Speech audio-to-gesture translation. From the bottom upward: the input audio, predicted arm and hand motion, and synthesized video frames. Abstract Human speech is often accompanied by hand and arm gestures. Given audio speech input, we generate plausible gestures to go along with the sound. Specifically, we perform cross-modal translation from "in-the-wild" monologue speech of a single speaker to their hand and arm motion. We train on unlabeled videos for which we only have noisy pseudo ground truth from an automatic pose detection system. Our proposed model significantly outperforms baseline methods in a quantitative comparison. To support research toward obtaining a computational understanding of the relationship between gesture and speech, we release a large video dataset of person-specific gestures. Paper Learning Individual Styles of Conversational Gesture...

3 mentions: @_amirbar@luiscosio@soycurd1
Date: 2019/06/13 09:48

Referring Tweets

@_amirbar (2/2) We also release our full dataset and will make the code available. Joint work with with @shiryginosar, Gefen Kohavi, Caroline Chan, Andrew Ownes and Jitendra Malik. For more details see project page: https://t.co/oH0mca7B3x. @berkeley_ai @ZebraMedVision
@luiscosio Learning Individual Styles of Conversational Gesture - https://t.co/mkdhLfnsrV #AI #ML
@soycurd1 発想勝負論文っぽい / 1件のコメント https://t.co/KS3IMQWoQK “Speech2Gesture” https://t.co/uUK1CSt241

Bookmark Comments

id:soy-curd 発想勝負論文っぽい