[1909.04101] Neural Naturalist: Generating Fine-Grained Image Comparisons

We introduce the new Birds-to-Words dataset of 41k sentences describing fine-grained differences between photographs of birds. The language collected is highly detailed, while remaining understandable to the everyday observer (e.g., "heart-shaped face," "squat body"). Paragraph-length descriptions naturally adapt to varying levels of taxonomic and visual distance---drawn from a novel stratified sampling approach---with the appropriate level of detail. We propose a new model called Neural Naturalist that uses a joint image encoding and comparative module to generate comparative language, and evaluate the results with humans who must use the descriptions to distinguish real images. Our results indicate promising potential for neural models to explain differences in visual embedding space using natural language, as well as a concrete path for machine learning to aid citizen scientists in their effort to preserve biodiversity.

2 mentions: @ReaderMeter@tkym1220
Date: 2019/09/12 12:50

Referring Tweets

@tkym1220 Neural Naturalist: Generating Fine-Grained Image Comparisons, EMNLP2019 https://t.co/B8Hxvvf4TT 鳥の画像のペアと,画像間の違いを説明した文からなるデータセットを構築 & 公開.また,入力画像ペアの違いを説明した文を生成する "Neural Naturalist" モデルを提案. https://t.co/aqqch14Exa
@ReaderMeter The Birds-to-Words dataset: 40,000+ sentences describing fine-grained differences between photographs of birds, and “a model (Neural Naturalist) that uses a joint image encoding and comparative module to generate comparative language.” https://t.co/MHo2lrp8Q1 https://t.co/tI3Jke35NC

Related Entries

Read more Dataset for Semantic Urban Scene Understanding
0 users, 0 mentions 2018/10/12 14:56
Read more GitHub - muhaochen/bilingual_dictionaries: This repository contains the source code and links to som...
0 users, 1 mentions 2019/09/08 09:47
Read more On Building an Instagram Street Art Dataset and Detection Model
0 users, 38 mentions 2019/01/29 20:59
Read more GitHub - allenai/PeerRead: Data and code for Kang et al., NAACL 2018's paper titled "A Dataset of Pe...
0 users, 0 mentions 2018/05/01 07:07
Read more COCO-Text: Dataset for Text Detection and Recognition | SE(3) Computer Vision Group at Cornell Tech
2 users, 1 mentions 2019/04/01 14:17