Language, trees, and geometry in neural networks

Part I of a series of expository notes accompanying this paper, by Andy Coenen, Emily Reif, Ann Yuan, Been Kim, Adam Pearce, Fernanda Viégas, and Martin Wattenberg. These notes are designed as an expository walk through some of the main results. Please see the paper for full references and details. Language is made of discrete structures, yet neural networks operate on continuous data: vectors in high-dimensional space. A successful language-processing network must translate this symbolic information into some kind of geometric representation—but in what form? Word embeddings provide two well-known examples: distance encodes semantic similarity, while certain directions correspond to polarities (e.g. male vs. female). A recent, fascinating discovery points to an entirely new type of representation. One of the key pieces of linguistic information about a sentence is its syntactic structure. This structure can be represented as a tree whose nodes correspond to words of the sentence. He...

12 mentions: @wattenberg@viegasf@earnmyturns@wattenberg@jakubzavrel@mat_kelcey@FabienCampagne@wzuidema
Date: 2019/06/08 14:17

Referring Tweets

@wattenberg How does a neural net represent language? See the visualizations and geometry in this PAIR team paper and blog post
@viegasf Analyzing and visualizing syntax trees in the high-dimensional spaces of neural nets. Check out the new PAIR paper on BERT geometry And the blog post on “Language, trees, and geometry in neural networks”
@earnmyturns Very cool exploration of the geometry of language embeddings, with some fun math I did not know.