Analysiere im Detail und fasse zusammen.
++
Informationen
Word2vec (published by a team of Google researchers led by Tomas Mikolov), as a “breakthroug technique” in the natural language processing field, has been eight years old. They pioneered the concept of word embedding as the foundation of the technique.
An **embedding** is a relatively low-dimensional space into which you can translate high-dimensional vectors
One model, Word2Vec (word to vector), developed by Google in 2013, is a method to efficiently create word embeddings by using a two-layer neural network. It takes as input a word and spits out an n-dimensional coordinate (the embedding vector) so that when you plot these word vectors in a three-dimensional space, synonyms cluster.
Here is how two words, “dad” and “mom” would be represented as vectors:
“dad” = [0.1548, 0.4848, …, 1.864]
“mom” = [0.8785, 0.8974, …, 2.794]
Although there is some similarity between these two words, we would expect that “father” would live in much closer proximity to “dad” in the vector space, resulting in a higher dot product (_a measure of the relative direction of two vectors and how closely they align in the direction they point_).
++
Quellen
61. embeddings https://www.ibm.com/topics/embedding
62. what is embedding https://towardsdatascience.com/what-is-embedding-and-what-can-you-do-with-it-61ba7c05efd8