How would you convert TextStructure into a list of vertices that are directed from the first member of the list to the next and then the next?

conceptNetModel = 
  NetModel["ConceptNet Numberbatch Word Vectors V17.06"];
gloveModel = 
   "GloVe 300-Dimensional Word Vectors Trained on Wikipedia and \
Gigaword 5 Data"];
wordsGlove = NetExtract[gloveModel, "Input"][["Tokens"]];
vecsGlove = NetExtract[gloveModel, "Weights"] // Normal;
word2vecGlove = AssociationThread[wordsGlove -> Most[vecsGlove]];
wordsConceptNet = NetExtract[conceptNetModel, "Input"][["Tokens"]];
vecsConceptNet = NetExtract[conceptNetModel, "Weights"] // Normal;
word2vecConceptNet = 
  AssociationThread[wordsConceptNet -> Most[vecsConceptNet]];
SemanticShift[word_, shiftVec_, model_] := 
  Nearest[model, model[word] + shiftVec, 8];
genderShift = word2vecGlove["he"] - word2vecGlove["she"];
SemanticShift["actor", genderShift, word2vecGlove]
FindAnalogy[word1_, word2_, word3_, model_] := 
  Nearest[model, model[word1] - model[word2] + model[word3], 8];
FindAnalogy["king", "man", "woman", word2vecGlove]
ExploreConceptNetRelations[word_, model_, n_ : 10] := 
  Nearest[model, model[word], n];
ExploreGloveRelations[word_, model_, n_ : 10] := 
  Nearest[model, model[word], n];
ExploreConceptNetRelations["wisdom", word2vecConceptNet]
ExploreGloveRelations["technology", word2vecGlove]
SemanticDistance[word1_, word2_, model_] := 
  EuclideanDistance[model[word1], model[word2]];
SemanticSimilarity[word1_, word2_, model_] := 
  1/(1 + SemanticDistance[word1, word2, model]);
SemanticDistance["cat", "dog", word2vecGlove]
SemanticSimilarity["cat", "dog", word2vecGlove]
ClusterWords[words_, model_] := FindClusters[Map[model, words]];
VisualizeEmbeddings[words_, model_] := 
  FeatureSpacePlot[Map[model, words], PlotLabels -> words];
wordsToAnalyze = {"science", "art", "mathematics", "literature", 
   "physics", "philosophy"};
clusterYourWords = ClusterWords[wordsToAnalyze, word2vecGlove];
VisualizeEmbeddings[wordsToAnalyze, word2vecGlove]
LinguisticTransformation[word_, transformationVec_, model_] := 
  Nearest[model, model[word] + transformationVec, 8];
positiveShift = word2vecGlove["joy"] - word2vecGlove["sadness"];
LinguisticTransformation["melancholy", positiveShift, word2vecGlove]

Features of Linguistics

Features of Linguistics


Extracting Linguistic

At least gravitational lump, one specific aspect is the extraction of linguistic relations using translation vectors in word embedding space.

For example,

TextStructure["Jupiter is the biggest planet in our solar system, \
eleven times bigger in diameter than Earth and two-and-a-half times \
more massive than all the other planets put together. Jupiter has no \
solid surface. Beneath the gas clouds lies hot, liquid hydrogen, then \
a layer of hydrogen in a form similar to liquid metal, and finally a \
rocky core. Jupiter has a faint ring around its equator made of \
microscopic dust particles.", "DependencyGraphs"]

For example, the classic example "king - man + woman = queen" demonstrates the geometric understanding of word embedding space, where the addition and subtraction of word vectors produce meaningful results. The directed lists which generate the abstraction graphs in natural language processing really aid in the visualization of latent spaces. The techniques we discuss to analyze word embedding spaces..such as dimensionality reduction and the formation of basis vectors using arbitrary orthogonal bases.

enter image description here enter image description here enter image description here enter image description here

What would you do if you wanted to create the directed list with, instead of natural numbers, strings of words? Which is to generate list-oriented DirectedEdges. The formation of basic vectors using arbitrary orthogonal bases.

With[{text = 
    "What do you call an arrow function that cares about where it is \
invoked & called (for example, not about where it is defined)? The \
idea is that the 8 year old has a lot to look forward to, therefore \
is most analogous to the arrow function; he's taking his true calling \
and following the laws of physics, rather than being affected by the \
laws of physics. He thus finds the illogical arrangement in the data, \
caring about the logical progression of the data (for example, not \
its definition).", {" ", ". ", ", ", "
    Select[Thread[Rule[Drop[text, -1], Drop[text, 1]]], 
     UnsameQ[#, {}] &]]], VertexLabels -> "Name"]]

Community Graph Plot (Named)

CosineSimilarity[word1_, word2_, model_] := 
 CosineDistance[model[word1], model[word2]]
CosineSimilarity["dog", "cat", word2vecGl]

And, the visualization of latent spaces I found kind of interesting, it demonstrates how word embeddings form clusters and contain most variance in a limited set of principal components. This similarity Cosine, is intertwined..offering us new possibilities for efficiency, transparency, and fairness. It poses the following value:

Cosine Similarity

Nearest[word2vecGl, word2vecGl["king"], 10]

Nearest King

 word2vecGl["paris"] - word2vecGl["france"] + word2vecGl["italy"], 10]

Nearest Paris

Nearest[word2vecCn, word2vecCn["birthday"], 10]

Nearest Birthday


Mean Syno Distances

Mean Antonyms Distances

There are some breathtaking linguistic computational interpretability issues related to ethics and privacy before the mean synonym & antonym distances are to be addressed. The word class subspaces that we investigate, the clustering of word embeddings kinds of things that define meaningful classes..this is how we discover classes of words within subspaces based on the volume of simplices, formed by word embeddings. And it's something where you're providing, challenging the assumption that word embeddings lack grammatical information by exploring embedded grammar within these models as the curriculum progresses.

 word2vecGl["king"] - word2vecGl["man"] + word2vecGl["woman"], 8]
Nearest[word2vecGl, word2vecGl["waitress"] + 2.2*converterTrans, 8]
Nearest[word2vecGl, word2vecGl["actress"] + 2.2*converterTrans, 8]
WordRelations[words_List, modelAssoc_] := 
 Column[Nearest[modelAssoc, modelAssoc[#], 5] & /@ words]
WordRelations[{"king", "queen", "prince", "princess"}, word2vecGl]

King Man Woman

Waitress 2.2 converterTrans

Actress 2.2 converterTrans


The philosophical implications of the king queen dichotomy or the bartender-actor understanding of human concepts...ranging from "bartender" to "guy" and "actor" to "mr." would need to be translated into computational terms. Human language is inherently ambiguous, leading to many legal disputes. Couching the things that the student cares about, the topic application areas that the student cares about, provides valuable insights into understanding linguistic relationships encoded within word embedding models and language models and understanding the potential applications in natural language processing tasks such as machine translation.

PrincipalRelations["kitten", "cat"]
PrincipalRelations["summer", "winter"]
PrincipalRelations["republican", "democrat"]

Kitten Cat

Summer Winter

Republican Democrat

{Histogram[synoDistances, PlotLabel -> "Synonyms Distances", 
  ImageSize -> Medium],
 Histogram[antonymsDistances, PlotLabel -> "Antonyms Distances", 
  ImageSize -> Medium],
  PlotLabel -> "Random Word Embedding Distances", ImageSize -> Medium],
  PlotLabel -> "Random Vector Distances", ImageSize -> Medium]}

Random Vector Distances Histograms

This ambiguity is a major challenge. The Cayley-Menger determinant is extensively used in geometric computation and representations of geometric & algebraic problems. Sentiment analysis, text generation, involves understanding and parsing meaning from words and language. When this determinant equals zero, it means all the points are on the same n-1 dimensional sphere, that is they are co-spherical. And if the determinant is negative, it indicates..that the distances do not come from any Euclidean space, that is they violate some basic principles of geometry: the triangle inequality.

FindSimilarWords[word_String, modelAssoc_, n_ : 10] := 
   DeleteCases[Keys[modelAssoc], word] // 
    AssociationMap[CosineDistance[modelAssoc[word], modelAssoc[#]] &],
    Identity] // Take[#, n] &
FindSimilarWords["king", word2vecGl]


The determinant provides for an easy way to calculate volumes named after Arthur Cayley and Carl Menger the mathematicians, that calculates the volume of an n-dimensional simplex. Linguistic equations that take the form of word == adjective + noun. For example, "ostrich" == "flightless bird". Whether that can be appropriately equationally represented mathematically, where nouns are represented by word embeddings and adjectives are represented as translation vectors in word embedding space.

plots = ListLinePlot[#] & /@ clusterYourWords;

Extracting Linguistic 2

Searching for antonyms of words and retrieving their word embeddings. These word embeddings are then compared to a dataset of antonym pair translation vectors to find the best match. This match represents the translation vector that we implement for the adjective in the linguistic equation. The distance between the word embedding of the word and the result of adding the translation vector to the word embedding of the noun, which carries with it the requirement that the distance is within a certain threshold and the equation is considered valid.

genderVector = Normalize[word2vecGl["man"] - word2vecGl["woman"]];
genderBias = Dot[genderVector, #] & /@ word2vecGl;
genderBias // KeySort // ReverseSort

Gender Bias

This gender bias illuminates the relationship between the downstream task performance and the analogy task, to uncover how these models represent and learn linguistic information. So it's really an illustration of how the validity of linguistic equations is determined based on word embeddings and translation vectors, which focuses on things which exist and things which don't exist, which represents word meanings through linguistic equations. Leveraging word embeddings, by the eigenvectors & eigenvalues, the map's the analogy "man is to king as woman is to queen", the map would transform the word embedding of 'man' to that of 'king'.

SynonymFinder[word_String] := 
 Nearest[word2vecGl, word2vecGl[word], 6]

Synonym Finder Happy

wordList = {"king", "queen", "man", "woman", "bread", "butter", "cat",
wordVectors = word2vecGl /@ wordList;
PCA = PrincipalComponents[wordVectors];
proj = PCA[[All, 1 ;; 2]];
 PlotStyle -> PointSize[Medium],
 PlotLabel -> "Word Embeddings Visualized via PCA",
 FrameLabel -> {"Principal Component 1", "Principal Component 2"},
 Frame -> True,
 ImageSize -> Medium,
 Epilog -> (Text[wordList[[#]], proj[[#]], {-1, -1}] & /@ 

Word Embeddings Visualized via PCA

The behavior of these linear maps, word embeddings visualized via Principle Component Analysis. Whether or not linear maps are learned in a word analogy task that hold when applied to other tasks..these kinds of models perform on downstream tasks. The concept of abstract and concrete words is defined based on their relationship to physical instantiation or conceptual existence. The goal is to develop an "abstract index" that assigns a score to words indicating their level of abstraction. This index can range from 0 (concrete) to 1 (abstract). The distance of surrounding nearest words involves analyzing word embeddings and their distances to surrounding words, and the hypothesis that abstract words would cluster closer together. However, this approach of distance of surrounding words nearest does not yield significant differences between abstract and concrete words.

PrincipalComponentsVisualization[words_, model_] := 
  Module[{vectors, pca, reducedVectors}, vectors = model /@ words;
   pca = PrincipalComponents[vectors];
   reducedVectors = pca[[All, 1 ;; 2]]; 
   ListPlot[reducedVectors, AspectRatio -> 1, PlotRange -> All, 
    PlotStyle -> PointSize[Medium], AxesLabel -> {"PC1", "PC2"}, 
    PlotLabel -> "Word Embeddings Visualization", 
    PlotMarkers -> Automatic, GridLines -> Automatic, 
    ImageSize -> Large]];
wordsToAnalyze = {"science", "art", "mathematics", "literature", 
   "physics", "philosophy"};
PrincipalComponentsVisualization[wordsToAnalyze, word2vecGlove]

Extracting Linguistic 3

The translation vector to nearest words direction..explores the directional differences in word embeddings to determine abstraction. This involves comparing the average pointing direction vectors for abstract and concrete words within specific subclasses. Transformer language model prompts leverage transformer-based language models such as GPT-2 to identify "to be" relationships between words.

wordPairs = Tuples[wordsToAnalyze, 2];
similarities = 
  Outer[SemanticSimilarity[#1, #2, word2vecGlove] &, wordsToAnalyze, 
heatmap = 
 MatrixPlot[similarities, ColorFunction -> "Rainbow", 
  PlotLegends -> Automatic, 
  FrameTicks -> {{Table[{i, wordsToAnalyze[[i]]}, {i, 
       Length[wordsToAnalyze]}], None}, {None, 
     Table[{i, wordsToAnalyze[[i]]}, {i, Length[wordsToAnalyze]}]}}, 
  Mesh -> True, MeshStyle -> Gray]

Word Pairs

Constructing an abstraction hierarchy represented as a directed graph, where edges point from concrete to abstract words..serves as a visualization of the relationship between the words in terms of an abstraction. The ambiguity of the English language poses challenges, as not all subset relationships follow a clear "to be" pattern. Methods for accurately assessing and representing abstraction in NLP systems.

word = "pizza";
distances = 
  AssociationMap[EuclideanDistance[word2vecGl[word], word2vecGl[#]] &,
closest = TakeSmallest[distances, 50];
WordCloud[closest, ImageSize -> Large]

Word Cloud

It's interesting to think about, where do these word cloud errors come from? While word2vec and similar models can capture some intriguing semantic relationships, they have limitations and do not always perform as expected. The word2vec model can often reflect biases in the training data and may not always provide the most intuitively correct answers, to these analogy problems. The famous word2vec example, "king - man + woman = ?" often results in 'queen'. It does not always do so! The exact result is mysterious and varies depending on the specific model implementation and word vectors.

Rasterize[FeatureSpacePlot[RandomSample[vecsCn, 2000]]]


randomVecNormal[r_, d_] := 
 r*Normalize@RandomVariate[NormalDistribution[], d]
Rasterize[FeatureSpacePlot[Table[randomVecNormal[1, 300], 1000]]]

Feature Space Plot Rasterize

Here's another "example". The FeatureSpacePlot can be normally Rasterized. What's the balance between production readiness and prototyping speed? In TensorFlow vs PyTorch for Text Classification there are Convolutional Neural Networks to perform text classification tasks, on the two distinct datasets: 20 Newsgroups and the Movie Review Data. The "Pythonic" presentation of PyTorch is flexible, with both that and TensorFlow described as a "low-level library with high-level APIs built on top", TensorFlow more verbose and provides more explicit control, which might be beneficial for deploying in a production environment.

synonyms = {"happy", "joyful", "cheerful", "gleeful", "jubilant"};
meanSynonymEmbedding = Mean[word2vecCn[#] & /@ synonyms];
Nearest[word2vecCn, meanSynonymEmbedding, 10]

word2vecCn meanSynonymEmbedding, 10, Nearest

WordDiffWalkSteps[w1_String, w2_String, steps_Integer, modelAssoc_] :=
   Module[{e1, e2, transSteps}, e1 = Flatten[modelAssoc[w1]];
   e2 = Flatten[modelAssoc[w2]];
   transSteps = Table[e1 + i*((e2 - e1)/steps), {i, steps}];
   Nearest[modelAssoc, #, 5] & /@ transSteps];
Column[WordDiffWalkSteps["cat", "dog", 15, word2vecCn]]


categories = <|
   "animals" -> {"dog", "cat", "tiger", "elephant", "bear", "fish", 
     "dolphin", "bird"}, 
   "furniture" -> {"table", "chair", "sofa", "cupboard", "bed", 
     "desk", "shelf", "drawer"}, 
   "emotions" -> {"happy", "sad", "angry", "excited", "afraid", 
     "curious", "bored", "surprised"}|>;
categoryEmbeddings = Map[word2vecGl, categories, {2}];
categorySamples = RandomSample /@ categoryEmbeddings;
words = Join[categories["animals"], categories["furniture"]];
vectors = word2vecGl /@ words;
clusters = FindClusters[vectors];
wordsByCluster = 
  GatherBy[words, Position[clusters, word2vecGl[#]] &];
Rasterize /@ Map[FeatureSpacePlot, categorySamples]


Rasterize /@ Map[FeatureSpacePlot

The word2vec model actually presents an interactive visualization tool that allows users to explore word analogies, using the word2vec model, using pre-trained word vectors from GloVe. It's compelling how these pairs like 'uncle' and 'aunt', 'niece' and 'nephew', 'brother' and 'sister', 'actor' and 'actress' etc., are positioned close together to signify that they're, in the sense of gender differences, similar. The complexity of capturing and representing abstraction in language models, the suggestion of the avenues for improving our understanding, and proving our utilization of abstraction in NLP systems, is the way that we can focus on these methods for evaluating abstraction in word equations, considering concrete and abstract words. Different approaches including distance of surrounding nearest words and translation vector to nearest words direction are explored but yield mixed results.

word = "actor";
shiftedWords = SemanticShift[word, genderShift, word2vecGlove];
shiftedWordVector = Mean[Map[word2vecGlove, shiftedWords]]; 
reducedWord = DimensionReduce[word2vecGlove[word], 2];
reducedShiftedWord = DimensionReduce[shiftedWordVector, 2];
Graphics[{Arrow[{reducedWord, reducedShiftedWord}], Red, 
  Point[reducedWord], Blue, Point[reducedShiftedWord]}]

Extracting Linguistic 5

Our own transformer-based language models that we propose to assess abstraction using a "to be" relationship, by inputting phrases like "cat is an animal" and the Socratic interaction in the directionality of the relationship between words, is examined to determine abstraction levels. Construct a graph representation of abstraction hierarchy in the English language, achieved by forming direct edges between words based on their abstraction levels, which creates a tree-like structure. The limitations and challenges in representing abstraction accurately, due to ambiguity in language.

Rasterize[FeatureSpacePlot[RandomSample[vecsCn, 1000]]]
Rasterize[FeatureSpacePlot[RandomSample[vecsGl, 1000]]]



{PhraseBogusPair["the sky"], PhraseBogusPair["the green"], 
 PhraseBogusPair["a cat"], PhraseBogusPair["many dogs"]}


Row[Style[Keys[#] <> " ", Background -> Hue[Values[#]]] & /@ 
      "In the beginning God created the heavens and the earth", " "]],
     2, 1], {1}]]


Pointwise Mutual Information that is the log probability that two words co-occur, can be approximated in a high-dimensional space by the scalar product of word vectors. The technical papers, tutorials, and pre-trained models that Extracting linguistic relations fr. word embeddings&language models provides, have furthered our exploration of word2vec.

word = "technology";
neighbors = ExploreGloveRelations[word, word2vecGlove];
neighborsReduced = 
  DimensionReduce[Map[word2vecGlove, neighbors], 2, 
   Method -> "TSNE"];
ListPlot[neighborsReduced, PlotLabels -> neighbors, 
 PlotStyle -> PointSize[Medium]]

Extracting Linguistic 6

reducedWords = 
  DimensionReduce[Map[word2vecGlove, wordsToPlot], 2, 
   Method -> "TSNE"];
shiftedVectors = Map[word2vecGlove[#] + genderShift &, wordsToPlot];
reducedShiftedWords = 
  DimensionReduce[shiftedVectors, 2, Method -> "TSNE"];
ListLinePlot[{reducedWords, reducedShiftedWords}, 
 PlotMarkers -> Automatic, PlotLegends -> wordsToPlot]

Linguistic Extracting 7

Highlights the potential for future work in improving NLP systems, understanding linguistic phenomena, and advancing symbolic understanding and interpretation of language.

abstractConcretePairs = {{"vehicle", "car"}, {"fruit", 
    "apple"}, {"color", "blue"}, {"animal", "dog"}};
abstractConcretePairsEmbeddings = 
  Apply[{Rule[#1, word2vecCn[#1]], Rule[#2, word2vecCn[#2]]} &, 
   abstractConcretePairs, 1];
abstractConcretePairsNearest = 
  Apply[{Rule[Values[#1], Nearest[word2vecCn, Values[#1], 1000]], 
     Rule[Values[#2], Nearest[word2vecCn, Values[#2], 1000]]} &, 
   abstractConcretePairsEmbeddings, 1];
abstractConcretePairsNearestEmbeddings = 
  Map[{Keys[#], Map[word2vecCn, Values[#]]} &, 
   abstractConcretePairsNearest, {2}];
abstractConcretePairsNearestDistances = 
  Map[Map[Function[{var}, EuclideanDistance[var, #[[1]]]], #[[
      2]], {1}] &, abstractConcretePairsNearestEmbeddings, {2}];
Apply[PairedHistogram, abstractConcretePairsNearestDistances, {1}]


The concept of "Linguistic Equations" compels the left-hand side to represent a single word and the right-hand side to represent a synonymous: definitional expression. Word embeddings contain more grammatical information, just not in the conventional sense. Assigning a higher value to the grammatically correct sentences, phrases that yield larger inner products than agrammatical phrases..noun can be represented by the word embedding and, in the word embedding space the adjective represents some translation vector. The validity of these linguistic equations. Simple distance metrics do not capture this relation, the class of words that does have a unique relationship that relationship exists in a subspace of the word embedding space so that instead of merely relying on the proximity between words in the embedding space, we also calculate the volume of the simplex formed by the embeddings of each word in the class and the centroid, of those embeddings, the subspaces & relationships that exist within different classes, of words.

data = RandomReal[{-10, 10}, {300, 2}];
clusters = FindClusters[data, 4];
centers = Mean /@ clusters;
 PlotStyle -> PointSize[Medium],
 PlotTheme -> "Detailed",
 Epilog -> {Red, PointSize[Large], Point[centers]},
 PlotLabel -> "Cluster Plot of Synthetic Data",
 FrameLabel -> {"Feature 1", "Feature 2"}

Cluster Plot

While word embeddings and language models provide valuable insights, they also pose challenges in representing meaning accurately. The Cosine Similarity method is used to analyze word similarities, and the gender bias in word embeddings. The word cloud that is presented helps us understand semantic relationships and biases in word embeddings. TensorFlow and PyTorch for text classification tasks. Rasterizing feature space plots and discussing the balance between production readiness and prototyping speed for text classification tasks, and the analysis of the synonyms, the word difference walk steps, and the categorization of words in semantic space. Abstract semantic categories like animal, furniture, emotion, using machine learning frameworks for text classification tasks.

