Message Boards Message Boards

XKCD in LUV and relationships: semantic proximity of similar colors

Posted 10 years ago

Images / animations are large, wait till they load. The best part IMHO is at the end. Huge table is NOT the end.

A color can be a hard thing to pinpoint. A harder question, perhaps: Do visually close colors evoke close semantic descriptions? Is electric lime close to goblin grin? That’s not RGB… but these are “real colors”. At least according to “public color poll” run by ever-inventive creator of XKCD comic Randall Munroe. And from 222,500 user sessions and over five million colors we finally can pose a question: if visual similarity of colors - like this graph (I will show how to build it from XKCD data later):

enter image description here

can be used to define "semantic proximity" of subjective color descriptions these ones (also from XKCD data):

enter image description here

enter image description here

Well, we will investigate, or at least try to pinpoint, visual similarity to comprehend the semantic one (if there is any). BTW, this is not a typo, - ladies do prefer to use “camel” for color! And I am not going to comment about what type of glasses gentlemen see the world through. So what does programming have to do with this? Patience, there is a huge chart and a network - way down this post – result of some coding… and probably a few more jokes. When Randall Munroe published the poll data there were a few efforts to visualize results. Simple table by XKCD like this

enter image description here

was a bit disorienting for me because colors were visually random. Data Pointed efforts were excellent. But first one had too few most-popular names (while stunning):

enter image description here

The second one, quite an interactive marvel, had some tiny points which were hard to see and easy to miss with good names:

enter image description here

I wanted to browse all ~1000 names but in a sort of consistent color-wise way. The main point being, when i see a "goblin green" color, I would like neighboring colors to be similar, so I can see which names should also be close semantically. Basically I wanted to compare names of similar colors. Let’s import data and see a sample:

data = Import["", "Data"][[All, 1 ;; 2]];
data // Length
data[[;; 4]] // Column


{{{"cloudy blue", "#acc2d9"}},

{{"dark pastel green", "#56ae57"}},

{{"dust", "#b2996e"}},

{{"electric lime", "#a8ff04"}}`

Note, colors are given as hexadecimal HTML codes. We can use Interpreter to get colors in WL format, say RGB:

clrs = Interpreter["Color"][data[[All, 2]]];
Multicolumn[clrs, 30]

enter image description here

We, of course, could just throw all the points on a chromatic diagram:

ChromaticityPlot[{clrs, "RGB"}, PlotTheme -> "Detailed", 
 Appearance -> {"VisibleSpectrum", "Wavelengths" -> True}]

enter image description here

...or in 3D

ChromaticityPlot3D[{clrs, "RGB"}, PlotTheme -> "Marketing", 
 Appearance -> "VisibleSpectrum", SphericalRegion -> True]

enter image description here

...but that of course would not get me anywhere with readability of color names. So I decided to do the simplest thing - a table. Columns would arrange colors in one way while rows in another. This won't be perfect, but let's try it. For example in LUV color space there are 3 parameters:

  • L - lightness, approximate luminance
  • U - color
  • V - color

LUV is a color space designed to have perceptual uniformity; i.e. equal changes in its components will be perceived by a human to have equal effects. I hope this uniformity will help me to rearrange the colors. LUV is extensively used for applications such as computer graphics which deal with colored lights and is device independent. Let's get data in a convenient format:

dataP = MapAt[ColorConvert[Interpreter["Color"][#], "LUV"] &, #, 2] & /@ data;
dataP[[;; 5]] // Column

enter image description here

I will sort by abstract colors U and V and sacrifice lightness L to keep things simple and 2-dimensional. 2D sorting already will be helpful. Once data are sorted according to U

dataPA = SortBy[dataP, #[[2, 2]] &];

we ragged-partition them in 10 columns and sort each column according to V:

dataPAB = SortBy[#, #[[2, 3]] &] & /@ Partition[dataPA, 10, 10, 1, {}];
dataPAB[[;; 5, ;; 5]] // TableForm

enter image description here

Note the tricky syntax for Partition to keep partitioning ragged and not cut off a short remaining column. Now I will just build a grid where cells are rectangles of color with the color name written inside. But here is a tricky part: text color should be in contrast to the color of cell background, to be readable. Good that we have ColorNegate! We can use ColorNegate[x] when cell color is x - cool! ...except when cell color is gray because

ColorNegate[Gray] // InputForm


Hmmm... Well let's be inventive. When ColorDistance of a cell-color too close to Gray - we'll simply use White for text. Define:

rect[{x_, y_}] := Framed[Style[x, 10, 
   If[ColorDistance[ColorNegate[y], Gray] < .2, White, 
    ColorNegate[y]]], Background -> y, ImageSize -> {80, 50}]


rect@{"speechless green", Green}

enter image description here

Great, we now ready. Behold, read, and wonder (right-click and "open image in new tab" to see a bigger version). Do not forget - there is more stuff after this table.

Grid[ParallelMap[rect, dataPAB, {2}], Spacings -> {0, 0}]

enter image description here

Well, could there a be a better or different way to visualize relationships? What about a network graph - judging by social analytics approaches - they are the best to represent relationships. Let's make a clean cut and get the data again:

data = Import["", "Data"][[All, 1 ;; 2]];

And we turn strings of color descriptions into WL format colors with Interpreter again:

data = Reverse[MapAt[Interpreter["Color"], #, 2]] & /@ data;
data[[;; 5]] // Column

enter image description here

Now, like on Facebook - you have friends and they have friends and so on - we need to find closest friends of each color. We can use ColorDistance for that that utilizes many measures, for example Euclidean distance in LABColor and such. Let's define our distance function:

neco[{u_, v_}, {x_, y_}] := ColorDistance[u, x]

Now in WL we have an awesome function Nearest that can operate on any objects to deduce the closest to it objects:

neig[c_] := Nearest[DeleteCases[data, c], c, {All, .16}, DistanceFunction -> neco]

where {All, .16} means among all objects find closest within radius 0.16 as given by DistanceFunction. DeleteCases is needed exclude the original object as its own friend. Check:

enter image description here

This function will connect the original color and its closest friends within 0.16 measure of DistanceFunction

edgs[v_] := v <-> # & /@ neig[v]


enter image description here

Noticed the trick with Sort? Sort will flip the edges to orient b<->a as a<->b so we can delete duplicates using Union when building all edges between all colors and their friends:

edgsALL = Union[Sort /@ Flatten[ParallelMap[edgs, data], 1]];

To get a simple color-proximity Graph define a VertexLabels function:

panelLabel[lbl_] := lbl[[1]]

And now behold:

g = Graph[data, edgsALL, VertexLabels -> Table[i -> Placed[{i}, Center, panelLabel], {i, data}], 
  EdgeStyle -> Opacity[.2], EdgeShapeFunction -> "Line", VertexSize -> 0, ImageSize -> 900]

enter image description here

To build a large scale browseable network with readable labels define new label function:

panelLabel[lbl_] := Panel[Style[lbl[[2]], 14, Bold, 
   If[ColorConvert[lbl[[1]], "GrayLevel"][[1]] < .5, White, Black]], 
  FrameMargins -> 0, Background -> lbl[[1]]]

Instead of negating the color of text (as we did in the huge table) we make it White if GrayLevel of background is < 0.5 and Black if it is > 0.5. A different approach. Check:

enter image description here

Perfect. Now the monster network:

g = Graph[data, edgsALL, VertexLabels -> 
    Table[i -> Placed[{i}, Center, panelLabel], {i, data}], 
   EdgeStyle -> Opacity[.2], EdgeShapeFunction -> "Line", 
   VertexSize -> 0, ImageSize -> 10000];

To browse it open ==> THIS LINK <== in a NEW TAB and zoom in/out. It will look something like this:

enter image description here

Interesting part is why did we chose radius 0.16? Two words - percolation theory. Radius 0.16 for XKCD data serves as percolation threshold much below which the network has a lot of disconnected components and much above which the network is "overconnected" complete Graph. The former is lack of information and the later is "too much" info for meaningful sharp description. My intuition is that percolation threshold is the golden middle that allows for concise but precise definitions. You can experiment lowering it and increasing it to see how network under- and over- connects. Percolation threshold is that moment when you can get from one description to another and then next one and get to any other description. Using association chains you can deduce deeper connections among remote meanings in the whole network. This is of course is speculative and arguable. Let me know if you are familiar with relevant research or have an opinion. And now using percolation threshold we can define new colors based on old descriptions:

Labeled[Grid[neig[{#, ""}], Frame -> All], 
   Row[{"New clor ", Graphics[{#, Disk[]}, ImageSize -> 30], " is like"}], Top] &@RandomColor[]

enter image description here

Concise (much less than full ~1000 descriptors) but precise (you "got the feeling"). Now what is next? It would be really great to make a "machine" have an imagination and form its own new color descriptors. How? - not sure but probably running WL machine learning on some large color-related corpora. When I figure it out - I will write a continuation. Or maybe you will?

POSTED BY: Vitaliy Kaurov
5 Replies

enter image description here -- you have earned Featured Contributor Badge enter image description here Your exceptional post has been selected for our editorial column Staff Picks and Your Profile is now distinguished by a Featured Contributor Badge and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: Moderation Team

As a newcomer to Wolfram-Community and as a nonscientist I'm overwhelmed by what is possible in Mathematica. And, Vitaliy, what a creative train of thought to combine social data on colors with color theory and percolation theory! What I missed in the xkcd poll were more shades of grey and dark hues. Incidently, In January I finished a new painting "In CIELab color space", which provides more colors to be named


That's awesome!

After reading this I downloaded the data and did some processing of my own. I focused a bit on the names given to colors.

Starting off similar to you (including the code just so that any small differences won't cause confusion) I import the data and get the names:

names = fullData[[All, 1]];

We can make a histogram showing the distribution of the number of words used in a description:



It's not very interesting because not very many words are typically used. Instead we can calculate the average number of accompanying words to get a more interesting plot:



Next we can look at which words are used the most often. This is a pie chart shows the average color of all colors that share a certain word in their names:


Pies are good

And another (uglier) version which labels each slice:

Do I need a description?

We can immediately see what words are popular for describing colors.

This will tell us what words are the most vague, this is being calculated by the mean deviation of


Here are the last few:

<| "green" -> 0.301527, "yellowish" -> 0.301806, 
 "orangish" -> 0.309022, "greeny" -> 0.310186, "magenta" -> 0.310729, 
 "red" -> 0.326567, "fuchsia" -> 0.331209, "greenish" -> 0.331261, 
 "cool" -> 0.340017, "purply" -> 0.344614, "yellowy" -> 0.349218, 
 "powder" -> 0.358974, "blood" -> 0.362786, "browny" -> 0.375569, 
 "pink" -> 0.377552, "purple" -> 0.39065, "sea" -> 0.392243, 
 "reddish" -> 0.407342, "pinkish" -> 0.409191, "off" -> 0.414668, 
 "purpley" -> 0.415278, "blue" -> 0.417053, "orangey" -> 0.417463, 
 "purpleish" -> 0.418713, "pale" -> 0.421891, "flat" -> 0.422791, 
 "dull" -> 0.422931, "dusty" -> 0.429657, "muted" -> 0.430807, 
 "dirty" -> 0.431016, "purplish" -> 0.43483, "baby" -> 0.441156, 
 "marine" -> 0.444411, "dark" -> 0.450202, "very" -> 0.458118, 
 "pinky" -> 0.46324, "ugly" -> 0.478061, "bluey" -> 0.489151, 
 "violet" -> 0.490251, "light" -> 0.502694, "faded" -> 0.505302, 
 "soft" -> 0.531751, "mid" -> 0.538431, "deep" -> 0.544057, 
 "warm" -> 0.548372, "bluish" -> 0.549277, "medium" -> 0.553847, 
 "rich" -> 0.571403, "pastel" -> 0.580348, "darkish" -> 0.612625, 
 "true" -> 0.777189, "hot" -> 0.83747, "easter" -> 0.879319, 
 "strong" -> 0.93553, "vibrant" -> 0.938836, "bright" -> 0.939123, 
 "lightish" -> 0.957704, "lighter" -> 0.967594, "vivid" -> 0.998611, 
 "electric" -> 1.07712, "neon" -> 1.08666|>

It makes sense that words like vivid and neon are at the top but i tis amusing to see anyway. Now it's time to do something really cool, we are going to make a graph with words at each vertex, and where vertexes are connected if there is a color that uses both words. Example: "bright red" would connect the bright and red vertexes. Here is the graph:



Here edges are colored by the color that both words are mentioned in, and vertexes colored with the mean color of connecting edges. Next we can find communities in this graph:

communities = FindGraphCommunities[namesGraph];
CommunityGraphPlot[namesGraph, communities]


Finally we can do some coloring to see what colors are in these groups:


Filled in

I may do some more with this data, if I do I'll post it here.

Random note: CommunityGraphPlot returns particularly pretty result when run over your final graph: Your graph


Poll data is probably a new thing, but maybe more appropriate would be the type of experiments done in psychology

Or already an old technology is eye tracking. Has anyone tracked the eyes as they see this color representation?

Back in 2004 Anirudh Tiwathia did an eye tracking experiment with cellular automata at the summer school which was later published in Complex Systems The experienced researchers would start at the bottom, but those unfamiliar always started at the top. In the case of Vitaliy's color visualizations I find myself looking through them in a definite movement, although I can't recall exactly what. From some perspective the eye movement is a more basic measure of semantic pathways than a poll, which is something that has to go through many layers and is more difficult to interpret.

POSTED BY: Todd Rowland

What a colorful Community post! :-)

Vitaliy, another way to have text in contrast is to use your trick posted in Stackexchange

text = First[First[ImportString[ExportString[Style["gray", Italic, FontSize -> 24, 
FontFamily -> "Times"], "PDF"], "PDF", "TextMode" -> "Outlines"]]];

and setting the fonts to be always black with white borders:

Graphics[{EdgeForm[Directive[White, Thick]], Black, text},
Background -> Gray, PlotRange -> {{-5, 25}, {-0, 20}}]

gray white borders black font

Btw, your awesome function to find color relationships can be used as a "Color Blind Assistant". Here is the modified function that does this given a color:

cassist[c_] := Part[Nearest[data, {c, ""}, DistanceFunction -> neco],1,-1]

for example:


"darkish red"

Which it really tells how that particular color looks like.

POSTED BY: Bernat Espigulé
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract