Group Abstract

Message Boards

WOLFRAM COMMUNITY

71.9K Views

5 Replies

39 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Staff Picks Data Science Image Processing Recreation Graphics and Visualization Graphs and Networks Wolfram Language

XKCD in LUV and relationships: semantic proximity of similar colors

Vitaliy Kaurov

Vitaliy Kaurov, WOLFRAM Research

Posted 11 years ago

Images / animations are large, wait till they load. The best part IMHO is at the end. Huge table is NOT the end. A color can be a hard thing to pinpoint. A harder question, perhaps: Do visually close colors evoke close semantic descriptions? Is electric lime close to goblin grin? Thats not RGB but these are real colors. At least according to public color poll run by ever-inventive creator of XKCD comic Randall Munroe. And from 222,500 user sessions and over five million colors we finally can pose a question: if visual similarity of colors - like this graph (I will show how to build it from XKCD data later): can be used to define "semantic proximity" of subjective color descriptions ...like these ones (also from XKCD data): Well, we will investigate, or at least try to pinpoint, visual similarity to comprehend the semantic one (if there is any). BTW, this is not a typo, - ladies do prefer to use camel for color! And I am not going to comment about what type of glasses gentlemen see the world through. So what does programming have to do with this? Patience, there is a huge chart and a network - way down this post result of some coding and probably a few more jokes. When Randall Munroe published the poll data there were a few efforts to visualize results. Simple table by XKCD like this was a bit disorienting for me because colors were visually random. Data Pointed efforts were excellent. But first one had too few most-popular names (while stunning): The second one, quite an interactive marvel, had some tiny points which were hard to see and easy to miss with good names: I wanted to browse all ~1000 names but in a sort of consistent color-wise way. The main point being, when i see a "goblin green" color, I would like neighboring colors to be similar, so I can see which names should also be close semantically. Basically I wanted to compare names of similar colors. Lets import data and see a sample: data = Import["http://xkcd.com/color/rgb.txt", "Data"][[All, 1 ;; 2]]; data // Length data[[;; 4]] // Column `949 {{{"cloudy blue", "#acc2d9"}}, {{"dark pastel green", "#56ae57"}}, {{"dust", "#b2996e"}}, {{"electric lime", "#a8ff04"}}` Note, colors are given as hexadecimal HTML codes. We can use Interpreter to get colors in WL format, say RGB: clrs = Interpreter["Color"][data[[All, 2]]]; Multicolumn[clrs, 30] We, of course, could just throw all the points on a chromatic diagram: ChromaticityPlot[{clrs, "RGB"}, PlotTheme -> "Detailed", Appearance -> {"VisibleSpectrum", "Wavelengths" -> True}] ...or in 3D ChromaticityPlot3D[{clrs, "RGB"}, PlotTheme -> "Marketing", Appearance -> "VisibleSpectrum", SphericalRegion -> True] ...but that of course would not get me anywhere with readability of color names. So I decided to do the simplest thing - a table. Columns would arrange colors in one way while rows in another. This won't be perfect, but let's try it. For example in LUV color space there are 3 parameters: L - lightness, approximate luminance U - color V - color LUV is a color space designed to have perceptual uniformity; i.e. equal changes in its components will be perceived by a human to have equal effects. I hope this uniformity will help me to rearrange the colors. LUV is extensively used for applications such as computer graphics which deal with colored lights and is device independent. Let's get data in a convenient format: dataP = MapAt[ColorConvert[Interpreter["Color"][#], "LUV"] &, #, 2] & /@ data; dataP[[;; 5]] // Column I will sort by abstract colors U and V and sacrifice lightness L to keep things simple and 2-dimensional. 2D sorting already will be helpful. Once data are sorted according to U dataPA = SortBy[dataP, #[[2, 2]] &]; we ragged-partition them in 10 columns and sort each column according to V: dataPAB = SortBy[#, #[[2, 3]] &] & /@ Partition[dataPA, 10, 10, 1, {}]; dataPAB[[;; 5, ;; 5]] // TableForm Note the tricky syntax for `Partition` to keep partitioning ragged and not cut off a short remaining column. Now I will just build a grid where cells are rectangles of color with the color name written inside. But here is a tricky part: text color should be in contrast to the color of cell background, to be readable. Good that we have ColorNegate! We can use ColorNegate[x] when cell color is x - cool! ...except when cell color is gray because ColorNegate[Gray] // InputForm `GrayLevel[0.5]` Hmmm... Well let's be inventive. When `ColorDistance` of a cell-color too close to `Gray` - we'll simply use `White` for text. Define: rect[{x_, y_}] := Framed[Style[x, 10, If[ColorDistance[ColorNegate[y], Gray] < .2, White, ColorNegate[y]]], Background -> y, ImageSize -> {80, 50}] Check: rect@{"speechless green", Green} Great, we now ready. Behold, read, and wonder (right-click and "open image in new tab" to see a bigger version). Do not forget - there is more stuff after this table. Grid[ParallelMap[rect, dataPAB, {2}], Spacings -> {0, 0}] Well, could there a be a better or different way to visualize relationships? What about a network graph - judging by social analytics approaches - they are the best to represent relationships. Let's make a clean cut and get the data again: data = Import["http://xkcd.com/color/rgb.txt", "Data"][[All, 1 ;; 2]]; And we turn strings of color descriptions into WL format colors with Interpreter again: data = Reverse[MapAt[Interpreter["Color"], #, 2]] & /@ data; data[[;; 5]] // Column Now, like on Facebook - you have friends and they have friends and so on - we need to find closest friends of each color. We can use `ColorDistance` for that that utilizes many measures, for example `Euclidean` distance in `LABColor` and such. Let's define our distance function: neco[{u_, v_}, {x_, y_}] := ColorDistance[u, x] Now in WL we have an awesome function `Nearest` that can operate on any objects to deduce the closest to it objects: neig[c_] := Nearest[DeleteCases[data, c], c, {All, .16}, DistanceFunction -> neco] where `{All, .16}` means among all objects find closest within radius 0.16 as given by `DistanceFunction`. `DeleteCases` is needed exclude the original object as its own friend. Check: This function will connect the original color and its closest friends within 0.16 measure of `DistanceFunction` edgs[v_] := v <-> # & /@ neig[v] Check: Noticed the trick with `Sort`? `Sort` will flip the edges to orient `b<->a` as `a<->b` so we can delete duplicates using `Union` when building all edges between all colors and their friends: edgsALL = Union[Sort /@ Flatten[ParallelMap[edgs, data], 1]]; To get a simple color-proximity `Graph` define a `VertexLabels` function: panelLabel[lbl_] := lbl[[1]] And now behold: g = Graph[data, edgsALL, VertexLabels -> Table[i -> Placed[{i}, Center, panelLabel], {i, data}], EdgeStyle -> Opacity[.2], EdgeShapeFunction -> "Line", VertexSize -> 0, ImageSize -> 900] To build a large scale browseable network with readable labels define new label function: panelLabel[lbl_] := Panel[Style[lbl[[2]], 14, Bold, If[ColorConvert[lbl[[1]], "GrayLevel"][[1]] < .5, White, Black]], FrameMargins -> 0, Background -> lbl[[1]]] Instead of negating the color of text (as we did in the huge table) we make it `White` if `GrayLevel` of background is `< 0.5` and Black if it is `> 0.5`. A different approach. Check: Perfect. Now the monster network: g = Graph[data, edgsALL, VertexLabels -> Table[i -> Placed[{i}, Center, panelLabel], {i, data}], EdgeStyle -> Opacity[.2], EdgeShapeFunction -> "Line", VertexSize -> 0, ImageSize -> 10000]; To browse it open ==> THIS LINK <== in a NEW TAB and zoom in/out. It will look something like this: Interesting part is why did we chose radius 0.16? Two words - percolation theory. Radius 0.16 for XKCD data serves as percolation threshold much below which the network has a lot of disconnected components and much above which the network is "overconnected" complete `Graph`. The former is lack of information and the later is "too much" info for meaningful sharp description. My intuition is that percolation threshold is the golden middle that allows for concise but precise definitions. You can experiment lowering it and increasing it to see how network under- and over- connects. Percolation threshold is that moment when you can get from one description to another and then next one and get to any other description. Using association chains you can deduce deeper connections among remote meanings in the whole network. This is of course is speculative and arguable. Let me know if you are familiar with relevant research or have an opinion. And now using percolation threshold we can define new colors based on old descriptions: Labeled[Grid[neig[{#, ""}], Frame -> All], Row[{"New clor ", Graphics[{#, Disk[]}, ImageSize -> 30], " is like"}], Top] &@RandomColor[] Concise (much less than full ~1000 descriptors) but precise (you "got the feeling"). Now what is next? It would be really great to make a "machine" have an imagination and form its own new color descriptors. How? - not sure but probably running WL machine learning on some large color-related corpora. When I figure it out - I will write a continuation. Or maybe you will?

POSTED BY: Vitaliy Kaurov

5 Replies

Sort By:

EDITORIAL BOARD

EDITORIAL BOARD, WOLFRAM

Posted 3 years ago

-- you have earned *Featured Contributor Badge* Your exceptional post has been selected for our editorial column *Staff Picks* http://wolfr.am/StaffPicks and Your Profile is now distinguished by a *Featured Contributor Badge* and is displayed on the Featured Contributor Board. Thank you!

POSTED BY: EDITORIAL BOARD

Reinhard Danelzik

Posted 11 years ago

As a newcomer to Wolfram-Community and as a nonscientist I'm overwhelmed by what is possible in Mathematica. And, Vitaliy, what a creative train of thought to combine social data on colors with color theory and percolation theory! What I missed in the xkcd poll were more shades of grey and dark hues. Incidently, In January I finished a new painting "In CIELab color space", which provides more colors to be named

POSTED BY: Reinhard Danelzik

Christopher Wolfram

Christopher Wolfram, WOLFRAM

Posted 11 years ago

That's awesome! After reading this I downloaded the data and did some processing of my own. I focused a bit on the names given to colors. Starting off similar to you (including the code just so that any small differences won't cause confusion) I import the data and get the names: fullData=Import["http://xkcd.com/color/rgb.txt","Data"][[All,1;;2]]; names = fullData[[All, 1]]; We can make a histogram showing the distribution of the number of words used in a description: Histogram[Length/@StringSplit[names]] It's not very interesting because not very many words are typically used. Instead we can calculate the average number of accompanying words to get a more interesting plot: Histogram@Sort[Merge[Flatten[Function[words,#->Length[words]-1&/@words]/@StringSplit[names]],Mean]] Next we can look at which words are used the most often. This is a pie chart shows the average color of all colors that share a certain word in their names: nameToColors=SortBy[Merge[Flatten[Function[color,#->color[[2]]&/@StringSplit[color[[1]]]]/@data],Flatten],Length]; PieChart[Length/@nameToColors,ChartStyle->(LUVColor@@Mean[List@@@#]&/@Values[nameToColors])] And another (uglier) version which labels each slice: We can immediately see what words are popular for describing colors. This will tell us what words are the most vague, this is being calculated by the mean deviation of Sort[Function[color,Norm[MeanDeviation[List@@@color]]]/@nameToColors] Here are the last few: <\| "green" -> 0.301527, "yellowish" -> 0.301806, "orangish" -> 0.309022, "greeny" -> 0.310186, "magenta" -> 0.310729, "red" -> 0.326567, "fuchsia" -> 0.331209, "greenish" -> 0.331261, "cool" -> 0.340017, "purply" -> 0.344614, "yellowy" -> 0.349218, "powder" -> 0.358974, "blood" -> 0.362786, "browny" -> 0.375569, "pink" -> 0.377552, "purple" -> 0.39065, "sea" -> 0.392243, "reddish" -> 0.407342, "pinkish" -> 0.409191, "off" -> 0.414668, "purpley" -> 0.415278, "blue" -> 0.417053, "orangey" -> 0.417463, "purpleish" -> 0.418713, "pale" -> 0.421891, "flat" -> 0.422791, "dull" -> 0.422931, "dusty" -> 0.429657, "muted" -> 0.430807, "dirty" -> 0.431016, "purplish" -> 0.43483, "baby" -> 0.441156, "marine" -> 0.444411, "dark" -> 0.450202, "very" -> 0.458118, "pinky" -> 0.46324, "ugly" -> 0.478061, "bluey" -> 0.489151, "violet" -> 0.490251, "light" -> 0.502694, "faded" -> 0.505302, "soft" -> 0.531751, "mid" -> 0.538431, "deep" -> 0.544057, "warm" -> 0.548372, "bluish" -> 0.549277, "medium" -> 0.553847, "rich" -> 0.571403, "pastel" -> 0.580348, "darkish" -> 0.612625, "true" -> 0.777189, "hot" -> 0.83747, "easter" -> 0.879319, "strong" -> 0.93553, "vibrant" -> 0.938836, "bright" -> 0.939123, "lightish" -> 0.957704, "lighter" -> 0.967594, "vivid" -> 0.998611, "electric" -> 1.07712, "neon" -> 1.08666\|> It makes sense that words like vivid and neon are at the top but i tis amusing to see anyway. Now it's time to do something really cool, we are going to make a graph with words at each vertex, and where vertexes are connected if there is a color that uses both words. Example: "bright red" would connect the bright and red vertexes. Here is the graph: wordRelations=DeleteDuplicates[Flatten[Function[{fullName,color},Thread[{UndirectedEdge@@@Permutations[StringSplit[fullName],{2}],color}]]@@@data,1],#1[[1]]===Reverse[#2[[1]]]&]; namesGraph=Graph[DeleteDuplicates[Flatten[List@@@wordRelations[[All,1]]]],wordRelations[[All,1]],EdgeStyle->Rule@@@wordRelations,VertexSize->1,VertexStyle->name_:>LUVColor@@Mean[List@@@nameToColors[name]]] Here edges are colored by the color that both words are mentioned in, and vertexes colored with the mean color of connecting edges. Next we can find communities in this graph: communities = FindGraphCommunities[namesGraph]; CommunityGraphPlot[namesGraph, communities] Finally we can do some coloring to see what colors are in these groups: CommunityGraphPlot[namesGraph,communities,CommunityRegionStyle->Function[names,LUVColor@@Mean[List@@@Flatten[nameToColors/@names]]]/@communities] I may do some more with this data, if I do I'll post it here. Random note: CommunityGraphPlot returns particularly pretty result when run over your final graph: Attachments:

That's awesome!

After reading this I downloaded the data and did some processing of my own. I focused a bit on the names given to colors.

Starting off similar to you (including the code just so that any small differences won't cause confusion) I import the data and get the names:

fullData=Import["http://xkcd.com/color/rgb.txt","Data"][[All,1;;2]];
names = fullData[[All, 1]];

We can make a histogram showing the distribution of the number of words used in a description:

Histogram[Length/@StringSplit[names]]

Histogram

It's not very interesting because not very many words are typically used. Instead we can calculate the average number of accompanying words to get a more interesting plot:

Histogram@Sort[Merge[Flatten[Function[words,#->Length[words]-1&/@words]/@StringSplit[names]],Mean]]

Histogram

Next we can look at which words are used the most often. This is a pie chart shows the average color of all colors that share a certain word in their names:

nameToColors=SortBy[Merge[Flatten[Function[color,#->color[[2]]&/@StringSplit[color[[1]]]]/@data],Flatten],Length];
PieChart[Length/@nameToColors,ChartStyle->(LUVColor@@Mean[List@@@#]&/@Values[nameToColors])]

Pies are good

And another (uglier) version which labels each slice:

Do I need a description?

We can immediately see what words are popular for describing colors.

This will tell us what words are the most vague, this is being calculated by the mean deviation of

Sort[Function[color,Norm[MeanDeviation[List@@@color]]]/@nameToColors]

Here are the last few:

<| "green" -> 0.301527, "yellowish" -> 0.301806, 
 "orangish" -> 0.309022, "greeny" -> 0.310186, "magenta" -> 0.310729, 
 "red" -> 0.326567, "fuchsia" -> 0.331209, "greenish" -> 0.331261, 
 "cool" -> 0.340017, "purply" -> 0.344614, "yellowy" -> 0.349218, 
 "powder" -> 0.358974, "blood" -> 0.362786, "browny" -> 0.375569, 
 "pink" -> 0.377552, "purple" -> 0.39065, "sea" -> 0.392243, 
 "reddish" -> 0.407342, "pinkish" -> 0.409191, "off" -> 0.414668, 
 "purpley" -> 0.415278, "blue" -> 0.417053, "orangey" -> 0.417463, 
 "purpleish" -> 0.418713, "pale" -> 0.421891, "flat" -> 0.422791, 
 "dull" -> 0.422931, "dusty" -> 0.429657, "muted" -> 0.430807, 
 "dirty" -> 0.431016, "purplish" -> 0.43483, "baby" -> 0.441156, 
 "marine" -> 0.444411, "dark" -> 0.450202, "very" -> 0.458118, 
 "pinky" -> 0.46324, "ugly" -> 0.478061, "bluey" -> 0.489151, 
 "violet" -> 0.490251, "light" -> 0.502694, "faded" -> 0.505302, 
 "soft" -> 0.531751, "mid" -> 0.538431, "deep" -> 0.544057, 
 "warm" -> 0.548372, "bluish" -> 0.549277, "medium" -> 0.553847, 
 "rich" -> 0.571403, "pastel" -> 0.580348, "darkish" -> 0.612625, 
 "true" -> 0.777189, "hot" -> 0.83747, "easter" -> 0.879319, 
 "strong" -> 0.93553, "vibrant" -> 0.938836, "bright" -> 0.939123, 
 "lightish" -> 0.957704, "lighter" -> 0.967594, "vivid" -> 0.998611, 
 "electric" -> 1.07712, "neon" -> 1.08666|>

It makes sense that words like vivid and neon are at the top but i tis amusing to see anyway. Now it's time to do something really cool, we are going to make a graph with words at each vertex, and where vertexes are connected if there is a color that uses both words. Example: "bright red" would connect the bright and red vertexes. Here is the graph:

wordRelations=DeleteDuplicates[Flatten[Function[{fullName,color},Thread[{UndirectedEdge@@@Permutations[StringSplit[fullName],{2}],color}]]@@@data,1],#1[[1]]===Reverse[#2[[1]]]&];
namesGraph=Graph[DeleteDuplicates[Flatten[List@@@wordRelations[[All,1]]]],wordRelations[[All,1]],EdgeStyle->Rule@@@wordRelations,VertexSize->1,VertexStyle->name_:>LUVColor@@Mean[List@@@nameToColors[name]]]

Graph

Here edges are colored by the color that both words are mentioned in, and vertexes colored with the mean color of connecting edges. Next we can find communities in this graph:

communities = FindGraphCommunities[namesGraph];
CommunityGraphPlot[namesGraph, communities]

Communities

Finally we can do some coloring to see what colors are in these groups:

CommunityGraphPlot[namesGraph,communities,CommunityRegionStyle->Function[names,LUVColor@@Mean[List@@@Flatten[nameToColors/@names]]]/@communities]

Filled in

I may do some more with this data, if I do I'll post it here.

Random note: CommunityGraphPlot returns particularly pretty result when run over your final graph: Your graph

POSTED BY: Christopher Wolfram

Todd Rowland

Todd Rowland, Zeta Cubed

Posted 11 years ago

Poll data is probably a new thing, but maybe more appropriate would be the type of experiments done in psychology Or already an old technology is eye tracking. Has anyone tracked the eyes as they see this color representation? Back in 2004 Anirudh Tiwathia did an eye tracking experiment with cellular automata at the summer school which was later published in Complex Systems The experienced researchers would start at the bottom, but those unfamiliar always started at the top. In the case of Vitaliy's color visualizations I find myself looking through them in a definite movement, although I can't recall exactly what. From some perspective the eye movement is a more basic measure of semantic pathways than a poll, which is something that has to go through many layers and is more difficult to interpret.

POSTED BY: Todd Rowland

Bernat Espigulé

Bernat Espigulé, Universitat de Barcelona

Posted 11 years ago

What a colorful Community post! :-) Vitaliy, another way to have text in contrast is to use your trick posted in Stackexchange text = First[First[ImportString[ExportString[Style["gray", Italic, FontSize -> 24, FontFamily -> "Times"], "PDF"], "PDF", "TextMode" -> "Outlines"]]]; and setting the fonts to be always black with white borders: Graphics[{EdgeForm[Directive[White, Thick]], Black, text}, Background -> Gray, PlotRange -> {{-5, 25}, {-0, 20}}] Btw, your awesome function to find color relationships can be used as a "Color Blind Assistant". Here is the modified function that does this given a color: cassist[c_] := Part[Nearest[data, {c, ""}, DistanceFunction -> neco],1,-1] for example: cassist[Darker@Red] "darkish red" Which it really tells how that particular color looks like.

POSTED BY: Bernat Espigulé

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback