# Mosaic plots for data visualization

Posted 10 years ago
15573 Views
|
6 Replies
|
15 Total Likes
|
 I just published a blog post proclaiming the implementation of the function MosaicPlot that gives visual representation of the contingencies of categorical variables in a list of records. The blog post has examples and explanations:http://mathematicaforprediction.wordpress.com/2014/03/17/mosaic-plots-for-data-visualization/If we consider the census income data set known as the "adult data set" that is summarized in this table:we visualize the co-occurence of (categorical variable) values with mosaic plots like this one:By comparing the sizes of the rectangles corresponding to values ĀBachelorsĀ, ĀDoctorateĀ, ĀMastersĀ, and ĀSome-collegeĀ on the Āsex vs. educationĀ mosaic plot we can see that the fraction of men that have finished college is larger than the fraction of women that have finished college.We can further subdivide the rectangles according the co-occurrence frequencies with a third categorical variable. We are going to choose that third variable to be ĀincomeĀ, the values of which can be seen as outcomes or consequents of the values of the first two variables of the mosaic plot.From the mosaic plot "sex vs. education vs. income" we can make the following observations.1. Approximately 75% of the males with doctorate degrees or with a professional school degree earn more than $50000 per year.2. Approximately 60% of the females with a doctorate degree earn more than$50000 per year.3. Approximately 45% of the females with a professional school degree earn more than $50000.4. Across all education type females are (much) less likely to earn more than$50000 per year.
6 Replies
Sort By:
Posted 1 month ago
 Here is the corresponding "MosaicPlot" paclet:
Posted 9 years ago
 Here is a new blog post of mine that analyzes further the census income data:"Classification and association rules for census income data",http://mathematicaforprediction.wordpress.com/2014/03/30/classification-and-association-rules-for-census-income-data/.I found using MosaicPlot with Manipulate and Tooltip very useful:
Posted 9 years ago
 Just published a blog post describing the enhancement I implemented during this week of MosaicPlot:http://mathematicaforprediction.wordpress.com/2014/03/24/enhancements-of-mosaicplot/The functionality that took me most effort and designing to do was the coloring of the rectangles. I choseĀ an approach that makes the plots easier to read. Here is a grid of examples:I also updated my previous posts in this discussion with color plots.
Posted 9 years ago
 Thanks, Mark!Here is the code for the function RecordSummary that can be used together with Grid to make summary tables: Clear[DataColumnsSummary] Options[DataColumnsSummary] = {"MaxTallies" -> 7, "NumberedColumns" -> True}; DataColumnsSummary[dataColumns_, opts : OptionsPattern[]] :=    DataColumnsSummary[dataColumns,     Table["column " <> ToString[i], {i, 1, Length[dataColumns]}], opts]; DataColumnsSummary[dataColumns_, columnNamesArg_, opts : OptionsPattern[]] :=    Block[{columnTypes, columnNames = columnNamesArg,       maxTallies = OptionValue[DataColumnsSummary, "MaxTallies"],       numberedColumnsQ =       TrueQ[OptionValue[DataColumnsSummary, "NumberedColumns"]]},    If[numberedColumnsQ,     columnNames =       MapIndexed[ToString[#2[[1]]] <> " " <> #1 &, columnNames]     ];    columnTypes =      Map[If[NumberQ[#], Number, Symbol] &, dataColumns[[All, 1]]];    MapThread[     Column[{        Style[#1, Blue, FontFamily -> "Times"],        If[TrueQ[#2 === Number],         Grid[NumericVectorSummary[#3], Alignment -> Left],         Grid[CategoricalVectorSummary[#3, maxTallies],           Alignment -> Left]         ]}] &, {columnNames, columnTypes, dataColumns}, 1]    ] /; Length[dataColumns] == Length[columnNamesArg];Clear[RecordsSummary];RecordsSummary[dataRecords_, opts : OptionsPattern[]] :=   DataColumnsSummary[Transpose[dataRecords], opts];RecordsSummary[dataRecords_, columnNames_, opts : OptionsPattern[]] :=   DataColumnsSummary[Transpose[dataRecords], columnNames, opts];
Posted 9 years ago
 Thank you for this post and linking to you blog,esp mosaic plots with data summaries and quantile regression. Look forward to exploring these.
Posted 10 years ago
 I updated the implementation of the function MosaicPlot to have an interactive feature using Tooltip that gives a table with the exact co-occurrence (contingency) values when hovering with the mouse over the rectangles. Here is an example: