Connect with users of Wolfram technologies to learn, solve problems and share ideas
Mark as an Answer
5 Total Likes
Follow this post
Share this post:
Classification and association rules for census income data
Anton Antonov, Accendo Data LLC
10 years ago
This blog post discusses the application of several algorithms for analysis of census income data:
I used the same data in a previous
discussion about mosaic plots
because of data's categorical variables.
Here is a table of the histograms for age, education-num, and hours-per-week:
The two classifiers used are (1) decision trees and (2) naive Bayesian classifiers. Both classifiers are trained with the same training data set, and tested with the same test data set. With each of classifier I measured the classification success rates after shuffling each of the columns in the test data. (Every time only one column is shuffled.)
Here is comparison of how much worse the success rates become after the shuffling:
I had to "categorize" the numerical columns in order to be able to apply the
Association rules learning
Here is a table with (some) of the rules with highest confidence:
The confidence of an association rule A->C with antecedent A and consequent C is defined to be the ratio P(A and C) / P(C). The higher the ratio the more confidence we have in the rule. (If the ratio is 1 we have a logical rule, C in A.)
Here is a table showing the rules with highest confidence for the consequent being "<=50K":
The analysis confirmed (and quantified) what is considered common sense:
Age, education, occupation, and marital status (or relationship kind) are good for predicting income (above a certain threshold).
Using the association rules we see for example that(1) if a person earns more than $50000 he is very likely to be a married man with large number of years of education;(2) single parents, younger than 25 years, who studied less than 10 years, and were never-married make less than $50000.
Reply to this discussion
in reply to
Community posts can be styled and formatted using the
Tag limit exceeded
Note: Only the first five people you tag will receive an email notification; the other tagged names will appear as links to their profiles.
Add a file to this post
Follow this discussion
Be respectful. Review our
to understand your role and responsibilities.
Wolfram|Alpha Notebook Edition
Volume & Site Licensing
Enterprise Private Cloud
Service Plans Benefits
Wolfram Language Documentation
Wolfram Language Introductory Book
Get Started with Wolfram
Fast Introduction for Programmers
Fast Introduction for Math Students
Webinars & Training
Connected Devices Project
Wolfram Data Drop
Wolfram + Raspberry Pi
. All rights reserved.
Learn how »