Goals

  • Complete a data visualization with a new tool
  • Discover questions raised by the correlation in the data

Background

As an educator, I worry about the things that affect my students before (and while) they come to me. Of the many things that influence education, one general category is "inequality." In Silicon Valley, racial and gender inequality are frequently identified as influences of a lack of diversity in tech, and these definitely affect educational outcomes as well. Another kind that has caught both the world's and America's attention is income inequality. 

Since the total difference between top and bottom is so large, an easier way to evaluate inequality is with a measurement called the Gini coefficient, which measures relative wealth. It's not a measure of absolute wealth, absolute poverty, or changes in income over a lifetime; it's a snapshot of how unequal the distribution is.

I matched Gini data and educational attainment data from 2009 to test out a new data visualization tool, Google Fusion Tables. Looking at just the Gini data applied to each U.S. state, we see that some "rich" states are more unequal than some "poor" states, but that that isn't always the case. Since this was an exercise in visualization, I didn't dig into the data more, but it would be informative to tease out the relationship between Gini and the absolute per capita income for population percentiles.

Although per capita income, income inequality, and spending on education are not the same thing (I'm from Florida, which has low spending on education in part because of reluctance to raise taxes), I wanted to see if there was anything in the data that suggested a correlation between Gini and educational outcomes.

There was (though there was no way to test the significance of the data within this tool).

If the visualization tool offered more advanced charts, like a 3D scatter plot, the relationship would be clearer with the addition of data like per capita spending, parents' education, family income, etc.. This correlation raises more questions than it answers, though, like

  1. Why are the lower completion rates so low? And for which groups?
  2. Are there any clusters within the data?
  3. How high is the correlation between poverty and education and inequality on a county-by-county level?


High school or more Education

Bachelors or more Education

Advanced degree education

More questions

One interesting generalization: several of the states with higher Gini coefficients have higher advanced degree completion rates compared to the rest of the country. Some of the questions this raises are

  1. Are the states with more advanced degrees more unequal compared to their neighbors because people with advanced degrees have greater incomes?
  2. Are the states with fewer advanced degrees more equal compared to their neighbors because everyone is poorer?
  3. What is the quality and availability of higher education in states with more inequality?
  4. What besides the quality and availability of higher education in each state could explain the differences in educational attainment rates?
  5. What are the non-economic barriers to educational equality in these states?

More data that would clarify the causality but exponentially increase computational time include longitudinal changes in inequality, migration, demographic shifts, and state taxes. Ultimately, this is a huge issue, one that an afternoon of research and a hunch for a correlation won't solve.

tool evaluation

More time was spent corralling the data and formatting it than actually visualizing it. One frustration I had with Google Fusion Tables was that it didn't let me edit the data the way I wanted to, and in the visualizations, a lot of unwanted data shows up even though I tried to change the display settings from a few different menus.

Google Fusion Tables didn't require installing new software, but it had very limited views. That or perhaps I didn't understand the documentation for how to get different views, and Google doesn't offer many video tutorials. The next tools I'd like to try are Tableau Public and CartoDB which seem to have more options and produce better visualizations (though maybe that's the community).