Atricle Review
10/1/2014 Averages vs histograms | SuperFly Physics
http://arundquist.wordpress.com/2014/09/28/averages-vs-histograms/ 1/3
Averages vs histograms
Posted on September 28, 2014
With graphics being so easy to add to documents these days, why don’t we show more histograms in place of the
typical approach of representing very complicated data with one or two numbers (eg average and standard
deviation)? Sure, if your data is normally distributed, then those two numbers really are a great distillation of the
data. However, lots of things aren’t normally distributed, and I’m lobbying for more use of histograms instead of
(or, I suppose, in conjunction with) the numeric characteristics of the data set.
Here’s the example that got me thinking about this today. At my school student evaluations of instructors are very
important. We use a seven-point Likert scale on questions such as “The instructor encourages me to learn
actively” and “This course was a valuable learning experience.” Quite often reviews of faculty are peppered with
means and occasionally standard deviations of evaluation data for the reviewed faculty member. However, the
data is not normally distributed at all! It can be bimodal (some hate me, some love me), or highly skewed in other
ways. I’ve been working lately to provide an interface for our evaluations to help people on the tenure and
promotion committee make wise recommendations. Instead of having to click through to each course, I’ve made a
nice table that shows the average for the class on each question. The table rows are the various courses the faculty
member has taught. But while thinking about the notion of showing histograms in addition to averages, I hit upon
using PHP to dynamically create SVG’s with the histograms. Here’s what it looks like:
5 courses for an anonymous faculty member. Each column is a different question on our standard evaluation.
I feel like you learn a lot by looking at the (tiny) histograms. Take the three “4.44”s that are in the third class. The
middle one is much more bimodal than the other two.
What am I lobbying for? I’d love it if many more reports/journal articles/newspaper stories did this kind of thing.
The graphics generation and inclusion is really not that hard, and I think it communicates the whole story, not
just a distilled version.
One downside is the inability to describe the data very easily. I was showing this to my partner and I was trying to
say “this one is different than that one” and I had to point to them. I couldn’t easily describe them. So I resorted to
saying “the 4.44 one . . .” etc. I suppose this is backing up my point that the data sets are complex and resist easy
description, but I know my colleagues on the tenure and promotion committee like to really discuss these
evaluations a lot.
Here’s another interesting point from a friend of mine (who’ll remain anonymous):
Averages and SDs are **NOT** appropriate for categorical data. They assume the “distance”
between each category is equal, as if the numerical choices were locations on a spatial scale. They
are not. You’ve got two choices: Report number of responses in each bin (as you’re playing with);;
or turn to Rasch analysis, which is designed for exactly this problem. But it’s not for the faint of
heart…
Interesting, huh?
SuperFly Physics
Physics questions, ideas, hare-brained
schemes
10/1/2014 Averages vs histograms | SuperFly Physics
http://arundquist.wordpress.com/2014/09/28/averages-vs-histograms/ 2/3
About Andy "SuperFly" Rundquist
Associate professor of physics at Hamline.
View all posts by Andy "SuperFly" Rundquist ?
Your thoughts? Here are some starters for you:
This is great. I totally agree that representing all of the data is much better than any distillations. I would even
go further by suggesting . . .
This is dumb. We use the distillations for several very good reasons . . .
Why do you use evaluation data at all? They’ve clearly been shown to be problematic.
Why a 7-point Likert scale? How about a 2-point Love-ert scale?
How did you make those SVG histograms in PHP?
PHP?!!? I’m never reading this blog again.
Wait, I thought you only knew how to use Mathematica.
This entry was posted in math and tagged data display, php. Bookmark the permalink.
4 Responses to Averages vs histograms
Reblog Like
One blogger likes this.
Related
Error Propagation unbalanced 20-sided die? Labs for standards
In "lab" In "fun" In "sbar"
bretbenesh says:
September 28, 2014 at 10:08 pm
I give a big thumbs up to your anonymous friend: categorical data should definitely not be averaged.
I think that your histograms are preferable, although I might even prefer a basic list of frequency distributions if I were on
Rank and Tenure (two notes: this is probably a personal preference, and I definitely prefer just seeing the numerical
frequencies if there are only five possibilities on the Likert scale, as I am used to, but I might start preferring the histograms if
I had to deal with seven possibilities).
And that is some pretty nice PHP.
Reply
Mr. John says:
September 29, 2014 at 10:23 am
Hello Andy. I am a former student of yours. I have a few thoughts on this piece. Based on your description, it seems that
Hamline puts way too much stock in student evals. I just read an article the other day about how Student Evals are actually
one of the worst indicators of teaching quality whereas a peer evaluation system would be much more preferred.
Nonetheless, I love your point about how each of the 4.44 averages are not the same as the histogram showed.
As a side note, I think that if the evals are so high stakes teachers should have some say in how questions are worded and
what gets asked so you have more input in the evaluation process.
What I learned in your class can not be quantified in an evaluation process. I learned the value of struggling. I learned what it
is like to delve into really difficult material and try to make it work. Having had that experience has really helped me better
understand my own students I am working with this year.
Finally, I think your classes really foster self knowledge and reflection which again does not really get captured in the student
eval process.
Reply
Joss Ives says:
10/1/2014 Averages vs histograms | SuperFly Physics
http://arundquist.wordpress.com/2014/09/28/averages-vs-histograms/ 3/3
SuperFly Physics
September 29, 2014 at 3:49 pm
Ignoring the categorical data issue, and the fact that I would almost always take a visual representation over numerical
summary representation, wouldn’t the standard deviation of the bimodal 4.44 be so large that it could only represent
bimodal data? More data is usually better, but sometimes one just wants a summary (hello Metacritic).
Reply
andrewkbennett says:
October 1, 2014 at 9:58 am
It makes sense that most of the time, the standard deviation of a bimodal distribution will be larger, but is it
necessarily? Those shape-related features can still get lost in translation, I think. Consider {2,2,2,3,3,4,4,5,5,6,6,6}
and {1,2,2,3,3,4,4,5,5,6,6,7}. Not a great example, perhaps, but does show that the standard deviation of the more
“bimodal-looking” distribution can be smaller.
Reply
The Twenty Ten Theme. Create a free website or blog at WordPress.com.