Tag Archives: data

Studio Usage Heat Map

studio usage heat map - by day

If you’ve been following along, you know that I’ve been working to pull together a recording studio on a budget. Our first step was clearing out the old office that was destined to become the studio, work on minimizing the echo in the room, and painting one wall Sparkling Apple to use as a green screen. This is where our first $100 went. Next, we spent another $50 or so to light both the green screen and the talent in front of it. I’m currently working on sorting out the best solution for audio and video. (Stay tuned for updates!)

Fortunately, the lack of A/V equipment hasn’t prevented our staff from using the studio.  In fact, since the doors first opened in July, it has seen over 150 hours of use.  At this point, it is interesting to look at the patterns of usage that have emerged. Thus, the heat map, above.

To make the heat map, I added a “1” to each half-hour timeslot that the studio was reserved each week in an Excel spreadsheet. I then color-coded the data in the sheet with hotter colors reflecting higher numbers. The colors help to visualize trends in usage. For example, usage increases as the week goes on with Thursday and Friday afternoons appearing in oranges and reds. In contrast, there are times early on Monday and Tuesday that have never been reserved.

Studio usage heat map - by weekI also have a heat map that compresses all of the days into one, which I made by totaling the times for each half-hour block on the spreadsheet and then color-coding it. Click to enlarge it. Again, it’s pretty easy to see the studio warm up as the day goes on, indicating increased usage.  Having a couple of regular evening reservations also contributes to this pattern.

Color coding numbers in a spreadsheet isn’t rocket science, but it is an easy way to visualize the data to quickly get a read on the studio. And, I can see that I’m going to have to start coming in earlier on Mondays if I want to use the studio.

Leave a comment

Filed under Projects

Raw. What is it good for?

students vs teachers-1 cropped

When I first came across Raw, a free, online data visualization tool, I channeled my inner Edwin Starr and asked, “What is it good for?”  It turns out the answer is “absolutely everything.”  Or pretty close to it.

Raw is extremely user friendly.  It’s built on D3.JS, which is pretty powerful.  If you, like me, haven’t had time to explore D3 in depth (or if, also like me, you’re not sure you have the skills to take it on,) Raw greatly simplifies the process.  And all of the data is processed in your browser, which means your data is never copied and stored on their servers.

So, what can Raw do for you?  Well take your favorite data set and paste it into the text box (or choose from one of the four example data sets provided).  Then choose from one of the 15 chart types and drag components for your data into the axes or other options for the cart type you have chosen.  You can do this as many times as you like to get the data to try on different options.  Finally, customize your visualization by adjusting the size, scale, and colors of your visualization before choosing how you want to export your results.  It’s amazingly easy!

I created the visualization at the top of this post by feeding in some data on teachers (left) and students (right).  The lines connecting them represent classes that the students had with each teacher with thin lines for one semester and thick ones for the next.  I wanted to explore how students move through our program.  Here, it’s easy to see that most students move up from one level to the next, but there are some that skip levels and some that repeat levels.  The students and teachers are not arranged in order from lowest to highest level, though this would be possible and might make it easier to see these trends.

There are lots of other options within Raw and, depending on what your data include, some may be more useful than others.  But the beauty of Raw is that you are only a couple of clicks away from any of them, making it very easy to try several visualizations until you find one you like.

Leave a comment

Filed under Resources

The Largest Vocabulary in Hip Hop

turntable“technics sl-1200 mk2” by Rick Harrison / Flickr

I spent much of my youth listening to hip hop, or, as it was called back then, rap music.  This was long before MP3 players and long before you could Google your favorite song lyrics.  It was also long before I knew anything about textual analysis, let alone before I thought about using unique words per n words as a measure of variety in vocabulary.

So, when Matt Daniels published this piece called The Largest Vocabulary in Hip Hop last month, it was both a flash back to the music of my youth and a flash forward to some of my current interests in corpus linguistics.

Daniels does a very nice analysis, so I won’t repeat much of it here.  Just follow the link and scroll down to see the details.  Be aware that some of the analysis incorporates a bit of slang that may not make it completely kid friendly.

Most noteworthy in the analysis are the two baselines of comparison:  Shakespeare (5170 unique words per 35,000 words) and Herman Melville (6,022 unique words in the first 35,000 words of Moby Dick).  Of the 85 rappers analyzed, 16 use a wider vocabulary than Shakespeare and 3 are above Melville.  So, if you ever thought all hip hop was a simplistic art form, you may want to take another look.  It’s amazing what an analysis of the data can show us.

Leave a comment

Filed under Inspiration

Data is Beautiful

graph of "language" as a tag in TED talksVisualization of how often “language” is a tag in TED Talks.

I’ve mentioned data visualizations in several previous posts, so it may not be surprising that I’m writing a trove I’ve recently found: the dataisbeautiful subreddit.  In addition to lots of excellent data visualizations (and some mediocre ones) there’s lots of interesting discussion, including responses to previous visualizations (for example, compare this early version of “How we die” to this follow up.)

One I just came across is someone asking about a pattern in some data, specifically why Google searches for “1990s” peak in May of almost every year.  Other decades follow the same pattern.  Several correlates are suggested (high school reunions, for example) but it turns out that high school proms look like the best correlate.  So, 1950s, 1960s, 1970s, 1980s, and, yes, 1990s, seem to be heavily-Googled prom themes.

If you’re not familiar with Reddit, this is a great subreddit to jump into.  One of the key features of Reddit is that users can vote content up or down, which means that the best content rises to the top (though the definition of “best” is open to the interpretation of every user.)  It’s free to join and not even an email address is required.  You can lurk for a while, simply up / downvote, or jump right into conversations with people from across the internet on almost every conceivable topic, including the data visualizations in dataisbeautiful.

Leave a comment

Filed under Resources

Data Visualizations from the New York Times

Screen Shot 2014-04-02 at 12.43.17 PM (2)

Everyone loves a good data visualization.  And everyone loves a good data visualization even more if the visualization is interactive.  Unfortunately, I can’t embed an interactive visualization above, but click on it to link to the interactive version.  The circles represent the volume of traffic at airports around the U.S.  Clicking on a circle reveals all of the connecting flights to that airport.  I’m sure you could get this information out of some kind of heinous Excel spreadsheet, but this format is way more engaging.

This is why I was attracted to this year’s Wherry Lecture, which is hosted by the Departments of Statistics and Psychology at Ohio State.  The speaker was Amanda Cox from the New York Times‘ graphics department who spoke about the Times‘ use of data visualizations.  Amanda shared many examples that illustrated the importance of context, how a good visualization sometimes limits the amount of data in order to highlight patterns, and the importance of how the text and the visuals work together.  These are a few of my favorites.

The Jobless Rate for People Like YouNot all groups have felt the recession equally.  This visualization allows you to view trends in different demographics.  The differences can be startling.

One Report, Diverging Perspectives – Employment numbers with “Democrat” and “Republican” buttons that allow you to view the same data through different lenses.

Over the Decades, How States Have Shifted – A look at how each state has voted – Democratic or Republican – with connections to every election since 1952.

Counties Blue and Red, Moving Right and Left – Imagine a map of the wind blowing across the U.S.  Now instead of that wind representing, well, wind, imagine it representing the changes vote margin between Democratic and Republican presidential candidates.

Mapping America: Every City, Every Block – Based on U.S. Census data from 2005 to 2009, you can choose to represent ethnicity, income, housing, education, and other information on a map and then zoom out to view the entire nation or zoom in to view your neighborhood.

All of these examples provide different paths to understanding the data that is represented.  To see some of the other examples in this lecture, check out my Twitter stream (@eslchill) or follow the New York Times Graphics Department (@NYTgraphics).

Leave a comment

Filed under Inspiration

Choose Your Own Visualization

Like many kids who grew up in the ’80s and ’90s, I discovered Choose Your Own Adventure (CYOA) books in my school library and later at my local public library.  I read and re-read many of them and eventually owned a few of them.

For those not familiar with the genre of gamebooks, the reader reads the first couple of pages at which point she is faced with a decision.  For example, after passing through an antimatter storm, do you keep your spaceship on course or do you return to your home planet?  Depending on the choice, the reader is directed to another page where that branch of the story continues.  More choices follow every page or two and the story branches off into several directions with many possible endings.

For young readers, the challenge of finding successful endings can spur multiple readings of the story.  For young authors (I wrote a CYOA story as a writing project in highschool,) the process of creating and managing multiple story lines can be an interesting challenge.  For ESL students, both reading and writing CYOA stories can be a compelling way to practice English.

The branching structure means the story can grow exponentially.  To see just how complex these stories can quickly become, take a look at some of these visualizations of CYOA stories.

The first example, pictured above, is from seanmichaelragan.com.  This graph clearly illustrates how the stories branch and, in some cases, reattach.  Each node represents the page number of each choice.

The second example orients the graph horizontally and uses color to denote critical plot points as well as happy or tragic endings.

The third example fans the story out from the center, but includes even more information.  Happy endings, cliffhanger endings, and reader death endings are noted, but additional text pops up when you mouseover each node describing each decision.

As a fan of both Choose Your Own Adventure stories and data visualizations, I’ve really enjoyed looking through these images.  If you’re not familiar with Choose Your Own Adventure stories, I recommend to try to track a couple of them down for you and for your students.  Students could enjoy reading, writing, and analyzing these stories, which are accessible to high intermediate readers.

1 Comment

Filed under Resources

My Favorite Data Visualization

info graphic of best teams in baseball

I’m a visual person, so it’s no surprise that I like to see data represented visually.  One of my favorite data visualizations is the one above from a blog called Flip Flop Fly Ball.  The blog focuses on baseball, but the website it’s hosted on features lots of other quirky data visualizations including a graph of how many smarties are in the tube and a representation of the relationships between characters in the movie Love Actually.

I’m not really a big fan of baseball, but I do like sports and the discussion and statistics they generate.  This graph crams in a lot of information.  World Series winners from 1995 to 2009 are represented in pink (losers are in purple).  Teams that had a better regular season record than the champions are above and those with a worse record are below (teams with identical records all appear in the same box); National League teams are in yellow; American League teams are in white; and wild card teams are in italics.

If this information were all presented in columns, it would be a bit hard to decipher.  But shifting each column to align the winners puts the data in a new light.  Whenever I think of this site, this is the data visualization I think of first.  I’m not sure if there is a way to apply a similar approach to data generated in a classroom, but I imagine it would give a different perspective on students’ performance from the traditional bell curve.

Take a look around the site and you’ll get lots of different ideas for ways in which to represent data.  You’ll probably also learn a few things about baseball and maybe find something that would get your ESL students talking.

Leave a comment

Filed under Inspiration

Data Visualization: Attendance vs. GPA

Above is a plot of students’ attendance versus their grade point averages (GPAs).  See any trends?  Obviously, students with higher attendance tend to have higher GPAs.  While this is not particularly surprising, it’s nice to be able to support this notion with actual data.

(I should say that this “actual data” is not actual data, but it is based on actual data.  I’ve taken the actual “actual data” and randomly added or subtracted up to 5% so that the general trends remain, but none of the actual data points are the same, except by chance.)

In addition to the general trend that GPAs correlate positively with attendance, I can say that no student who had 100% attendance got less than a C+ (2.85 GPA) and that no student who got a 4.0 GPA (straight As) attended less than 96% (at least in the “actual” data).

Can I claim causality?  Not exactly.  I don’t know that higher attendance causes higher grades, or vice versa, but I think it could be claimed that low attendance causes low grades — if you aren’t in class, you can’t get an A.

Admittedly, this isn’t the most cutting edge visualization — it’s just a graph I made using Microsoft Excel — but I think it represents a relatively simple set of data effectively.

I plan to show this graph to all of our students at our program-wide meeting at the beginning of the academic year.  If nothing else, it should get them thinking a bit about the importance of attending class if they want to be successful.  This isn’t a big issue for most of our students but, as you can see, it is an issue for some.  And if it helps them to have me connect the dots, I gladly will (see below, click to enlarge).

Leave a comment

Filed under Projects

Processing Data Visualizations

closeup of CPU chip

I’ve seen a lot of interesting data visualizations lately but have struggled to figure out how to visualize my own data.  It seems like there is a vast chasm between creating pie charts in Excel and Hans Rosling’s TED Talks.  The I stumbled upon Processing.

Processing was used to create the genetics simulation I described in an earlier post.  After looking into it some more, I learned that Processing was developed out of a project at MIT’s Media Lab.  It is an object-oriented programming language conceived as a way to sketch out images, animations and interactions with the user.

Examples of of Processing projects include everything from a New York Times data visualization of how articles move through the internet and visually representing data in an annual report to more esoteric and artistic works.

To get started, download the application at http://processing.org and go through some of the tutorials on the site.  There are lots of examples included with the download so you can also open them up and start tweaking and hacking them, if that’s your preferred method of learning.  Once your code is complete, or after you’ve made a minor tweak, click on the play button to open a new window and see it looks.  Once you’ve completed your project, you can export it as an applet, which can be uploaded to a web server, or as an executable file for a Mac, Windows, or Linux computer.

I’ve been through the first half-dozen tutorials and am to the point of making lines and circles dance around.  I can even make the colors and sizes vary based on mouse position.  I have also opened up some of the more advanced examples and started picking away at them to see what I can understand and what I still need to learn more about.  Once I can import data from an external source, it will be really exciting to see the different ways to represent it.

I haven’t had a foreign language learning experience in a while.  I am learning (and re-learning) many valuable lessons as I try to express myself in this new language.  Not surprisingly, I’m finding that I need a balance between instruction (going through the tutorials) and practice / play (experimenting with the code I’m writing or hacking together).  I’m also a bit frustrated by my progress because I can see what can be done by fluent speakers (see examples, above) but am stuck making short, choppy utterances (see my circles and lines, which really aren’t worth sharing.)  I plan to both work my way through the basics (L+1) as well as dabble with some more advanced projects (L+10) to see if I can pull them off.  If not, I’ll know what to learn next.

Fortunately, I have one or two friends who are also learning Processing at the same time.  They are more advanced than me (in programming languages, but I hold the advantage in human languages), but it has been helpful and fun to bounce examples and ideas off of one another.  We plan to begin a wiki to document our progress and questions as they arise — a little like a students notebook where vocabulary and idioms are jotted down so they can be reviewed later.

Watch for more updates as projects get pulled together as well as notes on other ways to visualized data in the near future.

Leave a comment

Filed under Projects

2010 in review

automatic transmission

I got this auto-generated post direct from WordPress, which hosts ESL Technology.com, and thought some of it was worth sharing.  Every time I login to my blog, I can check on these numbers and other interesting data.  I can see how many page views I’ve had by day, week, and month as well as which pages were most popular and what sites are referring people to my blog.  More interesting than the numbers themselves are the fact that this data is so easily available that WordPress can automagically pull it together into a blog post for me.  (Incidentally, you can search “2010 in review” on WordPress to find other bloggers who have posted the autogenerated post.)

This kind of data is becoming easier and easier to work with — to mashup.  And all kinds of new software allows us to pull together lots of data in enlightening ways.  Governments that are making this kind of data available are finding citizens stepping forward to develop ways to make it more useful.  See Tim Berners-Lee’s TED Talk for a six-minute rundown of the highlights, below.

I recognize that the numbers generated for me by my blog are not as important as which roads are impassable after an earthquake in Haiti.  But, on almost every scale, this data is becoming easier to find, use, and mashup.  Some of our students may already be doing this.  Surely, many are not.  Developing the ability to work with this kind of data in very dynamic ways is sure to be an asset, if not an expectation, in the near future.

So, without further ado, my numbers.  Thanks for reading.

Crunchy numbers

Featured image

A helper monkey made this abstract painting, inspired by these stats.

A Boeing 747-400 passenger jet can hold 416 passengers. This blog was viewed about 13,000 times in 2010. That’s about 31 full 747s.

In 2010, there were 50 new posts, growing the total archive of this blog to 119 posts.

The busiest day of the year was August 4th with 95 views. The most popular post that day was OutSMARTed.

Where did they come from?

The top referring sites in 2010 were twitter.com, esl.osu.edu, en.wordpress.com, en.bab.la, and google.com.

Some visitors came searching, mostly for esl technology, esl and technology, technology esl, technology for esl students, and kinesthetic learners.

Attractions in 2010

These are the posts and pages that got the most views in 2010.

1

OutSMARTed August 2010
4 comments

2

Interactive Whiteboard FAQ (Wii) March 2009
16 comments

3

About Me July 2008
4 comments

4

How do I know my IR LED works? October 2008
1 comment

5

Projects August 2008
1 comment

Leave a comment

Filed under Inspiration