I spent much of my youth listening to hip hop, or, as it was called back then, rap music. This was long before MP3 players and long before you could Google your favorite song lyrics. It was also long before I knew anything about textual analysis, let alone before I thought about using unique words per n words as a measure of variety in vocabulary.
So, when Matt Daniels published this piece called The Largest Vocabulary in Hip Hop last month, it was both a flash back to the music of my youth and a flash forward to some of my current interests in corpus linguistics.
Daniels does a very nice analysis, so I won’t repeat much of it here. Just follow the link and scroll down to see the details. Be aware that some of the analysis incorporates a bit of slang that may not make it completely kid friendly.
Most noteworthy in the analysis are the two baselines of comparison: Shakespeare (5170 unique words per 35,000 words) and Herman Melville (6,022 unique words in the first 35,000 words of Moby Dick). Of the 85 rappers analyzed, 16 use a wider vocabulary than Shakespeare and 3 are above Melville. So, if you ever thought all hip hop was a simplistic art form, you may want to take another look. It’s amazing what an analysis of the data can show us.
A friend recently lent me the book Uncharted: Big Data as a Lens on Human Culture, which discusses the development of the Google N-Gram Corpus. After scanning millions of books, Google could not simply make them all freely available because this would essentially be republishing copyrighted works. Instead, Google has made them all searchable by N-Grams (one-, two-, three-word phrases and so on up to n-words) which protects the copyrighted works because they are really only viewable in aggregate. The corpus is, of course, limited in that it only includes books (as opposed to also including magazines, newspapers, oral texts, etc.), but given that it goes back hundreds of years, the size and the scope of the corpus is pretty amazing.
Early on in Uncharted, a book called Legendary Lexical Loquacious Love, a concordance of a romance novel, is affectionately described as a conceptual art piece that helped to inspire the N-Gram Corpus. In Love, every word from a romance novel is presented in alphabetical order. So, a word like a, which appears several times in the original source novel, is repeated scores of times. The authors talk about how different the experience of reading a concordance of a romance novel is from reading the original romance novel, but how the former is compelling in its own way. For example, they offer the following quote:
These 29 occurrences of the word beautiful are, presumably, spread throughout the original novel. But seeing them juxtaposed next to other words that begin with b (and with the scores of occurrences of the word a) gives you a different perspective on a romance novel.
What does this have to do with Star Wars? Great question. While reading Uncharted, I came across the following YouTube video:
Created by Tom Murphy, the video is “meant to be provocative in its uselessness.” It took 42 hours to produce the 43-minute video, which is oddly compelling to watch. In addition to the video, a small data bar at the bottom graphs the frequencies of each word, which is also tallied onscreen through the video. It’s a difference experience, much like reading a concordance is different from reading the original source text. For example, the famous scene in which Obi-Wan uses a Jedi mind trick on a couple of Stormtroopers appears in the original movie as follows:
Stormtrooper: Let me see your identification. Obi-Wan: [with a small wave of his hand] You don’t need to see his identification. Stormtrooper: We don’t need to see his identification. Obi-Wan: These aren’t the droids you’re looking for. Stormtrooper: These aren’t the droids we’re looking for.
In Arst Arsw, this interaction is best summarized by the three occurrences of the word identification, which are the only three times that this word appears in the film. Identification appears at 16:08 of the video. There are many other interesting moments, particularly when different voices utter the same word several times (for example, leader by several rebel pilots) or when only one character uses the same word several times (for example, kid by Han Solo.) For me, longer words are generally more interesting because they take longer to say, whereas the shorter words can fly by so quickly that they can be hard to comprehend. One exception, however, is the word know, all 32 occurrences of which fly by in under 5 seconds. But because the 26th know is so emphatic, it stands out against the rest.
I’m not sure if there are any other video concordances out there, but if there are, I would love to see them. Especially if the original source material is as compelling as the original Star Wars.
I’ve somehow managed to avoid the pop cultural phenomenon that is Game of Thrones. I’m aware that it exists, and that it’s adapted from a series of fantasy novels, but I’ve never seen an episode. An awareness of the show is hard to avoid. For example, one of my favorite podcasts, Nerdist, hosted by Chris Hardwick, references it all the time. I bring this up because one of the recent guests on the podcast was David J. Peterson, a linguist who created Dothraki, the language that is used by characters in Game of Thrones. (Actually, as Peterson explains, George R. R. Martin, the author of the novels, invented the language and then Peterson had to flesh it out further, develop the phonology, etc.)
So, if you’re interested in linguistics and Game of Thrones (or either of these things) you will probably enjoy Nerdist episode #502, in which Peterson goes into depth on creating Dothraki and several other topics. Please note, as often happens on the Nerdist, the hosts and guests occasionally drop an F-bomb or two out enthusiasm, which means that the entire episode may not be appropriate for younger audiences. Enjoy your burrito!
Everyone loves a good data visualization. And everyone loves a good data visualization even more if the visualization is interactive. Unfortunately, I can’t embed an interactive visualization above, but click on it to link to the interactive version. The circles represent the volume of traffic at airports around the U.S. Clicking on a circle reveals all of the connecting flights to that airport. I’m sure you could get this information out of some kind of heinous Excel spreadsheet, but this format is way more engaging.
This is why I was attracted to this year’s Wherry Lecture, which is hosted by the Departments of Statistics and Psychology at Ohio State. The speaker was Amanda Cox from the New York Times‘ graphics department who spoke about the Times‘ use of data visualizations. Amanda shared many examples that illustrated the importance of context, how a good visualization sometimes limits the amount of data in order to highlight patterns, and the importance of how the text and the visuals work together. These are a few of my favorites.
The Jobless Rate for People Like You – Not all groups have felt the recession equally. This visualization allows you to view trends in different demographics. The differences can be startling.
One Report, Diverging Perspectives – Employment numbers with “Democrat” and “Republican” buttons that allow you to view the same data through different lenses.
Counties Blue and Red, Moving Right and Left – Imagine a map of the wind blowing across the U.S. Now instead of that wind representing, well, wind, imagine it representing the changes vote margin between Democratic and Republican presidential candidates.
Mapping America: Every City, Every Block – Based on U.S. Census data from 2005 to 2009, you can choose to represent ethnicity, income, housing, education, and other information on a map and then zoom out to view the entire nation or zoom in to view your neighborhood.
All of these examples provide different paths to understanding the data that is represented. To see some of the other examples in this lecture, check out my Twitter stream (@eslchill) or follow the New York Times Graphics Department (@NYTgraphics).
Blue Water Silver Moon (Mermaid), 1991 by Kerry James Marshall. Photo copyright Dispatch.com.
I’m currently teaching one of my favorite classes: the Field Experience elective. In this class, I plan a series of field trips on and around campus so students can explore their community as well as English, the field they are studying.
One of our recent trips was to the Wexner Center for the Arts, the campus art gallery. The current show is Blues for Smoke, which explores Blues music as a “catalyst of experimentation within contemporary cultural production.” Works in the show span several decades and include a variety of media.
As part of our trip, I ask each student to identify a favorite piece, which we later discuss in class. One student chose the painting above. We had talked in front of the painting and I helped her understand some of the vocabulary in the information placard next to the work:
Marshall’s portrait of a mythical female nude lounging under the moonlight in a shimmering pond was inspired by a pulp comic book he was reading in the early 1980s. He notes, “Up until then, I had not considered that a black woman could be considered as a goddess of love and beauty. Even I took the classic European ideal for granted …. I wanted to develop a stylized representation of beauty that would be unequivocally black.”
We discussed how the painting includes faces from pulp romance novels that typify this “classic European ideal” for beauty and how the mermaid figure is beautiful and unequivocally black.
But what I interpreted as an interesting insight into the experience of African Americans was something that my student took to heart. The next day, she shared that this was her favorite piece because she, too, had felt the pressure to conform to this classic European ideal of beauty in her native China. For example, she and many of her friends stayed out of the sun so that her skin could be lighter and whiter. But, in this painting, she discovered that black is beautiful — an idea she could relate to and share.
I wouldn’t have guessed that this piece of art would strike this student in this way. But by exposing students to a wide variety of art, the opportunity for this to happen was created. Never underestimate the power of art. Or a good field trip.
In December 2012, Beck Hansen released an album called Song Reader in an extremely traditional way: on sheet music. Best known for genre-bending songs such as Loser and Where It’s At, Beck is going blazing another new trail by reaching back to a format that predates recorded audio. But, why?
Well, in an age of Instructables, MakerBots, and GarageBand, making things has never seemed less intimidating. And with YouTube, you’ve got a way to share your creations whether you’ve played a song on your piano or mashed up a couple of hit songs into something new.
These songs are meant to be pulled apart and reshaped. The idea of them being played by choirs, brass bands, string ensembles, anything outside of traditional rock-band constructs—it’s interesting because it’s outside of where my songs normally exist. I thought a lot about making these songs playable and approachable, but still musically interesting. I think some of the best covers will reimagine the chord structure, take liberties with the melodies, the phrasing, even the lyrics themselves. There are no rules in interpretation.
In education, we talk about Massively Open Online Courses (MOOCs). Beck has released an album that is completely open to interpretation and assembly by the user. By trusting and empowering his listeners to participate in his music, Beck has created something much larger than just twenty songs. He has created a community.
Anyone can post their version of one of these songs to Song Reader.net via YouTube or Soundcloud. As more songs are performed and uploaded, each work will form a kind of dialog and interaction with each one influencing the next.
Why mention this on ESL Technology.com? There are some parallels between this open approach to making an album and the open education movement. Trusting your students and empowering them to make decisions can be very scary — for both teachers and students. Letting students choose their own projects and then working with them to make sure the projects fit the curriculum is more difficult and time consuming, but it’s a process that can really infuse students with a sense of ownership over their work. Being responsible for their own learning is an important lesson for all students.
Opening your classroom to the real world (by making student videos and blog posts public, for example) can also be a scary, but rewarding, opportunity. Teaching in an open environment also means preparing students for the challenges in that real world — teaching strategies for dealing with griefers and phishing attacks, for example — which is probably some of the most useful learning they can carry forward from your classroom. They all have to join the real world eventually.
Is Song Reader a model that can guide your teaching? Not directly. But the novel way that this album has been conceptualized relates to some interesting ideas that relate to how many are re-thinking traditional approaches to teaching.
For more on Beck’s album, visit Song Reader.net. Some of my favorite interpretations of “Old Shanghai”, the single that was released before the rest of the album was available, are below. If you’ve played “Old Shanghai” or anything else from Song Reader, please post a link to your work in the comments.
Recently, the Rhetoric, Composition, and Literacy Studies program in the Department of English at Ohio State was awarded a $50,000 grant from the Bill & Melinda Gates Foundation to create a “Writing II: Rhetorical Composition” MOOC. Read more details on the OSU Department of English website.
What’s a MOOC? MOOC stands for Massively Open Online Course. So, imagine an online course that is open and (typically) free to anyone who wants to register. In essence, MOOCs bring information technology’s promise of exponential scalability to education. And, obviously, there are some administrative challenges inherent to this kind of teaching. The recent spectacular failure of the “Fundamentals of Online Education: Planning and Application” brought MOOCs attention from mainstream media, in part because the course topic made the failure irresistibly ironic.
Can anything be taught in an online classroom with tens of thousands of students? Apparently, yes. I have friends who have learned programming languages this way. Of course, programming languages are much simpler and easy to test (Does your program do this? Good! You pass the quiz!) than most human languages and particularly the advanced rhetoric of a language.
MOOCs frequently crowdsource some of the evaluation of student assignments — think peer editing — which may work well for advanced writing. But, students who enroll with the expectation that they will receive 1/20th of the instructors attention, which they might receive in a traditional classroom, might be surprised by some of these techniques.
This is truly the cutting edge / Wild West of online learning. The good news is, if you’re interested in learning more, you (and all of your friends) can sign up for the course yourself via Coursera, a “social entrepreneurship company” that has partnered with OSU and many other universities to offer MOOCs.
So, maybe the question is, can everything be taught with MOOCs? It’s too early to answer that question. But lots of people are asking it.
Will MOOCs eventually replace traditional brick-and-mortar institutions? New technologies rarely replace old ones completely. For example, you have a television, but you probably still listen to the radio sometimes (in your car, when your iPod battery dies, say.) But, if even moderately successful, it will be difficult for every school to compete with a free course offered by Harvard, MIT, or Stanford. Or Ohio State.
Have you ever been amazed by a TED Talks video? This is one of those. Using principles from the insect world, these robots communicate with each other in ways that allow them to interact and work together. These robots can map 3D spaces, build complex structures out of modular pieces, and even jump through hoops — literally.
This video doesn’t necessarily have a direct-to-classroom ESL application — though I’m sure it would get your students talking — but it is a pretty impressive demonstration of how far this technology has come. With the work that is being done with Microsoft Kinect in the DIY community, I wonder how long before we are building these in our backyard.
I read this article about a 3D printer that was recently unveiled at the Consumer Electronics Show and couldn’t help but get a bit excited. Sure, as the article points out, at $1300, this “affordable” printer may not be affordable for everyone. (It’s not for me.) But it’s getting closer to affordable.
The notion of being able to create or download a 3D image file on my computer, send it to the printer via a USB cable, and have the real object in my hand a few minutes (or a couple of hours) later is pretty amazing — and I’m not even in a business that does any rapid prototyping, nor do I have a burning need for my own custom designed neon ABS plastic chess set, two of the most often cited uses for such a device.
The best part will be watching the prices come down on these. They are a bit expensive now, but in five years, I could see myself forking over $500 for something like this. Especially if the media that is “printed” comes down in price as well.
I’m sure, in addition to being a fun, novel tool with which to experiment, I could find more and more uses for it once I had one. Kids break one part of their favorite toy? Make another! This gadget were exactly the same but with a built-in loop for hanging it from a hook? No problem! Like something I have? I’ll scan it and email it to you and you can print one for yourself (almost) instantly! It’s a pretty exciting future.
I’m not really a big fan of baseball, but I do like sports and the discussion and statistics they generate. This graph crams in a lot of information. World Series winners from 1995 to 2009 are represented in pink (losers are in purple). Teams that had a better regular season record than the champions are above and those with a worse record are below (teams with identical records all appear in the same box); National League teams are in yellow; American League teams are in white; and wild card teams are in italics.
If this information were all presented in columns, it would be a bit hard to decipher. But shifting each column to align the winners puts the data in a new light. Whenever I think of this site, this is the data visualization I think of first. I’m not sure if there is a way to apply a similar approach to data generated in a classroom, but I imagine it would give a different perspective on students’ performance from the traditional bell curve.
Take a look around the site and you’ll get lots of different ideas for ways in which to represent data. You’ll probably also learn a few things about baseball and maybe find something that would get your ESL students talking.