Corpus Tools for English Teachers

typesetting letters for a printing press

I recently attended Ohio University’s annual CALL Conference where I discovered a handful of interesting corpus-based resources worth blogging about.  Most of these come from Chris DiStasio’s presentation “How Corpus-based Tools Can Benefit Your ESL Classroom” and from my subsequent exploration of them.

Corpus of Contemporary American English (COCA) – COCA is a huge (450 million words and counting) balanced corpus to which 20 million words have been added since 1990.  The interface takes some getting used to, but it is quite powerful.  You can search for frequency of words, frequency of collocates, structures based on part-of-speech, and much, much more.  One of the instructors in the highest level of our program asks his students to do searches based on the words in their vocabulary book.  From the collocates, they can identify the most frequent prototype strings or chunks.  These often sound far more native-like than what students (and in many cases, vocabulary textbook authors) come up with.  If you haven’t yet, take a few minutes (or hours) and explore COCA.

Word and Phrase.info – This site, which Chris shared in his presentation, at first seems to be the COCA corpus with a simplified interface.  But in addition to being a simpler way to query the COCA corpus, texts can be uploaded and analyzed based on the use of high frequency words (the 500 most frequent, the next 2500 most frequent, the least frequent, and “academic” words — a note on this last set is below) each of which is then linked to examples in the COCA corpus.  This can be a very useful tool for students who want a quick snapshot of how their writing compares to a target sample.  For example, if they aspire to be published in a given academic journal, they can upload an article (or several articles form that journal) and compare the analysis to their own writing.  As with the COCA interface, there are lots of other features that warrant further exploration.

Academic Vocabulary Lists – My curiosity about what Word and Phrase.info defined as an “academic” word led me to this site, which describes how the Academic Vocabulary List (AVL) was created.  Like the Academic Word List (AWL) that April Coxhead developed in 2000, the AVL is a corpus-based list of vocabulary words that appear with higher frequency in academic texts.  In both cases, high frequency words are first omitted leaving only academic words.  But whereas Coxhead built her own 3.5 million-word academic corpus an omitted the General Service List (GSL), a list that has been around since 1953, the AVL is based entirely on the 120 million-word academic portion of the COCA corpus.  Its creators claim better coverage of the COCA academic corpus (14%) compared to the AWL (7.2%).  And although I find this logic a bit circuitous (How could a list based on a given corpus not cover that corpus better than a list that is based on a different corpus?) the development of a more recent (2013) list of academic vocabulary is intriguing.

Just The Word.com – This is another resource described by Chris in his presentation.  This website, based on the 80 million-word British National Corpus (BNC), offers an even simpler, Google-inspired interface.  The user enters a word or phrase in the search box and clicks on one of three buttons: Combinations, which provides collocates; Alternatives from Thesaurus, which links to the phrase with one or more words replaced with synonyms to show the strength of the links between words in the original phrase; and Alternatives from Learner Errors, which purports to link to actual user errors, but I wasn’t able to see much difference between this and Alternatives from Thesaurus.  Although simpler, this tool took me a few tries to get the hang of.  For example, Alternatives from Thesaurus only works with phrases, which I did not immediately realize.  But aside from this initial learning curve, this tool is a very straightforward way for students to easily search for collocates and to learn more about the nativeness of their word choices.  And, like Word and Phrase.info, search results are linked to the corpus for quick and easy access to multiple authentic examples.

If you use these tools, use them in ways other than I’ve described, or know of others, let me know in the Comments.

History For All

roman colliseum

How Earth Made Us is a documentary series produced by the BBC.  Like many BBC programs, the cinematography is spectacular.  But, perhaps more interesting, is the approach the program takes to history.  Instead of only examining human interactions, the program focuses on how natural forces such as geology, geography, and climate have shaped history.  And, the whole series is available on YouTube.

In the first episode, Water, host Iain Stewart explores the effects that extreme conditions have had on human development.  He visits the Sahara Desert, which receives less than a centimeter of rainfall each year, and Tonlé Sap, which swells to become the largest freshwater lake in southeast Asia during monsoon season.  The contrast is striking.  One interesting factoid is that the world’s reservoirs now hold 10,000 cubic kilometers of water (2400 cubic miles).  Because most of these reservoirs are in the northern hemisphere, they have actually affected the earth’s rotation very slightly.

The second episode, Deep Earth, begins in a stunning crystal cave in Mexico, in which crystals have grown to several meters long.  The cave, which is five kilometers below the earth’s surface, was discovered by accident when miners broke into it.  I can’t imagine what they thought when they first set foot inside.

The third episode, Wind, explores the tradewinds which spread trade and colonization, which lead to the beginning of globalization.  This brought fortune to some who exploited resources and tragedy to others who were enslaved.  The view from the doorway through which thousands of Africans passed on their way to the Americas is a chilling reminder of this period of history.

Fire, the fourth episode, moves from cultures that held the flame as sacred, to the role of carbon in everything from plants to diamonds to flames.  And carbon is also the basis of petroleum, which has powered the growth of humankind.  Several methods of extracting crude oil around the world are explored.

The final episode, Human Planet, turns the equation around tying the first four episodes together by looking at how humans have had an impact on the earth. One of the most compelling examples is the Great Pacific Garbage Patch which is the result of ocean currents bringing plastic and other debris from countries around the Pacific rim.  This garbage collects, is broken down by the sun, and eventually settles to the bottom to become part of the earth’s crust.  This is juxtaposed to rock strata in the Grand Canyon, pointing out that eventually, one layer of rock under the garbage patch in the Pacific will be made up of this debris.

In all, there is almost 5 hours of documentary video here.  It is a compelling production with spectacular imagery.  There are any number of ways to use these videos with an ESL class.  And because they are available on YouTube, there are even more options available to an ESL instructor.  Instead of everyone watching together in the classroom, the videos can be posted in an online content management system and students can watch them anywhere, anytime on their laptops and smartphones, if they have access to that kind of technology.  And if the videos are being watched outside of the classroom, there are more options for assigning different groups of students to watch different videos and then have conversations with students who watched different episodes.  The ubiquity of online video can bring learning to students outside of the classroom.

Genetics for Kids

test tubes

Over the past ten or twenty years, the news media has become saturated with stories about genetics.  But do you really understand how genes interact?  A new genetics simulation being developed at Ohio State can help.

The simulation begins with a series of cartoon faces from which the user can choose to populate the gene pool for the next generation.  (The term “parents” is used, but more than two can be selected.)  This process can be repeated several times to create successive generations of cartoon faces.

Over 50 “genes” are incorporated into the faces (affecting everything from the dimensions of the head and other features to how asymmetrical the face is and whether the eyes follow your mouse or not) and the genes of the “parents” interact to produce the subsequent generation.  You can also adjust the amount of mutation, which leads to a wider (or narrower) variety of offspring.

Another interesting feature is the ability to view genotypes.  This allows you to view a graph under each offspring representing which genes come from which parent.  You can also choose two faces and drag them to the Gene Exam Room to view to what degree each gene is represented in each face.  This also allows you to see the effect of each individual gene.  You can even increase or decrease the representation of each gene to see how it changes each face.

What can you (or your students) do with this simulation?  Imagine the faces are puppies and you want to develop a new breed that is cute (or whatever other trait you’re interested in.)  This simulation clearly demonstrates how breeders (of animals, plants, etc.) select for certain traits and refine them over generations.

Or imagine the choices you  make in the simulation are not choices, but represent the effects of the environment.  For example, say the Sun grows dim giving people with big eyes that can see in low light an advantage over people with small eyes.  This advantage results in a higher percentage of offspring surviving and a wider representation in the gene pool.  What effect would this have after several generations?

Think of how much richer students’ discussions of designer pets and natural disasters will be after they have “experienced” the process instead of just reading about it.  In addition to genetics, this simulation can also stimulate interest in probability (how likely are offspring to have certain characteristics), design (ideas behind evolutionary design were the impetus for the interface), as well as all of the social issues behind decisions we are now able to make regarding genetics.

In terms of ESL teaching, I think giving students something interesting to do and then having them talk or write about it is a great way to get them to practice English.  This genetics simulation is simple but interesting enough that it could generate lots of interesting ideas for students to talk about.

21st Century Newspapers

rolled up newspapers

A long, long time ago (maybe 6 or 7 years now) I taught an elective ESL class centered around a student newspaper.  We tried various formats including weekly, monthly, and quarterly editions, which ranged from 2 to 32 pages.  We also experimented with various online editions, but at the time that mostly consisted of cutting and pasting the documents into HTML pages.

Fast-forward to 2011 and look how online publishing has changed.  Blogs are ubiquitous, if not approaching passé.  Everyone but my Mom has a Facebook page.  (Don’t worry, my aunts fill her in).  And many people get news, sports scores, Twitter posts, friends’ Facebook updates, and other information of interest pushed directly to their smartphones.

It’s no surprise, then, that a website like paper.li has found its niche.  The slogan for paper.li is Create your newspaper.  Today.  Essentially, paper.li is an RSS aggregator in the form of a newspaper.  RSS aggregators are nothing new (see iGoogle, My Yahoo!, etc.).  As the name implies, the user selects a variety of different feeds from favorite blogs, people on Twitter, Facebook friends, etc. and aggregates the updates onto one page.

The twist with with paper.li is that the aggregated page looks very much like a newspaper — at least a newspaper’s website.  For people not on Twitter, Facebook, and Tumblr, paper.li might feel much more comfortable.  Also, publicizing one’s pages seems to be built right in to paper.li’s sourcecode.  I say that because I first learned of paper.li when I read a tweet that said a new edition of that person’s paper was out featuring me.  How flattering!  Of course, I had to take a look.

Would paper.li be a good platform to relaunch a student newspaper?  It might.  If students have multiple blogs, paper.li could certainly aggregate the most recent posts into one convenient location.  Other feeds could also be easily incorporated as well.  (Think of this as akin to your local community newspaper printing stories from the Associated Press.)  The most recent news stories about your city or region, updates from your institution’s website, and photos posted to Flickr tagged with your city or school name could each be a column in your paper.li paper right beside the articles crafted by the students themselves.  You could even include updates from other paper.li papers.

To see examples of paper.li papers, visit the paper.li website.  (And note that .li is the website suffix — no need to type .com no matter how automatically your fingers try to do so.)  You can search paper.li for existing papers to see what is possible.  A search for ESL, for example, brought up 5 pages of examples, some with hundreds of followers.  Take a look.  You might just get an idea for your own paper.li.

Interacting With Video

hand in monitor

#edtech #esl YouTube annotations provide a discussion space layered onto each video.

In my previous post, Interactive Videos, I shared some examples of YouTube videos that incorporate some new interactive features of the site that overlay buttons and links that can take you to a different segment of the video or to a different video or website entirely.

These kinds of pop-up messages have been crowding onto YouTube videos since this feature became available.  If used gratuitously, they are annoying, but when used to add supplemental information, they can be quite useful.  As one example, take a look at the video tutorial for making the above image.  It’s a straightforward and informative two-minute video.  At about the 1:30 mark, some red text appears that seems to be essential information that was omitted in the original shooting of the video.  Adding a quick note is a simple solution that does not require reshooting the video.

But there must be more we can do with these tools.  I’d been thinking about some different ways to incorporate these techniques when I came across a presentation made by Craig Howard at the Indiana University Foreign / Second Language Share Fair.  The page includes a recording of the presentation, a handout that summarizes how to annotate YouTube videos, and a link to an example video, which I’ve included below.

The nice thing about this approach is that a video, in this case a video for teachers-in-training to discuss, can include the online conversation layered right over top of the video.  Comments by different speakers can be made in different colors and the length of time they are displayed can easily be adjusted as appropriate.  Of course, everyone involved needs to have free Google or Gmail accounts to sign in, and the video must be configured to allow annotations by people other than the person who uploaded it.

The ability to integrate video materials and online discussion so seamlessly opens up some interesting potential for interacting with videos in new and interesting ways.  I’ve recently looked at some options for online bulletin boards / sticky notes, including Google Docs, but incorporating this style of discussion directly onto the video is fantastic.

I’m still kicking around different options for making YouTube videos more interactive.  If you have other examples or ideas, please share them in the comments below.

Interactive Videos

mocap character

When I hear the phrase interactive videos, I think of people covered in florescent mocap pingpong balls or choppy, Choose Your Own Adventure-style stories like Dragon’s Lair.  And there are those.  But, it seems that some creative tinkerers have pushed the envelope with some of YouTube’s interactive features and come up with some interesting results.

How can they be used with ESL and EFL students?  Well, in addition to viewing and interacting with the videos and then discussing or reporting on the experience, students could be challenged to determine how the videos were made.  For the more ambitious, students could make their own videos using the same techniques.  Some of them, like the Oscars find the difference photo challenge would be relatively easy to remake.

For more interactive videos that will get your students talking, watch 15 Awesome YouTube Tricks.

Online Bulletin Boards

bulletin board

Most schools and classrooms have bulletin boards, but what is the online digital equivalent?  If you are using a course management system, there are lots of tools built-in that approximate this experience.  But if not, there are various options that offer lots of options for interaction between users.

They can be used asynchronously so that people can leave messages anytime and the conversation happens over a long period of time.  They could also be used in real time so that users can interact in a very visual environment.  Messages can be various sizes, color-coded, and dragged around so they can be grouped together in various ways.


One online bulletin board is Wallwisher.com, which allows a user to create a wall to which other users can add “sticky notes.”  It’s quick and easy to use, but unfortunately it appears to be a victim of it’s own success — in my recent experience the site is not loading quickly, possibly due to being overwhelmed by a large volume of users.  If these issues can be worked out, Wallwisher will be a very useful tool.


A very similar tool is Stixy, which allows sticky notes and other items (photos, documents, and dated to-do list items) to be posted on the wall.  Clicking on an item opens a menu with lots of options for color, font, as well as placement (in the front or in the back, relative to the other notes).  You can also lock certain notes so that instructions or introductions, for example, can’t be moved around like the rest of the notes.  And the site doesn’t seem to have any problems loading due to demand.  Yet.


This site also allows the creation of sticky notes, including very small word-sized stickies, which could work very well on an interactive whiteboard as a way to make fridge-magnet-poetry dragable words.

Google Docs

In addition to the sticky-specific applications above, it’s worth noting that documents created in Google Docs can be configured to be edited by a group of people.  Create a new document and use different colored boxes in place of stickies and the same effect can be achieved.


For information on these tools and others, visit The Pursuit of Technology Integration Happiness which includes several examples that you can test drive.

