Category Archives: Research

CoLE: Corpus of Learner English

photo of students taking an exam.

The exam by bitjungle.

So here’s the big project that’s been keeping me away from my blog for the last little while: the Corpus of Learner English (CoLE).  This is a project I have been working on for a couple of years and we are finally ready to start sharing it with the world.  Every step of the way has been an adventure: From designing the corpus, to applying for IRB approval, to compiling the data.  Here are the nitty gritty details.

I’ve always been interested in corpora — more in the idea of them than in any particular research question.  A few years ago, I initiated the American Language Program’s (ALP — Ohio State’s intensive English program) transition from paper-based placement testing to computer-based testing.  Shortly after that, I started thinking about how much easier the 30-minute placement composition components would be to analyze.  Word counts, for example, could be compared with a couple of keystrokes.  And, of course, more complex comparisons were possible such as differences in pronoun use between male and female students.  (One interesting preliminary finding there was that our male students used a lot more first person pronouns while our female students used a lot more third person pronouns.  Was this some sort of cultural artifact?  Not my research question!  But it could be yours…)

A year or two after we moved to computer-based testing in ALP, Ohio State’s ESL Composition Program also moved their testing online, in part as a response to making testing accessible to students before they arrive in the U.S.  Previously, students could not take their placement tests until they arrived.  Because test results were prerequisites for many classes, students often registered late and found many classes had already filled.  Again, I saw some data that could be an interesting corpus.

I talked with my colleague Jack Rouzer about the potential for such a corpus, and he was also very enthusiastic about the project.  We immediately began working out the details and submitted an IRB application.  This was my first experience with our IRB and it was an interesting one.  For one, I don’t think our IRB is as familiar with linguistic corpora (or even “data repositories” as the project was classified) as it is with medical testing or psychological experiments.  Once we were able to create a protocol that would reasonably protect our student participants’ privacy, we were approved.  Here’s what we came up with:

First, obviously, we ask for students’ informed consent.  We describe that we will make their writing available online in a de-identified way with only some demographic information attached.  In the corpus, we include each student’s age range, sex, country of origin, college of study, graduate or undergraduate status, and their placement level (1, 2, or out, which means they are exempt from taking ESL Composition classes.)  We ask for their consent after they have written their essays so that they know exactly what will be included. Second, we read each placement essay to be sure students don’t self-identify in any way within the content of their essays.  Third, we only include essays in the corpus for which we have at least fifty members of every demographic category.  So, for example, we will include an essay if it written by one of 500 students aged 18-21, one of 1000 female students, one of 400 Chinese students, one of 400 College of Arts & Sciences, one of 300 graduate students, and one of 200 students that placed into undergraduate level 2.  It is extremely unlikely that you would be able to identify who wrote this essay based on these demographics.  However, we would not release an essay written by one of 3 Botswanans or one of 20 students over age 25 because it is more likely that you could identify them if, for example, you know a student from Botswana.  The good news is, as we include more and more essays each year, every population will go up and we will be able to include more essays in the corpus as this threshold is reached in different demographics.

In the first semester, we were only able to include male and female, grad and undergrad Chinese students under 25 years old in Business, Arts and Sciences, and Engineering with low and intermediate placements, but subsequent additional semesters have broadened the pool to include more age bands, countries, colleges, and placement levels.

If you are interested in accessing this corpus please contact me for more information.

Leave a comment

Filed under Projects, Research, Resources

Pop Psychology for Teachers

balloon poppingRed Pop Three” by Brent Schneeman used under CC BY-NC 2.0

I originally intended this post to be about an article I came across on creativity, but as I looked around What Makes Them, I found that the whole site deserves a mention.

Susan Weinschenk, who writes this blog, draws on her 30+ years of experience applying her PhD in Psychology to the workplace.  She identifies interesting research articles and then summarizes them in a way that makes them very easy to apply to the workplace, including the classroom.  Some examples are below.

knitted duck on a streetAfloat on Grey Street” by Nicola Stock used under CC BY-NC-SA 2.0

4 Types of Creativity

Evidently, there are four types of creativity, each a combination of cognitive / emotional and deliberate / spontaneous.  Thomas Edison, who is said to have gone through thousands of failed experiments before inventing something, is classified as cognitive and deliberate.  In contrast, artists and musicians tend to be spontaneous and emotional in their creativity.  Each type has different requirements in order to be successful.    For example, the Thomas Edisons need lots of knowledge and time whereas require skill to create based on a spontaneous impulse.  So, there may not be a one-size-fits-all way to facilitate creativity in the classroom.

jack in the box toyGood Lord” by Kevin O’Mara used under CC BY-NC-ND 2.0


Your brain craves surprises.  This is, ironically, not a surprise to any good language teacher who fills lesson plans with a variety of activities to hold students’ interest.  This summary is based on a study which demonstrated that people find surprises more pleasurable than things they like.  How do they know?  The squirted fruit juice in people’s mouths.  Seriously.

blue screen of death -- Windows computer errorBSOD 0x07B” by Justin used under CC BY 2.0

Error Strategies

This study looked at what strategies older and younger adults used when encountering an error when trying to use a new electronic device.  Some interesting differences: the older group didn’t receive meaningful hints from their actions or use their past knowledge as much as the younger group did.  These results may be particularly useful for teachers who integrate technology into their classrooms.  Common sense would have us believe that older adults would have different difficulties navigating a content management system for the first time.  Perhaps this study can help teachers to better anticipate these problems.

There are lots of other interesting studies summarized on this site.  Take a look around and if you find others that are particularly applicable to ESL teachers, leave a link in the comments.

Leave a comment

Filed under Research

Mobile / Gaming Resources

rubic's cube

Where to begin?

As you can probably tell from my recent flurry of posts, I’ve gotten a lot out of coming to CALICO.  This is a great conference with great people.  Everyone is extremely approachable even though their expertise usually seems intimidatingly beyond mine.  I wanted to share some of the gaming resources I’ve come across during this conference, some of which have begun to answer the questions I have been asking over the last couple of days.

10 Key Principles for Designing Video Games for Foreign Language Learning by Ravi Purushotma, Steven L. Thorne, and Julian Wheatley.  I’ve heard Steve speak a couple of times and have gotten a change to get to know him.  He’s a real Renaissance man in that he pulls together research from pretty diverse fields in ways that can inform each (and then is as engaging a speaker as a “monkey on crack” — his description, which I only use in the most positive and appreciative sense.)  There is some great guidance in this paper, which is grounded in SLA theory.

What might mobile media afford education? by David Gagnon.  A nice look at some possible uses for mobile learning including everything from repackaging existing content to mobile data collection and augmented reality.  At first glance it seems very futuristic and cutting edge (which it is) but much of it is already being developed.  The future is now.

Spoil-sports Save the Day on Wise  Spoil-sports are defined as those that intentionally disrupt the game by ignoring the rules of the game.  There is some really thought-provoking information on the importance of rules, but also on the necessity of breaking them, both in games and in life.

Leave a comment

Filed under Research

How Is Technology Changing Learning?

Recently, as part of my final project for EDU P&L 823 – The Functions of the Computer in the Classroom, I asked the question “How is technology changing learning?” using six different channels of communication: on this blog, Twitter, Facebook, Flickr, via email and face-to-face.  The question was deliberately very open-ended and I received some very interesting responses.  But, perhaps more interestingly, were the differences between how people responded on each of these channels.

Obviously, the channels that reached people with whom I had close connections (email, face-to-face) received a lot of responses.  Other, more ephemeral, forms of communication where connections are not as strong, received far fewer.  In some ways, this was a bit humbling — I have a hundred followers on Twitter and even more on Facebook — but the response rate was very low.  Perhaps the people with whom I communicate via these channels simply weren’t interested in this question?

Although these new channels (Twitter, Facebook) are changing communication, clearly they do not completely replace the others.  And perhaps integrating them all is the most effective approach.  Watch my final presentation below.


Filed under Research