Category Archives: Projects

CoLE: Corpus of Learner English

photo of students taking an exam.

The exam by bitjungle.

So here’s the big project that’s been keeping me away from my blog for the last little while: the Corpus of Learner English (CoLE).  This is a project I have been working on for a couple of years and we are finally ready to start sharing it with the world.  Every step of the way has been an adventure: From designing the corpus, to applying for IRB approval, to compiling the data.  Here are the nitty gritty details.

I’ve always been interested in corpora — more in the idea of them than in any particular research question.  A few years ago, I initiated the American Language Program’s (ALP — Ohio State’s intensive English program) transition from paper-based placement testing to computer-based testing.  Shortly after that, I started thinking about how much easier the 30-minute placement composition components would be to analyze.  Word counts, for example, could be compared with a couple of keystrokes.  And, of course, more complex comparisons were possible such as differences in pronoun use between male and female students.  (One interesting preliminary finding there was that our male students used a lot more first person pronouns while our female students used a lot more third person pronouns.  Was this some sort of cultural artifact?  Not my research question!  But it could be yours…)

A year or two after we moved to computer-based testing in ALP, Ohio State’s ESL Composition Program also moved their testing online, in part as a response to making testing accessible to students before they arrive in the U.S.  Previously, students could not take their placement tests until they arrived.  Because test results were prerequisites for many classes, students often registered late and found many classes had already filled.  Again, I saw some data that could be an interesting corpus.

I talked with my colleague Jack Rouzer about the potential for such a corpus, and he was also very enthusiastic about the project.  We immediately began working out the details and submitted an IRB application.  This was my first experience with our IRB and it was an interesting one.  For one, I don’t think our IRB is as familiar with linguistic corpora (or even “data repositories” as the project was classified) as it is with medical testing or psychological experiments.  Once we were able to create a protocol that would reasonably protect our student participants’ privacy, we were approved.  Here’s what we came up with:

First, obviously, we ask for students’ informed consent.  We describe that we will make their writing available online in a de-identified way with only some demographic information attached.  In the corpus, we include each student’s age range, sex, country of origin, college of study, graduate or undergraduate status, and their placement level (1, 2, or out, which means they are exempt from taking ESL Composition classes.)  We ask for their consent after they have written their essays so that they know exactly what will be included. Second, we read each placement essay to be sure students don’t self-identify in any way within the content of their essays.  Third, we only include essays in the corpus for which we have at least fifty members of every demographic category.  So, for example, we will include an essay if it written by one of 500 students aged 18-21, one of 1000 female students, one of 400 Chinese students, one of 400 College of Arts & Sciences, one of 300 graduate students, and one of 200 students that placed into undergraduate level 2.  It is extremely unlikely that you would be able to identify who wrote this essay based on these demographics.  However, we would not release an essay written by one of 3 Botswanans or one of 20 students over age 25 because it is more likely that you could identify them if, for example, you know a student from Botswana.  The good news is, as we include more and more essays each year, every population will go up and we will be able to include more essays in the corpus as this threshold is reached in different demographics.

In the first semester, we were only able to include male and female, grad and undergrad Chinese students under 25 years old in Business, Arts and Sciences, and Engineering with low and intermediate placements, but subsequent additional semesters have broadened the pool to include more age bands, countries, colleges, and placement levels.

If you are interested in accessing this corpus please contact me for more information.

Leave a comment

Filed under Projects, Research, Resources

Studio Usage Heat Map

studio usage heat map - by day

If you’ve been following along, you know that I’ve been working to pull together a recording studio on a budget. Our first step was clearing out the old office that was destined to become the studio, work on minimizing the echo in the room, and painting one wall Sparkling Apple to use as a green screen. This is where our first $100 went. Next, we spent another $50 or so to light both the green screen and the talent in front of it. I’m currently working on sorting out the best solution for audio and video. (Stay tuned for updates!)

Fortunately, the lack of A/V equipment hasn’t prevented our staff from using the studio.  In fact, since the doors first opened in July, it has seen over 150 hours of use.  At this point, it is interesting to look at the patterns of usage that have emerged. Thus, the heat map, above.

To make the heat map, I added a “1” to each half-hour timeslot that the studio was reserved each week in an Excel spreadsheet. I then color-coded the data in the sheet with hotter colors reflecting higher numbers. The colors help to visualize trends in usage. For example, usage increases as the week goes on with Thursday and Friday afternoons appearing in oranges and reds. In contrast, there are times early on Monday and Tuesday that have never been reserved.

Studio usage heat map - by weekI also have a heat map that compresses all of the days into one, which I made by totaling the times for each half-hour block on the spreadsheet and then color-coding it. Click to enlarge it. Again, it’s pretty easy to see the studio warm up as the day goes on, indicating increased usage.  Having a couple of regular evening reservations also contributes to this pattern.

Color coding numbers in a spreadsheet isn’t rocket science, but it is an easy way to visualize the data to quickly get a read on the studio. And, I can see that I’m going to have to start coming in earlier on Mondays if I want to use the studio.

Leave a comment

Filed under Projects

Build a $150 Studio

IMG_4533  Our $100 studio gets $50 worth of lighting.

If you’ve been following along, you’ve already read about the $100 studio we built in an old office to record better audio and video resources for our students. We’ve recently installed $50 worth of lights to get the studio ready for video production.  Here’s what we used:

Item  #  Cost  Total
4′ two-light shop light  2 $14.98 $29.96
8 1/2″ clamp light  2 $7.85 $15.70
CFL bulbs – daylight (2 pack)  1 $9.98 $9.98
Total:  $55.64

Again, we did come a few dollars over our target of $50, but we’re in the neighborhood. Our list does not include bulbs for the shop lights (I brought in four bulbs from a twelve-pack I had in my garage) or the power strips we plugged the lights into because we scrounged those from around the office.

IMG_4536

The installation was relatively straightforward. We hung the shop lights as close to our green screen wall as possible in order to wash the wall with light evenly. An evenly lit green screen is easier to replace with another image or video in postproduction using iMovie or a similar application. We attached a paper baffle using magnets to try to keep the light from the shop lights from backlighting the subject. Green paper was not necessary, but it was readily available so we used it.

IMG_4535

We hung the clamp lights from the ceiling at approximately a 45-degree angle from the subject. The goal is to light the subject from just above her eyes, which means these lights may be a little high, but the ceiling was an easy way to hang them and keep them out of the way. We used binder clips to attach parchment paper over the bulbs to diffuse the light, making it less harsh. In the photo, you can see that we have added a second light (for two on each side). We did this to make sure there was plenty of light on the subject. Although the CFL lightbulbs do warm up and become brighter after about five minutes, they still have to compete with all of the light reflecting off of the green screen. So, we added the second set of lights to be sure there was plenty of light, though these may not be absolutely necessary.

Each set of lights, left and right, are plugged into a power strip on the wall. None of the lights have switches, so the switch on the power strip becomes an easy way to turn them on and off without having to plug or unplug them. Finally, the last critical detail was to get “daylight” bulbs rated at 6500K. This is the best light temperature for most cameras. Fortunately, daylight bulbs were easy to acquire and not any more expensive than other temperatures (warm, cool, etc.)

So, for a few bucks at your local home improvement warehouse, you can find plenty of lights to outfit your studio on a budget. Our next step is to test a few camera / microphone combinations to see what will fit our budget and be quick and easy to use for anyone in our program who wants to make a video. Stay tuned.

1 Comment

Filed under Projects

Building a $100 Studio

panorama 3a_small

Like many educators, we find ourselves producing more and more online content.  Currently, to record audio, we try to find a quiet room and record directly onto our laptops, which makes for pretty lousy audio.  For video, the process is the same, including stacking furniture and books to get the webcam in our laptops to the best possible position.  Far from ideal.  As we move to more and more audio and video production, the lack of a dedicated studio space is becoming and issue.  So, we decided build a dedicated studio.

Like most educational organizations, cost is big a factor.  We just don’t have thousands of dollars to throw at the latest 4K cameras.  We also don’t need a full-blown Hollywood studio to make materials for our students to view on the web.  We started by looking at acoustical foam as a way to insulate our space, but this quickly added up to hundreds of dollars for our 10′ x 12′ room.  Our search for other options led us to Justin Troyer, OSU’s resident media services expert and author of Medialogue, who showed us a studio on campus that he had insulated with mover’s blankets.  This looked to be a solution to some of our biggest audio issues because they would both help to block out external noise and reduce the echo within the room.

We had also been struggling with what sort of background to use for video production.  We were leaning towards a velvet or velour curtain in a neutral color because it would help to further absorb the echo within the studio.  But that fabric is expensive and it would lock us into a single background for every video, which is not ideal.  Justin suggested a green screen, which can be removed digitally and replaced with almost anything.  He has several different-sized pop-up green screens which are easy to put behind the video subjects.  But in the end we decided to got with another option he suggested: paint a wall green.  This saves both money and space because the wall does not have to be set up or stored when not in use.

So, after starting with an empty office space, we used the following items to create our studio:

Item  #  Cost  Total
Mover’s Blankets – Harbor Freight  6  $7.99  $47.94
Light-Duty Ceiling Hooks – Home Depot (4 pack)  4  $1.49  $5.96
Gallon Behr Premium Plus Ultra Interior Latex Paint – Sparkling Apple  1  $30.98  $30.98
Assorted painting sundries (roller covers, masking tape)  $15.87
Total:  $100.75

We came in just over $100, which is pretty close to our target.  Included in the costs are items that got used and disposed of while we were painting (roller covers and masking tape) but not items that I already had at home that I brought in to use (paint roller, roller tray, brushes).  I also filled in a few holes in the wall with my own putty and putty knife.  You may need to factor in additional costs if you don’t have access to these basic tools.

In the end, we incurred one final cost which was to purchase a short curtain rod and rings to which allow us to slide the mover’s blanket out from in front of the door, which makes getting in and out much, much easier.  The rod and rings cost just under $22.

Now the real fun begins.  You can see from the picture that we already have a small table, chair, microphone stand, and camera tripod.  The table will be used for straight audio recording, which is why we wrapped the end of one mover’s blanked around it to enclose it on three sides.  We still need to find a microphone or two, a video camera, and some lights.  Stay tuned as we work on acquiring these items to complete our studio.

3 Comments

Filed under Projects

Paper-based Games for ESL Students

dice

At the inaugural Playful Learning Summit at Ohio University, I shared a couple of games that I developed for use with ESL students at Ohio State. These are both paper-based games, which stood out in a room full of computer games and an Oculus Rift connected to a Kinect. This last project — an immersive, gesture-controlled, virtual reality interface — was really cool, but isn’t something I know how to develop (yet).  But, fortunately, everyone gets paper.  I hope these two games serve as an inspiration for anyone who doesn’t think she can design a game for her students.

Football Simulation – I’ve posted about this one before, but it still stands as an easy-to-prepare, easy-to-play simulation that can help international students to understand the game of American football.  The focus, when I use the game in the classroom, is to understand what down and distance are as well as the importance of basic offensive and defensive strategies.  All that is required is one six-sided die and a printout of the document with the offense and defense  cards cut out.

Orientation to Campus Game – This is a board game I developed based on the Madeline board game.  Players travel around the campus map / board uncovering tokens when they land next to them.  If the player uncovers one of the 5 buckeye symbols, she keeps it.  If the player uncovers the name of a building, she must move to that space immediately.  The best things about this game are that it is very easy to play and that students really focus and pay attention to the most important buildings on the map.  There are no dice and you can use almost anything for player tokens.  I also really like the mechanic of moving to the place listed on the token because this changes every time the game is played.  On the down side, it is a kids game, so it doesn’t hold adults’ attention for very long.  And if the students have been on campus for even a couple of weeks, they are already familiar with most of the buildings in the game.  Still, this game could be useful for students to play while waiting for our orientation program to start because it might help them to discover buildings that they do not yet know.

So, don’t be afraid of developing games on paper if, like me, you don’t have a wide array of programming skills.  Any game that is prototyped and play-tested on paper could later be converted to a computer version.  But, by working out the kinks on paper, you can develop your game to its final version without even picking up your keyboard.

Leave a comment

Filed under Projects

Create a Second Screen Video Experience in the Classroom

zits comic

Popular television shows like Breaking Bad and The Walking Dead offer second screen experiences called “Story Sync” that let viewers to engage with additional content on their tablets and laptops while they watch. Free online polling software can be used to quickly and easily create a similar experience for students in the classroom. In this workshop at OSU’s Innovate Conference, participants will see an example second screen experience, learn about student reactions to this approach, and create their own, which will be shared during the workshop.

Examples you can use

You can use the following videos and screenshots in the second screen experience you create as part of this workshop, or you can use your own.  You can pause your video in the middle to ask a question, ask a question at the end, or both.

1. Forrest Gump – meeting Jenny

forrest gump screenshot

2. The King’s Speech

kings speech screenshot

3. Planes, Trains, and Automobiles

planes trains screenshot

Your turn

If you are participating in this conference, and you create a second screen experience, post a link to the video to watch (i.e. the first screen experience) and your Socrative.com room number in the comments so that we can share what you’ve made.

6 Comments

Filed under Projects

How Do You Spell Success?

Statue of Rocky in Philadelphia, his arms raised in triumph.

To find the prescriptive answer to this question, look in a dictionary.  To find the descriptive answer to this question, look in a corpus.

In ESL Programs at Ohio State, I have been working towards building a couple of corpora of learner language not only for our own analysis, but also for researchers around the world to access.  Our plan is to include the English placement compositions that all international students’ write when the arrive on campus in the first corpus and the Intensive ESL Program (IEP) students’ placement and end-of-term compositions in the second.  Because almost all of these compositions are now written on computers instead of paper, it is relatively easy to take the next step and format them for analysis by corpus tools.

Both corpora should be interesting.  The former could grow by more than a thousand compositions per year as international students are admitted to Ohio State in ever increasing numbers.  Because these students have met the English proficiency requirements to be admitted, their level of proficiency is relatively high.  The latter will include fewer students, but will include longitudinal data because each student will write multiple compositions as they progress through the program.

As I was scoring some of the recent end-of-semester IEP compositions, and encountering the usual and frequent errors in our lowest-level students’ writing, I began thinking about how our students’ creative spelling would affect, and possibly inhibit, searches of this corpus.  For example, how can you search for past tense verbs when so many of them are misspelled?  Then it occurred to me that these misspellings could themselves be quite interesting.  So, to answer the question posed in the title of this post, here are some of the ways our students spell success (and its cognates), listed in order of frequency:

successful, success, succeed, sucessful, successfull, succesful, secessful, succes, succed, sucssed, successfully, succeful, seccsessful, suessful, suecess, suceessful, succsful, succsess, successul, successufl, successfufl, successeful, succeshul, succefull, succeess, succees, succeeded, succeccful, secuessful, secssed, seccssful, seccessful, scuccess, sccesful.

We are currently working on securing IRB (Institutional Review Board) approval for this project, after which we will be able to share the data and results more publicly.  As part of our IRB application, we are alpha testing our procedures and this question about the spelling of success became an interesting test case.  To create this list, I took a set of student compositions and fed them through AntConc, a free concordancer written by Laurence Anthony.  In addition to the frequency of words, lots of other interesting queries are possible with this application and others.

All of the compositions will be coded with the demographic information we have for each student (age, gender, country of origin, first language, major or degree program) as well as information about each composition (score, topic, date).  By sorting for whatever factor is interesting, we’ll be able to make any comparison we like.  Want to see what the compositions above and below a certain score look like?  No problem.  Want to see how Chinese speakers compare to Arabic speakers?  Male to female?  Grad to undergrad?  We will be able to do it.

We’re looking forward to bringing this Big Data approach to our programs.  Not only will this data inform our curriculum, but it will also become a useful resource for researchers across our campus and around the world.

Leave a comment

Filed under Projects

Interactive Fiction

The text-based game Zork being played via teletype machine

If you’re like me, one of your first computer game experiences was with an interactive fiction text-based game.  Zork was probably the most popular, but I discovered the Hitchhiker’s Guide to the Galaxy game first (before I knew it was a book.)  In fact, I had never played Zork until quite recently when I encountered a version of it on Frotz, which I discovered as an iPhone application.  Within Frotz, one can play a wide variety of text-based adventure games.

If you’ve never played one of these games, there really isn’t much to learn.  Players are typically presented with a description of their character’s surroundings followed by a prompt.  Players can type simple directions at the prompt, such as “go north” or “pickup phone.” This process repeats with the game presenting the results of the previous command or a description of the new scene if the player has moved.  From there, the player enters further directions, and the game continues.

Obviously, the focus of the game is the writing as there are typically no graphics involved.  These games also have a rich tradition of Easter Eggs and snarky responses, particularly when commands are malformed or not recognized.  As the player proceeds through the game, objects can be collected (such as a key) that can later be used to solve a problem or make progress through the game (such as unlocking a door.)

These games are now rediscoverable thanks to new technologies.  Not only that, but it has also become very easy to create a game with virtually zero programming involved.  Two examples of tools that can be used to create interactive fiction are Twine and Inform7.

Twine is the simpler of the two.  Resulting stories are interactive in the way that Choose Your Own Adventure stories are interactive, but they use linked texts to allow the reader to progress in a non-linear way.  Examples of stories written using Twine can be played on gimcrackd.com.

Inform7 is much more complex, but the results are actual interactive text-based adventure games.  Elements can be dragged and dropped to create the relationships that form the basis of the story.  Examples of some of the best interactive text adventures can be found in this article on makeuseof.com.

I haven’t used either of these tools yet, but I’m curious about what ESL students might make of them (and make with them).  The process of writing can be a challenge in itself and, if they are not familiar with interactive fiction, explaining it would be an additional difficulty.  But by collaborating in small groups, there might be some interesting possibilities for collaborating and giving peer feedback.  At the very least, interactive text adventure games can provide ESL students with a rich source of input.  And because the syntax for interacting with the game is so simplified, even intermediate level learners can play them.

I plan to give Inform7 a try to see how easy it is to use.  If you’ve used it, or know of other similar tools, leave a comment.

Leave a comment

Filed under Projects

ELTU Unconference

breakout area 2 for ELTU

We’re about a week and a half away from the fourth annual Exploring Learning Technologies Unconference (ELTU4).  This year, we’ve moved the event to the start of the academic year because the spring was becoming crowded with other conferences and events.  So, on Friday, October 14, we meet again from 9am to 1:30pm to unconference.

What is an unconference?  There are lots of different variations, from Open Space to various camps (FooCamp, Barcamp, Mashup Camp, etc. — see Wikipedia for more.)  Our variant resembles a traditional conference in may ways — there are meeting areas for different breakout sessions that begin every hour —  but the biggest difference is that none of the content is set in advance of the meeting.

We spend the first half hour with introductions and generating session topics.  From there, the group negotiates which topics go in which time slots and we begin.  Being a technology-themed unconference, we use some technology to facilitate this process: we project the session grid on screens around the room so everyone can see and participate in the process.  We also set up a wiki in advance with one page that lists the schedule and links to one page per session so someone in each session can take notes.  (Visit http://go.osu.edu/eltu to see the wikis from the last three unconferences.)

Once organized, the unconference runs a lot like a regular conference, though participants are encouraged to move between sessions as a way of cross-pollinating the various discussions.  In fact, we have traditionally hosted the unconference in one big open space or computer lab in order to facilitate this movement.

The beauty of the process is that, if everything works as intended, the discussions are all appealing to those in attendance because they were generated only by those in attendance (instead of presenters who submitted an abstract months in advance and then failed to attend the conference.)

The effect is intentionally a bit like the hallway conversations you have at a traditional conference — when you actually get to talk to someone with similar interests to you instead of just watching a speaker read their PowerPoint slides.  By attracting interesting people from across campus and throughout Ohio, the discussion at the unconference is always a good one.

I’d recommend the format to any organization interested in hosting a stimulating conversation.  I’d also welcome you to our next unconference on Friday, October 14 from 9am to 1:30pm.  Details are available at http://go.osu.edu/eltu and registration is available (and free!) at http://eltu4.crowdvine.com.

Leave a comment

Filed under Projects

Data Visualization: Attendance vs. GPA

Above is a plot of students’ attendance versus their grade point averages (GPAs).  See any trends?  Obviously, students with higher attendance tend to have higher GPAs.  While this is not particularly surprising, it’s nice to be able to support this notion with actual data.

(I should say that this “actual data” is not actual data, but it is based on actual data.  I’ve taken the actual “actual data” and randomly added or subtracted up to 5% so that the general trends remain, but none of the actual data points are the same, except by chance.)

In addition to the general trend that GPAs correlate positively with attendance, I can say that no student who had 100% attendance got less than a C+ (2.85 GPA) and that no student who got a 4.0 GPA (straight As) attended less than 96% (at least in the “actual” data).

Can I claim causality?  Not exactly.  I don’t know that higher attendance causes higher grades, or vice versa, but I think it could be claimed that low attendance causes low grades — if you aren’t in class, you can’t get an A.

Admittedly, this isn’t the most cutting edge visualization — it’s just a graph I made using Microsoft Excel — but I think it represents a relatively simple set of data effectively.

I plan to show this graph to all of our students at our program-wide meeting at the beginning of the academic year.  If nothing else, it should get them thinking a bit about the importance of attending class if they want to be successful.  This isn’t a big issue for most of our students but, as you can see, it is an issue for some.  And if it helps them to have me connect the dots, I gladly will (see below, click to enlarge).

Leave a comment

Filed under Projects