Building a $100 Studio

panorama 3a_small

Like many educators, we find ourselves producing more and more online content.  Currently, to record audio, we try to find a quiet room and record directly onto our laptops, which makes for pretty lousy audio.  For video, the process is the same, including stacking furniture and books to get the webcam in our laptops to the best possible position.  Far from ideal.  As we move to more and more audio and video production, the lack of a dedicated studio space is becoming and issue.  So, we decided build a dedicated studio.

Like most educational organizations, cost is big a factor.  We just don’t have thousands of dollars to throw at the latest 4K cameras.  We also don’t need a full-blown Hollywood studio to make materials for our students to view on the web.  We started by looking at acoustical foam as a way to insulate our space, but this quickly added up to hundreds of dollars for our 10′ x 12′ room.  Our search for other options led us to Justin Troyer, OSU’s resident media services expert and author of Medialogue, who showed us a studio on campus that he had insulated with mover’s blankets.  This looked to be a solution to some of our biggest audio issues because they would both help to block out external noise and reduce the echo within the room.

We had also been struggling with what sort of background to use for video production.  We were leaning towards a velvet or velour curtain in a neutral color because it would help to further absorb the echo within the studio.  But that fabric is expensive and it would lock us into a single background for every video, which is not ideal.  Justin suggested a green screen, which can be removed digitally and replaced with almost anything.  He has several different-sized pop-up green screens which are easy to put behind the video subjects.  But in the end we decided to got with another option he suggested: paint a wall green.  This saves both money and space because the wall does not have to be set up or stored when not in use.

So, after starting with an empty office space, we used the following items to create our studio:

Item  #  Cost  Total
Mover’s Blankets – Harbor Freight  6  $7.99  $47.94
Light-Duty Ceiling Hooks – Home Depot (4 pack)  4  $1.49  $5.96
Gallon Behr Premium Plus Ultra Interior Latex Paint – Sparkling Apple  1  $30.98  $30.98
Assorted painting sundries (roller covers, masking tape)  $15.87
Total:  $100.75

We came in just over $100, which is pretty close to our target.  Included in the costs are items that got used and disposed of while we were painting (roller covers and masking tape) but not items that I already had at home that I brought in to use (paint roller, roller tray, brushes).  I also filled in a few holes in the wall with my own putty and putty knife.  You may need to factor in additional costs if you don’t have access to these basic tools.

In the end, we incurred one final cost which was to purchase a short curtain rod and rings to which allow us to slide the mover’s blanket out from in front of the door, which makes getting in and out much, much easier.  The rod and rings cost just under $22.

Now the real fun begins.  You can see from the picture that we already have a small table, chair, microphone stand, and camera tripod.  The table will be used for straight audio recording, which is why we wrapped the end of one mover’s blanked around it to enclose it on three sides.  We still need to find a microphone or two, a video camera, and some lights.  Stay tuned as we work on acquiring these items to complete our studio.

3 Comments

Filed under Projects

The List of Lists

dictionaries

I’ve been tinkering with AntConc, Laurence Anthony’s free concordancer, which has led me down a bit of a rabbit hole of lists generated by corpus linguists over the past 60 years.  I’ve listed a few that I’ve used, sometimes within AntConc, to analyze students’ writing.  If you’ve taught students to investigate their linguistic hunches via the Corpus of Contemporary American English (COCA), you might also consider teaching them to put their own writing into a tool like AntConc to analyze their own writing as well.  By including the lists below a blacklist (do not show) or a whitelist (show only these), students can hone in on a more specific part of their vocabulary.  Most of these lists are available for download, which means you can be up and running with your own analysis very quickly.

The lists (in chronological order):

General Service List (GSL) – developed by Michael West in 1953; based on a 2.5 million word corpus.  (Can you imagine doing corpus linguistics in 1953?  Much of it must have been by hand, which is mind boggling.)  Despite criticism that it is out of date (words such as plastic and television are not included, for example), this pioneering list still provides about 80% coverage of English.

Academic Word List (AWL) – developed by Averil Coxhead in 2000; 570 words (word families) selected from a purpose-built academic corpus with the 2000 most frequent GSL words removed; organized into 9 lists of 60 and one of 30, sorted by frequency.  Scores of textbooks have been written based on these lists, and for good reason.  In fact, we have found that students are so familiar with these materials, they test disproportionately highly on these words versus other advanced vocabulary.

Academic Vocabulary List (AVL) – the 3000 most frequent words in the 120 million words in the academic portion of the 440 million word Corpus of Contemporary American English (COCA). This word list includes groupings by word families, definitions, and an online interface for browsing or uploading texts to be analyzed according to the list.

New General Service List (NGSL) – developed by Charles Browne, Brent Culligan, and Joseph Phillips in 2013; based on the two-billion-word Cambridge English Corpus (CEC); 2368 words that cover 90.34% of the CEC.

New Academic Word List (NAWL) – based on three components: the CEC Academic Corpus; two oral corpora, the Michigan Corpus of Academic Spoken English (MICASE) and the British Academic Spoken English (BASE) corpus; and on a corpus of published textbooks for a total of 288 million words. The NAWL is to the NGSL what the AWL is to the GSL in that it contains the 964 most frequent words in the academic corpus after the NGSL words have been removed.

Leave a comment

Filed under Resources

Raw. What is it good for?

students vs teachers-1 cropped

When I first came across Raw, a free, online data visualization tool, I channeled my inner Edwin Starr and asked, “What is it good for?”  It turns out the answer is “absolutely everything.”  Or pretty close to it.

Raw is extremely user friendly.  It’s built on D3.JS, which is pretty powerful.  If you, like me, haven’t had time to explore D3 in depth (or if, also like me, you’re not sure you have the skills to take it on,) Raw greatly simplifies the process.  And all of the data is processed in your browser, which means your data is never copied and stored on their servers.

So, what can Raw do for you?  Well take your favorite data set and paste it into the text box (or choose from one of the four example data sets provided).  Then choose from one of the 15 chart types and drag components for your data into the axes or other options for the cart type you have chosen.  You can do this as many times as you like to get the data to try on different options.  Finally, customize your visualization by adjusting the size, scale, and colors of your visualization before choosing how you want to export your results.  It’s amazingly easy!

I created the visualization at the top of this post by feeding in some data on teachers (left) and students (right).  The lines connecting them represent classes that the students had with each teacher with thin lines for one semester and thick ones for the next.  I wanted to explore how students move through our program.  Here, it’s easy to see that most students move up from one level to the next, but there are some that skip levels and some that repeat levels.  The students and teachers are not arranged in order from lowest to highest level, though this would be possible and might make it easier to see these trends.

There are lots of other options within Raw and, depending on what your data include, some may be more useful than others.  But the beauty of Raw is that you are only a couple of clicks away from any of them, making it very easy to try several visualizations until you find one you like.

Leave a comment

Filed under Resources

The Largest Vocabulary in Hip Hop

turntable“technics sl-1200 mk2″ by Rick Harrison / Flickr

I spent much of my youth listening to hip hop, or, as it was called back then, rap music.  This was long before MP3 players and long before you could Google your favorite song lyrics.  It was also long before I knew anything about textual analysis, let alone before I thought about using unique words per n words as a measure of variety in vocabulary.

So, when Matt Daniels published this piece called The Largest Vocabulary in Hip Hop last month, it was both a flash back to the music of my youth and a flash forward to some of my current interests in corpus linguistics.

Daniels does a very nice analysis, so I won’t repeat much of it here.  Just follow the link and scroll down to see the details.  Be aware that some of the analysis incorporates a bit of slang that may not make it completely kid friendly.

Most noteworthy in the analysis are the two baselines of comparison:  Shakespeare (5170 unique words per 35,000 words) and Herman Melville (6,022 unique words in the first 35,000 words of Moby Dick).  Of the 85 rappers analyzed, 16 use a wider vocabulary than Shakespeare and 3 are above Melville.  So, if you ever thought all hip hop was a simplistic art form, you may want to take another look.  It’s amazing what an analysis of the data can show us.

Leave a comment

Filed under Inspiration

Arst Arsw: Star Wars in Alphabetical Order

baby darthFather’s Day by Artiee / Flickr

A friend recently lent me the book Uncharted: Big Data as a Lens on Human Culture, which discusses the development of the Google N-Gram Corpus.  After scanning millions of books, Google could not simply make them all freely available because this would essentially be republishing copyrighted works.  Instead, Google has made them all searchable by N-Grams (one-, two-, three-word phrases and so on up to n-words) which protects the copyrighted works because they are really only viewable in aggregate.  The corpus is, of course, limited in that it only includes books (as opposed to also including magazines, newspapers, oral texts, etc.), but given that it goes back hundreds of years, the size and the scope of the corpus is pretty amazing.

Early on in Uncharted, a book called Legendary Lexical Loquacious Love, a concordance of a romance novel, is affectionately described as a conceptual art piece that helped to inspire the N-Gram Corpus.  In Love, every word from a romance novel is presented in alphabetical order.  So, a word like a, which appears several times in the original source novel, is repeated scores of times.  The authors talk about how different the experience of reading a concordance of a romance novel is from reading the original romance novel, but how the former is compelling in its own way.  For example, they offer the following quote:

beautiful beautiful beautiful beautiful beautiful beautiful beautiful
beautiful beautiful beautiful beautiful beautiful beautiful beautiful
beautiful beautiful beautiful,  beautiful, beautiful, beautiful, beautiful,
beautiful, beautiful, beautiful,” beautiful. beautiful. beautiful.”
beautiful… beautiful…

These 29 occurrences of the word beautiful are, presumably, spread throughout the original novel.  But seeing them juxtaposed next to other words that begin with b (and with the scores of occurrences of the word a) gives you a different perspective on a romance novel.

What does this have to do with Star Wars?  Great question.  While reading Uncharted, I came across the following YouTube video:

Created by Tom Murphy, the video is “meant to be provocative in its uselessness.”  It took 42 hours to produce the 43-minute video, which is oddly compelling to watch.  In addition to the video, a small data bar at the bottom graphs the frequencies of each word, which is also tallied onscreen through the video.  It’s a difference experience, much like reading a concordance is different from reading the original source text.  For example, the famous scene in which Obi-Wan uses a Jedi mind trick on a couple of Stormtroopers appears in the original movie as follows:

Stormtrooper: Let me see your identification.
Obi-Wan: [with a small wave of his hand] You don’t need to see his identification.
Stormtrooper: We don’t need to see his identification.
Obi-Wan: These aren’t the droids you’re looking for.
Stormtrooper: These aren’t the droids we’re looking for.

(Source: imdb.com)

In Arst Arsw, this interaction is best summarized by the three occurrences of the word identification, which are the only three times that this word appears in the film.  Identification appears at 16:08 of the video.  There are many other interesting moments, particularly when different voices utter the same word several times (for example, leader by several rebel pilots) or when only one character uses the same word several times (for example, kid by Han Solo.)  For me, longer words are generally more interesting because they take longer to say, whereas the shorter words can fly by so quickly that they can be hard to comprehend.  One exception, however, is the word know, all 32 occurrences of which fly by in under 5 seconds.  But because the 26th know is so emphatic, it stands out against the rest.

I’m not sure if there are any other video concordances out there, but if there are, I would love to see them.  Especially if the original source material is as compelling as the original Star Wars.

Leave a comment

Filed under Inspiration

Data is Beautiful

graph of "language" as a tag in TED talksVisualization of how often “language” is a tag in TED Talks.

I’ve mentioned data visualizations in several previous posts, so it may not be surprising that I’m writing a trove I’ve recently found: the dataisbeautiful subreddit.  In addition to lots of excellent data visualizations (and some mediocre ones) there’s lots of interesting discussion, including responses to previous visualizations (for example, compare this early version of “How we die” to this follow up.)

One I just came across is someone asking about a pattern in some data, specifically why Google searches for “1990s” peak in May of almost every year.  Other decades follow the same pattern.  Several correlates are suggested (high school reunions, for example) but it turns out that high school proms look like the best correlate.  So, 1950s, 1960s, 1970s, 1980s, and, yes, 1990s, seem to be heavily-Googled prom themes.

If you’re not familiar with Reddit, this is a great subreddit to jump into.  One of the key features of Reddit is that users can vote content up or down, which means that the best content rises to the top (though the definition of “best” is open to the interpretation of every user.)  It’s free to join and not even an email address is required.  You can lurk for a while, simply up / downvote, or jump right into conversations with people from across the internet on almost every conceivable topic, including the data visualizations in dataisbeautiful.

Leave a comment

Filed under Resources

More Reaction GIFs for the ESL Classroom

tom brady no high 5

I’ve written about using reaction GIFs in the classroom before, but a few collections recently caught my eye.  A reaction GIF is a small, animated image that typically summarizes a mood or feeling more quickly or succinctly than words can.  For example, in the image above, quarterback Tom Brady unsuccessfully searches for a teammate to high five.  Many of us can probably relate to this situation; even if you’ve never been left hanging for a high five, this GIF can still be a metaphor for other times in your life in which the people surrounding you are unable or unwilling to share in your excitement.

The following links to Reddit contain a treasure trove of reaction GIFs.  Note that, like anything on the internet, some of the content may not be safe for work (NSFW).  Depending on the student population you work with, you may want to preview this material before you use any of these reaction GIFs in your classroom.  As I wrote in my previous post, these GIFs can serve as excellent starting points for student discussions, writing activities, and more.

If you could sum up your life in a GIF, what would it be? – In this Reddit forum, Redditors post their reaction GIF responses to this question.  As you click through them, you’ll notice themes of self-deprecating humor and a bit of depression becoming the common refrain.  Many of these GIFs summarize a generally frustrated attitude, which can be interesting.

GIFs as comments collection – This is a collection of comment / reaction GIFs.  Many of the posts have links to multiple GIFs.  Lots of general and generic internet forum reactions here.

Retired GIF – This is a subreddit in which Redditors post links to conversation threads in which a GIF has been posted as a response in the “most appropriate context conceivable.”  Each link will take you to the conversation including the GIF and the context in which it was used.  If you’re not familiar with how GIFs are used as part of online discussions, this will get you acquainted very quickly.

2 Comments

Filed under Resources