Term paper specifications
You are responsible for a term paper that counts something of linguistic interest using a non-trivial amount of Python features.
If you are not sure whether your project satisfies the above specifications, email a brief description to Kyle before proceeding.
A brief list of ideas:
- Count the words most associated with each of the 12 zodiac signs in a corpus of horoscopes
- Count the number of words ending in various derivational suffixes in a digital dictionary
- Count the number of words ending in syllabic sonorants in a pronunciation dictionary
- Count the frequencies of the different pronunciations of the word live using a tagger (n.b.: this works because the pronunciation is used when it's a noun, and the other when it's a verb)
What to submit
Your submission should include:
- Any interesting samples of code (though I won't reviewing code quality in my grading)
- Data used (or instructions or code to obtain it, if it's more than 10 MB or so)
- A write-up of 3-4 pages describing:
- the data you used
- what you counted
- what the counts were (please make a table, don't just dump Python output here)
- why this might be a interesting thing to count
Rubric
The term paper will be graded on the degree to submission satisfies the above specification.
I will grade the submission up to the point where I am required to submit grades to the registrar's office; this usually a week or so after the end of the semester. If I have not received a term paper by then, you will receive an "I" (incomplete) grade until you submit the term paper.
Hints
- While it's technically possible to work with audio data for this project, it's a lot harder than working with discrete (e.g., text, etc.) data unless you've also studied acoustic phonetics and/or signal processing.
- It's okay (good, even) if this harmonizes with some other projects you're doing for credit (e.g., qualifying papers), so long as you make it clear in your write-up what part of the project is unique to the term paper.