Your final project for this class should involve the creation of some speech or language technology.
You should submit the project as a term paper which
- motivates the technology (why does it exist? why did you create it?) and
- evaluates the technology using standard evaluation procedures (i.e., held-out evaluation).
While you're welcome to submit data or code, you will not be graded on this. Rather, your project will be graded on:
- technical sophistication,
- how you motivate the technology in question, and
- how you evaluate the technology in question.
Some suggestions for the project are given below.
- Experiment with neural network grapheme-to-phoneme conversion, cf. HW2:
- Study the effects of morphological features or segmentations (e.g., in French or German)
- Address outstanding issues in WikiPron
- Compare LSTM and transformer models across several languages, cf. Gorman et al. (2020)
- Focus on one language and perform a detailed error analysis, cf. Ashby et al. (2021)
- Build a finite-state morphological analyzer
- Experiment with morphological generation, cf. HW3:
- Study the effects of inherent features (e.g., gender and animacy in Polish, gender in German, aspect in Russian)
- Compare LSTM and transformer models across several languages
- Focus on one language and perform a detailed error analysis
- Something to do with "unnamed morphological abstractness project"
- Conduct an ambitious grid, random, or black box hyperparameter tuning for HW2 or HW3
- Extend FairSeq for:
- POS tagging
- Named entity recognition
- Morphological analysis/lemmatization
- Text classification
- Experiment with machine translation using FairSeq and data from the WMT shared tasks
- Experiment with neural network homograph disambiguation using data from Gorman et al. (2018)
- Experiment with neural network abbreviation expansion using data from Gorman et al. (2021)
- Build a speech recognizer using Kaldi and data from OpenSLR
- Use speech recognition to perform a sociolinguistics experiment
If you're pursuing an idea not on the list below, you are strongly encouraged to submit a brief abstract to Kyle (either over email or at office hours) describing your concept. This will allow Kyle to ensure the project is both feasible and of appropriate scope.