I am on sabbatical leave for fall 2024. Prof. Spencer Caplan is serving
as interim director of the master's program in computional linguistics.
I am an associate professor of linguistics at the Graduate Center, City University of New York, and director of the master's program in computational linguistics.
I also work as a software engineer at Google Research.
Some topics I've worked on include the morphophonology of Latin, productivity and defectivity, phonotactics, statistical methods for computational model comparison, finite-state techniques, text normalization, grapheme-to-phoneme conversion, morphological analysis, writing systems, developmental disorders, and aphasia.
I am most excited about research that addresses questions in the cognitive science of language or builds high-quality linguistic resources.
Teaching
Tutorials
Publications
- Kyle Gorman and Yuval Pinter. 2024. Don't touch my diacritics. arXiv:2410.24140.
- Kyle Gorman and Brian Roark. 2024. Abbreviation across the world's languages and scripts. In Proceedings of the 2nd Workshop on Computation and Written Language, pages 36-42.
- Kyle Gorman and Richard Sproat. 2024. Was rongorongo an independent invention of writing? (Ms.)
- Kyle Gorman and Cyril Allauzen. 2024. A* shortest string decoding for non-idempotent semirings. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, pages 732-739.
- Adam Wiemerslage, Kyle Gorman, and Katharina von der Wense. 2024. Quantifying the hyperparameter sensitivity of neural networks for character-level sequence-to-sequence tasks. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, pages 674-689.
- Kyle Gorman and Richard Sproat. 2023. Myths about writing systems in speech & language technology. In Proceedings of the Workshop on Computation and Written Language, pages 1-5.
- Kyle Gorman. 2023. Asymmetries in Latin glide formation. Paper presented at the 12th North American Phonology Conference.
- Kyle Gorman and Charles Reiss. 2023. Maximal feature specification is feasible; minimal feature specification is not. Paper presented at the 47th Penn Linguistics Conference and the 46th Generative Linguistics in the Old World Colloquium.
- Kyle Gorman. 2022. Computational morphology. In Mark Aronoff and Kirsten Fudeman, What is Morphology?, pages 246-273. 3rd edition. John Wiley & Sons.
- Géza Kiss, Kyle Gorman, and Jan van Santen. 2021. Group-matching algorithms for subjects and items. arXiv:2110.04432.
- Kyle Gorman, Christo Kirov, Brian Roark, and Richard Sproat. 2021. Structured abbreviation expansion in context. In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 995-1005. [Data]
- Lucas F.E. Ashby, Travis M. Bartley, Simon Clematide, Luca Del Signore, Cameron Gibson, Kyle Gorman, Yeonju Lee-Sikka, Peter Makarov, Aidan Malanoski, Sean Miller, Omar Ortiz, Reuben Raff, Arundhati Sengupta, Bora Seo, Yulia Spektor, and Winnie Yan. 2021. Results of the second SIGMORPHON shared task on multilingual grapheme-to-phoneme conversion. In Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 115-125. [Data] [Software]
- Kyle Gorman and Richard Sproat. 2021. Finite-State Text Processing. Morgan & Claypool. [Software] [Errata for first printing]
[Review] (Now published by Springer.)
- Kyle Gorman. 2020. Anatomy of an analogy. Handout of a talk given at Stony Brook University.
- Angie Waller and Kyle Gorman. 2020. Detecting objectifying language in online professor reviews. In Proceedings of the Sixth Workshop on Noisy User-generated Text, pages 171-180. Best Paper Award.
- Piotr Szymański and Kyle Gorman. 2020. Is the best better? Bayesian statistical model comparison for natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, pages 2203-2212.
- Kyle Gorman, Lucas F.E. Ashby, Aaron Goyzueta, Arya D. McCarthy, Shijie Wu, and Daniel You. 2020. The SIGMORPHON 2020 shared task on multilingual grapheme-to-phoneme conversion. In 17th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pages 40-50. [Data] [Software]
- Jackson L. Lee, Lucas F.E. Ashby, M. Elizabeth Garza, Yeonju Lee-Sikka, Sean Miller, Alan Wong, Arya D. McCarthy, and Kyle Gorman. 2020. Massively multilingual pronunciation mining with WikiPron. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 4223-4228. [Data] [Software]
- Arya D. McCarthy, Christo Kirov, Matteo Grella, Amrit Nidhi, Patrick Xia, Kyle Gorman, Ekaterina Vylomova, Sabrina J. Mielke, Garrett Nicolai, Miikka Silfverberg, Timofey Arkhangelskij, Natalya Krizhanovsky, Andrew Krizhanovsky, Elena Klyachko, Alexey Sorokin, John Mansfield, Valts Ernštreits, Yuval Pinter, Cassandra L. Jacobs, Ryan Cotterell, Mans Hulden, and David Yarowsky. 2020. UniMorph 3.0: universal morphology. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 3922-3931. [Data] [Software]
- Kyle Gorman, Arya D. McCarthy, Ryan Cotterell, Ekaterina Vylomova, Miikka Silfverberg, and Magdalena Markowska. 2019. Weird inflects but OK: making sense of morphological generation errors. In Proceedings of the 23rd Conference on Computational Natural Language Learning, pages 140-151.
- Kyle Gorman and Charles Yang. 2019. When nobody wins. In Franz Rainer, Francesco Gardani, Hans Christian Luschützky and Wolfgang U. Dressler (ed.), Competition in Inflection and Word Formation, pages 169-193. Springer. [Preprint]
- Sandy Ritchie, Richard Sproat, Kyle Gorman, Daan van Esch, Christian Schallhart, Nikos Bampounis, Benoît Brard, Jonas Fromseier Mortensen, Millie Holt, and Eoin Mahon. 2019. Unified verbalization for speech recognition & synthesis across languages. In INTERSPEECH, pages 3530-3534. [Data]
- Kyle Gorman and Steven Bedrick. 2019. We need to talk about standard splits. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2786-2791. Outstanding Paper Award. [Software]
- Sabrina J. Mielke, Ryan Cotterell, Kyle Gorman, Brian Roark, and Jason Eisner. 2019. What kind of language is hard to language-model? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 4975-4989.
- Hao Zhang, Richard Sproat, Axel H. Ng, Felix Stahlberg, Xiaochang Peng, Kyle Gorman, and Brian Roark. 2019. Neural models of text normalization for speech applications. Computational Linguistics 45(2): 293-337.
- Kyle Gorman, Gleb Mazovetskiy, and Vitaly Nikolaev. 2018. Improving homograph disambiguation with machine learning. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation, pages 1349-1352. [Data]
- Axel H. Ng, Kyle Gorman, and Richard Sproat. 2017. Minimally supervised written-to-spoken text normalization. In ASRU, pages 665-670. [Data]
- Joel Adams, Steven Bedrick, Gerasimos Fergadiotis, Kyle Gorman, and Jan van Santen. 2017. Target word prediction and paraphasia classification in spoken discourse. In Proceedings of the BioNLP Workshop, pages 1-8.
- Heather MacFarlane, Kyle Gorman, Rosemary Ingham, Alison Presmanes Hill, Katina Papadakis, Géza Kiss, and Jan van Santen. 2017. Quantitative analysis of disfluency in children with autism spectrum disorder or language impairment. PLOS ONE 12(3): e0173936. [Data]
- Gerasimos Fergadiotis, Kyle Gorman, and Steven Bedrick. 2016. Algorithmic classification of five characteristic types of paraphasias. American Journal of Speech-Language Pathology 25(4S): S776-S787.
- Kyle Gorman and Richard Sproat. 2016. Minimally supervised number normalization. Transactions of the Association for Computational Linguistics 4: 507-519. [Data]
- Kyle Gorman. 2016. Pynini: a Python library for weighted finite-state grammar compilation. In Proceedings of the ACL Workshop on Statistical NLP and Weighted Automata, pages 75-80. [Software]
- Kyle Gorman, Lindsay Olson, Alison Presmanes Hill, Rebecca Lunsford, Peter Heeman, and Jan van Santen. 2016. Uh and um in children with autism spectrum disorders or language impairment. Autism Research 9(8): 854-865.
- Alison Presmanes Hill, Jan van Santen, Kyle Gorman, Beth H. Langhorst, and Eric Fombonne. 2015. Memory in language-impaired children with and without autism. Journal of Neurodevelopmental Disorders 7: 19.
- Kyle Gorman, Steven Bedrick, Géza Kiss, Eric Morley, Rosemary Ingham, Metrah Mohammad, Katina Papadakis, and Jan van Santen. 2015. Automated morphological analysis of clinical language samples. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology Workshop, pages 108-116.
- Constantine Lignos and Kyle Gorman. 2014. Revisiting frequency and storage in morphological processing. In Proceedings of the Forty-eighth Annual Meeting of the Chicago Linguistic Society, pages 447-461.
- Kyle Gorman. 2014b. Exceptions to rhotacism. In Proceedings of the Forty-Eighth Annual Meeting of the Chicago Linguistic Society, pages 279-293.
- Kyle Gorman. 2014a. A program for phonotactic theory. In Proceedings of the Forty-Seventh Annual Meeting of the Chicago Linguistic Society: The Main Session, pages 79-93.
- Maider Lehr, Kyle Gorman, and Izhak Shafran. 2014. Discriminative pronunciation modeling for dialectal speech recognition. In INTERSPEECH, pages 1458-1462.
- Lars Hinrichs, Axel Bohmann, and Kyle Gorman. 2013. Real-time trends in the Texas English vowel system: F2 trajectory in GOOSE as an index of a variety's ongoing delocalization. Rice Working Papers in Linguistics 4: 1-12.
- Kyle Gorman. 2013. Generative phonotactics. University of Pennsylvania dissertation.
- Kyle Gorman and Daniel Ezra Johnson. 2013. Quantitative analysis. In Robert Bayley, Richard Cameron, and Ceil Lucas (ed.), The Oxford Handbook of Sociolinguistics, pages 214-240. Oxford University Press. [Preprint PDF] [Supplemental material]
- Kyle Gorman. 2011. Review: Defective Paradigms: Missing Forms and What They Tell Us. LINGUIST List 22.2894.
- Kyle Gorman. 2011. Latin rhotacism for real. Ms., University of Pennsylvania. [Superseded by Gorman 2014b]
- Kyle Gorman, Jonathan Howell, and Michael Wagner. 2011. Prosodylab-Aligner: a tool for forced alignment of laboratory speech. Journal of the Canadian Acoustical Association 39(3): 192-193. [Software]
- Josef Fruehwald and Kyle Gorman. 2011. Cross-derivational feeding is epiphenomenal. Studies in the Linguistic Sciences 2011:36-50.
- Kyle Gorman. 2010. The consequences of multicollinearity among socioeconomic predictors of negative concord in Philadelphia. U. Penn Working Papers in Linguistics 16(2): 66-75.
- Kyle Gorman. 2009. Hierarchical regression for language research. University of Pennsylvania IRCS Technical Report 09-02.
- Catherine Lai, Kyle Gorman, Jiahong Yuan, and Mark Liberman. 2007. Perception of disfluency: Language differences and listener bias. In INTERSPEECH, pages 2345-2348.
Patents
Grants
Open-source software
- SWIPE': robust pitch tracking
- OpenFst: weighted finite-state transducers
- Pynini: finite-state grammar compilation in Python
- WikiPron: scrapes pronunciation data from Wiktionary
- Prosodylab-Aligner: forced alignment
- ldamatch: Statistical condition matching in R
- Detector Morse: sentence boundary detection in Python
- Baum-Welch: OpenFst extension for Baum-Welch/expectation maximization training
- Perceptronix: sparse and dense binomial and multinomial linear models in C++ and Python
Other
"The pale Usher–threadbare in coat, heart, body, and brain; I see him now. He was ever dusting his old lexicons and grammars, with a queer handkerchief, mockingly embellished with all the gay flags of all the known nations of the world. He loved to dust his old grammars; it somehow mildly reminded him of his mortality." Herman Melville, Moby-Dick; or The Whale