Markdown isn’t good enough to replace LaTeX

I am generally sympathetic with calls to replace LaTeX with something else. LaTeX has terrible defaults, Unicode and font support is a constant problem, the syntax is deliberately obfuscatory, and actual generation is painfully slow (probably because the whole thing is a big pasta factory of interpreted code instead of a single static library).

But at the same time, I don’t think Markdown is really good enough for LaTeX. Of course one can use Pandoc to generate LaTeX from Markdown notes, and its output is often a decent thing to copy and paste into your LaTeX document. But Markdown just doesn’t solve any of the issues I mention, except making the syntax a tad more WYSIWYG than it would be otherwise. And Markdown is quite a bit worse at one thing: the extended syntax for tables is very hard to key in and still much less expressive than LaTeX’s actually pretty rational tabular environment.

On the different types of magic

In two earlier posts, I discussed the idea of magic, my term for the deductive necessity that some linguistic property distinguishes those morphemes which undergo or do not undergo surface-unpredictable alternations. For instance, the Spanish verb negar ‘to deny’ diphthongizes under stress (e.g., niego ‘I deny’), whereas the superficially similar pegar ‘to stick to s.t.’ does not (pego ‘I stick to s.t.’), and there must be something different about the two stems that causes this.

In a forthcoming handbook chapter, Baković, E., Heinz, J., and Rawski (in press; henceforth BHR) take up the familiar Kiparskian question of the locuses of phonological abstractness, and in doing so, they discuss several ways in which this magic might be encoded. I would like to briefly review their taxonomy.

Under the suppletive analysis, magic verbs like negar have two stems underlying, perhaps /neg-/ and /njɛg-/, the latter used when primary stress falls on the stem. Linguists have—rightly, I think—been uncomfortable with this kind of analysis when the supposedly suppletive stem allomorphs are phonologically similar, and when the distribution of the allomorphs are easily stated in phonological or morphosyntactic terms; both are the case here. However, Aronoff (1994) argues that one must recognize the existence of suppletive patterns and his major case studies (from Hebrew and Latin) involve less-similar stem allomorphs whose distributions are not so easily stated. I am not immediately convinced by Aronoff’s arguments, but I think they should be taken seriously. BHR are similarly skeptical of the use of suppletion except in cases where the allomorphs share little material (e.g., the Korean nominative, which is realized as /-ka/ or /-i/ depending on context).²

Under the abstract diacritic analysis, magical stems bear an feature which is part of the environment for some rule. One concrete version of this is to make the diacritic literally a rule feature, such that rule R cannot apply to a stem unless that stem bears the feature [+R]. For instance, we might represent the stems of the two Spanish verbs as /neg- {+diph}/ and /peg- {−diph}/.¹ We of course need then to write the diphthongization structural change (but this is not hard) and to specify its environment (but this no more or less hard than it would be under the suppletive analysis).

Finally, under the abstract phoneme analysis, magical stems contain phonemes³ which are “abstract” (in a sense to be specified shortly) and trigger the relevant rule (here, diphthongization). BHR discern two types of abstract phonemes: absolutely abstract phonemes are feature bundles which do not appear on the surface, and restrictedly abstract phonemes consist of surface-licit feature bundles which surface in the some, but not all, of the contexts in which they are posited.⁴

The distinction between abstract diacritics and abstract phonemes seems important. It is probably not surprising that self-described morphologists seem to prefer abstract diacritics whereas phonologists prefer abstract phonemes.

Endnotes

One can further imagine that {−diph} is not present underlyingly but is filled in by a lexical redundancy rule early in the derivation, at least for 1st (-a-) and 2nd (-e-) conjugation verbs for which non-diphthongization seems to be the default (see Gorman & Yang 2019 and citations therein). Similar redundancy rules will be called for all rules of “pure phonology”, those which do not show morpholexical conditioning.
This type of analysis is closely related to Gouskova and Pater’s concept of “exceptional morphs” in Optimality Theory.
I note it is not exactly the phoneme itself which is abstract, but rather the overall phonemic form of the morph. For instance, according to Lieber (1987:100f.), German umlaut (a fronting of a [+back] stem vowel) is triggered by a “floating” [−back] feature which is underlyingly present in just those stems which undergo umlaut.
It is not clear to me whether BHR treat this distinction as an object of grammar, or whether it’s just a descriptive notion.

References

Aronoff, M. 1994. Morphology by Itself. MIT Press.
Baković, E., Heinz, J., and Rawski, J. In press. Phonological abstractness in the mental lexicon. In The Oxford Handbook of the Mental Lexicon, to appear.
Gorman, K. and Yang, C. 2019. When nobody wins. In Franz Rainer, Francesco Gardani, Hans Christian Luschützky and Wolfgang U. Dressler (ed.), Competition in inflection and word formation, pages 169-193. Springer.
Harris, J. 1969. Spanish Phonology. MIT Press.
Harris, J. 1985. Spanish diphthongisation and stress: a paradox resolved. Phonology Yearbook 2:31-45.
Lieber, R. 1987. An Integrated Theory of Autosegmental Processes. State University of New York Press.

Noam on neural networks

I just crashed a Zoom conference in which Noam Chomsky was the discussant. (What I have to say will be heavily paraphrased: I wasn’t taking notes.) One back-and-forth stuck with me. Someone asked Noam what people interested in language and cognition ought to study, other than linguistics itself. He mentioned various biological systems, and said however, that they probably shouldn’t bother to study neural networks, since they have very little in common with intelligent biological systems (despite their branding as “neural” and “brain-inspired”). He stated that he is grateful for Zoom closed captions (he has some hearing loss), but that one should not conflate that with language understanding. He said, similarly, that he’s grateful for snow plows, but one shouldn’t confuse such a useful technology with theories of the physical world.

For myself, I think they’re not uninteresting devices, and that linguists are uniquely situated to evaluate them—adversarily, I hope—as models of language. I also think they can be viewed as powerful black boxes for studying the limits of domain-general pattern learning. Sometimes we actually want to ask whether certain linguistic information is actually present in the input, and some of my work (e.g., Gorman et al. 2019) looks at that in some detail. But I do share some intuition that they are not likely to greatly expand our understanding of human language overall.

References

Gorman, K., McCarthy, A. D., Cotterell, R., Vylomova, E., Silfverberg, M., and Markowska, M. Weird inflects but OK: making sense of morphological generation errors. In Proceedings of the 23rd Conference on Computational Natural Language Learning, pages 140-151.

O in truncated compounds

English uses stump compounds formed by taking (roughly) the first syllable of two (or three) words and adjoining them. This is the presumably the process behind real-estate neologisms like Soho (< South of Houston) and Noho (< North of Houston), truncated brand names like HoJo (< Howard Johnson, a hotel chain), and nicknames for celebrities like BoJo (< Beau Johnson) and FloJo (< Florence Griffith Joyner). One strange property of these compounds—documented in an unpublished paper I presented with Laurel MacKenzie at an LSA meeting many years ago—is that when such compounds contain an orthographic <o>, it is almost always pronounced with the GOAT vowel (e.g., American English [oʊ]) even when that is unfaithful to the underlying pronunciation. Thus Soho is [ˈsoʊ.hoʊ], not the more faithful [ˈsaʊ.haʊ]. And, the second syllable of Samohi, a stump compound for Santa Monica High School, is presumably read […moʊ…] even though it stands in for [ˈmɑ.nɪ.kə].

(h/t: Laurel, of course.)

banner reading "Samohi"

Defectivity in Swedish

[This is part of a small but growing series of defectivity case studies.]

Swedish has two genders: a common (or uter) and a neuter. The uter form consists solely of the adjectival stem, whereas the neuter is formed by appending a suffix normally spelled -tt. This suffix, by hypothesis /-tː/, triggers voice assimilation, degemination and/or vowel shortening in some stems. For instance, the neuter form of röd [røːd] ‘red’ is rött [rœt]: here /…d-tː/ is realized as just [t] as the result of assimilation and degemination, and long /øː/ is shortened to short (and lower) [œ].

However, not all adjectives have a well-formed neuter (e.g., Hellberg 1972, Eliasson 1975, Iverson 1981, Löwenadler 2010). Some of the defective categories, after Löwenadler, are:

Both monosyllabic adjectives ending in a short vowel followed by -dd: fadd ‘stale’, and rädd ‘scared’. (However, Hellberg notes that neuter past participles, which have the same surface form, are well-formed: thus fött is the well-formed neuter past participle of föda ‘to feed’. Presumably the past participle formative /-d-/ is treated differently than stem-final /-d/.)
Certain monosyllabic adjectives with long vowels ending in -t or -d: lat ‘lazy’, flat ‘ibid.’, kåt ‘horny’, rät ‘straight’, pryd ‘prudish’, vred ‘wrathful’, snöd ‘vile’.
Most polysyllabic adjectives in -d with final stress, many of which are borrowings from French: morbid ‘ibid.’, hybrid ‘ibid.’, rapid ‘ibid.’, gravid ‘pregnant’, timid ‘ibid.’. (However, Hellberg reports that solid ‘ibid.’ has a neuter: solitt [sulitː] is apparently well-formed.)
Adjectives ending in a stressed vowel: disträ ‘absent-minded’, blasé ‘ibid.’, kry ‘healthy’.

As with Norwegian, I am left wondering whether there are other places in Swedish grammar where -dd affixation might lead to ineffability. Eliasson (1975) and Iverson (1981) claims that verbs in -dd never follow the second or third conjugation, in which certain cells would pose similar problems to the neuter adjectives. Instead such verbs all belong to the first conjugation, which has a theme marker -a- which avoids this issue.

It also seems that the wellformedness of solitt will be an important point for any final theory. There is clearly some individual variation too, as documented by Löwenadler (2010).

Other theoretical accounts of this phenomena, which I didn’t find much to say about, include Buchanan 2007, Lofstedt 2010, and Raffelsiefen 2002.

References

Buchanan, C. H. 2007. Deriving asymmetry in Swedish and Icelandic inflexional paradigms. Master’s thesis, University of Tromsø.
Eliasson, S. 1975. On the issue of directionality. In K.-H. Dahlstedt (ed.), The Nordic Languages and Modern Linguistics 2, pages 421-455. Almqvist & Wiksell.
Hellberg, S. 1972. Ordering relations in the phonology of Swedish adjectives. Gothenburg Papers in Theoretical Linguistics 13: 1-16.
Iverson, G. 1981. Rules, constraints, and paradigm lacunae. Glossa 15: 136-144.
Lofstedt, I. P. M. 2010. Phonetic effects in Swedish phonology: allomorphy and paradigms. Doctoral dissertation, University of California, Los Angeles.
Löwenadler, J. 2010. Restrictions on productivity: Defectiveness in Swedish adjective paradigms. Morphology 20: 70-107.
Raffelsiefen, R. 2002. Quantity and syllable weight in Swedish. Ms.

Defectivity in Norwegian

[This is part of a small but growing series of defectivity case studies.]

Icelandic is not the only Scandinavian language to exhibit defectivity in imperatives: Rice (2003, 2004; henceforth R) describes a superficially similar pattern of defectivity in Norwegian adjectives.

In Norwegian, the infinitival form of most verbs consists of the particle å, the verb stem, and a schwa (which, like in German, is spelled -e). Such verbs’ imperatives then consists of the bare stem, without a particle or the schwa; e.g., å skrive ‘to write’/skriv ‘write!’. A second, smaller class of verbs are monosyllables ending in a (non-schwa) vowel. These verbs use the bare verb stem in both infinitive and imperative; e.g., å tre ‘to step’/tre ‘step!’. While R does not go into any details about how these two patterns might be encoded, one might posit two allomorphs of the infinitive suffix, -e and zero. Presumably this allomorphy is in part lexically conditioned, since it seems necessary to distinguish between minimal pairs like å vie ‘to dedicate’/vi ‘dedicate!’, which belongs to the former class, and å si ‘to say’/si ‘say!’, which belongs to the latter. However, R only gives a few examples of vowel-final monosyllable with infinitive in -e (all other verbs of this shape have zero infinitives), so it’s possible these are just exceptions and the allomorphy conditioning is mostly phonological.

A third class of verbs are those whose stem ends in a rising-sonority consonant cluster; e.g., åpne ‘to open’, sykle ‘to bike’.¹These superficially resemble the first class of verbs (e.g., å skrive) in that they end in a schwa in the infinitive. However, Norwegian does not permit rising sonority codas, so the expected *åpn, *sykl, and so on are ill-formed.

According to R, some speakers simply use circumlocutions to avoid the imperative of such verbs, making this a standard case of defectivity. However, R mentions several other strategies used by Norwegian speakers:²

The word-final sonorant can be made syllabic (e.g., [oːpn̩]).
If the cluster consists of a voiceless consonant followed by a sonorant, the sonorant can be devoiced, reducing the sonority rise (e.g., [oːpn̥]).
One can insert a schwa to break up the cluster (e.g., [oːp.pɘn]).
One can insert a schwa after the cluster (e.g., [oːp.nɘ]).

One question that arises is whether there are any other places in the Norwegian grammar where we would expect word-final rising sonority consonant clusters to surface. As others have noted (e.g., Albright 2009), most if not all instances of inflectional defectivity are limited to specific morphological categories. For speakers who cannot generate an imperative of verbs like åpne or sykle, is this defectivity limited to this the category of imperatives, or is it found anywhere else in the language?

Endnotes

R gives the infinitives of this third class of verbs without the å particle. It is unclear to me whether this is intentional or just an oversight.
These forms are ones I have posited on the basis of R’s description, which is not as detailed as one might like.

References

Albright, A. 2009. Lexical and morphological conditioning of paradigm gaps. In C. Rice and S. Blaho (ed.), When Nothing Wins: Modeling Ungrammaticality in OT, pages 117-164. Equinox.
Rice, C. 2003. Dialectal variation in Norwegian imperatives. Nordlyd 31: 372-384.
Rice, C. 2005. Optimal gaps in optimal paradigms. Catalan Journal of Linguistics 4: 155-170.

Defectivity in Icelandic

Hansson (1999; henceforth H) discusses an interesting case of defectivity in Icelandic imperative formation. According to H, this language has three types of (2sg.) imperative.

The root imperative is available only as a “deliberate archaism”; it won’t be considered further.
The full imperative consists of the root plus a coronal suffix plus a 2sg. pronominal enclitic -u /ʏ/.
The clipped imperative also consists of the root plus a coronal suffix but uses a contrastively stressed pronoun ‘you’ (cf. English ‘YOU work!’) instead of a clitic.

For example, the full imperative for taka ‘to take’ is taktu [ˈtʰaxtʏ] and the clipped imperative is takt ÞÚ [tʰaxt ˈθuː].¹ H develops an account of the allomorphy of the dental suffix in the full and clipped imperatives; going forward I will cite the full forms, since the distinction is irrelevant. Under H’s analysis, there are two allomorphs:

/-T-/ is a [−spread glottis] coronal obstruent surfacing as [t] or [ð] depending on context; e.g., the full imperative for negla ‘to nail’ is negldu [ˈnɛɣ͡ltʏ].²
/-Tʰ-/ is a [+spread glottis] coronal obstruent, surfacing as [t] with devoicing of preceding stem-final consonants; e.g., the full imperative for synda ‘to swim’ is syntu [ˈsɪn̥tʏ].

H claims that “[f]or the vast majority of verbs, the choice of allomorph is uniquely determined on the basis of the root-final consonant(s)” (p. 108), implying that this is a phonologically conditioned allomorphy, though the conditioning is not given in prose form. H also implies (fn. 4) that this is suppletive allomorphy, though this assumption is also not justified. Let us assume, for sake of argument, that both assumptions are correct and this is a case of phonologically conditioned suppletive allomorphy. Finally, H notes that under his assumptions, there are certain roots for which either allomorph would give the same imperative surface form.

There are several exceptional verbs for which the phonological conditioning H proposes yields an incorrect result. For instance, the full imperative of senda ‘to send’ is the /-T-/ form sendu [ˈsɛntʏ] rather than the expected /-Tʰ-/ form *[ˈsɛn̥tʏ].³ H draws attention to weak verbs whose roots end in /ll, nn/. For these, H’s account of the phonological conditioning ought to prefer /-T-/, but most select /-Tʰ-/.⁴

There are four strong verbs whose roots end in /ll, nn/. So far, other than the characteristic ablaut, we have seen no reason to treat imperative formation in the strong verbs differently than in weak verbs.⁵ For example, for stela ‘to steal’, the full imperative is the /-T-/ form steldu [ˈstɛltʏ]. Yet, there are three strong verbs in /ll, nn/ for which neither possible form of the imperative is well-formed. These are the verbs vinna ‘to work’ (*vinndu, *vinntu), spinna ‘to spin (s.t.)’ (*spinndu, *spinntu), and falla ‘to fall; flunk’ (*falldu, *falltu). And to make matters more complex, there is one strong verb in /nn/ for which the “expected” /-T-/ is acceptable: the full imperative of finna ‘to find’ is finndu [ˈfɪntʏ].

H identifies the following explananda for imperative formation in Icelandic.

The imperative stem is always the same as the past stem in weak verbs
Yet, defectivity is found only in imperatives and never in pasts.
Defectivity occurs only in strong verbs.
Defectivity is found only in roots in /ll, nn/, a form which “usually is indicative of exceptionality in allomorph selection” (p. 344).

It is not obvious to me that the first explanandum is meaningful. While many linguists believe “Priscian”-like mechanisms which permit direct encoding of these kinds of facts, the mere stem identity of two semantically distant parts of speech is not itself compelling evidence. In this particular case, one might implement these facts without referring to identity by deriving the allomorphy from a verbal theme, perhaps a floating [α spread glottis] feature, which surfaces in both the imperative and the past. Thus roots selecting /-Tʰ-/ might be underlyingly someting like /√-ʰ/ where the surd denotes the root and /ʰ/ a thematic [+spread glottis] specification.

The second explanandum does seem to be meaningful, even independently of the first. One possible fact that might be relevant here is that (other than the enclitic) the Icelandic imperative is bare, whereas weak verb stems are, to my knowledge, always followed by a vowel-initial suffix. So one could imagine that this is, in part, a phonotactic effect at some level of prosodic structure that does not include the clitic.

The third explanandum also seems meaningful. One can, for instance, frame it as a simple statistical hypothesis test, the null hypothesis being that imperative defectivity is independent of the strong/weak distinction. While I don’t have psychologically plausible counts of the strong and weak verbs—the numbers I need to compute sufficient statistics for this test—in front of me, I suspect the probability of observing this pattern under the null hypothesis is going to be vanishingly small.

The fourth and final explanandum is certainly one worth incorporating into any analysis. However, I think the obvious step has not yet been taken: serious attempts out to be made to incorporate it into a phonological account of the coronal suffix allomorphy, something H unfortunately has not attempted. If we are in fact to regard verbs in /ll, nn/ as lexically exceptional, one should first reasonably exhaust possible phonological accounts. One direction for future research would be to better understand the allomorphy associated with the imperative and past stems in Icelandic in general.

H proposes, essentially, that defectivity results in strong verbs in /ll, nn/ because such verbs lack a coronal-suffixed past tense form elsewhere in the paradigm; he adds that the strong imperative finndu is exempted because there are other /…nt/ forms in the paradigm of that verb. So many, many, many different things have to go wrong for a defective imperative in Icelandic: essentially, one has to be imperative, in /ll, nn/, and lack other coronal-final stems, and this come together in just three verbs in the entire language. Whether or not one finds H’s account compelling, it is very difficult to reason much about the theory of defectivity from the existence of no more than three verbs in a language. We might do better to focus on languages, like Greek or Russian, in which inflectional defectivity has much higher type frequency.

Endnotes

Whether or not the full and the clipped imperative are pragmatically substitutable is unclear to me from H’s description.
Unfortunately, H does not always give the orthographic form of the words he is citing, and given the language’s famously difficult spelling, I am not always certain I have guessed the correct spelling for inflected forms. However, it appears to me that the contrast between /-T-/ and /-Tʰ-/ is spelled as -d- vs. -t-.
Once again, it is not clear why this is the expected form because the only description of the phonological conditioning is given in a sketchy Optimality Theory analysis (H:§2.1-2).
The relevant statistic is that 6 out of 33 weak verbs in /ll, nn/ select the “expected” /-T-/. From this H concludes that in this environment, “the exceptions far outnumber the regulars” (p. 113). I note briefly that under the tolerance principle (Yang 2005), an environment of 33 examples can tolerate up to 9 exceptions, so this could be a productive generalization according to that theory.
In H’s examples, strong imperatives use the same ablaut grade as the infinitive, so we just have to take his word that they are in fact strong.

References

Hansson, G. Ó. 1999. ‘When in doubt…’: intraparadigmatic dependencies and gaps in Icelandic. In Proceedings of NELS 29, pages 105-119. GLSA Publications.
Yang, C. 2005. On productivity. Language Variation Yearbook 5: 333-370.

Grapholinguistics talk

Here are slides from a talk, coauthored with Richard Sproat, given at the Grapholinguistics in the 21st Century conference, on how we talk about writing systems in speech and language processing. We will try to get this into archival form soon.

On “from scratch”

For a variety of historical and sociocultural reasons, nearly all natural language processing (NLP) research involves processing of text, i.e., written documents (Gorman & Sproat 2022). Furthermore, most speech processing research uses written text either as input or output.

A great deal of speech or language processing treads words (however they are understood) as atomic, indivisible units rather than the “intricately structured objects linguists have long recognized them to be” (Gorman in press). But there has been a recent trend to instead work with individual Unicode codepoints, or even the individual bytes of a Unicode string encoded in UTF-8. When such systems are part of an “end-to-end” neural network, these systems are sometimes said to be “from scratch”; see, e.g., Gillick et al. 2016 and Li et al. 2019, who both use this exact phrase to describe their contributions. There is an implication that such systems, by bypassing the fraught notion of word, have somehow eliminated the need for linguistic insight altogether.

The expression “from scratch” makes an analogy to baking: it is as if we are making angel food cake by sifting flour, superfine sugar, and cream of tartar, rather than using the “just add water and egg whites” mixes from Betty Crocker. But this analogy understates just how much linguistic knowledge can be baked in (or perhaps “sifted in”) to writing systems. Writing systems are essentially a type of linguistic analysis (Sproat 2010), and like any language technology, they necessarily reify the analysis that underlies them.¹ The linguistic analysis underlying a writing system may be quite naïve but may also encode sophisticated phonemic and/or morphemic insights. Thus written text, whether expressed as Unicode codepoints or UTF-8 bytes, may have quite a bit of linguistic knowledge sifted and folded in.

A familiar and well-known example of this kind of knowledge comes from English (Gorman in press). In this language, changes in vowel quality triggered by the addition of “level 1” suffixes like -ity are generally not indicated in written form. Thus sane [seɪn] and sanity [sæ.nɪ.ti], for instance, are spelled more similarly than they are pronounced (Chomsky and Halle 1968: 44f.), meaning that this vowel change need not be modeled when working with written text.

Endnotes

The Sumerian and Egyptian scribes were thus history’s first linguists, and history’s first language technologists.

References

Chomsky, N., and Halle, M. 1968. Sound Pattern of English. Harper & Row.
Gillick, D., Brunk, C., Vinyals, O., and Subramanya, A. 2016. Multilingual language processing from bytes. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1296-1306.
Gorman, K.. In press. Computational morphology. In Aronoff, M. and Fudeman, K., What is Morphology? 3rd edition. Blackwell.
Gorman, K., and Sproat, R. 2022. The persistent conflation of writing and language. Paper presented at Grapholinguistics in the 21st Century.
Li, B., Zhang, Y., Sainath, T., Wu, Y., and Chan, W. 2019. Bytes are all you need: end-to-end multilingual speech recognition and synthesis with bytes. In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 5621-5625.
Sproat, R. 2010. Language, Technology, and Society. Oxford University Press.

The computational revolution in linguistics

(Throughout this post, I have taken pains not to name any names. The beauty of subtweeting and other forms of subposting is that nobody knows for sure you’re the person being discussed unless you volunteer yourself. So, don’t.)

One of the more salient developments in linguistics as a discipline over the last two decades is the way in which computational knowledge has diffused into the field.¹ 20 years ago, there were but a handful of linguistics professors in North America who could perform elaborate corpus analyses, apply machine learning and statistical analysis, or extract acoustic measurements from an audio file. And, while it was in some ways quite robust, speech and language processing at the turn of the last century simply did not hold the same importance it does nowadays.

While some professors—including, to their credit, many of my mentors and colleagues—can be commended for having “skilled up” in the intervening years, this knowledge has, I am sad to say, mostly advanced one death (and subsequent tenure line renewal) at a time. This has negative consequences for linguistics students who want to train for or pivot to a career in the tech sector, since there are professors who were, in their time, computationally sophisticated, but lack the skills a rising computational linguist is expected to have mastered. In an era of contracting tenure rolls and other forms of casualization in the academy, this has the risk of pushing out legitimate, albeit staid, lines of linguistic inquiry in favor of areas favored by capitalists.²

Yet I believe that this upskilling has a lot to contribute to linguistics as a discipline. There are many core questions about language use, acquisition, variation, and change which are best answered with a computational simulation that forces us to be explicit about our assumptions, or a corpus study that tells us what people really said, or a statistical analysis that tells us whether our correlations are likely to be meaningful, or even a machine learning system that helps us rapidly label linguistic data.³ It is a boon to our field that linguists of any age can employ these tools when appropriate.

This is not to say that the transition has not been occasionally ugly. First, there are the occasional nasty turf wars over who exactly is a linguist.⁴ Secondly, the standards of quality for work in this area must be negotiated and imposed. While a syntax paper in NL&LT from even 30 years ago are easily readable today, the computational methods of even widely-praised paper from 15 or 20 years ago are, frankly, often quite sloppy. I have found it necessary to explain this to students who want to interact with this older work lest they lower their own methodological standards.

I discern at least a few common sloppy habits in this older computational work, focusing for the moment on computational cognitive models of linguistic behavior.

If a proposed computational model is compared to some “baseline” or older model, this older model is usually an ancient associationist model from psychology. This older model naturally lacks much of the rich linguistic specifications of the proposed model, and naturally it fails to model the data. Deliberately picking a bad baseline is putting one’s finger on the scale.
Comparison of different computational models is usually informal. One should instead use statistical model comparison methods.
The dependent variable for modeling is often derived from poorly-designed human subjects experiments. The subjects in these experiments may be instructed to perform a task they are unlikely to be able to do consciously (i.e., the tasks are cognitively impenetrable). Unjustified assumptions about appropriate scales of measurement may have been made. Finally, the n‘s are often needlessly small. Computational cognitive models demand high-quality measures of the behaviors they’re meant to model.
Once the proposed model has been shown better than the baseline, it is reified far beyond what the evidence suggests. Computational cognitive modeling can at most show that certain explicit assumptions are consistent with the observed data: they cannot establish much beyond that.

The statistician Andrew Gelman writes that scientific discourse sometimes proceeds as if earlier published work has additional claim to truth than later research that is critical of the original findings (which may or may not be published yet).⁵ Critical interpretation of this older computational work is increasingly called for, as our methodological standards continue to mature. I find reviewers (and literature-reviewers) overly deferential to prior work of dubious quality simply because of its priority.

Endnotes

An under-appreciated element to this process is that it is is simply easier to do linguistically-relevant things with computers than it was 20 years prior. For this, one should thank Python and R, NumPy and Scikit-learn, and of course tools like Praat and Parselmouth.
I happen to think college education should not be merely vocational training.
I happen to think most of these questions can be answered with a cheap laptop, and only a few require a CUDA-enabled GPU.
I suspect this is mostly a response to the rapidly casualizing academy. Unfortunately, any question about whether we should be doing X in linguistics is misinterpreted as a question about whether people who do X deserve to have a job. This is a presupposition failure for me: I believe everyone deserves meaningful work, and that academic tenure is a model of labor relations that should be expanded beyond the academy.
To free ourselves of this bias, Gelman proposes what he calls the time-reversal heuristic, in which one imagines the temporal order reversed (e.g., that the later failed replication is now the first published result on the matter) and then re-evaluates the evidence. When interacting with older computational work, similar thinking is called for here.