Dutch names in LaTeX

One thing I recently figured out is a sensible way to handle Dutch names (i.e., those that begin with denvan or similar particles. Traditionally, these particles are part of the cited name in author-date citations (e.g., den Dikken 2003, van Oostendorp 2009) but are ignored when alphabetizing (thus, van Oostendorp is alphabetized between Orgun & Sprouse and Otheguy, not between Vago and Vaux)This is not something handled automatically by tools like LaTeX and BibTeX, but it is relatively easy to annotate name particles like this so that they do the right thing.

First, place, at the top of your BibTeX file, the following:

@preamble{{\providecommand{\noopsort}[1]{}}}

Then, in the individual BibTeX entries, wrap the author field with this command like so:

 author = {{\noopsort{Dikken}{den Dikken}}, Marcel},

This preserves the correct in-text author-date citations, but also gives the intended alphabetization in the bibliography.

Note of course that not all people with van (etc.) names in the Anglosphere treat the van as if it were a particle to be ignored; a few deliberately alphabetize their last name as if it begins with v.

X moment

A Reddit moment is an expression used to refer to a certain type of cringe ‘cringeworthy behavior or content’ judged characteristic of Redditors, habitual users of the forum website reddit.com. It seems hard to pin down what makes cringe Redditor-like, but discussion on Urban Dictionary suggests that one salient feature is a belief in one’s superiority, or the superiority of Redditors in general; a related feature is irl behavior that takes Reddit too seriously. The normal usage is as an interjection of sorts; presented with cringeworthy internet content (a screenshot or URL), one might simply respond  “Reddit moment”.

However, Reddit isn’t the only community that can have a similar type of pejorative X moment. One can find many instances of crackhead moment, describing unpredictable or spazzy behavior. A more complicated example comes from a friend, who shared a link about a software developer who deliberately sabotaged a widely used JavaScript software library to protest the Russian invasion of Ukraine. JavaScript, and the Node.js community in particular, has been extremely vulnerable to both deliberate sabotage and accidental bricking ‘irreversible destruction of technology’, and naturally my friend sent the link with the commentary “js moment”. The one thing that seems to unite all X moment snowclones is a shared negative evaluation of the community in the common ground.

Evaluations from the past

In a literature review, speech and language processing specialists often feel tempted to report evaluation metrics like accuracy, F-score, or word error rate for systems described in the literature review. In my opinion, this is only informative if the prior and present work use the exact same data set(s) for evaluations. (Such results should probably be presented in a table along with results from the present work, not in the body of the literature review.) If instead, they were tested on some proprietary data set, an obsolete corpus, or a data set the authors of the present work have declined to evaluate on, this information is inactionable. Authors should omit this information, and reviewers and editors should insist that it be omitted.

It is also clear to me that these numbers are rarely meaningful as measures of how difficult a task is “generally”. To take an example from an unnamed 2019 NAACL paper (one guilty of the sin described above), word error rates on a single task in a single language range between 9.1% and 23.61% (note also the mixed precision). What could we possibly reason from this enormous spread of results across different data sets?

Country (dead)naming

Current events reminded me of an ongoing Discourse about how we ought to refer to the country Ukraine in English. William Taylor, US ambassador to the country under George W. Bush, is quoted on the subject in this Time magazine piece (“Ukraine, Not the Ukraine: The Significance of Three Little Letters”, March 5th, 2014; emphasis mine), which is circulating again today:

The Ukraine is the way the Russians referred to that part of the country during Soviet times … Now that it is a country, a nation, and a recognized state, it is just Ukraine.

Apparently they don’t fact-check claims like this, because this is utter nonsense. Russian doesn’t have definite articles, i.e., words like the. There is simply no straightforward way to express the contrast between the Ukraine and Ukraine in Russian (or in Ukrainian for that matter).

Now, it’s true that the before Ukraine has long been proscribed in English, but this seems to be more a matter of style—the the variant sounds archaic to my ear—than ideology. And, in Russian, there is variation between в Украине and на Украине, both of which I would translate as ‘in Ukraine’. My understanding is that both have been attested for centuries, but one (на) was more widely used during the Soviet era and thus the other (в) is thought to emphasize the country’s sovereignty in the modern era. As I understand it, that one preposition is indexical of Ukrainian nationalist sentiment and another is indexical of Russian revanchist-nationalist sentiment is more or less linguistically arbitrary in the Saussurean sense. Or, more weakly, the connotative differences between the two prepositions are subtle and don’t map cleanly onto the relevant ideologies. But I am not a native (or even competent) speaker of Russian so you should not take my word for it.

Taylor, in the Time article, continues to argue that US media should use the Ukrainian-style transliteration Kyiv instead of the Russian-style transliteration Kiev. This is a more interesting prescription, at least in that the linguistic claim—that Kyiv is the standard Ukrainian transliteration and Kiev is the standard Russian transliteration—is certainly true. However, it probably should be noted that dozens of other cities and countries in non-Anglophone Europe are known by their English exonyms, and no one seems to be demanding that Americans start referring to Wien [viːn] ‘Vienna’ or Moskva ‘Moscow’. In other words Taylor’s prescription is a political exercise rather than a matter of grammatical correctness. (One can’t help but notice that Taylor is a retired neoconservative diplomat pleading for “political correctness”.)

On conspiracies

Kisseberth (1970) introduces the notion of conspiracies, cases in which a series of phonological rules in a single language “conspire” to create similar output configurations. Supposedly, Haj Ross chose the term “conspiracy”, and it is perhaps not an accident that the term he chose immediately reminds one of conspiracy theory, which has a strong negative connotation implying that the existence of the conspiracy cannot be proven. Kisseberth’s discovery of conspiracies motivated the rise of Optimality Theory (OT) two decades later—Prince & Smolensky (1993:1) refer to conspiracies as a “conceptual crisis” at the heart of phonological theory, and Zuraw (2003) explicitly links Kisseberth’s data to OT—but curiously, it seemingly had little effect on contemporary phonological theorizing. (A positivist might say that the theoretical technology needed to encode conspiratorial thinking simply did not exist at the time; a cynic might say that contemporaries did not take Kisseberth’s conspiratorial thinking seriously until it became easy to do so.) I discern two major objections to the logic of conspiracies: the evolutionary argument and the prosodic argument, which I’ll briefly review.

The evolutionary argument

What I am calling the evolutionary argument was first made by Kiparsky (1973:75f.) and is presented as an argument against OT by Hale & Reiss (2008:14). Roughly, if a series of rules lead to the same set of output configurations, they must be surface true, or they would not contribute to the putative conspiracy. Since surface-true rules are assumed to be easy to learn, especially relative to opaque rules are assumed to be difficult to learn, and since failure to learn rules would contribute to language change, grammars will naturally accumulate functionally related surface-true rules. I think we should question the assumption (au courant in 1973) that opacity is the end-all of what makes a rule difficult to acquire, but otherwise I find this basic logic sound.

The prosodic argument

At the time Kisseberth was writing, standard phonological theory included few of the prosodic primitives; even the notion of syllable was considered dubious. Subsequent revisions of the theory have introduced rich hierarchies of prosodic primitives. In particular, a subsequent generation of phonologists hypothesized that speakers “build” or “parse” sequences of segments into onsets and rimes, syllables, and feet, with repairs like stray erasure, i.e., deletion, of unsyllabified segmental or epenthesis used to resolve conflicts (McCarthy 1979, Steriade 1982, Itô 1986). It seems to me that this approach accounts for most of the facts of Yowlumne (formerly Yawelmani) reviewed by Kisseberth in his study:

  1. there are no word-initial CC clusters
  2. there are no word-final CC clusters
  3. derived CCCs are resolved either by deletion or i-epenthesis
  4. there are no CCC clusters in underlying form

The relevant observation that links all these facts is simply that Yowlumne does not permit branching onsets or codas, but more specifically, Yowlumne’s syllable-parsing algorithm does not build branching onsets or codas. This immediately accounts for facts #1-2. Assuming the logic of the McCarthy and contemporaries, #3 is also unsurprising: these clusters simply cannot be realized faithfully; the fact that there are multiple resolutions for the *CCC pathology is besides the point. And finally, adopting the logic that Prince & Smolensky (1993:54) were later to call Stampean occultation, the absence of underlying CCC clusters follows from the inability of them to surface, since the generalizations in question are all surface-true. (Here, we are treading closely to Kiparsky’s thoughts on the matter too.) Crucially, the analysis given above does not reify any surface constraints; the facts all follow from the feed-forward derivational structure of prosodically-informed phonological theory current a decade before Prince & Smolensky.

Conclusion

While Prince & Smolensky are right to say that OT provides a principled solution to Kisseberth’s notion of conspiracies, researchers in the ’70s and ’80s treated Kisseberth’s notion as epiphenomena of acquisition (Kiparsky) or prosodic structure-building (McCarthy and contemporaries). Perhaps, then, OT do not deserve credit for solving an unsolved problem in this regard. Of course, it remains to be seen whether the many implicit conjectures in these two objections can be sustained.

References

Hale, M. and Reiss, C. 2008. The Phonological Enterprise. Oxford University Press.
Kiparsky, P. 1973. Phonological representations. In O. Fujimura (ed.), Three Dimensions of Linguistic Theory, pages 1-135. TEC Corporation.
Kisseberth, C. W. 1970. On the functional unity of phonological rules. Linguistic Inquiry 1(3): 291-306.
Itô, J. 1986. Syllable theory in prosodic phonology. Doctoral dissertation, University of Massachusetts, Amherst. Published by Garland Publishers, 1988.
McCarthy, J. 1979. Formal problems in Semitic phonology and morphology. Doctoral dissertation, MIT. Published by Garland Publishers, 1985.
Prince, A., and Smolensky, P. 1993. Optimality Theory: constraint interaction in generative grammar. Rutgers Center for Cognitive Science Technical Report TR-2.
Steriade, D. 1982. Greek prosodies and the Nature of syllabification. Doctoral dissertation, MIT.
Zuraw, K. 2003. Optimality Theory in linguistics. In M. Arbib (ed.), Handbook of Brain Theory and Neural Networks, pages 819-822. 2nd edition. MIT Press.

On the Germanic *tl gap

One “parochial” constraint in Germanic is the absence of branching onsets consisting of a coronal stop followed by /l/. Thus /pl, bl, kl, gl/ are all common in Germanic, but *tl and *dl are not. It is difficult to understand what might gives rise to this phonotactic gap.

Blevins & Grawunder (2009), henceforth B&G, note that in portions of Saxony and points south, *kl has in fact shifted to [tl] and *gl to [dl]. This sound change has been noted in passing by several linguists, going back to at least the 19th century. This change has the hallmarks of a change from below: it does not appear to be subject to social evaluation and is not subject to “correction” in careful speech styles. B&G also note that many varieties of English have undergone this change; according to Wright, it could be found in parts of east Yorkshire. Similarly, no social stigma seems to have attached to this pronunciation, and B&G suggest it may have even made its way into American English. B&G argue that since it has occurred at least twice, KL > TL is a natural sound change in the relevant sense.

Of particular interest to me is B&G’s claim that one structural factor supporting *KL > TL is the absence of TL in Germanic before this change; in all known instances of *KL > TL, the preceding stage of the language lacked (contrastive) TL. While many linguists have argued that TL is universally marked, and that its absence in Germanic is a structural gap in the relevant sense, this does not seem to be borne out by quantitative typology of a wide range of language families.

Of course, other phonotactic gaps, even statistically robust ones, also are similarly filled with ease. I submit that evidence of this sort suggests that phonologists habitually overestimate the “structural” nature of phonotactic gaps.

References

Blevins, J. and Grawunder, S. 2009. *KL > TL sound change in Germanic and elsewhere: descriptions, explanations, and implications. Linguistic Typology 13: 267-303.

The curious case of -pilled

A correspondent asks whether –pilled is a libfix. I note grillpilled (when you stop caring about politics and focus on cooking meat outdoors) and catpilled (when you get toxoplasmosis). While writing this, I was wondering whether anyone has declared themselves tennispilled; yes, someone has.

The etymology of -pilled seems clear enough. The phrase taking the {blue, red} pill from that scene in The Matrix (1998) gave rise to the idiomatic compounds blue pill and red pill. These then underwent zero derivation, giving us bluepilled and (especially) redpilled. The most common syntactic function for these two words seems to be as a sort of perfective adjective, possibly with an agentive by-phrase (e.g., “I was redpilled by Donald Trump Jr.’s IG”), but I also recognize a construction where the agent has been promoted to subject position and the object is the benefactor (e.g., “Donald Trump Jr.’s IG redpilled me”).

The thing though, is that –pilled derives from two idiomatic compounds and still has the form of an English past participle. There is no clear evidence of recutting, just a new reading for the zero-derived pill plus the past participle marker –ed. It is thus much like other non-exactly-libfixes like –core (< hardcore) and –gate (< Watergate), in my estimation.

On expanding acronyms

Student writers are often taught that acronyms should also be given in expanded form on first use. While this is a good rule of thumb in my opinion, there is an exception for any acronym whose expansion the author believes to be misleading about its referent, particularly when the acronym in question seems to have been coined after the fact and purely for the creator’s amusement.

“Many such cases.”

An author-date citation may be preferable to spelling out the silly acronym.

The role of phonotactics in language change

How does phonotactic knowledge influence the path taken by language change? As is often the case, the null hypothesis seems to be simply that it doesn’t. Perhaps speakers have projected a phonotactic constraint C into the grammar of Old English, but that doesn’t necessarily mean that Middle English will conform to C, or even that Middle English won’t freely borrow words that flagrantly violate C.

One case comes from the history of English. As is well known, modern English /ʃ/ descends from Old English sk; modern instances of word-initial sk are mostly borrowed from Dutch (e.g., skipper) or Norse (e.g., ski); sky was borrowed from an Old Norse word meaning ‘cloud’ (which tells you a lot about the weather in the Danelaw). Furthermore, Old English forbids super-heavy long vowel-consonant cluster rimes. Because the one major source for /ʃ/ is sk, and because a word-final long vowel followed by sk was unheard of, V̄ʃ# was rare in Middle English and word-final sequences of tense vowels followed by [ʃ] are still rare in Modern English (Iverson & Salmons 2005). Of course there are exceptions, but according to Iverson & Salmons, they tend to:

  • be markedly foreign (e.g., cartouche),
  • to be proper names (e.g., LaRouche),
  • or to convey an “affective, onomatopoeic quality” (e.g., sheesh, woosh).

However, it is reasonably clear that all of these were added during the Middle or Modern period. Clearly, this constraint, which is still statistically robust (Gorman 2014:85), did not prevent speakers from borrowing and coining exceptions to it. However, it is hard to  rule out any historical effect of the constraint: perhaps there would be more Modern English V̄ʃ# words otherwise.

Another case of interest comes from Latin. As is well known Old Latin went through a near-exceptionless “Neogrammarian” sound change, a “primary split” or “conditioned merge” of intervocalic s with r. (The terminus ante quem, i.e., the latest possible date, for the actuation of this change is the 4th c. BCE.) This change had the effect of temporarily eliminating all traces of intervocalic in late Old Latin (Gorman 2014b). From this fact, one might posit that speakers of this era of Latin might project a *VsV constraint. And, one might posit that this would prevent subsequent sound changes from reintroducing intervocalic s. But this is clearly not the case: in the 1st c. BCE, degemination of ss after diphthongs and long monophthongs reintroduced intervocalic s (e.g., caussa > classical causa ’cause’). It is also clear that loanwords with intervocalic s were freely borrowed, and with the exception of the very early Greek borrowing tūs-tūris ‘incense’, none of them were adapted in any way to conform to a putative *VsV constraint:

(1) Greek loanwords: ambrosia ‘id.’, *asōtus ‘libertine’ (acc.sg. asōtum), basis ‘pedestal’, basilica ‘public hall’, casia ‘cinnamon’ (cf. cassia), cerasus ‘cherry’, gausapa ‘woolen cloth’, lasanum ‘cooking utensil’, nausea ‘id.’, pausa ‘pause’, philosophus ‘philosopher’, poēsis ‘poetry’, sarīsa ‘lance’, seselis ‘seseli’
(2) Celtic loanwords: gaesī ‘javelins’, omāsum ‘tripe’
(3) Germanic loanwords: glaesum ‘amber’, bisōntes ‘wild oxen’

References

Gorman, K. 2014a. A program for phonotactic theory. In Proceedings of the 47th Annual Meeting of the Chicago Linguistic Society, pages 79-93.
Gorman, K. 2014b. Exceptions to rhotacism, In Proceedings of the 48th Annual Meeting of the Chicago Linguistic Society, pages 279-293.
Iverson, G. K. and Salmons, J. C. 2005. Filling the gap: English tense vowel plus final
/š/. Journal of English Linguistics 33: 1-15.

Allophones and pure allophones

I assume you know what an allophone is. But what this blog post supposes […beat…] is that you could be more careful about how you talk about them.

Let us suppose the following:

  • the phonemic inventory of some grammar G contains t and d
  • does not contain s or z
  • yet instances of s or z are found on the surface

Thus we might say that /t, d/ are phonemes and [s, z] are allophones (perhaps of /t, d/: maybe in G, derived coronal stop clusters undergo assibilation).

Let us suppose that you’re writing the introduction to a phonological analysis of G, and in Table 1—it’s usually Table 1—you list the phonemes you posit, sorted by place and manner. Perhaps you will place s and in italics or brackets, and the caption will indicate that this refers to segments which are allophones.

I find this imprecise. It suggests that all instances of surface t or d are phonemic (or perhaps more precisely, and more vacuously, are faithful allophones),1 which need not be the case. Perhaps G has a rule of perseveratory obstruent cluster voice assimilation and one can derive surface [pt] from /…p-d…/, or surface [gd] from /…g-t…/, and so on. The confusion here seems to be that we are implicitly treating the sets of allophones and phonemes are disjoint when the former is a superset of the latter. What we seem to actually mean when we say that [s, z] are allophones is rather that they are pure allophones: allophones which are not also phonemes.

Another possible way to clarify the hypothetical table 1 is to simply state what phonemes and z are allophones of, exactly. For instance, if they are purely derived by assibilation, we might write that “the stridents s, z are (pure) allophones of the associated coronal stops /t, d/ respectively”. However, since this might be besides the point, and because there’s no principled upper bound on how many phonemic sources a given (pure or otherwise) allophone might have, I think it should suffice to suggest that s and z are pure allophones and leave it at that.2

This imprecision, I suspect, is a hang-over from structuralist phonemics, which viewed allophony as separate (and arguably, more privileged or entrenched) than alternations (then called morphophonemics). Of course, this assumption does not appear to have any compelling justification, and as Halle (1959) shows, it leads to substantial duplication (in the sense of Kisseberth 1970) between rules of allophony and rules of neutralization.3 Most linguists since Halle seem to have found the structuralist stipulation and the duplication it gives rise to aesthetically displeasing; I concur.

Endnotes

  1. I leave open the question of whether surface representations ever contain phonemes: perhaps vacuous rules “faithfully” convert them to allophones.
  2. One could (and perhaps should) go further into feature logic, and as such, regard both phonemes and pure allophones as mere bundles of features linked to a single timing slot. However, this makes things harder to talk about.
  3. I do not assume that “neutralization” is a grammatical primitive. It is easily defined (see Bale & Reiss 2017, ch. 20) but I see no reason to suppose that grammars distinguish neutralizing processes from other processes.

References

Bale, A. and Reiss, C. 2018. Phonology: A Formal Introduction. MIT Press.
Halle, M. 1959. Sound Pattern of Russian. Mouton.
Kisseberth, C. W. 1970. On the functional unity of phonological rules. Linguistic Inquiry 1(3): 291-306.