Use the minus sign for feature specifications

LaTeX has a dizzying number of options for different types of horizontal dash. The following are available:

  • A single - is a short dash appropriate for hyphenated compounds (like encoder-decoder).
  • A single dash in math mode,$-$, is a longer minus sign
  • A double -- is a longer “en-dash” appropriate for numerical ranges (like 3-5).
  • A triple --- is a long “em-dash” appropriate for interjections (like this—no, I mean like that).

My plea to linguists is to actually use math mode and the minus sign when they are writing binary features. If you want to turn this into a simple macro, you can please the following in your preamble:

\newcommand{feature}[2]{\ensuremath{#1}\textsc{#2}}

and then write \feature{-}{Back} for nicely formatted feature specifications.

Note that this issue has an exact parallel in Word and other WYSIWYG setups: there the issue is as simple as selecting the Unicode minus sign (U+2212) from the inventory of special characters (or just googling “Unicode minus sign” and copying and pasting what you find). 

A note on pure allophony

I have previously discussed the notion of pure allophony, contrasting it with the facts of alternations. What follows is a lightly edited section from my recent NAPhC 12 talk, which in part hinges on this notion.


While Halle (1959) famously dispenses with the structuralist distinction between phonemics and morphophonemics, some later generativists reject pure allophony outright. Let the phonemic inventory of some grammar G be P and the set of surface phones generated by G from P be S. If some phoneme p P always corresponds—in some to be made precise—to some phone s ∈ S and if s ∉ P then s is a pure allophone of p. For example, if /s/ is a phoneme and [ʃ] is not, but all [ʃ]s correspond to /s/s, then [ʃ] is a pure allophone of [s]. According to some descriptions, this is the case for Korean, as [ʃ] is a (pure) allophone of /s/ when followed by [i].

One might argue that alternations are more entrenched facts than pure allophony, simply because it is always possible to construct a grammar free of pure allophony. For instance, if one wants to do away with pure allophony one can derive the Korean word [ʃI] ‘poem’ from /ʃi/ rather than from /si/. One early attempt to rule out pure allophony—and thus to motivate the choice of /ʃi/ over /si/ for the this problem—is the alternation condition (Kiparsky 1968). As Kenstowicz & Kisseberth (1979:215) state it, this condition holds that “the UR of a morpheme may not contain a phoneme /x/ that is always realized phonetically as identical to the realization of some other phoneme /y/.” [Note here that /x, y/ are to be interpreted as variables rather than as the voiceless velar fricative or the front high round vowel.–KBG] Another recent version of this idea—often attributed to Dell (1973) or Stampe (1973)—is the notion of lexicon optimization (Prince & Smolensky 1993:192).

A correspondent to this list wonders why, in a grammar G such that G(a) = G(b) for potential input elements /a, b/, a nonalternating observed element [a] is not (sometimes, always, freely) lexically /b/. The correct answer is surely “why bother?”—i.e. to set up /b/ for [a] when /a/ will do […] The basic idea reappears as “lexicon optimization” in recent discussions. (Alan Prince, electronic discussion; cited in Hale & Reiss 2008:246)

Should grammars with pure allophony be permitted? The question is not, as is sometimes supposed, a purely philosophical one (see Hale & Reiss 2008:16-22): both linguists and infants acquiring language require a satisfactory answer. In my opinion, the burden of proof lies with those who would deny pure allophony. They must explain how the language acquisition device (LAD) either directly induces grammars that satisfy the alternation condition, or optimizes all pure allophony out of them after the fact. “Why bother” could go either way: why posit either complication to the LAD when pure allophony will do? The linguist faces a similar problem to the infant. To wit, I began this project assuming Latin glide formation was purely allophonic, and only later uncovered—subtle and rare—evidence for vowel-glide alternations. Thus in this study, I make no apology for—and draw no further attention to—the fact that some data are purely allophonic. This important question will have to be settled by other means.

References

Dell, F. 1973. Les règles et les sons. Hermann.
Hale, M, and Reiss, R.. 2008.
The Phonological Enterprise. Oxford University Press.
Halle, M. 1959. The Sound Pattern of Russian. Mouton.
Kenstowicz, M. and Kisseberth, C. 1979. Generative Phonology: Description and Theory. Academic Press.
Kiparsky. P. 1968. How Abstract is Phonology? Indiana University Linguistics Club.
Prince, A. and Smolensky, P. 1993. Optimality Theory: Constraint interaction in generative grammar. Technical Report TR-2, Rutgers University Center For Cognitive Science and Technical Report CU-CS-533-91, University of Colorado, Boulder Department of Computer Science.
Stampe, D. 1973. A Dissertation on Natural Phonology. Garland.

Defectivity in Amharic

[This is part of a series of defectivity case studies.]

According to Sande (2015), only Amharic verb stems that contain a geminate can form a frequentative. Since not all imperfect aspect verbs have geminates, some lack frequentatives and speakers must resort to periphrasis. If I understand the data correctly, it appears that the frequentative is a /Ca-/ reduplicant template which docks to the immediate left of the first geminate; the C (consonant) slot takes its value from said geminate. For instance, for the perfect verb [ˈsäb.bärä] ‘he broke’, the frequentative is [sä.ˈbab.bärä] ‘he broke repeatedly’. But there is no corresponding frequentative for the imperfective verb [ˈjə.säb(ə)r] ‘he breaks’ since there is no geminate to dock the reduplicant against; Sande marks as ungrammatical *[jə.sä.ˈbab(ə)r] and presumably other options are out too.

(h/t: Heather Newell)

References

Sande, H. 2015. Amharic infixing reduplication: support for a stratal approach to morphophonology. Talk presented at NELS 46.

More than one rule

[Leaving this as a note to myself to circle back.]

I’m just going to say it: some “rules” are probably two or three rules, because the idea that rules are defined by natural classes (and thus free of disjunctions) is more entrenched than our intuitions about whether or not a process in some language is really one rule or not, and we should be Gallilean about this. Here are some phonological “rules” that are probably two or three rules different rules.

  • Indo-Iranian, Balto-Slavic families, and Albanian “ruki” (environment: preceding {w, j, k, r}): it is not clear to me if any of these languages actually need this as a synchronic rule at all.
  • Breton voiced stop lenition (change: /b/ to [v], /d/ to [z], /g/ to [x]): the devoicing of /g/ must be a separate rule. Hat tip: Richard Sproat. I believe there’s a parallel set of processes in German.
  • Lamba patalatalization (change: /k/ to [tʃ], /s/ to [ʃ]): two rules, possibly with a Duke-of-York thing. Hat tip: Charles Reiss.
  • Mid-Atlantic (e.g., Philadelphia) English ae-tensing (environment: following tautosyllabic, same-stem {m, n, f, θ, s, ʃ]): let’s assume this is allophony; then the anterior nasal and voiceless fricative cases should be separate rules. It is possible the incipient restructuring of this as having a simple [+nasal] context provides evidence for the multi-rule analysis.
  • Latin glide formation (environment: complex). Front and back glides are formed from high short monophthongs in different but partially overlapping contexts.

Feature maximization and phonotactics

[This is a quick writing exercise for in-progress work with Charles Reiss. Sorry if it doesn’t make sense out of context.]

An anonymous reviewer asks:

I wonder how the author(s) would reconcile this learning model with the evidence that both children and adults seem to aggressively generalize phonotactic restrictions from limited data (e.g. just [p]) to larger, unobserved natural classes (e.g. [p f b v]). See e.g. the discussion in Linzen & Gallagher (2017). If those results are credible, they seem much more consistent with learning minimal feature specifications for natural classes than learning maximal ones.

First, note that Linzen & Gallagher’s study is a study of phonotactic learning, whereas our proposal concerns induction of phonological rules. We have been, independently but complementarily, quite critical of the naïve assumptions inherent in prior work on this topic (e.g., Gorman 2013, ch. 2; Reiss 2017, §6); we have both argued that knowledge of phonotactic generalizations may require much less grammatical knowledge than is generally believed.

Secondly, we note that Linzen & Gallagher’s subjects are (presumably; they were recruited on Mechanical Turk and were paid $0.65 USD for their efforts) adults briefly exposed to an artificial language. While we recognize that adult “artificial language learning” studies are common practice in psycholinguistics, it is not clear what such studies contribute to our understanding of phonotactic acqusition (whatever the phonotactic acquirenda turn out to be) by children robustly exposed to realistic languages in situ.

Third, the reviewer is incorrect; the result reported by Linzen & Gallagher (henceforth L&G) is not consistent with minimal generalization. Let us grant—for sake of argument—that our proposal about rule induction in children is relevant to their work on rapid phonotactic learning in adults. One hypothesis they entertain is that their participants will construct “minimal classes”:

For example, when acquiring the phonotactics of English, learners may first learn that both [b] and [g] are valid onsets for English syllables before they can generalize to other voiced stops (e.g., [d]). This generalization will be restricted to the minimal class that contained the attested onsets (i.e., voiced stops), at least until a voiceless stop onset is encountered.

If by a “minimal class” L&G are referring to a natural class which is consistent with the data and has an extension with the fewest members, then presumably they would endorse our proposal of feature maximization, since the class that satisfies this definition is the most fully specified empirically adequate class. However, it is an open question whether or not such a class would actually contain [d]. For instance, if one assumes that major place features are bivalent, then the intersection of the features associated with [b, g] will contain the specification [−coronal], which rules out [d].

Interestingly, the matter is similarly unclear if we interpret “minimal class” intensionally, in terms of the number of features, rather than in terms of the number of phonemes the class picks out. The (featurewise-)minimal specification for a single phone (as in the reviewer’s example) is the empty set, which would (it is generally assumed) pick out any segment. Then, we would expect that any generalization which held of [p], as in the reviewer’s example, to generalize not just to other labial obstruents (as the reviewer suggests), but to any segment at all. Minimal feature specification cannot yield a generalization from [p] to any proper subset of segments, contra the anonymous reviewer and L&G. An adequate minimal specification which picks out [p] will pick out just [p].; L&G suggest that maximum entropy models of phonotactic knowledge may have this property, but do not provide a demonstration of this for any particular implementation of these models.

We thank the anonymous reviewer for drawing our attention to this study and the opportunity their comment has given us to clarify the scope of our proposal and to draw attention to a defect in L&G’s argumentation.

References

Gorman, K. 2013. Generative phonotactics. Doctoral dissertation, University of Pennsylvania.
Linzen, T., and Gallagher, G. 2017. Rapid generalization in phonotactic learning. Laboratory Phonology: Journal of the Association for Laboratory Phonology 8(1): 1-32.
Reiss, C. 2017. Substance free phonology. In S.J. Hannahs and A. Bosch (ed.), The Routledge Handbook of Phonological Theory, pages 425-452. Routledge.

Codon math

It well-known that there are twenty “proteinogenic” amino acids—those capable of creating proteins—in eukaryotes (i.e., lifeforms with nucleated cells). When biologists first began to realize that DNA synthesizes RNA, which synthesizes amino acids, it was not yet known how many DNA bases (the vocabulary being A, T, C, and G) were required to code an animo acid. It turns out the answer is three: each codon is a base triple, each corresponding to an amino acid. However, one might have deduced that answer ahead of time using some basic algebra, as did Soviet-American polymath George Gamow. Given that one needs at least 20 aminos (and admitting that some redundancy is not impossible), it should be clear that pairs of bases will not suffice to uniquely identify the different animos: 42 = 16, which is less than 20 (+ some epsilon). However, triples will more than suffice: 43 = 64. This holds assuming that the codons are interpreted consistently independently of their context (as Gamow correctly deduced) and whether or not the triplets are interpreted as overlapping or not (Gamow incorrectly guessed that they overlapped, so that a six-base sequence contains four triplet codons; in fact it contains no more than two).

All of this is a long way to link back to the idea of counting entities in phonology.  It seems to me we can ask just how many features might be necessary to mark all the distinctions needed. At the same time, Matamoros & Reiss (2016), for instance, following some broader work by Gallistel & King (2009), take it as desirable that a cognitive theory involve a small number of initial entities that give rise to a combinatoric explosion that, at the etic level, is “essentially infinite”. Surely similar thinking can be applied throughout linguistics.

References

Gallistel, C. R., and King, A. P.. 2009. Memory and the Computational
Brain: Why Cognitive Science Will Transform Neuroscience. Wiley-Blackwell.
Matamoros, C. and Reiss, C. 2016. Symbol taxonomy in biophonology. In A. M. Di Sciullo (ed.), Biolinguistic Investigations on the Language Faculty, pages 41-54. John Benjmanins Publishing Company.

Phonological nihilism

One might argue that phonology is in something of a crisis period. Phonology seems to be going through early stages of grief for what I see as the failure of teleological, substance-rich, constraint-based, parallel-evaluation approaches to make headway, but the next paradigm shift is yet to become clear to us. I personally think that logical, substance-free, serialist approaches ought to represent our next i-phonology paradigm, with “evolutionary”-historical thinking providing the e-language context, but I may be wrong and altogether different paradigm may be waiting in the wing. The thing that troubles me is that phonologists from these still-dominant constraint-based traditions seem to have less and less faith in the tenets of their theories, and in the worst case this expresses itself as a sort of nihilism. I discern two forms of this nihilism. The first is the phonologist who thinks we’re doing “word sudoku”, playing games of minimal description that produce generalizations without a shred of cognitive support. The second is the phonologist who thinks that everything is memorized, so that the actual domain of phonological generalization are just Psych 101 subject pool nonce word experiments. My pitch to both types of nihilists is the same: if you truly believe this, you ought to spend more time at the beach and less in the classroom, and save some space in the discourse for those of us who believe in something.

Thought experiment #3

[The semester is finally winding down and I am back to writing again.]

Let us suppose one encounters a language in which the only adjacent consonants are affricates like [tʃ, ts, tɬ].1 One might be tempted to argue that these affricates are in fact singleton contour phonemes2 and that the language does not permit true consonant clusters.3

Let us suppose instead that one finds a language in which word-internal nasal-stop clusters are common, but nasal-glide and nasal-liquid clusters are not found except at transparent morpheme boundaries.4 One then might be tempted to argue that in this language, nasal-stop clusters are in fact sequences of nasal followed by an oral consonant rather than singleton contour phonemes.

In my opinion, neither of these argument “go through”. They follow from nothing, or at least nothing that has been explicitly stated. Allow me to explain, but first, consider the following hypothetical:

The metrical system of Centaurian, the lingua franca of the hominid aliens of the Alpha Centauri system, historically formed weight-insensitive trochees, with final extrametricality for prosodic words with odd syllable count of more than one syllable. However, a small group of Centaurian exiles have been hurtling towards the Sol system at .05 parsecs a year (roughly 1m MPH) for the last century or so. Because of their rapid speed of travel it is impossible for these pioneers to stay in communication with their homeworld, and naturally their language has undergone drift over the past few centuries. In particular, Pioneer Centaurian (as we’ll call it) has slowly but surely lost all the final extrametrical syllables of Classical Centaurian, and as a result there are no longer any 3-, 5-, 7- or 9- (etc.) syllable words in the Pioneer dialect.

As a result of a phonetically well grounded, “plausible”, Neogrammarian sound change, Pioneer Centaurian (PC) lacks long words with an odd number of syllables, though it still has 1-syllable words. What then is the status of this generalization in the grammar of PC speakers? The null hypothesis has to be that it has no status at all. Even though the lexical entries of PC have undergone changes, the metrical grammar of PC could easily be identical to Classical Centaurian: weight-intensitive trochees, with a now-vacuous rule of final extrametricality. Furthermore, it is quite possible that PC speakers have simply not noticed the relevant metrical facts, either consciously or subconsciously. Would PC speakers rate, say, 4-syllable nonce words as ill-formed possible words? No one knows. When PC speakers inevitably come in contact with English, will be they be reluctant to borrow a 6-syllable words like anthropomorphism or detoxification into their language, or will they feel the need to append or delete a syllable to conform to their language’s lexicon? Once again, no one knows.

The same is essentially true of the aforementioned language in which the only consonant clusters are affricates, or the aforementioned language in which nasal-consonant clusters are highly restricted. It might be the case that the grammar treats the former as single segments and the grammar treats the latter as clusters, but absolutely nothing presented thus far suggests it has to be true.

Let us refer to the idea that the grammar needs to encode phonotactic generalizations (somehow) as the phonotactic hypothesis. I have argued—though more for the sake of argument than out of genuine commitment—for a constrained version of this hypothesis; I note that any surface-true rule will rule out certain surface forms. Thus, if desired, one can derive—or perhaps more accurately, project—certain phonotactic generalizations by taking a free-ride on surface-true rules.5 But note: I have not argued that the phonotactic hypothesis is correct. Rather, I have simply provided a way to derive some phonotactic generalizations using entrenched grammatical machinery (i.e., phonological alternations). And this can only account for a subset of possible phonotactic generalizations.

Let us consider the language with word-initial affricates again. Linguists are often heard to say that one needs to posit phonotactic generalizations to “rule out” consonant clusters in this language. I disagree. Imagine that we have two grammars, G and G’. G has a set of URs, which includes contour phoneme affricates (/t͡ɬakaʔ-/ ‘people’, /t͡sopelik-/ ‘sweet’, etc., where the IPA tie bar symbolizes contour phonemes) but no consonant clusters. G also has a surface constraint on consonant clusters other than the affricates (which can be assumed to be contour phonemes, for sake of simplicity). G’ has the same set of URs, but lacks the surface constraint. Is there any reason to prefer G over G’? With the evidence given so far, I submit that there is not. Of course, there might be some grammatical patterns which, if otherwise unconstrained, would produce consonant clusters, in which case the phonotactic constraint of G may have some work to do. And, there may additional facts (perhaps the adaptation of loanwords, or wordlikeness judgments, though these data are not applied to this problem without making additional strong assumptions) may also militate in favor of G. But rarely if ever are these additional facts presented when positing G’. Now let us consider a third grammar, G”. This grammar is the same as G’, except that the affricates are now represented as consonant clusters (/tɬakaʔ-/ ‘people’, /tsopelik-/ ‘sweet’, etc.) rather than contour phonemes. Is there any reason to prefer either G’ or G” given the facts available to us thus far? It seems to me there is not.

This is a minor scandal for phonemic analysis. But it is not a purely philosophical issue: it is the same issue that children acquiring Nahuatl face. “Phonotacticians” have largely sidestepped these issues by making a completely implicit assumption that grammars (or perhaps, language learners) abhor a vacuum, in the sense that phonotactic constraints need to be posited to rule out that which does not occur. The problem is that there is often no reason to think these things would occur in the first place. If we assume that grammars do not abhor a vacuum—allowing us to rid ourselves of the increasingly complex machinery used to encode phonotactic generalizations not derived from alternations—we obtain exactly the same results in the vast majority of cases.

Endnotes

  1. One language with this property is Classical Nahuatl.
  2. Whatever that means! It’s not immediately clear, since there does not seem to be a fully-articulated theory that explains what it means to be a single segment in underlying representation to correspond to multiple articulatory targets on the surface. Without such a theory this feels like mere phenomenological description.
  3. Recently, Gouskova & Stanton (2021) express this heuristic, which has antecedents going back to at least Trubetzkoy, as a simple computational model.
  4. One language which supposedly has this property is Gurindji (McConvell 1988), though I only have only seen the relevant data reprinted in secondary sources. Thanks to Andrew Lamont (p.c.) for drawing my attention to this data. Note that in this language, the nasal-obstruent clusters undergo dissimilation when preceded by another nasal-obstruent cluster, which might—under certain assumptions—be a further argument that nasal-obstruent sequences are really clusters.
  5. See also Gorman 2013, particularly chapters 3-4.

References

Gorman, K. 2013. Generative phonotactics. Doctoral dissertation, University of Pennsylvania.
Gouskova, M. and Stanton, J. 2021. Learning complex segments. Language 97(1): 151-193.
McConvell, P. 1988. Nasal cluster dissimilation and constraints on phonological variables in Gurundji and related languages. Aboriginal Linguistics 1: 135-165.