“…phonology is logically (and causally) prior to phonetics.”

Two important consequences follow from this. First, that phonology is logically (and causally) prior to phonetics as here defined. Second, phonology is also epistemologically prior to phonetics. Judgments about phonetic events are invariably made in terms of perceptual phonology. (Hammarberg 1976:356)

In this post I’d like to briefly review a view of the relationship between phonetics and phonology as related by Hammarberg (1976) and Appelbaum (1996), the former being primarily concerned with production and the latter with perception.

Phonetics, being concerned with the material and physical, has tended to align itself with the physical sciences (and physics in particular), and with the empiricist tradition in science.1,2 In contrast, much of what has been called the cognitive revolution in the cognitive sciences, and in linguistics in particular, is explicitly anti-empiricist. As Hammarberg and Appelbaum argue, the empiricist biases of phonetics make it ill-suited to explain fundamental facts about speech.

It is generally understood that spoken language is not produced as a discrete sequence but rather a series of overlapping gestures and acoustic signatures. Anyone who has looked closely at the acoustics of speech will already recognize that it is impossible to say exactly where, in a word like cat, the [æ]-ness ends and the [t]-ness begins. In a worrd like soon, the fricative portion shows signs of rounding not found in words like scene. From an acoustic record alone, one cannot determine empirically how many segments are present. And, one cannot produce natural-sounding synthesized speech via simple concatenation of segments. It is not just that the [æ, t, s] and other segments are coarticulated with nearby segments, however: it is also the case that there are simply no invariant acoustic-phonetic properties that uniquely characterize [t]. A [t] spoken by a child, by a man with a mouth full of chili, by a woman missing her front teeth, and so on may have radically different acoustic properties, yet we as scientists understand them to be in some sense identical phenomena.

This is a basic principle of scientific discovery: one must assume that “the vast multitude of phenomena he encounters may be accounted for in terms of the interactions of a fairly small number of basic entities, standard elementary individuals. His task thus becomes one of identifying the basic entities and describing the interactions in virtue of which the encountered phenomena are generated. From this emerge our…notions of the identity and nonidentity of phenomena.” (Hammarberg, p. 354) The linguistic notion of segment is perhaps the most important of these basic entities. It is an entity recognized both by those early lay-linguists, the Iron Age scribes who gave us the alphabet, as well as one of the most venerable notions in the history of modern linguistics. Yet, segments do not have a physical reality of their own; they do not exist in the physical world, but only in the human mind. They are “internally generated, the creature of some kind of perceptual-cognitive process.”

It is generally uncontroversial to speak of the output of the phonological component as the input to the phonetic component. From this it follows that phonology is cognitively and epistemically prior to phonetics. Coarticulation, for instance, results because of the process which maps segments—which, remember, exist only in the mind of speakers—onto articulatory and acoustic events. But one cannot talk about coarticulation without segments, since it is the spreading of articulatory-acoustic properties between segments that defines coarticulation. One must know that /s/ exists, and has an inherent properties not normally associated with—or compatible with—lip rounding to even observe the anticipatory lip rounding in words like soon.

The existence of coarticulation is often understood teleologically, in the sense that is taken to be in part mechanical, automatic, inertial. This too is a mistake, according to Hammarberg: apparent teleological explanations of human behavior should be recast, as is the tradition in Western philosophy, as the result of intentional, causal behavior. The existence of anticipatory articulation shows us that the influence of the /u/ in soon has on the realization of the preceding /s/ occurred some time before instructions to the articulators were generated, and the level at which this influence occurs should therefore be identified with the mental rather than the physiological. Hammarberg continues to argue that coarticulatory processes are akin to ordinary allophony and should reside in the scope of phonological theory. This argument is strengthened insofar as coarticulation has a language-specific character, as is sometimes claimed.

Appelbaum, while not citing Hammarberg’s original paper, extends this critique to the theory of speech perception. It is an assumption of the so-called motor theory that there are invariant properties which identify “phonetic gestures”. Since the motor theorists do not present any evidence that such invariants soc much as exist, we instead must be abstract out into mental entities which have all the properties of—and which Appelbaum identifies with—what we are calling segments, or perhaps lower-level entities like phonological features. Under this approach, then, there is no content to the motor theory of speech perception beyond the obvious point that phonetic experience, somehow, turns into purely mental representations. Again, the empiricist biases of phonetics have lead us astray.

The above discussion may influence the way we think about the role of phonetics in linguistics education. Phonetics is generally viewed as its own autonomous subdiscipline, and modern acoustic and articulatory analysis is certainly complex enough to justify serious graduate instruction, but it would seem to suggest that phonetic tools exist primarily as a way of gathering phonological information rather than  as an autonomous discipline. I am not sure I am ready to conclude that, but it certainly is provocative!

Endnotes

  1. Empiricism refers to a theory of epistemology and should not be confused with the empirical method in science (the use of sense-based observation). Many prominent thinkers reject empiricism in favor of rationalism, but support the use  of empirical methods. No one is seriously arguing against the use of the senses.
  2. This will be shown to be yet another example of physics envy as the source of sloppy thinking in linguistics.

References

Appelbaum, I. 1996. The lack of invariance problem and the goal of speech perception. In Proceeding of Fourth International Conference on Spoken Language Processing, 1541-1544.
Hammarberg, R. 1976. The metaphysics of coarticulation. Journal of Phonetics 4: 353-363.

Noam on phonotactics

(Emphasis mine.)

Take the question of sound structure. Here too the person who has acquired knowledge of a language has quite specific knowledge about the facts that transcend his or her experience, for example, about which nonexistent words are possible words and which are not. Consider the forms strid and bnid. Speakers of English have not heard either of these forms, but they know that strid is a possible word, perhaps the name of some exotic fruit they have not seen before, but bnid, though pronounceable, is not a possible word of the language. Speakers of Arabic, in contrast, know that bnid is a possible word and strid is not; speakers of Spanish known that neither strid nor bnid is a possible word of their language. The facts can be explained in terms of rules of sound structure that the language learner comes to know in the course of acquiring the language.

Acquisition of the rules of sound structure, in turn, depends on fixed principles governing possible sound systems for human languages, the elemnts of which they are constituted, the manner of their combination and the modifications that they may undergo in various contexts. These principles are common to English, Arabic, Spanish, and all other human languages and are used unconsciously by a person acquiring any of these languages…

Suppose one were to argue that the knowledge of possible words is derived “by analogy.” The explanation is empty until an account is given of this notion. If we attempt to develop a concept of “analogy” that will account for these facts, we will discover that we are building into this notion the rules and principles of sound structure. (Chomsky 1988:26)

 

References

Chomsky, N. 1988. Language and Problems of Knowledge: the Managua Lectures. MIT Press.

Defectivity in Tagalog

[This is part of a small but growing series of defectivity case studies. Here I am well out of my linguistic comfort zone, working with a language I know very little about, so please take my comments cum salo granis.]

The behavior of the Tagalog actor focus (AF) infix (and occasionally, prefix) -um- has received an enormous amount of attention since the days of prosodic morphology. Schachter & Otanes (1972; henceforth SO), cited in Orgun & Sprouse (1999), claim that “-um- does not occur with bases beginning with /m/ or /w/” (p. 292). Presumably this statement means such bases are defective with respect to their actor focus form; that is certainly how Orgun & Sprouse—and most of the subsequent literature—has interpreted this. I am aware of one complication, however. First off, many verbs instead use the prefix mag– to mark actor focus; SO could simply be making a distributional statement about two allomorphs of the actor focus marker. As I understand it, whether a verb takes -um-mag-, or both is conditioned by verb semantics, whether or not the verb is derived or a bare root, whether or not the verb is borrowed or not, and so on, and there is probably some regional, register, and individual variation too.1 And there are other focus markers beyond -um- and mag-.

Orgun & Sprouse (henceforth OS) provide just a few examples (p. 206). According to them, there is no AF form *mumahal ‘to become expensive’ (< mahal). It’s not really clear what we ought to reason from *mumahal. First, do all adjectives have a corresponding AF verb form? Secondly, one might ask whether magmahal is the AF form of this adjective. According to Wiktionary, it is, so this is probably just an instance of ordinary morphological blocking. Third, this is obviously a loanword, which might have something to do with its choice of AF affix and/or whether it participates in the AF system at all. Similarly, OS give the example *mumura ‘to become cheap’ (< mura), but Wiktionary says magmura exists and has the relevant reading. If this is correct, OS may have confused defectivity and blocking.

OS provide two other types of examples of what they call defectivity.

First, OS claim that /Cw…/-initial stems borrowed from English can form AF forms of the form /C-um-w…/ but not /Cw-um…/. Thus the AF infinitive sumwer (< Eng. swear) well-formed, but *swumer is not. It is not clear this generalization is correct, since Ross (1996) elicits the AF infinitive [twumɪtɘɾ] (< Eng. twitter; p. 15) from “a native speaker of Tagalog in her thirties who had recently come to Canada from Manila…” who was “asked to ‘borrow’ hypothetical English loanwords…” (p. 2).2 OS do not discuss /Cm-…/-initial borrowings, and they give us no reason to suspect that /Cw…/- and /Cm…/-initial loanword stems would behave differently, but Ross also elicits [smumajl] (< Eng. smile; p. 15).

Secondly, OS claim that /m, w/-initial stems borrowed from English do not form AF verbs in -um-. I have not been able to find any of their examples in a Tagalog dictionary, so these may just be poorly assimilated loanwords. 

OS note that there is no general restriction on homomorphemic /…mum…/ sequences in Tagalog, and they note that reduplication may also produce /…mum…/. Even if their description is correct, it is a mystery why this restriction holds only of a specific AF affix. But I suspect that OS have either misunderstood SO, or perhaps misgeneralized from OS’s admittedly vague comment.

Before I conclude, I should note that the empirical situation for Tagalog linguistics is dire. The language has many tens of millions of speakers, and has long been of interest to linguists. There are extensive grammatical resources on Tagalog in English and Spanish. Yet any time I interact with Tagalog examples in the literature, I find data inconsistencies, analytical laziness, or both. As a student put it to me: “As a Filipina it feels disrespectful and offensive, and as a linguist it feels super shady and raises so many philosophy of science red flags.” There may be some relevant results in Zuraw 2007, which elicits a corpus of the AF forms of Tagalog loanwords, including forms in /Sm-…/, but I am unable to reconcile those findings with Ross 1996, despite the fact that Ross and Zuraw are the same person.

Endnotes

  1. For roots that take both affixes, the two AF forms may or may not be synonymous. For example, pumunta and magpunta are roughly equivalent AF forms of ‘to go’. However, bumuli means ‘to buy’ whereas magbili means ‘to sell’.
  2. Note that it was the ’90s, mannnnnn, so this is about songbirds; it has nothing to do with microblogging.

References

Orgun, C. O. and Sprouse, R. L.  1999. From MPARSE to CONTROL: deriving ungrammaticality. Phonology 16:191-224.
Ross, K. 1996.  Floating phonotactics: variability in reduplication and infixation in Tagalog loanwords. Master’s thesis, University of California, Los Angeles.
Schachter, P. and Otanes, F. 1972. Tagalog Reference Grammar. University of California Press.
Zuraw, K. 2007. The role of phonetic knowledge in phonological patterning: corpus and survey evidence from Tagalog infixation. Language 83: 277-316.

Magic and productivity: Spanish metaphony

In Gorman & Yang 2019 (henceforth GY), we provide an analysis of metaphonic patterns in Spanish. This is just one of four or five case studies and it is a bit too brief to go into some interesting representational issues. In this post I’ll try to fill some of the missing details as I understand them, with the caveat that Charles does not necessarily endorse any of my proposals here.

The tolerance principle approach to productivity is somewhat unique in that it is not tied to any particular theory of rules or representations, so long as such theories provide a way to encode competing rules applying in order of decreasing specificity (Pāṇini’s principle or the elsewhere principle). Yet any particular tolerance analysis requires us to commit to a specific formal analysis of the phenomenon⁠—the relevant rules and the representations over which they operate—so that we know what to count. The way in which I apply the tolerance principle also presumes that productivity (e.g., as witnessed by child overregularization errors) or its lack (as witnessed by inflectional gaps) is a first-class empirical observation and that any explanatorily-adequate tolerance analysis ought to account for it. What this means to me is that the facts productivity can adjudicate between different formal analyses, as the following example shows.

The facts are these. A large percentage of Spanish verbs, all of which have a surface mid vowel (e or o) in the infinitive, exhibit alternations targeting the nucleus of the final syllable of the stem. In all three conjugations, one can find verbs in which this surface mid vowel diphthongizes to ie [je] or ue [we], respectively.1 Furthermore, in the third conjugation, there is a class of verbs in which the e in the final syllable of certain forms alternates with an i.2

The issue, of course, is that there are verbs which are almost identical to the diphthongizing or ei stems but which do not undergo these alternations (GY:178f.). One can of course deny that magic is operating here, but this does not seem workable.3 We need therefore to identify the type of magic: the rules and representations involved.

There is some reason to think that conjugation class is relevant to these verb stem alternations. For example, Mayol et al. (2007) analyzes verb stem errors in a sample of six children acquiring Spanish, a corpus of roughly 2,000 verb tokens. Nearly all errors in this corpus involve underapplication of diphthongization to diphthongizing verbs in the first and second conjugation; errors in the third conjugation are extremely rare. Secondly the e-i alternations are limited to the third conjugation. As Harris (1969:111)  points out, the e form surfaces only when the stem is followed by an i in the first syllable of the desinence. This suggests that the alternation is a lowering rather than a raising one, and explains why this pattern is confined to the third (-i-) conjugation. Finally, there are about a dozen Spanish verbs, all of the third conjugation, which are defective in exactly those inflectional forms—those in which there is either stress on the stem or those in which the stem is followed by a desinential /i/ in the following syllable—which would reveal to us whether the stem is diphthongization or lowering. These three facts seem to be telling us that these alternations are sensitive to conjugation class.

Jim Harris has long argued for an abstract phoneme analysis of Spanish diphthongization. In Harris 1969, diphthongization reflect abstract phonemes, present underlyingly, denoted /E, O/; no featural decomposition is provided, but one could imagine that they are underspecified for some features related to height. Harris (1985) instead supposes that the vowels which undergo diphthongization under stress bear two skeletal “x” slots, one linked and one unlinked, as follows.

o
|
X X

This distinguishes them from ordinary non-alternating mid vowels (which only have one “x”) and non-alternating diphthongs (which are prelinked to two “x”s). Harris argues this also provides explanation for why stress conditions this alternation.

One interesting property of Harris’ account, one which I do not believe has been remarked on before, it is that it seems to rule out the idea that diphthongization vs. non-diphthongization is “governed by the grammar”: it is purely a fact of lexical representation and surface forms follow directly from applying the rules to the abstract phonemic forms. To put it more fancifully, there is no “daemon” inside the phonemic storage unit of the lexicon deciding where the diphthongs or lowering vowels go; such facts are of interest for “evolutionary” theorizing, but are accidents of diachrony.

However, I believe the facts of productivity and the conditioning effects of conjugation support an alternative—and arguably more traditional—analysis, in which diphthongization and lowering are governed by abstract diacritics at the root level, in the form of rule features of the sort proposed by Kisseberth (1970) and Lakoff (1970).

I propose that verbs with mid vowel in the final syllable of their stem which do not undergo diphthongization, like pegar ‘to stick to’; (e.g., pego ‘I stick to’), are marked [−diph], and those which do undergo diphthongization, like negar ‘to deny’ (niego ‘I deny’) are marked [+diph]; both are assumed to have an /e/ in underlying form. Similarly, I propose that verbs which undergo lowering, like pedir ‘to ask for’ (e.g., pido ‘I ask for’), are specified [+lowering] and non-lowering verbs, like vivir ‘to live’ (vivo ‘I live), are specified [−lowering]; both have an underlyingly /i/. Then, the rule of lowering is

Lowering: i -> e / __ C_0 i

or, in prose, an /i/ lowers to /e/ when followed by zero or more consonants and a /i/. I assume a convention of rule application such that rule R can apply only to those /i/s which are part of a root marked [+R]; it is as if there is an implicit [+R] specification on the rule’s target. Therefore, the rule of lowering does not apply to vivir. This rule feature convention is assumed to apply to all phonological rules, including diphthongization.

I furthermore propose that [diph] and [lowering] rule features are inserted during the derivation according to GY’s tolerance analysis. For first (-a-) and second (-e-) conjugation verbs, [−diph] is the default and [+diph] is lexically conditioned.

[] -> [+diph] / __ {√neg-, ...}
   -> [-diph] / __

For third (-i-) conjugation verbs, I assume that there is no default specification for either rule feature.

[] -> [+lowering] / __ {√ped-, ...}
[] -> [-lowering] / __ {√viv-, ...}

I have not yet provided formal machinery to limit these generalizations to the particular conjugations, but I wish to stay agnostic about morphological theory and so I assume that any adequate model of the morphophonological interface ought to be able to encode conjugation class-specific generalizations like the above.

I leave open the question as to how roots which fail to satisfy the phonological conditions for lowering (like those which do not contain a final-syllable /i/) or diphthongization (like those which do not contain a final-syllable mid vowel) are specified with respect to the [diph] and [lowering] features. I am inclined to say that they remain underspecified for these features throughout the derivation. However, all that is essential here is that such roots are not in scope for the tolerance computation.

Let us suppose that we wish to encode, synchronically, phonological “trends” in the lexicon with respect to the distribution of diphthongizing and/or lowering verbs, such as Bybee & Pardo’s claim that eie diphthongization is facilitated when followed by the trill rr. Such observations could be encoded at the point in which rule features are inserted, if desired. It is unclear how a similar effect might be achieved under the abstract phoneme analysis. I remain agnostic on this question, which may ultimately bear on the past tense debate.

In future work (if blogging can be called “work”), it would be interesting to expand the proposal to other cases of morpholexical behavior studied by Kisseberth (1970), Lakoff (1970), and Zonneveld (1978), among others. Yet my proposal does not entail that we draw similar conclusions for all superficially similar case studies. For instance, I am unaware at present of evidence contradicting Rubach’s (2016) arguments that the Polish yers are abstract phonemes.

Endnotes

  1. Let us assume, as does Harris, that the appearance of the [e] in both diphthongs is the result of a default insertion rule applying after diphthongization converts the nucleus to the corresponding glide.
  2. This of course does not exhaust the set of verbal alternations, as there are highly-irregular consonantal and vocalic alternations in a handful of other verbs.
  3. Albright et al. (2001) and Bybee & Pardo (1981) are sometimes understood to have found solid evidence for a “non-magical” analysis, in which the local context in which a stem mid vowel is found is the sole determinant. This is a massive overinterpretation. Bybee & Pardo identify some local contexts which seem to favor diphthongization, and the results of a small nonce word cloze task are consistent with these findings. Albright et al. use a simple computational model to discover some contexts which seem to favor diphthongization, and find that subjects’ ratings of possible nonce words (on a seven-point Likert scale) are correlated with the models’ predictions for diphthongization. Schütze (2005) gives a withering critique of the general nonce word rating approach. Even ignoring this, neither study links nonce word tasks in adult knowledge of, or child acquisition of, actual Spanish words.

References

Albright, A., Andrade, A., and Hayes, B. 2001. Segmental environments of Spanish diphthongization. UCLA Working Papers in Linguistics 7: 117-151.
Baković, E., Heinz, J., and Rawski, J. In press. Phonological abstractness in the mental lexicon. In The Oxford Handbook of the Mental Lexicon, to appear.
Bale, A., and Reiss, C. 2018. Phonology: a Formal Introduction. MIT Press.
Bybee, J., and Pardo, E. 1981. Morphological and lexical conditioning of rules: experimental evidence from Spanish. Linguistics 19: 937-968.
Gorman, K. and Yang, C. 2019. When nobody wins. In F. Rainer, F. Gardani, H. C. Luschützky and W. U. Dressler (ed.), Competition in Inflection and Word Formation, 169-193. Springer.
Harris, J. 1969. Spanish Phonology. MIT Press.
Harris, J. 1985. Spanish diphthongisation and stress: a paradox resolved. Phonology Yearbook 2:31-45.
Lakoff, G. 1970. Irregularity in Syntax. Holt, Rinehart and Winston.
Kisseberth, C. W. 1970. The treatment of exceptions. Papers in Linguistics 2:44-58.
Mayol, Laia. 2007. Acquisition of irregular patterns in Spanish verbal morphology. In Proceedings of the Twelfth ESSLLI Student Session, 1-11.
Schütze, C. 2005. Thinking about what we are asking speakers to do. In S. Kepser and M. Reis (ed.), Linguistic Evidence: Empirical, Theoretical, and Computational Perspectives, pages 457-485. Mouton de Gruyter.
Zonneveld, W. 1978. A Formal Theory of Exceptions in Generative Phonology. Peter de Ridder.

On the different types of magic

In two earlier posts, I discussed the idea of magic, my term for the deductive necessity that some linguistic property distinguishes those morphemes which undergo or do not undergo surface-unpredictable alternations. For instance, the Spanish verb negar ‘to deny’ diphthongizes under stress (e.g., niego ‘I deny’), whereas the superficially similar pegar ‘to stick to s.t.’ does not (pego ‘I stick to s.t.’), and there must be something different about the two stems that causes this.

In a forthcoming handbook chapter, Baković, E., Heinz, J., and Rawski (in press; henceforth BHR) take up the familiar Kiparskian question of the locuses of phonological abstractness, and in doing so, they discuss several ways in which this magic might be encoded. I would like to briefly review their taxonomy.

Under the suppletive analysis, magic verbs like negar have two stems underlying, perhaps /neg-/ and /njɛg-/, the latter used when primary stress falls on the stem. Linguists have—rightly, I think—been uncomfortable with this kind of analysis when the supposedly suppletive stem allomorphs are phonologically similar, and when the distribution of the allomorphs are easily stated in phonological or morphosyntactic terms; both are the case here. However, Aronoff (1994) argues that one must recognize the existence of suppletive patterns and his major case studies (from Hebrew and Latin) involve less-similar stem allomorphs whose distributions are not so easily stated.  I am not immediately convinced by Aronoff’s arguments, but I think they should be taken seriously. BHR are similarly skeptical of the use of suppletion except in cases where the allomorphs share little material (e.g., the Korean nominative, which is realized as /-ka/ or /-i/ depending on context).2

Under the abstract diacritic analysis, magical stems bear an feature which is part of the environment for some rule. One concrete version of this is to make the diacritic literally a rule feature, such that rule R cannot apply to a stem unless that stem bears the feature [+R]. For instance, we might represent the stems of the two Spanish verbs as /neg- {+diph}/ and /peg- {−diph}/.1 We of course need then to write the diphthongization structural change (but this is not hard) and to specify its environment (but this no more or less hard than it would be under the suppletive analysis).

Finally, under the abstract phoneme analysis, magical stems contain phonemes3 which are “abstract” (in a sense to be specified shortly) and trigger the relevant rule (here, diphthongization). BHR discern two types of abstract phonemes: absolutely abstract phonemes are feature bundles which do not appear on the surfaceand restrictedly abstract phonemes consist of surface-licit feature bundles which surface in the some, but not all, of the contexts in which they are posited.4

The distinction between abstract diacritics and abstract phonemes seems important. It is probably not surprising that self-described morphologists seem to prefer abstract diacritics whereas phonologists prefer abstract phonemes.

Endnotes

  1. One can further imagine that {−diph} is not present underlyingly but is filled in by a lexical redundancy rule early in the derivation, at least for 1st (-a-) and 2nd (-e-) conjugation verbs for which non-diphthongization seems to be the default (see Gorman & Yang 2019 and citations therein). Similar redundancy rules will be called for all rules of “pure phonology”, those which do not show morpholexical conditioning.
  2. This type of analysis is closely related to Gouskova and Pater’s concept of “exceptional morphs” in Optimality Theory.
  3. I note it is not exactly the phoneme itself which is abstract, but rather the overall phonemic form of the morph. For instance, according to Lieber (1987:100f.), German umlaut (a fronting of a [+back] stem vowel) is triggered by a “floating” [−back] feature which is underlyingly present in just those stems which undergo umlaut.
  4. It is not clear to me whether BHR treat this distinction as an object of grammar, or whether it’s just a descriptive notion.

References

Aronoff, M. 1994. Morphology by Itself. MIT Press.
Baković, E., Heinz, J., and Rawski, J. In press. Phonological abstractness in the mental lexicon. In The Oxford Handbook of the Mental Lexicon, to appear.
Gorman, K. and Yang, C. 2019. When nobody wins. In Franz Rainer, Francesco Gardani, Hans Christian Luschützky and Wolfgang U. Dressler (ed.), Competition in inflection and word formation, pages 169-193. Springer.
Harris, J. 1969. Spanish Phonology. MIT Press.
Harris, J. 1985. Spanish diphthongisation and stress: a paradox resolved. Phonology Yearbook 2:31-45.
Lieber, R. 1987. An Integrated Theory of Autosegmental Processes. State University of New York Press.

O in truncated compounds

English uses stump compounds formed by taking (roughly) the first syllable of two (or three) words and adjoining them. This is the presumably the process behind real-estate neologisms like Soho (< South of Houston) and Noho (< North of Houston), truncated brand names like HoJo (< Howard Johnson, a hotel chain), and nicknames for celebrities like BoJo (< Beau Johnson) and FloJo (< Florence Griffith Joyner). One strange property of these compounds—documented in an unpublished paper I presented with Laurel MacKenzie at an LSA meeting many years ago—is that when such compounds contain an orthographic <o>, it is almost always pronounced with the GOAT vowel (e.g., American English [oʊ]) even when that is unfaithful to the underlying pronunciation. Thus Soho is [ˈsoʊ.hoʊ], not the more faithful [ˈsaʊ.haʊ]. And, the second syllable of Samohi, a stump compound for Santa Monica High School, is presumably read […moʊ…] even though it stands in for [ˈmɑ.nɪ.kə].

(h/t: Laurel, of course.)

banner reading "Samohi"

Defectivity in Norwegian

[This is part of a small but growing series of defectivity case studies.]

Icelandic is not the only Scandinavian language to exhibit defectivity in imperatives: Rice (2003, 2004; henceforth R) describes a superficially similar pattern of defectivity in Norwegian adjectives.

In Norwegian, the infinitival form of most verbs consists of the particle å, the verb stem, and a schwa (which, like in German, is spelled -e). Such verbs’ imperatives then consists of the bare stem, without a particle or the schwa; e.g., å skrive ‘to write’/skriv ‘write!’. A second, smaller class of verbs are monosyllables ending in a (non-schwa) vowel. These verbs use the bare verb stem in both infinitive and imperative; e.g., å tre ‘to step’/tre ‘step!’. While R does not go into any details about how these two patterns might be encoded, one might posit two allomorphs of the infinitive suffix, -e and zero. Presumably this allomorphy is in part lexically conditioned, since it seems necessary to distinguish between minimal pairs like å vie ‘to dedicate’/vi ‘dedicate!’, which belongs to the former class, and å si ‘to say’/si ‘say!’, which belongs to the latter. However, R only gives a few examples of vowel-final monosyllable with infinitive in -e (all other verbs of this shape have zero infinitives), so it’s possible these are just exceptions and the allomorphy conditioning is mostly phonological.

A third class of verbs are those whose stem ends in a rising-sonority consonant cluster; e.g., åpne ‘to open’, sykle ‘to bike’.1 These superficially resemble the first class of verbs (e.g., å skrive) in that they end in a schwa in the infinitive. However, Norwegian does not permit rising sonority codas, so the expected *åpn, *sykl, and so on are ill-formed.

According to R, some speakers simply use circumlocutions to avoid the imperative of such verbs, making this a standard case of defectivity. However, R mentions several other strategies used by Norwegian speakers:2

  • The word-final sonorant can be made syllabic (e.g., [oːpn̩]).
  • If the cluster consists of a voiceless consonant followed by a sonorant, the sonorant can be devoiced, reducing the sonority rise (e.g., [oːpn̥]).
  • One can insert a schwa to break up the cluster (e.g., [oːp.pɘn]).
  • One can insert a schwa after the cluster (e.g., [oːp.nɘ]).

One question that arises is whether there are any other places in the Norwegian grammar where we would expect word-final rising sonority consonant clusters to surface. As others have noted (e.g., Albright 2009), most if not all instances of inflectional defectivity are limited to specific morphological categories. For speakers who cannot generate an imperative of verbs like åpne or sykle, is this defectivity limited to this the category of imperatives, or is it found anywhere else in the language?

Endnotes

  1. R gives the infinitives of this third class of verbs without the å particle. It is unclear to me whether this is intentional or just an oversight.
  2. These forms are ones I have posited on the basis of R’s description, which is not as detailed as one might like.

References

Albright, A. 2009. Lexical and morphological conditioning of paradigm gaps. In C. Rice and S. Blaho (ed.), When Nothing Wins: Modeling Ungrammaticality in OT, pages 117-164. Equinox.
Rice, C. 2003. Dialectal variation in Norwegian imperatives. Nordlyd 31: 372-384.
Rice, C. 2005. Optimal gaps in optimal paradigms. Catalan Journal of Linguistics 4: 155-170.

Defectivity in Icelandic

Hansson (1999; henceforth H) discusses an interesting case of defectivity in Icelandic imperative formation. According to H, this language has three types of (2sg.) imperative.

  • The root imperative is available only as a “deliberate archaism”; it won’t be considered further.
  • The full imperative consists of the root plus a coronal suffix plus a 2sg. pronominal enclitic -u /ʏ/.
  • The clipped imperative also consists of the root plus a coronal suffix but uses a contrastively stressed pronoun ‘you’ (cf. English ‘YOU work!’) instead of a clitic.

For example, the full imperative for taka ‘to take’ is taktu [ˈtʰaxtʏ] and the clipped imperative is takt ÞÚ [tʰaxt ˈθuː].1 H develops an account of the allomorphy of the dental suffix in the full and clipped imperatives; going forward I will cite the full forms, since the distinction is irrelevant. Under H’s analysis, there are two allomorphs:

  • /-T-/ is a [−spread glottis] coronal obstruent surfacing as [t] or [ð] depending on context; e.g., the full imperative for negla ‘to nail’ is negldu [ˈnɛɣ͡ltʏ].2
  • /-Tʰ-/ is a [+spread glottis] coronal obstruent, surfacing as [t] with devoicing of preceding stem-final consonants; e.g., the full imperative for synda ‘to swim’ is syntu [ˈsɪn̥tʏ].

H claims that “[f]or the vast majority of verbs, the choice of allomorph is uniquely determined on the basis of the root-final consonant(s)” (p. 108), implying that this is a phonologically conditioned allomorphy, though the conditioning is not given in prose form. H also implies (fn. 4) that this is suppletive allomorphy, though this assumption is also not justified. Let us assume, for sake of argument, that both assumptions are correct and this is a case of phonologically conditioned suppletive allomorphy. Finally, H notes that under his assumptions, there are certain roots for which either allomorph would give the same imperative surface form.

There are several exceptional verbs for which the phonological conditioning H proposes yields an incorrect result. For instance, the full imperative of senda ‘to send’ is the /-T-/ form sendu [ˈsɛntʏ] rather than the expected /-Tʰ-/ form *[ˈsɛn̥tʏ].3 H draws attention to weak verbs whose roots end in /ll, nn/. For these, H’s account of the phonological conditioning ought to prefer /-T-/, but most select /-Tʰ-/.4

There are four strong verbs whose roots end in /ll, nn/. So far, other than the characteristic ablaut, we have seen no reason to treat imperative formation in the strong verbs differently than in weak verbs.5 For example, for stela ‘to steal’, the full imperative is the /-T-/ form steldu [ˈstɛltʏ]. Yet, there are three strong verbs in /ll, nn/ for which neither possible form of the imperative is well-formed. These are the verbs vinna ‘to work’ (*vinndu, *vinntu), spinna ‘to spin (s.t.)’ (*spinndu, *spinntu), and falla ‘to fall; flunk’ (*falldu, *falltu). And to make matters more complex, there is one strong verb in /nn/ for which the “expected” /-T-/ is acceptable: the full imperative of finna ‘to find’ is finndu [ˈfɪntʏ].

H identifies the following explananda for imperative formation in Icelandic.

  • The imperative stem is always the same as the past stem in weak verbs
  • Yet, defectivity is found only in imperatives and never in pasts.
  • Defectivity occurs only in strong verbs.
  • Defectivity is found only in roots in /ll, nn/, a form which “usually is indicative of exceptionality in allomorph selection” (p. 344).

It is not obvious to me that the first explanandum is meaningful. While many linguists believe “Priscian”-like mechanisms which permit direct encoding of these kinds of facts, the mere stem identity of two semantically distant parts of speech is not itself compelling evidence. In this particular case, one might implement these facts without referring to identity by deriving the allomorphy from a verbal theme, perhaps a floating [α spread glottis] feature, which surfaces in both the imperative and the past. Thus roots selecting /-Tʰ-/ might be underlyingly someting like /√-ʰ/ where the surd denotes the root and /ʰ/ a thematic [+spread glottis] specification.

The second explanandum does seem to be meaningful, even independently of the first. One possible fact that might be relevant here is that (other than the enclitic) the Icelandic imperative is bare, whereas weak verb stems are, to my knowledge, always followed by a vowel-initial suffix. So one could imagine that this is, in part, a phonotactic effect at some level of prosodic structure that does not include the clitic.

The third explanandum also seems meaningful. One can, for instance, frame it as a simple statistical hypothesis test, the null hypothesis being that imperative defectivity is independent of the strong/weak distinction. While I don’t have psychologically plausible counts of the strong and weak verbs—the numbers I need to compute sufficient statistics for this test—in front of me, I suspect the probability of observing this pattern under the null hypothesis is going to be vanishingly small.

The fourth and final explanandum is certainly one worth incorporating into any analysis. However, I think the obvious step has not yet been taken: serious attempts out to be made to incorporate it into a phonological account of the coronal suffix allomorphy, something H unfortunately has not attempted. If we are in fact to regard verbs in /ll, nn/ as lexically exceptional, one should first reasonably exhaust possible phonological accounts. One direction for future research would be to better understand the allomorphy associated with the imperative and past stems in Icelandic in general.

H proposes, essentially, that defectivity results in strong verbs in /ll, nn/ because such verbs lack a coronal-suffixed past tense form elsewhere in the paradigm; he adds that the strong imperative finndu is exempted because there are other /…nt/ forms in the paradigm of that verb. So many, many, many different things have to go wrong for a defective imperative in Icelandic: essentially, one has to be imperative, in /ll, nn/, and lack other coronal-final stems, and this come together in just three verbs in the entire language. Whether or not one finds H’s account compelling, it is very difficult to reason much about the theory of defectivity from the existence of no more than three verbs in a language. We might do better to focus on languages, like Greek or Russian, in which inflectional defectivity has much higher type frequency.

Endnotes

  1. Whether or not the full and the clipped imperative are pragmatically substitutable is unclear to me from H’s description.
  2. Unfortunately, H does not always give the orthographic form of the words he is citing, and given the language’s famously difficult spelling, I am not always certain I have guessed the correct spelling for inflected forms. However, it appears to me that the contrast between /-T-/ and /-Tʰ-/ is spelled as -d- vs. -t-.
  3. Once again, it is not clear why this is the expected form because the only description of the phonological conditioning is given in a sketchy Optimality Theory analysis (H:§2.1-2).
  4. The relevant statistic is that 6 out of 33 weak verbs in /ll, nn/ select the “expected” /-T-/. From this H concludes that in this environment, “the exceptions far outnumber the regulars” (p. 113). I note briefly that under the tolerance principle (Yang 2005), an environment of 33 examples can tolerate up to 9 exceptions, so this could be a productive generalization according to that theory.
  5. In H’s examples, strong imperatives use the same ablaut grade as the infinitive, so we just have to take his word that they are in fact strong.

References

Hansson, G. Ó. 1999. ‘When in doubt…’: intraparadigmatic dependencies and gaps in Icelandic. In Proceedings of NELS 29, pages 105-119. GLSA Publications.
Yang, C. 2005. On productivity. Language Variation Yearbook 5: 333-370.

When rule directionality does and does not matter

At the Graduate Center we recently hosted an excellent lecture by Jane Chandlee of Haverford College. Those familiar with her work may know that she’s been studying, for some time now, two classes of string-to-string functions called the input strictly local (ISL) and output strictly local (OSL) functions. These are generalizations of the familiar notion of the strictly local (SL) languages proposed by McNaughton and Papert (1971) many years ago. For definitions of ISL and OSL functions, see Chandlee et al. 2014 and Chandlee 2014. Chandlee and colleagues have been arguing, for some time now, that virtually all phonological processes are ISL, OSL, or both (note that their intersection is non-null).

In her talk, Chandlee attempted to formalize the notions of iterativity and non-iterativity in phonology with reference to ISL and OSL functions. One interesting side effect of this work is that one can, quite easily, determine what makes a phonological process direction-invariant or direction-specific. In FSTP (Gorman & Sproat 2021:§5.1.1) we describe three notions of rule directionality (ones which are quite a bit less general than Chandlee’s notions) from the literature, but conclude: “Note, however, that directionality of application has no discernable effect for perhaps the majority of rules, and can often be ignored.” (op. cit., 53) We didn’t bother to determine when this is the case, but Chandlee shows that the set of rules which are invariant to direction of application (in our sense) are exactly those which are ISL ∩ OSL; that is, they describe processes which are both ISL and OSL, in the sense that they are string-to-string functions (or maps, to use her term) which can be encoded either as ISL or OSL.

As Richard Sproat (p.c.) points out to me, there are weaker notions of direction-invariance we may care about in the context of grammar engineering. For instance, it might be the case that some rule is, strictly speaking, direction-specific, but the language of input strings is not expected to contain any relevant examples. I suspect this is quite common also.

References

Chandlee, J. 2014. Strictly local phonological processes. Doctoral dissertation, University of Delaware.
Chandlee, J., Eyraud, R., and Heinz, J. 2014. Learning strictly local subsequential functions. Transactions of the Association for Computational Linguistics 2: 491-503.
Gorman, K., and Sproat, R. 2021. Finite-State Text Processing. Morgan & Claypool.
McNaughton, R., and Papert, S. A. 1971. Counter-Free Automata. MIT Press.

Dutch names in LaTeX

One thing I recently figured out is a sensible way to handle Dutch names (i.e., those that begin with denvan or similar particles. Traditionally, these particles are part of the cited name in author-date citations (e.g., den Dikken 2003, van Oostendorp 2009) but are ignored when alphabetizing (thus, van Oostendorp is alphabetized between Orgun & Sprouse and Otheguy, not between Vago and Vaux)This is not something handled automatically by tools like LaTeX and BibTeX, but it is relatively easy to annotate name particles like this so that they do the right thing.

First, place, at the top of your BibTeX file, the following:

@preamble{{\providecommand{\noopsort}[1]{}}}

Then, in the individual BibTeX entries, wrap the author field with this command like so:

 author = {{\noopsort{Dikken}{den Dikken}}, Marcel},

This preserves the correct in-text author-date citations, but also gives the intended alphabetization in the bibliography.

Note of course that not all people with van (etc.) names in the Anglosphere treat the van as if it were a particle to be ignored; a few deliberately alphabetize their last name as if it begins with v.