Defectivity in English: more observations

[This is part of a series of defectivity case studies.]

In an earlier post I listed some defective verbs in my idiolect. After talking with our PhD student Aidan Malanoski, I have a couple additional generalizations to note.

Aidan is fine with infinitival BEWARE (e.g., Caesar was told to beware the Ides of March). I am not sure about this myself.
Aidan points out that SCRAM, SHOO, and GO AWAY (we might call them, along with BEWARE, “imperative-dominant verbs”) have a similarly restricted distribution. Roughly, our judgments are:

imperatives ok: Scram! Shoo! Go away!
infinitives ok: Roaches started to scram when I turned the lights on. She shouted for the pigeons to shoo. The waiters couldn’t wait for them to go away.
gerunds marginal: Scramming would be a good idea right about now. Just going away might be the best thing.
Other -ing participles degraded: (past continuous) Roaches started scramming when I turned the lights on. (small clause) I saw the police scramming.
Simple pasts degraded: Roaches scrammed when I turned the lights on. (compositional reading only) He went away.

They point out that same -ing surface forms may differ in acceptability. I also note that for me shooed [s.o.] away is fine as a transitive.

Why binarity is probably right

Consider the following passage, about phonological features:

I have not seen any convicing justification for the doctrine that all features must be underlyingly binary rather than ternary, quaternary, etc. The proponents of the doctrine often realize it needs defending, but the calibre of the defense is not unfairly represented by the subordinary clause devoted to the subject in SPE (297): ‘for the natural way of indicating whether or not an item belongs to a particular category is by means of binary features.’ The restriction to two underlying specifications creates problems and solves none. (Sommerstein 1977: 109)

Similarly, I had a recent conversation by someone who insisted certain English multi-object constructions in syntax are better handled by assuming the possibility of ternary branching.

I disagree with Sommerstein, though: a logical defense of the assumption of binarity—both for the specification of phonological feature polarity and for the arity of syntactic trees—is so obvious that it fits on a single page. Roughly: 1) less than two is not enough, and; 2) two is enough.

Less than two is not enough. This much should be obvious: theories in which features only have one value, or syntactic constituents cannot dominate more than one element, have no expressive power whatsover.^1,2

Two is enough. Every time we might desire to use a ternary feature polarity, or a ternary branching non-terminal, there exists a weakly equivalent specification which uses binary polarity or binary branching, respectively, and more features or non-terminals. It is then up to the analyst to determine whether or not they are happy with the natural classes and/or constituents obtained, but this possibility is always available to the analyst. One opposed to the this strategy has a duty to say why the hypothesized features or non-terminals are wrong.

Endnotes

It is important to note in this regard that privative approaches to feature theory (as developed by Trubetzkoy and disciples) are themselves special cases of the binary hypothesis which happen to treat absence as a non-referable. For instance, if we treat the set of nasals as a natural class (specified [Nasal]) but deny the existence of the (admittedly rather diverse) natural class [−Nasal]—and if we further insist rules be defined in terms of natural classes, and deny the possibility of disjunctive specification—we are still working in a binary setting, we just have added an additional stipulation that negated features cannot be referred to by rules.
I put aside the issue of cumulativity of stress—a common critique in the early days—since nobody believes this is done by feature in 2023.

References

Sommerstein, A. 1977. Modern Phonology. Edward Arnold.

The different functions of probabilty in probabilistic grammar

I have long been critical of naïve interpretations of probabilistic grammar. To me, it seems like the major motivation for this approach derives from a naïve—I’d say overly naïve—linking hypothesis mapping between acceptability judgments and grammaticality, as seen in Likert scale-style acceptability tasks. (See chapter 2 of my dissertation for a concrete argument against this.) But in this approach, the probabilities are measures of wellformedness.

It occurs to me that there are a number of ontologically distinct interpretations of grammatical probabilities of the sort produced by “maxent”, i.e., logistic regression models.

For instance, at M100 this weekend, I heard Bruce Hayes talk about another use of maximum entropy models: scansion. In poetic meters, there is variation in, say, whether the caesura is masculine (after a stressed syllable) or feminine (after an unstressed syllable), and the probabilities reflect that.¹ However, I don’t think it makes sense to equate this with grammaticality, since we are talking about variation in highly self-conscious linguistic artifacts here and there is no reason to think one style of caesura is more grammatical than the other.²

And of course there is a third interpretation, in which the probabilities are production probabilities, representing actual variation in production, within a speaker or across multiple speakers.

It is not obvious to me that these facts all ought to be modeled the same way, yet the maxent community seems comfortable assuming a single cognitive model to cover all three scenarios. To state the obvious, it makes no sense for a cognitive model to acconut for interspeaker variation because there is no such thing as “interspeaker cognition”, there are just individual mental grammars.

Endnotes

This is a fabricated example because Hayes and colleagues mostly study English meter—something I know nothing about—whereas I’m interested in Latin poetry. I imagine English poetry has caesurae too but I’ve given it no thought yet.
I am not trying to say that we can’t study grammar with poetry. Separately, I note, as did, I think, Paul Kiparsky at the talk, that this model also assumes that the input text the poet is trying to fit to the meter has no role to play in constraining what happens.

Use the minus sign for feature specifications

LaTeX has a dizzying number of options for different types of horizontal dash. The following are available:

A single - is a short dash appropriate for hyphenated compounds (like encoder-decoder).
A single dash in math mode, $-$ , is a longer minus sign
A double -- is a longer “en-dash” appropriate for numerical ranges (like 3-5).
A triple --- is a long “em-dash” appropriate for interjections (like this—no, I mean like that).

My plea to linguists is to actually use math mode and the minus sign when they are writing binary features. If you want to turn this into a simple macro, you can please the following in your preamble:

\newcommand{feature}[2]{\ensuremath{#1}\textsc{#2}}

and then write \feature{-}{Back} for nicely formatted feature specifications.

Note that this issue has an exact parallel in Word and other WYSIWYG setups: there the issue is as simple as selecting the Unicode minus sign (U+2212) from the inventory of special characters (or just googling “Unicode minus sign” and copying and pasting what you find).