I just got back from the excellent Workshop on Model Theoretic Representations in Phonology. While I am not exactly a member of the “Delaware school”, i.e., the model-theoretic phonology (MTP) crowd, I am a big fan. In my talk, I contrasted the model-theoretic approach to an approach I called the black box approach, using neural networks and program synthesis solvers as examples of the latter. I likened the two styles to neat vs. scruffy, better is better vs. worse is better, rationalists vs. empiricists, and cowboys vs. aliens.
One lesson I drew from this comparison is the need for MTPists to develop high-quality software—the next toolkit 2. I didn’t say much during my talk about what I imagine this to be like, so I thought I’d leave my thoughts here. Several people—Alëna Aksënova, Hossep Dolatian, and Dakotah Lambert, for example—have developed interesting MTP-oriented libraries. While I do not want to give short schrift to their work, I think there are two useful models for the the next next toolkit: (my own) Pynini and PyTorch. Here is what I see as the key features:
- They are ordinary Python on the front-end. Of course, both have a C++ back-end, and PyTorch has a rarely used C++ API, but that’s purely a matter of performance; both have been slowly moving Python code into the C++ layer over the course of their development.The fact of the matter is that in 2022, just about anyone who can code at all can do so in Python.
- While both are devilishly complex, their design follows the principle of least suprise; there is only a bit of what Pythonistas call exhuberant syntax (Pynini’s use of the @ operator, PyTorch’s use of _ to denote in-place methods).
- They have extensive documentation (both in-module and up to book length).
- They have extensive test suites.
- They are properly packaged and can be installed via PyPi (i.e., via pip) or Conda-Forge (via conda).
- They have corporate backing.
I understand that many in the MTP community are naturally—constitutionally, even—drawn to functional languages and literate programming. I think this should not be the initial focus. It should be ease of use, and for that it is hard to beat ordinary Python in 2022. Jupyter/Colab support is a great idea, though, and might satisfy the literate programming itch too.