I have been meaning to describe some of the work I have been doing on Pynini, our weighted finite-state grammar development platform. For one, while I have been the primary contributor through the history of the project (Richard Sproat wrote the excellent path iteration library), we are now also getting many contributions from Lawrence Wolf-Sonkin (rewrite of the symbol table wrapper, type hints) and lots of usability and bug reports from the Google linguists.
We are currently on Pynini release 2.1.1. Here are some new features/improvements from the last few releases:
- 2.0.9: Adds an efficient multi-argument
union
. - 2.0.9: Pynini (and the rest of OpenGrm) are available on Conda via Conda-Forge. This means that for most users, there is no longer any need to compile Pynini by hand; instead Pynini is compiled (for a variety of platforms) in the cloud, using a continuous integration framework.
- 2.1.0: Rewrites the string compiler so that symbol tables are no longer attached to compiled FSTs, eliminating the need for expensive symbol table merging and relabeling options.
- 2.1.0: Rewrites the FST and symbol table class hierarchies to better reflect the organization of lower-level APIs.
- 2.1.1: Adds PEP 484/PEP 561-compatible type stubs.
We also have removed or renamed quite a few features:
stringify
is renamedstring
.text
is renamedprint
(cf. the command-line toolfstprint
).- The
defaults
struct is removed, though it may be reintroduced as a context manager at some point. - The
*
infix operator, previously used for composition is removed; use@
instead. transducer
‘s argumentsinput_token_type
andoutput_token_type
are merged astoken_type
.
Finally, we have broken Python 2.7 compatibility as of 2.1.0; pywrapfst
, the lower-level API, still has some degree of Python 2.7 compatibility, but this is probably the last release to maintain that property.