LING83800: Methods in Computational Linguistics II

Spring 2024

Graduate Center, CUNY

Instructors: Prof. Spencer Caplan and Prof. Kyle Gorman
Practicum leader: Natalia Tyulina
Lecture: Monday 4:15-6:15 (rm. 7314)
Practicum: Friday 11:45-1:45 (rm. 7400.13)
Office hours: (Caplan) Wednesday 12:30-2:30 (rm 7400.02), (Gorman) Wednesday 2-3 (rm 7400.05) and by appointment

Synopsis

This course is the second of a two-semester series introducing computational linguistics and software development. The intended audience are students interested in speech and language processing technologies, though the materials will be beneficial to all language researchers.

Objectives

Using the Python programming language, students will learn formalisms and technologies used to build speech and language technologies.

Materials

There is no textbook, but readings will be assigned. Students are strongly encouraged to bring a laptop computer to the practicum.

Assignments

Assignments will take the form of either pencil-and-paper assignments or small software development projects. Software assignments will always been turned in with a write-up describing the general approach taken and any challenges encountered. Students will often be able to verify the technical correctness of their code by running provided tests. We will use GitHub Classroom for assignment turn-in.

The final assignment will be an open-ended project which will either extend earlier projects, or build and evaluate a speech and language technology system. Students are encouraged to conceive of projects relevant to their research interests. Students should discuss project plans with the instructor during office hours to confirm that it is both feasible and of appropriate scope.

Grading

80% of students' grades will be derived from the assignments; the remaining 20% will be reserved for participation and attendance. Assignments should be submitted on time or may receive a 0 grade (barring a documented emergency). No separate grade will be assigned for the practicum.

Accommodations

The instructor will attempt to provide all reasonable accommodations to students upon request. If you believe you are covered under the Americans With Disabilities Act, please direct accommodations requests to Matthew G. Schoengood, Vice President for Student Affairs.

Attendance

Students are expected to attend all lectures and practica in person. Other absences will not be excused, and the instructor reserves the right to tie grades to attendance records.

Integrity

In line with the Student Handbook policies on plagiarism, students are expected to complete their own work. The instructor reserves the right to refer violations to the Academic Integrity Officer.

Respect

For the sake of privacy, students are not permitted to record lectures. Students are expected to be considerate of your peers and to treat them with respect during discussions.

Schedule

(Please note that this is subject to change and will be updated as we go.)

F 1/26 No class
M 1/29 Kyle Syllabus
tooling
Notes
Slides
F 2/2 Natasha Practicum Handout 1 2
M 2/5 Kyle Git
GitHub
Handout Chacon & Straub ch. 1.1-3.2, 6.1-6.3
F 2/9 Natasha Practicum Handout
M 2/12 No class
F 2/16 No class
M 2/19 No class
Th 2/22 HW1 due Kyle Formal languages I Handout Partee et al. ch. 1 (Hopcroft et al. ch. 1.5)
F 2/23 Natasha Practicum Handout
M 2/26 No class
W 2/28 HW2 due [solution] Kyle Formal languages II Handout Jurafsky & Martin ch. 17-17.5 (Jäger & Rogers; Graf)
F 3/1 Natasha Practicum Notebook
M 3/4 Kyle Automata Slides Gorman & Sproat ch. 1-1.4
Jurafsky & Martin ch. 2-2.1
(Freeman et al. ch 10; Hopcroft et al. ch. 3-3.1, 3.3)
F 3/8 Natasha Practicum Notebook
M 3/11 HW3 due [solution] Kyle Transducers
Rewrite rules
Slides
Notebook
Gorman & Sproat ch. 5 (Hopcroft et al. ch. 2, 3.2)
F 3/15 Natasha Practicum Notebook
M 3/18 HW4 due [solution] Spencer Probability theory Handout Manning & Schütze ch. 2
F 3/22 Natasha Practicum Handout
Notebook
M 3/25 HW5 due [solution] Spencer Language models I Handout Gorman & Sproat ch. 1.5-1.6
Roark & Sproat ch. 6.1
(Manning & Schütze ch. 6)
F 3/29 No classes
M 4/1 Spencer Language models II Slides 1 2
Handout
F 4/5 Natasha Notebook
Notes
M 4/8 Spencer Tagging
chunking
Slides Bird et al. ch. 5
Jurafsky & Martin appendix A
(Manning & Schütze ch. 9)
W 4/10 HW6 due [solution]
F 4/12 Natasha Slides
Notebook
M 4/15 Spencer Generative classifiers Slides
Handout
Bird et al. ch. 6.1-3, 6.5-6.9
Jurafsky & Martin ch. 4
F 4/19 Natasha Practicum Notebook
Handout
M 4/22 No classes
F 4/26 No classes
M 4/29 No classes
F 5/3 No classes
M 5/6 Spencer Discriminative classifiers Slides Pedregosa et al.
Breiman
Ng & Jordan
(Collins)
W 5/8 HW7 due [solution]
F 5/10 Natasha Practicum Notebook
M 5/13 Spencer Text classification
Regularization & tuning
Slides Scikit-learn tutorials 1, 2, 3
F 5/17 Natasha Practicum Handout
Notebook
M 5/20 HW8 due [solution] Kyle Evaluation
Ethical thinking
Slides 1 2
Handout
Resnik & Lin
Hovy & Spruitt
(Gorman & Bedrick)
(Strubell et al.)
W 5/22 End of semester
T 5/28 Term paper due

References