This graduate course will provide a research-oriented overview of the key concepts in natural language processing (NLP) and the techniques used for statistical modeling of natural language data. We will introduce and discuss several NLP tasks, including but not limited to text classification, sentiment analysis, information extraction, language modeling, syntactic parsing, and semantic analysis.
Lectures are held on Tuesdays and Thursdays at 6:00-7:15pm in the Physics Building Room 203.
Instructors and TAs | Office Hours |
---|---|
Abulhair Saparov | After class |
Nathaniel Getachew | Wednesdays 4:30-5:30pm, DSAI B061 |
Yunxin Sun | Thursdays 4:00-5:00pm, DSAI B047 |
Note the following schedule is subject to change throughout the course.
Date | Topic | Resources |
---|---|---|
08/26 | Lecture 1: Introduction to NLP | [slides] |
08/28 | Lecture 2: Text Classification | [slides] |
09/02 | Lecture 3: Language Modeling | |
09/04 | Lecture 4: Recurrent Neural Networks | |
09/09 | Lecture 5: LSTMs and GRUs | |
09/11 | Lecture 6: Attention and Transformers | |
09/16 | Lecture 7: Transformers II | |
09/18 | Lecture 8: Scaling | |
09/23 | Lecture 9: Prompting | |
09/25 | Lecture 10: Reinforcement Learning | |
09/30 | Lecture 11: Reinforcement Learning II | |
10/02 | Lecture 12: Efficiency | |
10/07 | Lecture 13: Efficiency II | |
10/09 | TBA | |
10/14 | Fall Break: No class | ⠀ |
10/16 | Lecture 14: Efficiency III | |
10/21 | Lecture 15: Efficiency IV | |
10/23 | Lecture 16: Efficiency V | |
10/28 | Lecture 17: Efficiency VI | |
10/30 | Lecture 18: Mixture of Experts and Retrieval | |
11/04 | Lecture 19: Computational Linguistics and Morphology | |
11/06 | Lecture 20: Syntax | |
11/11 | Lecture 21: Syntax II | |
11/13 | Lecture 22: Syntax III | |
11/18 | Lecture 23: Syntax IV and Semantics | |
11/20 | Lecture 24: Semantics II | |
11/25 | TBA | |
11/27 | Thanksgiving: No class | ⠀ |
12/02 | TBA | |
12/04 | Lecture 25: Semantics III | |
12/09 | Lecture 26: Multi-modal NLP | |
12/11 | Lecture 27: Multi-modal NLP II |
Assignments are to be submitted by the due date listed. Each person will be allowed a total of 5 late days which can be applied to any combination of assignments during the semester without penalty. After that, a late penalty of 15% per day will be assigned. Use of a partial day will be counted as a full day.
Use of extension days must be stated explicitly in the late submission (either directly in the submission header or by accompanying email to the TA), otherwise late penalties will apply. Extensions cannot be used after the final day of classes (i.e., May 9th 11:59pm).
Extension days cannot be rearranged after they are applied to a submission. Use them wisely!
Assignments will NOT BE accepted if they are more than five days late. Additional extensions will be granted only due to serious and documented medical or family emergencies.
Please read the departmental academic integrity policy. This will be followed unless we provide written documentation of exceptions. We encourage you to interact amongst yourselves: you may discuss and obtain help with basic concepts covered in lectures or the textbook, homework specification (but not solution), and program implementation (but not design). However, unless otherwise noted, work turned in should reflect your own efforts and knowledge. Sharing or copying solutions is unacceptable and could result in failure. We use copy detection software, so do not copy code and make changes (either from the Web or from other students). You are expected to take reasonable precautions to prevent others from using your work.
Students are not only permitted but encouraged to use generative AI tools such as ChatGPT if they find the tools to be helpful. These tools can help to accelerate low-level tasks, such as writing boilerplate code. However, we urge students to be wary of the output of such models on some tasks. These tools can be very effective for tasks such as paraphrasing or correcting grammar, but they do produce errors on other tasks, such as analysis of research papers or scientific scrutiny of an experimental setup. Be very mindful when using such tools to generate code, as they will insert bugs (often making unnatural/non-human mistakes, which can be sometimes very difficult to detect).