This course will give a comprehensive overview of the key concepts in natural language processing (NLP) and the techniques used for statistical modeling of natural language data. We will introduce and discuss several NLP tasks, such as sentiment analysis, information extraction, language modeling, syntactic parsing and semantic analysis. The course will be divided into three modules: (1) key statistical machine learning methods in NLP, (2) computational linguistic tasks and modeling approaches, and (3) generative AI including large language models (LLMs).
Lectures are held on Tuesdays and Thursdays at 1:30-2:45pm in Stanley Coulter Hall Room 239.
We will use Ed for class discussion and announcements, including announcements regarding assignments. If you are not in the Ed course, ask an instructor to be added.
Note the following schedule is subject to change throughout the course.
| Date | Topic | Resources |
|---|---|---|
| 01/13 | Lecture 1: Introduction to NLP | [slides] |
| 01/15 | Lecture 2: Text Classification | [slides] |
| 01/20 | Lecture 3: Text Classification II | |
| 01/22 | Lecture 4: Neural Networks | |
| 01/27 | Lecture 5: Representation Learning | |
| 01/29 | Lecture 6: Recurrent Neural Networks | |
| 02/03 | Lecture 7: Attention and Transformers | |
| 02/05 | Lecture 8: Transformers II | |
| 02/10 | Lecture 9: Computational Linguistics | |
| 02/12 | Lecture 10: Morphology | |
| 02/17 | Lecture 11: Syntax | |
| 02/19 | Lecture 12: Syntax II | |
| 02/24 | Lecture 13: Semantics | |
| 02/26 | Lecture 14: Pragmatics | |
| 03/03 | Lecture 15: Discourse | |
| 03/05 | Lecture 16: Language Modeling | |
| 03/10 | Lecture 17: Transformer Language Models | |
| 03/12 | Lecture 18: Scaling | |
| 03/17 | Spring Break: No class | ⠀ |
| 03/19 | Spring Break: No class | ⠀ |
| 03/24 | Lecture 19: Prompting | |
| 03/26 | Lecture 20: Retrieval and Agents | |
| 03/31 | Lecture 21: Fine-tuning | |
| 04/02 | Lecture 22: Distillation | |
| 04/07 | Lecture 23: Quantization | |
| 04/09 | Lecture 24: Reinforcement Learning | |
| 04/14 | Lecture 25: Reinforcement Learning II | |
| 04/16 | Lecture 26: Multi-modal NLP | |
| 04/21 | Lecture 27: Multi-modal NLP II | |
| 04/23 | TBA | |
| 04/28 | TBA | |
| 04/30 | TBA |
Assignments are to be submitted by the due date listed. Each person will be allowed a total of 5 late days which can be applied to any combination of assignments during the semester without penalty. After that, a late penalty of 15% per day will be assigned. Use of a partial day will be counted as a full day.
Use of extension days must be stated explicitly in the late submission (either directly in the submission header or by accompanying email to the TA), otherwise late penalties will apply. Extensions cannot be used after the final day of classes (i.e., December 13th 11:59pm).
Extension days cannot be rearranged after they are applied to a submission. Use them wisely!
Assignments will NOT BE accepted if they are more than five days late. Additional extensions will be granted only due to serious and documented medical or family emergencies.
Please read the departmental academic integrity policy. This will be followed unless we provide written documentation of exceptions. We encourage you to interact amongst yourselves: you may discuss and obtain help with basic concepts covered in lectures or the textbook, homework specification (but not solution), and program implementation (but not design). However, unless otherwise noted, work turned in should reflect your own efforts and knowledge. Sharing or copying solutions is unacceptable and could result in failure. We use copy detection software, so do not copy code and make changes (either from the Web or from other students). You are expected to take reasonable precautions to prevent others from using your work.
Students are permitted to use generative AI tools such as ChatGPT if they find the tools to be helpful. These tools can help to accelerate low-level tasks, such as writing boilerplate code. However, we urge students to be wary of the output of such models on some tasks. These tools can be very effective for tasks such as paraphrasing or correcting grammar, but they do produce errors on other tasks, such as analysis of research papers or scientific scrutiny of an experimental setup. Be very mindful when using such tools to generate code, as they will insert bugs (often making unnatural/non-human mistakes, which can be sometimes very difficult to detect). Overly relying on AI can result in poor preparation for the final exam, which is a significant portion of the grade.