CSCI 7000: Neuro-Symbolic Approaches to NLP

Instructor: Maria L. Pacheco
Class times: Tue/Thu 11:00 am-12:15 pm
Class location: ECCR 1B55
Office hours:

  • Location: ECOT 822
  • Thursdays before class: 9:30 to 10:45 am, no appointment needed
  • By appointment

Course Description and Learning Objectives

The goal of neuro-symbolic methods is to combine symbolic representations and neural networks to benefit from the complementary strengths of the two paradigms. These ideas have a long history in AI and currently experience a resurgence of interest in several AI communities, including Natural Language Processing. In this class, we will survey and discuss foundational and emerging literature on neuro-symbolic representations as applied to natural language processing and computational linguistics. The main goal of this class is to provide students with a framework for analyzing the different modeling and algorithmic choices when combining symbolic and neural models for knowledge representation, learning and reasoning, building a base from which students can go on to do research in these areas.

The bulk of the class will be dedicated to paper reading, presentations and discussion. Each student should expect to present one research paper during the semester. Students will also need to write a short paper review (1 or 2 paragraphs) in the form of comments and questions and post it on Canvas before each paper discussion. There will be a course project, in which students will work in groups to design and carry out a research project or a paper reproduction project related to neuro-symbolic NLP.

Pre-requisites

This is an advanced research class and assumes that students have taken a graduate-level AI/Machine Learning/NLP class or have acquired equivalent experience. Students should also be comfortable with at least one programming language.

Grading

  • Course project: 35%
  • Paper reviews: 25%
  • Paper presentation: 20%
  • Class participation: 15%
  • Quizzes: 10%

Course Schedule

Week 1: Introduction

  • Tue. Aug 29: Introduction and logistics, review of relevant NLP/ML concepts
  • Thu. Aug 31: Tutorial on Research Computing GPU resources at CU by OIT team

Week 2: Modularity and Decomposability

Week 3: Node and Graph Embeddings

Week 4: Augmenting NNets with Logic

Week 5: Logic-based Loss Design

Week 6: Structured Learning and Prediction

Week 7: Deep Structured Learning and Prediction

Week 8: Probabilistic Soft Logic

Week 9: Deep Probabilistic Logic

Week 10: Vision + Language

Week 11: Dynamic Induction and Reasoning

Week 12: Large Language Models - Modularity

Week 13: Fall Break

  • Tue. Nov 21: No class
  • Thu. Nov 22: No class

Week 14: Large Language Models - Knowledge Distillation

Week 15: Large Language Models - Constraints

Week 16: Final Project Presentations

  • Tue. Dec 12: Final Project Presentations. Quiz 6 at the end of the class (Weeks 12-15)
  • Thu. Dec 14: Final Project Presentations

Week 17: Final Project Report due on Tue. Dec 19 at midnight

Assignments in More Detail

Paper Reviews

For each lecture, students should expect to read one research paper on a specific topic in Neuro-Symbolic NLP. For each required paper reading, students need to submit a short paper review (1 or 2 paragraphs) in the form of questions or comments on Canvas before the class. The grading of the paper review will depend on the overall quality of the questions and/or comments. As you read a paper and write your review, focus on the following perspectives:

  • Motivation of the work: What are the research questions that this paper tries to answer? How important is the problem that the paper aims to address? Who will care about the findings and why should they care?
  • Novelty and significance of the work: What was new at the time of publication? What are the main contributions of the paper? What did you find most interesting?
  • Limitations, flaws, and blind spots: Is the research question ill defined? Does the paper fail to deliver on its promise? Are there any unrealistic or false assumptions about the goal or target domain? Are there flaws or mistakes in the technical approach or experimental design?
  • Future work: How would you improve on this work? Does this paper inspire any new ideas in your own research?

Reviews are due at 9 am before the class, no late submissions allowed. To calculate your final grade, your paper review with the lowest score will be dropped (1 out of 26 paper reviews).

Paper Presentation

Each student should expect to present one research paper during the course. The instructor will ask students to sign up for papers by the end of the first day of class. The instructor will present any unselected papers during the course. Each paper presentation should take no more than 40 minutes, so that we can have enough time for discussion. The presentation should elaborate on the motivation, related work, research questions, methodology/experimental design, findings, limitations, and future work stated in the paper. To make your presentation more insightful, try to position the paper contribution with respect to the related literature and tell the audience why this work was proposed in the first place, how it advances our understanding about the topic, and how it is different from other related work in the past. You are also encouraged to connect the assigned paper to your own research. You should prepare a set of questions (either written by yourself or based on questions other students post on Canvas) and co-lead an in-class discussion with the instructor after the presentation.

I recommend that you give yourself enough time to read the paper and prepare your presentation. If you encounter any difficulties understanding the technical content of the paper, please attend office hours for help.

In-Class Discussions

The in-class discussion will follow the think-pair-share format.

  • Think. The presenter or the instructor will provoke students’ thinking with a question. The students should take one or two minutes just to THINK about the question.
  • Pair. Using designated partners (such as with Clock Buddies), nearby neighbors, or a deskmate, students PAIR up to talk about their answers with each other. They compare their mental or written notes and identify the answers they think are best, most convincing, or most unique.
  • Share. After students talk in pairs for a few minutes, the presenter or instructor will call for pairs to SHARE their thinking with the rest of the class.

Course Project

Students are expected to work on a course project either alone or in groups (no more than 3 students in a group). There are two options for projects:

1) Research Projects

Students can pick any topic related to NLP as long as they incorporate at least one symbolic and one distributed component in the method design. Between Week 1 and Week 5, students should make sure to stop by during office hours to discuss project ideas with the instructor and get early feedback on the relevance, novelty, feasibility, and significance of the ideas. Students are encouraged to use this class to advance their own research, so it is perfectly acceptable to present ongoing individual research work as part of the course project as long as it is relevant to the class.

2) Paper Reproduction Projects

Students can select an existing, published, experimental paper and carry out a reproduction study. The paper does not need to be from the class reading list, as long as it is relevant to the topics covered in this class. The objective is to assess if the experiments are reproducible, and to determine if the conclusions of the paper are supported by your findings. Your results can be either positive (i.e. confirm reproducibility), or negative (i.e. explain what you were unable to reproduce, and potentially explain why). I suggest that you first attempt to reimplement the experiments of the paper from scratch. Using any published code is allowed, as long as this is clear in your report. I recommend that you focus on the central claim of the paper. Note that exact reproducibility is in most cases very difficult due to minor implementation details, results that are close to those in the original paper are enough for a positive conclusion. You do not need to reproduce all experiments in your selected paper, but only those that you feel are sufficient for you to verify the validity of the central claim.

Just re-running code is not a reproducibility study, and you need to approach any code with critical thinking and verify it does what is described in the paper and that these are sufficient to support the conclusions of the paper. I suggest that you take a look at the ML Reproducibility Challenge for guidelines and resources on how to carry out good reproducibility research. Between Week 1 and Week 5, students should make sure to stop by during office hours to get early feedback on their paper choice.

Course Project Grading:
  • Project Proposal: A short project proposal is due on Sun. Oct 1 at midnight (Week 5). This proposal is worth 5% of your overall grade. The proposal should be no longer than 2 pages (minus references) and include the following sections:
    1. Title for your project (this can also change later on).
    2. Introduction section, which should explain the context of the paper. It should contain the following information:
      1. Task / Research Question Description: For research projects, what is the task you are trying to solve or what is the research question you are trying to answer? For reproduction studies, what does the original paper propose and which claims are you wanting to verify?
      2. Motivation & Limitations of existing work: Have others tried to solve the same task or answer a similar research question? What are you/they trying to do differently and why? What were the limitations or shortcomings of prior work?
      3. Likely challenges and mitigations: What is hard about this task / research question? What are your contingency plans if the reproduction turns out to be harder than expected or experiments do not go as planned?
    3. Related Work Section: Include 3-4 sentence descriptions of no less than 4 papers directly relevant to the proposed research. Also mention how the paper you are working on differs from these.

  • Mid-Point Project Report: A mid-point project summary is due on Sun. Oct 29 at midnight (Week 9). This summary is worth 10% of your overall grade. The report should be no longer than 4 pages (minus references) and it should include the following sections:
    1. Title: Can include any revisions to your previous title.
    2. Introduction: Can include any revisions to your previous introduction.
    3. Methodology Section: For research projects, this section should describe the envisioned approach/methodology. For reproduction projects, this summary should include a road map for the reproduction study. i.e., a detailed description of which methods you are going to re-implement from scratch and what existing resources you are re-using (if any).
    4. Experimental Section: a description of the experimental setup and any preliminary results obtained.

  • Final Project Presentation: In Weeks 15 and 16, each team will deliver a presentation of their project. The presentation will be about 20 minutes + 5 minutes for Q&A. The presentation is worth 10% of your total grade.

  • Final Report: A final report is due on Tue. Dec 18 at midnight. This report should be no longer than 8 pages (minus references) and it is worth 10% of your overall grade. Your final project report should be built upon your proposal and project summary. Feel free to reuse sections from those two reports in your final report. You may include an appendix beyond 8 pages, but your paper must be understandable without it. Submissions should be in the ACL format. Your final report should be structured like a conference paper. It should contain:
    1. Title
    2. Abstract
    3. A well-motivated introduction
    4. Related work with proper citations
    5. Description of your methodology
    6. Experimental results
    7. Discussion of your findings and limitations.
    8. Conclusions and future work

Please include a link to your code in your final report. Please also add a README file in your repository to describe how to run and test your code.

Important dates:
  • Project proposal due on Sun. Oct 1 at midnight (Week 5)
  • Mid-point project report due on Sun. Oct 29 at midnight (Week 9)
  • Paper presentations in class on Week 16
  • Final project report due on Tue. Dec 19 at midnight (Week 17)

Quizzes

We will have six quizzes during this course. Each quiz will assess your understanding of the topics that we have covered in the previous two or three weeks. The quizzes will include multiple-choice questions and open-ended questions. You should be able to do well in quizzes as long as you attend the lectures and pay attention to the discussions in class. Each quiz will take 10 minutes. To calculate your final grade, your lowest scoring quiz will be dropped (1 out of 6 quizzes). The instructor cannot accommodate quizzes on a different date unless there are extenuating circumstances.

Late Submission Policy

  • All paper review comments must be submitted by 9 AM before the class. No late submission is allowed.
  • For the course project proposal, mid-point project summary, and final project report, late submissions will be accepted with 20% decaying credit per day.

University Policies

Classroom Behavior

Students and faculty are responsible for maintaining an appropriate learning environment in all instructional settings, whether in person, remote, or online. Failure to adhere to such behavioral standards may be subject to discipline. Professional courtesy and sensitivity are especially important with respect to individuals and topics dealing with race, color, national origin, sex, pregnancy, age, disability, creed, religion, sexual orientation, gender identity, gender expression, veteran status, political affiliation, or political philosophy.

For more information, see the classroom behavior policy, the Student Code of Conduct, and the Office of Institutional Equity and Compliance.

Requirements for Infectious Diseases

Members of the CU Boulder community and visitors to campus must follow university, department, and building health and safety requirements and all public health orders to reduce the risk of spreading infectious diseases.

The CU Boulder campus is currently mask optional. However, if masks are again required in classrooms, students who fail to adhere to masking requirements will be asked to leave class. Students who do not leave class when asked or who refuse to comply with these requirements will be referred to Student Conduct & Conflict Resolution. Students who require accommodation because a disability prevents them from fulfilling safety measures related to infectious disease will be asked to follow the steps in the “Accommodation for Disabilities” statement on this syllabus.

For those who feel ill and think you might have COVID-19 or if you have tested positive for COVID-19, please stay home and follow the further guidance of the Public Health Office. For those who have been in close contact with someone who has COVID-19 but do not have any symptoms and have not tested positive for COVID-19, you do not need to stay home.

Accommodation for Disabilities, Temporary Medical Conditions, and Medical Isolation

Disability Services determines accommodations based on documented disabilities in the academic environment. If you qualify for accommodations because of a disability, submit your accommodation letter from Disability Services to your faculty member in a timely manner so your needs can be addressed. Contact Disability Services at 303-492-8671 or dsinfo@colorado.edu for further assistance.

If you have a temporary medical condition or required medical isolation for which you require accommodation, please get in touch with the instructor as soon as possible. Also see Temporary Medical Conditions on the Disability Services website.

Preferred Student Names and Pronouns

CU Boulder recognizes that students’ legal information doesn’t always align with how they identify. Students may update their preferred names and pronouns via the student portal; those preferred names and pronouns are listed on instructors’ class rosters. In the absence of such updates, the name that appears on the class roster is the student’s legal name.

Honor Code

All students enrolled in a University of Colorado Boulder course are responsible for knowing and adhering to the Honor Code. Violations of the Honor Code may include but are not limited to: plagiarism (including use of paper writing services or technology [such as essay bots]), cheating, fabrication, lying, bribery, threat, unauthorized access to academic materials, clicker fraud, submitting the same or similar work in more than one course without permission from all course instructors involved, and aiding academic dishonesty.

All incidents of academic misconduct will be reported to Student Conduct & Conflict Resolution: honor@colorado.edu, 303-492-5550. If you have read this far into the syllabus, well done! Send me a link to your favorite song or video for a small bump to your grade. Students found responsible for violating the Honor Code will be assigned resolution outcomes from the Student Conduct & Conflict Resolution as well as be subject to academic sanctions from the faculty member. Visit Honor Code for more information on the academic integrity policy.

CU Boulder is committed to fostering an inclusive and welcoming learning, working, and living environment. University policy prohibits protected-class discrimination and harassment, sexual misconduct (harassment, exploitation, and assault), intimate partner violence (dating or domestic violence), stalking, and related retaliation by or against members of our community on- and off-campus. These behaviors harm individuals and our community. The Office of Institutional Equity and Compliance (OIEC) addresses these concerns, and individuals who believe they have been subjected to misconduct can contact OIEC at 303-492-2127 or email cureport@colorado.edu. Information about university policies, reporting options, and support resources can be found on the OIEC website.

Please know that faculty and graduate instructors have a responsibility to inform OIEC when they are made aware of incidents related to these policies regardless of when or where something occurred. This is to ensure that individuals impacted receive an outreach from OIEC about their options for addressing a concern and the support resources available. To learn more about reporting and support resources for a variety of issues, visit Don’t Ignore It.

Religious Holidays

Campus policy regarding religious observances requires that faculty make every effort to deal reasonably and fairly with all students who, because of religious obligations, have conflicts with scheduled exams, assignments or required attendance. In this class, please let me know of upcoming religious holidays at least two weeks ahead of time if you will need an accommodation. See the campus policy regarding religious observances for full details.

Mental Health and Wellness

The University of Colorado Boulder is committed to the well-being of all students. If you are struggling with personal stressors, mental health or substance use concerns that are impacting academic or daily life, please contact Counseling and Psychiatric Services (CAPS) located in C4C or call (303) 492-2277, 24/7.

Free and unlimited telehealth is also available through Academic Live Care. The Academic Live Care site also provides information about additional wellness services on campus that are available to students.