Best Papers

ACL’23 implemented the new award policy, which aims for broader recognition of exceptional research, in particular by significantly increasing the pool of outstanding papers to 1.5-2.5% of the total submissions. So, this year we have a total of 3 best papers, 4 special awards papers (Resource Award, Social Impact Award, Reproduction Award, Theme Paper Award)—and 39 outstanding papers! Additionally, there are Area Chair Awards: the Senior Area Chairs of each track had the opportunity to nominate one of their papers for a separate award. Many thanks to our Best Paper Committee for helping us with the selection process!

This page lists all the awards and honorable mentions, as well as demo track and SRW awards. But we congratulate everybody who was considered for the award: only 1.6% papers were even nominated by the reviewers. Next year, let’s all be more generous with nominations!

Best Paper Awards

Do Androids Laugh at Electric Sheep? Humor “Understanding” Benchmarks from The New Yorker Caption Contest
Jack Hessel, Ana Marasovic, Jena D. Hwang, Lillian Lee, Jeff Da, Rowan Zellers, Robert Mankoff and Yejin Choi
What the DAAM: Interpreting Stable Diffusion Using Cross Attention
Raphael Tang, Linqing Liu, Akshat Pandey, Zhiying Jiang, Gefei Yang, Karun Kumar, Pontus Stenetorp, Jimmy Lin and Ferhan Ture
From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models
Shangbin Feng, Chan Young Park, Yuhan Liu and Yulia Tsvetkov

Special Awards

Reproduction Award:

Do CoNLL-2003 Named Entity Taggers Still Work Well in 2023?
Shuheng Liu and Alan Ritter

Resource Award:

When Does Translation Require Context? A Data-driven, Multilingual Exploration
Patrick Fernandes, Kayo Yin, Emmy Liu, André Martins and Graham Neubig

Social Impact Award:

Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models
Myra Cheng, Esin Durmus and Dan Jurafsky

Theme Paper Award:

Weaker Than You Think: A Critical Look at Weakly Supervised Learning
Dawei Zhu, Xiaoyu Shen, Marius Mosbach, Andreas Stephan and Dietrich Klakow

Outstanding Papers

Backpack Language Models
John Hewitt, John Thickstun, Christopher Manning and Percy Liang
CAME: Confidence-guided Adaptive Memory Efficient Optimization
Yang Luo, Xiaozhe REN, Zangwei Zheng, ZHUO JIANG, Xin Jiang and Yang You
Causes and Cures for Interference in Multilingual Translation
Uri Shaham, Maha Elbayad, Vedanuj Goswami, Omer Levy and Shruti Bhosale
Cognitive Reframing of Negative Thoughts through Human-Language Model Interaction
Ashish Sharma, Kevin Rushton, Inna Lin, David Wadden, Khendra Lucas, Adam Miner, Theresa Nguyen and Tim Althoff
Compositional Generalization without Trees using Multiset Tagging and Latent Permutations
Matthias Lindemann, Alexander Koller and Ivan Titov
Considerations for meaningful sign language machine translation based on glosses
Mathias Müller, Zifan Jiang, Amit Moryossef, Annette Rios and Sarah Ebling
Dense-ATOMIC: Towards Densely-connected ATOMIC with High Knowledge Coverage and Massive Multi-hop Paths
Xiangqing Shen, Siwei Wu and Rui Xia
Dissecting Transformer Length Extrapolation via the Lens of Receptive Field Analysis
Ta-Chung Chi, Ting-Han Fan, alexander rudnicky and Peter Ramadge
Distilling Script Knowledge from Large Language Models for Constrained Language Planning
Siyu Yuan, Jiangjie Chen, Ziquan Fu, Xuyang Ge, Soham Shah, Charles Jankowski, Yanghua Xiao and Deqing Yang
Do PLMs Know and Understand Ontological Knowledge?
Weiqi Wu, Chengyue Jiang, Yong Jiang, Pengjun Xie and Kewei Tu
Don’t Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments
Yu Gu, Xiang Deng and Yu Su
Extrinsic Evaluation of Machine Translation Metrics
Nikita Moghe, Tom Sherborne, Mark Steedman and Alexandra Birch
Faithful Low-Resource Data-to-Text Generation through Cycle Training
Zhuoer Wang, Marcus Collins, Nikhita Vedula, Simone Filice, Shervin Malmasi and Oleg Rokhlenko
Generalizing Backpropagation for Gradient-Based Interpretability
Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alex Warstadt and Ryan Cotterell
Hexatagging: Projective Dependency Parsing as Tagging
Afra Amini, Tianyu Liu and Ryan Cotterell
Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Yun Tang, Anna Sun, Hirofumi Inaguma, Xinyue Chen, Ning Dong, Xutai Ma, Paden Tomasello and Juan Pino
Improving Pretraining Techniques for Code-Switched NLP
Richeek Das, Sahasra Ranjan, Shreya Pathak and Preethi Jyothi
Knowledge Transfer in Incremental Learning for Multilingual Neural Machine Translation
Kaiyu Huang, Peng Li, Jin Ma, Ting Yao and Yang Liu
Language model acceptability judgements are not always robust to context
Koustuv Sinha, Jon Gauthier, Aaron Mueller, Kanishka Misra, Keren Fuentes, Roger Levy and Adina Williams
Linear Classifier: An Often-Forgotten Baseline for Text Classification
Yu-Chen Lin, Si-An Chen, Jie-Jyun Liu and Chih-Jen Lin
Minding Language Models’ (Lack of) Theory of Mind: A Plug-and-Play Multi-Character Belief Tracker
Melanie Sclar, Sachin Kumar, Peter West, Alane Suhr, Yejin Choi and Yulia Tsvetkov
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
Zhiyang Xu, Ying Shen and Lifu Huang
Multilingual LLMs are Better Cross-lingual In-context Learners with Alignment
Eshaan Tanwar, Subhabrata Dutta, Manish Borthakur and Tanmoy Chakraborty
Neural Machine Translation Methods for Translating Text to Sign Language Glosses
Dele Zhu, Vera Czehmann and Eleftherios Avramidis
NLPositionality: Characterizing Design Biases of Datasets and Models
Sebastin Santy, Jenny Liang, Ronan Le Bras, Katharina Reinecke and Maarten Sap
PeaCoK: Persona Commonsense Knowledge for Consistent and Engaging Narratives
Silin Gao, Beatriz Borges, Soyoung Oh, Deniz Bayazit, Saya Kanno, Hiromi Wakaki, Yuki Mitsufuji and Antoine Bosselut
QUEST: A Retrieval Dataset of Entity-Seeking Queries with Implicit Set Operations
Chaitanya Malaviya, Peter Shaw, Ming-Wei Chang, Kenton Lee and Kristina Toutanova
Question-Answering in a Low-resourced Language: Benchmark Dataset and Models for Tigrinya
Fitsum Gaim, Wonsuk Yang, Hancheol Park and Jong Park
Scaling in Cognitive Modelling: a Multilingual Approach to Human Reading Times
Andrea Gregor de Varda and Marco Marelli
SCOTT: Self-Consistent Chain-of-Thought Distillation
Peifeng Wang, Zhengyang Wang, Zheng Li, Yifan Gao, Bing Yin and Xiang Ren
The Mechanical Bard: An Interpretable Machine Learning Approach to Shakespearean Sonnet Generation
Edwin Agnew, Michelle Qiu, Lily Zhu, Sam Wiseman and Cynthia Rudin
The Tail Wagging the Dog: Dataset Construction Biases of Social Bias Benchmarks
Nikil Selvam, Sunipa Dev, Daniel Khashabi, Tushar Khot and Kai-Wei Chang
Towards Zero-Shot Multilingual Transfer for Code-Switched Responses
Ting-Wei Wu, Changsheng Zhao, Ernie Chang, Yangyang Shi, Pierce Chuang, Vikas Chandra and Biing Juang
Transfer and Active Learning for Dissonance Detection: Addressing the Rare-Class Challenge
Vasudha Varadarajan, Swanie Juhng, Syeda Mahwish, Xiaoran Liu, Jonah Luby, Christian Luhmann and H. Andrew Schwartz
VisText: A Benchmark for Semantically Rich Chart Captioning
Benny Tang, Angie Boggust and Arvind Satyanarayan
What’s the Meaning of Superhuman Performance in Today’s NLU?
Simone Tedeschi, Johan Bos, Thierry Declerck, Jan Hajič, Daniel Hershcovich, Eduard Hovy, Alexander Koller, Simon Krek, Steven Schockaert, Rico Sennrich, Ekaterina Shutova and Roberto Navigli
WikiBio: a Semantic Resource for the Intersectional Analysis of Biographical Events
Marco Antonio Stranisci, Rossana Damiano, Enrico Mensa, Viviana Patti, Daniele Radicioni and Tommaso Caselli
World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models
Ziqiao Ma, Jiayi Pan and Joyce Chai

Area Chair Awards

Linguistic Diversity:

Small Data, Big Impact: Leveraging Minimal Data for Effective Machine Translation
Jean Maillard, Cynthia Gao, Elahe Kalbassi, Kaushik Ram Sadagopan, Vedanuj Goswami, Philipp Koehn, Angela Fan and Francisco Guzman

Sentiment Analysis, Stylistic Analysis, and Argument Mining:

StoryTrans: Non-Parallel Story Author-Style Transfer with Discourse Representations and Content Enhancing
Xuekai Zhu, Jian Guan, Minlie Huang and Juan Liu

Discourse and Pragmatics:

Resolving Indirect Referring Expressions for Entity Selection
Mohammad Javad Hosseini, Filip Radlinski, Silvia Pareti and Annie Louis

Semantics: Sentence-level Semantics, Textual Inference, and Other Areas:

ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset by AMR Back-Translation
Kuan-Hao Huang, Varun Iyer, I-Hung Hsu, Anoop Kumar, Kai-Wei Chang and Aram Galstyan

Question Answering:

DisentQA: Disentangling Parametric and Contextual Knowledge with Counterfactual Question Answering
Ella Neeman, Roee Aharoni, Or Honovich, Leshem Choshen, Idan Szpektor and Omri Abend

Semantics: Lexical:

LexSym: Compositionality as Lexical Symmetry
Ekin Akyurek and Jacob Andreas

NLP Applications:

Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark
Wenjun Peng, Jingwei Yi, Fangzhao Wu, Shangxi Wu, Bin Bin Zhu, Lingjuan Lyu, Binxing Jiao, Tong Xu, Guangzhong Sun and Xing Xie

Speech and Multimodality:

Hearing Lips in Noise: Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition
Yuchen Hu, Ruizhe Li, Chen Chen, Chengwei Qin, Qiu-Shi Zhu and Eng Siong Chng

Interpretability and Analysis of Models for NLP:

Entity Tracking in Language Models
Najoung Kim and Sebastian Schuster

Linguistic Theories, Cognitive Modeling, and Psycholinguistics:

Exploring How Generative Adversarial Networks Learn Phonological Representations
Jingyi Chen and Micha Elsner

Resources and Evaluation:

Tell2Design: A Dataset for Language-Guided Floor Plan Generation
Sicong Leng, Yang Zhou, Mohammed Haroon Dupty, Wee Sun Lee, Sam Joyce and Wei Lu

Multilingualism and Cross-Lingual NLP:

Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages
Ayyoob ImaniGooghari, Peiqin Lin, Amir Hossein Kargaran, Silvia Severini, Masoud Jalili Sabet, Nora Kassner, Chunlan Ma, Helmut Schmid, André Martins, François Yvon and Hinrich Schütze

Demo Track Awards

Best Paper Award:
VisKoP: Visual Knowledge oriented Programming for Interactive Knowledge Base Question Answering
Zijun Yao, YUANYONG CHEN, Xin Lv, Shulin Cao, Amy Xin, Jifan Yu, Hailong Jin, jianjun xu, Peng Zhang, Lei Hou and Juanzi Li
Outstanding demo paper:
CB2: Collaborative Natural Language Interaction Research Platform
Jacob Sharf, Mustafa Omer Gul and Yoav Artzi
Outstanding demo paper:
disco: a toolkit for Distributional Control of Generative Models
Germán Kruszewski, Jos Rozen and Marc Dymetman

Student Research Workshop Awards

Assessing Chain-of-Thought Reasoning against Lexical Negation: A Case Study on Syllogism
Mengyu Ye, Tatsuki Kuribayashi, Jun Suzuki, Hiroaki Funayama, Goro Kobayashi
Is a Knowledge-based Response Engaging?: An Analysis on Knowledge-Grounded Dialogue with Information Source Annotation
Takashi Kodama, Hirokazu Kiyomaru, Yin Jou Huang, Taro Okahisa, Sadao Kurohashi
LECO: Improving Early Exiting via Learned Exits and Comparison-based Exiting Mechanism
Jingfan Zhang, Ming Tan, Pengyu Dai, Wei Zhu
How-to Guides for Specific Audiences: A Corpus and Initial Findings
Nicola Fanton, Agnieszka Falenska, Michael Roth

Honorable Mentions

ByGPT5: End-to-End Style-conditioned Poetry Generation with Token-free Language Models
Jonas Belouadi and Steffen Eger
DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models
Zijie J. Wang, Evan Montoya, David Munechika, Haoyang Yang, Benjamin Hoover and Duen Horng Chau
DrBERT: A Robust Pre-trained Model in French for Biomedical and Clinical domains
Yanis Labrak, Adrien Bazoge, Richard Dufour, Mickael Rouvier, Emmanuel Morin, Béatrice Daille and Pierre-Antoine Gourraud
Entity Tracking in Language Models
Najoung Kim and Sebastian Schuster
Forgotten Knowledge: Examining the Citational Amnesia in NLP
Janvijay Singh, Mukund Rungta, Diyi Yang and Saif Mohammad
From Characters to Words: Hierarchical Pre-trained Language Model for Open-vocabulary Language Understanding
Li Sun, Florian Luisier, Kayhan Batmanghelich, Dinei Florencio and Cha Zhang
GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding
Jia-Chen Gu, Zhenhua Ling, Quan Liu, Cong Liu and Guoping Hu
Human Inspired Progressive Alignment and Comparative Learning for Grounded Word Acquisition
Yuwei Bao, Barrett Lattimer and Joyce Chai
Latent Positional Information is in the Self-Attention Variance of Transformer Language Models Without Positional Embeddings
Ta-Chung Chi, Ting-Han Fan, Li-Wei Chen, alexander rudnicky and Peter Ramadge
Revisiting non-English Text Simplification: A Unified Multilingual Benchmark
Michael Ryan, Tarek Naous and Wei Xu
Synthetic Text Generation with Differential Privacy: A Simple and Practical Recipe
Xiang Yue, Huseyin Inan, Xuechen Li, Girish Kumar, Julia McAnallen, Hoda Shajari, Huan Sun, David Levitan and Robert Sim
Theory-Grounded Computational Text Analysis
Arya D. McCarthy and Giovanna Maria Dora Dore
Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters
Boshi Wang, Sewon Min, Xiang Deng, Jiaming Shen, You Wu, Luke Zettlemoyer and Huan Sun
UniCoRN: Unified Cognitive Signal ReconstructioN bridging cognitive signals and human language
Nuwa Xi, Sendong Zhao, Haochun Wang, Chi Liu, Bing Qin and Ting Liu

Best Video Recordings

You can watch my seven minute highlights of the best video recordings via the video below. After the highlights, all of the complete video recordings are viewable.

Most Viewed:

KILM: Knowledge Injection into Encoder-Decoder Language Models Yan Xu, Mahdi Namazifar, Devamanyu Hazarika, Aishwarya Padmakumar, Yang Liu, and Dilek Hakkani-Tür
When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, and Hannaneh Hajishirzi

Highest Rated:

Neural Machine Translation for Mathematical Formulae Felix Petersen, Moritz Schubotz, André Greiner-Petter, and Bela Gipp

Outstanding Videos:

Generalizing Backpropagation for Gradient-Based Interpretability Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alexander Warstadt, and Ryan Cotterell
WikiHowQA: A Comprehensive Benchmark for Multi-Document Non-Factoid Question Answering Valeriia Bolotova, Vladislav Blinov, Sofya Filippova, Falk Scholer, and Mark Sanderson
Score It All Together: A Multi-Task Learning Study on Automatic Scoring of Argumentative Essays Yuning Ding, Marie Bexte, and Andrea Horbach