Preprints
How Well Does Agent Development Reflect Real-World Work?
Zora Zhiruo Wang, Sanidhya Vijayvargiya, Aspen Chen, Hanmo Zhang, Venu Arvind Arangarajan, Jett Chen, Valerie Chen, Diyi Yang, Daniel Fried, Graham Neubig
arXiv, 2026.
site
Hybrid-Gym: Training Coding Agents to Generalize Across Tasks
Yiqing Xie, Emmy Liu, Gaokai Zhang, Nachiket Kotalwar, Shubham Gandhi, Sathwik Acharya, Xingyao Wang, Carolyn Rose, Graham Neubig, Daniel Fried
arXiv, 2026.
code
Position: Humans are Missing from AI Coding Agent Research
Zora Zhiruo Wang*, John Yang*, Kilian Lieret*, Alexa Tartaglini, Valerie Chen, Yuxiang Wei, Zijian Wang, Lingming Zhang, Karthik Narasimhan, Ludwig Schmidt, Graham Neubig, Daniel Fried, Diyi Yang
preprint, 2026.
Reasoning with Latent Tokens in Diffusion Language Models
Andre He, Sean Welleck*, Daniel Fried*
arXiv, 2026.
Propose, Solve, Verify: Self-Play Through Formal Verification
Alex Wilf, Pranjal Aggarwal, Bryan Parno, Daniel Fried, Louis-Philippe Morency, Paul Pu Liang, Sean Welleck
arXiv, 2025.
Toward Training Superintelligent Software Agents through Self-Play SWE-RL
Yuxiang Wei, Zhiqing Sun, Emily McMilin, Jonas Gehring, David Zhang, Gabriel Synnaeve, Daniel Fried, Lingming Zhang, Sida Wang
arXiv, 2025.
Measuring Fine-Grained Negotiation Tactics of Humans and LLMs in Diplomacy
Wenkai Li*, Lynnette Hui Xian Ng*, Andy Liu, Daniel Fried
arXiv, 2025.
How Do AI Agents Do Human Work? Comparing AI and Human Workflows Across Diverse Occupations
Zora Zhiruo Wang, Yijia Shao, Omar Shaikh, Daniel Fried, Graham Neubig, Diyi Yang
arXiv, 2025.
Success and Cost Elicit Convention Formation for Efficient Communication
Saujas Vaduguru, Yilun Hua, Yoav Artzi, Daniel Fried
arXiv, 2025.
Analyzing Information Sharing and Coordination in Multi-Agent Planning
Tianyue Ou, Saujas Vaduguru, Daniel Fried
arXiv, 2025.
MetaLint: Generalizable Idiomatic Code Quality Analysis through Instruction-Following and Easy-to-Hard Generalization
Atharva Naik, Lawanya Baghel, Dhakshin Govindarajan, Darsh Agrawal, Daniel Fried, Carolyn Rose
arXiv, 2025.
CodeBenchGen: Creating Scalable Execution-based Code Generation Benchmarks
Yiqing Xie, Alex Xie, Divyanshu Sheth, Pengfei Liu, Daniel Fried, and Carolyn Rose
arXiv, 2024.
2026
Generative Value Conflicts Reveal LLM Priorities
Andy Liu, Kshitish Ghate, Mona Diab*, Daniel Fried*, Atoosa Kasirzadeh*, Max Kleiman-Weiner*
ICLR, 2026.
code
From Reproduction to Replication: Evaluating Research Agents with Progressive Code Masking
Gyeongwon James Kim, Alex Wilf, Louis-Philippe Morency, Daniel Fried
ICLR, 2026.
code
2025
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Yuxiang Wei, Olivier Duchenne, Jade Copet, Quentin Carbonneaux, Lingming Zhang, Daniel Fried, Gabriel Synnaeve, Rishabh Singh, and Sida I. Wang
NeurIPS, 2025.
code
Identifying and Interactively Refining Ambiguous User Goals for Data Visualization Code Generation
Mert Inan, Anthony Sicilia, Alex Xie, Saujas Vaduguru, Daniel Fried, and Malihe Alikhani
EMNLP, 2025.
Rewarding the Unlikely: Lifting GRPO Beyond Distribution Sharpening
Andre He, Daniel Fried, and Sean Welleck
EMNLP, 2025.
mrCAD: Multimodal Refinement of Computer-aided Designs
William P. McCarthy, Saujas Vaduguru, Karl D. D. Willis, Justin Matejka, Judith E. Fan, Daniel Fried, and Yewen Pu
Findings of EMNLP, 2025.
dataset
Inducing Programmatic Skills for Agentic Tasks
Zora Zhiruo Wang, Apurva Gandhi, Graham Neubig, Daniel Fried
COLM, 2025.
code
RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing
Yiqing Xie, Alex Xie, Divyanshu Sheth, Pengfei Liu, Daniel Fried, Carolyn Rose
COLM, 2025.
code
Improving Model Factuality with Fine-grained Critique-based Evaluator
Yiqing Xie, Wenxuan Zhou, Pradyot Prakash, Di Jin, Yuning Mao, Quintin Fettes, Arya Talebzadeh, Sinong Wang, Han Fang, Carolyn Rose, Daniel Fried, and Hejia Zhang
ACL, 2025.
Agent Workflow Memory
Zora Zhiruo Wang, Jiayuan Mao, Daniel Fried, and Graham Neubig
ICML, 2025.
code
Dynamic Coalition Structure Detection in Natural Language-based Interactions
Abhishek N. Kulkarni*, Andy Liu*, Jean-Raphael Gaglione, Daniel Fried, and Ufuk Topcu
AAMAS, 2025.
AutoPresent: Designing Structured Visuals from Scratch
Jiaxin Ge*, Zora Zhiruo Wang*, Xuhui Zhou, Yi-Hao Peng, Sanjay Subramanian, Qinyue Tan, Maarten Sap, Alane Suhr**, Daniel Fried**, Graham Neubig**, and Trevor Darrell**
CVPR, 2025.
code
CRScore: Grounding Automated Evaluation of Code Review Comments in Code Claims and Smells
Atharva Naik, Marcus Alenius, Daniel Fried, and Carolyn Rose
NAACL, 2025.
CodeRAG-Bench: Can Retrieval Augment Code Generation?
Zora Zhiruo Wang*, Akari Asai*, Xinyan Velocity Yu, Frank F. Xu, Yiqing Xie, Graham Neubig, and Daniel Fried
Findings of NAACL, 2025.
project page,
code
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Terry Yue Zhuo et al. (33 authors from the BigCode project)
ICLR, 2025.
project page,
code
Human-Aligned Chess with a Bit of Search
Yiming Zhang, Athul Paul Jacob, Vivian Lai, Daniel Fried, and Daphne Ippolito
ICLR, 2025.
code and models
Repetition Improves Language Model Embeddings
Jacob Mitchell Springer, Suhas Kotha, Daniel Fried, Graham Neubig, and Aditi Raghunathan
ICLR, 2025.
Dissecting Adversarial Robustness of Multimodal LM Agents
Chen Henry Wu, Rishi Shah, Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried, and Aditi Raghunathan
ICLR, 2025.
project page
Tree Search for Language Model Agents
Jing Yu Koh, Stephen McAleer, Daniel Fried, and Ruslan Salakhutdinov
TMLR, 2025.
code,
project page
2024
Comparative Knowledge Distillation
Alex Tianyi Xu*, Alex Wilf*, Paul Pu Liang, Alexander Obolenskiy, Daniel Fried, and Louis-Philippe Morency
WACV, 2024.
ECCO: Can We Improve Model-Generated Code Efficiency Without Sacrificing Functional Correctness?
Siddhant Waghjale*, Vishruth Veerendranath*, Zora Zhiruo Wang, and Daniel Fried
EMNLP, 2024.
code,
project page
What Are Tools Anyway? A Survey from the Language Model Perspective
Zora Zhiruo Wang, Zhoujun Cheng, Hao Zhu, Daniel Fried, and Graham Neubig
COLM, 2024.
Human-Agent Cooperation in Games under Incomplete Information through Natual Language Communication
Shenghui Chen, Daniel Fried, and Ufuk Topcu
IJCAI, 2024.
Evaluating Large Language Model Biases in Person-Steered Generation
Andy Liu, Mona T. Diab, and Daniel Fried
Findings of ACL, 2024.
Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs
Akhila Yerukola, Saujas Vaduguru, Daniel Fried, and Maarten Sap
ACL, 2024.
VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks
Jing Yu Koh, Robert Lo*, Lawrence Jang*, Vikram Duvvur*, Ming Chong Lim*, Po-Yu Huang*, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, and Daniel Fried
ACL, 2024.
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks
Zhiruo Wang, Graham Neubig, and Daniel Fried
ICML, 2024.
code
Amortizing Pragmatic Program Synthesis with Rankings
Yewen Pu, Saujas Vaduguru, Priyan Vaithilingam, Elena Glassman, and Daniel Fried
ICML, 2024.
Asking More Informative Questions for Grounded Retrieval
Sedrick Keh, Justin T. Chiu, and Daniel Fried
Findings of NAACL, 2024.
Generating Pragmatic Examples to Train Neural Program Synthesizers
Saujas Vaduguru, Daniel Fried, and Yewen Pu
ICLR, 2024.
Sotopia: Interactive Evaluation for Social Intelligence in Language Agents
Xuhui Zhou*, Hao Zhu*, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, and Maarten Sap
ICLR, 2024.
project page
WebArena: A Realistic Web Environment for Building Autonomous Agents
Shuyan Zhou*, Frank Xu*, Hao Zhu**, Xuhui Zhou**, Robert Lo**, Abishek Sridhar**, Xianyi Cheng, Yonatan Bisk, Daniel Fried, Uri Alon, and Graham Neubig
ICLR, 2024.
project page
2023
API-Assisted Code Generation for Question Answering on Varied Table Structures
Yihan Cao*, Shuyi Chen*, Ryan Liu*, Zhiruo Wang, and Daniel Fried
EMNLP, 2023.
Symbolic Planning and Code Generation for Grounded Dialogue
Justin Chiu, Wenting Zhao, Derek Chen, Saujas Vaduguru, Alexander Rush, and Daniel Fried
EMNLP, 2023.
Pragmatics in Language Grounding: Phenomena, Tasks, and Modeling Approaches
Daniel Fried*, Nicholas Tomlin*, Jennifer Hu, Roma Patel, and Aida Nematzadeh
Findings of EMNLP, 2023.
Execution-Based Evaluation for Open-Domain Code Generation
Zhiruo Wang, Shuyan Zhou, Daniel Fried, and Graham Neubig
Findings of EMNLP, 2023.
dataset
Data Augmentation for Code Translation with Comparable Corpora and Multiple References
Yiqing Xie, Atharva Naik, Daniel Fried, Carolyn Rose
Findings of EMNLP, 2023.
AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies
Weiyan Shi, Emily Dinan, Adi Renduchintala, Daniel Fried, Athul Paul Jacob, Zhou Yu, and Mike Lewis
Findings of EMNLP, 2023.
Generating Images with Multimodal Language Models
Jing Yu Koh, Daniel Fried, and Ruslan Salakhutdinov
NeurIPS, 2023.
project page
Pragmatic Inference with a CLIP Listener for Contrastive Captioning
Jiefu Ou, Benno Krojer, and Daniel Fried
Findings of ACL, 2023.
Contrastive Decoding: Open-ended Text Generation as Optimization
Xiang Lisa Li, Ari Holtzman, Daniel Fried, Percy Liang, Jason Eisner, Tatsunori Hashimoto, Luke Zettlemoyer, and Mike Lewis
ACL, 2023.
code
Grounding Language Models to Images for Multimodal Inputs and Outputs
Jing Yu Koh, Ruslan Salakhutdinov, and Daniel Fried
ICML, 2023.
project page
Coder Reviewer Reranking for Code Generation
Tianyi Zhang, Tao Yu, Tatsunori B. Hashimoto, Mike Lewis, Wen-tau Yih, Daniel Fried, and Sida I. Wang
ICML, 2023.
code
DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation
Yuhang Lai*, Chengxi Li*, Yiming Wang*, Tianyi Zhang*, Ruiqi Zhong*, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida I. Wang, and Tao Yu
ICML, 2023.
site,
code,
data
InCoder: A Generative Model for Code Infilling and Synthesis
Daniel Fried*, Armen Aghajanyan*, Jessy Lin, Sida I. Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, and Mike Lewis
ICLR, 2023.
site,
code and models,
demo
StarCoder: May the Source Be With You!
Raymond Li et al. (68 authors from the BigCode Project)
TMLR, 2023.
project page,
models,
demo
SantaCoder: Don't Reach for the Stars
Loubna Ben Allal*, Raymond Li*, Denis Kocetkov*, et al. (41 authors from the BigCode Project)
Deep Learning for Code Workshop, 2023.
models
Best Paper Award
2022
Natural Language to Code Translation with Execution
Freda Shi, Daniel Fried, Marjan Ghazvininejad, Luke Zettlemoyer, and Sida I. Wang
EMNLP, 2022.
code
Neural Theory-of-Mind? On the Limits of Social Intelligence in Large LMs
Maarten Sap, Ronan Le Bras, Daniel Fried, and Yejin Choi
EMNLP, 2022.
G3: Geolocation via Guidebook Grounding
Grace Luo*, Giscard Biamby*, Trevor Darrell, Daniel Fried, and Anna Rohrbach
Findings of EMNLP, 2022.
code
Inferring Rewards from Language in Context
Jessy Lin, Daniel Fried, Dan Klein, and Anca Dragan
ACL, 2022.
code and data
Human-Level Play in the Game of Diplomacy by Combining Language Models with Strategic Reasoning
FAIR Diplomacy Team
Science, 2022.
site,
code,
blog,
article
Modeling Perspective-Dependent Ambiguity in Grounded Collaborative Dialogue
Justin Chiu, Wenting Zhao, Alexander M. Rush, and Daniel Fried
Wordplay: When Language Meets Games Workshop, 2022.
2021
Reference-Centric Models for Grounded Collaborative Dialogue
Daniel Fried, Justin Chiu, and Dan Klein
EMNLP, 2021.
talk,
slides [pdf],
poster [pdf],
code
Modular Networks for Compositional Instruction Following
Rodolfo Corona, Daniel Fried, Coline Devin, Dan Klein, and Trevor Darrell
NAACL, 2021.
Learning Grounded Pragmatic Communication
Daniel Fried
PhD thesis, 2021.
job talk slides,
video
Interactive Assignments for Teaching Structured Neural NLP
David Gaddy, Daniel Fried, Nikita Kitaev, Mitchell Stern, Rodolfo Corona, John DeNero, and Dan Klein
Teaching NLP Workshop at NAACL, 2021.
2020 and before
Learning to Segment Actions from Observation and Narration
Daniel Fried, Jean-Baptiste Alayrac, Phil Blunsom, Chris Dyer, Stephen Clark, Aida Nematzadeh
ACL, 2020.
talk,
slides [pdf],
code
Syntactic Structure Distillation Pretraining for Bidirectional Encoders
Adhiguna Kuncoro*, Lingpeng Kong*, Daniel Fried*, Dani Yogatama, Laura Rimell, Chris Dyer, and Phil Blunsom
TACL, 2020.
talk
Cross-Domain Generalization of Neural Constituency Parsers
Daniel Fried*, Nikita Kitaev*, and Dan Klein
ACL, 2019.
talk,
slides [pdf],
code & models
Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation
Ronghang Hu, Daniel Fried, Anna Rohrbach, Dan Klein, Trevor Darrell, and Kate Saenko
ACL, 2019.
poster [pdf]
Pragmatically Informative Text Generation
Sheng Shen, Daniel Fried, Jacob Andreas, and Dan Klein
NAACL, 2019.
slides [pdf]
Speaker-Follower Models for Vision-and-Language Navigation
Daniel Fried*, Ronghang Hu*, Volkan Cirik*, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein**, and Trevor Darrell**
NeurIPS, 2018.
Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing
Daniel Fried and Dan Klein
ACL, 2018.
talk,
slides [pptx],
slides [pdf],
code
Unified Pragmatic Models for Generating and Following Instructions
Daniel Fried, Jacob Andreas, and Dan Klein
NAACL, 2018.
talk,
slides [pptx],
slides [pdf],
code
Effective Inference for Generative Neural Parsing
Mitchell Stern, Daniel Fried, and Dan Klein
EMNLP, 2017.
poster
Improving Neural Parsing by Disentangling Model Combination and Reranking Effects
Daniel Fried*, Mitchell Stern*, and Dan Klein
ACL, 2017.
talk,
slides [pptx],
slides [pdf],
code
Towards Using Social Media to Identify Individuals at Risk for Preventable Chronic Illness
Dane Bell, Daniel Fried, Luwen Huangfu, Mihai Surdeanu, and Stephen Kobourov
LREC, 2016.
Challenges for Using Social Media for Early Detection of Type II Diabetes Mellitus
Dane Bell, Daniel Fried, Luwen Huangfu, Mihai Surdeanu, and Stephen Kobourov
International Workshop on Social Media World Sensors, 2016.
Low-Rank Tensors for Verbs in Compositional Distributional Semantics
Daniel Fried, Tamara Polajnar, and Stephen Clark
ACL, 2015.
poster [pdf]
Higher-Order Lexical Semantic Models for Non-Factoid Answer Reranking
Daniel Fried, Peter Jansen, Gustave Hahn-Powell, Mihai Surdeanu, and Peter Clark
TACL, 2015.
slides [pdf]
Low-rank Tensor Approximations for Compositional Distributional Semantics
Daniel Fried
MPhil thesis, 2015.
Learning Low-Rank Tensors for Transitive Verbs
Daniel Fried, Tamara Polajnar, and Stephen Clark
Advances in Distributional Semantics Workshop, 2015.
Incorporating both Distributional and Relational Semantics in Word Representations
Daniel Fried and Kevin Duh
ICLR, 2015.
long version [arxiv],
poster [pdf]
Analyzing the Language of Food on Social Media
Daniel Fried, Mihai Surdeanu, Stephen Kobourov, Melanie Hingle, and Dane Bell
International Conference on Big Data, 2014.
long version [arxiv],
slides [pdf],
demo
Maps of Computer Science
Daniel Fried and Stephen Kobourov
PacificVis, 2014.
slides,
poster,
code,
demo
Predicting Parallelization of Sequential Programs Using Supervised Learning
Daniel Fried, Zhen Li, Ali Jannesari, and Felix Wolf
International Conference on Machine Learning and Applications, 2013.
A Generative Probabilistic Framework for Learning Spatial Language
Colin Dawson, Jeremy Wright, Antons Rebguns, Marco Valenzuela Escarcega, Daniel Fried, and Paul Cohen
International Conference on Development and Learning, 2013.
Best Paper Award
Bayesian Geometric Modeling of Indoor Scenes
Luca Del Pero, Joshua Bowdish, Daniel Fried, Bonnie Kermgard, Emily Hartley, and Kobus Barnard
CVPR, 2012.