William Merrill

I am a Ph.D. student at the CDS at NYU, where I am advised by Tal Linzen and supported by an NSF graduate research fellowship and by AI2.

My research develops theory to better understand what language models can do, as well as what they can't. I've worked on characterizing the computational power of transformers for representing linguistic structure and solving reasoning problems. I've also analyzed the aspects of semantics that can be learned from co-occurrence patterns as a way to understand the potential of self-supervised learning.

Contact: willm[æt]nyu.edu or here for anonymous feedback

Outside of research, I like exploring New York City by foot, train, and boat. I like cooking new things and trying hole-in-the-wall restaurants. I also play basketball, ping pong, and Age of Empires II.

Latest posts

Apr 15, 2022	Project: Improved Adversarial Robustness via Abstract Interpretation
Apr 16, 2020	A Formal Hierarchy of RNN Architectures
Sep 6, 2019	Theory of Saturated Neural Networks

Publications

2024

arXiv

Let’s Think Dot by Dot: Hidden Computation in Transformer Language Models

Jacob Pfau, William Merrill, and Samuel Bowman

Apr 2024

HTML
ICML

The Illusion of State in State-Space Models

William Merrill, Jackson Petty, and Ashish Sabharwal

In ICML, Jul 2024

HTML
arXiv

Can You Learn Semantics Through Next-Word Prediction? The Case of Entailment

William Merrill, Zhaofeng Wu, Norihito Naka, and 2 more authors

Feb 2024

HTML
arXiv

OLMo: Accelerating the Science of Language Models

Dirk Groeneveld, Iz Beltagy, Pete Walsh, and 40 more authors

Feb 2024

HTML
ICML

How Language Model Hallucinations Can Snowball

Muru Zhang, Ofir Press, William Merrill, and 2 more authors

In ICML, Jul 2024

HTML
ICLR

The Expressive Power of Transformers with Chain of Thought

William Merrill, and Ashish Sabharwal

In ICLR, May 2024

HTML

2023

DLT

Formal Languages and the NLP Black Box

William Merrill

In Developments in Language Theory, Jun 2023

HTML
ME-FoMo

A Tale of Two Circuits: Grokking as Competition of Sparse and Dense Subnetworks

William Merrill, Nikolaos Tsilivis, and Aman Shukla

In ICLR Workshop on Mathematical and Empirical Understanding of Foundation Models, May 2023

HTML
TACL

Transparency Helps Reveal When Language Models Learn Meaning

Zhaofeng Wu, William Merrill, Hao Peng, and 2 more authors

TACL, May 2023

HTML
NeurIPS

A Logic for Expressing Log-Precision Transformers

William Merrill, and Ashish Sabharwal

In NeurIPS, Dec 2023

HTML
TACL

The Parallelism Tradeoff: Limitations of Log-Precision Transformers

William Merrill, and Ashish Sabharwal

TACL, Jun 2023

HTML

2022

CoNLL

Entailment Semantics Can Be Extracted from an Ideal Language Model

William Merrill, Alex Warstadt, and Tal Linzen

In CoNLL, Dec 2022

HTML
Extracting Finite Automata from RNNs Using State Merging

William Merrill, and Nikolaos Tsilivis

Jan 2022

HTML
TACL

Saturated Transformers are Constant-Depth Threshold Circuits

William Merrill, Ashish Sabharwal, and Noah A. Smith

TACL, Aug 2022

HTML
ACL

ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension

Sanjay Subramanian, William Merrill, Trevor Darrell, and 3 more authors

In ACL, May 2022

HTML

2021

EMNLP

Competency Problems: On Finding and Removing Artifacts in Language Data

Matt Gardner, William Merrill, Jesse Dodge, and 4 more authors

In EMNLP, Nov 2021

HTML
Formal Language Theory Meets Modern NLP

William Merrill

Feb 2021

HTML
TACL

Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?

William Merrill, Yoav Goldberg, Roy Schwartz, and 1 more author

TACL, Sep 2021

HTML
EMNLP

Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent

William Merrill, Vivek Ramanujan, Yoav Goldberg, and 2 more authors

In EMNLP, Nov 2021

HTML

2020

ACL

A Formal Hierarchy of RNN Architectures

William Merrill, Gail Weiss, Yoav Goldberg, and 3 more authors

In ACL, Jul 2020

HTML
COVID19

CORD-19: The COVID-19 Open Research Dataset

Lucy Lu Wang, Kyle Lo, Yoganand Chandrasekhar, and 25 more authors

In ACL Workshop on NLP for COVID-19, Jul 2020

HTML
arXiv

On the Linguistic Capacity of Real-Time Counter Automata

William Merrill

Sep 2020

HTML

2019

DeLeFoL

Sequential Neural Networks as Automata

William Merrill

In ACL Workshop on Deep Learning and Formal Languages, Aug 2019

HTML
BlackboxNLP

Finding Hierarchical Structure in Neural Stacks Using Unsupervised Parsing

William Merrill, Lenny Khazan, Noah Amsel, and 3 more authors

In ACL Workshop BlackboxNLP, Aug 2019

HTML
LChange

Detecting Syntactic Change Using a Neural Part-of-Speech Tagger

William Merrill, Gigi Stark, and Robert Frank

In ACL Workshop on Computational Approaches to Historical Language Change, Aug 2019

HTML

2018

BlackboxNLP

Context-Free Transductions with Neural Stacks

Yiding Hao, William Merrill, Dana Angluin, and 4 more authors

In EMNLP Workshop BlackboxNLP, Nov 2018

HTML
NAACL

End-to-End Graph-Based TAG Parsing with Neural Networks

Jungo Kasai, Robert Frank, Pauli Xu, and 2 more authors

In NAACL, Nov 2018

HTML
TULCon

A Semantics of Subordinate Clauses Using Delayed Evaluation

William Merrill

Toronto Undergraduate Linguistics Conference, Nov 2018

HTML