I am a PhD student in Computer Science at Mila and McGill where I am supervised by Prof. Dzmitry Bahdanau and Prof. Siva Reddy. Previously, I spent 2.5 amazing years as a Research Fellow at Microsoft Research India, where I worked with Dr. Navin Goyal. I also interned with the AllenNLP team at the Allen Institute for Artificial Intelligence (AI2). At AI2, I worked with Pradeep Dasigi on evaluating code generation in LLMs.

I do research in Machine Learning on various interesting aspects surrounding Large Language Models (LLMs). My work focuses on building a principled and predictive understanding of how LLMs behave across varying data and training regimes. My goal is to uncover general laws that help explain when and why capabilities emerge, how they transfer or fail out of distribution, and which signals reliably anticipate downstream performance. I aim to leverage these insights for forecasting model behaviour and for informing the design of more reliable LLMs.

Keywords: generalization, scaling, reasoning, evaluation, safety, analysis and interpretability

I graduated with B.E. (Hons.) in Computer Science from BITS Pilani - Goa Campus, India in 2020. For more details about my background, refer to my CV. If you'd like to chat with me about my work or research in general, feel free to reach out!
News
Jan 12, 2026

Our Thoughtology paper investigating the reasoning chains-of-thoughts of Large Reasoning Models like DeepSeek-R1 has been published at TMLR!

May 01, 2025

Our paper proposing SafeArena, a benchmark for evaluating the safety of autonomous web agents is accepted at ICML 2025!

Mar 30, 2025

Our paper on AI safety investigating the transferability of adversarial triggers in LLMs has been accepted to TACL!

Mar 22, 2025

I'm a visiting graduate student at the Simons Institute at UC Berkeley as a part of their special year on LLMs and Transformers.

Feb 20, 2025

Our paper proposing the CHASE method to automatically generate challenging synthetic data for evaluating LLMs is out!

Jun 16, 2024

Presented my AI2 internship work on evaluating code generation in LLMs at NAACL 2024 in Mexico City!

Publications
  Google Scholar|   Semantic Scholar

How to Get Your LLM to Generate Challenging Problems for Evaluation
Arkil Patel, Siva Reddy, Dzmitry Bahdanau
Preprint
pdf code abstract

DeepSeek-R1 Thoughtology: Let’s think about LLM reasoning
Sara Vera Marjanović*, Arkil Patel*, Vaibhav Adlakha, Milad Aghajohari, Parishad BehnamGhader, Mehar Bhatia, Aditi Khandelwal, Austin Kraft, Benno Krojer, Xing Han Lù, Nicholas Meade, Dongchan Shin, Amirhossein Kazemnejad, Gaurav Kamath, Marius Mosbach, Karolina Stańczak, Siva Reddy
TMLR'26
pdf code abstract

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Xing Han Lù, Amirhossein Kazemnejad, Nicholas Meade, Arkil Patel, Dongchan Shin, Alejandra Zambrano, Karolina Stanczak, Peter Shaw, Christopher Pal, Siva Reddy
CoLM'25
pdf code abstract

Safearena: Evaluating the safety of autonomous web agents
Ada Defne Tur, Nicholas Meade, Xing Han Lù, Alejandra Zambrano, Arkil Patel, Esin Durmus, Spandana Gella, Karolina Stańczak, Siva Reddy
ICML'25
pdf code abstract

Universal Adversarial Triggers Are Not Universal
Nicholas Meade, Arkil Patel, Siva Reddy
TACL'25
pdf code abstract

Evaluating In-Context Learning of Libraries for Code Generation
Arkil Patel, Siva Reddy, Dzmitry Bahdanau, Pradeep Dasigi
NAACL'24
pdf code abstract

Understanding In-Context Learning in Transformers and LLMs by Learning to Learn Discrete Functions
Satwik Bhattamishra, Arkil Patel, Phil Blunsom, Varun Kanade
ICLR'24 [Oral]
pdf code abstract

Simplicity Bias in Transformers and their Ability to Learn Sparse Boolean Functions
Satwik Bhattamishra, Arkil Patel, Varun Kanade, Phil Blunsom
ACL'23
pdf code abstract

Revisiting the Compositional Generalization Abilities of Neural Sequence Models
Arkil Patel, Satwik Bhattamishra, Phil Blunsom, Navin Goyal
ACL'22
pdf code abstract

Are NLP Models really able to Solve Simple Math Word Problems?
Arkil Patel, Satwik Bhattamishra, Navin Goyal
NAACL'21
pdf code abstract article

VehicleChain: Blockchain-based Vehicular Data Transmission Scheme for Smart City
Arkil Patel, Naigam Shah, Trupil Limbasiya, Debasis Das
IEEE SMC'19
pdf

Service
Teaching
  • Winter 2026: Teaching Assistant for COMP 767: Large Language Models - McGill University
  • Fall 2024: Teaching Assistant for COMP 767: Large Language Models - McGill University
  • Winter 2024: Teaching Assistant for COMP 596: From Natural Language to Data Science - McGill University
  • Winter 2023: Teaching Assistant for COMP 596: From Natural Language to Data Science - McGill University
  • Winter 2020: Teaching Assistant for BITS F312: Neural Networks and Fuzzy Logic - BITS Goa
  • Winter 2019: Teaching Assistant for CS F415: Data Mining - BITS Goa
Reviewer     ICML   ICLR   NeurIPS   CoLM   ACL Rolling Review

BITS Pilani
2016 - 2020
Microsoft Research India
2019 - 2022
Allen Institute for AI
Summer 2023
Mila - Quebec AI Institute
2022 - Present
McGill University
2022 - Present
  Template: Sebastin