Publications


We Politely Insist: Your LLM Must Learn the Persian Art of Taarof
EMNLP 2025

Cite Arxiv Code Dataset


Personality Matters: User Traits Predict LLM Preferences in Multi-Turn Collaborative Tasks
EMNLP 2025

Cite Arxiv


The World According to LLMs: How Geographic Origin Influences LLMs Entity Deduction Capabilities
EMNLP 2025 Findings

Cite Arxiv Code Dataset


The World According to LLMs: How Geographic Origin Influences LLMs Entity Deduction Capabilities
COLM 2025

Cite Arxiv Code + Website Dataset


Fine-Tuned LLMs are "Time Capsules" for Tracking Societal Bias Through Books
NAACL 2025

Cite Arxiv Code


NYT-Connections: A Deceptively Simple Text Classification Task that Stumps System-1 Thinkers
COLING 2025, Oral Presentation, Best Dataset Paper Award

Cite ACL Anthology ArXiv NYT-Connections (dataset)


Can We Afford The Perfect Prompt? Balancing Cost and Accuracy with the Economical Prompting Index
COLING 2025, Oral Presentation

Cite ACL Anthology ArXiv Code


STOP! Benchmarking Large Language Models with Sensitivity Testing on Offensive Progressions
EMNLP 2024, Oral Presentation, Social Impact Paper Award

Cite Arxiv Code Dataset


Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge
ACL 2024, Oral Presentation

Cite ACL Anthology Arxiv Code


An application of pseudo-log-likelihoods to natural language scoring
arXiv preprint arXiv:2201.09377

Cite ArXiv


Predicting irregularities in arrival times for transit buses with recurrent neural networks using GPS coordinates and weather data
Journal of Ambient Intelligence and Humanized Computing

Cite Springer


ADEPT: An Adjective-Dependent Plausibility Task
ACL-IJCNLP 2021, Oral Presentation

Cite ACL Anthology Code Video


An analysis of dataset overlap on winograd-style tasks
COLING 2020

Cite ACL Anthology ArXiv Code


The KnowRef coreference corpus: Removing gender and number cues for difficult pronominal anaphora resolution
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019)

Cite ACL Anthology ArXiv Code


How reasonable are common-sense reasoning tasks: A case-study on the Winograd schema challenge and SWAG
EMNLP-IJCNLP 2019

Cite ACL Anthology Arxiv Code


A knowledge hunting framework for common sense reasoning
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018)

Cite ACL Anthology ArXiv Code


A generalized knowledge hunting framework for the winograd schema challenge
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop (NAACL 2018, Best Paper Award)

Cite ACL Anthology Video


Behavioral patterns and associations with glucose control during 12-week randomized free-living clinical trial of day and night hybrid closed-loop insulin delivery in adults with type 1 diabetes
Diabetes technology & therapeutics

Cite PubMed


Modeling glucagon action in patients with type 1 diabetes
IEEE journal of biomedical and health informatics

Cite PubMed


Efficacy of single-hormone and dual-hormone artificial pancreas during continuous and interval exercise in adult patients with type 1 diabetes: randomised controlled crossover trial
Diabetologia

Cite Springer


Comparison of two continuous glucose monitoring systems, Dexcom G4 Platinum and Medtronic Paradigm Veo Enlite System, at rest and during exercise
Diabetes technology & therapeutics

Cite PubMed


The efficacy of single-and dual-hormone artificial pancreas systems at regulating glucose levels during continuous and interval exercise in type 1 diabetes
Canadian Journal of Diabetes

Cite CJD


Enhancing glucose sensor models: modeling the drop-outs
Diabetes Technology & Therapeutics

Cite PubMed


Practical Approach to Physical-Chemical Acid-Base Management: Stewart at the Bedside
Annals of the American Thoracic Society

Cite PubMed