Sireesh Gururaja

prof_pic.jpg

I’m a PhD student at Carnegie Mellon University’s Language Technologies Institute, advised by Emma Strubell. I previously completed a masters degree here under the supervision of Carolyn Rose, and a BA in computer science at Columbia University. My work is supported by the Army Research Lab’s HTMDEC US Citizen Fellowship and the Mozilla Foundation.

My research focuses on NLP and AI tools that allows users in specialized domains to keep agency in their work. How can we empower people to customize and change their tools to reflect and be useful to how they see their own jobs, rather than how their boss or a tech company with a billion other users does? More concretely, I focus on user-customizable, on-device models that live in the browser, and how to effectively reason their limitations and update them. I’m also interested in the incentives that shape NLP research, whether funding, tooling, or culture.

Before coming to CMU, I spent six years in industry. I started at IBM Watson in 2015 on a team that did bespoke prototypes; I then moved to Kensho Technologies in 2018, where I spent three and a half years, first working as an ML engineer focused on NLP, then as the first lead of the ML Ops and Internal Tools team.

news

Apr 24, 2025 I’m heading to NAACL 2025 next week, and presenting our work on data-driven materials design as a new benchmark for information extraction at the AI and Scientific Discovery Workshop.
Apr 04, 2025 I’m going to be back in Boston for (some of) the summer, interning at Ikigai Labs. I’ll be working at the intersection of UX and ML, figuring out how to make user-controllable models for timeseries prediction.

selected publications

  1. Preprint
    Beyond Text: Expert Needs in Document Research
    Sireesh Gururaja, Nupoor Gandhi, Jeremiah Milbauer, and 1 more author
    2025
  2. Non-archival@NAACL AISD ’25
    Data Driven Design as a Challenge Task for Few- and Zero-Shot Information Extraction
    Sireesh Gururaja, Jeremiah Milbauer, Hung-Yi Lin, and 2 more authors
    2025
  3. Preprint
    Basic Research, Lethal Effects: Military AI Research Funding as Enlistment
    David Gray Widder, Sireesh Gururaja, and Lucy Suchman
    2024
  4. Preprint
    Collage: Decomposable Rapid Prototyping for Information Extraction on Scientific PDFs
    Sireesh Gururaja, Yueheng Zhang, Guannan Tang, and 6 more authors
    2024
  5. EMNLP ’23
    To Build Our Future, We Must Know Our Past: Contextualizing Paradigm Shifts in Natural Language Processing
    Sireesh Gururaja, Amanda Bertsch, Clara Na, and 2 more authors
    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Dec 2023
  6. ACL ’23
    Linguistic representations for fewer-shot relation extraction across domains
    Sireesh Gururaja, Ritam Dutt, Tinglong Liao, and 1 more author
    In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Jul 2023