Waris Gill

Hi, I’m Waris Gill, a 5th-year PhD candidate in Computer Science at Virginia Tech. I’m currently working as an Applied Scientist (intern) at Microsoft, focusing on enhancing the safety and defense of Microsoft’s Generative AI systems (e.g., Azure OpenAI).

Previously, in collaboration with Cisco, my work MeanCache led to real-world adoption of semantic caching for LLMs. As a Machine Learning Engineer (intern) at Redis, I developed redis/langcache-embed-v1 and v2 embedding models for semantic caching, with thousands of downloads on Hugging Face, outperforming both open- and closed-source models from OpenAI and Amazon on tasks related to semantic caching.

My research mainly focuses on interpretability techniques for distributed privacy-preserving AI systems and stress-testing Large Language Models (LLMs) in complex software engineering tasks. My work is published in prestigious computer science venues such as ICSE, FSE (SE4SafeML), MSR, and IPDPS.

Advisor & Mentor

news

May 27, 2025 I started my internship at Microsoft as an Applied Scientist in the AI Safety team.
Apr 03, 2025 My research at Redis on compact and efficient embeddings for semantic caching has been open sourced (read the paper here). The redis/langcache-embed-v1 and redis/langcache-embed-v2 models have surpassed 60K and 72K downloads on Hugging Face. Both are optimized for semantic caching in LLM services.
Mar 26, 2025 Delivered a talk on TraceFL, an interpretability technique based on neuron provenance for federated learning, at the Flower AI Summit 2025–the world’s largest federated AI conference—held in London, UK. The talk is available at this link. [Slides]
Jan 30, 2025 Our work on restoring Jupyter Notebooks is accepted at MSR-2025. Congratulations, Tien!
Jan 21, 2025 I started my internship at Redis as a Machine Learning Engineer in the Redis AI team.
Dec 19, 2024 Our paper, in collaboration with Cisco on MeanCache, has been accepted at IPDPS-2025. MeanCache is a semantic cache for LLM services.
Nov 01, 2024 The baseline of our paper, FedDebug, for debugging malicious/faulty clients in Federated Learning is available in the Flower AI framework. Check out the code here.
Oct 31, 2024 Our paper, TraceFL, is accepted at 𝗜𝗖𝗦𝗘-𝟮𝟬𝟮𝟱 (acceptance rate ~𝟭𝟬% [132/1219]). TraceFL addresses the open challenge of interpretability in federated learning using neuron provenance.
Oct 03, 2024 Presented 𝐌𝐞𝐚𝐧𝐂𝐚𝐜𝐡𝐞, a semantic cache for LLMs, at the Amazon - Virginia Tech Initiative for Efficient and Robust ML. Selected as one of 18 participants for the poster presentation. [Poster] [Paper]
Aug 19, 2024 Serving as a program committee member on the artifact evaluation track for the 47th International Conference on Software Engineering (ICSE) 2025.
Aug 07, 2024 Delivered an invited talk on Achieving Debugging and Interpretability in Federated Learning Systems at Flower AI, a premier platform for federated learning. [Slides]
Dec 04, 2023 Presented our paper, FedDefender, during the SE4SafeML event at FSE-2023 in San Francisco, California.
Sep 20, 2023 My work at Cisco got open-sourced (Link).
May 22, 2023 I started working at Cisco with Shannon and Pallavi.
May 14, 2023 I received National Science Foundation (NSF) award to present our paper, FedDebug, at ICSE-2023 in Melbourne, Australia. [Slides]

selected publications

  1. TraceFL: Interpretability-Driven Debugging in Federated Learning via Neuron Provenance
    Waris Gill, Ali Anwar, and Muhammad Ali Gulzar
    In 2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE), 2025
  2. MeanCache: User-Centric Semantic Caching for LLM Web Services
    Waris Gill, Mohamed Elidrisi, Pallavi Kalapatapu, and 2 more authors
    In 2025 IEEE 39th International Parallel & Distributed Processing Symposium (IPDPS), 2025
  3. Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data
    Waris Gill, Justin Cechmanek, Tyler Hutcherson, and 5 more authors
    2025
  4. Are the Majority of Public Computational Notebooks Pathologically Non-Executable?
    Tien Nguyen, Waris Gill, and Muhammad Ali Gulzar
    In 2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR), 2025
  5. How Accurately Do Large Language Models Understand Code?
    Sabaat Haroon, Ahmad Faraz Khan, Ahmad Humayun, and 5 more authors
    2025
  6. FedDebug: Systematic Debugging for Federated Learning Applications
    Waris Gill, Ali Anwar, and Muhammad Ali Gulzar
    In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), 2023
  7. FedDefender: Backdoor Attack Defense in Federated Learning
    Waris Gill, Ali Anwar, and Muhammad Ali Gulzar
    In Proceedings of the 1st International Workshop on Dependability and Trustworthiness of Safety-Critical Systems with Machine Learned Components, , San Francisco, CA, USA, , 2023