Ahmad Humayun

4th Floor Gilbert Place

220 Gilbert St

Blacksburg, VA 24060

ahmad35@vt.edu
LinkedIn
GitHub
Google Scholar
CV

I’m a 5th year PhD candidate in Computer Science at Virginia Tech, advised by Prof. Muhammad Ali Gulzar on automated software testing and security of distributed data-intensive scalable computing (DISC) and Machine Learning (ML) applications. I also collaborate closely with Prof. Miryung Kim @ UCLA.

My research focuses on developing novel methods to improve testing and debugging across two domains: (1) big data analytics applications, targeting DISC frameworks like Apache Spark and Apache Flink, and (2) large language models (LLMs), where I investigate code reasoning capabilities, fault localization, and provenance tracking in federated learning settings. I’ve published my work at top-tier venues including ESEC/FSE and IEEE/ACM ASE. My tools have discovered multiple previously unknown faults in popular distributed frameworks such as Apache Spark, Apache Flink, Polars, and Dask. My recent work introduces token-level attribution techniques for federated LLMs to enable debugging, malicious client detection, and trust verification in collaborative learning environments.

I recently completed an internship as an Applied Scientist at Amazon Web Services (Summer 2025), where I developed an LLM-powered application to automate the modeling of complex distributed algorithms in low-resource programming languages, deployed both as a standalone application and an MCP server. Previously, as an Applied Scientist intern at AWS (Summer 2024), I enhanced the automated testing infrastructure of critical AWS Services.

news

Feb 10, 2026	🎉 Our paper “ProToken: Token-Level Attribution for Federated Large Language Models” has been accepted to MLSys 2026!
Feb 10, 2026	🎉 Our paper “Generating and Understanding Tests via Path-Aware Symbolic Execution with LLMs” has been accepted to ICPC 2026!
Jan 26, 2026	🎉 We just submitted exciting work on DAG-based fuzzing for Dataflow Frameworks to ISSTA ‘26! Our work has exposed several faults across four different frameworks: Apache Spark, Apache Flink, Dask and Polars!
Aug 15, 2025	🚀 I was offered to return for another Applied Science internship at AWS!
Nov 07, 2024	🎖️ Honored to serve on the Program Committee for TaPP 2024 - Workshop on the Theory and Practice of Provenance!
Aug 15, 2024	🚀 Excited to start my internship at AWS as an Applied Scientist working with Ankush Desai and Aman Goel!
Apr 15, 2024	🎉 Our paper on natural symbolic execution-based testing for big data analytics has been accepted at ESEC/FSE 2024!
Sep 13, 2023	🎤 I presented our work on Natural Input Generation for Big Data Analytics at ASE 2023 in Luxembourg!
Sep 11, 2023	🎤 I presented Co-dependence Aware Fuzzing for Dataflow-based Big Data Analytics at ESEC/FSE 2023 in San Francisco you can find my talk here!
Aug 21, 2023	🏆 I was awarded a SIGSOFT grant to present my work at the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023) in Luxembourg.
Jul 17, 2023	🎉 Our paper on natural input generation for data intensive applications has been accepted at ASE ‘23!
May 04, 2023	🎉 Our paper on co-dependence aware fuzzing for dataflow-based big data analytics has been accepted at ESEC/FSE 2023!

selected publications

MLSys 2026

ProToken: Token-Level Attribution for Federated Large Language Models

Waris Gill, Ahmad Humayun, Ali Anwar, and 1 more author

Ninth Annual Conference on Machine Learning and Systems, 2026

PDF
ICPC 2026

Generating and Understanding Tests via Path-Aware Symbolic Execution with LLMs

Yaoxuan Wu, Xiaojie Zhou, Ahmad Humayun, and 2 more authors

The 34th IEEE/ACM International Conference on Program Comprehension, 2026

PDF
ESEC/FSE 2024

Natural Symbolic Execution-Based Testing for Big Data Analytics

Yaoxuan Wu, Ahmad Humayun, Muhammad Ali Gulzar, and 1 more author

Proceedings of the 32nd ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Jul 2024

Abs DOI PDF

Symbolic execution is an automated test input generation technique that models individual program paths as logical constraints. However, the realism of concrete test inputs generated by SMT solvers often comes into question. Existing symbolic execution tools only seek arbitrary solutions for given path constraints. These constraints do not incorporate the naturalness of inputs that observe statistical distributions, range constraints, or preferred string constants. This results in unnatural-looking inputs that fail to emulate real-world data. In this paper, we extend symbolic execution with consideration for incorporating naturalness. Our key insight is that users typically understand the semantics of program inputs, such as the distribution of height or possible values of zipcode, which can be leveraged to advance the ability of symbolic execution to produce natural test inputs. We instantiate this idea in NaturalSym, a symbolic execution-based test generation tool for data-intensive scalable computing (DISC) applications. NaturalSym generates natural-looking data that mimics real-world distributions by utilizing user-provided input semantics to drastically enhance the naturalness of inputs, while preserving strong bug-finding potential. On DISC applications and commercial big data test benchmarks, NaturalSym achieves a higher degree of realism —as evidenced by a perplexity score 35.1 points lower on median, and detects 1.29\texttimes injected faults compared to the state-of-the-art symbolic executor for DISC, BigTest. This is because BigTest draws inputs purely based on the satisfiability of path constraints constructed from branch predicates, while NaturalSym is able to draw natural concrete values based on user-specified semantics and prioritize using these values in input generation. Our empirical results demonstrate that NaturalSym finds injected faults 47.8\texttimes more than NaturalFuzz (a coverage-guided fuzzer) and 19.1\texttimes more than ChatGPT. Meanwhile, TestMiner (a mining-based approach) fails to detect any injected faults. NaturalSym is the first symbolic executor that combines the notion of input naturalness in symbolic path constraints during SMT-based input generation. We make our code available at https://github.com/UCLA-SEAL/NaturalSym.
ESEC/FSE 2023

Co-dependence Aware Fuzzing for Dataflow-Based Big Data Analytics

Ahmad Humayun, Miryung Kim, and Muhammad Ali Gulzar

In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, San Francisco, CA, USA, Jul 2023

Abs DOI PDF

Data-intensive scalable computing has become popular due to the increasing demands of analyzing big data. For example, Apache Spark and Hadoop allow developers to write dataflow-based applications with user-defined functions to process data with custom logic. Testing such applications is difficult. (1) These applications often take multiple datasets as input. (2) Unlike in SQL, there is no explicit schema for these datasets and each unstructured (or semi-structured) dataset is segmented and parsed at runtime. (3) Dataflow operators (e.g., join) create implicit co-dependence constraints between the fields of multiple datasets. An efficient and effective testing technique must analyze co-dependence among different regions of multiple datasets at the level of rows and columns and orchestrate input mutations jointly on co-dependent regions. We propose DepFuzz to increase the effectiveness and efficiency of fuzz testing dataflow-based big data applications. The key insight behind DepFuzz is twofold. It keeps track of which code segments operate on which datasets, which rows, and which columns. By analyzing the use of dataflow operators (e.g., join and groupByKey) in tandem with the semantics of UDFs, DepFuzz generates test data that subsequently reach hard-to-reach regions of the application code. In real-world big data applications, DepFuzz finds 3.4\texttimes more faults, achieving 29% more statement coverage in half the time as Jazzer’s, a state-of-the-art commercial fuzzer for Java bytecode. It outperforms prior DISC testing by exposing deeper semantic faults beyond simpler input formatting errors, especially when multiple datasets have complex interactions through dataflow operators.
ASE 2023

NaturalFuzz: Natural Input Generation for Big Data Analytics

Ahmad Humayun, Yaoxuan Wu, Miryung Kim, and 1 more author

In 2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE), Jul 2023

DOI PDF
[under review]

How Accurately Do Large Language Models Understand Code?

Sabaat Haroon, Ahmad Faraz Khan, Ahmad Humayun, and 5 more authors

Jul 2025