Haochen Pan
Thank you for stopping by! I am a fifth-year CS Ph.D. student at the University of Chicago and a proud member of Globus Labs, advised by Dr. Kyle Chard, Dr. Ian Foster, and Dr. Ryan Chard.

Current Research
My research focuses on distributed systems at the intersection of cloud and high-performance computing (HPC), with an emphasis on resilience and efficiency for AI-guided scientific workflows and time-sensitive data analysis.
- We developed Octopus, a Kafka-based hierarchical event fabric for high-performance exchange of control and metadata events across cloud and HPC environments.
- Building on this, we designed Icicle (to be released), a real-time metadata monitoring and indexing system for Lustre and IBM Storage Scale that integrates Octopus, Apache Flink, and Globus Search to provide live visibility and historical usage analysis.
- More recently, we developed Science-MCP, which applies the Model Context Protocol (MCP) to expose these capabilities as discoverable and composable services for LLM-powered agents across heterogeneous cyberinfrastructure.
Selected Publications
The complete list is available on Google Scholar and my CV.
- [Preprint]Experiences with Model Context Protocol Servers for Science and High Performance ComputingHaochen Pan, Ryan Chard, Reid Mello, Christopher Grams, Tanjin He, Alexander Brace, Owen Price Skelly, Will Engler, Hayden Holbrook, Song Young Oh, Maxime Gonthier, Michael Papka, Ben Blaiszik, Kyle Chard, Ian Foster
- [FGCS Vol. 153]The Globus Compute Dataset: An Open Function-as-a-Service Dataset From the Edge to the CloudAndré Bauer, Haochen Pan, Ryan Chard, Yadu Babuji, Josh Bryan, Devesh Tiwari, Ian Foster, Kyle Chard
- [OSDI'22]Cancellation in Systems: An Empirical Study of Task Cancellation Patterns and FailuresUtsav Sethi, Haochen Pan, Shan Lu, Madanlal Musuvathi, Suman Nath
- [SOSP'21]Rabia: Simplifying State-Machine Replication Through RandomizationHaochen Pan, Jesse Tuglu, Neo Zhou, Tianshu Wang, Yicheng Shen, Xiong Zheng, Joseph Tassarotti, Lewis Tseng, Roberto Palmieri
Projects
- 2025 Science MCPs: Model Context Protocol (MCP) servers that enable AI assistants to interact with scientific computing resources and data management services.
- 2024 Diaspora: a resilience-enabling event fabric for real-time scientific workflows across HPC systems
- 2019 GitHub Trending Timeline: a full-stack Python application to track GitHub Trending repositories over time
High-Performance Computing
- Aug 2025[i4][Preprint]Experiences with Model Context Protocol Servers for Science and High Performance Computing
- Aug 2025[j4][FHPCP]Toward a Persistent Event-Streaming System for High-Performance Computing Applications
- Jul 2025[i3][Preprint]Throughput Estimation of Data Transport Networks from Digital Twin Measurements
- Jul 2025[j3][ApJS]RADAR—Radio Afterglow Detection and AI‑Driven Response: A Federated Framework for Gravitational Wave Event Follow‑Up
- Jun 2025[c22][ICS'25]D-Rex: Heterogeneity-Aware Reliability Framework and Adaptive Algorithms for Distributed Storage
- May 2025[c21][CCGrid'25]DynoStore: A wide-area distribution system for the management of data over heterogeneous storage
- May 2025[c20][CCGrid'25]WRATH: Workload Resilience Across Task Hierarchies in Task-based Parallel Programming Frameworks
- May 2025[c19][IPDPS'25]Optimizing Fine-Grained Parallelism Through Dynamic Load Balancing on Multi-Socket Many-Core Systems
- Jan 2025[i2][Preprint]MOFA: Discovering Materials for Carbon Capture with a GenAI-and Simulation-Based Workflow
- Nov 2024[c18][FTXS'24]Octopus: Experiences with a Hybrid Event-Driven Architecture for Distributed Scientific Computing
Cloud Computing
- Sep 2024[c16][eScience'24]An Empirical Investigation of Container Building Strategies and Warm Times to Reduce Cold Starts in Scientific Computing Serverless Functions
- Apr 2024[j2][FGCS Vol. 153]The Globus Compute Dataset: An Open Function-as-a-Service Dataset From the Edge to the Cloud
Distributed Systems
- Jul 2022[c14][OSDI'22]Cancellation in Systems: An Empirical Study of Task Cancellation Patterns and Failures
- Jan 2021[c11][ICDCN'21]Practical Experience Report: Cassandra+: Trading-Off Consistency, Latency, and Fault-tolerance in Cassandra
- Dec 2020[j1][Computer Networks Vol.182]Reliable broadcast with trusted nodes: Energy reduction, resilience, and speed
- Nov 2020[c9][NCA'20]CassandrEAS: Highly Available and Storage-Efficient Distributed Key-Value Store with Erasure Coding
- Mar 2020[c8][PerVehicle'20]Make Multi-hop Broadcast in VANET Fast by Selecting a Better Route for Source Vehicle
- Sep 2019[c1][Sarnoff'19]A First Step Towards Production-Ready Network Function Storage: Benchmarking with NFSB