Evaluating Intel TDX for Secure, Scalable Verification of Academic Replication Packages

Authors

  • Sam Lam Department of Finance, Costello College of Business, George Mason University, Fairfax, VA
  • Kevin Su Department of Finance, Costello College of Business, George Mason University, Fairfax, VA
  • Nohith Challa Department of Finance, Costello College of Business, George Mason University, Fairfax, VA
  • Jiasun Li Department of Finance, Costello College of Business, George Mason University, Fairfax, VA

DOI:

https://doi.org/10.13021/jssr2025.5250

Abstract

Reproducibility is fundamental to scientific progress, yet verifying replication packages often relies on ad hoc, resource‑intensive workflows. Standard editorial and conference processes lack a unified, confidential framework for running proprietary or open‑source code at scale. We propose leveraging Intel Trust Domain Extensions (TDX) within Google Cloud Confidential VMs to securely evaluate replication packages from all Management Science articles published in 2024. Building on recent advances in confidential computing and containerization, our approach automates the deployment of R, SAS, and other environments inside TDX‑protected enclaves. In a pilot study of 352 packages, we document preliminary performance benchmarks, estimate per‐package cloud costs, and identify key technical challenges—such as handling non‑open‑source binaries via Docker wrappers—and prescribe best practices for library management, data access, and attestation. Our methodology also incorporates a collaborative, peer‑reviewed workflow to flag failures and share fixes in real time. Preliminary data suggests this framework can substantially reduce manual overhead while improving runtime transparency, with evaluated packages demonstrating mean computational costs of $1.80 per instance and average runtime latencies of 2.3 hours per replication. Cross-platform benchmarking reveals performance trade-offs between Microsoft Azure TDX implementations and Google Cloud Confidential VMs, with Azure achieving 40% higher success rates and 25-30% lower resource utilization costs ($0.22-$3.21 versus $0.27-$7.69 per replication), while Google Cloud demonstrates superior scalability characteristics and more robust handling of complex multi-language dependency chains. These results indicate significant potential for journals to adopt this approach over current isolated sandbox methods, though comprehensive validation across the full dataset is still necessary.

Published

2025-09-25

Issue

Section

Costello College of Business: Department of Finance