Polars inside Intel SGX2 Enclaves: An Empirical Study of Confidential Analytical Query Processing
Source: arXiv cs.CR
Published: Fri, 22 May 2026 00:00:00 -0400
Summary
arXiv:2605.21797v1 Announce Type: new
Abstract: Trusted Execution Environments (TEEs) have renewed interest in confidential analytics, but most prior evaluations focus on SQL database engines or earlier SGX generations. This paper studies an Arrow-native DataFrame engine, Polars, running inside Intel SGX2 enclaves via Gramine on TPC-H SF30 with Azure Blob Storage. We report both the standard TPC-H power score and a query-only variant that removes table-loading time in order to separate compute overhead from data-ingestion overhead. Across four dataset-width configurations (approximately 22-73 GB), end-to-end overhead remains nearly constant at 1.49-1.56$\times$, but this composite metric obscures two distinct behaviors: query-only overhead declines from 1.51-1.52$\times$ to 1.43-1.44$\times$, whereas table-loading overhead rises from 2.27$\times$ to 4.07$\times$. We further show that overhead is not uniform across queries: for the len130 configuration, the median per-query SGX slowdown is 1.45$\times$ with a maximum of 2.57$\times$, and a small set of queries exhibits pronounced run-to-run spikes consistent with stateful EPC pressure. Finally, we compare Polars' lazy and eager API
Sources
Lyrie Verdict
Lyrie's autonomous defense layer flags this class of exposure the moment it surfaces — no signature update required.