As large language models (LLMs) become central to AI products and services, the ability to verify model performance without compromising confidentiality is critical. Whether you're a model provider seeking to validate claims about fairness or accuracy, or a regulator or partner demanding assurance, trustworthy evaluation is no longer optional—it's essential.
Imagine this scenario:
- You've built an AI model and want to demonstrate its performance to a partner, regulator, or customer.
- You don’t want to reveal the model itself due to intellectual property risk.
- You do want to run a third-party evaluation script that both parties trust.
- And everyone needs to be confident that the results weren't faked, modified, or cherry-picked.
This is where ManaTEE, a secure model evaluation framework leveraging Trusted Execution Environments (TEEs), comes in.
This post explores how ManaTEE enables trusted evaluation of LLMs using the popular evaluation framework lm-evaluation-harness
. By combining industry-standard benchmarks with a Trusted Execution Environment (TEE), ManaTEE allows users to:
- Keep proprietary models confidential
- Run evaluation scripts transparently
- Generate cryptographically verifiable results
Why Model Evaluation is a trust problem
In traditional AI workflows, evaluation typically happens in one of two ways:
- On your own infrastructure, where others can’t be sure you're playing fair
- On external infrastructure, which exposes your proprietary model to risk
Neither option is ideal.
Collaborators and regulators want proof:
- That you're using the agreed-upon script
- That you're not secretly modifying the model mid-test
- That the results are honest
At the same time, you want to protect your model from leaks. ManaTEE solves this trust paradox.
What is ManaTEE?
ManaTEE is an open-source framework designed to run model evaluation inside a Trusted Execution Environment (TEE), such as Intel TDX or AMD SEV. It creates a cryptographically sealed black box where code and data are loaded securely and cannot be tampered with, not even by the machine owner.
The result? You can now prove your model's performance without revealing the model itself.
How it works: Step-by-step
Below is the diagram for the AI model verification scenario.
Let's walk through a secure evaluation process using ManaTEE.
Step 1: Prepare your model
You upload your LLM (or any ML model) into ManaTEE. Inside this secure enclave, no one—not even the cloud provider or system administrator, can see the model weights.
Step 1: Agree on an evaluation script
You and your collaborator agree on an evaluation script, either from a benchmark suite or a custom test.
This script will also be securely loaded into the TEE.
Step 3: Execute the evaluation
ManaTEE runs the script against your model within the TEE. The model cannot be exfiltrated, even by malicious scripts. The TEE strictly enforces isolation and secure memory usage.
Step 4: Generate and sign results
The evaluation results are digitally signed inside the TEE. The TEE includes hashes of:
- The model
- The evaluation script
- A nonce (to guarantee freshness)
This ensures the output is cryptographically bound to exactly what was executed.
Step 5: Remote attestation
ManaTEE generates an attestation report, signed by the hardware vendor. This proves:
- The code that was run (verified by script hash)
- That it was run in a genuine TEE
Step 6: Share and verify
You send the results along with the signed attestation to your collaborator. They can independently verify:
- The integrity of the evaluation
- That it came from your model
- That the results weren't tampered with
The eat_nonce value represents a hash of the AI model, and the attestation report also includes additional information such as hardware/software metadata and domain details for comprehensive verification.
Conclusion: Trust through technology
In the future of AI, trust will be as important as performance. As models become more powerful and sensitive, secure evaluation will be critical, not just as a technical need, but as a foundation for AI governance.
ManaTEE offers a forward-looking solution: aevaluate without exposure, prove without compromise.
The project was initiated in 2024 as a core use case of TikTok, now open source under the Linux Foundation Confidential Computing Consortium. ManaTEE continues exploring and addressing growing challenges in data collaboration and verifiable transparency.
Want to try it yourself?
Get started with the ManaTEE documentation and explore how your team can bring verifiable trust to your AI workflows.
