Ascend-CC: Confidential Computing on Heterogeneous NPU for Emerging Generative AI Workloads

Back

Published

Jul 16, 2024

Updated

Jul 16, 2024

Securing AI’s Future: Confidential Computing on NPUs for Generative AI

Ascend-CC: Confidential Computing on Heterogeneous NPU for Emerging Generative AI Workloads

Aritra Dhar|Clément Thorens|Lara Magdalena Lazier|Lukas Cavigelli

https://arxiv.org/abs/2407.11888v1

Summary

The rise of generative AI and large language models (LLMs) has brought incredible advancements, but also new security risks. Imagine training a cutting-edge AI model or querying it with sensitive data—you wouldn't want your cloud provider or anyone else snooping around, would you? That's where confidential computing comes in. Traditional methods using CPUs or integrated GPUs for trusted execution environments (TEEs) often fall short. They either don't provide enough protection or rely on components that broaden the attack surface. A new approach, called Ascend-CC, offers a fresh perspective. It focuses on using discrete NPU devices as the bedrock of trust, eliminating reliance on the potentially compromised host system. Ascend-CC wraps a shield of encryption around your AI models and data, keeping them safe from prying eyes, even the cloud provider's. This innovative approach uses a clever 'delegation-based memory' method. Think of it like a secure drop-off point: the host delivers the encrypted data and model to the NPU, then loses access. Only then does the NPU decrypt and process the information, encrypting the results before sending them back. Even trickier attacks, like injecting malicious commands into the model, are thwarted through task and binary attestation. This rigorous verification ensures that the model executes exactly as intended, preserving its integrity. The best part? Ascend-CC works seamlessly with current AI frameworks like PyTorch. No need for developers to overhaul their code—confidential computing becomes a simple, yet powerful upgrade. Tests with leading LLMs like Llama 2 and Llama 3 show that Ascend-CC adds minimal performance overhead, making it a practical solution for protecting your valuable AI assets. This research marks a significant step towards making confidential computing the norm in AI, opening doors for broader adoption and paving the way for a more secure AI-powered future.

🍰 Interesting in building your own agents?

PromptLayer provides the tools to manage and monitor prompts with your whole team. Get started for free.

Question & Answers

How does Ascend-CC's delegation-based memory mechanism work to protect AI models?

Ascend-CC's delegation-based memory mechanism creates a secure enclave within the NPU device. The process works through three main steps: First, encrypted data and models are transferred from the host to the NPU. Second, the NPU becomes the sole entity with decryption capabilities, completely isolating the processing from the host system. Finally, results are re-encrypted before being sent back to the host. This is similar to a secure vault system where only the vault (NPU) has the key, and all items must be processed inside before leaving. In practice, this allows organizations to run sensitive AI models on cloud infrastructure while maintaining complete data privacy, even from the cloud provider.

What are the main benefits of confidential computing for everyday AI applications?

Confidential computing makes AI applications more secure and trustworthy for everyday use. It ensures that sensitive data remains private when using AI services, similar to how encrypted messaging keeps your conversations secure. The main benefits include protecting personal information (like health records or financial data) when using AI systems, enabling businesses to safely use cloud-based AI without exposing proprietary data, and building trust in AI services. For example, a healthcare provider could use AI to analyze patient data while ensuring complete privacy, or a financial institution could leverage AI for fraud detection while maintaining client confidentiality.

Why are NPUs becoming more important for AI security compared to traditional processors?

NPUs (Neural Processing Units) are emerging as crucial components for AI security because they offer specialized processing capabilities with built-in security features. Unlike traditional CPUs or GPUs, NPUs are designed specifically for AI workloads and can provide better isolation of sensitive data. This makes them ideal for running AI applications securely while maintaining high performance. Think of NPUs as purpose-built secure processors for AI, similar to how a dedicated security chip in your smartphone protects your biometric data. Industries like healthcare, finance, and government agencies can benefit from NPUs when handling sensitive AI applications.

PromptLayer Features

Access Controls
Aligns with Ascend-CC's secure execution environment principles by managing prompt access and versioning in a controlled manner

Implementation Details

1. Configure role-based access controls 2. Implement encryption for sensitive prompts 3. Set up audit logging for prompt access

Key Benefits

• Granular control over prompt access • Audit trail of prompt usage • Protected intellectual property

Potential Improvements

• Integration with hardware security modules • Enhanced encryption options • Multi-factor authentication for critical prompts

Business Value

Efficiency Gains

Reduced security overhead through automated access management

Cost Savings

Prevented data breaches and intellectual property theft

Quality Improvement

Enhanced compliance and security posture

Analytics
Testing & Evaluation
Mirrors Ascend-CC's attestation mechanisms through systematic testing and validation of prompt behaviors

Implementation Details

1. Set up automated testing pipelines 2. Implement regression testing 3. Configure performance monitoring

Key Benefits

• Consistent prompt behavior verification • Early detection of security issues • Performance impact assessment

Potential Improvements

• Advanced security testing scenarios • Automated vulnerability scanning • Real-time anomaly detection

Business Value

Efficiency Gains

Faster deployment through automated testing

Cost Savings

Reduced security incident response costs

Quality Improvement

Higher reliability and security confidence

Securing AI’s Future: Confidential Computing on NPUs for Generative AI

Summary

Question & Answers

PromptLayer Features

The first platform built for prompt engineering