-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFD] TPM Enrollment and secure secret delivery #40
Comments
Further work here could include using the attestation process to set up a wireguard tunnel for cloud-init and removing the need for authentication in cloud-init itself. |
https://github.com/keylime/keylime May provide much of the functionality needed for the actual enrollment and management of certs/keys |
Should be looking here at the rust version instead since the python version is deprecated? |
I think both repos are relevant. From the README:
I think only the agent has been rewritten in rust and move to the other repo. I don't see evidence that they are moving the Registrar or Verifier to rust at this point. |
Relevant to this discussion: |
Possible alternative to Keylime. Doesn't look ready for primetime yet. |
(All views expressed are my own. If at all, they originate from my role as an OpenCUBE developer.)
In general I think that tracking faults per device is yet another story and this data should be tied to subcomponents instead if possible.
I agree that additional IDs must be maintained within SMD. Keeping it separate makes it unmaintainable, quickly.
How the actual transport happens is secondary to this proposal I think. I would prefer HTTP mTLS over a full separate protocol, though.
I am missing how the third option improves on the MITM situation. But then I am fine with either. What I am missing here is a disclaimer that this proposal is geared towards managed nodes and already assumes that the environment where OpenCHAMI runs is to be considered secure. |
Attestation Background
Attestation is a method for verifying the integrity of a computer’s software, hardware, and firmware using a Trusted Platform Module (TPM). The TPM creates cryptographic measurements, or "quotes," reflecting the system's state, including its software and firmware configuration. These quotes are assembled into a report that is signed by the TPM through an embedded Public Key Infrastructure (PKI), which keeps private keys secure within the TPM.
During remote attestation, this signed report is sent to a remote verifier, which uses PKI to authenticate and validate it. The report includes a nonce—a unique, random number generated for each request—to prevent replay attacks and ensure the report's freshness. The verifier checks the report against expected values, leveraging PKI to confirm the authenticity of the quotes and the system’s integrity. Successful validation allows the verifier to grant access or permissions, affirming the system's trustworthiness.
Beyond integrity verification, the same PKI framework used in attestation can facilitate secure communications between nodes. The TPM can encrypt information so that only nodes with the corresponding TPM can decrypt and read it, ensuring data security. Additionally, PKI allows nodes to prove that a message originated from a TPM-equipped system by signing the message with the TPM's private key. Recipients use PKI to verify this signature, confirming both the message’s origin and its authenticity. This combined use of PKI and TPM strengthens security by enabling both secure connections and reliable verification of communications.
Bootstrapping Attestation
Bootstrapping remote attestation requires establishing trust in TPMs themselves. This foundational trust is essential for the effective functioning of the attestation process.
Initial trust is established through key provisioning and certification. When a TPM is initialized, it generates a primary endorsement key (EK) and additional keys for various functions. The EK is used to obtain a certificate from a trusted Certificate Authority (CA), known as the Endorsement Certificate (EKCert), which binds the TPM’s public key to its identity. This endorsement, signed by a trusted CA, establishes a basis of trust for the TPM’s operations.
With initial trust established, the remote attestation process can proceed. The remote verifier sends a request to the TPM, including a nonce to ensure the report’s freshness. The TPM generates a signed quote, which includes the nonce and a measurement of the system’s state. This quote serves as proof of the system's integrity. The verifier uses the TPM’s EKCert to validate the TPM’s public key and the authenticity of the quote. Successful validation confirms the TPM’s trustworthiness and the system’s integrity.
Ongoing trust management involves periodic attestation checks to ensure system integrity, key rotation to maintain security, and mechanisms for certificate revocation if a TPM is compromised. These practices help maintain the robustness and reliability of the attestation framework.
Infrastructure Challenges for Remote Attestation
Managing the original Endorsement Keys (EKs) securely is a critical challenge, especially when integrating new computers as new racks are delivered or nodes are swapped. The integrity of the attestation process depends on the secure handling of these keys from generation to deployment. Either in the factory, or on delivery, each TPM generates unique EKs that must be securely transmitted to a trusted Certificate Authority (CA) for certification. Ensuring these EKs are encrypted and protected during transmission is essential to prevent unauthorized access.
Ongoing management of EKs also includes secure handling of key rotations and updates. Procedures must be in place to address the replacement of TPMs and the updating or revocation of EKs to maintain system integrity and trustworthiness.
OpenCHAMI Attestation and Enrollment Service
This RFD proposes a process for managing enrollment keys and supporting remote attestation.
Extend OpenCHAMI to use TPMs for identity
In today's system, each node is primarily identified by the xname which denotes a location in the system. This idea of location as primary identifier is inherited from CSM through our use of the CSM service SMD as the primary inventory interface in OpenCHAMI. Location, while unique across the system, isn't stable. It is possible and even somewhat common to remove a blade from one chassis and replace it in another chassis. Tracking errors per blade as it is moved from one part of the system to another is possible in CSM, but not trivial.
The TPM contains several pieces of data that can be used for identity and are both unique and stable. The specification for TPM 2.0 which is linked below in the references describes two ids that are practical for our use.
One approach would be to include a TPM identifier as an additional piece of data stored by SMD and provide functions for interacting with the unique and stable identities in addition to the xnames.
A second approach would be to create a new service for externally managing these identities and provide integrations with SMD and other microservices.
I recommend the first approach as identity and inventory are intrinsically linked. Keeping them separate introduces race conditions and other potential consistency problems.
Extend OpenCHAMI to boot a dedicated discovery image for collecting TPM keys/IDs
The remote attestation process requires establishing a collection of valid Public Keys/Certificates that identify the TPMs and which can be compared with responses in the remote attestation process. We have considered several options for a process that works with OpenCHAMI.
We believe the first option to be the most secure, but it is also the most labor intensive. We did not pursue it as impractical.
We believe the second option creates an opportunity for a rogue device to register itself as a fake node and could provide an avenue for future attacks. If we can adjust network settings or provide other protections, it might be workable. We chose not to pursue it at this time.
The third option appears to provide us with the security and manageability we need and allows us to build on tooling we already have. We are pursuing this option, but are ensuring that other options remain available for sites that do not wish to use ansible in this way.
References
Identity and Attestation
The text was updated successfully, but these errors were encountered: