Launch the service on Bare Metal
This document describes steps required to host the Silent Network node on Bare Metal hardware. Audience of this documents are DevOps, or software engineers, preferably with basic knowledge of Docker and the Shell.
The hardware needs to meet several criteria. We provide the Operator software in a form of simple Docker image.
The current software is meant for TESTNET launch, the MAINNET ready software will be announced soon. Also note there is no way to migrate data from TESTNET to MAINNET.
Prerequisites
The CPU must support Intel SGX (Software Guard Extensions)
Example CPU: Intel(R) Xeon(R) Gold 5412U
The whole platform must be up to date. Use most recent CPU models, update to newest BIOS and other firmware. Otherwise software will not launch. Look at Platform provisioning for more information
The host operating system: Ubuntu 22.04 (Jammy)
Intel libraries in use are officially supported on Ubuntu: https://download.01.org/intel-sgx/sgx_repo/ubuntu/dists/
Please check the SGX features your CPU has. The simplest way is to use sgx-detect
Example output:
➜ ~ sudo ./sgx-detect
Detecting SGX, this may take a minute...
✔ SGX instruction set
✔ CPU support
✔ CPU configuration
✔ Enclave attributes
✔ Enclave Page Cache
SGX features
✔ SGX2 ✔ EXINFO ✔ ENCLV ✔ OVERSUB ✔ KSS
Total EPC size: 92.2MiB
✔ Flexible launch control
✔ CPU support
✔ CPU configuration
✔ Able to launch production mode enclave
✔ SGX system software
✔ SGX kernel device (/dev/sgx_enclave)
✔ libsgx_enclave_common
✔ AESM service
✔ Able to launch enclaves
✔ Debug mode
✔ Production mode
✔ Production mode (Intel whitelisted)
You're all set to start running SGX programs!
It's important to have all green ticks in
SGX Instruction set
,Flexible launch control, SGX system software.
FromSGX features
important are:SGX2, EXINFO
The operator software needs to be tied to that particular CPU die. Once you run it on a machine, it needs to be always the same machine hereafter. Restarting the container on another SGX-enabled CPU will cause the generation of different
MRSIGNER
andMRENCLAVE
Keys resulting in different encryption keys. That will disallow the enclave to unseal the state, that was stored while using the previous CPU. Making the software not-operable.The host machine needs to have installed container runtime, like Docker
The software uses a disk as a persistent storage. The minimum size required is 64 GB. The storage should be exclusive to this software. No other service should use it. The storage should be persistent, i.e., data should stay after the power cycle.
The storage should be periodically baked up, so it will be possible to rollback to last valid state in case of database write failure, disk write failure, or some unexpected software bug
Minimal RAM is 16GB
You need to provide static IP, or the URL that will point to the Operator software
The running service requires a high bandwidth of the external network interface.
The system date and time must be valid, synchronized by NTP (Ubuntu by default has it enabled)
Platform provisioning
The Operator software does remote attestation when the Aggregator service connects to it. The attestation procedure involves external infrastructure (including Intel's web services). The platform on which the Operator service will be run must first be correctly configured.
For those interested in more details, refer to official Intel's documentation. However, it's not mandatory for the setup process to be completed.
Install PCK ID Retrieval tool and others
Add Debian repo (command for Ubuntu Jammy):
echo 'deb [arch=amd64] https://download.01.org/intel-sgx/sgx_repo/ubuntu jammy main' | sudo tee /etc/apt/sources.list.d/intel-sgx.list > /dev/null
wget -O - https://download.01.org/intel-sgx/sgx_repo/ubuntu/intel-sgx-deb.key | sudo apt-key add -
Install required packages:
sudo apt update
sudo apt install sgx-pck-id-retrieval-tool sgx-aesm-service libsgx-urts libsgx-dcap-ql libsgx-dcap-default-qpl libsgx-aesm-ecdsa-plugin libsgx-aesm-quote-ex-plugin sgx-ra-service
Make changes in
/opt/intel/sgx-pck-id-retrieval-tool/network_setting.conf
Change the PCCS_URL
to match our caching service:
PCCS_URL=https://pccs.el.silencelaboratories.com/sgx/certification/v4/platforms
Set USE_SECURE_CERT
to true:
USE_SECURE_CERT=TRUE
Uncomment user_token
and set it to given value:
user_token =NBFPy5R9dJTa
Provision this host
Call the command to provision the host. It will fill up the cache database of PCCS. Needed to be done only once.
sudo PCKIDRetrievalTool
The valid output of this command looks like this:
> sudo PCKIDRetrievalTool
Intel(R) Software Guard Extensions PCK Cert ID Retrieval Tool Version 1.21.100.3
Warning: platform manifest is not available or current platform is not multi-package platform.
the data has been sent to cache server successfully and pckid_retrieval.csv has been generated successfully!
This command will create pckid_retrieval.csv
please do not remove it.
Set aesmd config
Edit /etc/aesmd.conf
. Uncomment and set default quote type:
default quoting type = ecdsa_256
Set QCNL config
The qcnl config is json like file describing the network configuration that is used during attestation. In particular, contains the URL to the PCCS service, and other parameters.
Download this file:
And put it under /etc/sgx_default_qcnl.conf
Do not change this file in any way! Make sure it's sha256 sum is:
1ad7f16fd1335229a81ed98a84b24e80df46614c95a2431b949c94a94d037b96
Restart the aesmd service:
sudo systemctl restart aesmd
Check if platform is free from known hardware vulnerabilities
Put get_tcb_info.py
script on SGX machine aside to pckid_retrieval.csv
(file generated by PCKIDRetrievalTool
from Platform provisioning section)
and execute it
The script will output JSON to the console. Find if tcbStatus
property is set to UpToDate
anywhere in that JSON. Example:
{
"tcbInfo": {
"id": "SGX",
"version": 3,
...
"tcbLevels": [
{
"tcb": {
"sgxtcbcomponents": [
...
],
"pcesvn": 11
},
"tcbDate": "2024-03-13T00:00:00Z",
"tcbStatus": "UpToDate"
},
...
}
If JSON does not contain UpToDate
value, It means HW contains unfixable bugs. The current hardware cannot be used to run the software. Please use other hardware.
Run the service
There are several services to run,sgx-secret-vault, operator-sgx, postgres.
We provide sample docker-compose file to launch them together.
Download the operator
directory on your sgx-enabled machine:
Extract the operator
directory,
tar zxf operator.tar.gz
The structure of the directory is following:
.
├── config
│  ├── init-user-db.sql
│  └── resolv.conf
├── docker-compose.yaml
└── silent-network-operator.env
Setup environment variables that are required to run the container
To launch the compose, file silent-network-operator.env
contains env variables used to configure the services. Most of them are predefined, please set ORIG_ID
to name of your organization, it will be used in for example in Grafana dashboards.
For security reasons change DB_PASS
from default operator_password
Apply that change also in config/init-user-db.sql
file.
Run docker containers
Once you have all envs set up, run the containers:
Pass us your Github username to grant access to the container registry
Create GitHub Personal Access Token (with read: packages scope) and login Docker to the registry
From the
operator
directory run the compose:
docker compose --env-file silent-network-operator.env up -d
The startup can take a while. Eventually, the logs from the service should appear:
2024-07-24T14:10:13.220863Z INFO el_party_svc::signer: Master VK: "XXXX"
2024-07-24T14:10:13.222779Z INFO el_party_svc: listening on 0.0.0.0:80
You should be able to reach the service by calling a simple command:
curl --insecure https://localhost:{SN_OPERATOR_EXTERNAL_PORT}/v1/version
It should respond with details of running software.
If you want to shut down the services, use following command:
docker compose --env-file silent-network-operator.env down
The storage
Once you launch the services, they will keep the state on the storage, in db
and sgx-secret-vault
directories.
.
├── config
│  ├── init-user-db.sql
│  └── resolv.conf
├── db
├── docker-compose.yaml
├── sgx-secret-vault
│  └── seed_file
└── silent-network-operator.env
The content of db
and sgx-secret-vault
directories should be periodically backed up, so it should be possible to recover to previous state in case of database write failure, hardware failure or software bug.
Wrapping up
Make the service to be externally available. Provide to us:
The IP address, or URL, together with the port by which the service is accessible
Response from this command (insecure because certificates are self-signed):
curl --insecure https://localhost:{SN_OPERATOR_EXTERNAL_PORT}/v1/version
The city where the hosted HW is running
Troubleshooting
Quote verify failed func_verify_quote_result: "0xE019"
func_verify_quote_result: "0xE019"
If you receive the error message during startup:
ra_tls_verify_callback: sgx_qv_verify_quote failed: 57369
2024-11-18T11:25:02.252594Z DEBUG remote_attestation::verifier: check certificate return -9984
2024-11-18T11:25:02.253033Z ERROR el_party_svc: RA-TLS certificate verification results: VerifyError {
return_code: -9984,
callback_results: ra_tls_verify_callback_results {
attestation_scheme: RA_TLS_ATTESTATION_SCHEME_DCAP,
err_loc: AT_VERIFY_EXTERNAL,
dcap: dcap {
func_verify_quote_result: "0xE019",
quote_verification_result: "UNSPECIFIED",
},
},
}
This might happen if:
The PCCS service is down, check it's accessibility by simple curl command:
curl https://pccs.el.silencelaboratories.com/sgx/certification/v4/qe/identity
It should return JSON response with HTTP status 200. If it doesn't, please reach out to the Silence Laboratories team.
The configuration files are invalid:
Make sure the files are mounted to the container:
-v /etc/sgx_default_qcnl.conf:/etc/sgx_default_qcnl.conf \
-v ${CONFIG_MOUNT}/resolv.conf:/etc/resolv.conf \
Has expected sha2 sums as mentioned earlier.
Startup of the container with an error AESM service returned error 30;
AESM service returned error 30;
The error:
error: AESM service returned error 30; this may indicate that infrastructure for the DCAP attestation requested by Gramine is missing on this machine
Make sure all packages mentioned in Platform provisioningwere installed correctly
For further debugging, call
sudo journalctl -u aesmd.service
PCKIDRetrievalTool error
The output of the PCKID tool:
Error: unexpected error occurred while sending data to cache server.
pckid_retrieval.csv has been generated successfully, however the data couldn't be sent to cache server!
Our PCCS server reports:
2025-04-02 11:18:23.310 [error]: Intel PCS server returns error(404).
2025-04-02 11:18:23.310 [error]: Error: No cache data for this platform.
It means the CPU is not registered, the registration service needs to be installed:
sudo apt install sgx-ra-service
Then call PCKID tool again
sudo PCKIDRetrievalTool
for further debugging, read the logs from /var/log/mpa_registration.log
Other issues
In case of any problems with the service, please provide us with the logs from the container: docker logs operator
Last updated