You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@teaclave.apache.org by ms...@apache.org on 2020/10/28 03:58:01 UTC
[incubator-teaclave-sgx-sdk] 01/03: Add document: is_x86_feature_detected in Teaclave SGX SDK

This is an automated email from the ASF dual-hosted git repository.

mssun pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-teaclave-sgx-sdk.git

commit c5ece876482ce446347f518ca4656288fe50fb1a
Author: Mingshen Sun <bo...@mssun.me>
AuthorDate: Tue Oct 27 20:34:59 2020 -0700

    Add document: is_x86_feature_detected in Teaclave SGX SDK
---
 documents/is_x86_feature_detected-in-sgx-sdk.md | 99 +++++++++++++++++++++++++
 1 file changed, 99 insertions(+)

diff --git a/documents/is_x86_feature_detected-in-sgx-sdk.md b/documents/is_x86_feature_detected-in-sgx-sdk.md
new file mode 100644
index 0000000..8c71865
--- /dev/null
+++ b/documents/is_x86_feature_detected-in-sgx-sdk.md
@@ -0,0 +1,99 @@
+---
+permalink: /sgx-sdk-docs/is_x86_feature_detected-in-sgx-sdk
+---
+
+# `is_x86_feature_detected` in Teaclave SGX SDK
+
+## Background
+
+Crates often use `is_x86_feature_detected` to select appropriate implementations
+(such as AVX/SSE/SSSE/FMA). It triggers `cpuid` instruction in default `libstd`
+implementation on x86_64. We want to avoid such kind of SGX in-compatible
+instructions and unnecessary AEX events.
+
+## Solution
+
+We found that Intel's SDK initializes its optimized libraries in a way of:
+
+1. initialize a global cpu feature indicator by enclave initialization parameter
+   in [urts](https://github.com/intel/linux-sgx/blob/042849cef8db1f0384e52e8cebcd8820c7754398/psw/urts/enclave_creator_hw_com.cpp#L61)
+
+```c
+//Since CPUID instruction is NOT supported within enclave, we enumerate the cpu features here and send to tRTS.
+get_cpu_features(&info.cpu_features);
+get_cpu_features_ext(&info.cpu_features_ext);
+init_cpuinfo((uint32_t *)info.cpuinfo_table);
+```
+
+2. Initialize optimized libraries according to the global cpu feature indicator
+   in [trts](https://github.com/intel/linux-sgx/blob/042849cef8db1f0384e52e8cebcd8820c7754398/sdk/trts/init_enclave.cpp#L169)
+
+```c
+// optimized libs
+if (SDK_VERSION_2_0 < g_sdk_version || sys_features.size != 0)
+{
+  if (0 != init_optimized_libs(cpu_features, (uint32_t*)sys_features.cpuinfo_table, xfrm))
+  {
+    return -1;
+  }
+}
+```
+
+We found that in `init_optimized_libs`, a global variable
+`g_cpu_feature_indicator` is initialized to store the `feature_bit_array` which
+contains everything we need!
+
+```c
+static int set_global_feature_indicator(uint64_t feature_bit_array, uint64_t xfrm) {
+    ......
+    g_cpu_feature_indicator = feature_bit_array;
+    return 0;
+}
+```
+
+Since Rust SGX SDK depends on trts, we can simply re-use the
+`g_cpu_feature_indicator` and simulate the `is_x86_feature_detected` macro
+easily! First we import the value from trts:
+
+```rust
+#[link(name = "sgx_trts")]
+extern {
+    static g_cpu_feature_indicator: uint64_t;
+    static EDMM_supported: c_int;
+}
+
+#[inline]
+pub fn rsgx_get_cpu_feature() -> u64 {
+    unsafe { g_cpu_feature_indicator }
+}
+```
+
+Then parse `g_cpu_feature_indicator` like std_detect:
+
+```rust
+#[macro_export]
+macro_rules! is_cpu_feature_supported {
+    ($feature:expr) => ( (($feature & $crate::enclave::rsgx_get_cpu_feature()) != 0) )
+}
+
+#[macro_export]
+macro_rules! is_x86_feature_detected {
+    ("ia32") => {
+        $crate::cpu_feature::check_for($crate::cpu_feature::Feature::ia32)
+    };
+    ...
+}
+```
+
+## Performance concerns
+
+We observed that some crates (such as matrixmultiply) are likely to use the
+highest level of instructions for speed up. But it may not be the best solution.
+For example, the "machine-learning" SGX sample depends on rusty-machine and
+matrixmultiply, which intend to use AVX instruction if supported. However, if we
+use the "fallback" mode, it'll be about 10x faster than the AVX version. The AVX
+optimiztion is pretty complicated and I have no time to read Intel's [Intel® 64
+and IA-32 Architectures Optimization Reference
+Manual](https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf).
+And I don't think either of crate's owner or llvm backend can optimize it
+ideally. I recommend to choose the appropirate instruction set per workload.


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@teaclave.apache.org
For additional commands, e-mail: commits-help@teaclave.apache.org