You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2021/04/13 05:31:15 UTC

[GitHub] [tvm] csullivan opened a new pull request #7834: [OpenCL] Refactor cl_program generation

csullivan opened a new pull request #7834:
URL: https://github.com/apache/tvm/pull/7834


   I have encountered a few pathological bugs in the opencl compiler provided on the snapdragon android platform (e.g. opencl compiler hung for 5+ hours in call to clBuildProgram, and non-deterministic emission of `cl_a6x_cmdbuf_mgr_submit_ibs`). I've isolated them into a minimal reproducible example, and find that they occur only when all kernels are created from a single cl_program. If instead a cl_program is created for each kernel, these issues are avoided. 
   
   This PR proposes the addition of a kernel primitive delimiter to be added to the OpenCL code generation, and for the OpenCL module runtime to utilize this delimiter to build and cache separate cl_programs for each generated kernel source.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on a change in pull request #7834: [OpenCL] Refactor cl_program generation

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #7834:
URL: https://github.com/apache/tvm/pull/7834#discussion_r612406840



##########
File path: src/runtime/opencl/opencl_module.cc
##########
@@ -181,49 +185,81 @@ void OpenCLModuleNode::Init() {
     e.version = workspace_->timestamp++;
     kid_map_[key] = e;
   }
+
+  // Use function delimiters to parse the serialized source
+  // into separate source files for each kernel primitive
+  std::string source = GetSource("cl");

Review comment:
       decouple into a separate function SplitKernels




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] csullivan commented on a change in pull request #7834: [OpenCL] Refactor cl_program generation

Posted by GitBox <gi...@apache.org>.
csullivan commented on a change in pull request #7834:
URL: https://github.com/apache/tvm/pull/7834#discussion_r624401398



##########
File path: src/runtime/opencl/opencl_common.h
##########
@@ -315,6 +315,13 @@ class OpenCLModuleNode : public ModuleNode {
   cl_kernel InstallKernel(cl::OpenCLWorkspace* w, cl::OpenCLThreadEntry* t,
                           const std::string& func_name, const KTRefEntry& e);
 
+  /*
+   * \brief Splits the provided serialized source file into separate
+   * source for each kernel primitive.
+   * \param source The serialized program source file (fmt: cl)

Review comment:
       Thanks for the ping @tqchen, should be good to go.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jroesch commented on a change in pull request #7834: [OpenCL] Refactor cl_program generation

Posted by GitBox <gi...@apache.org>.
jroesch commented on a change in pull request #7834:
URL: https://github.com/apache/tvm/pull/7834#discussion_r612776428



##########
File path: src/runtime/opencl/opencl_module.cc
##########
@@ -181,56 +185,94 @@ void OpenCLModuleNode::Init() {
     e.version = workspace_->timestamp++;
     kid_map_[key] = e;
   }
+
+  // split into source artifacts for each kernel
+  parsed_kernels_ = SplitKernels(GetSource("cl"));
+  // zero initialize cl_program pointers for each device kernel
+  for (auto& kv : parsed_kernels_) {
+    programs_.insert({kv.first, std::vector<cl_program>(workspace_->devices.size(), nullptr)});
+  }
 }
 
 cl_kernel OpenCLModuleNode::InstallKernel(cl::OpenCLWorkspace* w, cl::OpenCLThreadEntry* t,
                                           const std::string& func_name, const KTRefEntry& e) {
   std::lock_guard<std::mutex> lock(build_lock_);
   int device_id = t->device.device_id;
-  if (!device_built_flag_[device_id]) {
+  if (programs_[func_name][device_id] == nullptr) {
     // create program
     if (fmt_ == "cl") {
-      if (program_ == nullptr) {
-        const char* s = data_.c_str();
-        size_t len = data_.length();
-        cl_int err;
-        program_ = clCreateProgramWithSource(w->context, 1, &s, &len, &err);
-        OPENCL_CHECK_ERROR(err);
-      }
+      const char* s = parsed_kernels_[func_name].c_str();
+      size_t len = parsed_kernels_[func_name].length();
+      cl_int err;
+      programs_[func_name][device_id] = clCreateProgramWithSource(w->context, 1, &s, &len, &err);
+      OPENCL_CHECK_ERROR(err);
     } else if (fmt_ == "xclbin" || fmt_ == "awsxclbin" || fmt_ == "aocx") {
       const unsigned char* s = (const unsigned char*)data_.c_str();
       size_t len = data_.length();
       cl_int err;
       cl_device_id dev = w->devices[device_id];
-      program_ = clCreateProgramWithBinary(w->context, 1, &dev, &len, &s, NULL, &err);
+      programs_[func_name][device_id] =
+          clCreateProgramWithBinary(w->context, 1, &dev, &len, &s, NULL, &err);
       OPENCL_CHECK_ERROR(err);
     } else {
       LOG(FATAL) << "Unknown OpenCL format " << fmt_;
     }
     // build program
     cl_int err;
     cl_device_id dev = w->devices[device_id];
-    err = clBuildProgram(program_, 1, &dev, nullptr, nullptr, nullptr);
+    err = clBuildProgram(programs_[func_name][device_id], 1, &dev, nullptr, nullptr, nullptr);
     if (err != CL_SUCCESS) {
       size_t len;
       std::string log;
-      clGetProgramBuildInfo(program_, dev, CL_PROGRAM_BUILD_LOG, 0, nullptr, &len);
+      clGetProgramBuildInfo(programs_[func_name][device_id], dev, CL_PROGRAM_BUILD_LOG, 0, nullptr,
+                            &len);
       log.resize(len);
-      clGetProgramBuildInfo(program_, dev, CL_PROGRAM_BUILD_LOG, len, &log[0], nullptr);
-      LOG(FATAL) << "OpenCL build error for device=" << dev << log;
+      clGetProgramBuildInfo(programs_[func_name][device_id], dev, CL_PROGRAM_BUILD_LOG, len,
+                            &log[0], nullptr);
+      LOG(FATAL) << "OpenCL build error for device=" << dev << "\n" << log;
     }
-    device_built_flag_[device_id] = true;
   }
   // build kernel
   cl_int err;
-  cl_kernel kernel = clCreateKernel(program_, func_name.c_str(), &err);
+  cl_kernel kernel = clCreateKernel(programs_[func_name][device_id], func_name.c_str(), &err);
   OPENCL_CHECK_ERROR(err);
   t->kernel_table[e.kernel_id].kernel = kernel;
   t->kernel_table[e.kernel_id].version = e.version;
   kernels_.push_back(kernel);
   return kernel;
 }
 
+std::unordered_map<std::string, std::string> OpenCLModuleNode::SplitKernels(
+    std::string source) const {
+  std::unordered_map<std::string, std::string> split_kernels;
+  if (source.size()) {
+    std::string del{"// Function: "};
+    size_t end;
+    size_t begin = source.find(del);
+    ICHECK(begin != std::string::npos) << "The OpenCL module expects a kernel delimited "
+                                       << "source from code generation, but no kernel "
+                                       << "delimiter was found.";
+    while (true) {

Review comment:
       Are we sure this string matching code always terminates? 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on pull request #7834: [OpenCL] Refactor cl_program generation

Posted by GitBox <gi...@apache.org>.
tqchen commented on pull request #7834:
URL: https://github.com/apache/tvm/pull/7834#issuecomment-830617843


   Thanks @csullivan This is merged


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on a change in pull request #7834: [OpenCL] Refactor cl_program generation

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #7834:
URL: https://github.com/apache/tvm/pull/7834#discussion_r620245071



##########
File path: src/runtime/opencl/opencl_common.h
##########
@@ -315,6 +315,13 @@ class OpenCLModuleNode : public ModuleNode {
   cl_kernel InstallKernel(cl::OpenCLWorkspace* w, cl::OpenCLThreadEntry* t,
                           const std::string& func_name, const KTRefEntry& e);
 
+  /*
+   * \brief Splits the provided serialized source file into separate
+   * source for each kernel primitive.
+   * \param source The serialized program source file (fmt: cl)

Review comment:
       document return 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen merged pull request #7834: [OpenCL] Refactor cl_program generation

Posted by GitBox <gi...@apache.org>.
tqchen merged pull request #7834:
URL: https://github.com/apache/tvm/pull/7834


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] jroesch commented on a change in pull request #7834: [OpenCL] Refactor cl_program generation

Posted by GitBox <gi...@apache.org>.
jroesch commented on a change in pull request #7834:
URL: https://github.com/apache/tvm/pull/7834#discussion_r612776612



##########
File path: src/runtime/opencl/opencl_module.cc
##########
@@ -181,56 +185,94 @@ void OpenCLModuleNode::Init() {
     e.version = workspace_->timestamp++;
     kid_map_[key] = e;
   }
+
+  // split into source artifacts for each kernel
+  parsed_kernels_ = SplitKernels(GetSource("cl"));
+  // zero initialize cl_program pointers for each device kernel
+  for (auto& kv : parsed_kernels_) {
+    programs_.insert({kv.first, std::vector<cl_program>(workspace_->devices.size(), nullptr)});
+  }
 }
 
 cl_kernel OpenCLModuleNode::InstallKernel(cl::OpenCLWorkspace* w, cl::OpenCLThreadEntry* t,
                                           const std::string& func_name, const KTRefEntry& e) {
   std::lock_guard<std::mutex> lock(build_lock_);
   int device_id = t->device.device_id;
-  if (!device_built_flag_[device_id]) {
+  if (programs_[func_name][device_id] == nullptr) {
     // create program
     if (fmt_ == "cl") {
-      if (program_ == nullptr) {
-        const char* s = data_.c_str();
-        size_t len = data_.length();
-        cl_int err;
-        program_ = clCreateProgramWithSource(w->context, 1, &s, &len, &err);
-        OPENCL_CHECK_ERROR(err);
-      }
+      const char* s = parsed_kernels_[func_name].c_str();
+      size_t len = parsed_kernels_[func_name].length();
+      cl_int err;
+      programs_[func_name][device_id] = clCreateProgramWithSource(w->context, 1, &s, &len, &err);
+      OPENCL_CHECK_ERROR(err);
     } else if (fmt_ == "xclbin" || fmt_ == "awsxclbin" || fmt_ == "aocx") {
       const unsigned char* s = (const unsigned char*)data_.c_str();
       size_t len = data_.length();
       cl_int err;
       cl_device_id dev = w->devices[device_id];
-      program_ = clCreateProgramWithBinary(w->context, 1, &dev, &len, &s, NULL, &err);
+      programs_[func_name][device_id] =
+          clCreateProgramWithBinary(w->context, 1, &dev, &len, &s, NULL, &err);
       OPENCL_CHECK_ERROR(err);
     } else {
       LOG(FATAL) << "Unknown OpenCL format " << fmt_;
     }
     // build program
     cl_int err;
     cl_device_id dev = w->devices[device_id];
-    err = clBuildProgram(program_, 1, &dev, nullptr, nullptr, nullptr);
+    err = clBuildProgram(programs_[func_name][device_id], 1, &dev, nullptr, nullptr, nullptr);
     if (err != CL_SUCCESS) {
       size_t len;
       std::string log;
-      clGetProgramBuildInfo(program_, dev, CL_PROGRAM_BUILD_LOG, 0, nullptr, &len);
+      clGetProgramBuildInfo(programs_[func_name][device_id], dev, CL_PROGRAM_BUILD_LOG, 0, nullptr,
+                            &len);
       log.resize(len);
-      clGetProgramBuildInfo(program_, dev, CL_PROGRAM_BUILD_LOG, len, &log[0], nullptr);
-      LOG(FATAL) << "OpenCL build error for device=" << dev << log;
+      clGetProgramBuildInfo(programs_[func_name][device_id], dev, CL_PROGRAM_BUILD_LOG, len,
+                            &log[0], nullptr);
+      LOG(FATAL) << "OpenCL build error for device=" << dev << "\n" << log;
     }
-    device_built_flag_[device_id] = true;
   }
   // build kernel
   cl_int err;
-  cl_kernel kernel = clCreateKernel(program_, func_name.c_str(), &err);
+  cl_kernel kernel = clCreateKernel(programs_[func_name][device_id], func_name.c_str(), &err);
   OPENCL_CHECK_ERROR(err);
   t->kernel_table[e.kernel_id].kernel = kernel;
   t->kernel_table[e.kernel_id].version = e.version;
   kernels_.push_back(kernel);
   return kernel;
 }
 
+std::unordered_map<std::string, std::string> OpenCLModuleNode::SplitKernels(
+    std::string source) const {
+  std::unordered_map<std::string, std::string> split_kernels;
+  if (source.size()) {
+    std::string del{"// Function: "};
+    size_t end;
+    size_t begin = source.find(del);
+    ICHECK(begin != std::string::npos) << "The OpenCL module expects a kernel delimited "
+                                       << "source from code generation, but no kernel "
+                                       << "delimiter was found.";
+    while (true) {

Review comment:
       Usually its good to put some kind of upper bound on things like this to ensure termination. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] csullivan commented on a change in pull request #7834: [OpenCL] Refactor cl_program generation

Posted by GitBox <gi...@apache.org>.
csullivan commented on a change in pull request #7834:
URL: https://github.com/apache/tvm/pull/7834#discussion_r614426856



##########
File path: src/runtime/opencl/opencl_module.cc
##########
@@ -181,56 +185,94 @@ void OpenCLModuleNode::Init() {
     e.version = workspace_->timestamp++;
     kid_map_[key] = e;
   }
+
+  // split into source artifacts for each kernel
+  parsed_kernels_ = SplitKernels(GetSource("cl"));
+  // zero initialize cl_program pointers for each device kernel
+  for (auto& kv : parsed_kernels_) {
+    programs_.insert({kv.first, std::vector<cl_program>(workspace_->devices.size(), nullptr)});
+  }
 }
 
 cl_kernel OpenCLModuleNode::InstallKernel(cl::OpenCLWorkspace* w, cl::OpenCLThreadEntry* t,
                                           const std::string& func_name, const KTRefEntry& e) {
   std::lock_guard<std::mutex> lock(build_lock_);
   int device_id = t->device.device_id;
-  if (!device_built_flag_[device_id]) {
+  if (programs_[func_name][device_id] == nullptr) {
     // create program
     if (fmt_ == "cl") {
-      if (program_ == nullptr) {
-        const char* s = data_.c_str();
-        size_t len = data_.length();
-        cl_int err;
-        program_ = clCreateProgramWithSource(w->context, 1, &s, &len, &err);
-        OPENCL_CHECK_ERROR(err);
-      }
+      const char* s = parsed_kernels_[func_name].c_str();
+      size_t len = parsed_kernels_[func_name].length();
+      cl_int err;
+      programs_[func_name][device_id] = clCreateProgramWithSource(w->context, 1, &s, &len, &err);
+      OPENCL_CHECK_ERROR(err);
     } else if (fmt_ == "xclbin" || fmt_ == "awsxclbin" || fmt_ == "aocx") {
       const unsigned char* s = (const unsigned char*)data_.c_str();
       size_t len = data_.length();
       cl_int err;
       cl_device_id dev = w->devices[device_id];
-      program_ = clCreateProgramWithBinary(w->context, 1, &dev, &len, &s, NULL, &err);
+      programs_[func_name][device_id] =
+          clCreateProgramWithBinary(w->context, 1, &dev, &len, &s, NULL, &err);
       OPENCL_CHECK_ERROR(err);
     } else {
       LOG(FATAL) << "Unknown OpenCL format " << fmt_;
     }
     // build program
     cl_int err;
     cl_device_id dev = w->devices[device_id];
-    err = clBuildProgram(program_, 1, &dev, nullptr, nullptr, nullptr);
+    err = clBuildProgram(programs_[func_name][device_id], 1, &dev, nullptr, nullptr, nullptr);
     if (err != CL_SUCCESS) {
       size_t len;
       std::string log;
-      clGetProgramBuildInfo(program_, dev, CL_PROGRAM_BUILD_LOG, 0, nullptr, &len);
+      clGetProgramBuildInfo(programs_[func_name][device_id], dev, CL_PROGRAM_BUILD_LOG, 0, nullptr,
+                            &len);
       log.resize(len);
-      clGetProgramBuildInfo(program_, dev, CL_PROGRAM_BUILD_LOG, len, &log[0], nullptr);
-      LOG(FATAL) << "OpenCL build error for device=" << dev << log;
+      clGetProgramBuildInfo(programs_[func_name][device_id], dev, CL_PROGRAM_BUILD_LOG, len,
+                            &log[0], nullptr);
+      LOG(FATAL) << "OpenCL build error for device=" << dev << "\n" << log;
     }
-    device_built_flag_[device_id] = true;
   }
   // build kernel
   cl_int err;
-  cl_kernel kernel = clCreateKernel(program_, func_name.c_str(), &err);
+  cl_kernel kernel = clCreateKernel(programs_[func_name][device_id], func_name.c_str(), &err);
   OPENCL_CHECK_ERROR(err);
   t->kernel_table[e.kernel_id].kernel = kernel;
   t->kernel_table[e.kernel_id].version = e.version;
   kernels_.push_back(kernel);
   return kernel;
 }
 
+std::unordered_map<std::string, std::string> OpenCLModuleNode::SplitKernels(
+    std::string source) const {
+  std::unordered_map<std::string, std::string> split_kernels;
+  if (source.size()) {
+    std::string del{"// Function: "};
+    size_t end;
+    size_t begin = source.find(del);
+    ICHECK(begin != std::string::npos) << "The OpenCL module expects a kernel delimited "
+                                       << "source from code generation, but no kernel "
+                                       << "delimiter was found.";
+    while (true) {

Review comment:
       Great catch, thanks! I updated this and added a check to ensure the number of parsed kernels matches the number of registered kernels.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [tvm] tqchen commented on a change in pull request #7834: [OpenCL] Refactor cl_program generation

Posted by GitBox <gi...@apache.org>.
tqchen commented on a change in pull request #7834:
URL: https://github.com/apache/tvm/pull/7834#discussion_r624280088



##########
File path: src/runtime/opencl/opencl_common.h
##########
@@ -315,6 +315,13 @@ class OpenCLModuleNode : public ModuleNode {
   cl_kernel InstallKernel(cl::OpenCLWorkspace* w, cl::OpenCLThreadEntry* t,
                           const std::string& func_name, const KTRefEntry& e);
 
+  /*
+   * \brief Splits the provided serialized source file into separate
+   * source for each kernel primitive.
+   * \param source The serialized program source file (fmt: cl)

Review comment:
       gentle ping @csullivan 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org