You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by mo...@apache.org on 2022/06/09 15:36:16 UTC

[incubator-doris] branch master updated: [dependency] add hyperscan and its dependency ragel to thirdparty (#9964)

This is an automated email from the ASF dual-hosted git repository.

morningman pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-doris.git


The following commit(s) were added to refs/heads/master by this push:
     new 73a3093539 [dependency] add hyperscan and its dependency ragel to thirdparty (#9964)
73a3093539 is described below

commit 73a30935399108a6fc2abf8b5017bd1ff2872dc6
Author: Kang <kx...@gmail.com>
AuthorDate: Thu Jun 9 23:36:09 2022 +0800

    [dependency] add hyperscan and its dependency ragel to thirdparty (#9964)
---
 dist/LICENSE-dist.txt                    |   1 +
 dist/licenses/LICENSE-hyperscan.txt      | 118 +++++++++++++++++++++++++++++++
 thirdparty/CHANGELOG.md                  |   4 ++
 thirdparty/build-thirdparty.sh           |  15 ++++
 thirdparty/download-thirdparty.sh        |  12 ++++
 thirdparty/patches/hyperscan-5.4.0.patch |  18 +++++
 thirdparty/vars.sh                       |  14 ++++
 7 files changed, 182 insertions(+)

diff --git a/dist/LICENSE-dist.txt b/dist/LICENSE-dist.txt
index eab7c03f94..2330685ebb 100644
--- a/dist/LICENSE-dist.txt
+++ b/dist/LICENSE-dist.txt
@@ -1549,6 +1549,7 @@ Other dependencies:
     * rapidjson@1a803826 -- license/LICENSE-rapidjson.txt
     * curl: 7.79.0 -- license/LICENSE-curl.txt
     * re2: 2021-02-02 -- license/LICENSE-re2.txt
+    * hyperscan: 5.4.0 -- license/LICENSE-hyperscan.txt
     * boost: 1.73.0 -- license/LICENSE-boost.txt
     * unixodbc: 2.3.7 -- license/LICENSE-unixodbc.txt
     * leveldb: 1.20 -- license/LICENSE-leveldb.txt
diff --git a/dist/licenses/LICENSE-hyperscan.txt b/dist/licenses/LICENSE-hyperscan.txt
new file mode 100644
index 0000000000..30c57a8013
--- /dev/null
+++ b/dist/licenses/LICENSE-hyperscan.txt
@@ -0,0 +1,118 @@
+Hyperscan is licensed under the BSD License.
+
+Copyright (c) 2015, Intel Corporation
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+  * Redistributions of source code must retain the above copyright notice,
+    this list of conditions and the following disclaimer.
+  * Redistributions in binary form must reproduce the above copyright
+    notice, this list of conditions and the following disclaimer in the
+    documentation and/or other materials provided with the distribution.
+  * Neither the name of Intel Corporation nor the names of its contributors
+    may be used to endorse or promote products derived from this software
+    without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+--------------------------------------------------------------------------------
+
+This product also contains code from third parties, under the following
+licenses:
+
+Intel's Slicing-by-8 CRC32 implementation
+-----------------------------------------
+
+Copyright (c) 2004-2006, Intel Corporation
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+  * Redistributions of source code must retain the above copyright notice,
+    this list of conditions and the following disclaimer.
+  * Redistributions in binary form must reproduce the above copyright notice,
+    this list of conditions and the following disclaimer in the documentation
+    and/or other materials provided with the distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+Boost C++ Headers Library
+-------------------------
+
+Boost Software License - Version 1.0 - August 17th, 2003
+
+Permission is hereby granted, free of charge, to any person or organization
+obtaining a copy of the software and accompanying documentation covered by
+this license (the "Software") to use, reproduce, display, distribute,
+execute, and transmit the Software, and to prepare derivative works of the
+Software, and to permit third-parties to whom the Software is furnished to
+do so, all subject to the following:
+
+The copyright notices in the Software and this entire statement, including
+the above license grant, this restriction and the following disclaimer,
+must be included in all copies of the Software, in whole or in part, and
+all derivative works of the Software, unless such copies or derivative
+works are solely in the form of machine-executable object code generated by
+a source language processor.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT
+SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE
+FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE,
+ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+DEALINGS IN THE SOFTWARE.
+
+
+The Google C++ Testing Framework (Google Test)
+----------------------------------------------
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met:
+
+    * Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above
+copyright notice, this list of conditions and the following disclaimer
+in the documentation and/or other materials provided with the
+distribution.
+    * Neither the name of Google Inc. nor the names of its
+contributors may be used to endorse or promote products derived from
+this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
diff --git a/thirdparty/CHANGELOG.md b/thirdparty/CHANGELOG.md
index ddd3dfb7f3..9287a2838b 100644
--- a/thirdparty/CHANGELOG.md
+++ b/thirdparty/CHANGELOG.md
@@ -2,6 +2,10 @@
 
 This file contains version of the third-party dependency libraries in the build-env image. The docker build-env image is apache/incubator-doris, and the tag is `build-env-${version}`
 
+## v20220606
+- Added: hyperscan 5.4.0, and a patch for compilation
+- Added: ragel 6.1.0, it is used by hyperscan to generate files before compilation
+
 ## v20220522
 
 - Added: libgsasl 1.8.0, this version of gsasl is only used for libhdfs3 with kerberos
diff --git a/thirdparty/build-thirdparty.sh b/thirdparty/build-thirdparty.sh
index ec4bf9b985..1690e7e8f6 100755
--- a/thirdparty/build-thirdparty.sh
+++ b/thirdparty/build-thirdparty.sh
@@ -505,6 +505,20 @@ build_re2() {
     ${BUILD_SYSTEM} -j $PARALLEL install
 }
 
+# hyperscan
+build_hyperscan() {
+    check_if_source_exist $RAGEL_SOURCE
+    cd $TP_SOURCE_DIR/$RAGEL_SOURCE
+    ./configure --prefix=$TP_INSTALL_DIR && make install
+
+    check_if_source_exist $HYPERSCAN_SOURCE
+    cd $TP_SOURCE_DIR/$HYPERSCAN_SOURCE
+    mkdir -p $BUILD_DIR && cd $BUILD_DIR
+    PATH=$TP_INSTALL_DIR/bin:$PATH ${CMAKE_CMD} -G "${GENERATOR}" -DBUILD_SHARED_LIBS=0 \
+    -DBOOST_ROOT=$BOOST_SOURCE -DCMAKE_INSTALL_PREFIX=$TP_INSTALL_DIR ..
+    ${BUILD_SYSTEM} -j $PARALLEL install
+}
+
 # boost
 build_boost() {
     check_if_source_exist $BOOST_SOURCE
@@ -1019,6 +1033,7 @@ build_snappy
 build_gperftools
 build_curl
 build_re2
+build_hyperscan
 build_thrift
 build_leveldb
 build_brpc
diff --git a/thirdparty/download-thirdparty.sh b/thirdparty/download-thirdparty.sh
index 6014611ab5..526e549512 100755
--- a/thirdparty/download-thirdparty.sh
+++ b/thirdparty/download-thirdparty.sh
@@ -312,6 +312,18 @@ if [ $LIBRDKAFKA_SOURCE = "librdkafka-1.8.2" ]; then
 fi
 echo "Finished patching $LIBRDKAFKA_SOURCE"
 
+# patch hyperscan
+# https://github.com/intel/hyperscan/issues/292
+if [ $HYPERSCAN_SOURCE = "hyperscan-5.4.0" ]; then
+    cd $TP_SOURCE_DIR/$HYPERSCAN_SOURCE
+    if [ ! -f $PATCHED_MARK ]; then
+        patch -p0 < $TP_PATCH_DIR/hyperscan-5.4.0.patch
+        touch $PATCHED_MARK
+    fi
+    cd -
+fi
+echo "Finished patching $HYPERSCAN_SOURCE"
+
 cd $TP_SOURCE_DIR/$AWS_SDK_SOURCE
 if [ ! -f $PATCHED_MARK ]; then
     if [ $AWS_SDK_SOURCE == "aws-sdk-cpp-1.9.211" ]; then
diff --git a/thirdparty/patches/hyperscan-5.4.0.patch b/thirdparty/patches/hyperscan-5.4.0.patch
new file mode 100644
index 0000000000..8bce7a88ba
--- /dev/null
+++ b/thirdparty/patches/hyperscan-5.4.0.patch
@@ -0,0 +1,18 @@
+diff --git cmake/build_wrapper.sh cmake/build_wrapper.sh
+index 1962813..becfbf4 100755
+--- cmake/build_wrapper.sh
++++ cmake/build_wrapper.sh
+@@ -17,11 +17,11 @@ KEEPSYMS=$(mktemp -p /tmp keep.syms.XXXXX)
+ LIBC_SO=$("$@" --print-file-name=libc.so.6)
+ cp ${KEEPSYMS_IN} ${KEEPSYMS}
+ # get all symbols from libc and turn them into patterns
+-nm -f p -g -D ${LIBC_SO} | sed -s 's/\([^ ]*\).*/^\1$/' >> ${KEEPSYMS}
++nm -f posix -g -D ${LIBC_SO} | sed -s 's/\([^ @]*\).*/^\1$/' >> ${KEEPSYMS}
+ # build the object
+ "$@"
+ # rename the symbols in the object
+-nm -f p -g ${OUT} | cut -f1 -d' ' | grep -v -f ${KEEPSYMS} | sed -e "s/\(.*\)/\1\ ${PREFIX}_\1/" >> ${SYMSFILE}
++nm -f posix -g ${OUT} | cut -f1 -d' ' | grep -v -f ${KEEPSYMS} | sed -e "s/\(.*\)/\1\ ${PREFIX}_\1/" >> ${SYMSFILE}
+ if test -s ${SYMSFILE}
+ then
+     objcopy --redefine-syms=${SYMSFILE} ${OUT}
diff --git a/thirdparty/vars.sh b/thirdparty/vars.sh
index 9696c76a48..3adcd3f5b9 100755
--- a/thirdparty/vars.sh
+++ b/thirdparty/vars.sh
@@ -152,6 +152,18 @@ RE2_NAME=re2-2021-02-02.tar.gz
 RE2_SOURCE=re2-2021-02-02
 RE2_MD5SUM="48bc665463a86f68243c5af1bac75cd0"
 
+# hyperscan
+HYPERSCAN_DOWNLOAD="https://github.com/intel/hyperscan/archive/refs/tags/v5.4.0.tar.gz"
+HYPERSCAN_NAME=hyperscan-5.4.0.tar.gz
+HYPERSCAN_SOURCE=hyperscan-5.4.0
+HYPERSCAN_MD5SUM="65e08385038c24470a248f6ff2fa379b"
+
+# ragel (dependency for hyperscan)
+RAGEL_DOWNLOAD="http://www.colm.net/files/ragel/ragel-6.10.tar.gz"
+RAGEL_NAME=ragel-6.10.tar.gz
+RAGEL_SOURCE=ragel-6.10
+RAGEL_MD5SUM="748cae8b50cffe9efcaa5acebc6abf0d"
+
 # boost
 BOOST_DOWNLOAD="https://boostorg.jfrog.io/artifactory/main/release/1.73.0/source/boost_1_73_0.tar.gz"
 BOOST_NAME=boost_1_73_0.tar.gz
@@ -401,6 +413,8 @@ BZIP
 LZO2
 CURL
 RE2
+HYPERSCAN
+RAGEL
 BOOST
 MYSQL
 ODBC


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org