You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by ChinmaySKulkarni <gi...@git.apache.org> on 2018/02/02 20:23:13 UTC

[GitHub] phoenix pull request #292: PHOENIX-4231: Support restriction of remote UDF l...

GitHub user ChinmaySKulkarni opened a pull request:

    https://github.com/apache/phoenix/pull/292

    PHOENIX-4231: Support restriction of remote UDF load sources

    - Added feature to be able to add jars from an HDFS URI.
    - Restrict loading of jars to be only from the hbase.dynamic.jars.dir
    directory.
    
    Testing done:
    - Tested that the user is able to add jars from an HDFS URI reachable on the
    network as well as local filesystem.
    - Tested that the user is unable to create a function where the jar is
    loaded from any directory apart from hbase.dynamic.jars.dir.
    - Tested that HDFS URIs without scheme and authority work.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ChinmaySKulkarni/phoenix PHOENIX-4231

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/phoenix/pull/292.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #292
    
----
commit 9351e1b4d4a5741bbf063f0d6cd31f14cfa1e6b6
Author: Chinmay Kulkarni <ch...@...>
Date:   2018-02-02T20:16:47Z

    PHOENIX-4231: Support restriction of remote UDF load sources
    
    - Added feature to be able to add jars from an HDFS URI.
    - Restrict loading of jars to be only from the hbase.dynamic.jars.dir
    directory.
    
    Testing done:
    - Tested that the user is able to add jars from an HDFS URI reachable on the
    network as well as local filesystem.
    - Tested that the user is unable to create a function where the jar is
    loaded from any directory apart from hbase.dynamic.jars.dir.
    - Tested that HDFS URIs without scheme and authority work.

----


---

[GitHub] phoenix pull request #292: PHOENIX-4231: Support restriction of remote UDF l...

Posted by aertoria <gi...@git.apache.org>.
Github user aertoria commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/292#discussion_r169509766
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java ---
    @@ -907,10 +909,15 @@ public MutationState execute() throws SQLException {
                         try {
                             FileSystem fs = dynamicJarsDirPath.getFileSystem(conf);
                             List<LiteralParseNode> jarPaths = getJarPaths();
    -                        for (LiteralParseNode jarPath : jarPaths) {
    -                            File f = new File((String) jarPath.getValue());
    -                            fs.copyFromLocalFile(new Path(f.getAbsolutePath()), new Path(
    -                                    dynamicJarsDir + f.getName()));
    +                        for (LiteralParseNode jarPathNode : jarPaths) {
    +                          String jarPathName = (String) jarPathNode.getValue();
    +                          File f = new File(jarPathName);
    +                          Path dynamicJarsDirPathWithJar = new Path(dynamicJarsDir + f.getName());
    +                          // Copy the jar (can be local or on HDFS) to the hbase.dynamic.jars.dir directory.
    +                          // Note that this does not support HDFS URIs without scheme and authority.
    +                          Path jarPath = new Path(jarPathName);
    +                          FileUtil.copy(jarPath.getFileSystem(conf), jarPath, fs, dynamicJarsDirPathWithJar,
    --- End diff --
    
    This copy happen at phoenixStatment.java which is at client side. Shouldn't the checking be happening at Server side


---

[GitHub] phoenix pull request #292: PHOENIX-4231: Support restriction of remote UDF l...

Posted by apurtell <gi...@git.apache.org>.
Github user apurtell commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/292#discussion_r166479944
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java ---
    @@ -907,10 +909,15 @@ public MutationState execute() throws SQLException {
                         try {
                             FileSystem fs = dynamicJarsDirPath.getFileSystem(conf);
                             List<LiteralParseNode> jarPaths = getJarPaths();
    -                        for (LiteralParseNode jarPath : jarPaths) {
    -                            File f = new File((String) jarPath.getValue());
    -                            fs.copyFromLocalFile(new Path(f.getAbsolutePath()), new Path(
    -                                    dynamicJarsDir + f.getName()));
    +                        for (LiteralParseNode jarPathNode : jarPaths) {
    +                          String jarPathName = (String) jarPathNode.getValue();
    +                          File f = new File(jarPathName);
    +                          Path dynamicJarsDirPathWithJar = new Path(dynamicJarsDir + f.getName());
    +                          // Copy the jar (can be local or on HDFS) to the hbase.dynamic.jars.dir directory.
    +                          // Note that this does not support HDFS URIs without scheme and authority.
    +                          Path jarPath = new Path(jarPathName);
    +                          FileUtil.copy(jarPath.getFileSystem(conf), jarPath, fs, dynamicJarsDirPathWithJar,
    --- End diff --
    
    Actually, doesn't this imply the client should have write perms to hbase.dynamic.jars.dir? We don't want to allow arbitrary clients to write there. 


---

[GitHub] phoenix pull request #292: PHOENIX-4231: Support restriction of remote UDF l...

Posted by aertoria <gi...@git.apache.org>.
Github user aertoria commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/292#discussion_r166487244
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java ---
    @@ -907,10 +909,15 @@ public MutationState execute() throws SQLException {
                         try {
                             FileSystem fs = dynamicJarsDirPath.getFileSystem(conf);
                             List<LiteralParseNode> jarPaths = getJarPaths();
    -                        for (LiteralParseNode jarPath : jarPaths) {
    -                            File f = new File((String) jarPath.getValue());
    -                            fs.copyFromLocalFile(new Path(f.getAbsolutePath()), new Path(
    -                                    dynamicJarsDir + f.getName()));
    +                        for (LiteralParseNode jarPathNode : jarPaths) {
    +                          String jarPathName = (String) jarPathNode.getValue();
    +                          File f = new File(jarPathName);
    +                          Path dynamicJarsDirPathWithJar = new Path(dynamicJarsDir + f.getName());
    +                          // Copy the jar (can be local or on HDFS) to the hbase.dynamic.jars.dir directory.
    +                          // Note that this does not support HDFS URIs without scheme and authority.
    +                          Path jarPath = new Path(jarPathName);
    +                          FileUtil.copy(jarPath.getFileSystem(conf), jarPath, fs, dynamicJarsDirPathWithJar,
    +                            false, true, conf);
    --- End diff --
    
    @ChinmaySKulkarni 
    How is Hadoop.fileUtil.copy different from filesystem.copy


---

[GitHub] phoenix pull request #292: PHOENIX-4231: Support restriction of remote UDF l...

Posted by apurtell <gi...@git.apache.org>.
Github user apurtell commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/292#discussion_r166479267
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java ---
    @@ -907,10 +909,15 @@ public MutationState execute() throws SQLException {
                         try {
                             FileSystem fs = dynamicJarsDirPath.getFileSystem(conf);
                             List<LiteralParseNode> jarPaths = getJarPaths();
    -                        for (LiteralParseNode jarPath : jarPaths) {
    -                            File f = new File((String) jarPath.getValue());
    -                            fs.copyFromLocalFile(new Path(f.getAbsolutePath()), new Path(
    -                                    dynamicJarsDir + f.getName()));
    +                        for (LiteralParseNode jarPathNode : jarPaths) {
    +                          String jarPathName = (String) jarPathNode.getValue();
    +                          File f = new File(jarPathName);
    +                          Path dynamicJarsDirPathWithJar = new Path(dynamicJarsDir + f.getName());
    +                          // Copy the jar (can be local or on HDFS) to the hbase.dynamic.jars.dir directory.
    +                          // Note that this does not support HDFS URIs without scheme and authority.
    +                          Path jarPath = new Path(jarPathName);
    +                          FileUtil.copy(jarPath.getFileSystem(conf), jarPath, fs, dynamicJarsDirPathWithJar,
    --- End diff --
    
    If the client does not have perms to write to hbase.dynamic.jars.dir (and I expect normally clients will not have write perms to this directory, only admin clients will have it), the copy will fail and throw an IOException. The result may not be user friendly, though. Did you try this? What happens? 


---

[GitHub] phoenix pull request #292: PHOENIX-4231: Support restriction of remote UDF l...

Posted by aertoria <gi...@git.apache.org>.
Github user aertoria commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/292#discussion_r170387933
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/expression/function/UDFExpression.java ---
    @@ -198,18 +199,26 @@ public static DynamicClassLoader getClassLoader(final PName tenantId, final Stri
                 }
                 return cl;
             } else {
    -            cl = pathSpecificCls.get(jarPath);
    -            if (cl == null) {
    -                Configuration conf = HBaseConfiguration.create(config);
    -                conf.set(DYNAMIC_JARS_DIR_KEY, parent);
    -                cl = new DynamicClassLoader(conf, UDFExpression.class.getClassLoader());
    -            }
    -            // Cache class loader as a weak value, will be GC'ed when no reference left
    -            DynamicClassLoader prev = pathSpecificCls.putIfAbsent(jarPath, cl);
    -            if (prev != null) {
    -                cl = prev;
    +            String rawPath = new Path(config.get(DYNAMIC_JARS_DIR_KEY)).toUri().getRawPath();
    +            // jarPath is provided as an HDFS URI without scheme and authority, but the jar is inside the configured hbase.dynamic.jars.dir
    +            if(rawPath.equals(parent)) {
    --- End diff --
    
    This if block seems will never be reachable. See line 188.  Aren't these two condition equivalent?


---

[GitHub] phoenix pull request #292: PHOENIX-4231: Support restriction of remote UDF l...

Posted by aertoria <gi...@git.apache.org>.
Github user aertoria commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/292#discussion_r166491289
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java ---
    @@ -907,10 +909,15 @@ public MutationState execute() throws SQLException {
                         try {
                             FileSystem fs = dynamicJarsDirPath.getFileSystem(conf);
                             List<LiteralParseNode> jarPaths = getJarPaths();
    -                        for (LiteralParseNode jarPath : jarPaths) {
    -                            File f = new File((String) jarPath.getValue());
    -                            fs.copyFromLocalFile(new Path(f.getAbsolutePath()), new Path(
    -                                    dynamicJarsDir + f.getName()));
    +                        for (LiteralParseNode jarPathNode : jarPaths) {
    +                          String jarPathName = (String) jarPathNode.getValue();
    +                          File f = new File(jarPathName);
    +                          Path dynamicJarsDirPathWithJar = new Path(dynamicJarsDir + f.getName());
    +                          // Copy the jar (can be local or on HDFS) to the hbase.dynamic.jars.dir directory.
    +                          // Note that this does not support HDFS URIs without scheme and authority.
    +                          Path jarPath = new Path(jarPathName);
    +                          FileUtil.copy(jarPath.getFileSystem(conf), jarPath, fs, dynamicJarsDirPathWithJar,
    --- End diff --
    
    @apurtell 
    Since @ChinmaySKulkarni will be absent for a while, I will be wrap up this patch for the team.
    do you know how is hbase.dynamic.jars.dir used differently than hbase.local.jars.dir


---

[GitHub] phoenix pull request #292: PHOENIX-4231: Support restriction of remote UDF l...

Posted by ChinmaySKulkarni <gi...@git.apache.org>.
Github user ChinmaySKulkarni closed the pull request at:

    https://github.com/apache/phoenix/pull/292


---

[GitHub] phoenix pull request #292: PHOENIX-4231: Support restriction of remote UDF l...

Posted by aertoria <gi...@git.apache.org>.
Github user aertoria commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/292#discussion_r170755367
  
    --- Diff: phoenix-core/src/main/java/org/apache/phoenix/jdbc/PhoenixStatement.java ---
    @@ -907,10 +909,15 @@ public MutationState execute() throws SQLException {
                         try {
                             FileSystem fs = dynamicJarsDirPath.getFileSystem(conf);
                             List<LiteralParseNode> jarPaths = getJarPaths();
    -                        for (LiteralParseNode jarPath : jarPaths) {
    -                            File f = new File((String) jarPath.getValue());
    -                            fs.copyFromLocalFile(new Path(f.getAbsolutePath()), new Path(
    -                                    dynamicJarsDir + f.getName()));
    +                        for (LiteralParseNode jarPathNode : jarPaths) {
    +                          String jarPathName = (String) jarPathNode.getValue();
    +                          File f = new File(jarPathName);
    +                          Path dynamicJarsDirPathWithJar = new Path(dynamicJarsDir + f.getName());
    +                          // Copy the jar (can be local or on HDFS) to the hbase.dynamic.jars.dir directory.
    +                          // Note that this does not support HDFS URIs without scheme and authority.
    +                          Path jarPath = new Path(jarPathName);
    +                          FileUtil.copy(jarPath.getFileSystem(conf), jarPath, fs, dynamicJarsDirPathWithJar,
    --- End diff --
    
    The comment "Note that this does not support HDFS URIs without scheme and authority." I wonder where did you discover that.
    
    If I'm not mistaken FileUtil.copy internally calls fs.copyFromLocalFile without doing file.islocal check.


---

[GitHub] phoenix issue #292: PHOENIX-4231: Support restriction of remote UDF load sou...

Posted by aertoria <gi...@git.apache.org>.
Github user aertoria commented on the issue:

    https://github.com/apache/phoenix/pull/292
  
    Sent out another patch for review. Let us close this p.r.


---

[GitHub] phoenix issue #292: PHOENIX-4231: Support restriction of remote UDF load sou...

Posted by ChinmaySKulkarni <gi...@git.apache.org>.
Github user ChinmaySKulkarni commented on the issue:

    https://github.com/apache/phoenix/pull/292
  
    @apurtell @twdsilva @jtaylor-sfdc please review. Thanks.


---

[GitHub] phoenix issue #292: PHOENIX-4231: Support restriction of remote UDF load sou...

Posted by JamesRTaylor <gi...@git.apache.org>.
Github user JamesRTaylor commented on the issue:

    https://github.com/apache/phoenix/pull/292
  
    Thanks for the patch, @ChinmaySKulkarni. Best person to review is @chrajeshbabu who originally added support for UDFs.


---

[GitHub] phoenix pull request #292: PHOENIX-4231: Support restriction of remote UDF l...

Posted by aertoria <gi...@git.apache.org>.
Github user aertoria commented on a diff in the pull request:

    https://github.com/apache/phoenix/pull/292#discussion_r169213703
  
    --- Diff: phoenix-core/src/it/java/org/apache/phoenix/end2end/UserDefinedFunctionsIT.java ---
    @@ -322,6 +327,99 @@ public void testDeleteJar() throws Exception {
             assertFalse(rs.next());
         }
     
    +    /**
    +     * Test adding jars from an HDFS URI.
    +     * @throws Exception
    +     */
    +    @Test
    +    public void testAddJarsFromHDFS() throws Exception {
    +        compileTestClass(MY_ARRAY_INDEX_CLASS_NAME, MY_ARRAY_INDEX_PROGRAM, 7);
    +        Statement stmt = driver.connect(url, EMPTY_PROPS).createStatement();
    +        // Note that we have already added all locally created UDF jars to the hbase.dynamic.jars.dir directory
    +        ResultSet rs = stmt.executeQuery("list jars");
    +        int count = 0;
    +        while(rs.next()) {
    +            count++;
    +        }
    +        Path destJarPathOnHDFS = copyJarsFromDynamicJarsDirToDummyHDFSDir("myjar7.jar");
    +        stmt.execute("delete jar '"+ util.getConfiguration().get(QueryServices.DYNAMIC_JARS_DIR_KEY)+"/"+"myjar7.jar'");
    +        stmt.execute("add jars '" + destJarPathOnHDFS.toString() + "'");
    +        rs = stmt.executeQuery("list jars");
    +        int finalCount = 0;
    +        while(rs.next()) {
    +            finalCount++;
    +        }
    +        assertEquals(count, finalCount);
    --- End diff --
    
    original `list jars` equal to `list jars`, shouldn't the newly added jar make it plus 1? i.e.,
     assertEquals(count+1, finalCount);


---