You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by GitBox <gi...@apache.org> on 2020/08/14 08:46:00 UTC

[GitHub] [phoenix-connectors] richardantal opened a new pull request #26: PHOENIX-6076 Refactor Phoenix Hive Connectors introduced by PHOENIX-6057

richardantal opened a new pull request #26:
URL: https://github.com/apache/phoenix-connectors/pull/26


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix-connectors] stoty commented on pull request #26: PHOENIX-6076 Refactor Phoenix Hive Connectors introduced by PHOENIX-6057

Posted by GitBox <gi...@apache.org>.
stoty commented on pull request #26:
URL: https://github.com/apache/phoenix-connectors/pull/26#issuecomment-680982766


   The java part basically looks good.
   
   My problem is that we are building the code in the -common modules, when there is no reason to do that. (This applies both to hive and the rest of the connectors)
   
   The -common modules should really just define the common dependencies, and have compiling/testing disabled.
   
   You can base this on how it's done in Tephra.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix-connectors] stoty commented on a change in pull request #26: PHOENIX-6076 Refactor Phoenix Hive Connectors introduced by PHOENIX-6057

Posted by GitBox <gi...@apache.org>.
stoty commented on a change in pull request #26:
URL: https://github.com/apache/phoenix-connectors/pull/26#discussion_r470619252



##########
File path: phoenix-hive-base/src/main/java/org/apache/phoenix/hive/util/PhoenixStorageHandlerUtilBase.java
##########
@@ -20,18 +20,27 @@
 import java.io.ByteArrayInputStream;

Review comment:
       Please mark the *Base classes abstract, to make sure no-one tries to instantiate them.

##########
File path: phoenix-hive-base/phoenix5-hive/src/main/java/org/apache/phoenix/hive/SpecificPhoenixSeriazer.java
##########
@@ -15,21 +15,21 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
-
 package org.apache.phoenix.hive;
 
-import static org.junit.Assert.fail;
-
-import org.junit.BeforeClass;
+import org.apache.hadoop.hive.common.type.Date;
+import org.apache.hadoop.hive.common.type.Timestamp;
 
-public class HiveMapReduceIT extends HivePhoenixStoreIT {
-
-    @BeforeClass
-    public static void setUpBeforeClass() throws Exception {
-        final String hadoopConfDir = System.getenv("HADOOP_CONF_DIR");
-        if (hadoopConfDir != null && hadoopConfDir.length() != 0) {
-            fail("HADOOP_CONF_DIR is non-empty in the current shell environment which will very likely cause this test to fail.");
+/**
+ * Serializer used in PhoenixSerDe and PhoenixRecordUpdater to produce Writable.
+ */
+public class SpecificPhoenixSeriazer {

Review comment:
       This is a pretty awkward name. 
   Diff seems to be bit confused, but if this is a new class, try to find a better name for it.

##########
File path: phoenix-hive-base/pom.xml
##########
@@ -195,101 +180,75 @@
                 </exclusion>
             </exclusions>
         </dependency>
-        <dependency>
-            <groupId>org.eclipse.jetty</groupId>
-            <artifactId>jetty-util</artifactId>
-            <scope>test</scope>
-            <version>${jetty.version}</version>
-        </dependency>
-        <dependency>
-            <groupId>org.eclipse.jetty</groupId>
-            <artifactId>jetty-http</artifactId>
-            <scope>test</scope>
-            <version>${jetty.version}</version>
-        </dependency>
-        <dependency>
-            <groupId>org.eclipse.jetty</groupId>
-            <artifactId>jetty-server</artifactId>
-            <scope>test</scope>
-            <version>${jetty.version}</version>
-        </dependency>
         <dependency>
             <groupId>org.mockito</groupId>
             <artifactId>mockito-all</artifactId>
             <version>${mockito-all.version}</version>
             <scope>test</scope>
         </dependency>
-        <dependency>
-            <groupId>com.google.guava</groupId>
-            <artifactId>guava</artifactId>
-            <version>19.0</version>
-        </dependency>
-        <dependency>
-            <groupId>org.apache.calcite.avatica</groupId>
-            <artifactId>avatica</artifactId>
-            <!-- Overriding the version of Avatica that PQS uses so that Hive will work -->
-            <version>${avatica.version}</version>
-            <scope>test</scope>
-            <!-- And removing a bunch of dependencies that haven't been shaded in this older
-                 Avatica version which conflict with HDFS -->
-            <exclusions>
-                <exclusion>
-                    <groupId>org.hsqldb</groupId>
-                    <artifactId>hsqldb</artifactId>
-                </exclusion>
-                <exclusion>
-                    <groupId>com.fasterxml.jackson.core</groupId>
-                    <artifactId>jackson-databind</artifactId>
-                </exclusion>
-                <exclusion>
-                    <groupId>com.fasterxml.jackson.core</groupId>
-                    <artifactId>jackson-annotations</artifactId>
-                </exclusion>
-                <exclusion>
-                    <groupId>com.fasterxml.jackson.core</groupId>
-                    <artifactId>jackson-core</artifactId>
-                </exclusion>
-            </exclusions>
-        </dependency>
     </dependencies>
 
     <build>
         <plugins>
             <plugin>
                 <groupId>org.codehaus.mojo</groupId>
                 <artifactId>build-helper-maven-plugin</artifactId>
-            </plugin>
-            <plugin>
-                <groupId>org.apache.maven.plugins</groupId>
-                <artifactId>maven-failsafe-plugin</artifactId>
+                <version>3.0.0</version>
                 <executions>
                     <execution>
-                        <id>HBaseManagedTimeTests</id>
+                        <id>add-source</id>
+                        <phase>generate-sources</phase>
+                        <goals>
+                            <goal>add-source</goal>
+                        </goals>
                         <configuration>
-                            <encoding>UTF-8</encoding>
-                            <forkCount>1</forkCount>
-                            <runOrder>alphabetical</runOrder>
-                            <reuseForks>false</reuseForks>
-                            <argLine>-enableassertions -Xmx2000m -XX:MaxPermSize=256m
-                                -Djava.security.egd=file:/dev/./urandom
-                                "-Djava.library.path=${hadoop.library.path}${path.separator}${java.library.path}"
-                                -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./target/
-                                -Dorg.apache.hadoop.hbase.shaded.io.netty.packagePrefix=org.apache.hadoop.hbase.shaded.
-                            </argLine>
-                            <redirectTestOutputToFile>${test.output.tofile}
-                            </redirectTestOutputToFile>
-                            <testSourceDirectory>${basedir}/src/it/java</testSourceDirectory>
-                            <groups>org.apache.phoenix.end2end.HBaseManagedTimeTest</groups>
-                            <shutdown>kill</shutdown>
-                            <useSystemClassLoader>false</useSystemClassLoader>
+                            <sources>
+                                <source>${project.parent.basedir}/src/main/java</source>
+                            </sources>
                         </configuration>
+                    </execution>
+                    <execution>
+                        <id>add-test-source</id>
+                        <phase>generate-sources</phase>
                         <goals>
-                            <goal>integration-test</goal>
-                            <goal>verify</goal>
+                            <goal>add-test-source</goal>
                         </goals>
+                        <configuration>

Review comment:
       Probably depending on common would be better.

##########
File path: phoenix-hive-base/pom.xml
##########
@@ -195,101 +180,75 @@
                 </exclusion>
             </exclusions>
         </dependency>
-        <dependency>
-            <groupId>org.eclipse.jetty</groupId>
-            <artifactId>jetty-util</artifactId>
-            <scope>test</scope>
-            <version>${jetty.version}</version>
-        </dependency>
-        <dependency>
-            <groupId>org.eclipse.jetty</groupId>
-            <artifactId>jetty-http</artifactId>
-            <scope>test</scope>
-            <version>${jetty.version}</version>
-        </dependency>
-        <dependency>
-            <groupId>org.eclipse.jetty</groupId>
-            <artifactId>jetty-server</artifactId>
-            <scope>test</scope>
-            <version>${jetty.version}</version>
-        </dependency>
         <dependency>
             <groupId>org.mockito</groupId>
             <artifactId>mockito-all</artifactId>
             <version>${mockito-all.version}</version>
             <scope>test</scope>
         </dependency>
-        <dependency>
-            <groupId>com.google.guava</groupId>
-            <artifactId>guava</artifactId>
-            <version>19.0</version>
-        </dependency>
-        <dependency>
-            <groupId>org.apache.calcite.avatica</groupId>
-            <artifactId>avatica</artifactId>
-            <!-- Overriding the version of Avatica that PQS uses so that Hive will work -->
-            <version>${avatica.version}</version>
-            <scope>test</scope>
-            <!-- And removing a bunch of dependencies that haven't been shaded in this older
-                 Avatica version which conflict with HDFS -->
-            <exclusions>
-                <exclusion>
-                    <groupId>org.hsqldb</groupId>
-                    <artifactId>hsqldb</artifactId>
-                </exclusion>
-                <exclusion>
-                    <groupId>com.fasterxml.jackson.core</groupId>
-                    <artifactId>jackson-databind</artifactId>
-                </exclusion>
-                <exclusion>
-                    <groupId>com.fasterxml.jackson.core</groupId>
-                    <artifactId>jackson-annotations</artifactId>
-                </exclusion>
-                <exclusion>
-                    <groupId>com.fasterxml.jackson.core</groupId>
-                    <artifactId>jackson-core</artifactId>
-                </exclusion>
-            </exclusions>
-        </dependency>
     </dependencies>
 
     <build>
         <plugins>
             <plugin>
                 <groupId>org.codehaus.mojo</groupId>
                 <artifactId>build-helper-maven-plugin</artifactId>
-            </plugin>
-            <plugin>
-                <groupId>org.apache.maven.plugins</groupId>
-                <artifactId>maven-failsafe-plugin</artifactId>
+                <version>3.0.0</version>
                 <executions>
                     <execution>
-                        <id>HBaseManagedTimeTests</id>
+                        <id>add-source</id>
+                        <phase>generate-sources</phase>
+                        <goals>
+                            <goal>add-source</goal>
+                        </goals>
                         <configuration>
-                            <encoding>UTF-8</encoding>
-                            <forkCount>1</forkCount>
-                            <runOrder>alphabetical</runOrder>
-                            <reuseForks>false</reuseForks>
-                            <argLine>-enableassertions -Xmx2000m -XX:MaxPermSize=256m
-                                -Djava.security.egd=file:/dev/./urandom
-                                "-Djava.library.path=${hadoop.library.path}${path.separator}${java.library.path}"
-                                -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=./target/
-                                -Dorg.apache.hadoop.hbase.shaded.io.netty.packagePrefix=org.apache.hadoop.hbase.shaded.
-                            </argLine>
-                            <redirectTestOutputToFile>${test.output.tofile}
-                            </redirectTestOutputToFile>
-                            <testSourceDirectory>${basedir}/src/it/java</testSourceDirectory>
-                            <groups>org.apache.phoenix.end2end.HBaseManagedTimeTest</groups>
-                            <shutdown>kill</shutdown>
-                            <useSystemClassLoader>false</useSystemClassLoader>
+                            <sources>
+                                <source>${project.parent.basedir}/src/main/java</source>
+                            </sources>
                         </configuration>
+                    </execution>
+                    <execution>
+                        <id>add-test-source</id>
+                        <phase>generate-sources</phase>
                         <goals>
-                            <goal>integration-test</goal>
-                            <goal>verify</goal>
+                            <goal>add-test-source</goal>
                         </goals>
+                        <configuration>
+                            <sources>
+                                <source>${project.parent.basedir}/src/it/java</source>
+                                <source>${project.basedir}/src/it/java</source>
+                                <source>${project.parent.basedir}/src/test/java</source>
+                            </sources>
+                        </configuration>
                     </execution>
                 </executions>
             </plugin>
+            <plugin>

Review comment:
       I think that in this case we simply want to depend on -base, instead of copying stuff from it.

##########
File path: phoenix-hive-base/src/main/java/org/apache/phoenix/hive/PhoenixSerializer.java
##########
@@ -167,6 +168,8 @@ public Writable serialize(Object values, ObjectInspector objInspector, DmlType d
                             value = ((HiveDecimal) value).bigDecimalValue();
                         } else if (value instanceof HiveChar) {
                             value = ((HiveChar) value).getValue().trim();
+                        } else if (CompatUtil.DateAndTimestampSupport()){
+                            value = SpecificPhoenixSeriazer.GetValue(value);

Review comment:
       Fix the typo while at it.

##########
File path: phoenix-hive-base/pom.xml
##########
@@ -29,42 +29,46 @@
         <artifactId>phoenix-connectors</artifactId>
         <version>6.0.0-SNAPSHOT</version>
     </parent>
-    <artifactId>phoenix-hive3</artifactId>
-    <name>Phoenix Hive Connector for Phoenix 5</name>
+    <artifactId>phoenix-hive-base</artifactId>
+    <name>Phoenix Hive Connector - Base</name>
+
+    <packaging>pom</packaging>
+    <modules>
+        <module>phoenix4-hive</module>
+        <module>phoenix5-hive</module>
+    </modules>
+
     <properties>
         <test.tmp.dir>${project.build.directory}/tmp</test.tmp.dir>
-        <netty.version>4.1.47.Final</netty.version>
-        <phoenix.version>5.1.0-SNAPSHOT</phoenix.version>
-        <hbase.version>2.2.4</hbase.version>
-        <hadoop.version>3.0.3</hadoop.version>
-        <avatica.version>1.12.0</avatica.version>
-        <hive3.version>3.1.2</hive3.version>
-        <curator.version>4.0.0</curator.version>
+        <netty.version>4.0.52.Final</netty.version>

Review comment:
       These dependencies are pretty ancient.
   Use the hive 3 dependency versions, if you can.

##########
File path: phoenix-hive-base/src/main/java/org/apache/phoenix/hive/PhoenixSerializer.java
##########
@@ -167,6 +168,8 @@ public Writable serialize(Object values, ObjectInspector objInspector, DmlType d
                             value = ((HiveDecimal) value).bigDecimalValue();
                         } else if (value instanceof HiveChar) {
                             value = ((HiveChar) value).getValue().trim();
+                        } else if (CompatUtil.DateAndTimestampSupport()){

Review comment:
       Nice solution, but doesn't follow the java naming Conventions.
   use .isDateAndTimeSupported() instead.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix-connectors] stoty commented on pull request #26: PHOENIX-6076 Refactor Phoenix Hive Connectors introduced by PHOENIX-6057

Posted by GitBox <gi...@apache.org>.
stoty commented on pull request #26:
URL: https://github.com/apache/phoenix-connectors/pull/26#issuecomment-682368843


   I'll wait for the checks to pass before committing
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix-connectors] stoty closed pull request #26: PHOENIX-6076 Refactor Phoenix Hive Connectors introduced by PHOENIX-6057

Posted by GitBox <gi...@apache.org>.
stoty closed pull request #26:
URL: https://github.com/apache/phoenix-connectors/pull/26


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix-connectors] stoty commented on a change in pull request #26: PHOENIX-6076 Refactor Phoenix Hive Connectors introduced by PHOENIX-6057

Posted by GitBox <gi...@apache.org>.
stoty commented on a change in pull request #26:
URL: https://github.com/apache/phoenix-connectors/pull/26#discussion_r478834431



##########
File path: phoenix4-compat/src/main/java/org/apache/phoenix/compat/CompatUtil.java
##########
@@ -22,12 +22,14 @@
 import org.apache.hadoop.hbase.client.HBaseAdmin;
 import org.apache.hadoop.hbase.client.RegionLocator;
 import org.apache.hadoop.hbase.util.RegionSizeCalculator;
-import org.junit.Assert;
 
 import java.io.IOException;
 
 public class CompatUtil {
 
+    public static boolean PHOENIX4ONLY = true;

Review comment:
       I'd prefer
   
   ```
   public static boolean isPhoenix4() {
       return true;
   }
   ```
   instead

##########
File path: phoenix-hive-base/phoenix4-hive/src/main/java/org/apache/phoenix/compat/HiveCompatUtil.java
##########
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.phoenix.compat;
+
+import org.apache.commons.logging.Log;
+import org.apache.hadoop.hive.conf.HiveConf;
+import org.apache.hadoop.hive.ql.QueryState;
+import org.apache.hadoop.hive.ql.io.AcidOutputFormat;
+import org.apache.hadoop.hive.ql.metadata.Hive;
+import org.apache.hadoop.hive.ql.metadata.HiveMaterializedViewsRegistry;
+import org.apache.hadoop.hive.ql.parse.ParseDriver;
+import org.apache.hadoop.hive.ql.parse.SemanticAnalyzer;
+import org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc;
+import org.apache.hadoop.hive.ql.session.SessionState;
+
+import java.io.File;
+import java.lang.reflect.Method;
+import java.net.MalformedURLException;
+import java.net.URL;
+import java.util.concurrent.atomic.AtomicReference;
+
+public class HiveCompatUtil {
+
+    private HiveCompatUtil() {
+        // Not to be instantiated
+    }
+
+    public static ExprNodeGenericFuncDesc getComparisonExpr(ExprNodeGenericFuncDesc comparisonExpr, boolean isNot) {
+        return comparisonExpr;
+    }
+
+    public static String getOptionsValue(AcidOutputFormat.Options options, AtomicReference<Method> GET_BUCKET_METHOD_REF, AtomicReference<Method> GET_BUCKET_ID_METHOD_REF, Log LOG) {
+        StringBuilder content = new StringBuilder();
+
+        int bucket = options.getBucket();
+        String inspectorInfo = options.getInspector().getCategory() + ":" + options.getInspector()
+                .getTypeName();
+        long maxTxnId = options.getMaximumTransactionId();
+        long minTxnId = options.getMinimumTransactionId();
+        int recordIdColumn = options.getRecordIdColumn();
+        boolean isCompresses = options.isCompressed();
+        boolean isWritingBase = options.isWritingBase();
+
+        content.append("bucket : ").append(bucket).append(", inspectorInfo : ").append
+                (inspectorInfo).append(", minTxnId : ").append(minTxnId).append(", maxTxnId : ")
+                .append(maxTxnId).append(", recordIdColumn : ").append(recordIdColumn);
+        content.append(", isCompressed : ").append(isCompresses).append(", isWritingBase : ")
+                .append(isWritingBase);
+
+        return content.toString();
+    }
+
+    public static Object GetDateOrTimestampValue(Object value){

Review comment:
       Please keep to the Java naming conventions in new code (do not capitalize method names)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [phoenix-connectors] richardantal commented on pull request #26: PHOENIX-6076 Refactor Phoenix Hive Connectors introduced by PHOENIX-6057

Posted by GitBox <gi...@apache.org>.
richardantal commented on pull request #26:
URL: https://github.com/apache/phoenix-connectors/pull/26#issuecomment-682369491


   Thank you @stoty for your proper review (as always)! :) 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org