You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by GitBox <gi...@apache.org> on 2019/12/24 11:46:59 UTC

[GitHub] [kylin] hit-lacus edited a comment on issue #996: KYLIN-4206 Add kylin supports aws glue catalog metastroeclient

hit-lacus edited a comment on issue #996: KYLIN-4206 Add kylin supports aws glue catalog metastroeclient
URL: https://github.com/apache/kylin/pull/996#issuecomment-568732240
 
 
   ## Kylin On AWS EMR(Glue)
   
   > This modification mainly solves the problem of aw glue catalog supported by kylin, and the associated jira is [KYLIN-4206](https://issues.apache.org/jira/browse/KYLIN-4206).
   
   
   1. First you need to modify the aws-glue-data-catalog-client source code. aws-glue-data-catalog-client-for-apache-hive-metastore github address is [https://github.com/awslabs/aws-glue-data-catalog-client-for-apache-hive- metastore],aws-glue-client development environment see README.MD.
   I downloaded hive 2.3.7 locally, so after following the steps in the [README.MD|http://readme.md/] file, the version of hive is 2.3.7-SNAPSHOT. 1)Modify the pom.xml file in the home directory. <hive2.version>2.3.7-SNAPSHOT</hive2.version> <spark-hive.version>1.2.1.spark2</spark-hive.version>
   
   2. Modify the class ofaws-glue-datacatalog-hive2-client/ com.amazonaws.glue.catalog.metastore.AWSCatalogMetastoreClient
   
   Implementation method
   ```java
   @Override
   public PartitionValuesResponse listPartitionValues (PartitionValuesRequest partitionValuesRequest) throws MetaException, TException, NoSuchObjectException {
   return null; }
   ```
   
   3. Modify the class ofaws-glue-datacatalog-spark-client/ com.amazonaws.glue.catalog.metastore.AWSCatalogMetastoreClient.
   The problems are as follows:
   
   This method is not available in the parent class,so delete the method,Then copy the method of aws-glue-datacatalog-hive2-client / com.amazonaws.glue.catalog.metastore.AWSCatalogMetastoreClient.Add dependency in aws-glue-datacatalog-spark-client / pom.xml file 
   
   ```xml
   <dependency>
   <groupId>org.apache.hive</groupId>
    <artifactId>hive-exec</artifactId> <version>${hive2.version}</version> <scope>provided</scope>
   </dependency>
   ```
   4. Package,need to package three projects,as follows. !9k=!
   
   5. Copy the three package aws-glue-datacatalog-client-common-1.10.0-SNAPSHOT.jar aws-glue-datacatalog-hive2-client-1.10.0-SNAPSHOT.jar aws-glue-datacatalog-spark-client-1.10.0-SNAPSHOT.jar
   to /kylin/lib
   
   6. Modify the source code of kylin,See submission of PR.*
   1)Add the gluecatalog in the config of kylin.properties.
   ##The default access HiveMetastoreClient is hcatalog. If AWS user and glue catalog is used, it can be configured as gluecatalog ##kylin.source.hive.metadata-type=hcatalog
   The default is hcatalog. If you want to use glue, please configure kylin.source.hive.metadata-type = gluecatalog.
   if config gluecatalog,so need to configure in hive-site.xml,as follows:
   
   ```xml
   <property> <name>hive.metastore.client.factory.class</name>
   <value>com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClie ntFactory</value>
   </property> 
   ```
   
   
   
   
   3.install on EMR
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services