You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Andriy Kushnir (JIRA)" <ji...@apache.org> on 2017/10/24 16:38:00 UTC
[jira] [Commented] (SPARK-9686) Spark Thrift server doesn't return correct JDBC metadata

    [ https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217201#comment-16217201 ] 

Andriy Kushnir commented on SPARK-9686:
---------------------------------------

[~rxin], I did a little research for this error.
To invoke {{run()}} → {{runInternal()}} on any {{org.apache.hive.service.cli.operation.Operation}} (for example, {{GetSchemasOperation}}) we need {{IMetaStoreClient}}. Currently it's taken from {{HiveSession}} instance:
{code:java}
public class GetSchemasOperation extends MetadataOperation {
    @Override
    public void runInternal() throws HiveSQLException {
        IMetaStoreClient metastoreClient = getParentSession().getMetaStoreClient();
    }
}
{code}

All opened {{HiveSession}} s are handled by {{org.apache.hive.service.cli.session.SessionManager}} instance.
{{SessionManager}}, among with others, implements {{org.apache.hive.service.Service}} interface, and all {{Service}}s initialized with same Hive configuration:
{code:java}
public interface Service { 
    void init(HiveConf conf);
}
{code}
When {{org.apache.spark.sql.hive.thriftserver.HiveThriftServer2}} initializes, all {{org.apache.hive.service.CompositeService}} s receive same {{HiveConf}}:

{code:java}
private[hive] class HiveThriftServer2(sqlContext: SQLContext) extends HiveServer2 with ReflectedCompositeService {
    override def init(hiveConf: HiveConf) {
        initCompositeService(hiveConf)
    }
}

object HiveThriftServer2 extends Logging {
    @DeveloperApi
    def startWithContext(sqlContext: SQLContext): Unit = {
        val server = new HiveThriftServer2(sqlContext)

        val executionHive = HiveUtils.newClientForExecution(
          sqlContext.sparkContext.conf,
          sqlContext.sessionState.newHadoopConf())

        server.init(executionHive.conf)
    }
}

{code}

So, {{HiveUtils#newClientForExecution()}} returns implementation of {{IMetaStoreClient}} which *ALWAYS* points to derby metastore (see dosctrings and comments in {{org.apache.spark.sql.hive.HiveUtils#newTemporaryConfiguration()}})

IMHO, to get correct metadata we need to additionally create another {{IMetaStoreClient}} with {{newClientForMetadata()}}, and pass it's {{HiveConf}} to underlying {{Service}} s.

> Spark Thrift server doesn't return correct JDBC metadata 
> ---------------------------------------------------------
>
>                 Key: SPARK-9686
>                 URL: https://issues.apache.org/jira/browse/SPARK-9686
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1, 1.5.2
>            Reporter: pin_zhang
>            Priority: Critical
>         Attachments: SPARK-9686.1.patch.txt
>
>
> 1. Start  start-thriftserver.sh
> 2. connect with beeline
> 3. create table
> 4.show tables, the new created table returned
> 5.
> 	Class.forName("org.apache.hive.jdbc.HiveDriver");
> 	String URL = "jdbc:hive2://localhost:10000/default";
> 	 Properties info = new Properties();
>         Connection conn = DriverManager.getConnection(URL, info);
> 	ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(),
> 		 null, null, null);
> Problem:
>            No tables with returned this API, that work in spark1.3



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org