You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "pin_zhang (JIRA)" <ji...@apache.org> on 2015/04/16 10:04:58 UTC
[jira] [Commented] (SPARK-6923) Get invalid hive table columns
after save DataFrame to hive table
[ https://issues.apache.org/jira/browse/SPARK-6923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497701#comment-14497701 ]
pin_zhang commented on SPARK-6923:
----------------------------------
In spark1.1.0 client with the jdbc api to get the table schema
age(bigint), id(string)
while in spark1.3.0 {name=col, type=array<string>}
That's not expected.
ArrayList<Map> results = new ArrayList();
DatabaseMetaData meta = cnn.getMetaData();
rsColumns = meta.getColumns(database, null, table, null);
while (rsColumns.next()) {
Map col = new HashMap();
col.put("name", rsColumns.getString("COLUMN_NAME"));
String typeName = rsColumns.getString("TYPE_NAME");
col.put("type", typeName);
results.add(col);
}
rsColumns.close();
> Get invalid hive table columns after save DataFrame to hive table
> -----------------------------------------------------------------
>
> Key: SPARK-6923
> URL: https://issues.apache.org/jira/browse/SPARK-6923
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.3.0
> Reporter: pin_zhang
>
> HiveContext hctx = new HiveContext(sc);
> List<String> sample = new ArrayList<String>();
> sample.add( "{\"id\": \"id_1\", \"age\":1}" );
> RDD<String> sampleRDD = new JavaSparkContext(sc).parallelize(sample).rdd();
> DataFrame df = hctx.jsonRDD(sampleRDD);
> String table="test";
> df.saveAsTable(table, "json",SaveMode.Overwrite);
> Table t = hctx.catalog().client().getTable(table);
> System.out.println( t.getCols());
> --------------------------------------------------------------
> With the code above to save DataFrame to hive table,
> Get table cols returns one column named 'col'
> [FieldSchema(name:col, type:array<string>, comment:from deserializer)]
> Expected return fields schema id, age.
> This results in the jdbc API cannot retrieves the table columns via ResultSet DatabaseMetaData.getColumns(String catalog, String schemaPattern,String tableNamePattern, String columnNamePattern)
> But resultset metadata for query " select * from test " contains fields id, age.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org