You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by "Sanoj MG (JIRA)" <ji...@apache.org> on 2017/04/08 21:26:42 UTC
[jira] [Created] (CARBONDATA-888) Dictionary include / exclude
option in dataframe writer
Sanoj MG created CARBONDATA-888:
-----------------------------------
Summary: Dictionary include / exclude option in dataframe writer
Key: CARBONDATA-888
URL: https://issues.apache.org/jira/browse/CARBONDATA-888
Project: CarbonData
Issue Type: Improvement
Components: spark-integration
Affects Versions: 1.2.0-incubating
Environment: HDP 2.5, Spark 1.6
Reporter: Sanoj MG
Priority: Minor
Fix For: 1.2.0-incubating
While creating a Carbondata table from dataframe, currently it is not possible to specify columns that needs to be included in or excluded from the dictionary. An option is required to specify it as below :
df.write.format("carbondata")
.option("tableName", "test")
.option("compress","true")
.option("dictionary_include","incol1,intcol2")
.option("dictionary_exclude","stringcol1,stringcol2")
.mode(SaveMode.Overwrite)
.save()
We have lot of integer columns that are dimensions, dataframe.save is used to quickly create tables instead of writing ddls, and it would be nice to have this feature to execute POCs.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)