You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@carbondata.apache.org by ch...@apache.org on 2017/03/08 15:28:27 UTC

[1/2] incubator-carbondata git commit: Documented how to handle bad records

Repository: incubator-carbondata
Updated Branches:
  refs/heads/master 9be5262ed -> 15874a042


Documented how to handle bad records

Resolved review comments for FAQs


Project: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/commit/33dabe45
Tree: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/tree/33dabe45
Diff: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/diff/33dabe45

Branch: refs/heads/master
Commit: 33dabe455fdaaba7cc06733faac82fa2fc706d77
Parents: 9be5262
Author: PallaviSingh1992 <pa...@yahoo.co.in>
Authored: Tue Feb 28 12:14:45 2017 +0530
Committer: chenliang613 <ch...@huawei.com>
Committed: Wed Mar 8 23:24:06 2017 +0800

----------------------------------------------------------------------
 docs/faq.md | 70 ++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 50 insertions(+), 20 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/33dabe45/docs/faq.md
----------------------------------------------------------------------
diff --git a/docs/faq.md b/docs/faq.md
index a72a96f..57ac171 100644
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -18,30 +18,60 @@
 -->
 
 # FAQs
-* **Auto Compaction not Working**
 
-    The Property carbon.enable.auto.load.merge in carbon.properties need to be set to true.
+* [What are Bad Records?](#what-are-bad-records)
+* [Where are Bad Records Stored in CarbonData?](#where-are-bad-records-stored-in-carbondata)
+* [How to enable Bad Record Logging?](#how-to-enable-bad-record-logging)
+* [How to ignore the Bad Records?](#how-to-ignore-the-bad-records)
+* [How to specify store location while creating carbon session?](#how-to-specify-store-location-while-creating-carbon-session)
+* [What is Carbon Lock Type?](#what-is-carbon-lock-type)
+* [How to resolve Abstract Method Error?](#how-to-resolve-abstract-method-error)
 
-* **Getting Abstract method error**
+## What are Bad Records?
+Records that fail to get loaded into the CarbonData due to data type incompatibility or are empty or have incompatible format are classified as Bad Records.
 
-    You need to specify the spark version while using Maven to build project.
+## Where are Bad Records Stored in CarbonData?
+The bad records are stored at the location set in carbon.badRecords.location in carbon.properties file.
+By default **carbon.badRecords.location** specifies the following location ``/opt/Carbon/Spark/badrecords``.
 
-* **Getting NotImplementedException for subquery using IN and EXISTS**
+## How to enable Bad Record Logging?
+While loading data we can specify the approach to handle Bad Records. In order to analyse the cause of the Bad Records the parameter ``BAD_RECORDS_LOGGER_ENABLE`` must be set to value ``TRUE``. There are multiple approaches to handle Bad Records which can be specified  by the parameter ``BAD_RECORDS_ACTION``.
 
-    Subquery with in and exists not supported in CarbonData.
-    
-* **Getting Exceptions on creating  a view**
-    
-    View not supported in CarbonData.
-    
-* **How to verify if ColumnGroups have been created as desired.**
+- To pad the incorrect values of the csv rows with NULL value and load the data in CarbonData, set the following in the query :
+```
+'BAD_RECORDS_ACTION'='FORCE'
+```
+
+- To write the Bad Records without padding incorrect values with NULL in the raw csv (set in the parameter **carbon.badRecords.location**), set the following in the query :
+```
+'BAD_RECORDS_ACTION'='REDIRECT'
+```
+
+## How to ignore the Bad Records?
+To ignore the Bad Records from getting stored in the raw csv, we need to set the following in the query :
+```
+'BAD_RECORDS_ACTION'='IGNORE'
+```
+
+## How to specify store location while creating carbon session?
+The store location specified while creating carbon session is used by the CarbonData to store the meta data like the schema, dictionary files, dictionary meta data and sort indexes.
+
+Try creating ``carbonsession`` with ``storepath`` specified in the following manner :
+```
+val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession(<store_path>)
+```
+Example:
+```
+val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://localhost:9000/carbon/store ")
+```
+
+## What is Carbon Lock Type?
+The Apache CarbonData acquires lock on the files to prevent concurrent operation from modifying the same files. The lock can be of the following types depending on the storage location, for HDFS we specify it to be of type HDFSLOCK. By default it is set to type LOCALLOCK.
+The property carbon.lock.type configuration specifies the type of lock to be acquired during concurrent operations on table. This property can be set with the following values :
+- **LOCALLOCK** : This Lock is created on local file system as file. This lock is useful when only one spark driver (thrift server) runs on a machine and no other CarbonData spark application is launched concurrently.
+- **HDFSLOCK** : This Lock is created on HDFS file system as file. This lock is useful when multiple CarbonData spark applications are launched and no ZooKeeper is running on cluster and the HDFS supports, file based locking.
+
+## How to resolve Abstract Method Error?
+In order to build CarbonData project it is necessary to specify the spark profile. The spark profile sets the Spark Version. You need to specify the ``spark version`` while using Maven to build project.
 
-    Try using desc table query.
-    
-* **Did anyone try to run CarbonData on windows? Is it supported on Windows?**
-    
-    We may provide support for windows in future. You are welcome to contribute if you want to add the support :) 
-   
-    
-    
 


[2/2] incubator-carbondata git commit: [CARBONDATA-714]Documented how to handle bad records This closes #614

Posted by ch...@apache.org.
[CARBONDATA-714]Documented how to handle bad records This closes #614


Project: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/commit/15874a04
Tree: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/tree/15874a04
Diff: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/diff/15874a04

Branch: refs/heads/master
Commit: 15874a0428f4bc2d0328d16369ed48ee80db266f
Parents: 9be5262 33dabe4
Author: chenliang613 <ch...@huawei.com>
Authored: Wed Mar 8 23:27:58 2017 +0800
Committer: chenliang613 <ch...@huawei.com>
Committed: Wed Mar 8 23:27:58 2017 +0800

----------------------------------------------------------------------
 docs/faq.md | 70 ++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 50 insertions(+), 20 deletions(-)
----------------------------------------------------------------------