You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by ajantha-bhat <gi...@git.apache.org> on 2018/04/24 08:22:53 UTC

[GitHub] carbondata pull request #2220: [CARBONDATA-2369] FAQ update related to carbo...

GitHub user ajantha-bhat opened a pull request:

    https://github.com/apache/carbondata/pull/2220

    [CARBONDATA-2369] FAQ update related to carbon SDK scenario

    [CARBONDATA-2369] FAQ update related to carbon SDK scenario
    
     - [ ] Any interfaces changed? no
     
     - [ ] Any backward compatibility impacted? no
     
     - [ ] Document update required? yes, updated
    
     - [ ] Testing done. NA
           
    
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ajantha-bhat/carbondata faq

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/2220.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2220
    
----
commit 6ee180f0c4f11207e73b19a46ab48ba01ec7128a
Author: ajantha-bhat <aj...@...>
Date:   2018-04-24T08:20:35Z

    [CARBONDATA-2369] FAQ update related to SDK scenario

----


---

[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on the issue:

    https://github.com/apache/carbondata/pull/2220
  
    This will be handled in #2198.
    
    No need of separate PR


---

[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2220
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5366/



---

[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/2220
  
    Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4197/



---

[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/2220
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4508/



---

[GitHub] carbondata pull request #2220: [CARBONDATA-2369] FAQ update related to carbo...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2220#discussion_r183928302
  
    --- Diff: docs/faq.md ---
    @@ -182,3 +183,15 @@ select cntry,sum(gdp) from gdp21,pop1 where cntry=ctry group by cntry;
     ## Why all executors are showing success in Spark UI even after Dataload command failed at Driver side?
     Spark executor shows task as failed after the maximum number of retry attempts, but loading the data having bad records and BAD_RECORDS_ACTION (carbon.bad.records.action) is set as “FAIL” will attempt only once but will send the signal to driver as failed instead of throwing the exception to retry, as there is no point to retry if bad record found and BAD_RECORDS_ACTION is set to fail. Hence the Spark executor displays this one attempt as successful but the command has actually failed to execute. Task attempts or executor logs can be checked to observe the failure reason.
     
    +## Why different time zone result for select query output when query SDK writer output? 
    +SDK writer is an independent entity, hence SDK writer can generate carbondata files from a non-cluster machine that has different time zones. But at cluster when those files are read, it always takes cluster time-zone. Hence, the value of timestamp and date datatype fields are not original value.
    +If you do not want to see according to time-zone, then set cluster's time-zone in SDK writer by calling below API.
    --- End diff --
    
    done. will take this changes in #2198 


---

[GitHub] carbondata pull request #2220: [CARBONDATA-2369] FAQ update related to carbo...

Posted by ajantha-bhat <gi...@git.apache.org>.
Github user ajantha-bhat closed the pull request at:

    https://github.com/apache/carbondata/pull/2220


---

[GitHub] carbondata issue #2220: [CARBONDATA-2369] FAQ update related to carbon SDK s...

Posted by sgururajshetty <gi...@git.apache.org>.
Github user sgururajshetty commented on the issue:

    https://github.com/apache/carbondata/pull/2220
  
    LGTM


---

[GitHub] carbondata pull request #2220: [CARBONDATA-2369] FAQ update related to carbo...

Posted by gvramana <gi...@git.apache.org>.
Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2220#discussion_r183798281
  
    --- Diff: docs/faq.md ---
    @@ -182,3 +183,15 @@ select cntry,sum(gdp) from gdp21,pop1 where cntry=ctry group by cntry;
     ## Why all executors are showing success in Spark UI even after Dataload command failed at Driver side?
     Spark executor shows task as failed after the maximum number of retry attempts, but loading the data having bad records and BAD_RECORDS_ACTION (carbon.bad.records.action) is set as “FAIL” will attempt only once but will send the signal to driver as failed instead of throwing the exception to retry, as there is no point to retry if bad record found and BAD_RECORDS_ACTION is set to fail. Hence the Spark executor displays this one attempt as successful but the command has actually failed to execute. Task attempts or executor logs can be checked to observe the failure reason.
     
    +## Why different time zone result for select query output when query SDK writer output? 
    +SDK writer is an independent entity, hence SDK writer can generate carbondata files from a non-cluster machine that has different time zones. But at cluster when those files are read, it always takes cluster time-zone. Hence, the value of timestamp and date datatype fields are not original value.
    +If you do not want to see according to time-zone, then set cluster's time-zone in SDK writer by calling below API.
    --- End diff --
    
    If wanted to control timezone of data while writing, then set cluster's time-zone in SDK writer by calling below API. 


---