You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by GitBox <gi...@apache.org> on 2020/10/08 10:55:30 UTC

[GitHub] [carbondata] VenuReddy2103 opened a new pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

VenuReddy2103 opened a new pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972


     ### Why is this PR needed?
    At present, When we do insert into table select from or create table as select from, we lauch one single task per node. Whereas when we do a simple select * from table query, tasks launched are equal to number of carbondata files(CARBON_TASK_DISTRIBUTION default is CARBON_TASK_DISTRIBUTION_BLOCK). 
   <p> Thus, slows down the load performance of insert into select and ctas cases.
   Refer [Community discussion regd. task lauch](http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Query-Regarding-Task-launch-mechanism-for-data-load-operations-tt98711.html)
    ### What changes were proposed in this PR?
   Lauch the same number of tasks as in select query for insert into select and ctas cases when the target table is of no-sort.
       
    ### Does this PR introduce any user interface change?
    - No
   
    ### Is any new testcase added?
    - No
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] akashrn5 commented on pull request #3972: [CARBONDATA-4042]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
akashrn5 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-717401461


   LGTM


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] VenuReddy2103 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
VenuReddy2103 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-708269487


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai edited a comment on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
QiangCai edited a comment on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-714861784


   TestSIWithSecondaryIndex
   1. change line 92, add order by
   `
   checkAnswerWithoutSort(sql("select id, country from table1_index order by id"),
         Seq(Row("1", "india"), Row("2", "china")))
   `
   
   2. change line 115, add order by


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
QiangCai commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-713264026


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] VenuReddy2103 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
VenuReddy2103 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-705575486


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on pull request #3972: [CARBONDATA-4042]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-719310576


   @VenuReddy2103 : If you have a performance benchmark with this change? 
   
   Once I tried sending no sort [1 node 1 task] to global sort flow [launch more task], I observed performance degrade for TPCH lineitem table 15GB insert. so, I suggest you to check the performance with this change.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-705549859


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2589/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-705547810






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] VenuReddy2103 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
VenuReddy2103 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-705575486


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
QiangCai commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-706830943


   How about load data of no_sort?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] VenuReddy2103 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
VenuReddy2103 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-706748561


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] asfgit closed pull request #3972: [CARBONDATA-4042]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-706705226


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4353/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [CARBONDATA-4042]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-718612012


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2970/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [CARBONDATA-4042]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-715138306


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2899/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-706765963


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4354/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-714794831


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4647/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] ajantha-bhat commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
ajantha-bhat commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-714925420


   @QiangCai , @VenuReddy2103 : Not just line 95, In this file `TestSIWithSecondaryIndex`, look for all checkAnswerWithoutSort and replace with checkAnswer as it can cause random failure based on which query task finishes first 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai commented on pull request #3972: [CARBONDATA-4042]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
QiangCai commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-718478244






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-705660474


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4341/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-714797045


   Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2891/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-713350810


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4565/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-706765787


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2604/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] QiangCai commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
QiangCai commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-714861784


   please fix TestSIWithSecondaryIndex line 92:
   `
   checkAnswerWithoutSort(sql("select id, country from table1_index order by id"),
         Seq(Row("1", "india"), Row("2", "china")))
   `


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-708359537


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2680/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [CARBONDATA-4042]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-718593609


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4729/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [CARBONDATA-4042]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-715134807


   Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4655/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-713343302


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2809/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-705656666


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2591/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-708366544


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4432/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-705547810


   Build Failed  with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4339/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3972: [WIP]Launch same number of task as select query for insert into select and ctas cases when target table is of no_sort

Posted by GitBox <gi...@apache.org>.
CarbonDataQA1 commented on pull request #3972:
URL: https://github.com/apache/carbondata/pull/3972#issuecomment-706705055


   Build Failed  with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2603/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org