You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by ffpeng90 <gi...@git.apache.org> on 2017/03/12 12:38:46 UTC

[GitHub] incubator-carbondata pull request #650: [CARBONDATA-](WIP) a...

GitHub user ffpeng90 opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/650

    [CARBONDATA-<Jira issue 728>](WIP) add intergation with presto

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
    
     - [ ] Make sure the PR title is formatted like:
       `[CARBONDATA-<Jira issue #>] Description of pull request`
     - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [ ] If this contribution is large, please file an Apache
           [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt).
     - [ ] Testing done
     
            Please provide details on 
            - Whether new unit test cases have been added or why no new tests are required?
            - What manual testing you have done?
            - Any additional information to help reviewers in testing this change.
             
     - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
                     
    ---


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ffpeng90/incubator-carbondata add_presto

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/650.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #650
    
----
commit 82168f48a07b1757160147551332bb456df2e65a
Author: ffpeng90 <ff...@126.com>
Date:   2017-03-12T12:27:32Z

    add presto integration 0.0.1

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    add to whitelist


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

Posted by ffpeng90 <gi...@git.apache.org>.
Github user ffpeng90 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    Hi:
        1. This version only suppport DML,  
            All tables for test are created by spark-sql(DML part), 
            and i submit queries to presto to get results.
            I only tested the "Select" Case , like where, group , sum , join.
    
    
        2.  I use APIs like createQueryPlan, resolveFilter from class "CarbonInputFormatUtil". 
           To read carbon formatted table, i make the read process into several steps:
           a). load table metadata 
           b). get splits from table (pushing down filtering to filter datablocks of one segment @CarbonTableReader.getInputSplits2 )
           c). parse records ( pushing down column projection and filtering into QueryModel  @CarbondataRecordSetProvider.getRecordSet ) 
    
    
        3. As i described  in partC "parse records", I use QueryModel to get  decoded records.
           For lazy decoding,  I will keep on exploring a better solution.  Maybe we can get inspiration from module presto-orc, presto-parquet.
          
     
          
       
    
    
    
    
    
    
    At 2017-03-15 09:11:19, "Jacky Li" <no...@github.com> wrote:
    
    
    Thanks for working on this. Can you describe what feature is added in term of:
    
    What SQL syntax is supported? DDL &DML?
    I think it uses CarbonInputFormat to read, so are you pushing down column projection and filtering by setting the configuration in CarbonInputFormat?
    Is there any SQL optimization integration with Presto's optimizer? like leveraging carbon's global dictionary to do lazy decode?
    
    \u2014
    You are receiving this because you authored the thread.
    Reply to this email directly, view it on GitHub, or mute the thread.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

Posted by ffpeng90 <gi...@git.apache.org>.
Github user ffpeng90 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    I'm focusing on two things.
    1. let user can debug presto-carbondata in his IDE.
    2. use new presto API to support lazy decode.
    They will be ok soon.
    
    
    
    
    
    
    At 2017-03-15 10:52:01, "\u5f6d" <ff...@126.com> wrote:
    
    Hi:
        1. This version only suppport DML,  
            All tables for test are created by spark-sql(DML part), 
            and i submit queries to presto to get results.
            I only tested the "Select" Case , like where, group , sum , join.
    
    
        2.  I use APIs like createQueryPlan, resolveFilter from class "CarbonInputFormatUtil". 
           To read carbon formatted table, i make the read process into several steps:
           a). load table metadata 
           b). get splits from table (pushing down filtering to filter datablocks of one segment @CarbonTableReader.getInputSplits2 )
           c). parse records ( pushing down column projection and filtering into QueryModel  @CarbondataRecordSetProvider.getRecordSet ) 
    
    
        3. As i described  in partC "parse records", I use QueryModel to get  decoded records.
           For lazy decoding,  I will keep on exploring a better solution.  Maybe we can get inspiration from module presto-orc, presto-parquet.
          
     
          
       
    
    
    
    
    
    
    At 2017-03-15 09:11:19, "Jacky Li" <no...@github.com> wrote:
    
    
    Thanks for working on this. Can you describe what feature is added in term of:
    
    What SQL syntax is supported? DDL &DML?
    I think it uses CarbonInputFormat to read, so are you pushing down column projection and filtering by setting the configuration in CarbonInputFormat?
    Is there any SQL optimization integration with Presto's optimizer? like leveraging carbon's global dictionary to do lazy decode?
    
    \u2014
    You are receiving this because you authored the thread.
    Reply to this email directly, view it on GitHub, or mute the thread.
    
    
    
    
    
     


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1290/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [CARBONDATA-728] add intergation with prest...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1296/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [CARBONDATA-728] add intergation with prest...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #650: [CARBONDATA-728] add intergation wit...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-carbondata/pull/650


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [CARBONDATA-728] add intergation with prest...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    @ffpeng90 
    In presto/pom.xml, please change groupid from "com.facebook.presto" to "org.apache.carbondata"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    And I think it is easier to review and can be merged sooner if you could break this PR down into smaller one. Just provide the very basic functionality in the first round of the integration. You can add more functionality in subsequent PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    @ffpeng90  please update the PR title also.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [CARBONDATA-](WIP) add inte...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [CARBONDATA-728] add intergation with prest...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

Posted by ffpeng90 <gi...@git.apache.org>.
Github user ffpeng90 commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    as your wish


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #650: [WIP] add intergation with presto

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/650#discussion_r105558847
  
    --- Diff: integration/presto/pom.xml ---
    @@ -0,0 +1,167 @@
    +<?xml version="1.0" encoding="UTF-8"?>
    +<project xmlns="http://maven.apache.org/POM/4.0.0"
    +         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    +         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    +    <modelVersion>4.0.0</modelVersion>
    +
    +    <parent>
    +        <groupId>com.facebook.presto</groupId>
    --- End diff --
    
    please change groupId to org.apache.carbondata. 
    you can take integration/spark module as reference, and update accordingly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    Build Failed  with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1100/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/incubator-carbondata/pull/650
  
    Thanks for working on this. Can you describe what feature is added in term of:
    1. What SQL syntax is supported? DDL &DML?
    2. I think it uses CarbonInputFormat to read, so are you pushing down column projection and filtering by setting the configuration in CarbonInputFormat?
    3. Is there any SQL optimization integration with Presto's optimizer? like leveraging carbon's global dictionary to do lazy decode?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---