You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Anton Kedin <ke...@google.com> on 2018/06/01 17:10:09 UTC

Re: [SQL] Unsupported features

This looks very helpful, thank you.

Can you file Jiras for the major problems? Or maybe a single jira for the
whole thing with sub-tasks for specific problems.

Regards,
Anton

On Wed, May 30, 2018 at 9:12 AM Kenneth Knowles <kl...@google.com> wrote:

> This is extremely useful. Thanks for putting so much information together!
>
> Kenn
>
> On Wed, May 30, 2018 at 8:19 AM Kai Jiang <ji...@gmail.com> wrote:
>
>> Hi all,
>>
>> Based on pull/5481 <https://github.com/apache/beam/pull/5481>, I
>> manually did a coverage test with TPC-ds queries (65%) and TPC-h queries
>> (100%) and want to see what features Beam SQL is currently not supporting.
>> Test was running on DirectRunner.
>>
>> I want to share the result.​
>>  TPC-DS queries on Beam
>> <https://docs.google.com/spreadsheets/d/12iO0vnPWJC-SFp1dBXd_iClf2ERjewl6IRAC2Z0AzdY/edit?usp=drive_web>
>> ​
>> TL;DR:
>>
>>    1. aggregation function (stddev) missing or calculation of
>>    aggregation functions combination.
>>    2. nested beamjoinrel(condition=[true], joinType=[inner]) / cross
>>    join error
>>    3. date type casting/ calculation and other types casting.
>>    4. LIKE operator in String / alias for substring function
>>    5. order by w/o limit clause.
>>    6. OR operator is supported in join condition
>>    7. Syntax: exist/ not exist (errors) .    rank() over (partition by)
>>    / view (unsupported)
>>
>>
>> Best,
>> Kai
>> ᐧ
>>
>

Re: [SQL] Unsupported features

Posted by Kai Jiang <ji...@gmail.com>.
FYI, Umbrella JIRA ticket: https://issues.apache.org/jira/browse/BEAM-4476
ᐧ
ᐧ

On Mon, Jun 4, 2018 at 3:08 PM Kai Jiang <ji...@gmail.com> wrote:

> Ismaël, I was running this naive code snippet
> <https://gist.github.com/vectorijk/7c54f90aeebfd6fd9e9d2ee224bfed50>.
> Yes, IT would be interesting. Next step, I was thinking of is making the
> progress automatically and integrating with Nexmark.
> Do you have any ideas about this? Currently, I ingested data by reading
> plain CSV file. Is that possible to run batch job with non-generated data
> in Nexmark?
>
> Best,
> Kai
> ᐧ
>
> On Mon, Jun 4, 2018 at 4:41 AM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> This is super interesting, great work Kai!
>>
>> Just for curiosity, How are you validating this?
>> It would be really interesting to have this also as part of some kind of
>> IT for the future.
>>
>>
>> On Fri, Jun 1, 2018 at 7:43 PM Kai Jiang <ji...@gmail.com> wrote:
>>
>>> Sounds a good idea! I will file the major problems later and use a task
>>> issue to track.
>>>
>>> Best,
>>> Kai
>>> ᐧ
>>>
>>> On Fri, Jun 1, 2018 at 10:10 AM Anton Kedin <ke...@google.com> wrote:
>>>
>>>> This looks very helpful, thank you.
>>>>
>>>> Can you file Jiras for the major problems? Or maybe a single jira for
>>>> the whole thing with sub-tasks for specific problems.
>>>>
>>>> Regards,
>>>> Anton
>>>>
>>>> On Wed, May 30, 2018 at 9:12 AM Kenneth Knowles <kl...@google.com> wrote:
>>>>
>>>>> This is extremely useful. Thanks for putting so much information
>>>>> together!
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Wed, May 30, 2018 at 8:19 AM Kai Jiang <ji...@gmail.com> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> Based on pull/5481 <https://github.com/apache/beam/pull/5481>, I
>>>>>> manually did a coverage test with TPC-ds queries (65%) and TPC-h queries
>>>>>> (100%) and want to see what features Beam SQL is currently not supporting.
>>>>>> Test was running on DirectRunner.
>>>>>>
>>>>>> I want to share the result.​
>>>>>>  TPC-DS queries on Beam
>>>>>> <https://docs.google.com/spreadsheets/d/12iO0vnPWJC-SFp1dBXd_iClf2ERjewl6IRAC2Z0AzdY/edit?usp=drive_web>
>>>>>> ​
>>>>>> TL;DR:
>>>>>>
>>>>>>    1. aggregation function (stddev) missing or calculation of
>>>>>>    aggregation functions combination.
>>>>>>    2. nested beamjoinrel(condition=[true], joinType=[inner]) / cross
>>>>>>    join error
>>>>>>    3. date type casting/ calculation and other types casting.
>>>>>>    4. LIKE operator in String / alias for substring function
>>>>>>    5. order by w/o limit clause.
>>>>>>    6. OR operator is supported in join condition
>>>>>>    7. Syntax: exist/ not exist (errors) .    rank() over (partition
>>>>>>    by) / view (unsupported)
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>> Kai
>>>>>> ᐧ
>>>>>>
>>>>>

Re: [SQL] Unsupported features

Posted by Kai Jiang <ji...@gmail.com>.
Ismaël, I was running this naive code snippet
<https://gist.github.com/vectorijk/7c54f90aeebfd6fd9e9d2ee224bfed50>.
Yes, IT would be interesting. Next step, I was thinking of is making the
progress automatically and integrating with Nexmark.
Do you have any ideas about this? Currently, I ingested data by reading
plain CSV file. Is that possible to run batch job with non-generated data
in Nexmark?

Best,
Kai
ᐧ

On Mon, Jun 4, 2018 at 4:41 AM Ismaël Mejía <ie...@gmail.com> wrote:

> This is super interesting, great work Kai!
>
> Just for curiosity, How are you validating this?
> It would be really interesting to have this also as part of some kind of
> IT for the future.
>
>
> On Fri, Jun 1, 2018 at 7:43 PM Kai Jiang <ji...@gmail.com> wrote:
>
>> Sounds a good idea! I will file the major problems later and use a task
>> issue to track.
>>
>> Best,
>> Kai
>> ᐧ
>>
>> On Fri, Jun 1, 2018 at 10:10 AM Anton Kedin <ke...@google.com> wrote:
>>
>>> This looks very helpful, thank you.
>>>
>>> Can you file Jiras for the major problems? Or maybe a single jira for
>>> the whole thing with sub-tasks for specific problems.
>>>
>>> Regards,
>>> Anton
>>>
>>> On Wed, May 30, 2018 at 9:12 AM Kenneth Knowles <kl...@google.com> wrote:
>>>
>>>> This is extremely useful. Thanks for putting so much information
>>>> together!
>>>>
>>>> Kenn
>>>>
>>>> On Wed, May 30, 2018 at 8:19 AM Kai Jiang <ji...@gmail.com> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Based on pull/5481 <https://github.com/apache/beam/pull/5481>, I
>>>>> manually did a coverage test with TPC-ds queries (65%) and TPC-h queries
>>>>> (100%) and want to see what features Beam SQL is currently not supporting.
>>>>> Test was running on DirectRunner.
>>>>>
>>>>> I want to share the result.​
>>>>>  TPC-DS queries on Beam
>>>>> <https://docs.google.com/spreadsheets/d/12iO0vnPWJC-SFp1dBXd_iClf2ERjewl6IRAC2Z0AzdY/edit?usp=drive_web>
>>>>> ​
>>>>> TL;DR:
>>>>>
>>>>>    1. aggregation function (stddev) missing or calculation of
>>>>>    aggregation functions combination.
>>>>>    2. nested beamjoinrel(condition=[true], joinType=[inner]) / cross
>>>>>    join error
>>>>>    3. date type casting/ calculation and other types casting.
>>>>>    4. LIKE operator in String / alias for substring function
>>>>>    5. order by w/o limit clause.
>>>>>    6. OR operator is supported in join condition
>>>>>    7. Syntax: exist/ not exist (errors) .    rank() over (partition
>>>>>    by) / view (unsupported)
>>>>>
>>>>>
>>>>> Best,
>>>>> Kai
>>>>> ᐧ
>>>>>
>>>>

Re: [SQL] Unsupported features

Posted by Ismaël Mejía <ie...@gmail.com>.
This is super interesting, great work Kai!

Just for curiosity, How are you validating this?
It would be really interesting to have this also as part of some kind of IT
for the future.


On Fri, Jun 1, 2018 at 7:43 PM Kai Jiang <ji...@gmail.com> wrote:

> Sounds a good idea! I will file the major problems later and use a task
> issue to track.
>
> Best,
> Kai
> ᐧ
>
> On Fri, Jun 1, 2018 at 10:10 AM Anton Kedin <ke...@google.com> wrote:
>
>> This looks very helpful, thank you.
>>
>> Can you file Jiras for the major problems? Or maybe a single jira for the
>> whole thing with sub-tasks for specific problems.
>>
>> Regards,
>> Anton
>>
>> On Wed, May 30, 2018 at 9:12 AM Kenneth Knowles <kl...@google.com> wrote:
>>
>>> This is extremely useful. Thanks for putting so much information
>>> together!
>>>
>>> Kenn
>>>
>>> On Wed, May 30, 2018 at 8:19 AM Kai Jiang <ji...@gmail.com> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Based on pull/5481 <https://github.com/apache/beam/pull/5481>, I
>>>> manually did a coverage test with TPC-ds queries (65%) and TPC-h queries
>>>> (100%) and want to see what features Beam SQL is currently not supporting.
>>>> Test was running on DirectRunner.
>>>>
>>>> I want to share the result.​
>>>>  TPC-DS queries on Beam
>>>> <https://docs.google.com/spreadsheets/d/12iO0vnPWJC-SFp1dBXd_iClf2ERjewl6IRAC2Z0AzdY/edit?usp=drive_web>
>>>> ​
>>>> TL;DR:
>>>>
>>>>    1. aggregation function (stddev) missing or calculation of
>>>>    aggregation functions combination.
>>>>    2. nested beamjoinrel(condition=[true], joinType=[inner]) / cross
>>>>    join error
>>>>    3. date type casting/ calculation and other types casting.
>>>>    4. LIKE operator in String / alias for substring function
>>>>    5. order by w/o limit clause.
>>>>    6. OR operator is supported in join condition
>>>>    7. Syntax: exist/ not exist (errors) .    rank() over (partition
>>>>    by) / view (unsupported)
>>>>
>>>>
>>>> Best,
>>>> Kai
>>>> ᐧ
>>>>
>>>

Re: [SQL] Unsupported features

Posted by Kai Jiang <ji...@gmail.com>.
Sounds a good idea! I will file the major problems later and use a task
issue to track.

Best,
Kai
ᐧ

On Fri, Jun 1, 2018 at 10:10 AM Anton Kedin <ke...@google.com> wrote:

> This looks very helpful, thank you.
>
> Can you file Jiras for the major problems? Or maybe a single jira for the
> whole thing with sub-tasks for specific problems.
>
> Regards,
> Anton
>
> On Wed, May 30, 2018 at 9:12 AM Kenneth Knowles <kl...@google.com> wrote:
>
>> This is extremely useful. Thanks for putting so much information together!
>>
>> Kenn
>>
>> On Wed, May 30, 2018 at 8:19 AM Kai Jiang <ji...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> Based on pull/5481 <https://github.com/apache/beam/pull/5481>, I
>>> manually did a coverage test with TPC-ds queries (65%) and TPC-h queries
>>> (100%) and want to see what features Beam SQL is currently not supporting.
>>> Test was running on DirectRunner.
>>>
>>> I want to share the result.​
>>>  TPC-DS queries on Beam
>>> <https://docs.google.com/spreadsheets/d/12iO0vnPWJC-SFp1dBXd_iClf2ERjewl6IRAC2Z0AzdY/edit?usp=drive_web>
>>> ​
>>> TL;DR:
>>>
>>>    1. aggregation function (stddev) missing or calculation of
>>>    aggregation functions combination.
>>>    2. nested beamjoinrel(condition=[true], joinType=[inner]) / cross
>>>    join error
>>>    3. date type casting/ calculation and other types casting.
>>>    4. LIKE operator in String / alias for substring function
>>>    5. order by w/o limit clause.
>>>    6. OR operator is supported in join condition
>>>    7. Syntax: exist/ not exist (errors) .    rank() over (partition by)
>>>    / view (unsupported)
>>>
>>>
>>> Best,
>>> Kai
>>> ᐧ
>>>
>>