You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Cheolsoo Park (JIRA)" <ji...@apache.org> on 2013/10/10 02:52:42 UTC

[jira] [Commented] (PIG-3510) New filter extractor fails with more than one filter statement

    [ https://issues.apache.org/jira/browse/PIG-3510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13791059#comment-13791059 ] 

Cheolsoo Park commented on PIG-3510:
------------------------------------

Here is what I see when printing out optimization rules with new and old partition filter optimizers:
{code:title=new}
ImplicitSplitInserter
DuplicateForEachColumnRewrite
--> NewPartitionFilterOptimizer
--> NewPartitionFilterOptimizer
--> NewPartitionFilterOptimizer
StreamTypeCastInserter
LoadTypeCastInserter
SplitFilter
PushUpFilter
PushUpFilter
PushUpFilter
--> MergeFilter
PushDownForEachFlatten
ColumnMapKeyPrune
AddForEach
MergeForEach
GroupByConstParallelSetter
LimitOptimizer
{code}
{code:title=old}
ImplicitSplitInserter
DuplicateForEachColumnRewrite
LoadTypeCastInserter
StreamTypeCastInserter
SplitFilter
SplitFilter
PushUpFilter
PushUpFilter
--> MergeFilter
PushUpFilter
--> MergeFilter
PushUpFilter
--> MergeFilter
PushUpFilter
--> PartitionFilterOptimizer
--> PartitionFilterOptimizer
PushDownForEachFlatten
ColumnMapKeyPrune
AddForEach
MergeForEach
GroupByConstParallelSetter
LimitOptimizer
{code}

> New filter extractor fails with more than one filter statement
> --------------------------------------------------------------
>
>                 Key: PIG-3510
>                 URL: https://issues.apache.org/jira/browse/PIG-3510
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.12.0
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.12.1
>
>
> This is a regression from PIG-3461 - rewrite of partition filter optimizer. Here is an example that demonstrates the problem:
> {code:title=two filters}
> b = FILTER a BY (dateint >= 20130901 AND dateint <= 20131001);
> c = FILTER b BY (event_id == 419 OR event_id == 418);
> {code}
> {code:title=one filter}
> b = FILTER a BY (dateint >= 20130901 AND dateint <= 20131001) AND (event_id == 419 OR event_id == 418);
> {code}
> Both dateint and event_id are partition columns. For the 1 filter case, the whole expression is pushed down whereas for the 2 filter case, only (event_id == 419 OR event_id == 418) is pushed down.



--
This message was sent by Atlassian JIRA
(v6.1#6144)