You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Tyson Condie (JIRA)" <ji...@apache.org> on 2008/07/10 01:33:31 UTC
[jira] Created: (PIG-299) Filter operator not included in the main
predecessor plan structure
Filter operator not included in the main predecessor plan structure
-------------------------------------------------------------------
Key: PIG-299
URL: https://issues.apache.org/jira/browse/PIG-299
Project: Pig
Issue Type: Bug
Components: impl
Affects Versions: types_branch
Environment: N/A
Reporter: Tyson Condie
Priority: Blocker
Take the following query, which can be found in TestLogicalPlanBuilder.java method testQuery80();
a = load 'input1' as (name, age, gpa);
b = filter a by age < '20';");
c = group b by (name,age);
d = foreach c {
cf = filter b by gpa < '3.0';
cp = cf.gpa;
cd = distinct cp;
co = order cd by gpa;
generate group, flatten(co);
};
The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the LogicalPlan::getPredecessor method. Here is the explan plan print out of the inner foreach plan:
|---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
| |
| Project Test-Plan-Builder-16 Projections: [0] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
| Input: Distinct Test-Plan-Builder-1
|
|---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
|
|---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
Input: Project Test-Plan-Builder-13 Projections: [*] Overloaded: false|
|---Project Test-Plan-Builder-13 Projections: [*] Overloaded: false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) Type: tuple
Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
As you can see the filter is only accessible via the LOProject::getExpression() method. It is not showing up as an input operator. Focus on the projection immediately following the filter. If I remove this projection then I get a correct plan. For example, let the inner foreach plan be as follows:
d = foreach c {
cf = filter b by gpa < '3.0';
cd = distinct cf;
co = order cd by gpa;
generate group, flatten(co);
};
Then I get the following (correct) explan plan output.
|---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
| |
| Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
| Input: Distinct Test-Plan-Builder-1
|
|---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
|
|---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
| |
| LesserThan Test-Plan-Builder-11 FieldSchema: null Type: Unknown
| |
| |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: false FieldSchema: Type: Unknown
| | Input: CoGroup Test-Plan-Builder-7
| |
| |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: chararray
|
|---Project Test-Plan-Builder-8 Projections: [1] Overloaded: false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
Alan said that the problem is we don't generate a foreach operator for the 'cp = cf.gpa' statement. Please let me know if this can be resolved.
Thanks,
Tyson
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (PIG-299) Filter operator not included in the main
predecessor plan structure
Posted by "Alan Gates (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alan Gates reassigned PIG-299:
------------------------------
Assignee: Santhosh Srinivasan
> Filter operator not included in the main predecessor plan structure
> -------------------------------------------------------------------
>
> Key: PIG-299
> URL: https://issues.apache.org/jira/browse/PIG-299
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Environment: N/A
> Reporter: Tyson Condie
> Assignee: Santhosh Srinivasan
> Priority: Blocker
> Fix For: types_branch
>
>
> Take the following query, which can be found in TestLogicalPlanBuilder.java method testQuery80();
> a = load 'input1' as (name, age, gpa);
> b = filter a by age < '20';");
> c = group b by (name,age);
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cp = cf.gpa;
> cd = distinct cp;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the LogicalPlan::getPredecessor method. Here is the explan plan print out of the inner foreach plan:
> |---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-16 Projections: [0] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
> |
> |---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> Input: Project Test-Plan-Builder-13 Projections: [*] Overloaded: false|
> |---Project Test-Plan-Builder-13 Projections: [*] Overloaded: false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) Type: tuple
> Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> As you can see the filter is only accessible via the LOProject::getExpression() method. It is not showing up as an input operator. Focus on the projection immediately following the filter. If I remove this projection then I get a correct plan. For example, let the inner foreach plan be as follows:
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cd = distinct cf;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> Then I get the following (correct) explan plan output.
> |---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> |
> |---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | LesserThan Test-Plan-Builder-11 FieldSchema: null Type: Unknown
> | |
> | |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: false FieldSchema: Type: Unknown
> | | Input: CoGroup Test-Plan-Builder-7
> | |
> | |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: chararray
> |
> |---Project Test-Plan-Builder-8 Projections: [1] Overloaded: false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
> Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> Alan said that the problem is we don't generate a foreach operator for the 'cp = cf.gpa' statement. Please let me know if this can be resolved.
> Thanks,
> Tyson
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-299) Filter operator not included in the
main predecessor plan structure
Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613667#action_12613667 ]
Santhosh Srinivasan commented on PIG-299:
-----------------------------------------
Just adding the noformat tags around the plans to make them readable.
{noformat}
|---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
| |
| Project Test-Plan-Builder-16 Projections: [0] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
| Input: Distinct Test-Plan-Builder-1
|
|---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
|
|---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
Input: Project Test-Plan-Builder-13 Projections: [*] Overloaded: false|
|---Project Test-Plan-Builder-13 Projections: [*] Overloaded: false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) Type: tuple
Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
{noformat}
{noformat}
|---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
| |
| Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
| Input: Distinct Test-Plan-Builder-1
|
|---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
|
|---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
| |
| LesserThan Test-Plan-Builder-11 FieldSchema: null Type: Unknown
| |
| |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: false FieldSchema: Type: Unknown
| | Input: CoGroup Test-Plan-Builder-7
| |
| |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: chararray
|
|---Project Test-Plan-Builder-8 Projections: [1] Overloaded: false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
{noformat}
> Filter operator not included in the main predecessor plan structure
> -------------------------------------------------------------------
>
> Key: PIG-299
> URL: https://issues.apache.org/jira/browse/PIG-299
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Environment: N/A
> Reporter: Tyson Condie
> Assignee: Santhosh Srinivasan
> Priority: Blocker
> Fix For: types_branch
>
>
> Take the following query, which can be found in TestLogicalPlanBuilder.java method testQuery80();
> a = load 'input1' as (name, age, gpa);
> b = filter a by age < '20';");
> c = group b by (name,age);
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cp = cf.gpa;
> cd = distinct cp;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the LogicalPlan::getPredecessor method. Here is the explan plan print out of the inner foreach plan:
> |---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-16 Projections: [0] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
> |
> |---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> Input: Project Test-Plan-Builder-13 Projections: [*] Overloaded: false|
> |---Project Test-Plan-Builder-13 Projections: [*] Overloaded: false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) Type: tuple
> Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> As you can see the filter is only accessible via the LOProject::getExpression() method. It is not showing up as an input operator. Focus on the projection immediately following the filter. If I remove this projection then I get a correct plan. For example, let the inner foreach plan be as follows:
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cd = distinct cf;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> Then I get the following (correct) explan plan output.
> |---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> |
> |---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | LesserThan Test-Plan-Builder-11 FieldSchema: null Type: Unknown
> | |
> | |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: false FieldSchema: Type: Unknown
> | | Input: CoGroup Test-Plan-Builder-7
> | |
> | |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: chararray
> |
> |---Project Test-Plan-Builder-8 Projections: [1] Overloaded: false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
> Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> Alan said that the problem is we don't generate a foreach operator for the 'cp = cf.gpa' statement. Please let me know if this can be resolved.
> Thanks,
> Tyson
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (PIG-299) Filter operator not included in the main
predecessor plan structure
Posted by "Pi Song (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pi Song resolved PIG-299.
-------------------------
Resolution: Fixed
Committed.
Thanks Santhosh!
> Filter operator not included in the main predecessor plan structure
> -------------------------------------------------------------------
>
> Key: PIG-299
> URL: https://issues.apache.org/jira/browse/PIG-299
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Environment: N/A
> Reporter: Tyson Condie
> Assignee: Santhosh Srinivasan
> Priority: Blocker
> Fix For: types_branch
>
> Attachments: nested_project_as_foreach.patch
>
>
> Take the following query, which can be found in TestLogicalPlanBuilder.java method testQuery80();
> a = load 'input1' as (name, age, gpa);
> b = filter a by age < '20';");
> c = group b by (name,age);
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cp = cf.gpa;
> cd = distinct cp;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the LogicalPlan::getPredecessor method. Here is the explan plan print out of the inner foreach plan:
> |---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-16 Projections: [0] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
> |
> |---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> Input: Project Test-Plan-Builder-13 Projections: [*] Overloaded: false|
> |---Project Test-Plan-Builder-13 Projections: [*] Overloaded: false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) Type: tuple
> Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> As you can see the filter is only accessible via the LOProject::getExpression() method. It is not showing up as an input operator. Focus on the projection immediately following the filter. If I remove this projection then I get a correct plan. For example, let the inner foreach plan be as follows:
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cd = distinct cf;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> Then I get the following (correct) explan plan output.
> |---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> |
> |---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | LesserThan Test-Plan-Builder-11 FieldSchema: null Type: Unknown
> | |
> | |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: false FieldSchema: Type: Unknown
> | | Input: CoGroup Test-Plan-Builder-7
> | |
> | |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: chararray
> |
> |---Project Test-Plan-Builder-8 Projections: [1] Overloaded: false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
> Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> Alan said that the problem is we don't generate a foreach operator for the 'cp = cf.gpa' statement. Please let me know if this can be resolved.
> Thanks,
> Tyson
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-299) Filter operator not included in the
main predecessor plan structure
Posted by "Pi Song (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613938#action_12613938 ]
Pi Song commented on PIG-299:
-----------------------------
That would be what I suggest to do as well.
Running tests now.
> Filter operator not included in the main predecessor plan structure
> -------------------------------------------------------------------
>
> Key: PIG-299
> URL: https://issues.apache.org/jira/browse/PIG-299
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Environment: N/A
> Reporter: Tyson Condie
> Assignee: Santhosh Srinivasan
> Priority: Blocker
> Fix For: types_branch
>
> Attachments: nested_project_as_foreach.patch
>
>
> Take the following query, which can be found in TestLogicalPlanBuilder.java method testQuery80();
> a = load 'input1' as (name, age, gpa);
> b = filter a by age < '20';");
> c = group b by (name,age);
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cp = cf.gpa;
> cd = distinct cp;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the LogicalPlan::getPredecessor method. Here is the explan plan print out of the inner foreach plan:
> |---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-16 Projections: [0] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
> |
> |---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> Input: Project Test-Plan-Builder-13 Projections: [*] Overloaded: false|
> |---Project Test-Plan-Builder-13 Projections: [*] Overloaded: false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) Type: tuple
> Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> As you can see the filter is only accessible via the LOProject::getExpression() method. It is not showing up as an input operator. Focus on the projection immediately following the filter. If I remove this projection then I get a correct plan. For example, let the inner foreach plan be as follows:
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cd = distinct cf;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> Then I get the following (correct) explan plan output.
> |---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> |
> |---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | LesserThan Test-Plan-Builder-11 FieldSchema: null Type: Unknown
> | |
> | |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: false FieldSchema: Type: Unknown
> | | Input: CoGroup Test-Plan-Builder-7
> | |
> | |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: chararray
> |
> |---Project Test-Plan-Builder-8 Projections: [1] Overloaded: false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
> Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> Alan said that the problem is we don't generate a foreach operator for the 'cp = cf.gpa' statement. Please let me know if this can be resolved.
> Thanks,
> Tyson
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-299) Filter operator not included in the main
predecessor plan structure
Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Santhosh Srinivasan updated PIG-299:
------------------------------------
Fix Version/s: types_branch
> Filter operator not included in the main predecessor plan structure
> -------------------------------------------------------------------
>
> Key: PIG-299
> URL: https://issues.apache.org/jira/browse/PIG-299
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Environment: N/A
> Reporter: Tyson Condie
> Priority: Blocker
> Fix For: types_branch
>
>
> Take the following query, which can be found in TestLogicalPlanBuilder.java method testQuery80();
> a = load 'input1' as (name, age, gpa);
> b = filter a by age < '20';");
> c = group b by (name,age);
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cp = cf.gpa;
> cd = distinct cp;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the LogicalPlan::getPredecessor method. Here is the explan plan print out of the inner foreach plan:
> |---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-16 Projections: [0] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
> |
> |---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> Input: Project Test-Plan-Builder-13 Projections: [*] Overloaded: false|
> |---Project Test-Plan-Builder-13 Projections: [*] Overloaded: false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) Type: tuple
> Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> As you can see the filter is only accessible via the LOProject::getExpression() method. It is not showing up as an input operator. Focus on the projection immediately following the filter. If I remove this projection then I get a correct plan. For example, let the inner foreach plan be as follows:
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cd = distinct cf;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> Then I get the following (correct) explan plan output.
> |---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> |
> |---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | LesserThan Test-Plan-Builder-11 FieldSchema: null Type: Unknown
> | |
> | |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: false FieldSchema: Type: Unknown
> | | Input: CoGroup Test-Plan-Builder-7
> | |
> | |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: chararray
> |
> |---Project Test-Plan-Builder-8 Projections: [1] Overloaded: false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
> Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> Alan said that the problem is we don't generate a foreach operator for the 'cp = cf.gpa' statement. Please let me know if this can be resolved.
> Thanks,
> Tyson
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (PIG-299) Filter operator not included in the main
predecessor plan structure
Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Santhosh Srinivasan updated PIG-299:
------------------------------------
Attachment: nested_project_as_foreach.patch
The nested_project_as_foreach.patch contains the following:
1. The project statements like A = $1.$0; B = A.($1, $2); C = A.$1; etc. are rewritten as for each statements with nested plans that project the columns.
2. Unit test cases for testing the rewrite.
Unit test cases that still fail are:
[junit] Running org.apache.pig.test.TestEvalPipeline
[junit] Tests run: 8, Failures: 0, Errors: 1, Time elapsed: 142.518 sec
[junit] Test org.apache.pig.test.TestEvalPipeline FAILED
[junit] Running org.apache.pig.test.TestFilterOpNumeric
[junit] Tests run: 8, Failures: 0, Errors: 1, Time elapsed: 246.872 sec
[junit] Test org.apache.pig.test.TestFilterOpNumeric FAILED
[junit] Running org.apache.pig.test.TestStoreOld
[junit] Tests run: 3, Failures: 0, Errors: 2, Time elapsed: 21.584 sec
[junit] Test org.apache.pig.test.TestStoreOld FAILED
> Filter operator not included in the main predecessor plan structure
> -------------------------------------------------------------------
>
> Key: PIG-299
> URL: https://issues.apache.org/jira/browse/PIG-299
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Environment: N/A
> Reporter: Tyson Condie
> Assignee: Santhosh Srinivasan
> Priority: Blocker
> Fix For: types_branch
>
> Attachments: nested_project_as_foreach.patch
>
>
> Take the following query, which can be found in TestLogicalPlanBuilder.java method testQuery80();
> a = load 'input1' as (name, age, gpa);
> b = filter a by age < '20';");
> c = group b by (name,age);
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cp = cf.gpa;
> cd = distinct cp;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the LogicalPlan::getPredecessor method. Here is the explan plan print out of the inner foreach plan:
> |---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-16 Projections: [0] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
> |
> |---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> Input: Project Test-Plan-Builder-13 Projections: [*] Overloaded: false|
> |---Project Test-Plan-Builder-13 Projections: [*] Overloaded: false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) Type: tuple
> Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> As you can see the filter is only accessible via the LOProject::getExpression() method. It is not showing up as an input operator. Focus on the projection immediately following the filter. If I remove this projection then I get a correct plan. For example, let the inner foreach plan be as follows:
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cd = distinct cf;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> Then I get the following (correct) explan plan output.
> |---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> |
> |---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | LesserThan Test-Plan-Builder-11 FieldSchema: null Type: Unknown
> | |
> | |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: false FieldSchema: Type: Unknown
> | | Input: CoGroup Test-Plan-Builder-7
> | |
> | |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: chararray
> |
> |---Project Test-Plan-Builder-8 Projections: [1] Overloaded: false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
> Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> Alan said that the problem is we don't generate a foreach operator for the 'cp = cf.gpa' statement. Please let me know if this can be resolved.
> Thanks,
> Tyson
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (PIG-299) Filter operator not included in the
main predecessor plan structure
Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PIG-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613766#action_12613766 ]
Santhosh Srinivasan commented on PIG-299:
-----------------------------------------
The Logical Plan for the first query in the initial bug report will now look like:
{noformat}
ForEach Test-Plan-Builder-784 Schema: {group: (name: (null),age: (null),bytearray),co::gpa: bytearray} Type: bag
| |
| Project Test-Plan-Builder-781 Projections: [0] Overloaded: false FieldSchema: group: tuple({name: (null),age: (null),bytearray}) Type: tuple
| Input: CoGroup Test-Plan-Builder-77
| |
| Project Test-Plan-Builder-782 Projections: [*] Overloaded: false FieldSchema: co: tuple({gpa: bytearray}) Type: tuple
| Input: SORT Test-Plan-Builder-780|
| |---SORT Test-Plan-Builder-780 Schema: {gpa: bytearray} Type: bag
| | |
| | Project Test-Plan-Builder-779 Projections: [0] Overloaded: false FieldSchema: gpa: bytearray cn: 122 Type: bytearray
| | Input: Distinct Test-Plan-Builder-77
| |
| |---Distinct Test-Plan-Builder-778 Schema: {gpa: bytearray} Type: bag
| |
| |---ForEach Test-Plan-Builder-777 Schema: {gpa: bytearray} Type: bag
| | |
| | Project Test-Plan-Builder-776 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 122 Type: bytearray
| | Input: Filter Test-Plan-Builder-77
| |
| |---Filter Test-Plan-Builder-775 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
| | |
| | LesserThan Test-Plan-Builder-774 FieldSchema: null Type: Unknown
| | |
| | |---Project Test-Plan-Builder-772 Projections: [2] Overloaded: false FieldSchema: Type: Unknown
| | | Input: CoGroup Test-Plan-Builder-770
| | |
| | |---Const Test-Plan-Builder-773 FieldSchema: chararray Type: chararray
| |
| |---Project Test-Plan-Builder-771 Projections: [1] Overloaded: false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
| Input: CoGroup Test-Plan-Builder-770
|
|---CoGroup Test-Plan-Builder-770 Schema: {group: (name: (null),age: (null),bytearray),b: {name: bytearray,age: bytearray,gpa: bytearray}} Type: Unknown
| |
| Project Test-Plan-Builder-768 Projections: [0] Overloaded: false FieldSchema: name: bytearray cn: 120 Type: bytearray
| Input: Filter Test-Plan-Builder-76
| |
| Project Test-Plan-Builder-769 Projections: [1] Overloaded: false FieldSchema: age: bytearray cn: 121 Type: bytearray
| Input: Filter Test-Plan-Builder-76
|
|---Filter Test-Plan-Builder-767 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
| |
| LesserThan Test-Plan-Builder-766 FieldSchema: null Type: Unknown
| |
| |---Project Test-Plan-Builder-764 Projections: [1] Overloaded: false FieldSchema: age: bytearray cn: 121 Type: bytearray
| | Input: Load Test-Plan-Builder-763
| |
| |---Const Test-Plan-Builder-765 FieldSchema: chararray Type: chararray
|
|---Load Test-Plan-Builder-763 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
{noformat}
> Filter operator not included in the main predecessor plan structure
> -------------------------------------------------------------------
>
> Key: PIG-299
> URL: https://issues.apache.org/jira/browse/PIG-299
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: types_branch
> Environment: N/A
> Reporter: Tyson Condie
> Assignee: Santhosh Srinivasan
> Priority: Blocker
> Fix For: types_branch
>
> Attachments: nested_project_as_foreach.patch
>
>
> Take the following query, which can be found in TestLogicalPlanBuilder.java method testQuery80();
> a = load 'input1' as (name, age, gpa);
> b = filter a by age < '20';");
> c = group b by (name,age);
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cp = cf.gpa;
> cd = distinct cp;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> The filter statement 'cf = filter b by gpa < '3.0'' is not accessible via the LogicalPlan::getPredecessor method. Here is the explan plan print out of the inner foreach plan:
> |---SORT Test-Plan-Builder-17 Schema: {gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-16 Projections: [0] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-15 Schema: {gpa: bytearray} Type: bag
> |
> |---Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> Input: Project Test-Plan-Builder-13 Projections: [*] Overloaded: false|
> |---Project Test-Plan-Builder-13 Projections: [*] Overloaded: false FieldSchema: cf: tuple({name: bytearray,age: bytearray,gpa: bytearray}) Type: tuple
> Input: Filter Test-Plan-Builder-12OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> As you can see the filter is only accessible via the LOProject::getExpression() method. It is not showing up as an input operator. Focus on the projection immediately following the filter. If I remove this projection then I get a correct plan. For example, let the inner foreach plan be as follows:
> d = foreach c {
> cf = filter b by gpa < '3.0';
> cd = distinct cf;
> co = order cd by gpa;
> generate group, flatten(co);
> };
> Then I get the following (correct) explan plan output.
> |---SORT Test-Plan-Builder-15 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | Project Test-Plan-Builder-14 Projections: [2] Overloaded: false FieldSchema: gpa: bytearray cn: 2 Type: bytearray
> | Input: Distinct Test-Plan-Builder-1
> |
> |---Distinct Test-Plan-Builder-13 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> |
> |---Filter Test-Plan-Builder-12 Schema: {name: bytearray,age: bytearray,gpa: bytearray} Type: bag
> | |
> | LesserThan Test-Plan-Builder-11 FieldSchema: null Type: Unknown
> | |
> | |---Project Test-Plan-Builder-9 Projections: [2] Overloaded: false FieldSchema: Type: Unknown
> | | Input: CoGroup Test-Plan-Builder-7
> | |
> | |---Const Test-Plan-Builder-10 FieldSchema: chararray Type: chararray
> |
> |---Project Test-Plan-Builder-8 Projections: [1] Overloaded: false FieldSchema: b: bag({name: bytearray,age: bytearray,gpa: bytearray}) Type: bag
> Input: CoGroup Test-Plan-Builder-7OPERATOR PROJECT SCHEMA {name: bytearray,age: bytearray,gpa: bytearray}
> Alan said that the problem is we don't generate a foreach operator for the 'cp = cf.gpa' statement. Please let me know if this can be resolved.
> Thanks,
> Tyson
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.