You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Rohini Palaniswamy <ro...@gmail.com> on 2014/01/23 22:22:25 UTC
Review Request 17266: [PIG-3661] Piggybank AvroStorage fails if used in more
than one load or store statement
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17266/
-----------------------------------------------------------
Review request for pig.
Bugs: PIG-3661
https://issues.apache.org/jira/browse/PIG-3661
Repository: pig
Description
-------
This patch fixes other issues with AvroStorage as well apart from fixing multiple load and store
- Hidden files were not excluded (PIG-3717)
- mapred.input.dir was getting populated with all files instead of the top level directory making the conf very big
- Default value was not set for a Union
Diffs
-----
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java 1560805
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java 1560805
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java 1560805
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java 1560805
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java 1560805
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorageUtils.java 1560805
Diff: https://reviews.apache.org/r/17266/diff/
Testing
-------
Unit test added. testGlob was passing in git but failing when run in svn code base due to hidden .svn files(PIG-3717). That passes as well.
Thanks,
Rohini Palaniswamy
Re: Review Request 17266: [PIG-3661] Piggybank AvroStorage fails if used in
more than one load or store statement
Posted by Rohini Palaniswamy <ro...@gmail.com>.
> On Jan. 24, 2014, 5:09 p.m., Cheolsoo Park wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java, lines 204-212
> > <https://reviews.apache.org/r/17266/diff/1/?file=436566#file436566line204>
> >
> > Isn't (statuses.length == 0 && !status.isDir()) == fs.isFile(path)? If so, can you simplify this to the following?
> >
> > if (fs.isFile(path)) {
> > return path;
> > }
> >
> > FileStatus[] statuses = fs.listStatus(path, PATH_FILTER);
> > <the rest goes here>
Found that fs.isFile() returns false instead of throwing FileNotFoundException. So instead doing
FileStatus status = fs.getFileStatus(path);
if (!status.isDir()) {
return path;
}
- Rohini
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17266/#review32718
-----------------------------------------------------------
On Jan. 23, 2014, 9:22 p.m., Rohini Palaniswamy wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17266/
> -----------------------------------------------------------
>
> (Updated Jan. 23, 2014, 9:22 p.m.)
>
>
> Review request for pig.
>
>
> Bugs: PIG-3661
> https://issues.apache.org/jira/browse/PIG-3661
>
>
> Repository: pig
>
>
> Description
> -------
>
> This patch fixes other issues with AvroStorage as well apart from fixing multiple load and store
> - Hidden files were not excluded (PIG-3717)
> - mapred.input.dir was getting populated with all files instead of the top level directory making the conf very big
> - Default value was not set for a Union
>
>
> Diffs
> -----
>
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorageUtils.java 1560805
>
> Diff: https://reviews.apache.org/r/17266/diff/
>
>
> Testing
> -------
>
> Unit test added. testGlob was passing in git but failing when run in svn code base due to hidden .svn files(PIG-3717). That passes as well.
>
>
> Thanks,
>
> Rohini Palaniswamy
>
>
Re: Review Request 17266: [PIG-3661] Piggybank AvroStorage fails if used in
more than one load or store statement
Posted by Cheolsoo Park <pi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17266/#review32718
-----------------------------------------------------------
Looks good. I have few minor comments below.
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
<https://reviews.apache.org/r/17266/#comment61661>
Can we remove this method? Why not let AvroStorage directly call getPaths(String, Configuration, boolean) and get rid of an extra indirection?
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
<https://reviews.apache.org/r/17266/#comment61667>
f.getPath() => path.
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
<https://reviews.apache.org/r/17266/#comment61668>
} else {
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
<https://reviews.apache.org/r/17266/#comment61663>
Can we remove this method? It seems not used anywhere.
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java
<https://reviews.apache.org/r/17266/#comment61670>
Isn't (statuses.length == 0 && !status.isDir()) == fs.isFile(path)? If so, can you simplify this to the following?
if (fs.isFile(path)) {
return path;
}
FileStatus[] statuses = fs.listStatus(path, PATH_FILTER);
<the rest goes here>
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java
<https://reviews.apache.org/r/17266/#comment61669>
for( => for (
- Cheolsoo Park
On Jan. 23, 2014, 9:22 p.m., Rohini Palaniswamy wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17266/
> -----------------------------------------------------------
>
> (Updated Jan. 23, 2014, 9:22 p.m.)
>
>
> Review request for pig.
>
>
> Bugs: PIG-3661
> https://issues.apache.org/jira/browse/PIG-3661
>
>
> Repository: pig
>
>
> Description
> -------
>
> This patch fixes other issues with AvroStorage as well apart from fixing multiple load and store
> - Hidden files were not excluded (PIG-3717)
> - mapred.input.dir was getting populated with all files instead of the top level directory making the conf very big
> - Default value was not set for a Union
>
>
> Diffs
> -----
>
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorageUtils.java 1560805
>
> Diff: https://reviews.apache.org/r/17266/diff/
>
>
> Testing
> -------
>
> Unit test added. testGlob was passing in git but failing when run in svn code base due to hidden .svn files(PIG-3717). That passes as well.
>
>
> Thanks,
>
> Rohini Palaniswamy
>
>
Re: Review Request 17266: [PIG-3661] Piggybank AvroStorage fails if used in
more than one load or store statement
Posted by Cheolsoo Park <pi...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17266/#review32745
-----------------------------------------------------------
Ship it!
Ship It!
- Cheolsoo Park
On Jan. 24, 2014, 8:27 p.m., Rohini Palaniswamy wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17266/
> -----------------------------------------------------------
>
> (Updated Jan. 24, 2014, 8:27 p.m.)
>
>
> Review request for pig.
>
>
> Bugs: PIG-3661
> https://issues.apache.org/jira/browse/PIG-3661
>
>
> Repository: pig
>
>
> Description
> -------
>
> This patch fixes other issues with AvroStorage as well apart from fixing multiple load and store
> - Hidden files were not excluded (PIG-3717)
> - mapred.input.dir was getting populated with all files instead of the top level directory making the conf very big
> - Default value was not set for a Union
>
>
> Diffs
> -----
>
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java 1560805
> http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorageUtils.java 1560805
>
> Diff: https://reviews.apache.org/r/17266/diff/
>
>
> Testing
> -------
>
> Unit test added. testGlob was passing in git but failing when run in svn code base due to hidden .svn files(PIG-3717). That passes as well.
>
>
> Thanks,
>
> Rohini Palaniswamy
>
>
Re: Review Request 17266: [PIG-3661] Piggybank AvroStorage fails if used in
more than one load or store statement
Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17266/
-----------------------------------------------------------
(Updated Jan. 24, 2014, 8:27 p.m.)
Review request for pig.
Changes
-------
Addressed review comments
Bugs: PIG-3661
https://issues.apache.org/jira/browse/PIG-3661
Repository: pig
Description
-------
This patch fixes other issues with AvroStorage as well apart from fixing multiple load and store
- Hidden files were not excluded (PIG-3717)
- mapred.input.dir was getting populated with all files instead of the top level directory making the conf very big
- Default value was not set for a Union
Diffs (updated)
-----
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java 1560805
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorageUtils.java 1560805
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java 1560805
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java 1560805
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java 1560805
http://svn.apache.org/repos/asf/pig/trunk/contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorageUtils.java 1560805
Diff: https://reviews.apache.org/r/17266/diff/
Testing
-------
Unit test added. testGlob was passing in git but failing when run in svn code base due to hidden .svn files(PIG-3717). That passes as well.
Thanks,
Rohini Palaniswamy