You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by Josh Wills <jw...@cloudera.com> on 2012/11/25 20:19:35 UTC

Review Request: Adding Sources for NLine and KeyValueText InputFormats

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8215/
-----------------------------------------------------------

Review request for crunch.


Description
-------

Added support for the NLine and KeyValueText InputFormats to the o.a.c.io.text package. This completes Crunch's support for the InputFormats that ship as part of hadoop-client.

In the process, I refactored the ReaderFactory code that is used to read SequenceFiles and text files during materialization to eliminate some duplicate code.


This addresses bug CRUNCH-119.
    https://issues.apache.org/jira/browse/CRUNCH-119


Diffs
-----

  crunch/src/it/java/org/apache/crunch/io/CompositePathIterableIT.java 796b821 
  crunch/src/it/java/org/apache/crunch/io/NLineInputIT.java PRE-CREATION 
  crunch/src/it/java/org/apache/crunch/io/TextFileTableIT.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/avro/AvroFileReaderFactory.java 6f21dd2 
  crunch/src/main/java/org/apache/crunch/io/avro/AvroFileSource.java 2226556 
  crunch/src/main/java/org/apache/crunch/io/impl/AutoClosingIterator.java d58f290 
  crunch/src/main/java/org/apache/crunch/io/impl/FileTableSourceImpl.java f6e8f1d 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileReaderFactory.java ad1b81b 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileSource.java e8f3dcf 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableReaderFactory.java 20c749a 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java 56ed985 
  crunch/src/main/java/org/apache/crunch/io/text/LineParser.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileReaderFactory.java a0c48e0 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileSource.java ee51c04 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTarget.java c7e06d3 
  crunch/src/test/java/org/apache/crunch/io/avro/AvroFileReaderFactoryTest.java 66863ba 

Diff: https://reviews.apache.org/r/8215/diff/


Testing
-------

Integration tests that use the new formats.


Thanks,

Josh Wills


Re: Review Request: Adding Sources for NLine and KeyValueText InputFormats

Posted by Josh Wills <jw...@cloudera.com>.
Yah, I don't think that RB likes that I rebased relative to the parent
diff. I'm going to close this one and open a new one-- sorry about the spam.


On Sun, Dec 2, 2012 at 2:10 PM, Josh Wills <jw...@cloudera.com> wrote:

>    This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8215/
>   Review request for crunch.
> By Josh Wills.
>
> *Updated Dec. 2, 2012, 10:10 p.m.*
> Description
>
> Added support for the NLine and KeyValueText InputFormats to the o.a.c.io.text package. This completes Crunch's support for the InputFormats that ship as part of hadoop-client.
>
> In the process, I refactored the ReaderFactory code that is used to read SequenceFiles and text files during materialization to eliminate some duplicate code.
>
>   Testing
>
> Integration tests that use the new formats.
>
>   *Bugs: * CRUNCH-119 <https://issues.apache.org/jira/browse/CRUNCH-119>
> Diffs (updated)
>
>    - crunch/src/it/java/org/apache/crunch/io/CompositePathIterableIT.java
>    (796b821)
>    - crunch/src/it/java/org/apache/crunch/io/NLineInputIT.java
>    (PRE-CREATION)
>    - crunch/src/it/java/org/apache/crunch/io/TextFileTableIT.java
>    (PRE-CREATION)
>    - crunch/src/main/java/org/apache/crunch/io/ReadableSource.java
>    (73a13a3)
>    - crunch/src/main/java/org/apache/crunch/io/avro/AvroFileReaderFactory.java
>    (6f21dd2)
>    - crunch/src/main/java/org/apache/crunch/io/avro/AvroFileSource.java
>    (2226556)
>    - crunch/src/main/java/org/apache/crunch/io/impl/AutoClosingIterator.java
>    (d58f290)
>    - crunch/src/main/java/org/apache/crunch/io/impl/FileTableSourceImpl.java
>    (f6e8f1d)
>    - crunch/src/main/java/org/apache/crunch/io/seq/SeqFileReaderFactory.java
>    (ad1b81b)
>    - crunch/src/main/java/org/apache/crunch/io/seq/SeqFileSource.java
>    (e8f3dcf)
>    - crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableReaderFactory.java
>    (20c749a)
>    - crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java
>    (56ed985)
>    - crunch/src/main/java/org/apache/crunch/io/text/LineParser.java
>    (PRE-CREATION)
>    - crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java
>    (PRE-CREATION)
>    - crunch/src/main/java/org/apache/crunch/io/text/TextFileReaderFactory.java
>    (a0c48e0)
>    - crunch/src/main/java/org/apache/crunch/io/text/TextFileSource.java
>    (ee51c04)
>    - crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java
>    (PRE-CREATION)
>    - crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java
>    (PRE-CREATION)
>    - crunch/src/main/java/org/apache/crunch/io/text/TextFileTarget.java
>    (c7e06d3)
>    - crunch/src/test/java/org/apache/crunch/io/avro/AvroFileReaderFactoryTest.java
>    (66863ba)
>
> View Diff <https://reviews.apache.org/r/8215/diff/>
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Re: Review Request: Adding Sources for NLine and KeyValueText InputFormats

Posted by Josh Wills <jw...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8215/
-----------------------------------------------------------

(Updated Dec. 2, 2012, 10:10 p.m.)


Review request for crunch.


Description
-------

Added support for the NLine and KeyValueText InputFormats to the o.a.c.io.text package. This completes Crunch's support for the InputFormats that ship as part of hadoop-client.

In the process, I refactored the ReaderFactory code that is used to read SequenceFiles and text files during materialization to eliminate some duplicate code.


This addresses bug CRUNCH-119.
    https://issues.apache.org/jira/browse/CRUNCH-119


Diffs (updated)
-----

  crunch/src/it/java/org/apache/crunch/io/CompositePathIterableIT.java 796b821 
  crunch/src/it/java/org/apache/crunch/io/NLineInputIT.java PRE-CREATION 
  crunch/src/it/java/org/apache/crunch/io/TextFileTableIT.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/ReadableSource.java 73a13a3 
  crunch/src/main/java/org/apache/crunch/io/avro/AvroFileReaderFactory.java 6f21dd2 
  crunch/src/main/java/org/apache/crunch/io/avro/AvroFileSource.java 2226556 
  crunch/src/main/java/org/apache/crunch/io/impl/AutoClosingIterator.java d58f290 
  crunch/src/main/java/org/apache/crunch/io/impl/FileTableSourceImpl.java f6e8f1d 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileReaderFactory.java ad1b81b 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileSource.java e8f3dcf 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableReaderFactory.java 20c749a 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java 56ed985 
  crunch/src/main/java/org/apache/crunch/io/text/LineParser.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileReaderFactory.java a0c48e0 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileSource.java ee51c04 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTarget.java c7e06d3 
  crunch/src/test/java/org/apache/crunch/io/avro/AvroFileReaderFactoryTest.java 66863ba 

Diff: https://reviews.apache.org/r/8215/diff/


Testing
-------

Integration tests that use the new formats.


Thanks,

Josh Wills


Re: Review Request: Adding Sources for NLine and KeyValueText InputFormats

Posted by Josh Wills <jw...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8215/
-----------------------------------------------------------

(Updated Dec. 2, 2012, 10:09 p.m.)


Review request for crunch.


Description
-------

Added support for the NLine and KeyValueText InputFormats to the o.a.c.io.text package. This completes Crunch's support for the InputFormats that ship as part of hadoop-client.

In the process, I refactored the ReaderFactory code that is used to read SequenceFiles and text files during materialization to eliminate some duplicate code.


This addresses bug CRUNCH-119.
    https://issues.apache.org/jira/browse/CRUNCH-119


Diffs (updated)
-----

  crunch/src/it/java/org/apache/crunch/io/CompositePathIterableIT.java 796b821 
  crunch/src/it/java/org/apache/crunch/io/NLineInputIT.java PRE-CREATION 
  crunch/src/it/java/org/apache/crunch/io/TextFileTableIT.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/ReadableSource.java 73a13a3 
  crunch/src/main/java/org/apache/crunch/io/avro/AvroFileReaderFactory.java 6f21dd2 
  crunch/src/main/java/org/apache/crunch/io/avro/AvroFileSource.java 2226556 
  crunch/src/main/java/org/apache/crunch/io/impl/AutoClosingIterator.java d58f290 
  crunch/src/main/java/org/apache/crunch/io/impl/FileTableSourceImpl.java f6e8f1d 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileReaderFactory.java ad1b81b 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileSource.java e8f3dcf 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableReaderFactory.java 20c749a 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java 56ed985 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java 56ed985 
  crunch/src/main/java/org/apache/crunch/io/text/LineParser.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/LineParser.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileReaderFactory.java a0c48e0 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileSource.java ee51c04 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTarget.java c7e06d3 
  crunch/src/test/java/org/apache/crunch/io/avro/AvroFileReaderFactoryTest.java 66863ba 

Diff: https://reviews.apache.org/r/8215/diff/


Testing
-------

Integration tests that use the new formats.


Thanks,

Josh Wills


Re: Review Request: Adding Sources for NLine and KeyValueText InputFormats

Posted by Brock Noland <br...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8215/#review13959
-----------------------------------------------------------


Hmm, clicking "View Diff" gives an error:

reversed (or previously applied) patch detected!  Assume -R? [n] 
Apply anyway? [n] 
Skipping patch.
3 out of 3 hunks ignored -- saving rejects to file /tmp/reviewboard.iLp_yV/tmpRBAjIL-new.rej


- Brock Noland


On Dec. 2, 2012, 9:57 p.m., Josh Wills wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8215/
> -----------------------------------------------------------
> 
> (Updated Dec. 2, 2012, 9:57 p.m.)
> 
> 
> Review request for crunch.
> 
> 
> Description
> -------
> 
> Added support for the NLine and KeyValueText InputFormats to the o.a.c.io.text package. This completes Crunch's support for the InputFormats that ship as part of hadoop-client.
> 
> In the process, I refactored the ReaderFactory code that is used to read SequenceFiles and text files during materialization to eliminate some duplicate code.
> 
> 
> This addresses bug CRUNCH-119.
>     https://issues.apache.org/jira/browse/CRUNCH-119
> 
> 
> Diffs
> -----
> 
>   crunch/src/it/java/org/apache/crunch/io/CompositePathIterableIT.java 796b821 
>   crunch/src/it/java/org/apache/crunch/io/NLineInputIT.java PRE-CREATION 
>   crunch/src/it/java/org/apache/crunch/io/TextFileTableIT.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/ReadableSource.java 73a13a3 
>   crunch/src/main/java/org/apache/crunch/io/avro/AvroFileReaderFactory.java 6f21dd2 
>   crunch/src/main/java/org/apache/crunch/io/avro/AvroFileSource.java 2226556 
>   crunch/src/main/java/org/apache/crunch/io/impl/AutoClosingIterator.java d58f290 
>   crunch/src/main/java/org/apache/crunch/io/impl/FileTableSourceImpl.java f6e8f1d 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileReaderFactory.java ad1b81b 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileSource.java e8f3dcf 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableReaderFactory.java 20c749a 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java 56ed985 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java 56ed985 
>   crunch/src/main/java/org/apache/crunch/io/text/LineParser.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/LineParser.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileReaderFactory.java a0c48e0 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileSource.java ee51c04 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTarget.java c7e06d3 
>   crunch/src/test/java/org/apache/crunch/io/avro/AvroFileReaderFactoryTest.java 66863ba 
> 
> Diff: https://reviews.apache.org/r/8215/diff/
> 
> 
> Testing
> -------
> 
> Integration tests that use the new formats.
> 
> 
> Thanks,
> 
> Josh Wills
> 
>


Re: Review Request: Adding Sources for NLine and KeyValueText InputFormats

Posted by Josh Wills <jw...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8215/
-----------------------------------------------------------

(Updated Dec. 2, 2012, 9:57 p.m.)


Review request for crunch.


Changes
-------

Update w/more javadoc and a fix for the KeyValueLineParser.


Description
-------

Added support for the NLine and KeyValueText InputFormats to the o.a.c.io.text package. This completes Crunch's support for the InputFormats that ship as part of hadoop-client.

In the process, I refactored the ReaderFactory code that is used to read SequenceFiles and text files during materialization to eliminate some duplicate code.


This addresses bug CRUNCH-119.
    https://issues.apache.org/jira/browse/CRUNCH-119


Diffs (updated)
-----

  crunch/src/it/java/org/apache/crunch/io/CompositePathIterableIT.java 796b821 
  crunch/src/it/java/org/apache/crunch/io/NLineInputIT.java PRE-CREATION 
  crunch/src/it/java/org/apache/crunch/io/TextFileTableIT.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/ReadableSource.java 73a13a3 
  crunch/src/main/java/org/apache/crunch/io/avro/AvroFileReaderFactory.java 6f21dd2 
  crunch/src/main/java/org/apache/crunch/io/avro/AvroFileSource.java 2226556 
  crunch/src/main/java/org/apache/crunch/io/impl/AutoClosingIterator.java d58f290 
  crunch/src/main/java/org/apache/crunch/io/impl/FileTableSourceImpl.java f6e8f1d 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileReaderFactory.java ad1b81b 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileSource.java e8f3dcf 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableReaderFactory.java 20c749a 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java 56ed985 
  crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java 56ed985 
  crunch/src/main/java/org/apache/crunch/io/text/LineParser.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/LineParser.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileReaderFactory.java a0c48e0 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileSource.java ee51c04 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/io/text/TextFileTarget.java c7e06d3 
  crunch/src/test/java/org/apache/crunch/io/avro/AvroFileReaderFactoryTest.java 66863ba 

Diff: https://reviews.apache.org/r/8215/diff/


Testing
-------

Integration tests that use the new formats.


Thanks,

Josh Wills


Re: Review Request: Adding Sources for NLine and KeyValueText InputFormats

Posted by Matthias Friedrich <ma...@mafr.de>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8215/#review13937
-----------------------------------------------------------


Looks good, although the new classes at least could use some javadocs. I know they're surrounded by undocumented stuff, but we have to start somewhere ;-)


crunch/src/main/java/org/apache/crunch/io/text/LineParser.java
<https://reviews.apache.org/r/8215/#comment29728>

    Looks like the rest of the line is lost. Is this intended behavior?


- Matthias Friedrich


On Nov. 25, 2012, 7:19 p.m., Josh Wills wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8215/
> -----------------------------------------------------------
> 
> (Updated Nov. 25, 2012, 7:19 p.m.)
> 
> 
> Review request for crunch.
> 
> 
> Description
> -------
> 
> Added support for the NLine and KeyValueText InputFormats to the o.a.c.io.text package. This completes Crunch's support for the InputFormats that ship as part of hadoop-client.
> 
> In the process, I refactored the ReaderFactory code that is used to read SequenceFiles and text files during materialization to eliminate some duplicate code.
> 
> 
> This addresses bug CRUNCH-119.
>     https://issues.apache.org/jira/browse/CRUNCH-119
> 
> 
> Diffs
> -----
> 
>   crunch/src/it/java/org/apache/crunch/io/CompositePathIterableIT.java 796b821 
>   crunch/src/it/java/org/apache/crunch/io/NLineInputIT.java PRE-CREATION 
>   crunch/src/it/java/org/apache/crunch/io/TextFileTableIT.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/avro/AvroFileReaderFactory.java 6f21dd2 
>   crunch/src/main/java/org/apache/crunch/io/avro/AvroFileSource.java 2226556 
>   crunch/src/main/java/org/apache/crunch/io/impl/AutoClosingIterator.java d58f290 
>   crunch/src/main/java/org/apache/crunch/io/impl/FileTableSourceImpl.java f6e8f1d 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileReaderFactory.java ad1b81b 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileSource.java e8f3dcf 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableReaderFactory.java 20c749a 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java 56ed985 
>   crunch/src/main/java/org/apache/crunch/io/text/LineParser.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileReaderFactory.java a0c48e0 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileSource.java ee51c04 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTarget.java c7e06d3 
>   crunch/src/test/java/org/apache/crunch/io/avro/AvroFileReaderFactoryTest.java 66863ba 
> 
> Diff: https://reviews.apache.org/r/8215/diff/
> 
> 
> Testing
> -------
> 
> Integration tests that use the new formats.
> 
> 
> Thanks,
> 
> Josh Wills
> 
>


Re: Review Request: Adding Sources for NLine and KeyValueText InputFormats

Posted by Josh Wills <jw...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8215/#review13958
-----------------------------------------------------------


Thanks Matthias-- Will upload a new patch w/more javadoc.


crunch/src/main/java/org/apache/crunch/io/text/LineParser.java
<https://reviews.apache.org/r/8215/#comment29821>

    No, it's not-- will fix.


- Josh Wills


On Nov. 25, 2012, 7:19 p.m., Josh Wills wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/8215/
> -----------------------------------------------------------
> 
> (Updated Nov. 25, 2012, 7:19 p.m.)
> 
> 
> Review request for crunch.
> 
> 
> Description
> -------
> 
> Added support for the NLine and KeyValueText InputFormats to the o.a.c.io.text package. This completes Crunch's support for the InputFormats that ship as part of hadoop-client.
> 
> In the process, I refactored the ReaderFactory code that is used to read SequenceFiles and text files during materialization to eliminate some duplicate code.
> 
> 
> This addresses bug CRUNCH-119.
>     https://issues.apache.org/jira/browse/CRUNCH-119
> 
> 
> Diffs
> -----
> 
>   crunch/src/it/java/org/apache/crunch/io/CompositePathIterableIT.java 796b821 
>   crunch/src/it/java/org/apache/crunch/io/NLineInputIT.java PRE-CREATION 
>   crunch/src/it/java/org/apache/crunch/io/TextFileTableIT.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/avro/AvroFileReaderFactory.java 6f21dd2 
>   crunch/src/main/java/org/apache/crunch/io/avro/AvroFileSource.java 2226556 
>   crunch/src/main/java/org/apache/crunch/io/impl/AutoClosingIterator.java d58f290 
>   crunch/src/main/java/org/apache/crunch/io/impl/FileTableSourceImpl.java f6e8f1d 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileReaderFactory.java ad1b81b 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileSource.java e8f3dcf 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableReaderFactory.java 20c749a 
>   crunch/src/main/java/org/apache/crunch/io/seq/SeqFileTableSource.java 56ed985 
>   crunch/src/main/java/org/apache/crunch/io/text/LineParser.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/NLineFileSource.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileReaderFactory.java a0c48e0 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileSource.java ee51c04 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSource.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTableSourceTarget.java PRE-CREATION 
>   crunch/src/main/java/org/apache/crunch/io/text/TextFileTarget.java c7e06d3 
>   crunch/src/test/java/org/apache/crunch/io/avro/AvroFileReaderFactoryTest.java 66863ba 
> 
> Diff: https://reviews.apache.org/r/8215/diff/
> 
> 
> Testing
> -------
> 
> Integration tests that use the new formats.
> 
> 
> Thanks,
> 
> Josh Wills
> 
>