You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by Josh Wills <jw...@cloudera.com> on 2012/12/03 01:00:59 UTC

Review Request: Add helpers for parsing PCollection instances

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8310/
-----------------------------------------------------------

Review request for crunch.


Description
-------

We should make it a bit easier to parse delimited text files into specific data types (e.g., ints, floats, etc.) or combinations of types-- e.g., pairs of strings and ints, a Tuple3 of booleans, etc.


This addresses bug CRUNCH-97.
    https://issues.apache.org/jira/browse/CRUNCH-97


Diffs
-----

  crunch/src/main/java/org/apache/crunch/lib/text/AbstractCompositeExtractor.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/AbstractSimpleExtractor.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Extractor.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/ExtractorStats.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Extractors.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Parse.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/Tokenizer.java PRE-CREATION 
  crunch/src/main/java/org/apache/crunch/lib/text/TokenizerFactory.java PRE-CREATION 
  crunch/src/test/java/org/apache/crunch/lib/text/ParseTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/8310/diff/


Testing
-------

Unit tests.


Thanks,

Josh Wills