You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Dave Beech (JIRA)" <ji...@apache.org> on 2012/12/13 16:34:13 UTC

[jira] [Updated] (CRUNCH-131) Input paths containing globs/wildcards are not accepted

     [ https://issues.apache.org/jira/browse/CRUNCH-131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dave Beech updated CRUNCH-131:
------------------------------

    Attachment: CRUNCH-131.patch

Here's a quick patch to add support for these paths. 

A couple of questions about the current version:
- what's the reason for the throw and catch of FileNotFoundException? Can't you just log and return 0 inside the null check if-statement?

- what's the logic behind returning -1 or 0 following different checks? (I've kept the -1 return so that unit tests still pass.)

Thanks 


                
> Input paths containing globs/wildcards are not accepted
> -------------------------------------------------------
>
>                 Key: CRUNCH-131
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-131
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.4.0
>            Reporter: Dave Beech
>            Assignee: Josh Wills
>         Attachments: CRUNCH-131.patch
>
>
> Crunch fails to calculate the size of paths containing wildcards - example error below:
> Exception in thread "main" java.lang.IllegalStateException: Input source SeqFile(/my/path/containing/wildcards*/part*) does not exist!
>  	at org.apache.crunch.impl.mr.collect.InputCollection.getSizeInternal(InputCollection.java:53)
>  	at org.apache.crunch.impl.mr.collect.PCollectionImpl.getSize(PCollectionImpl.java:253)
>  	at org.apache.crunch.impl.mr.collect.DoCollectionImpl.getSizeInternal(DoCollectionImpl.java:43)
>  	at org.apache.crunch.impl.mr.collect.PCollectionImpl.getSize(PCollectionImpl.java:253)
>  	at org.apache.crunch.impl.mr.collect.DoTableImpl.getSizeInternal(DoTableImpl.java:47)
>  	at org.apache.crunch.impl.mr.collect.PGroupedTableImpl.getSizeInternal(PGroupedTableImpl.java:75)
>  	at org.apache.crunch.impl.mr.collect.PCollectionImpl.getSize(PCollectionImpl.java:253)
>  	at org.apache.crunch.impl.mr.collect.PGroupedTableImpl.configureShuffle(PGroupedTableImpl.java:63)
>  	at org.apache.crunch.impl.mr.plan.JobPrototype.build(JobPrototype.java:162)
>  	at org.apache.crunch.impl.mr.plan.JobPrototype.getCrunchJob(JobPrototype.java:114)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira