You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "mac champion (JIRA)" <ji...@apache.org> on 2015/10/01 17:55:28 UTC

[jira] [Comment Edited] (CRUNCH-565) CSVInputFormat needs to be more defensive when configuring itself

    [ https://issues.apache.org/jira/browse/CRUNCH-565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939926#comment-14939926 ] 

mac champion edited comment on CRUNCH-565 at 10/1/15 3:55 PM:
--------------------------------------------------------------

[~mkwhitacre]
Well, at first I started using it just because that's what I'm comfortable with. But later I realized i wasn't completely certain how to manipulate it into returning null instead of blank strings. With Mockito that's easy, just don't mock anything and the return value will be null.

BUT, If I switch all of these to get(opt,default) I will have to do some extra stuff, but I shouldn't have to handle nulls or do anything weird like that. Can you take another look here? https://github.com/champgm/crunch/pull/7

Also, sorry about the pull request to apache/crunch. I've forked that and I use it play around and create pull requests so I can have a nice place to review and comment on the diffs. When the code looks good and it builds, I'll squash, create a patch, and attach it to the JIRA. Is that an okay workflow? The official one is pretty sparse and doesn't include any kind of review steps: https://cwiki.apache.org/confluence/display/CRUNCH/Committer+Workflow





was (Author: champgm):
[~mkwhitacre]
Well, at first I started using it just because that's what I'm comfortable with. But later I realized i wasn't completely certain how to manipulate it into returning null instead of blank strings. With Mockito that's easy, just don't mock anything and the return value will be null.

BUT, If I switch all of these to get(opt,default) I will have to do some extra stuff, but I shouldn't have to handle nulls or do anything weird like that. Can you take another look here? https://github.com/champgm/crunch/pull/7

Also, sorry about the pull request to apache/crunch. I've forked that and I use it play around and create pull requests so I can have a nice place to review and comment on the diffs. When the code looks good and it builds, I'll create a patch and attach it to the JIRA. Is that an okay workflow? The official one is pretty sparse and doesn't include any kind of review steps: https://cwiki.apache.org/confluence/display/CRUNCH/Committer+Workflow




> CSVInputFormat needs to be more defensive when configuring itself
> -----------------------------------------------------------------
>
>                 Key: CRUNCH-565
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-565
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.10.0, 0.8.3
>            Reporter: mac champion
>            Assignee: mac champion
>            Priority: Minor
>              Labels: csv, csvparser
>
> It seems that some behavior has changed somewhere along the line where hadoop Configuration is concerned. It is possible that a call to .get(OPTION) will return null. CSVInputFormat does not handle that case gracefully:
> https://github.com/apache/crunch/blob/apache-crunch-0.10.0/crunch-core/src/main/java/org/apache/crunch/io/text/csv/CSVInputFormat.java#L178-L183
> Some more relevant details can be found in this JIRA:
> https://issues.apache.org/jira/browse/CRUNCH-564?focusedCommentId=14938186&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14938186



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)