You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Zac Hopkinson (JIRA)" <ji...@apache.org> on 2015/12/30 20:12:49 UTC
[jira] [Created] (MAPREDUCE-6596) MultipleInputs does not escape
Path characters
Zac Hopkinson created MAPREDUCE-6596:
----------------------------------------
Summary: MultipleInputs does not escape Path characters
Key: MAPREDUCE-6596
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6596
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mrv2
Affects Versions: 2.6.2
Reporter: Zac Hopkinson
Assignee: Zac Hopkinson
Filenames containing commas or semicolons cause MultipleInputs to break since these characters are used for joining and storing the path names.
MultipleInputs stores mapreduce.input.multipleinputs.dir.formats as:
```
path;inputFormatClass,path2;inputFormatClass2[, ...]
```
If a filename contains one of the characters used for joining the data then getInputFormatMap and getMapperTypeMap will fail.
Looking at FileInputFormat.addInputPath() it uses escapeString and unescapeString from StringUtils. I took the same approach for escaping in MultipleInputs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)