You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Harsh J Chouraria (JIRA)" <ji...@apache.org> on 2011/03/11 05:20:59 UTC
[jira] Updated: (HBASE-3623) Allow non-XML representable separator
characters in the ImportTSV tool
[ https://issues.apache.org/jira/browse/HBASE-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Harsh J Chouraria updated HBASE-3623:
-------------------------------------
Attachment: hbase.importtsv.xml.friendly.r1.diff
I've attached a patch (against trunk/) that uses Base64 encoding to achieve this.
Perhaps this can be back-ported too (vastly helps imports in some scenarios, where one would otherwise translate (tr, etc.) the files before using this tool).
The existing test-case for ImportTSV passes, and I have added a new one for testing the importtsv's mapper (no test was present at all for this one).
> Allow non-XML representable separator characters in the ImportTSV tool
> ----------------------------------------------------------------------
>
> Key: HBASE-3623
> URL: https://issues.apache.org/jira/browse/HBASE-3623
> Project: HBase
> Issue Type: Improvement
> Components: mapreduce
> Affects Versions: 0.90.1
> Environment: Cloudera Hadoop/HBase (3B4)
> Reporter: Harsh J Chouraria
> Labels: import
> Fix For: 0.92.0
>
> Attachments: hbase.importtsv.xml.friendly.r1.diff
>
> Original Estimate: 0.5h
> Remaining Estimate: 0.5h
>
> The current importtsv functionality will not work if one passes a non-XML representable character as the separator character (say, an escape character - \u001b, fairly common in use).
> {code}
> -Dimporttsv.separator=$'\x1b' # This param fails the submitter when serialized.
> {code}
> While this is a limitation with the Configuration class's being serialized as an XML, it can be circumvented by applying a suitable encoding that makes a string XML-compatible.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira