You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Zsolt Venczel (JIRA)" <ji...@apache.org> on 2018/07/18 13:32:00 UTC

[jira] [Created] (HDFS-13744) OIV tool should better handle control characters present in file or directory names

Zsolt Venczel created HDFS-13744:
------------------------------------

             Summary: OIV tool should better handle control characters present in file or directory names
                 Key: HDFS-13744
                 URL: https://issues.apache.org/jira/browse/HDFS-13744
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: hdfs, tools
    Affects Versions: 3.0.3, 2.7.6, 2.8.4, 2.9.1, 2.6.5
            Reporter: Zsolt Venczel
            Assignee: Zsolt Venczel


In certain cases when control characters or white space is present in file or directory names OIV tool processors can export data in a misleading format.

In the below examples we have EXAMPLE_NAME as a file and a directory name where the directory has a line feed character at the end (the actual production case has multiple line feeds and multiple spaces)
 * CSV processor case:
 ** misleading example:
{code:java}
/user/data/EXAMPLE_NAME
,0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
/user/data/EXAMPLE_NAME,2016-08-26 03:00,2017-05-16 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
{code}

 ** expected example as suggested by [https://tools.ietf.org/html/rfc4180#section-2:]
{code:java}
"/user/data/EXAMPLE_NAME%x0D",0,2017-04-24 04:34,1969-12-31 16:00,0,0,0,-1,-1,drwxrwxr-x+,user,group
"/user/data/EXAMPLE_NAME",2016-08-26 03:00,2017-05-16 10:05,134217728,1,520,0,0,-rw-rwxr--+,user,group
{code}

 * XML processor case:
 ** misleading example:
{code:java}
<inode><id>479867791</id><type>DIRECTORY</type><name>EXAMPLE_NAME
</name><mtime>1493033668294</mtime><permission>user:group:0775</permission></inode>

<inode><id>113632535</id><type>FILE</type><name>EXAMPLE_NAME</name><replication>3</replication><mtime>1472205657504</mtime><atime>1494954320141</atime><preferredBlockSize>134217728</preferredBlockSize><permission>user:group:0674</permission></inode>
{code}

 ** expected example as specified in [https://www.w3.org/TR/REC-xml/#sec-line-ends:]
{code:java}
<inode><id>479867791</id><type>DIRECTORY</type><name>EXAMPLE_NAME#xA</name><mtime>1493033668294</mtime><permission>user:group:0775</permission></inode>

<inode><id>479867791</id><type>DIRECTORY</type><name>EXAMPLE_NAME
</name><mtime>1493033668294</mtime><permission>user:group:0775</permission></inode>
{code}

 * JSON:
 The OIV Web Processor behaves correctly and produces the following:
{code:java}
{
  "FileStatuses": {
    "FileStatus": [
      {
        "fileId": 113632535,
        "accessTime": 1494954320141,
        "replication": 3,
        "owner": "user",
        "length": 520,
        "permission": "674",
        "blockSize": 134217728,
        "modificationTime": 1472205657504,
        "type": "FILE",
        "group": "group",
        "childrenNum": 0,
        "pathSuffix": "EXAMPLE_NAME"
      },
      {
        "fileId": 479867791,
        "accessTime": 0,
        "replication": 0,
        "owner": "user",
        "length": 0,
        "permission": "775",
        "blockSize": 0,
        "modificationTime": 1493033668294,
        "type": "DIRECTORY",
        "group": "group",
        "childrenNum": 0,
        "pathSuffix": "EXAMPLE_NAME\n"
      }
    ]
  }
}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org