You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@knox.apache.org by "Larry McCay (JIRA)" <ji...@apache.org> on 2017/06/17 14:39:00 UTC

[jira] [Updated] (KNOX-971) Putting files with special characters in the name mangles the name in HDFS

     [ https://issues.apache.org/jira/browse/KNOX-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Larry McCay updated KNOX-971:
-----------------------------
    Description: 
When issuing a CREATE of a filename with special characters with something like the following:

{code}
curl -i -L -u guest:guest-password "https://localhost:8443/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?op=CREATE"
{code}

The file is successfully created and written and can be successfully retrieved.
However the resulting filename within HDFS is actually "test_�lectronique_embarqu�.pdf" and if the same filename is used from dfs CLI the filename is correct in HDFS.

Moreover, trying to retrieve the properly named file through the gateway results in a 404.

{code}
17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS||||access|uri|/gateway/sandbox/webhdfs/v1//user/admin/test_�lectronique_embarqu�.pdf?op=CREATE|unavailable|Request method: PUT
17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/v1//user/admin/test_�lectronique_embarqu�.pdf?op=CREATE|success|
17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/v1//user/admin/test_�lectronique_embarqu�.pdf?op=CREATE|success|Groups: []
17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|hdfs||identity-mapping|principal|guest|success|Effective User: hdfs
17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50070/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?op=CREATE&user.name=hdfs|unavailable|Request method: PUT
17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50070/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?op=CREATE&user.name=hdfs|success|Response status: 307
17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|hdfs||access|uri|/gateway/sandbox/webhdfs/v1//user/admin/test_�lectronique_embarqu�.pdf?op=CREATE|success|Response status: 307
17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS||||access|uri|/gateway/sandbox/webhdfs/data/v1/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAACwKFV5ruVkBa7y6-HR3hqRqWFrapQYx523sBG1Vkvfg88gfaoAs4u2AZcbpm-KRYVqgQanuBkZFPyA4lxqwzptXGis5FNuQjk3fTHxEfGvsqGP2TbVQL24MT59dxszeVqwGxLrPS8SCLruYA5XmCEYt4Zhbty5IPdZFXikUc0aqolHSeafnc9j0gkrBBzfbUexrTSMMY26-Su8622oG5bFPlTWH1klTZ7vx_gRIwb6IdHhUnRwGgamJ3CpwArtXAJ9a9J8JHClvvk|unavailable|Request method: PUT
17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/data/v1/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAACwKFV5ruVkBa7y6-HR3hqRqWFrapQYx523sBG1Vkvfg88gfaoAs4u2AZcbpm-KRYVqgQanuBkZFPyA4lxqwzptXGis5FNuQjk3fTHxEfGvsqGP2TbVQL24MT59dxszeVqwGxLrPS8SCLruYA5XmCEYt4Zhbty5IPdZFXikUc0aqolHSeafnc9j0gkrBBzfbUexrTSMMY26-Su8622oG5bFPlTWH1klTZ7vx_gRIwb6IdHhUnRwGgamJ3CpwArtXAJ9a9J8JHClvvk|success|
17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/data/v1/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAACwKFV5ruVkBa7y6-HR3hqRqWFrapQYx523sBG1Vkvfg88gfaoAs4u2AZcbpm-KRYVqgQanuBkZFPyA4lxqwzptXGis5FNuQjk3fTHxEfGvsqGP2TbVQL24MT59dxszeVqwGxLrPS8SCLruYA5XmCEYt4Zhbty5IPdZFXikUc0aqolHSeafnc9j0gkrBBzfbUexrTSMMY26-Su8622oG5bFPlTWH1klTZ7vx_gRIwb6IdHhUnRwGgamJ3CpwArtXAJ9a9J8JHClvvk|success|Groups: []
17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|hdfs||identity-mapping|principal|guest|success|Effective User: hdfs
17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50075/webhdfs/v1/user/admin/test_%EF%BF%BDlectronique_embarqu%EF%BF%BD.pdf?op=CREATE&namenoderpcaddress=c6401.ambari.apache.org%3A8020&user.name=hdfs&createflag&createparent=true&overwrite=false|unavailable|Request method: PUT
17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50075/webhdfs/v1/user/admin/test_%EF%BF%BDlectronique_embarqu%EF%BF%BD.pdf?op=CREATE&namenoderpcaddress=c6401.ambari.apache.org%3A8020&user.name=hdfs&createflag&createparent=true&overwrite=false|success|Response status: 201
17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|hdfs||access|uri|/gateway/sandbox/webhdfs/data/v1/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAACwKFV5ruVkBa7y6-HR3hqRqWFrapQYx523sBG1Vkvfg88gfaoAs4u2AZcbpm-KRYVqgQanuBkZFPyA4lxqwzptXGis5FNuQjk3fTHxEfGvsqGP2TbVQL24MT59dxszeVqwGxLrPS8SCLruYA5XmCEYt4Zhbty5IPdZFXikUc0aqolHSeafnc9j0gkrBBzfbUexrTSMMY26-Su8622oG5bFPlTWH1klTZ7vx_gRIwb6IdHhUnRwGgamJ3CpwArtXAJ9a9J8JHClvvk|success|Response status: 201
17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS||||access|uri|/gateway/sandbox/webhdfs/v1/user/admin/?op=LISTSTATUS|unavailable|Request method: GET
17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/v1/user/admin/?op=LISTSTATUS|success|
17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/v1/user/admin/?op=LISTSTATUS|success|Groups: []
17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|hdfs||identity-mapping|principal|guest|success|Effective User: hdfs
17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50070/webhdfs/v1/user/admin?op=LISTSTATUS&user.name=hdfs|unavailable|Request method: GET
17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50070/webhdfs/v1/user/admin?op=LISTSTATUS&user.name=hdfs|success|Response status: 200
17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|hdfs||access|uri|/gateway/sandbox/webhdfs/v1/user/admin/?op=LISTSTATUS|success|Response status: 200
{code}

  was:
When issuing a CREATE of a filename with special characters with something like the following:

{code}
curl -i -L -u guest:guest-password "https://localhost:8443/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?op=CREATE"
{code}

The file is successfully created and written and can be successfully retrieved.
However the resulting filename within HDFS is actually "test_�lectronique_embarqu�.pdf" and if the same filename is used from dfs CLI the filename is correct in HDFS.

Moreover, trying to retrieve the properly named file through the gateway results in a 404.



> Putting files with special characters in the name mangles the name in HDFS
> --------------------------------------------------------------------------
>
>                 Key: KNOX-971
>                 URL: https://issues.apache.org/jira/browse/KNOX-971
>             Project: Apache Knox
>          Issue Type: Bug
>          Components: Server
>            Reporter: Larry McCay
>            Assignee: Larry McCay
>             Fix For: 0.13.0
>
>
> When issuing a CREATE of a filename with special characters with something like the following:
> {code}
> curl -i -L -u guest:guest-password "https://localhost:8443/webhdfs/v1/user/shfs3453/WORK/datasets/test_électronique_embarqué.pdf?op=CREATE"
> {code}
> The file is successfully created and written and can be successfully retrieved.
> However the resulting filename within HDFS is actually "test_�lectronique_embarqu�.pdf" and if the same filename is used from dfs CLI the filename is correct in HDFS.
> Moreover, trying to retrieve the properly named file through the gateway results in a 404.
> {code}
> 17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS||||access|uri|/gateway/sandbox/webhdfs/v1//user/admin/test_�lectronique_embarqu�.pdf?op=CREATE|unavailable|Request method: PUT
> 17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/v1//user/admin/test_�lectronique_embarqu�.pdf?op=CREATE|success|
> 17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/v1//user/admin/test_�lectronique_embarqu�.pdf?op=CREATE|success|Groups: []
> 17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|hdfs||identity-mapping|principal|guest|success|Effective User: hdfs
> 17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50070/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?op=CREATE&user.name=hdfs|unavailable|Request method: PUT
> 17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50070/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?op=CREATE&user.name=hdfs|success|Response status: 307
> 17/06/17 10:28:41 ||349b43e9-f449-4c5f-9dfb-f84e7ef943a2|audit|127.0.0.1|WEBHDFS|guest|hdfs||access|uri|/gateway/sandbox/webhdfs/v1//user/admin/test_�lectronique_embarqu�.pdf?op=CREATE|success|Response status: 307
> 17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS||||access|uri|/gateway/sandbox/webhdfs/data/v1/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAACwKFV5ruVkBa7y6-HR3hqRqWFrapQYx523sBG1Vkvfg88gfaoAs4u2AZcbpm-KRYVqgQanuBkZFPyA4lxqwzptXGis5FNuQjk3fTHxEfGvsqGP2TbVQL24MT59dxszeVqwGxLrPS8SCLruYA5XmCEYt4Zhbty5IPdZFXikUc0aqolHSeafnc9j0gkrBBzfbUexrTSMMY26-Su8622oG5bFPlTWH1klTZ7vx_gRIwb6IdHhUnRwGgamJ3CpwArtXAJ9a9J8JHClvvk|unavailable|Request method: PUT
> 17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/data/v1/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAACwKFV5ruVkBa7y6-HR3hqRqWFrapQYx523sBG1Vkvfg88gfaoAs4u2AZcbpm-KRYVqgQanuBkZFPyA4lxqwzptXGis5FNuQjk3fTHxEfGvsqGP2TbVQL24MT59dxszeVqwGxLrPS8SCLruYA5XmCEYt4Zhbty5IPdZFXikUc0aqolHSeafnc9j0gkrBBzfbUexrTSMMY26-Su8622oG5bFPlTWH1klTZ7vx_gRIwb6IdHhUnRwGgamJ3CpwArtXAJ9a9J8JHClvvk|success|
> 17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/data/v1/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAACwKFV5ruVkBa7y6-HR3hqRqWFrapQYx523sBG1Vkvfg88gfaoAs4u2AZcbpm-KRYVqgQanuBkZFPyA4lxqwzptXGis5FNuQjk3fTHxEfGvsqGP2TbVQL24MT59dxszeVqwGxLrPS8SCLruYA5XmCEYt4Zhbty5IPdZFXikUc0aqolHSeafnc9j0gkrBBzfbUexrTSMMY26-Su8622oG5bFPlTWH1klTZ7vx_gRIwb6IdHhUnRwGgamJ3CpwArtXAJ9a9J8JHClvvk|success|Groups: []
> 17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|hdfs||identity-mapping|principal|guest|success|Effective User: hdfs
> 17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50075/webhdfs/v1/user/admin/test_%EF%BF%BDlectronique_embarqu%EF%BF%BD.pdf?op=CREATE&namenoderpcaddress=c6401.ambari.apache.org%3A8020&user.name=hdfs&createflag&createparent=true&overwrite=false|unavailable|Request method: PUT
> 17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50075/webhdfs/v1/user/admin/test_%EF%BF%BDlectronique_embarqu%EF%BF%BD.pdf?op=CREATE&namenoderpcaddress=c6401.ambari.apache.org%3A8020&user.name=hdfs&createflag&createparent=true&overwrite=false|success|Response status: 201
> 17/06/17 10:28:41 ||82c97677-3b7b-4c19-bfea-e82fc149fc30|audit|127.0.0.1|WEBHDFS|guest|hdfs||access|uri|/gateway/sandbox/webhdfs/data/v1/webhdfs/v1/user/admin/test_�lectronique_embarqu�.pdf?_=AAAACAAAABAAAACwKFV5ruVkBa7y6-HR3hqRqWFrapQYx523sBG1Vkvfg88gfaoAs4u2AZcbpm-KRYVqgQanuBkZFPyA4lxqwzptXGis5FNuQjk3fTHxEfGvsqGP2TbVQL24MT59dxszeVqwGxLrPS8SCLruYA5XmCEYt4Zhbty5IPdZFXikUc0aqolHSeafnc9j0gkrBBzfbUexrTSMMY26-Su8622oG5bFPlTWH1klTZ7vx_gRIwb6IdHhUnRwGgamJ3CpwArtXAJ9a9J8JHClvvk|success|Response status: 201
> 17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS||||access|uri|/gateway/sandbox/webhdfs/v1/user/admin/?op=LISTSTATUS|unavailable|Request method: GET
> 17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/v1/user/admin/?op=LISTSTATUS|success|
> 17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|||authentication|uri|/gateway/sandbox/webhdfs/v1/user/admin/?op=LISTSTATUS|success|Groups: []
> 17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|hdfs||identity-mapping|principal|guest|success|Effective User: hdfs
> 17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50070/webhdfs/v1/user/admin?op=LISTSTATUS&user.name=hdfs|unavailable|Request method: GET
> 17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|hdfs||dispatch|uri|http://c6401.ambari.apache.org:50070/webhdfs/v1/user/admin?op=LISTSTATUS&user.name=hdfs|success|Response status: 200
> 17/06/17 10:28:47 ||78217ce2-1302-4dcd-a1a9-9e5a7548e552|audit|127.0.0.1|WEBHDFS|guest|hdfs||access|uri|/gateway/sandbox/webhdfs/v1/user/admin/?op=LISTSTATUS|success|Response status: 200
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)