You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ant.apache.org by GitBox <gi...@apache.org> on 2022/10/26 17:57:50 UTC

[GitHub] [ant] keithc-ca opened a new pull request, #194: Names end before the first NULL (not the last)

keithc-ca opened a new pull request, #194:
URL: https://github.com/apache/ant/pull/194

   This fixes parsing of archives produced on macOS.
   See the discussion in https://github.com/ibmruntimes/Semeru-Runtimes/issues/15.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] keithc-ca commented on pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
keithc-ca commented on PR #194:
URL: https://github.com/apache/ant/pull/194#issuecomment-1305963866

   Thanks for merging this.
   
   > We'd like to credit you in CONTRIBUTORS and contributors.xml
   
   Please use
   ```
   <name>
       <first>Keith</first>
       <middle>W.</middle>
       <last>Campbell</last>
   </name>
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] keithc-ca commented on pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
keithc-ca commented on PR #194:
URL: https://github.com/apache/ant/pull/194#issuecomment-1326867522

   > `ZipEncoding` might be a bit too simplistic
   
   Yes, I think so. I explored the idea of just decoding the whole byte array, but that will fail if there's a bad number of nulls at the end (e.g. an odd number for UTF16).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] keithc-ca commented on pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
keithc-ca commented on PR #194:
URL: https://github.com/apache/ant/pull/194#issuecomment-1329179830

   That's an interesting idea, but it has the potential to misidentify the end of a multi-byte encoding.
   For example, `"Ā!"` = `"\u0100!"` is encoded by UTF16  as { 1, 0, 0, 33 }. The null sequence would be found at offset 1 so only { 1 } would be given to the decoder resulting in a `MalformedInputException`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] bodewig commented on pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
bodewig commented on PR #194:
URL: https://github.com/apache/ant/pull/194#issuecomment-1305971189

   thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] bodewig commented on pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
bodewig commented on PR #194:
URL: https://github.com/apache/ant/pull/194#issuecomment-1330216852

   Good catch, thank you.
   
   I could try to turn the byte array into a string, catch the exception and then keep searching for encoded NULs later.
   
   But even then it might just fail for some other encoding, something that's even more complex and uses different numbers of bytes per glyph like UTF-8. The more I think about it the more I feel it is impossible to make the code work for general multi-byte encodings, at least without moving the burden to `ZipEncoding` itself. Will need to think about that a bit more.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] qf28 commented on pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
qf28 commented on PR #194:
URL: https://github.com/apache/ant/pull/194#issuecomment-1305727468

   Hi


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] bodewig commented on pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
bodewig commented on PR #194:
URL: https://github.com/apache/ant/pull/194#issuecomment-1327750192

   well, in the end the tar format is not meant to store file names in the name field with a multi-byte encoding. The few docs that one can find talk about "local variant of ASCII" and the UTF16 test only worked by accident before the change.
   
   I'll change the docs in either case, discouraging the use of multi-byte encodings and change the test to use codepage 1252 or something like that. And then I'll try to think about a proper - ideally backwards compatible - extension of ZipEncoding that would  help us here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] bodewig merged pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
bodewig merged PR #194:
URL: https://github.com/apache/ant/pull/194


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] bodewig commented on pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
bodewig commented on PR #194:
URL: https://github.com/apache/ant/pull/194#issuecomment-1305954729

   I should have asked my five year younger self wether I remember why the code is what it is :-) - see https://github.com/apache/commons-compress/pull/54
   
   I'll merge this PR right away. We'd like to credit you in CONTRIBUTORS and contributors.xml if you want to. Which name would you want us to add?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] bodewig commented on pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
bodewig commented on PR #194:
URL: https://github.com/apache/ant/pull/194#issuecomment-1320928103

   actually this breaks support for file names using multi-byte encodings where NULs may just be part of a multi-byte sequence and not signal the end of the name. You can see this by running 
   
   ```
   /build.sh -f src/etc/testcases/taskdefs/untar.xml encodingTest
   ```
   
   in Ant's root. This creates a tar using UTF-16 and fails to expand it again. I'm not 100% sure how to solve this as `ZipEncoding` might be a bit too simplistic.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org


[GitHub] [ant] bodewig commented on pull request #194: Names end before the first NULL (not the last)

Posted by GitBox <gi...@apache.org>.
bodewig commented on PR #194:
URL: https://github.com/apache/ant/pull/194#issuecomment-1328088190

   it seems e04fbe7ff works for our tests. I thought about letting the encoding tell us what a NUL would look like and then search for that, this seems to work even though it was more complex than I hoped for as Java's UTF-16 encoder prepends a BOM to every string, including one only consisting of a NUL.
   
   Of course this is slower than the code we had before but parsing tar archive names is not likely to be the performance bottleneck of a build process. And people should not use Ant's tar package but rather switch to Commons Compress outside of the context of Ant itself.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@ant.apache.org
For additional commands, e-mail: dev-help@ant.apache.org