You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Peter Karich (JIRA)" <ji...@apache.org> on 2013/04/30 09:58:16 UTC
[jira] [Comment Edited] (COMPRESS-224) Cannot uncompress very large
bzip2 files
[ https://issues.apache.org/jira/browse/COMPRESS-224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644645#comment-13644645 ]
Peter Karich edited comment on COMPRESS-224 at 4/30/13 7:56 AM:
----------------------------------------------------------------
Compared with bzip2 it looks like apache-compress works fine now (I'll report in a few days if parsing the resulting xml is ok)! Can I set this as the default setting or will certain bz2 files fail?
BTW: It looks like apache-compress is only 2 times slower than bzip2 which is quite good IMO :) ! But do you know how one could improve this further? (Would save 3 hours on such big beasts :))
was (Author: peathal):
Compared with bzip2 it looks like apache-compress works fine now (I'll report in a few days if parsing the resulting xml is ok)! Can I set this as the default setting or will certain bz2 files fail?
BTW: It looks like apache-compress is only 2.4 times slower than bzip2 which is quite good IMO :) ! Or do you think there is room for improvement?
> Cannot uncompress very large bzip2 files
> ----------------------------------------
>
> Key: COMPRESS-224
> URL: https://issues.apache.org/jira/browse/COMPRESS-224
> Project: Commons Compress
> Issue Type: Bug
> Affects Versions: 1.5
> Environment: Java 1.7.0_03
> Reporter: Peter Karich
> Priority: Blocker
>
> When extracting big files like http://download.geofabrik.de/europe/germany/bayern-latest.osm.bz2 apache-compress works nicely. But when trying the same for e.g. http://ftp5.gwdg.de/pub/misc/openstreetmap/planet.openstreetmap.org/planet/planet-latest.osm.bz2 it stops without an error after exactly 900000 bits.
> I'm using the following code:
> {code:title=App.java|borderStyle=solid}
> public static void main(String[] args) throws IOException {
> if (args.length == 0)
> throw new IllegalArgumentException("You need to specify the bz2 file!");
> String fromFile = args[0];
> if (!fromFile.endsWith(".bz2"))
> throw new IllegalArgumentException("You need to specify a bz2 file! But was:" + fromFile);
> String toFile = pruneFileEnd(fromFile);
> FileInputStream in = new FileInputStream(fromFile);
> FileOutputStream out = new FileOutputStream(toFile);
> BZip2CompressorInputStream bzIn = new BZip2CompressorInputStream(in);
> try {
> final byte[] buffer = new byte[1024 * 8];
> int n = 0;
> while (-1 != (n = bzIn.read(buffer))) {
> out.write(buffer, 0, n);
> }
> } finally {
> out.close();
> bzIn.close();
> }
> }
> public static String pruneFileEnd(String file) {
> int index = file.lastIndexOf(".");
> if (index < 0)
> return file;
> return file.substring(0, index);
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira