You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Frédérik Bilhaut (JIRA)" <ji...@apache.org> on 2015/10/12 14:33:05 UTC
[jira] [Commented] (COMPRESS-325) Unable to uncompress bzip2
dbPedia files
[ https://issues.apache.org/jira/browse/COMPRESS-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953059#comment-14953059 ]
Frédérik Bilhaut commented on COMPRESS-325:
-------------------------------------------
Actually, re-compressing the same file with another tool ("bzip2" command under macos) renders a file that's properly readable with BZip2CompressorInputStream.
> Unable to uncompress bzip2 dbPedia files
> ----------------------------------------
>
> Key: COMPRESS-325
> URL: https://issues.apache.org/jira/browse/COMPRESS-325
> Project: Commons Compress
> Issue Type: Bug
> Affects Versions: 1.10
> Reporter: Frédérik Bilhaut
>
> Sample code :
> {code:java}
> URL url = new URL("http://downloads.dbpedia.org/current/core-i18n/en/labels_en.nt.bz2");
> InputStream input = new BZip2CompressorInputStream(url.openConnection().getInputStream());
> BufferedReader reader = new BufferedReader(new InputStreamReader(input, "US-ASCII"));
>
> int count = 0;
> for(String line = reader.readLine(); line != null; line = reader.readLine()) {
> if(++count > 10000) break;
> else System.out.println(count + ": " + line);
> }
> {code}
> It stops at line 7801 (EOF) :
> {code}
> 7799: <http://dbpedia.org/resource/Gamemaster> <http://www.w3.org/2000/01/rdf-schema#label> "Gamemaster"@en .
> 7800: <http://dbpedia.org/resource/Genetic_engineering> <http://www.w3.org/2000/01/rdf-schema#label> "Genetic engineering"@en .
> 7801: <http://dbpedia.org/resource/Gradius_(video_game)> <http://www.w3.org/2000/01/rdf-s
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)