You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Frédérik Bilhaut (JIRA)" <ji...@apache.org> on 2015/10/12 14:24:05 UTC

[jira] [Comment Edited] (COMPRESS-325) Unable to uncompress bzip2 dbPedia files

    [ https://issues.apache.org/jira/browse/COMPRESS-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14953054#comment-14953054 ] 

Frédérik Bilhaut edited comment on COMPRESS-325 at 10/12/15 12:23 PM:
----------------------------------------------------------------------

It seems that files were packed with pbzip2. Is it related to COMPRESS-185 ?


was (Author: fbilhaut):
It seems that files were packed with pbzip2. Is it related to #185 ?

> Unable to uncompress bzip2 dbPedia files
> ----------------------------------------
>
>                 Key: COMPRESS-325
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-325
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.10
>            Reporter: Frédérik Bilhaut
>
> Sample code :
> {code:java}
> URL url = new URL("http://downloads.dbpedia.org/current/core-i18n/en/labels_en.nt.bz2");
> InputStream input = new BZip2CompressorInputStream(url.openConnection().getInputStream());
> BufferedReader reader = new BufferedReader(new InputStreamReader(input, "US-ASCII"));
> 			
> int count = 0;
> for(String line = reader.readLine(); line != null; line = reader.readLine()) {
> 	if(++count > 10000) break;
> 	else System.out.println(count + ": " + line);
> }
> {code}
> It stops at line 7801 (EOF) :
> {code}
> 7799: <http://dbpedia.org/resource/Gamemaster> <http://www.w3.org/2000/01/rdf-schema#label> "Gamemaster"@en .
> 7800: <http://dbpedia.org/resource/Genetic_engineering> <http://www.w3.org/2000/01/rdf-schema#label> "Genetic engineering"@en .
> 7801: <http://dbpedia.org/resource/Gradius_(video_game)> <http://www.w3.org/2000/01/rdf-s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)