You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@jena.apache.org by "Lorenz Bühmann (Jira)" <ji...@apache.org> on 2022/03/03 13:33:00 UTC

[jira] [Updated] (JENA-2225) TDB/TDB2 dataset size stat serialized incorrectly for large datasets

     [ https://issues.apache.org/jira/browse/JENA-2225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lorenz Bühmann updated JENA-2225:
---------------------------------
    Attachment: stats.opt

> TDB/TDB2 dataset size stat serialized incorrectly for large datasets
> --------------------------------------------------------------------
>
>                 Key: JENA-2225
>                 URL: https://issues.apache.org/jira/browse/JENA-2225
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB, TDB2
>    Affects Versions: Jena 4.3.1
>            Reporter: Lorenz Bühmann
>            Assignee: Andy Seaborne
>            Priority: Minor
>             Fix For: Jena 4.4.0
>
>         Attachments: stats.opt
>
>
> When computing the TDB/TDB2 stats via CLI the size will be serialized incorrectly for large datasets.
> For example for latest Wikidata Truthy we get
> {noformat}
> (count -1983667112)){noformat}
> This happens because for both the corresponding `Stats.java` class does enforce an Integer type Node though the value is a long type:
> {code:java}
> if ( count >= 0 )
>     addPair(meta.getList(), StatsMatcher.COUNT, NodeFactoryExtra.intToNode((int)count)) ; {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)