You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by "Jens Grivolla (JIRA)" <de...@uima.apache.org> on 2013/02/14 12:10:12 UTC

[jira] [Created] (UIMA-2670) FileSystemCollectionReader doesn't set lastSegment correctly

Jens Grivolla created UIMA-2670:
-----------------------------------

             Summary: FileSystemCollectionReader doesn't set lastSegment correctly
                 Key: UIMA-2670
                 URL: https://issues.apache.org/jira/browse/UIMA-2670
             Project: UIMA
          Issue Type: Bug
          Components: Examples
    Affects Versions: 2.4.0SDK
            Reporter: Jens Grivolla
            Priority: Minor


FileSystemCollectionReader only sets lastSegment=true (in the SourceDocumentInformation) on the last document. Given that it loads individual documents, not segments of a document, this should be "true" for each CAS that it generates.

This is a problem when later using a CAS multiplier to segment the CAS. It should be possible to check whether the CAS is a complete document or a segment by testing for "offsetInSource==0 && lastSegment==true".

in org.apache.uima.examples.cpe.FileSystemCollectionReader:166

srcDocInfo.setLastSegment(mCurrentIndex == mFiles.size());

should be:

srcDocInfo.setLastSegment(true);


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira