You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Jörn Kottmann (JIRA)" <ji...@apache.org> on 2011/06/09 10:04:01 UTC

[jira] [Created] (OPENNLP-201) Sentence Detector Trainer stops reading data when it contains two lines

Sentence Detector Trainer stops reading data when it contains two lines
-----------------------------------------------------------------------

                 Key: OPENNLP-201
                 URL: https://issues.apache.org/jira/browse/OPENNLP-201
             Project: OpenNLP
          Issue Type: Bug
          Components: Sentence Detector
    Affects Versions: tools-1.5.1-incubating
            Reporter: Jörn Kottmann
            Assignee: Jörn Kottmann
             Fix For: tools-1.5.2-incubating


The Sentence Detector Trainer stops reading the training data when the input stream contains two or more empty lines. Empty lines are used to mark document boundaries.

To fix this issue the training data reading code should treat multiple empty lines in the same way as one empty line.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (OPENNLP-201) Sentence Detector Trainer stops reading data when it contains two empty lines

Posted by "Jörn Kottmann (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörn Kottmann closed OPENNLP-201.
---------------------------------

    Resolution: Fixed

Added empty line preprocessing stream which skips over multiple empty lines.

> Sentence Detector Trainer stops reading data when it contains two empty lines
> -----------------------------------------------------------------------------
>
>                 Key: OPENNLP-201
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-201
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Sentence Detector
>    Affects Versions: tools-1.5.1-incubating
>            Reporter: Jörn Kottmann
>            Assignee: Jörn Kottmann
>             Fix For: tools-1.5.2-incubating
>
>
> The Sentence Detector Trainer stops reading the training data when the input stream contains two or more empty lines. Empty lines are used to mark document boundaries.
> To fix this issue the training data reading code should treat multiple empty lines in the same way as one empty line.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (OPENNLP-201) Sentence Detector Trainer stops reading data when it contains two empty lines

Posted by "Jörn Kottmann (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/OPENNLP-201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jörn Kottmann updated OPENNLP-201:
----------------------------------

    Summary: Sentence Detector Trainer stops reading data when it contains two empty lines  (was: Sentence Detector Trainer stops reading data when it contains two lines)

> Sentence Detector Trainer stops reading data when it contains two empty lines
> -----------------------------------------------------------------------------
>
>                 Key: OPENNLP-201
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-201
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Sentence Detector
>    Affects Versions: tools-1.5.1-incubating
>            Reporter: Jörn Kottmann
>            Assignee: Jörn Kottmann
>             Fix For: tools-1.5.2-incubating
>
>
> The Sentence Detector Trainer stops reading the training data when the input stream contains two or more empty lines. Empty lines are used to mark document boundaries.
> To fix this issue the training data reading code should treat multiple empty lines in the same way as one empty line.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira