You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Peter Thygesen (JIRA)" <ji...@apache.org> on 2017/09/14 09:02:00 UTC

[jira] [Created] (OPENNLP-1131) LeipzigLanguageSampleStreamFactory should not load hidden files

Peter Thygesen created OPENNLP-1131:
---------------------------------------

             Summary: LeipzigLanguageSampleStreamFactory should not load hidden files
                 Key: OPENNLP-1131
                 URL: https://issues.apache.org/jira/browse/OPENNLP-1131
             Project: OpenNLP
          Issue Type: Bug
          Components: Language Detector
    Affects Versions: 1.8.2
            Reporter: Peter Thygesen
            Assignee: Peter Thygesen


.DS_Store file is loaded as a sentence sample file. This is should not happen.

Exception in thread "main" java.io.UncheckedIOException: java.nio.charset.MalformedInputException: Input length = 1
	at java.io.BufferedReader$1.hasNext(BufferedReader.java:574)
	at java.util.Iterator.forEachRemaining(Iterator.java:115)
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.LongPipeline.reduce(LongPipeline.java:438)
	at java.util.stream.LongPipeline.sum(LongPipeline.java:396)
	at java.util.stream.ReferencePipeline.count(ReferencePipeline.java:526)
	at opennlp.tools.formats.leipzig.LeipzigLanguageSampleStream$LeipzigSentencesStream.<init>(LeipzigLanguageSampleStream.java:57)
	at opennlp.tools.formats.leipzig.LeipzigLanguageSampleStream.read(LeipzigLanguageSampleStream.java:157)
	at opennlp.tools.formats.leipzig.LeipzigLanguageSampleStream.read(LeipzigLanguageSampleStream.java:42)
	at opennlp.tools.formats.leipzig.SampleShuffleStream.<init>(SampleShuffleStream.java:38)
	at opennlp.tools.formats.leipzig.LeipzigLanguageSampleStreamFactory.create(LeipzigLanguageSampleStreamFactory.java:76)
	at opennlp.tools.cmdline.AbstractConverterTool.run(AbstractConverterTool.java:106)
	at opennlp.tools.cmdline.CLI.main(CLI.java:256)
Caused by: java.nio.charset.MalformedInputException: Input length = 1
	at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
	at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:339)
	at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
	at java.io.InputStreamReader.read(InputStreamReader.java:184)
	at java.io.BufferedReader.fill(BufferedReader.java:161)
	at java.io.BufferedReader.readLine(BufferedReader.java:324)
	at java.io.BufferedReader.readLine(BufferedReader.java:389)
	at java.io.BufferedReader$1.hasNext(BufferedReader.java:571)
	... 16 more



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)