You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by GitBox <gi...@apache.org> on 2022/12/20 12:53:01 UTC

[GitHub] [opennlp] mawiesne commented on a diff in pull request #461: OPENNLP-1416 Enhance JavaDoc in opennlp.tools.formats.ad package

mawiesne commented on code in PR #461:
URL: https://github.com/apache/opennlp/pull/461#discussion_r1053282816


##########
opennlp-tools/src/main/java/opennlp/tools/formats/ad/ADNameSampleStream.java:
##########
@@ -154,61 +153,49 @@ public class ADNameSampleStream implements ObjectStream<NameSample> {
 
   private final ObjectStream<ADSentenceStream.Sentence> adSentenceStream;
 
-  /**
+  /*
    * To keep the last left contraction part
    */
   private String leftContractionPart = null;
 
   private final boolean splitHyphenatedTokens;
 
   /**
-   * Creates a new {@link NameSample} stream from a line stream, i.e.
-   * {@link ObjectStream}&lt;{@link String}&gt;, that could be a
-   * {@link PlainTextByLineStream} object.
+   * Initializes a new {@link ADNameSampleStream} stream from a {@link ObjectStream<String>},
+   * that could be a {@link PlainTextByLineStream} object.
    *
-   * @param lineStream
-   *          a stream of lines as {@link String}
-   * @param splitHyphenatedTokens
-   *          if true hyphenated tokens will be separated: "carros-monstro" &gt;
-   *          "carros" "-" "monstro"
+   * @param lineStream An {@link ObjectStream<String>} as input.
+   * @param splitHyphenatedTokens If {@code true} hyphenated tokens will be separated:
+   *                              "carros-monstro" &gt; "carros" "-" "monstro".
    */
   public ADNameSampleStream(ObjectStream<String> lineStream, boolean splitHyphenatedTokens) {
     this.adSentenceStream = new ADSentenceStream(lineStream);
     this.splitHyphenatedTokens = splitHyphenatedTokens;
   }
 
   /**
-   * Creates a new {@link NameSample} stream from a {@link InputStream}
+   * Initializes a new {@link ADNameSampleStream} from an {@link InputStreamFactory}
    *
-   * @param in
-   *          the Corpus {@link InputStream}
-   * @param charsetName
-   *          the charset of the Arvores Deitadas Corpus
-   * @param splitHyphenatedTokens
-   *          if true hyphenated tokens will be separated: "carros-monstro" &gt;
-   *          "carros" "-" "monstro"
+   * @param in The Corpus {@link InputStreamFactory}.
+   * @param charsetName  The {@link java.nio.charset.Charset charset} to use
+   *                     for reading of the corpus.
+   * @param splitHyphenatedTokens If {@code true} hyphenated tokens will be separated:
+   *                              "carros-monstro" &gt; "carros" "-" "monstro".
    */
   @Deprecated
   public ADNameSampleStream(InputStreamFactory in, String charsetName,
       boolean splitHyphenatedTokens) throws IOException {
-
-    try {
-      this.adSentenceStream = new ADSentenceStream(new PlainTextByLineStream(
-          in, charsetName));
-      this.splitHyphenatedTokens = splitHyphenatedTokens;
-    } catch (UnsupportedEncodingException e) {
-      // UTF-8 is available on all JVMs, will never happen
-      throw new IllegalStateException(e);
-    }
+    this(new PlainTextByLineStream(in, charsetName), splitHyphenatedTokens);
   }
 
   private int textID = -1;
 
+  @Override
   public NameSample read() throws IOException {
 
     Sentence paragraph;
     // we should look for text here.
-    while ((paragraph = this.adSentenceStream.read()) != null) {
+    if ((paragraph = this.adSentenceStream.read()) != null) {

Review Comment:
   The first element read is being directly returned. Therefore `while` made no sense here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@opennlp.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org