You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by GitBox <gi...@apache.org> on 2022/12/04 15:30:25 UTC

[GitHub] [opennlp] kinow commented on a diff in pull request #446: OPENNLP-1404 Enhance JavaDoc in opennlp.tools.postag package

kinow commented on code in PR #446:
URL: https://github.com/apache/opennlp/pull/446#discussion_r1038987138


##########
opennlp-tools/src/main/java/opennlp/tools/postag/WordTagSampleStream.java:
##########
@@ -58,6 +61,7 @@ public POSSample read() throws IOException {
       try {
         sample = POSSample.parse(sentence);
       } catch (InvalidFormatException e) {
+        // TODO: An exception in error case should be thrown.

Review Comment:
   :+1: 



##########
opennlp-tools/src/main/java/opennlp/tools/postag/POSTaggerFactory.java:
##########
@@ -363,11 +382,11 @@ public void validateArtifactMap() throws InvalidFormatException {
     if (ngramDictEntry != null && !(ngramDictEntry instanceof Dictionary)) {
       throw new InvalidFormatException("NGram dictionary has wrong type!");
     }
-
   }
 
+  // reduced visibility to ensure deprecation is respected in future versions
   @Deprecated

Review Comment:
   We are more strict about changing visibility or removing public methods in Commons between non-major releases, but I think it should be fine for OpenNLP, as its API is not expected to be used by most users (i.e. most should be using the tools to train/infer with models instead). Even then, worth adding a note to the changelog about it IMO.



##########
opennlp-tools/src/main/java/opennlp/tools/postag/POSTagger.java:
##########
@@ -26,14 +26,38 @@ public interface POSTagger {
 
   /**
    * Assigns the sentence of tokens pos tags.
+   *
    * @param sentence The sentence of tokens to be tagged.
-   * @return an array of pos tags for each token provided in sentence.
+   * @return An array of pos tags for each token provided in {@code sentence}.
    */
   String[] tag(String[] sentence);
 
+  /**
+   * Assigns the sentence of tokens pos tags.
+   *
+   * @param sentence The sentence of tokens to be tagged.
+   * @param additionalContext The context to provide additional information with.
+   *                          
+   * @return An array of pos tags for each token provided in {@code sentence}.
+   */
   String[] tag(String[] sentence, Object[] additionalContext);
 
+  /**
+   * Assigns the sentence the top-k {@link Sequence sequences}.
+   *
+   * @param sentence The sentence of tokens to be tagged.
+   *
+   * @return An array of {@link Sequence sequeneces} for each token provided in {@code sentence}.
+   */
   Sequence[] topKSequences(String[] sentence);
 
+  /**
+   * Assigns the sentence the top-k {@link Sequence sequences}.
+   *
+   * @param sentence The sentence of tokens to be tagged.
+   * @param additionalContext The context to provide additional information with.
+   *
+   * @return An array of {@link Sequence sequeneces} for each token provided in {@code sentence}.

Review Comment:
   s/sequeneces/sequences



##########
opennlp-tools/src/main/java/opennlp/tools/postag/POSTaggerFactory.java:
##########
@@ -95,8 +83,19 @@ public POSTaggerFactory(byte[] featureGeneratorBytes, final Map<String, Object>
     this.posDictionary = posDictionary;
   }
 
+
+  // reduced visibility to ensure deprecation is respected in future versions
+  @Deprecated
+  POSTaggerFactory(Dictionary ngramDictionary, TagDictionary posDictionary) {
+    this.init(ngramDictionary, posDictionary);
+
+    // TODO: This could be made functional by creating some default feature generation
+    // which uses the dictionary ...

Review Comment:
   :ok_man: 



##########
opennlp-tools/src/main/java/opennlp/tools/postag/TagDictionary.java:
##########
@@ -25,11 +25,18 @@
 public interface TagDictionary {
 
   /**
-   * Returns a list of valid tags for the specified word.
+   * Retrieves a list of valid tags for the specified {@code word}.
    *
    * @param word The word.
-   * @return A list of valid tags for the specified word or null if no information
-   * is available for that word.
+   * @return An array of valid tags for the specified {@code word} or {@code null} if
+   *         no information is available for that word.
    */
   String[] getTags(String word);
+
+  /**
+   * Whether if the dictionary is case-sensitive or not.

Review Comment:
   I think we can drop the `if` (i.e. Whether the dictionary is... or not).



##########
opennlp-tools/src/main/java/opennlp/tools/postag/POSTaggerFactory.java:
##########
@@ -225,49 +240,51 @@ public void setTagDictionary(TagDictionary dictionary) {
     this.posDictionary = dictionary;
   }
 
+  /**
+   * @return The key-value based resources map, or an empty map.
+   */
   protected Map<String, Object> getResources() {
-
-
     if (resources != null) {
       return resources;
     }
 
     return Collections.emptyMap();
   }
 
+  /**
+   * @return The feature generator bytes used.
+   */
   protected byte[] getFeatureGenerator() {
     return featureGeneratorBytes;
   }
 
+  /**
+   * @return The {@link TagDictionary} used.
+   */
   public TagDictionary getTagDictionary() {
     if (this.posDictionary == null && artifactProvider != null)
       this.posDictionary = artifactProvider.getArtifact(TAG_DICTIONARY_ENTRY_NAME);
     return this.posDictionary;
   }
 
-  /**
-   * @deprecated this will be reduced in visibility and later removed
-   */
-  @Deprecated
-  public Dictionary getDictionary() {
+  @Deprecated // will be removed when only 8 series models are supported
+  private Dictionary getDictionary() {

Review Comment:
   Ditto the note below, about changing public methods :point_down: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@opennlp.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org