You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/11/27 14:16:42 UTC

[GitHub] [lucene] mocobeta opened a new pull request #482: Use the same analysis chain to StandardAnalyzer (a follow-up #480)

mocobeta opened a new pull request #482:
URL: https://github.com/apache/lucene/pull/482


   This mimics the StandardAnalyzer.
   
   https://github.com/apache/lucene/pull/480#pullrequestreview-817120063
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on a change in pull request #482: Use the same analysis chain to StandardAnalyzer (a follow-up of #480)

Posted by GitBox <gi...@apache.org>.
mocobeta commented on a change in pull request #482:
URL: https://github.com/apache/lucene/pull/482#discussion_r757823244



##########
File path: lucene/luke/src/java/org/apache/lucene/luke/models/analysis/AnalysisImpl.java
##########
@@ -154,7 +158,12 @@ public Analyzer createAnalyzerFromClassName(String analyzerType) {
 
   private Analyzer defaultAnalyzer() {
     try {
-      return CustomAnalyzer.builder().withTokenizer("standard").build();
+      Map<String, String> params = new HashMap<>();
+      params.put("maxTokenLength", Integer.toString(StandardAnalyzer.DEFAULT_MAX_TOKEN_LENGTH));
+      return CustomAnalyzer.builder()
+          .withTokenizer(StandardTokenizerFactory.NAME, params)

Review comment:
       > You don't need a map. Customanalyzer takes a Varars of strings. We should use the code in the "official way" (see examples in Javadocs).
   
   I know the builder interface with variable arguments. Ok, I'll keep in mind that.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a change in pull request #482: Use the same analysis chain to StandardAnalyzer (a follow-up of #480)

Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #482:
URL: https://github.com/apache/lucene/pull/482#discussion_r757797871



##########
File path: lucene/luke/src/java/org/apache/lucene/luke/models/analysis/AnalysisImpl.java
##########
@@ -154,7 +158,12 @@ public Analyzer createAnalyzerFromClassName(String analyzerType) {
 
   private Analyzer defaultAnalyzer() {
     try {
-      return CustomAnalyzer.builder().withTokenizer("standard").build();
+      Map<String, String> params = new HashMap<>();
+      params.put("maxTokenLength", Integer.toString(StandardAnalyzer.DEFAULT_MAX_TOKEN_LENGTH));
+      return CustomAnalyzer.builder()
+          .withTokenizer(StandardTokenizerFactory.NAME, params)

Review comment:
       Here's a quite complex example by Robert. Please also note the recommend formatting:
   
   https://github.com/uschindler/german-decompounder#lucene-api-example




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on a change in pull request #482: Use the same analysis chain to StandardAnalyzer (a follow-up of #480)

Posted by GitBox <gi...@apache.org>.
uschindler commented on a change in pull request #482:
URL: https://github.com/apache/lucene/pull/482#discussion_r757797473



##########
File path: lucene/luke/src/java/org/apache/lucene/luke/models/analysis/AnalysisImpl.java
##########
@@ -154,7 +158,12 @@ public Analyzer createAnalyzerFromClassName(String analyzerType) {
 
   private Analyzer defaultAnalyzer() {
     try {
-      return CustomAnalyzer.builder().withTokenizer("standard").build();
+      Map<String, String> params = new HashMap<>();
+      params.put("maxTokenLength", Integer.toString(StandardAnalyzer.DEFAULT_MAX_TOKEN_LENGTH));
+      return CustomAnalyzer.builder()
+          .withTokenizer(StandardTokenizerFactory.NAME, params)

Review comment:
       You don't need a map. Customanalyzer takes a Varars of strings. We should use the code in the "official way" (see examples in Javadocs).
   
   ```
   return CustomAnalyzer.builder()
             .withTokenizer(StandardTokenizerFactory.NAME, params), "maxTokenLength", Integer.toString(StandardAnalyzer.DEFAULT_MAX_TOKEN_LENGTH))...
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #482: Use the same analysis chain to StandardAnalyzer (a follow-up of #480)

Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #482:
URL: https://github.com/apache/lucene/pull/482#issuecomment-980783528


   Hi,
   I did some quick tests. I was expecting to see the standard analyzer defaults already predefined in the UI.
   As this is not the case, I would stay with `new StandardAnalyzer()`. It is easier and useful.
   
   I had some other idea: You could also add a line to enter the class name of an Analyzer and add a button "Instantiate class (requires no-arg constructor)".


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #482: Use the same analysis chain to StandardAnalyzer (a follow-up of #480)

Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #482:
URL: https://github.com/apache/lucene/pull/482#issuecomment-981080121


   Thanks 👍


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta merged pull request #482: Use the same analysis chain to StandardAnalyzer (a follow-up of #480)

Posted by GitBox <gi...@apache.org>.
mocobeta merged pull request #482:
URL: https://github.com/apache/lucene/pull/482


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #482: Use the same analysis chain to StandardAnalyzer (a follow-up of #480)

Posted by GitBox <gi...@apache.org>.
mocobeta commented on pull request #482:
URL: https://github.com/apache/lucene/pull/482#issuecomment-980806417


   I replaced the default to `StandardAnalyzer`.
   
   > I did some quick tests. I was expecting to see the standard analyzer defaults already predefined in the UI.
   > As this is not the case, I would stay with new StandardAnalyzer(). It is easier and useful.
   
   Yes - while StandardAnalyzer is already used as the default (in 9.0), that is defined in the UI layer. My intention here is that set the default in the model layer; though the difference is not visible in the UI, this is more manageable and test-friendly. 
   
   > I had some other idea: You could also add a line to enter the class name of an Analyzer and add a button "Instantiate class (requires no-arg constructor)".
   
   This is definitely possible, but some work is needed (adding a UI component requires layout rebalancing and proper event handling); can come later.
   
   Here is the default look of the Luke Analysis UI with the patch (no difference from 9.0, in terms of the user interface).
   ![Screenshot from 2021-11-28 08-32-48](https://user-images.githubusercontent.com/1825333/143723393-cc7ce7f5-92a3-425b-8bcb-ca26e0150fef.png)
   
   I'll push it to the main soon.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org