You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/02/02 11:05:58 UTC

[GitHub] [lucene] mocobeta opened a new pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

mocobeta opened a new pull request #638:
URL: https://github.com/apache/lucene/pull/638


   See https://issues.apache.org/jira/browse/LUCENE-10393
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1027850413


   The test fails since this wouldn't work on module mode (the resource loader attempts to read resources in another module). I'll look at it later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028811056


   Personally, I don't care about the APIs (two-args constructors to load external resources) too, but I know there are people who need and add the feature to kuromoji and nori (I forgot the issue).
   
   > I would really not move this dictionary code to commons module and add opens. 
   
   Sure, yes - I understood your intention.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028808508


   I would really not move this dictionary code to commons module and add opens. This is as bad as having the (now deprecated) methods in IOUtils to load stopwords files. Loading of a class resource must always be done by the class itsself not delegated to somebody else - because it is caller-sensitive!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta edited a comment on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta edited a comment on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028790164


   Thanks for your comment. Minor correction - we currently don't even have DictionaryLoader but the resource loading methods are embedded in two abstract BinaryDictionarys themselves. My intention here is, first simplify the resource loading stuff then break it down into proper methods. I know the current switching between classpath/filepath is not good at all and agree with we should remove it at some point.
   I'll explore if we can remove the delegation around resource loading without changing the API interface (for now).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028790164


   Thanks for your comment. Minor correction - we currently don't even have DictionaryLoader but the resource loading methods are embedded in two BinaryDictionarys themselves. My intention here is, first simplify the resource loading stuff then break it down into proper methods. I know the current switching between classpath/filepath is not good at all and agree with we should remove it at some point.
   I'll explore if we can remove the delegation around resource loading without changing the API interface (for now).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028771777


   > Please don't do the current code or add opens. Just simple remove the DictionaryLoader class completely and just use getResourceAsStream() directly. Most important: Remove the ENUM, its a misnomer anyways (CLASSPATH is incorrect).
   
   This is exactly the same: Never ever delegate opening class' resources to other code. Call getResourceAsStream() from the class that wants to load the files and where they reside. Don't add abstractions: ConnectionCosts class should call getResourceAsStream in its own code and load the file placed next to the class in JAR.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta edited a comment on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta edited a comment on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028811056


   Personally, I don't care about the API or two-args constructor to load external resources (the main interest/concern here) too, but I know there are people who need and add the feature to kuromoji and nori (I forgot the issue). To completely get rid of the methods for loading external resources, we'd need to work on LUCENE-8816. But we seem to have a long way to go.
   
   > I would really not move this dictionary code to commons module and add opens. 
   
   Sure, yes - I understood your intention.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028858655


   I will open an issue to clean up the constructors' signatures for the dictionaries if I have a chance. I might need some time and good motivation for the refactoring.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028598383


   I opened `o.a.l.a.ja.dict` and `o.a.l.a.ko.dict` to `analysis-common` module to allow `DictionaryResourceLoader` (common utility) to read dictionary resources in kuromoji and nori module.
   
   Then `TestModuleLayer.testAllOpenAnalysisPackagesInSync()` says
   ```
   error: [Opens should only be targeted to Lucene Core.] 
   ```
   
   Is there a way to lose this restriction? I don't think the resource loader should resident in lucene-core, it's too specific to kuromoji and nori (and a bit hacky; I think the two-args constructor should be break down into two constructors at the next major release - one for resources on the classpath and another for ones on the file path. In the future, I hope we will be able to resolve LUCENE-8816 then we'll need only one constructor that loads the dictionary resources from classpaths).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028994632


   I think the only reason why we have the file loading option is to support unit testing the dictionary-loading components. Maybe there's a better way?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028805953


   > I'll explore if we can remove the delegation around resource loading without changing the API interface (for now).
   
   I won't care about API, it is all not really public, because most of the stuff may be public visible, but can't be used, because the JapaneseTokenizer can't really load any other resources.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] uschindler commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
uschindler commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028862245


   Thanks. This really needs motivation, one reason why I did not do it!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta closed pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta closed pull request #638:
URL: https://github.com/apache/lucene/pull/638


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1029151546


   I opened a draft PR: https://github.com/apache/lucene/pull/643.
   Would you take a look at it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta edited a comment on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta edited a comment on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028598383


   I opened `o.a.l.a.ja.dict` and `o.a.l.a.ko.dict` to `analysis-common` module to allow `DictionaryResourceLoader` (common utility) to read dictionary resources in kuromoji and nori module.
   
   Then `TestModuleLayer.testAllOpenAnalysisPackagesInSync()` says
   ```
   error: [Opens should only be targeted to Lucene Core.] 
   ```
   
   Is there a way to loosen this restriction? I don't think the resource loader should resident in lucene-core, it's too specific to kuromoji and nori (and a bit hacky; I think the two-args constructor should be break down into two constructors at the next major release - one for resources on the classpath and another for ones on the file path. In the future, I hope we will be able to resolve LUCENE-8816 then we'll need only one constructor that loads the dictionary resources from classpaths).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mocobeta commented on pull request #638: LUCENE-10393: Unify resource loader in kuromoji and nori

Posted by GitBox <gi...@apache.org>.
mocobeta commented on pull request #638:
URL: https://github.com/apache/lucene/pull/638#issuecomment-1028736869


   Hi @uschindler, I think I need your review or opinion.
   This reduces code duplication and technically works, but adds "open" directives in the module descriptors.
   If this approach is not preferable, I'll back the dictionary loader utility to both kuromoji and nori - we'll need to maintain two dictionary loaders as before though, I think the code was simplified by the refactoring here.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org