You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@solr.apache.org by Jan Høydahl <ja...@cominvent.com> on 2022/01/21 10:46:36 UTC

[DISCUSS] Standardizing module naming

Hi,

In https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming <https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming> I suggested standardizing contrib/module names. We did not discuss it in yesterday's committer meeting, and it may be a bit too much for 9.0. But I'd like to discussed, since we are anyway renaming everything in SOLR-15917 "contrib->module".

With as few contribs as we had so far it has not really been an issue. But the reason I suggested it is because I anticipate a huge growth in number of modules/packages during 9.x, and it can get messy. Another reason for having a convention is that it forces the module/package creator to think through whether the proposed module has the right granularity. Take for instance the new "HDFS" or "Hadoop" module. It won't fit into either of my proposed types, as it contains both a directoryFactory, one or two authentication plugins and one backup repository. That of course suggests that the module is too big and should be divided. Another reason is that when we have 50 modules / packages it would be far better for users to be able to find all backup repositories by looking for backup-* rather than guess from naming what it is. Perhaps a bad example since both repo contribs have a suffix "-repository" today. But then "-repository" is not as user friendly as "backup-".

So I guess I'd like your opinion on

1) Do we even want a convention (at least for our own code?)
2) If yes, should we rename the contribs/modules for 9.0 when we throw them around anyway?
3) When we start adding package manifests to the modules, should there be a 1:1 between module name and package name?

Refarding the last point, we could apply such standardized naming convention for the packages only and leave module names as-is, i.e. you'd do "solr package install update-extraction" even if the module name is "extraction".
Jan

Re: [DISCUSS] Standardizing module naming

Posted by Ishan Chattopadhyaya <ic...@gmail.com>.
> Or we could just leave it up to each module author as before :)

+1

On Fri, Jan 21, 2022 at 9:31 PM Jan Høydahl <ja...@cominvent.com> wrote:

> There is kind of a proposal in
> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
> already, but I'd like to discuss the general idea and what structure makes
> the most sense here. With my "type" proposal, you can easily map the new
> names for the various contribs, e.g. "backup-s3", "backup-gce",
> "update-extraction", "update-langid", "search-analytics" etc. Other
> structures are also probably possible? Or we could just leave it up to each
> module author as before :)
>
> Jan
>
> 21. jan. 2022 kl. 15:25 skrev David Smiley <ds...@apache.org>:
>
> Now is a great time to do some name changes.  I suggest that you make a
> specific proposal of what the names should be.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <a....@sease.io>
> wrote:
>
>> I would also add a tangential question (rather than answers at this
>> point):
>> What makes a module(contrib) a module(contrib)?
>> *From now on I'll use 'module' where I intend a package under contrib.*
>>
>> I am referring to first-party modules such as ltr or langid.
>> My initial understanding was that a module in contrib, is an integration
>> with some external dependency (like langid with OpenNLP, Tika or
>> langdetect).
>> But then, why is *ltr* a module? It doesn't really integrate with any
>> external dependency.
>> It's additional query parsers and components for a key Solr functionality.
>> Is it just a legacy consequence of the fact that initially, Bloomberg
>> contributed the module?
>> Maybe this applies to other modules as well (analytics?).
>> Then, should this be fixed and brought inside the Solr core?
>>
>> And what about first party/third party modules?
>> I don't think there's any visible difference right now, but in case we
>> want to make a difference, should we create a sort of official "Solr Plugin
>> Marketplace" ?
>> (I proposed the idea to Lucidworks many years ago when I was working for
>> a partner, and for a certain amount of time, I think there was a Solr
>> Plugin Marketplace, but it was proprietary).
>>
>> I am curious to understand what you think about this and then reason
>> about the naming convention.
>>
>> Cheers
>>
>>
>> --------------------------
>> Alessandro Benedetti
>> Apache Lucene/Solr PMC member and Committer
>> Director, R&D Software Engineer, Search Consultant
>>
>> www.sease.io
>>
>>
>> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <ja...@cominvent.com> wrote:
>>
>>> Hi,
>>>
>>> In
>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>> I suggested standardizing contrib/module names. We did not discuss it in
>>> yesterday's committer meeting, and it may be a bit too much for 9.0. But
>>> I'd like to discussed, since we are anyway renaming everything in
>>> SOLR-15917 "contrib->module".
>>> With as few contribs as we had so far it has not really been an issue.
>>> But the reason I suggested it is because I anticipate a huge growth in
>>> number of modules/packages during 9.x, and it can get messy. Another reason
>>> for having a convention is that it forces the module/package creator to
>>> think through whether the proposed module has the right granularity. Take
>>> for instance the new "HDFS" or "Hadoop" module. It won't fit into either of
>>> my proposed types, as it contains both a directoryFactory, one or two
>>> authentication plugins and one backup repository. That of course suggests
>>> that the module is too big and should be divided. Another reason is that
>>> when we have 50 modules / packages it would be far better for users to be
>>> able to find all backup repositories by looking for backup-* rather than
>>> guess from naming what it is. Perhaps a bad example since both repo
>>> contribs have a suffix "-repository" today. But then "-repository" is not
>>> as user friendly as "backup-".
>>>
>>> So I guess I'd like your opinion on
>>>
>>> 1) Do we even want a convention (at least for our own code?)
>>> 2) If yes, should we rename the contribs/modules for 9.0 when we throw
>>> them around anyway?
>>> 3) When we start adding package manifests to the modules, should there
>>> be a 1:1 between module name and package name?
>>>
>>> Refarding the last point, we could apply such standardized naming
>>> convention for the packages only and leave module names as-is, i.e. you'd
>>> do "solr package install update-extraction" even if the module name is "
>>> extraction".
>>>
>>> Jan
>>>
>>
>

Re: [DISCUSS] Standardizing module naming

Posted by David Smiley <ds...@apache.org>.
I don't think we should embrace the module system in 9x; too much pain for
too little gain.  Maybe the only exception would be the SolrJ side as it
affects our users more directly.

For changing package names without breaking back-compat; depending on the
plugin, it may already be supported via the "solr.MyClassName" pattern.
These will work automatically.  If needed, we could do some small hack in
SolrResourceLoader to remap older class names to new ones.

Another interesting case would be auto-registered handlers like /sql for
modularizing the SQLHandler --
https://issues.apache.org/jira/browse/SOLR-15904. I think we could support
this by making ImplicitPlugins.json loading tolerant of a
ClassNotFoundException.  Alternatively it could be interesting to support a
way for a module to self-declare plugins that should be automatically
registered... although that would need to be reconciled with the package
manager's approach.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Jan 31, 2022 at 6:01 AM Jan Høydahl <ja...@cominvent.com> wrote:

> Could this work? :
>
> Say, in 9.1 that we move feature FOO out of core into a module. We
> introduce a new SOLR_DEFAULT_MODULES=foo variable that will append to
> SOLR_MODULES,
> so if a 9.0 user has SOLR_MODULES=extracting and upgrades to 9.1, tey will
> not need to change anything and will get both. But if they do not need FOO,
> then they
> can get rid of it from classpath by setting SOLR_DEFAULT_MODULES="". Then
> in 10.0 SOLR_DEFAULT_MODULES is again empty.
>
> The only thing I'm worried abut is split packages. E.g. HadoopAuthPlugin
> lives in org.apache.solr.security which will be shared with core. As I
> understand, that may be a
> problem for JavaDoc, and for java module system if we want to embrace it.
> Anything else? Is there a clever way we could change package name in 9.1
> without
> breaking back-compat?
>
> Jan
>
> 31. jan. 2022 kl. 04:11 skrev David Smiley <ds...@apache.org>:
>
> Yes; I was thinking something like this as well.  This way we can make
> meaningful progress on modularization during the 9x series without breaking
> compatibility.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Sat, Jan 29, 2022 at 4:37 PM Jan Høydahl <ja...@cominvent.com> wrote:
>
>> Hi,
>>
>> Seems to be not an overwhelming support for enforcing naming convention -
>> at least not yet.
>> So let the suggestion be a recommendation, and we'll see during 9.x what
>> naming makes sense for new modules.
>>
>> I thought about whether we can extract code from solr-core into modules
>> in a 9.x minor release.
>> If it breaks exisitng use, e.g. package name change, or if the plugin in
>> no longer on classpath by default, we cannot.
>> But if we want to extract a certain feature, such as Hadoop-Auth, in 9.1
>> - if we keep the package name and make the new module included in
>> SOLR_MODULES by default, then perhaps? Views?
>>
>> Jan
>>
>>
>> 24. jan. 2022 kl. 17:23 skrev Jason Gerlowski <ge...@gmail.com>:
>>
>>
>> 1. [Do we want a convention?] I'd be fine with a convention as long as
>> we're willing to be flexible on it or evolve it as more modules come in.
>> If we're expecting that 9.x will bring in other new modules but we don't
>> know what those are, then we can't be too strict on any particular naming.
>>
>> 2. [should we rename the contribs/modules for 9.0 when we throw them
>> around anyway?] Sure, +1 to the proposed names.
>>
>> Jason
>>
>> On Fri, Jan 21, 2022 at 1:53 PM Houston Putman <ho...@gmail.com>
>> wrote:
>>
>>> I agree that standardizing the names would be nice.
>>>
>>> Another good option is to have a ref-guide page that lists all the
>>> modules, explains their purpose and links to relevant documentation.
>>> This page could be broken down by feature, much like your proposed names
>>> would be.
>>>
>>> On Fri, Jan 21, 2022 at 1:47 PM David Smiley <ds...@apache.org> wrote:
>>>
>>>> +1 I like your proposed names.  Some of our names are so short now that
>>>> only us know what they are at a glance.
>>>>
>>>>
>>>> ~ David Smiley
>>>> Apache Lucene/Solr Search Developer
>>>> http://www.linkedin.com/in/davidwsmiley
>>>>
>>>>
>>>> On Fri, Jan 21, 2022 at 11:01 AM Jan Høydahl <ja...@cominvent.com>
>>>> wrote:
>>>>
>>>>> There is kind of a proposal in
>>>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>>>> already, but I'd like to discuss the general idea and what structure makes
>>>>> the most sense here. With my "type" proposal, you can easily map the new
>>>>> names for the various contribs, e.g. "backup-s3", "backup-gce",
>>>>> "update-extraction", "update-langid", "search-analytics" etc. Other
>>>>> structures are also probably possible? Or we could just leave it up to each
>>>>> module author as before :)
>>>>>
>>>>> Jan
>>>>>
>>>>> 21. jan. 2022 kl. 15:25 skrev David Smiley <ds...@apache.org>:
>>>>>
>>>>> Now is a great time to do some name changes.  I suggest that you make
>>>>> a specific proposal of what the names should be.
>>>>>
>>>>> ~ David Smiley
>>>>> Apache Lucene/Solr Search Developer
>>>>> http://www.linkedin.com/in/davidwsmiley
>>>>>
>>>>>
>>>>> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <
>>>>> a.benedetti@sease.io> wrote:
>>>>>
>>>>>> I would also add a tangential question (rather than answers at this
>>>>>> point):
>>>>>> What makes a module(contrib) a module(contrib)?
>>>>>> *From now on I'll use 'module' where I intend a package under
>>>>>> contrib.*
>>>>>>
>>>>>> I am referring to first-party modules such as ltr or langid.
>>>>>> My initial understanding was that a module in contrib, is an
>>>>>> integration with some external dependency (like langid with OpenNLP, Tika
>>>>>> or langdetect).
>>>>>> But then, why is *ltr* a module? It doesn't really integrate with
>>>>>> any external dependency.
>>>>>> It's additional query parsers and components for a key Solr
>>>>>> functionality.
>>>>>> Is it just a legacy consequence of the fact that initially, Bloomberg
>>>>>> contributed the module?
>>>>>> Maybe this applies to other modules as well (analytics?).
>>>>>> Then, should this be fixed and brought inside the Solr core?
>>>>>>
>>>>>> And what about first party/third party modules?
>>>>>> I don't think there's any visible difference right now, but in case
>>>>>> we want to make a difference, should we create a sort of official "Solr
>>>>>> Plugin Marketplace" ?
>>>>>> (I proposed the idea to Lucidworks many years ago when I was working
>>>>>> for a partner, and for a certain amount of time, I think there was a Solr
>>>>>> Plugin Marketplace, but it was proprietary).
>>>>>>
>>>>>> I am curious to understand what you think about this and then reason
>>>>>> about the naming convention.
>>>>>>
>>>>>> Cheers
>>>>>>
>>>>>>
>>>>>> --------------------------
>>>>>> Alessandro Benedetti
>>>>>> Apache Lucene/Solr PMC member and Committer
>>>>>> Director, R&D Software Engineer, Search Consultant
>>>>>>
>>>>>> www.sease.io
>>>>>>
>>>>>>
>>>>>> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <ja...@cominvent.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> In
>>>>>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>>>>>> I suggested standardizing contrib/module names. We did not discuss it in
>>>>>>> yesterday's committer meeting, and it may be a bit too much for 9.0. But
>>>>>>> I'd like to discussed, since we are anyway renaming everything in
>>>>>>> SOLR-15917 "contrib->module".
>>>>>>> With as few contribs as we had so far it has not really been an
>>>>>>> issue. But the reason I suggested it is because I anticipate a huge growth
>>>>>>> in number of modules/packages during 9.x, and it can get messy. Another
>>>>>>> reason for having a convention is that it forces the module/package creator
>>>>>>> to think through whether the proposed module has the right granularity.
>>>>>>> Take for instance the new "HDFS" or "Hadoop" module. It won't fit into
>>>>>>> either of my proposed types, as it contains both a directoryFactory, one or
>>>>>>> two authentication plugins and one backup repository. That of course
>>>>>>> suggests that the module is too big and should be divided. Another reason
>>>>>>> is that when we have 50 modules / packages it would be far better for users
>>>>>>> to be able to find all backup repositories by looking for backup-* rather
>>>>>>> than guess from naming what it is. Perhaps a bad example since both repo
>>>>>>> contribs have a suffix "-repository" today. But then "-repository" is not
>>>>>>> as user friendly as "backup-".
>>>>>>>
>>>>>>> So I guess I'd like your opinion on
>>>>>>>
>>>>>>> 1) Do we even want a convention (at least for our own code?)
>>>>>>> 2) If yes, should we rename the contribs/modules for 9.0 when we
>>>>>>> throw them around anyway?
>>>>>>> 3) When we start adding package manifests to the modules, should
>>>>>>> there be a 1:1 between module name and package name?
>>>>>>>
>>>>>>> Refarding the last point, we could apply such standardized naming
>>>>>>> convention for the packages only and leave module names as-is, i.e. you'd
>>>>>>> do "solr package install update-extraction" even if the module name
>>>>>>> is "extraction".
>>>>>>>
>>>>>>> Jan
>>>>>>>
>>>>>>
>>>>>
>>
>

Re: [DISCUSS] Standardizing module naming

Posted by Jan Høydahl <ja...@cominvent.com>.
Could this work? :

Say, in 9.1 that we move feature FOO out of core into a module. We introduce a new SOLR_DEFAULT_MODULES=foo variable that will append to SOLR_MODULES,
so if a 9.0 user has SOLR_MODULES=extracting and upgrades to 9.1, tey will not need to change anything and will get both. But if they do not need FOO, then they
can get rid of it from classpath by setting SOLR_DEFAULT_MODULES="". Then in 10.0 SOLR_DEFAULT_MODULES is again empty.

The only thing I'm worried abut is split packages. E.g. HadoopAuthPlugin lives in org.apache.solr.security which will be shared with core. As I understand, that may be a
problem for JavaDoc, and for java module system if we want to embrace it. Anything else? Is there a clever way we could change package name in 9.1 without 
breaking back-compat? 

Jan

> 31. jan. 2022 kl. 04:11 skrev David Smiley <ds...@apache.org>:
> 
> Yes; I was thinking something like this as well.  This way we can make meaningful progress on modularization during the 9x series without breaking compatibility.
> 
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley <http://www.linkedin.com/in/davidwsmiley>
> 
> On Sat, Jan 29, 2022 at 4:37 PM Jan Høydahl <jan.asf@cominvent.com <ma...@cominvent.com>> wrote:
> Hi,
> 
> Seems to be not an overwhelming support for enforcing naming convention - at least not yet.
> So let the suggestion be a recommendation, and we'll see during 9.x what naming makes sense for new modules.
> 
> I thought about whether we can extract code from solr-core into modules in a 9.x minor release.
> If it breaks exisitng use, e.g. package name change, or if the plugin in no longer on classpath by default, we cannot.
> But if we want to extract a certain feature, such as Hadoop-Auth, in 9.1 - if we keep the package name and make the new module included in SOLR_MODULES by default, then perhaps? Views?
> 
> Jan
> 
> 
>> 24. jan. 2022 kl. 17:23 skrev Jason Gerlowski <gerlowskija@gmail.com <ma...@gmail.com>>:
>> 
>> 
>> 1. [Do we want a convention?] I'd be fine with a convention as long as we're willing to be flexible on it or evolve it as more modules come in.  If we're expecting that 9.x will bring in other new modules but we don't know what those are, then we can't be too strict on any particular naming.  
>> 
>> 2. [should we rename the contribs/modules for 9.0 when we throw them around anyway?] Sure, +1 to the proposed names.
>> 
>> Jason
>> 
>> On Fri, Jan 21, 2022 at 1:53 PM Houston Putman <houstonputman@gmail.com <ma...@gmail.com>> wrote:
>> I agree that standardizing the names would be nice. 
>> 
>> Another good option is to have a ref-guide page that lists all the modules, explains their purpose and links to relevant documentation.
>> This page could be broken down by feature, much like your proposed names would be.
>> 
>> On Fri, Jan 21, 2022 at 1:47 PM David Smiley <dsmiley@apache.org <ma...@apache.org>> wrote:
>> +1 I like your proposed names.  Some of our names are so short now that only us know what they are at a glance.
>> 
>> 
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley <http://www.linkedin.com/in/davidwsmiley>
>> 
>> On Fri, Jan 21, 2022 at 11:01 AM Jan Høydahl <jan.asf@cominvent.com <ma...@cominvent.com>> wrote:
>> There is kind of a proposal in https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming <https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming> already, but I'd like to discuss the general idea and what structure makes the most sense here. With my "type" proposal, you can easily map the new names for the various contribs, e.g. "backup-s3", "backup-gce", "update-extraction", "update-langid", "search-analytics" etc. Other structures are also probably possible? Or we could just leave it up to each module author as before :)
>> 
>> Jan
>> 
>>> 21. jan. 2022 kl. 15:25 skrev David Smiley <dsmiley@apache.org <ma...@apache.org>>:
>>> 
>>> Now is a great time to do some name changes.  I suggest that you make a specific proposal of what the names should be.
>>> 
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley <http://www.linkedin.com/in/davidwsmiley>
>>> 
>>> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <a.benedetti@sease.io <ma...@sease.io>> wrote:
>>> I would also add a tangential question (rather than answers at this point):
>>> What makes a module(contrib) a module(contrib)?
>>> From now on I'll use 'module' where I intend a package under contrib.
>>> 
>>> I am referring to first-party modules such as ltr or langid.
>>> My initial understanding was that a module in contrib, is an integration with some external dependency (like langid with OpenNLP, Tika or langdetect).
>>> But then, why is ltr a module? It doesn't really integrate with any external dependency.
>>> It's additional query parsers and components for a key Solr functionality.
>>> Is it just a legacy consequence of the fact that initially, Bloomberg contributed the module?
>>> Maybe this applies to other modules as well (analytics?).
>>> Then, should this be fixed and brought inside the Solr core?
>>> 
>>> And what about first party/third party modules?
>>> I don't think there's any visible difference right now, but in case we want to make a difference, should we create a sort of official "Solr Plugin Marketplace" ?
>>> (I proposed the idea to Lucidworks many years ago when I was working for a partner, and for a certain amount of time, I think there was a Solr Plugin Marketplace, but it was proprietary).
>>> 
>>> I am curious to understand what you think about this and then reason about the naming convention.
>>> 
>>> Cheers
>>> 
>>> 
>>> --------------------------
>>> Alessandro Benedetti
>>> Apache Lucene/Solr PMC member and Committer
>>> Director, R&D Software Engineer, Search Consultant
>>> 
>>> www.sease.io <http://www.sease.io/>
>>> 
>>> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <jan.asf@cominvent.com <ma...@cominvent.com>> wrote:
>>> Hi,
>>> 
>>> In https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming <https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming> I suggested standardizing contrib/module names. We did not discuss it in yesterday's committer meeting, and it may be a bit too much for 9.0. But I'd like to discussed, since we are anyway renaming everything in SOLR-15917 "contrib->module".
>>> 
>>> With as few contribs as we had so far it has not really been an issue. But the reason I suggested it is because I anticipate a huge growth in number of modules/packages during 9.x, and it can get messy. Another reason for having a convention is that it forces the module/package creator to think through whether the proposed module has the right granularity. Take for instance the new "HDFS" or "Hadoop" module. It won't fit into either of my proposed types, as it contains both a directoryFactory, one or two authentication plugins and one backup repository. That of course suggests that the module is too big and should be divided. Another reason is that when we have 50 modules / packages it would be far better for users to be able to find all backup repositories by looking for backup-* rather than guess from naming what it is. Perhaps a bad example since both repo contribs have a suffix "-repository" today. But then "-repository" is not as user friendly as "backup-".
>>> 
>>> So I guess I'd like your opinion on
>>> 
>>> 1) Do we even want a convention (at least for our own code?)
>>> 2) If yes, should we rename the contribs/modules for 9.0 when we throw them around anyway?
>>> 3) When we start adding package manifests to the modules, should there be a 1:1 between module name and package name?
>>> 
>>> Refarding the last point, we could apply such standardized naming convention for the packages only and leave module names as-is, i.e. you'd do "solr package install update-extraction" even if the module name is "extraction".
>>> Jan
>>> 
>> 
> 


Re: [DISCUSS] Standardizing module naming

Posted by David Smiley <ds...@apache.org>.
Yes; I was thinking something like this as well.  This way we can make
meaningful progress on modularization during the 9x series without breaking
compatibility.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Sat, Jan 29, 2022 at 4:37 PM Jan Høydahl <ja...@cominvent.com> wrote:

> Hi,
>
> Seems to be not an overwhelming support for enforcing naming convention -
> at least not yet.
> So let the suggestion be a recommendation, and we'll see during 9.x what
> naming makes sense for new modules.
>
> I thought about whether we can extract code from solr-core into modules in
> a 9.x minor release.
> If it breaks exisitng use, e.g. package name change, or if the plugin in
> no longer on classpath by default, we cannot.
> But if we want to extract a certain feature, such as Hadoop-Auth, in 9.1 -
> if we keep the package name and make the new module included in
> SOLR_MODULES by default, then perhaps? Views?
>
> Jan
>
>
> 24. jan. 2022 kl. 17:23 skrev Jason Gerlowski <ge...@gmail.com>:
>
>
> 1. [Do we want a convention?] I'd be fine with a convention as long as
> we're willing to be flexible on it or evolve it as more modules come in.
> If we're expecting that 9.x will bring in other new modules but we don't
> know what those are, then we can't be too strict on any particular naming.
>
> 2. [should we rename the contribs/modules for 9.0 when we throw them
> around anyway?] Sure, +1 to the proposed names.
>
> Jason
>
> On Fri, Jan 21, 2022 at 1:53 PM Houston Putman <ho...@gmail.com>
> wrote:
>
>> I agree that standardizing the names would be nice.
>>
>> Another good option is to have a ref-guide page that lists all the
>> modules, explains their purpose and links to relevant documentation.
>> This page could be broken down by feature, much like your proposed names
>> would be.
>>
>> On Fri, Jan 21, 2022 at 1:47 PM David Smiley <ds...@apache.org> wrote:
>>
>>> +1 I like your proposed names.  Some of our names are so short now that
>>> only us know what they are at a glance.
>>>
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>>
>>> On Fri, Jan 21, 2022 at 11:01 AM Jan Høydahl <ja...@cominvent.com>
>>> wrote:
>>>
>>>> There is kind of a proposal in
>>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>>> already, but I'd like to discuss the general idea and what structure makes
>>>> the most sense here. With my "type" proposal, you can easily map the new
>>>> names for the various contribs, e.g. "backup-s3", "backup-gce",
>>>> "update-extraction", "update-langid", "search-analytics" etc. Other
>>>> structures are also probably possible? Or we could just leave it up to each
>>>> module author as before :)
>>>>
>>>> Jan
>>>>
>>>> 21. jan. 2022 kl. 15:25 skrev David Smiley <ds...@apache.org>:
>>>>
>>>> Now is a great time to do some name changes.  I suggest that you make a
>>>> specific proposal of what the names should be.
>>>>
>>>> ~ David Smiley
>>>> Apache Lucene/Solr Search Developer
>>>> http://www.linkedin.com/in/davidwsmiley
>>>>
>>>>
>>>> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <
>>>> a.benedetti@sease.io> wrote:
>>>>
>>>>> I would also add a tangential question (rather than answers at this
>>>>> point):
>>>>> What makes a module(contrib) a module(contrib)?
>>>>> *From now on I'll use 'module' where I intend a package under contrib.*
>>>>>
>>>>> I am referring to first-party modules such as ltr or langid.
>>>>> My initial understanding was that a module in contrib, is an
>>>>> integration with some external dependency (like langid with OpenNLP, Tika
>>>>> or langdetect).
>>>>> But then, why is *ltr* a module? It doesn't really integrate with any
>>>>> external dependency.
>>>>> It's additional query parsers and components for a key Solr
>>>>> functionality.
>>>>> Is it just a legacy consequence of the fact that initially, Bloomberg
>>>>> contributed the module?
>>>>> Maybe this applies to other modules as well (analytics?).
>>>>> Then, should this be fixed and brought inside the Solr core?
>>>>>
>>>>> And what about first party/third party modules?
>>>>> I don't think there's any visible difference right now, but in case we
>>>>> want to make a difference, should we create a sort of official "Solr Plugin
>>>>> Marketplace" ?
>>>>> (I proposed the idea to Lucidworks many years ago when I was working
>>>>> for a partner, and for a certain amount of time, I think there was a Solr
>>>>> Plugin Marketplace, but it was proprietary).
>>>>>
>>>>> I am curious to understand what you think about this and then reason
>>>>> about the naming convention.
>>>>>
>>>>> Cheers
>>>>>
>>>>>
>>>>> --------------------------
>>>>> Alessandro Benedetti
>>>>> Apache Lucene/Solr PMC member and Committer
>>>>> Director, R&D Software Engineer, Search Consultant
>>>>>
>>>>> www.sease.io
>>>>>
>>>>>
>>>>> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <ja...@cominvent.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> In
>>>>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>>>>> I suggested standardizing contrib/module names. We did not discuss it in
>>>>>> yesterday's committer meeting, and it may be a bit too much for 9.0. But
>>>>>> I'd like to discussed, since we are anyway renaming everything in
>>>>>> SOLR-15917 "contrib->module".
>>>>>> With as few contribs as we had so far it has not really been an
>>>>>> issue. But the reason I suggested it is because I anticipate a huge growth
>>>>>> in number of modules/packages during 9.x, and it can get messy. Another
>>>>>> reason for having a convention is that it forces the module/package creator
>>>>>> to think through whether the proposed module has the right granularity.
>>>>>> Take for instance the new "HDFS" or "Hadoop" module. It won't fit into
>>>>>> either of my proposed types, as it contains both a directoryFactory, one or
>>>>>> two authentication plugins and one backup repository. That of course
>>>>>> suggests that the module is too big and should be divided. Another reason
>>>>>> is that when we have 50 modules / packages it would be far better for users
>>>>>> to be able to find all backup repositories by looking for backup-* rather
>>>>>> than guess from naming what it is. Perhaps a bad example since both repo
>>>>>> contribs have a suffix "-repository" today. But then "-repository" is not
>>>>>> as user friendly as "backup-".
>>>>>>
>>>>>> So I guess I'd like your opinion on
>>>>>>
>>>>>> 1) Do we even want a convention (at least for our own code?)
>>>>>> 2) If yes, should we rename the contribs/modules for 9.0 when we
>>>>>> throw them around anyway?
>>>>>> 3) When we start adding package manifests to the modules, should
>>>>>> there be a 1:1 between module name and package name?
>>>>>>
>>>>>> Refarding the last point, we could apply such standardized naming
>>>>>> convention for the packages only and leave module names as-is, i.e. you'd
>>>>>> do "solr package install update-extraction" even if the module name
>>>>>> is "extraction".
>>>>>>
>>>>>> Jan
>>>>>>
>>>>>
>>>>
>

Re: [DISCUSS] Standardizing module naming

Posted by Jan Høydahl <ja...@cominvent.com>.
Hi,

Seems to be not an overwhelming support for enforcing naming convention - at least not yet.
So let the suggestion be a recommendation, and we'll see during 9.x what naming makes sense for new modules.

I thought about whether we can extract code from solr-core into modules in a 9.x minor release.
If it breaks exisitng use, e.g. package name change, or if the plugin in no longer on classpath by default, we cannot.
But if we want to extract a certain feature, such as Hadoop-Auth, in 9.1 - if we keep the package name and make the new module included in SOLR_MODULES by default, then perhaps? Views?

Jan


> 24. jan. 2022 kl. 17:23 skrev Jason Gerlowski <ge...@gmail.com>:
> 
> 
> 1. [Do we want a convention?] I'd be fine with a convention as long as we're willing to be flexible on it or evolve it as more modules come in.  If we're expecting that 9.x will bring in other new modules but we don't know what those are, then we can't be too strict on any particular naming.  
> 
> 2. [should we rename the contribs/modules for 9.0 when we throw them around anyway?] Sure, +1 to the proposed names.
> 
> Jason
> 
> On Fri, Jan 21, 2022 at 1:53 PM Houston Putman <houstonputman@gmail.com <ma...@gmail.com>> wrote:
> I agree that standardizing the names would be nice. 
> 
> Another good option is to have a ref-guide page that lists all the modules, explains their purpose and links to relevant documentation.
> This page could be broken down by feature, much like your proposed names would be.
> 
> On Fri, Jan 21, 2022 at 1:47 PM David Smiley <dsmiley@apache.org <ma...@apache.org>> wrote:
> +1 I like your proposed names.  Some of our names are so short now that only us know what they are at a glance.
> 
> 
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley <http://www.linkedin.com/in/davidwsmiley>
> 
> On Fri, Jan 21, 2022 at 11:01 AM Jan Høydahl <jan.asf@cominvent.com <ma...@cominvent.com>> wrote:
> There is kind of a proposal in https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming <https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming> already, but I'd like to discuss the general idea and what structure makes the most sense here. With my "type" proposal, you can easily map the new names for the various contribs, e.g. "backup-s3", "backup-gce", "update-extraction", "update-langid", "search-analytics" etc. Other structures are also probably possible? Or we could just leave it up to each module author as before :)
> 
> Jan
> 
>> 21. jan. 2022 kl. 15:25 skrev David Smiley <dsmiley@apache.org <ma...@apache.org>>:
>> 
>> Now is a great time to do some name changes.  I suggest that you make a specific proposal of what the names should be.
>> 
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley <http://www.linkedin.com/in/davidwsmiley>
>> 
>> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <a.benedetti@sease.io <ma...@sease.io>> wrote:
>> I would also add a tangential question (rather than answers at this point):
>> What makes a module(contrib) a module(contrib)?
>> From now on I'll use 'module' where I intend a package under contrib.
>> 
>> I am referring to first-party modules such as ltr or langid.
>> My initial understanding was that a module in contrib, is an integration with some external dependency (like langid with OpenNLP, Tika or langdetect).
>> But then, why is ltr a module? It doesn't really integrate with any external dependency.
>> It's additional query parsers and components for a key Solr functionality.
>> Is it just a legacy consequence of the fact that initially, Bloomberg contributed the module?
>> Maybe this applies to other modules as well (analytics?).
>> Then, should this be fixed and brought inside the Solr core?
>> 
>> And what about first party/third party modules?
>> I don't think there's any visible difference right now, but in case we want to make a difference, should we create a sort of official "Solr Plugin Marketplace" ?
>> (I proposed the idea to Lucidworks many years ago when I was working for a partner, and for a certain amount of time, I think there was a Solr Plugin Marketplace, but it was proprietary).
>> 
>> I am curious to understand what you think about this and then reason about the naming convention.
>> 
>> Cheers
>> 
>> 
>> --------------------------
>> Alessandro Benedetti
>> Apache Lucene/Solr PMC member and Committer
>> Director, R&D Software Engineer, Search Consultant
>> 
>> www.sease.io <http://www.sease.io/>
>> 
>> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <jan.asf@cominvent.com <ma...@cominvent.com>> wrote:
>> Hi,
>> 
>> In https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming <https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming> I suggested standardizing contrib/module names. We did not discuss it in yesterday's committer meeting, and it may be a bit too much for 9.0. But I'd like to discussed, since we are anyway renaming everything in SOLR-15917 "contrib->module".
>> 
>> With as few contribs as we had so far it has not really been an issue. But the reason I suggested it is because I anticipate a huge growth in number of modules/packages during 9.x, and it can get messy. Another reason for having a convention is that it forces the module/package creator to think through whether the proposed module has the right granularity. Take for instance the new "HDFS" or "Hadoop" module. It won't fit into either of my proposed types, as it contains both a directoryFactory, one or two authentication plugins and one backup repository. That of course suggests that the module is too big and should be divided. Another reason is that when we have 50 modules / packages it would be far better for users to be able to find all backup repositories by looking for backup-* rather than guess from naming what it is. Perhaps a bad example since both repo contribs have a suffix "-repository" today. But then "-repository" is not as user friendly as "backup-".
>> 
>> So I guess I'd like your opinion on
>> 
>> 1) Do we even want a convention (at least for our own code?)
>> 2) If yes, should we rename the contribs/modules for 9.0 when we throw them around anyway?
>> 3) When we start adding package manifests to the modules, should there be a 1:1 between module name and package name?
>> 
>> Refarding the last point, we could apply such standardized naming convention for the packages only and leave module names as-is, i.e. you'd do "solr package install update-extraction" even if the module name is "extraction".
>> Jan
>> 
> 


Re: [DISCUSS] Standardizing module naming

Posted by Jason Gerlowski <ge...@gmail.com>.
1. [Do we want a convention?] I'd be fine with a convention as long as
we're willing to be flexible on it or evolve it as more modules come in.
If we're expecting that 9.x will bring in other new modules but we don't
know what those are, then we can't be too strict on any particular naming.

2. [should we rename the contribs/modules for 9.0 when we throw them around
anyway?] Sure, +1 to the proposed names.

Jason

On Fri, Jan 21, 2022 at 1:53 PM Houston Putman <ho...@gmail.com>
wrote:

> I agree that standardizing the names would be nice.
>
> Another good option is to have a ref-guide page that lists all the
> modules, explains their purpose and links to relevant documentation.
> This page could be broken down by feature, much like your proposed names
> would be.
>
> On Fri, Jan 21, 2022 at 1:47 PM David Smiley <ds...@apache.org> wrote:
>
>> +1 I like your proposed names.  Some of our names are so short now that
>> only us know what they are at a glance.
>>
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Fri, Jan 21, 2022 at 11:01 AM Jan Høydahl <ja...@cominvent.com>
>> wrote:
>>
>>> There is kind of a proposal in
>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>> already, but I'd like to discuss the general idea and what structure makes
>>> the most sense here. With my "type" proposal, you can easily map the new
>>> names for the various contribs, e.g. "backup-s3", "backup-gce",
>>> "update-extraction", "update-langid", "search-analytics" etc. Other
>>> structures are also probably possible? Or we could just leave it up to each
>>> module author as before :)
>>>
>>> Jan
>>>
>>> 21. jan. 2022 kl. 15:25 skrev David Smiley <ds...@apache.org>:
>>>
>>> Now is a great time to do some name changes.  I suggest that you make a
>>> specific proposal of what the names should be.
>>>
>>> ~ David Smiley
>>> Apache Lucene/Solr Search Developer
>>> http://www.linkedin.com/in/davidwsmiley
>>>
>>>
>>> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <
>>> a.benedetti@sease.io> wrote:
>>>
>>>> I would also add a tangential question (rather than answers at this
>>>> point):
>>>> What makes a module(contrib) a module(contrib)?
>>>> *From now on I'll use 'module' where I intend a package under contrib.*
>>>>
>>>> I am referring to first-party modules such as ltr or langid.
>>>> My initial understanding was that a module in contrib, is an
>>>> integration with some external dependency (like langid with OpenNLP, Tika
>>>> or langdetect).
>>>> But then, why is *ltr* a module? It doesn't really integrate with any
>>>> external dependency.
>>>> It's additional query parsers and components for a key Solr
>>>> functionality.
>>>> Is it just a legacy consequence of the fact that initially, Bloomberg
>>>> contributed the module?
>>>> Maybe this applies to other modules as well (analytics?).
>>>> Then, should this be fixed and brought inside the Solr core?
>>>>
>>>> And what about first party/third party modules?
>>>> I don't think there's any visible difference right now, but in case we
>>>> want to make a difference, should we create a sort of official "Solr Plugin
>>>> Marketplace" ?
>>>> (I proposed the idea to Lucidworks many years ago when I was working
>>>> for a partner, and for a certain amount of time, I think there was a Solr
>>>> Plugin Marketplace, but it was proprietary).
>>>>
>>>> I am curious to understand what you think about this and then reason
>>>> about the naming convention.
>>>>
>>>> Cheers
>>>>
>>>>
>>>> --------------------------
>>>> Alessandro Benedetti
>>>> Apache Lucene/Solr PMC member and Committer
>>>> Director, R&D Software Engineer, Search Consultant
>>>>
>>>> www.sease.io
>>>>
>>>>
>>>> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <ja...@cominvent.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> In
>>>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>>>> I suggested standardizing contrib/module names. We did not discuss it in
>>>>> yesterday's committer meeting, and it may be a bit too much for 9.0. But
>>>>> I'd like to discussed, since we are anyway renaming everything in
>>>>> SOLR-15917 "contrib->module".
>>>>> With as few contribs as we had so far it has not really been an issue.
>>>>> But the reason I suggested it is because I anticipate a huge growth in
>>>>> number of modules/packages during 9.x, and it can get messy. Another reason
>>>>> for having a convention is that it forces the module/package creator to
>>>>> think through whether the proposed module has the right granularity. Take
>>>>> for instance the new "HDFS" or "Hadoop" module. It won't fit into either of
>>>>> my proposed types, as it contains both a directoryFactory, one or two
>>>>> authentication plugins and one backup repository. That of course suggests
>>>>> that the module is too big and should be divided. Another reason is that
>>>>> when we have 50 modules / packages it would be far better for users to be
>>>>> able to find all backup repositories by looking for backup-* rather than
>>>>> guess from naming what it is. Perhaps a bad example since both repo
>>>>> contribs have a suffix "-repository" today. But then "-repository" is not
>>>>> as user friendly as "backup-".
>>>>>
>>>>> So I guess I'd like your opinion on
>>>>>
>>>>> 1) Do we even want a convention (at least for our own code?)
>>>>> 2) If yes, should we rename the contribs/modules for 9.0 when we throw
>>>>> them around anyway?
>>>>> 3) When we start adding package manifests to the modules, should there
>>>>> be a 1:1 between module name and package name?
>>>>>
>>>>> Refarding the last point, we could apply such standardized naming
>>>>> convention for the packages only and leave module names as-is, i.e. you'd
>>>>> do "solr package install update-extraction" even if the module name
>>>>> is "extraction".
>>>>>
>>>>> Jan
>>>>>
>>>>
>>>

Re: [DISCUSS] Standardizing module naming

Posted by Houston Putman <ho...@gmail.com>.
I agree that standardizing the names would be nice.

Another good option is to have a ref-guide page that lists all the modules,
explains their purpose and links to relevant documentation.
This page could be broken down by feature, much like your proposed names
would be.

On Fri, Jan 21, 2022 at 1:47 PM David Smiley <ds...@apache.org> wrote:

> +1 I like your proposed names.  Some of our names are so short now that
> only us know what they are at a glance.
>
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Jan 21, 2022 at 11:01 AM Jan Høydahl <ja...@cominvent.com>
> wrote:
>
>> There is kind of a proposal in
>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>> already, but I'd like to discuss the general idea and what structure makes
>> the most sense here. With my "type" proposal, you can easily map the new
>> names for the various contribs, e.g. "backup-s3", "backup-gce",
>> "update-extraction", "update-langid", "search-analytics" etc. Other
>> structures are also probably possible? Or we could just leave it up to each
>> module author as before :)
>>
>> Jan
>>
>> 21. jan. 2022 kl. 15:25 skrev David Smiley <ds...@apache.org>:
>>
>> Now is a great time to do some name changes.  I suggest that you make a
>> specific proposal of what the names should be.
>>
>> ~ David Smiley
>> Apache Lucene/Solr Search Developer
>> http://www.linkedin.com/in/davidwsmiley
>>
>>
>> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <
>> a.benedetti@sease.io> wrote:
>>
>>> I would also add a tangential question (rather than answers at this
>>> point):
>>> What makes a module(contrib) a module(contrib)?
>>> *From now on I'll use 'module' where I intend a package under contrib.*
>>>
>>> I am referring to first-party modules such as ltr or langid.
>>> My initial understanding was that a module in contrib, is an integration
>>> with some external dependency (like langid with OpenNLP, Tika or
>>> langdetect).
>>> But then, why is *ltr* a module? It doesn't really integrate with any
>>> external dependency.
>>> It's additional query parsers and components for a key Solr
>>> functionality.
>>> Is it just a legacy consequence of the fact that initially, Bloomberg
>>> contributed the module?
>>> Maybe this applies to other modules as well (analytics?).
>>> Then, should this be fixed and brought inside the Solr core?
>>>
>>> And what about first party/third party modules?
>>> I don't think there's any visible difference right now, but in case we
>>> want to make a difference, should we create a sort of official "Solr Plugin
>>> Marketplace" ?
>>> (I proposed the idea to Lucidworks many years ago when I was working for
>>> a partner, and for a certain amount of time, I think there was a Solr
>>> Plugin Marketplace, but it was proprietary).
>>>
>>> I am curious to understand what you think about this and then reason
>>> about the naming convention.
>>>
>>> Cheers
>>>
>>>
>>> --------------------------
>>> Alessandro Benedetti
>>> Apache Lucene/Solr PMC member and Committer
>>> Director, R&D Software Engineer, Search Consultant
>>>
>>> www.sease.io
>>>
>>>
>>> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <ja...@cominvent.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> In
>>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>>> I suggested standardizing contrib/module names. We did not discuss it in
>>>> yesterday's committer meeting, and it may be a bit too much for 9.0. But
>>>> I'd like to discussed, since we are anyway renaming everything in
>>>> SOLR-15917 "contrib->module".
>>>> With as few contribs as we had so far it has not really been an issue.
>>>> But the reason I suggested it is because I anticipate a huge growth in
>>>> number of modules/packages during 9.x, and it can get messy. Another reason
>>>> for having a convention is that it forces the module/package creator to
>>>> think through whether the proposed module has the right granularity. Take
>>>> for instance the new "HDFS" or "Hadoop" module. It won't fit into either of
>>>> my proposed types, as it contains both a directoryFactory, one or two
>>>> authentication plugins and one backup repository. That of course suggests
>>>> that the module is too big and should be divided. Another reason is that
>>>> when we have 50 modules / packages it would be far better for users to be
>>>> able to find all backup repositories by looking for backup-* rather than
>>>> guess from naming what it is. Perhaps a bad example since both repo
>>>> contribs have a suffix "-repository" today. But then "-repository" is not
>>>> as user friendly as "backup-".
>>>>
>>>> So I guess I'd like your opinion on
>>>>
>>>> 1) Do we even want a convention (at least for our own code?)
>>>> 2) If yes, should we rename the contribs/modules for 9.0 when we throw
>>>> them around anyway?
>>>> 3) When we start adding package manifests to the modules, should there
>>>> be a 1:1 between module name and package name?
>>>>
>>>> Refarding the last point, we could apply such standardized naming
>>>> convention for the packages only and leave module names as-is, i.e. you'd
>>>> do "solr package install update-extraction" even if the module name is
>>>> "extraction".
>>>>
>>>> Jan
>>>>
>>>
>>

Re: [DISCUSS] Standardizing module naming

Posted by David Smiley <ds...@apache.org>.
+1 I like your proposed names.  Some of our names are so short now that
only us know what they are at a glance.


~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 21, 2022 at 11:01 AM Jan Høydahl <ja...@cominvent.com> wrote:

> There is kind of a proposal in
> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
> already, but I'd like to discuss the general idea and what structure makes
> the most sense here. With my "type" proposal, you can easily map the new
> names for the various contribs, e.g. "backup-s3", "backup-gce",
> "update-extraction", "update-langid", "search-analytics" etc. Other
> structures are also probably possible? Or we could just leave it up to each
> module author as before :)
>
> Jan
>
> 21. jan. 2022 kl. 15:25 skrev David Smiley <ds...@apache.org>:
>
> Now is a great time to do some name changes.  I suggest that you make a
> specific proposal of what the names should be.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <a....@sease.io>
> wrote:
>
>> I would also add a tangential question (rather than answers at this
>> point):
>> What makes a module(contrib) a module(contrib)?
>> *From now on I'll use 'module' where I intend a package under contrib.*
>>
>> I am referring to first-party modules such as ltr or langid.
>> My initial understanding was that a module in contrib, is an integration
>> with some external dependency (like langid with OpenNLP, Tika or
>> langdetect).
>> But then, why is *ltr* a module? It doesn't really integrate with any
>> external dependency.
>> It's additional query parsers and components for a key Solr functionality.
>> Is it just a legacy consequence of the fact that initially, Bloomberg
>> contributed the module?
>> Maybe this applies to other modules as well (analytics?).
>> Then, should this be fixed and brought inside the Solr core?
>>
>> And what about first party/third party modules?
>> I don't think there's any visible difference right now, but in case we
>> want to make a difference, should we create a sort of official "Solr Plugin
>> Marketplace" ?
>> (I proposed the idea to Lucidworks many years ago when I was working for
>> a partner, and for a certain amount of time, I think there was a Solr
>> Plugin Marketplace, but it was proprietary).
>>
>> I am curious to understand what you think about this and then reason
>> about the naming convention.
>>
>> Cheers
>>
>>
>> --------------------------
>> Alessandro Benedetti
>> Apache Lucene/Solr PMC member and Committer
>> Director, R&D Software Engineer, Search Consultant
>>
>> www.sease.io
>>
>>
>> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <ja...@cominvent.com> wrote:
>>
>>> Hi,
>>>
>>> In
>>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>>> I suggested standardizing contrib/module names. We did not discuss it in
>>> yesterday's committer meeting, and it may be a bit too much for 9.0. But
>>> I'd like to discussed, since we are anyway renaming everything in
>>> SOLR-15917 "contrib->module".
>>> With as few contribs as we had so far it has not really been an issue.
>>> But the reason I suggested it is because I anticipate a huge growth in
>>> number of modules/packages during 9.x, and it can get messy. Another reason
>>> for having a convention is that it forces the module/package creator to
>>> think through whether the proposed module has the right granularity. Take
>>> for instance the new "HDFS" or "Hadoop" module. It won't fit into either of
>>> my proposed types, as it contains both a directoryFactory, one or two
>>> authentication plugins and one backup repository. That of course suggests
>>> that the module is too big and should be divided. Another reason is that
>>> when we have 50 modules / packages it would be far better for users to be
>>> able to find all backup repositories by looking for backup-* rather than
>>> guess from naming what it is. Perhaps a bad example since both repo
>>> contribs have a suffix "-repository" today. But then "-repository" is not
>>> as user friendly as "backup-".
>>>
>>> So I guess I'd like your opinion on
>>>
>>> 1) Do we even want a convention (at least for our own code?)
>>> 2) If yes, should we rename the contribs/modules for 9.0 when we throw
>>> them around anyway?
>>> 3) When we start adding package manifests to the modules, should there
>>> be a 1:1 between module name and package name?
>>>
>>> Refarding the last point, we could apply such standardized naming
>>> convention for the packages only and leave module names as-is, i.e. you'd
>>> do "solr package install update-extraction" even if the module name is "
>>> extraction".
>>>
>>> Jan
>>>
>>
>

Re: [DISCUSS] Standardizing module naming

Posted by Jan Høydahl <ja...@cominvent.com>.
There is kind of a proposal in https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming already, but I'd like to discuss the general idea and what structure makes the most sense here. With my "type" proposal, you can easily map the new names for the various contribs, e.g. "backup-s3", "backup-gce", "update-extraction", "update-langid", "search-analytics" etc. Other structures are also probably possible? Or we could just leave it up to each module author as before :)

Jan

> 21. jan. 2022 kl. 15:25 skrev David Smiley <ds...@apache.org>:
> 
> Now is a great time to do some name changes.  I suggest that you make a specific proposal of what the names should be.
> 
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley <http://www.linkedin.com/in/davidwsmiley>
> 
> On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <a.benedetti@sease.io <ma...@sease.io>> wrote:
> I would also add a tangential question (rather than answers at this point):
> What makes a module(contrib) a module(contrib)?
> From now on I'll use 'module' where I intend a package under contrib.
> 
> I am referring to first-party modules such as ltr or langid.
> My initial understanding was that a module in contrib, is an integration with some external dependency (like langid with OpenNLP, Tika or langdetect).
> But then, why is ltr a module? It doesn't really integrate with any external dependency.
> It's additional query parsers and components for a key Solr functionality.
> Is it just a legacy consequence of the fact that initially, Bloomberg contributed the module?
> Maybe this applies to other modules as well (analytics?).
> Then, should this be fixed and brought inside the Solr core?
> 
> And what about first party/third party modules?
> I don't think there's any visible difference right now, but in case we want to make a difference, should we create a sort of official "Solr Plugin Marketplace" ?
> (I proposed the idea to Lucidworks many years ago when I was working for a partner, and for a certain amount of time, I think there was a Solr Plugin Marketplace, but it was proprietary).
> 
> I am curious to understand what you think about this and then reason about the naming convention.
> 
> Cheers
> 
> 
> --------------------------
> Alessandro Benedetti
> Apache Lucene/Solr PMC member and Committer
> Director, R&D Software Engineer, Search Consultant
> 
> www.sease.io <http://www.sease.io/>
> 
> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <jan.asf@cominvent.com <ma...@cominvent.com>> wrote:
> Hi,
> 
> In https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming <https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming> I suggested standardizing contrib/module names. We did not discuss it in yesterday's committer meeting, and it may be a bit too much for 9.0. But I'd like to discussed, since we are anyway renaming everything in SOLR-15917 "contrib->module".
> 
> With as few contribs as we had so far it has not really been an issue. But the reason I suggested it is because I anticipate a huge growth in number of modules/packages during 9.x, and it can get messy. Another reason for having a convention is that it forces the module/package creator to think through whether the proposed module has the right granularity. Take for instance the new "HDFS" or "Hadoop" module. It won't fit into either of my proposed types, as it contains both a directoryFactory, one or two authentication plugins and one backup repository. That of course suggests that the module is too big and should be divided. Another reason is that when we have 50 modules / packages it would be far better for users to be able to find all backup repositories by looking for backup-* rather than guess from naming what it is. Perhaps a bad example since both repo contribs have a suffix "-repository" today. But then "-repository" is not as user friendly as "backup-".
> 
> So I guess I'd like your opinion on
> 
> 1) Do we even want a convention (at least for our own code?)
> 2) If yes, should we rename the contribs/modules for 9.0 when we throw them around anyway?
> 3) When we start adding package manifests to the modules, should there be a 1:1 between module name and package name?
> 
> Refarding the last point, we could apply such standardized naming convention for the packages only and leave module names as-is, i.e. you'd do "solr package install update-extraction" even if the module name is "extraction".
> Jan
> 


Re: [DISCUSS] Standardizing module naming

Posted by David Smiley <ds...@apache.org>.
Now is a great time to do some name changes.  I suggest that you make a
specific proposal of what the names should be.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Fri, Jan 21, 2022 at 8:18 AM Alessandro Benedetti <a....@sease.io>
wrote:

> I would also add a tangential question (rather than answers at this point):
> What makes a module(contrib) a module(contrib)?
> *From now on I'll use 'module' where I intend a package under contrib.*
>
> I am referring to first-party modules such as ltr or langid.
> My initial understanding was that a module in contrib, is an integration
> with some external dependency (like langid with OpenNLP, Tika or
> langdetect).
> But then, why is *ltr* a module? It doesn't really integrate with any
> external dependency.
> It's additional query parsers and components for a key Solr functionality.
> Is it just a legacy consequence of the fact that initially, Bloomberg
> contributed the module?
> Maybe this applies to other modules as well (analytics?).
> Then, should this be fixed and brought inside the Solr core?
>
> And what about first party/third party modules?
> I don't think there's any visible difference right now, but in case we
> want to make a difference, should we create a sort of official "Solr Plugin
> Marketplace" ?
> (I proposed the idea to Lucidworks many years ago when I was working for a
> partner, and for a certain amount of time, I think there was a Solr Plugin
> Marketplace, but it was proprietary).
>
> I am curious to understand what you think about this and then reason about
> the naming convention.
>
> Cheers
>
>
> --------------------------
> Alessandro Benedetti
> Apache Lucene/Solr PMC member and Committer
> Director, R&D Software Engineer, Search Consultant
>
> www.sease.io
>
>
> On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <ja...@cominvent.com> wrote:
>
>> Hi,
>>
>> In
>> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
>> I suggested standardizing contrib/module names. We did not discuss it in
>> yesterday's committer meeting, and it may be a bit too much for 9.0. But
>> I'd like to discussed, since we are anyway renaming everything in
>> SOLR-15917 "contrib->module".
>> With as few contribs as we had so far it has not really been an issue.
>> But the reason I suggested it is because I anticipate a huge growth in
>> number of modules/packages during 9.x, and it can get messy. Another reason
>> for having a convention is that it forces the module/package creator to
>> think through whether the proposed module has the right granularity. Take
>> for instance the new "HDFS" or "Hadoop" module. It won't fit into either of
>> my proposed types, as it contains both a directoryFactory, one or two
>> authentication plugins and one backup repository. That of course suggests
>> that the module is too big and should be divided. Another reason is that
>> when we have 50 modules / packages it would be far better for users to be
>> able to find all backup repositories by looking for backup-* rather than
>> guess from naming what it is. Perhaps a bad example since both repo
>> contribs have a suffix "-repository" today. But then "-repository" is not
>> as user friendly as "backup-".
>>
>> So I guess I'd like your opinion on
>>
>> 1) Do we even want a convention (at least for our own code?)
>> 2) If yes, should we rename the contribs/modules for 9.0 when we throw
>> them around anyway?
>> 3) When we start adding package manifests to the modules, should there be
>> a 1:1 between module name and package name?
>>
>> Refarding the last point, we could apply such standardized naming
>> convention for the packages only and leave module names as-is, i.e. you'd
>> do "solr package install update-extraction" even if the module name is "
>> extraction".
>>
>> Jan
>>
>

Re: [DISCUSS] Standardizing module naming

Posted by Jan Høydahl <ja...@cominvent.com>.
Hi,

This is off-topic for this thread.

I'll post my answer to the "Moduarizing Solr with new contrib packages" thread if you don't mind 

https://lists.apache.org/thread/0k6s7h95yzwvhyk8049mr254ywr8vt2l

Jan
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@solr.apache.org
For additional commands, e-mail: dev-help@solr.apache.org


Re: [DISCUSS] Standardizing module naming

Posted by Alessandro Benedetti <a....@sease.io>.
I would also add a tangential question (rather than answers at this point):
What makes a module(contrib) a module(contrib)?
*From now on I'll use 'module' where I intend a package under contrib.*

I am referring to first-party modules such as ltr or langid.
My initial understanding was that a module in contrib, is an integration
with some external dependency (like langid with OpenNLP, Tika or
langdetect).
But then, why is *ltr* a module? It doesn't really integrate with any
external dependency.
It's additional query parsers and components for a key Solr functionality.
Is it just a legacy consequence of the fact that initially, Bloomberg
contributed the module?
Maybe this applies to other modules as well (analytics?).
Then, should this be fixed and brought inside the Solr core?

And what about first party/third party modules?
I don't think there's any visible difference right now, but in case we want
to make a difference, should we create a sort of official "Solr Plugin
Marketplace" ?
(I proposed the idea to Lucidworks many years ago when I was working for a
partner, and for a certain amount of time, I think there was a Solr Plugin
Marketplace, but it was proprietary).

I am curious to understand what you think about this and then reason about
the naming convention.

Cheers


--------------------------
Alessandro Benedetti
Apache Lucene/Solr PMC member and Committer
Director, R&D Software Engineer, Search Consultant

www.sease.io


On Fri, 21 Jan 2022 at 10:47, Jan Høydahl <ja...@cominvent.com> wrote:

> Hi,
>
> In
> https://github.com/apache/solr/blob/main/dev-docs/plugins-modules-packages.adoc#module-naming
> I suggested standardizing contrib/module names. We did not discuss it in
> yesterday's committer meeting, and it may be a bit too much for 9.0. But
> I'd like to discussed, since we are anyway renaming everything in
> SOLR-15917 "contrib->module".
> With as few contribs as we had so far it has not really been an issue. But
> the reason I suggested it is because I anticipate a huge growth in number
> of modules/packages during 9.x, and it can get messy. Another reason for
> having a convention is that it forces the module/package creator to think
> through whether the proposed module has the right granularity. Take for
> instance the new "HDFS" or "Hadoop" module. It won't fit into either of my
> proposed types, as it contains both a directoryFactory, one or two
> authentication plugins and one backup repository. That of course suggests
> that the module is too big and should be divided. Another reason is that
> when we have 50 modules / packages it would be far better for users to be
> able to find all backup repositories by looking for backup-* rather than
> guess from naming what it is. Perhaps a bad example since both repo
> contribs have a suffix "-repository" today. But then "-repository" is not
> as user friendly as "backup-".
>
> So I guess I'd like your opinion on
>
> 1) Do we even want a convention (at least for our own code?)
> 2) If yes, should we rename the contribs/modules for 9.0 when we throw
> them around anyway?
> 3) When we start adding package manifests to the modules, should there be
> a 1:1 between module name and package name?
>
> Refarding the last point, we could apply such standardized naming
> convention for the packages only and leave module names as-is, i.e. you'd
> do "solr package install update-extraction" even if the module name is "
> extraction".
>
> Jan
>