You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Jason Gerlowski (Jira)" <ji...@apache.org> on 2023/05/30 15:14:00 UTC

[jira] [Commented] (SOLR-16820) PackageUtils collection validation is more restrictive than CreateCollectionAPI allows

    [ https://issues.apache.org/jira/browse/SOLR-16820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727574#comment-17727574 ] 

Jason Gerlowski commented on SOLR-16820:
----------------------------------------

bq. As highlighted by Gus Heck in this thread changing the validation of collection names could be a risky change to make

I agree with Gus' word of caution in general, but the cluster of tickets (SOLR-8725, SOLR-8677, SOLR-8110, etc.) that set up the name-validation Solr uses today at creation-time were all in by Solr 7.0, so they've been around for 3 major versions at this point! And as a purely "widening" change to the package manager, it should be safe by definition.

[~willdotwhite] - would you be willing to put together a PR using the SIV-based approach you used [here|https://github.com/apache/solr/commit/638fd768ebd7ed7908029ced08e56bed05a4a2a5]?

> PackageUtils collection validation is more restrictive than CreateCollectionAPI allows
> --------------------------------------------------------------------------------------
>
>                 Key: SOLR-16820
>                 URL: https://issues.apache.org/jira/browse/SOLR-16820
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Package Manager
>            Reporter: Will White
>            Priority: Minor
>              Labels: packagemanager
>
> It's possible to create a collection via the CreateCollectionAPI which [passes validation from the SolrIdentifierValidation|https://github.com/apache/solr/blob/main/solr/solrj/src/java/org/apache/solr/client/solrj/util/SolrIdentifierValidator.java#L50-L52] (a regex which among other elements includes the '.' character), but that same collection name won't then pass validation when deployed/undeployed via the PackageTool because of the [packagemanager.PackageUtils validateCollection() method|https://github.com/apache/solr/blob/main/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271].
> A change [like this, using the existing SolrIdentifierValidator|https://github.com/apache/solr/commit/638fd768ebd7ed7908029ced08e56bed05a4a2a5] would bring the two validation steps back in line, although there's presumably a better approach.
> *Potential risks*
> As highlighted by Gus Heck [in this thread|https://lists.apache.org/thread/h7hnksgqwxxl7nkwkhn01r6jn8xjkjjs] changing the validation of collection names could be a risky change to make. The source of the PackageUtils regex appears to be [https://github.com/apache/lucene-solr/pull/994] from before Solr split from the Lucene project, and it seems that the regex wasn't crafted for a specific subset of use cases that specifically excluded the '.' character - it just appears to be the regex implemented at the time.
> Using the {{SolrIdentifierValidator}} approach mentioned above as an example, other than disallowing a collection name that begins with a '-' character, the {{SolrIdentifierValidator.identifierPattern}} would be a strict expansion of the allowed collection names for the {{{}PackageUtils.validateCollections{}}}. Any other solution (such as [this more naive example|https://github.com/apache/solr/blame/998fffdccf51a0560589e2cb413e9da127a5f26e/solr/core/src/java/org/apache/solr/packagemanager/PackageUtils.java#L271]) could similarly mitigate a lot of the potential risk by only expanding the allowed collection names.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org