You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Mark Miller (JIRA)" <ji...@apache.org> on 2014/12/23 21:18:13 UTC

[jira] [Commented] (SOLR-6887) SolrResourceLoader does not canonicalise the path

    [ https://issues.apache.org/jira/browse/SOLR-6887?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14257459#comment-14257459 ] 

Mark Miller commented on SOLR-6887:
-----------------------------------

This is really interesting! Why is it not a problem for the first collection?

What interests me most is why is this happening though! I think it's actually because a path that starts with '.' is treated as hidden and so java will actually say it doesn't exist!

{code}
    public int getBooleanAttributes(File f) {
        int rv = getBooleanAttributes0(f);
        String name = f.getName();
        boolean hidden = (name.length() > 0) && (name.charAt(0) == '.');
        return rv | (hidden ? BA_HIDDEN : 0);
    }
{code}

So these paths that start as ../ are treated as hidden files, and if you canonicalize it (or I assume getAbsolutePath), it works.

Very surprising to me if that is the case. That is a really interesting ugly corner.

> SolrResourceLoader does not canonicalise the path
> -------------------------------------------------
>
>                 Key: SOLR-6887
>                 URL: https://issues.apache.org/jira/browse/SOLR-6887
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.10.2
>            Reporter: Martijn Koster
>            Priority: Minor
>
> I get 
> {quote}
> Can't find (or read) directory to add to classloader
> {quote}
> errors for valid config files.
> To reproduce:
> Step 1: run up a Solr with a zookeeper (default collection, 1 node, 1 shard):
> {noformat}
> tar xvf ~/Downloads/solr-4.10.2.tgz 
> cd solr-4.10.2/
> ./bin/solr -e cloud
> Welcome to the SolrCloud example!
> This interactive session will help you launch a SolrCloud cluster on your local workstation.
> To begin, how many Solr nodes would you like to run in your local cluster? (specify 1-4 nodes) [2] 1
> Ok, let's start up 1 Solr nodes for your example SolrCloud cluster.
> Please enter the port for node1 [8983] 
> 8983
> Cloning /Users/mak/solr-4.10.2/example into /Users/mak/solr-4.10.2/node1
> Starting up SolrCloud node1 on port 8983 using command:
> solr start -cloud -d node1 -p 8983   
> Waiting to see Solr listening on port 8983 [/]  
> Started Solr server on port 8983 (pid=14245). Happy searching!
> Now let's create a new collection for indexing documents in your 1-node cluster.
> Please provide a name for your new collection: [gettingstarted] 
> gettingstarted
> How many shards would you like to split gettingstarted into? [2] 1
> 1
> How many replicas per shard would you like to create? [2] 1
> 1
> Please choose a configuration for the gettingstarted collection, available options are: default or schemaless [default] 
> default
> Deploying default Solr configuration files to embedded ZooKeeper using command:
> /Users/mak/solr-4.10.2/example/scripts/cloud-scripts/zkcli.sh -zkhost localhost:9983 -cmd upconfig -confdir /Users/mak/solr-4.10.2/example/solr/collection1/conf -confname default
> Successfully deployed the /Users/mak/solr-4.10.2/example/solr/collection1/conf configuration directory to ZooKeeper as default
> Creating new collection gettingstarted with 1 shards and replication factor 1 using Collections API command:
> http://localhost:8983/solr/admin/collections?action=CREATE&name=gettingstarted&replicationFactor=1&numShards=1&collection.configName=default&maxShardsPerNode=1&wt=json&indent=2
> For more information about the Collections API, please see: https://cwiki.apache.org/confluence/display/solr/Collections+API
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":2139},
>   "success":{
>     "":{
>       "responseHeader":{
>         "status":0,
>         "QTime":1906},
>       "core":"gettingstarted_shard1_replica1"}}}
> {noformat}
> Verify the server is running on http://localhost:8983/solr/#/
> Step 2: duplicate the zookeeper config:
> {noformat}
> mkdir zkshell
> cd zkshell/
> virtualenv venv
> source venv/bin/activate
> pip install zk_shell
> zk-shell localhost:9983
> Welcome to zk-shell (0.99.05)
> (CONNECTING) /> 
> (CONNECTED) /> 
> (CONNECTED) /> cd configs
> (CONNECTED) /configs> cp myconf myconf2 true
> (CONNECTED) /configs> cd myconf
> (CONNECTED) /configs/myconf> get solrconfig.xml
> (CONNECTED) /configs> quit
> {noformat}
> admire the config file, and note the {{<lib dir="../../../contrib/extraction/lib" regex=".*\.jar" />}}. That configuration comes from [somewhere like this|https://github.com/apache/lucene-solr/blob/lucene_solr_4_10_2/solr/example/solr/collection1/conf/solrconfig.xml#L75].
> Step 3: create a collection with the new config:
> {noformat}
> curl 'http://localhost:8983/solr/admin/collections?action=CREATE&name=collection2&collection.configName=myconf2&numShards=1'
> {noformat}
> Step 4: check the logs:
> {noformat}
> grep org.apache.solr.core.SolrResourceLoader ./node1/logs/solr.log
> ...
> INFO  - 2014-12-23 18:32:55.165; org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for directory: '/Users/mak/solr-4.10.2/node1/solr/collection2_shard1_replica1/'
> WARN  - 2014-12-23 18:32:55.218; org.apache.solr.core.SolrResourceLoader; Can't find (or read) directory to add to classloader: ../../../contrib/extraction/lib (resolved as: /Users/mak/solr-4.10.2/node1/solr/collection2_shard1_replica1/../../../contrib/extraction/lib).
> WARN  - 2014-12-23 18:32:55.218; org.apache.solr.core.SolrResourceLoader; Can't find (or read) directory to add to classloader: ../../../dist/ (resolved as: /Users/mak/solr-4.10.2/node1/solr/collection2_shard1_replica1/../../../dist).
> {noformat}
> Note the error for {{/Users/mak/solr-4.10.2/node1/solr/collection2_shard1_replica1/../../../dist}}.
> But that path does exist:
> {noformat}
> ls /Users/mak/solr-4.10.2/node1/solr/collection2_shard1_replica1/../../../dist | grep '\.jar$' | wc -l
>       32
> {noformat}
> but the `../../..` causes trouble here.
> The error message comes from https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/core/SolrResourceLoader.java#L192 
> If I modify the code to add a File.getCanonicalFile() ([gist|https://gist.github.com/makuk66/20b04e4a8e4ff682714f], which also has added some expanded error checking, which I've not verified), it works without error:
> To patch:
> {noformat}
> mak@crab 529 lucene-solr [detached] $ git co lucene_solr_4_10_2
> HEAD is now at be4a1a6... Lucene/Solr release 4.10.2
> mak@crab 545 lucene-solr [detached] $ curl -# -o p https://gist.githubusercontent.com/makuk66/20b04e4a8e4ff682714f/raw/c0981917ffad96e53939902e4a7938f2ee264d89/gistfile1.diff 
> ######################################################################## 100.0%
> mak@crab 546 lucene-solr [detached] $ patch -p1 < p
> patching file solr/core/src/java/org/apache/solr/core/SolrResourceLoader.java
> mak@crab 548 lucene-solr [detached] $ ant compile && (cd solr; ant dist-war)
> ...
> mak@crab 552 lucene-solr [detached] $ killall java
> mak@crab 554 lucene-solr [detached] $ rm -fr ~/solr-4.10.2/
> mak@crab 537 ~ $ tar xf ~/Downloads/solr-4.10.2.tgz 
> mak@crab 541 ~ $ cp ~/github/lucene-solr/solr/dist/solr-4.10.2-SNAPSHOT.war solr-4.10.2/dist/solr-4.10.2.war 
> mak@crab 544 ~ $ cp ~/github/lucene-solr/solr/dist/solr-4.10.2-SNAPSHOT.war solr-4.10.2//example/webapps/solr.war
> {noformat}
> Then repro as above, but with "collection3" instead of "collection2", and now we get:
> {noformat}
> INFO  - 2014-12-23 19:12:34.406; org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for directory: '/Users/mak/solr-4.10.2/node1/solr/collection3_shard1_replica1/'
> INFO  - 2014-12-23 19:12:34.453; org.apache.solr.core.SolrResourceLoader; Adding 'file:/Users/mak/solr-4.10.2/contrib/extraction/lib/apache-mime4j-core-0.7.2.jar' to classloader
> {noformat}
> So I conclude that the canonicalisation is desirable here.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org