You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Radu Gheorghe (JIRA)" <ji...@apache.org> on 2017/10/16 16:00:04 UTC

[jira] [Updated] (SOLR-11473) Make HDFSDirectoryFactory support other prefixes (besides hdfs:/)

     [ https://issues.apache.org/jira/browse/SOLR-11473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Radu Gheorghe updated SOLR-11473:
---------------------------------
    Attachment: SOLR-11473.patch

Thanks, [~thetaphi], [~mdrob] and [~thelabdude] for your comments. I tried specifying HdfsUpdateLog for the transaction log - indeed that helps storing the transaction log in Alluxio, but not with this path issue.

I'm attaching a patch that would fix this, with a few comments:
* I've only really changed the isAbsolute() method. I looked around in HDFSDirectoryFactory and I didn't find other places where it would be useful. Maybe I'm wrong, maybe this could be refactored to look nicer, but I thought starting with a minimal patch would be better :)
* while I tested this patch with Alluxio and it worked well, I didn't add any unit tests. I thought it would basically be testing the URI class, which seemed pointless, but on the other hand I thought it would be nice to make sure we don't lose this functionality in the future (i.e. to make non-hdfs:/ paths work). Let me know if that's needed, though, or if you have any suggestions on what should be tested (I'm thinking of an hdfs:/ path, an alluxio:/ path and a relative path)
* this was all tested with Solr 6.6.1, and I've based my changes off the 6_6 branch from GitHub. I didn't add anything to CHANGES.txt because it's unclear to me where this change would go. Or if a CHANGES.txt modification should be added at this stage, without knowing the Fix Version. Also, `git format-patch` misbehaved for me, so I've generated this through `git diff`. Is that OK?

Besides the last question, do you have any other thoughts or questions?

> Make HDFSDirectoryFactory support other prefixes (besides hdfs:/)
> -----------------------------------------------------------------
>
>                 Key: SOLR-11473
>                 URL: https://issues.apache.org/jira/browse/SOLR-11473
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: hdfs
>    Affects Versions: 6.6.1
>            Reporter: Radu Gheorghe
>            Priority: Minor
>         Attachments: SOLR-11473.patch
>
>
> Not sure if it's a bug or a missing feature :) I'm trying to make Solr work on Alluxio, as described by [~thelabdude] in https://www.slideshare.net/thelabdude/running-solr-in-the-cloud-at-memory-speed-with-alluxio/1
> The problem I'm facing here is with autoAddReplicas. If I have replicationFactor=1 and the node with that replica dies, the node taking over incorrectly assigns the data directory. For example:
> before
> {code}"dataDir":"alluxio://localhost:19998/solr/test/",{code}
> after
> {code}"dataDir":"alluxio://localhost:19998/solr/test/core_node1/alluxio://localhost:19998/solr/test/",{code}
> The same happens for ulogDir. Apparently, this has to do with this bit from HDFSDirectoryFactory:
> {code}  public boolean isAbsolute(String path) {
>     return path.startsWith("hdfs:/");
>   }{code}
> If I add "alluxio:/" in there, the paths are correct and the index is recovered.
> I see a few options here:
> * add "alluxio:/" to the list there
> * add a regular expression in the lines of \[a-z]*:/ I hope that's not too expensive, I'm not sure how often this method is called
> * don't do anything and expect alluxio to work with an "hdfs:/" path? I actually tried that and didn't manage to make it work
> * have a different DirectoryFactory or something else?
> What do you think?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org