You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Hari Sekhon (JIRA)" <ji...@apache.org> on 2015/03/18 15:32:39 UTC
[jira] [Comment Edited] (SOLR-6305) Ability to set the replication
factor for index files created by HDFSDirectoryFactory
[ https://issues.apache.org/jira/browse/SOLR-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14367192#comment-14367192 ]
Hari Sekhon edited comment on SOLR-6305 at 3/18/15 2:32 PM:
------------------------------------------------------------
I'm also having problems with this in 4.10.3. I had tried creating a separate hadoop conf dir pointed to via solr.hdfs.confdir with hdfs dfs.replication=1, then restarted all Solr instances, deleted and recreated the collection and dataDir but found that it only set the write locks to rep factor 1 and still set the data/index/segments* to rep factor 2. Even setting dfs.replication cluster wide resulted in the same behaviour which is odd (I didn't bounce the NN + DNs since this should be hdfs client writer side config).
Note sure if this is related to SOLR-6528.
was (Author: harisekhon):
I also tried creating a separate hadoop conf dir pointed to via solr.hdfs.confdir with hdfs dfs.replication=1, then restarted all Solr instances, deleted and recreated the collection and dataDir but found that it only set the write locks to rep factor 1 and still set the data/index/segments* to rep factor 2. Even setting dfs.replication cluster wide resulted in the same behaviour which is odd (I didn't bounce the NN + DNs since this should be hdfs client writer side config).
Note sure if this is related to SOLR-6528.
> Ability to set the replication factor for index files created by HDFSDirectoryFactory
> -------------------------------------------------------------------------------------
>
> Key: SOLR-6305
> URL: https://issues.apache.org/jira/browse/SOLR-6305
> Project: Solr
> Issue Type: Improvement
> Components: hdfs
> Environment: hadoop-2.2.0
> Reporter: Timothy Potter
>
> HdfsFileWriter doesn't allow us to create files in HDFS with a different replication factor than the configured DFS default because it uses:
> {{FsServerDefaults fsDefaults = fileSystem.getServerDefaults(path);}}
> Since we have two forms of replication going on when using HDFSDirectoryFactory, it would be nice to be able to set the HDFS replication factor for the Solr directories to a lower value than the default. I realize this might reduce the chance of data locality but since Solr cores each have their own path in HDFS, we should give operators the option to reduce it.
> My original thinking was to just use Hadoop setrep to customize the replication factor, but that's a one-time shot and doesn't affect new files created. For instance, I did:
> {{hadoop fs -setrep -R 1 solr49/coll1}}
> My default dfs replication is set to 3 ^^ I'm setting it to 1 just as an example
> Then added some more docs to the coll1 and did:
> {{hadoop fs -stat %r solr49/hdfs1/core_node1/data/index/segments_3}}
> 3 <-- should be 1
> So it looks like new files don't inherit the repfact from their parent directory.
> Not sure if we need to go as far as allowing different replication factor per collection but that should be considered if possible.
> I looked at the Hadoop 2.2.0 code to see if there was a way to work through this using the Configuration object but nothing jumped out at me ... and the implementation for getServerDefaults(path) is just:
> public FsServerDefaults getServerDefaults(Path p) throws IOException {
> return getServerDefaults();
> }
> Path is ignored ;-)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org