You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2019/12/10 13:22:00 UTC
[jira] [Commented] (SOLR-13563) SPLITSHARD - Using LINK method fails on disk usage checks

    [ https://issues.apache.org/jira/browse/SOLR-13563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16992534#comment-16992534 ] 

ASF subversion and git services commented on SOLR-13563:
--------------------------------------------------------

Commit fed199df7b3370b27f173d221e52c4c6983e8020 in lucene-solr's branch refs/heads/master from Andrzej Bialecki
[ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=fed199d ]

SOLR-13563: SPLITSHARD using LINK method fails on disk usage checks.


> SPLITSHARD - Using LINK method fails on disk usage checks
> ---------------------------------------------------------
>
>                 Key: SOLR-13563
>                 URL: https://issues.apache.org/jira/browse/SOLR-13563
>             Project: Solr
>          Issue Type: Bug
>          Components: AutoScaling, SolrCloud
>    Affects Versions: 7.7.2
>            Reporter: Andrew Kettmann
>            Assignee: Andrzej Bialecki
>            Priority: Major
>             Fix For: 8.4
>
>         Attachments: disk_check.patch
>
>
> Raised this on the mailing list and was told to open an issue, copy/pasting the context here:
>  
> Using Solr 7.7.2 Docker image, testing some of the new autoscale features, huge fan so far. Tested with the link method on a 2GB core and found that it took less than 1MB of additional space. Filled the core quite a bit larger, 12GB of a 20GB PVC, and now splitting the shard fails with the following error message on my overseer:
>  
>  
>  
> {code:java}
> 2019-06-18 16:27:41.754 ERROR 
> (OverseerThreadFactory-49-thread-5-processing-n:10.0.192.74:8983_solr) 
> [c:test_autoscale s:shard1  ] 
> o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: test_autoscale 
> operation: splitshard
> failed:org.apache.solr.common.SolrException: not enough free disk space 
> to perform index split on node 10.0.193.23:8983_solr, required: 
> 23.35038321465254, available: 7.811378479003906
>     at org.apache.solr.cloud.api.collections.SplitShardCmd.checkDiskSpace(SplitShardCmd.java:567)
>     at org.apache.solr.cloud.api.collections.SplitShardCmd.split(SplitShardCmd.java:138)
>     at org.apache.solr.cloud.api.collections.SplitShardCmd.call(SplitShardCmd.java:94)
>     at 
> org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:294)
>     at org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:505)
>     at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:209)
>     at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at java.base/java.lang.Thread.run(Thread.java:834)
> {code}
>  
>  
> I attempted sending the request to the node itself to see if it did anything different, but no luck. My parameters are (Note Python formatting as that is my language of choice):
>  
> {code:java}
> splitparams = {'action':'SPLITSHARD',
>                'collection':'test_autoscale',
>                'shard':'shard1',
>                'splitMethod':'link',
>                'timing':'true',
>                'async':'shardsplitasync'}{code}
>  
>  
> And this is confirmed by the log message from the node itself:
>  
> {code:java}
> 2019-06-18 16:27:41.730 INFO  
> (qtp1107530534-16) [c:test_autoscale   ] o.a.s.s.HttpSolrCall [admin] 
> webapp=null path=/admin/collections 
> params={async=shardsplitasync&timing=true&action=SPLITSHARD&collection=test_autoscale&shard=shard1&splitMethod=link}
> status=0 QTime=20{code}
>  
>  
> While it is true I do not have enough space if I were using the rewrite method, the link method on a 2GB core used an additional less than 1MB of space. Is there something I am missing here? is there an option to disable the disk space check that I need to pass? I can't find anything in the documentation at this point.
>  
> --
>  
> After this initial email, I found the issue and compiled with the attached patch and running the modification on the overseer only resolved the issue, as the overseer is what runs the check.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org