You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Anshum Gupta (JIRA)" <ji...@apache.org> on 2017/10/02 16:47:00 UTC

[jira] [Commented] (SOLR-11277) Add auto hard commit setting based on tlog size

    [ https://issues.apache.org/jira/browse/SOLR-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188416#comment-16188416 ] 

Anshum Gupta commented on SOLR-11277:
-------------------------------------

Disclaimer: Me, and Rupa are coworkers and I’ve discussed this idea, and looked at this code before reviewing it here.

This looks really good! I'll be adding to this feedback, but here are a few things to start looking at:

CommitTracker.java
-  SIZE_COMMIT_DELAY_MS can be static

DirectUpdateHandler2.java
- Can we rename fileSizeUpperBound to tLogFileSizeUpperBound? Just so that it’s clear?
- In addedDocument, we should extract a method for the docsUpperBound part as well. It’s not directly a part of your change, but would be good to do while we are at it.
- Can we define a fileSizeUpperBound value of -1 to a static final and use it instead of hardcoding it in the CommitTracker constructor ?
- We need the currentTlogSize at multiple places, we should extract that into a method

SolrConfig.java
- convertAutoCommitMaxSizeStringToBytes is more generic, so either we should rename it to a more generic name like convertConfigStringToBytes or we should call it getAutoCommitMaxSizeInBytes, not pass the path, and have it default to -1.
- The Javadoc for convertAutoCommitMaxSizeStringToBytes doesn’t mention that it returns -1 when autoCommitMaxSizeStr is not set.
- I would like more information to be spit out with the RuntimeException. A good idea would be to highlight what the correct/accepted format looks like.
- UpdateHandlerInfo constructor now has autoCommitMaxSize but is missing the entry from the Javadoc

TransactionLog.java
- Synchronizing is required in getLogSizeFromStream(), but can we run a basic benchmark to make sure that this isn’t impacting the update throughput? 

bad-solrconfig-no-autocommit-tag.xml
- Can you add one line about what is this config used for. It would be a good idea to just replace the current “Minimal  solrconfig.xml with /select, /admin, and /update….” line.

Thanks for adding the javadocs to older methods and removing commented out code from years ago too :) 

> Add auto hard commit setting based on tlog size
> -----------------------------------------------
>
>                 Key: SOLR-11277
>                 URL: https://issues.apache.org/jira/browse/SOLR-11277
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Rupa Shankar
>            Assignee: Anshum Gupta
>         Attachments: max_size_auto_commit.patch
>
>
> When indexing documents of variable sizes and at variable schedules, it can be hard to estimate the optimal auto hard commit maxDocs or maxTime settings. We’ve had some occurrences of really huge tlogs, resulting in serious issues, so in an attempt to avoid this, it would be great to have a “maxSize” setting based on the tlog size on disk. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org