You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by "David Medinets (Created) (JIRA)" <ji...@apache.org> on 2012/01/05 16:48:39 UTC

[jira] [Created] (ACCUMULO-251) Add wording to README.bloom about reason for flushing.

Add wording to README.bloom about reason for flushing.
------------------------------------------------------

                 Key: ACCUMULO-251
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-251
             Project: Accumulo
          Issue Type: Improvement
          Components: docs
            Reporter: David Medinets
            Assignee: Adam Fuchs
            Priority: Trivial


The README.bloom file says this:

 * Insert 1 million entries using  RandomBatchWriter with a seed of 7
 * Flush the table using the shell
 * Insert 1 million entries using  RandomBatchWriter with a seed of 8
 * Flush the table using the shell
 * Insert 1 million entries using  RandomBatchWriter with a seed of 9
 * Flush the table using the shell

However, no reasons are given for why three flushes are used instead of one. Please explain the reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (ACCUMULO-251) Add wording to README.bloom about reason for flushing.

Posted by "Keith Turner (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Keith Turner resolved ACCUMULO-251.
-----------------------------------

       Resolution: Fixed
    Fix Version/s: 1.4.0
    
> Add wording to README.bloom about reason for flushing.
> ------------------------------------------------------
>
>                 Key: ACCUMULO-251
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-251
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: docs
>            Reporter: David Medinets
>            Assignee: Keith Turner
>            Priority: Trivial
>             Fix For: 1.4.0
>
>
> The README.bloom file says this:
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 7
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 8
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 9
>  * Flush the table using the shell
> However, no reasons are given for why three flushes are used instead of one. Please explain the reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ACCUMULO-251) Add wording to README.bloom about reason for flushing.

Posted by "David Medinets (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180485#comment-13180485 ] 

David Medinets commented on ACCUMULO-251:
-----------------------------------------

>From K. Turner on the mailing list: The flushes after each insert are there for a specific purpose, to
ensure the data written with different seeds ends up in different
files.  This is done to show that at scan time the bloom filter will
let you skip seeking 2 of 3 files.

Part of the confusion is that I was not reading the text of the README well enough. I missed this part "To illustrate this two
identical tables were created using the following process." While I expected the README to walk me through the steps so that I could replicate the results, the purpose of the README is simply to report times.

I suggest the README be expanded into a step-by-step process to replicate the results. If y'all agree, it can be a separate ticket.
                
> Add wording to README.bloom about reason for flushing.
> ------------------------------------------------------
>
>                 Key: ACCUMULO-251
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-251
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: docs
>            Reporter: David Medinets
>            Assignee: Adam Fuchs
>            Priority: Trivial
>
> The README.bloom file says this:
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 7
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 8
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 9
>  * Flush the table using the shell
> However, no reasons are given for why three flushes are used instead of one. Please explain the reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Work started] (ACCUMULO-251) Add wording to README.bloom about reason for flushing.

Posted by "Keith Turner (Work started) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on ACCUMULO-251 started by Keith Turner.

> Add wording to README.bloom about reason for flushing.
> ------------------------------------------------------
>
>                 Key: ACCUMULO-251
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-251
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: docs
>            Reporter: David Medinets
>            Assignee: Keith Turner
>            Priority: Trivial
>
> The README.bloom file says this:
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 7
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 8
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 9
>  * Flush the table using the shell
> However, no reasons are given for why three flushes are used instead of one. Please explain the reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ACCUMULO-251) Add wording to README.bloom about reason for flushing.

Posted by "Keith Turner (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183499#comment-13183499 ] 

Keith Turner commented on ACCUMULO-251:
---------------------------------------

I pressed enter while typing my commit message.  The following commit is for this ticket.

http://svn.apache.org/viewvc?view=revision&revision=1229699
                
> Add wording to README.bloom about reason for flushing.
> ------------------------------------------------------
>
>                 Key: ACCUMULO-251
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-251
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: docs
>            Reporter: David Medinets
>            Assignee: Keith Turner
>            Priority: Trivial
>
> The README.bloom file says this:
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 7
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 8
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 9
>  * Flush the table using the shell
> However, no reasons are given for why three flushes are used instead of one. Please explain the reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ACCUMULO-251) Add wording to README.bloom about reason for flushing.

Posted by "David Medinets (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180493#comment-13180493 ] 

David Medinets commented on ACCUMULO-251:
-----------------------------------------

>From A. Fuchs on the mailing list: I think the confusion here might be that there are two different operations called "flush". One is the flush of the BatchWriter's local buffer, and the other is the flush of the TabletServer's in-memory map (AKA minor compaction). This example refers to the latter. There are also auto-flushes in both cases, but the flush in this case is effectively forcing the minor-compaction operation with a known quantity of data.

This information might also be useful in the README.bloom file.
                
> Add wording to README.bloom about reason for flushing.
> ------------------------------------------------------
>
>                 Key: ACCUMULO-251
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-251
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: docs
>            Reporter: David Medinets
>            Assignee: Adam Fuchs
>            Priority: Trivial
>
> The README.bloom file says this:
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 7
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 8
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 9
>  * Flush the table using the shell
> However, no reasons are given for why three flushes are used instead of one. Please explain the reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Assigned] (ACCUMULO-251) Add wording to README.bloom about reason for flushing.

Posted by "Keith Turner (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Keith Turner reassigned ACCUMULO-251:
-------------------------------------

    Assignee: Keith Turner  (was: Adam Fuchs)
    
> Add wording to README.bloom about reason for flushing.
> ------------------------------------------------------
>
>                 Key: ACCUMULO-251
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-251
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: docs
>            Reporter: David Medinets
>            Assignee: Keith Turner
>            Priority: Trivial
>
> The README.bloom file says this:
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 7
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 8
>  * Flush the table using the shell
>  * Insert 1 million entries using  RandomBatchWriter with a seed of 9
>  * Flush the table using the shell
> However, no reasons are given for why three flushes are used instead of one. Please explain the reasons.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira