You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by "michael.griffiths3@baesystems.com" <mi...@baesystems.com> on 2015/07/15 11:15:43 UTC

Combiner design clarification

Hi all,

I'm implementing a range of combiners and my assumption is that at scan time we will only have one single combined value returned (regardless of whether the versioning iterator is applied), however the documentation in the iterator_design.txt document isn't clear whether that assumption is correct:





A second consideration is that a Combiner is not guaranteed to see every Key-Value pair



which differ only by timestamp every time it is invoked. For example, if there are 5 Key-Value



pairs in a table which only differ by the timestamps 1, 2, 3, 4, and 5, it is not guaranteed that



every invocation of the Combiner will see 5 timestamps. One invocation might see the Values for



Keys with timestamp 1 and 4, while another invocation might see the Values for Keys with the



timestamps 1, 2, 4 and 5.



Is the above an implementation detail that we need to consider when writing combiners, but not for the client implementation? Will the client only have one value returned to it?
Equally does the same statement above stay true for other forms of iterators that are not combiners?

Many thanks,

Michael

Michael Griffiths
Developer
BAE Systems Applied Intelligence
___________________________________________________________

 E: michael.griffiths3@baesystems.com

BAE Systems Applied Intelligence, Surrey Research Park, Guildford, Surrey, GU2 7RQ.
www.baesystems.com/ai<http://www.baesystems.com/ai>

Please consider the environment before printing this email. This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies under the control of BAE Systems Applied Intelligence Limited, details of which can be found at http://www.baesystems.com/Businesses/index.htm.

Re: Combiner design clarification

Posted by Eric Newton <er...@gmail.com>.
This paragraph probably refers to the fact that Key/Values may be combined
using a sub-set of the files for a tablet.

If you have 5 files, and 3 of them are being merged in a compaction, you
won't see the keys from the other two files.  When you are doing a scan,
you will see all the entries.

-Eric




On Wed, Jul 15, 2015 at 5:15 AM, michael.griffiths3@baesystems.com <
michael.griffiths3@baesystems.com> wrote:

>  Hi all,
>
>
>
> I’m implementing a range of combiners and my assumption is that at scan
> time we will only have one single combined value returned (regardless of
> whether the versioning iterator is applied), however the documentation in
> the iterator_design.txt document isn’t clear whether that assumption is
> correct:
>
>
>
>
>
> *A second consideration is that a Combiner is not guaranteed to see every
> Key-Value pair*
>
> *which differ only by timestamp every time it is invoked. For example, if
> there are 5 Key-Value*
>
> *pairs in a table which only differ by the timestamps 1, 2, 3, 4, and 5,
> it is not guaranteed that*
>
> *every invocation of the Combiner will see 5 timestamps. One invocation
> might see the Values for*
>
> *Keys with timestamp 1 and 4, while another invocation might see the
> Values for Keys with the*
>
>
>
> *timestamps 1, 2, 4 and 5.*
>
>
>
>
>
> Is the above an implementation detail that we need to consider when
> writing combiners, but not for the client implementation? Will the client
> only have one value returned to it?
>
> Equally does the same statement above stay true for other forms of
> iterators that are not combiners?
>
>
>
> Many thanks,
>
>
>
> Michael
>
>
>
>
>
>
> *Michael Griffiths Developer BAE Systems Applied Intelligence *
> *___________________________________________________________ *
>  *E: *michael.griffiths3@baesystems.com
>
> BAE Systems Applied Intelligence, Surrey Research Park, Guildford, Surrey,
> GU2 7RQ.
> www.baesystems.com/ai
>
>
>  Please consider the environment before printing this email. This message
> should be regarded as confidential. If you have received this email in
> error please notify the sender and destroy it immediately. Statements of
> intent shall only become binding when confirmed in hard copy by an
> authorised signatory. The contents of this email may relate to dealings
> with other companies under the control of BAE Systems Applied Intelligence
> Limited, details of which can be found at
> http://www.baesystems.com/Businesses/index.htm.
>