You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@asterixdb.apache.org by Muhammad Abu Bakar Siddique <ms...@ucr.edu> on 2018/04/10 17:33:55 UTC

Is there an easier way to wrap/unwrap the entire tuple as a ByteBuffer?

Hi Dev,
I'm working on a Hyracks application for parallel random sampling which
consists of two operators. The first operator generates and appends a new
field to each tuple while the second operator processes that additional
field and removes it before writing the final output. So, the output of the
second operator should have the same format of the input of the first
operator. In other words, I want the first operator to wrap the tuple as-is
and add an additional field while the second operator should remove and
unwrap the tuple. Currently, I use the FrameTupleAppender and
ArrayTupleAppender where I have to add each field in the input record
separately but it seems to be an overhead in the code. Is there an easier
way to wrap/unwrap the entire tuple as a ByteBuffer without having to worry
about the individual fields inside it?

Re: Is there an easier way to wrap/unwrap the entire tuple as a ByteBuffer?

Posted by Mike Carey <dt...@gmail.com>.
Note that this is the AsterixDB format, right? (Hyracks - at its level - 
doesn't dictate the contents' details AFAIK except for its built-in 
primitive types; the rest are black boxes and accessed by functions like 
comparators, etc.)


On 4/10/18 5:36 PM, Taewoo Kim wrote:
> Hello Ahmed,
>
> This doc might help.
> https://code.google.com/archive/p/asterixdb/wikis/Serialization.wiki
>
> Best,
> Taewoo
>
> On Tue, Apr 10, 2018 at 5:30 PM, Ahmed Eldawy <el...@ucr.edu> wrote:
>
>> Mike,
>>
>> What you're suggesting makes more sense. We just don't know how to do it :)
>> BTW, is there any document that describes the binary format of the
>> frame/tuple/fields? I was able to find out some information myself by
>> digging into the code but if there is a document or page that describes
>> this it can be of a great help.
>>
>> On Tue, Apr 10, 2018 at 12:01 PM, Mike Carey <dt...@gmail.com> wrote:
>>
>>> Naive (me as a stupid observer :-)) question:  Is there a reason to
>>> wrap/unwrap instead of extend/unextend? (I.e., couldn't you add an
>>> additional Hyracks tuple field and then project it away - i.e., expand
>> and
>>> contract the tuple horizontally rather than nesting and unnesting it?)
>>>
>>>
>>>
>>> On 4/10/18 11:10 AM, Chen Luo wrote:
>>>
>>>> Hi,
>>>>
>>>> You can try IFrameFieldAppender (and its implementation
>>>> FrameFixedFieldAppender) to directly append wrapped tuple (field by
>> field)
>>>> to the output buffer, without going through the array tuple builder. But
>>>> in
>>>> general, because of the tuple format, I'm not sure there is a more
>>>> efficient way to wrap/unwrap tuples directly.
>>>>
>>>> Best regards,
>>>> Chen Luo
>>>>
>>>> On Tue, Apr 10, 2018 at 10:33 AM, Muhammad Abu Bakar Siddique <
>>>> msidd005@ucr.edu> wrote:
>>>>
>>>> Hi Dev,
>>>>> I'm working on a Hyracks application for parallel random sampling which
>>>>> consists of two operators. The first operator generates and appends a
>> new
>>>>> field to each tuple while the second operator processes that additional
>>>>> field and removes it before writing the final output. So, the output of
>>>>> the
>>>>> second operator should have the same format of the input of the first
>>>>> operator. In other words, I want the first operator to wrap the tuple
>>>>> as-is
>>>>> and add an additional field while the second operator should remove and
>>>>> unwrap the tuple. Currently, I use the FrameTupleAppender and
>>>>> ArrayTupleAppender where I have to add each field in the input record
>>>>> separately but it seems to be an overhead in the code. Is there an
>> easier
>>>>> way to wrap/unwrap the entire tuple as a ByteBuffer without having to
>>>>> worry
>>>>> about the individual fields inside it?
>>>>>
>>>>>
>>
>> --
>>
>> Ahmed Eldawy
>> Assistant Professor
>> http://www.cs.ucr.edu/~eldawy
>> Tel: +1 (951) 827-5654
>>


Re: Is there an easier way to wrap/unwrap the entire tuple as a ByteBuffer?

Posted by Taewoo Kim <wa...@gmail.com>.
Hello Ahmed,

This doc might help.
https://code.google.com/archive/p/asterixdb/wikis/Serialization.wiki

Best,
Taewoo

On Tue, Apr 10, 2018 at 5:30 PM, Ahmed Eldawy <el...@ucr.edu> wrote:

> Mike,
>
> What you're suggesting makes more sense. We just don't know how to do it :)
> BTW, is there any document that describes the binary format of the
> frame/tuple/fields? I was able to find out some information myself by
> digging into the code but if there is a document or page that describes
> this it can be of a great help.
>
> On Tue, Apr 10, 2018 at 12:01 PM, Mike Carey <dt...@gmail.com> wrote:
>
> > Naive (me as a stupid observer :-)) question:  Is there a reason to
> > wrap/unwrap instead of extend/unextend? (I.e., couldn't you add an
> > additional Hyracks tuple field and then project it away - i.e., expand
> and
> > contract the tuple horizontally rather than nesting and unnesting it?)
> >
> >
> >
> > On 4/10/18 11:10 AM, Chen Luo wrote:
> >
> >> Hi,
> >>
> >> You can try IFrameFieldAppender (and its implementation
> >> FrameFixedFieldAppender) to directly append wrapped tuple (field by
> field)
> >> to the output buffer, without going through the array tuple builder. But
> >> in
> >> general, because of the tuple format, I'm not sure there is a more
> >> efficient way to wrap/unwrap tuples directly.
> >>
> >> Best regards,
> >> Chen Luo
> >>
> >> On Tue, Apr 10, 2018 at 10:33 AM, Muhammad Abu Bakar Siddique <
> >> msidd005@ucr.edu> wrote:
> >>
> >> Hi Dev,
> >>> I'm working on a Hyracks application for parallel random sampling which
> >>> consists of two operators. The first operator generates and appends a
> new
> >>> field to each tuple while the second operator processes that additional
> >>> field and removes it before writing the final output. So, the output of
> >>> the
> >>> second operator should have the same format of the input of the first
> >>> operator. In other words, I want the first operator to wrap the tuple
> >>> as-is
> >>> and add an additional field while the second operator should remove and
> >>> unwrap the tuple. Currently, I use the FrameTupleAppender and
> >>> ArrayTupleAppender where I have to add each field in the input record
> >>> separately but it seems to be an overhead in the code. Is there an
> easier
> >>> way to wrap/unwrap the entire tuple as a ByteBuffer without having to
> >>> worry
> >>> about the individual fields inside it?
> >>>
> >>>
> >
>
>
> --
>
> Ahmed Eldawy
> Assistant Professor
> http://www.cs.ucr.edu/~eldawy
> Tel: +1 (951) 827-5654
>

Re: Is there an easier way to wrap/unwrap the entire tuple as a ByteBuffer?

Posted by Ahmed Eldawy <el...@ucr.edu>.
Mike,

What you're suggesting makes more sense. We just don't know how to do it :)
BTW, is there any document that describes the binary format of the
frame/tuple/fields? I was able to find out some information myself by
digging into the code but if there is a document or page that describes
this it can be of a great help.

On Tue, Apr 10, 2018 at 12:01 PM, Mike Carey <dt...@gmail.com> wrote:

> Naive (me as a stupid observer :-)) question:  Is there a reason to
> wrap/unwrap instead of extend/unextend? (I.e., couldn't you add an
> additional Hyracks tuple field and then project it away - i.e., expand and
> contract the tuple horizontally rather than nesting and unnesting it?)
>
>
>
> On 4/10/18 11:10 AM, Chen Luo wrote:
>
>> Hi,
>>
>> You can try IFrameFieldAppender (and its implementation
>> FrameFixedFieldAppender) to directly append wrapped tuple (field by field)
>> to the output buffer, without going through the array tuple builder. But
>> in
>> general, because of the tuple format, I'm not sure there is a more
>> efficient way to wrap/unwrap tuples directly.
>>
>> Best regards,
>> Chen Luo
>>
>> On Tue, Apr 10, 2018 at 10:33 AM, Muhammad Abu Bakar Siddique <
>> msidd005@ucr.edu> wrote:
>>
>> Hi Dev,
>>> I'm working on a Hyracks application for parallel random sampling which
>>> consists of two operators. The first operator generates and appends a new
>>> field to each tuple while the second operator processes that additional
>>> field and removes it before writing the final output. So, the output of
>>> the
>>> second operator should have the same format of the input of the first
>>> operator. In other words, I want the first operator to wrap the tuple
>>> as-is
>>> and add an additional field while the second operator should remove and
>>> unwrap the tuple. Currently, I use the FrameTupleAppender and
>>> ArrayTupleAppender where I have to add each field in the input record
>>> separately but it seems to be an overhead in the code. Is there an easier
>>> way to wrap/unwrap the entire tuple as a ByteBuffer without having to
>>> worry
>>> about the individual fields inside it?
>>>
>>>
>


-- 

Ahmed Eldawy
Assistant Professor
http://www.cs.ucr.edu/~eldawy
Tel: +1 (951) 827-5654

Re: Is there an easier way to wrap/unwrap the entire tuple as a ByteBuffer?

Posted by Mike Carey <dt...@gmail.com>.
Naive (me as a stupid observer :-)) question:  Is there a reason to 
wrap/unwrap instead of extend/unextend? (I.e., couldn't you add an 
additional Hyracks tuple field and then project it away - i.e., expand 
and contract the tuple horizontally rather than nesting and unnesting it?)


On 4/10/18 11:10 AM, Chen Luo wrote:
> Hi,
>
> You can try IFrameFieldAppender (and its implementation
> FrameFixedFieldAppender) to directly append wrapped tuple (field by field)
> to the output buffer, without going through the array tuple builder. But in
> general, because of the tuple format, I'm not sure there is a more
> efficient way to wrap/unwrap tuples directly.
>
> Best regards,
> Chen Luo
>
> On Tue, Apr 10, 2018 at 10:33 AM, Muhammad Abu Bakar Siddique <
> msidd005@ucr.edu> wrote:
>
>> Hi Dev,
>> I'm working on a Hyracks application for parallel random sampling which
>> consists of two operators. The first operator generates and appends a new
>> field to each tuple while the second operator processes that additional
>> field and removes it before writing the final output. So, the output of the
>> second operator should have the same format of the input of the first
>> operator. In other words, I want the first operator to wrap the tuple as-is
>> and add an additional field while the second operator should remove and
>> unwrap the tuple. Currently, I use the FrameTupleAppender and
>> ArrayTupleAppender where I have to add each field in the input record
>> separately but it seems to be an overhead in the code. Is there an easier
>> way to wrap/unwrap the entire tuple as a ByteBuffer without having to worry
>> about the individual fields inside it?
>>


Re: Is there an easier way to wrap/unwrap the entire tuple as a ByteBuffer?

Posted by Chen Luo <cl...@uci.edu>.
Hi,

You can try IFrameFieldAppender (and its implementation
FrameFixedFieldAppender) to directly append wrapped tuple (field by field)
to the output buffer, without going through the array tuple builder. But in
general, because of the tuple format, I'm not sure there is a more
efficient way to wrap/unwrap tuples directly.

Best regards,
Chen Luo

On Tue, Apr 10, 2018 at 10:33 AM, Muhammad Abu Bakar Siddique <
msidd005@ucr.edu> wrote:

> Hi Dev,
> I'm working on a Hyracks application for parallel random sampling which
> consists of two operators. The first operator generates and appends a new
> field to each tuple while the second operator processes that additional
> field and removes it before writing the final output. So, the output of the
> second operator should have the same format of the input of the first
> operator. In other words, I want the first operator to wrap the tuple as-is
> and add an additional field while the second operator should remove and
> unwrap the tuple. Currently, I use the FrameTupleAppender and
> ArrayTupleAppender where I have to add each field in the input record
> separately but it seems to be an overhead in the code. Is there an easier
> way to wrap/unwrap the entire tuple as a ByteBuffer without having to worry
> about the individual fields inside it?
>