You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Ken Sullivan <su...@mayachitra.com> on 2013/08/24 00:28:57 UTC

Writable readFields question

For my application I'm decoding data in readFields() of non-predetermined
length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1"
tend to mark the end of the data stream.  Is this reliable, or is there a
better way?

Thanks,
Ken

Re: Writable readFields question

Posted by Abhijit Sarkar <ab...@gmail.com>.
Ken,
What about supplemental characters, the major reason for which Hadoop's
Writeable implementations store the length?


On Sun, Aug 25, 2013 at 4:09 PM, Ken Sullivan <su...@mayachitra.com>wrote:

> That could be a possible, but ideally we wouldn't have to change how the
> data is being inserted.  The data is originally going into accumulo tables
> from an existing c++ system with a JNI wrapper to insert a language
> independent serialized blob; the code for that is tested and running and
> best case scenario we don't have to change it.  Checking for EOT and
> negative values is working so far...just wondering if there was a an
> official list of things to check.
>
>
> On Fri, Aug 23, 2013 at 6:19 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> When you're encoding/write()-ing the writable, do you not know the
>> length? If you do, store the length first, and you can solve your
>> problem?
>>
>> On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com>
>> wrote:
>> > For my application I'm decoding data in readFields() of
>> non-predetermined
>> > length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1"
>> tend
>> > to mark the end of the data stream.  Is this reliable, or is there a
>> better
>> > way?
>> >
>> > Thanks,
>> > Ken
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: Writable readFields question

Posted by Abhijit Sarkar <ab...@gmail.com>.
Ken,
What about supplemental characters, the major reason for which Hadoop's
Writeable implementations store the length?


On Sun, Aug 25, 2013 at 4:09 PM, Ken Sullivan <su...@mayachitra.com>wrote:

> That could be a possible, but ideally we wouldn't have to change how the
> data is being inserted.  The data is originally going into accumulo tables
> from an existing c++ system with a JNI wrapper to insert a language
> independent serialized blob; the code for that is tested and running and
> best case scenario we don't have to change it.  Checking for EOT and
> negative values is working so far...just wondering if there was a an
> official list of things to check.
>
>
> On Fri, Aug 23, 2013 at 6:19 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> When you're encoding/write()-ing the writable, do you not know the
>> length? If you do, store the length first, and you can solve your
>> problem?
>>
>> On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com>
>> wrote:
>> > For my application I'm decoding data in readFields() of
>> non-predetermined
>> > length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1"
>> tend
>> > to mark the end of the data stream.  Is this reliable, or is there a
>> better
>> > way?
>> >
>> > Thanks,
>> > Ken
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: Writable readFields question

Posted by Abhijit Sarkar <ab...@gmail.com>.
Ken,
What about supplemental characters, the major reason for which Hadoop's
Writeable implementations store the length?


On Sun, Aug 25, 2013 at 4:09 PM, Ken Sullivan <su...@mayachitra.com>wrote:

> That could be a possible, but ideally we wouldn't have to change how the
> data is being inserted.  The data is originally going into accumulo tables
> from an existing c++ system with a JNI wrapper to insert a language
> independent serialized blob; the code for that is tested and running and
> best case scenario we don't have to change it.  Checking for EOT and
> negative values is working so far...just wondering if there was a an
> official list of things to check.
>
>
> On Fri, Aug 23, 2013 at 6:19 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> When you're encoding/write()-ing the writable, do you not know the
>> length? If you do, store the length first, and you can solve your
>> problem?
>>
>> On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com>
>> wrote:
>> > For my application I'm decoding data in readFields() of
>> non-predetermined
>> > length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1"
>> tend
>> > to mark the end of the data stream.  Is this reliable, or is there a
>> better
>> > way?
>> >
>> > Thanks,
>> > Ken
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: Writable readFields question

Posted by Abhijit Sarkar <ab...@gmail.com>.
Ken,
What about supplemental characters, the major reason for which Hadoop's
Writeable implementations store the length?


On Sun, Aug 25, 2013 at 4:09 PM, Ken Sullivan <su...@mayachitra.com>wrote:

> That could be a possible, but ideally we wouldn't have to change how the
> data is being inserted.  The data is originally going into accumulo tables
> from an existing c++ system with a JNI wrapper to insert a language
> independent serialized blob; the code for that is tested and running and
> best case scenario we don't have to change it.  Checking for EOT and
> negative values is working so far...just wondering if there was a an
> official list of things to check.
>
>
> On Fri, Aug 23, 2013 at 6:19 PM, Harsh J <ha...@cloudera.com> wrote:
>
>> When you're encoding/write()-ing the writable, do you not know the
>> length? If you do, store the length first, and you can solve your
>> problem?
>>
>> On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com>
>> wrote:
>> > For my application I'm decoding data in readFields() of
>> non-predetermined
>> > length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1"
>> tend
>> > to mark the end of the data stream.  Is this reliable, or is there a
>> better
>> > way?
>> >
>> > Thanks,
>> > Ken
>>
>>
>>
>> --
>> Harsh J
>>
>
>

Re: Writable readFields question

Posted by Ken Sullivan <su...@mayachitra.com>.
That could be a possible, but ideally we wouldn't have to change how the
data is being inserted.  The data is originally going into accumulo tables
from an existing c++ system with a JNI wrapper to insert a language
independent serialized blob; the code for that is tested and running and
best case scenario we don't have to change it.  Checking for EOT and
negative values is working so far...just wondering if there was a an
official list of things to check.


On Fri, Aug 23, 2013 at 6:19 PM, Harsh J <ha...@cloudera.com> wrote:

> When you're encoding/write()-ing the writable, do you not know the
> length? If you do, store the length first, and you can solve your
> problem?
>
> On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com>
> wrote:
> > For my application I'm decoding data in readFields() of non-predetermined
> > length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1"
> tend
> > to mark the end of the data stream.  Is this reliable, or is there a
> better
> > way?
> >
> > Thanks,
> > Ken
>
>
>
> --
> Harsh J
>

Re: Writable readFields question

Posted by Ken Sullivan <su...@mayachitra.com>.
That could be a possible, but ideally we wouldn't have to change how the
data is being inserted.  The data is originally going into accumulo tables
from an existing c++ system with a JNI wrapper to insert a language
independent serialized blob; the code for that is tested and running and
best case scenario we don't have to change it.  Checking for EOT and
negative values is working so far...just wondering if there was a an
official list of things to check.


On Fri, Aug 23, 2013 at 6:19 PM, Harsh J <ha...@cloudera.com> wrote:

> When you're encoding/write()-ing the writable, do you not know the
> length? If you do, store the length first, and you can solve your
> problem?
>
> On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com>
> wrote:
> > For my application I'm decoding data in readFields() of non-predetermined
> > length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1"
> tend
> > to mark the end of the data stream.  Is this reliable, or is there a
> better
> > way?
> >
> > Thanks,
> > Ken
>
>
>
> --
> Harsh J
>

Re: Writable readFields question

Posted by Ken Sullivan <su...@mayachitra.com>.
That could be a possible, but ideally we wouldn't have to change how the
data is being inserted.  The data is originally going into accumulo tables
from an existing c++ system with a JNI wrapper to insert a language
independent serialized blob; the code for that is tested and running and
best case scenario we don't have to change it.  Checking for EOT and
negative values is working so far...just wondering if there was a an
official list of things to check.


On Fri, Aug 23, 2013 at 6:19 PM, Harsh J <ha...@cloudera.com> wrote:

> When you're encoding/write()-ing the writable, do you not know the
> length? If you do, store the length first, and you can solve your
> problem?
>
> On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com>
> wrote:
> > For my application I'm decoding data in readFields() of non-predetermined
> > length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1"
> tend
> > to mark the end of the data stream.  Is this reliable, or is there a
> better
> > way?
> >
> > Thanks,
> > Ken
>
>
>
> --
> Harsh J
>

Re: Writable readFields question

Posted by Ken Sullivan <su...@mayachitra.com>.
That could be a possible, but ideally we wouldn't have to change how the
data is being inserted.  The data is originally going into accumulo tables
from an existing c++ system with a JNI wrapper to insert a language
independent serialized blob; the code for that is tested and running and
best case scenario we don't have to change it.  Checking for EOT and
negative values is working so far...just wondering if there was a an
official list of things to check.


On Fri, Aug 23, 2013 at 6:19 PM, Harsh J <ha...@cloudera.com> wrote:

> When you're encoding/write()-ing the writable, do you not know the
> length? If you do, store the length first, and you can solve your
> problem?
>
> On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com>
> wrote:
> > For my application I'm decoding data in readFields() of non-predetermined
> > length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1"
> tend
> > to mark the end of the data stream.  Is this reliable, or is there a
> better
> > way?
> >
> > Thanks,
> > Ken
>
>
>
> --
> Harsh J
>

Re: Writable readFields question

Posted by Harsh J <ha...@cloudera.com>.
When you're encoding/write()-ing the writable, do you not know the
length? If you do, store the length first, and you can solve your
problem?

On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com> wrote:
> For my application I'm decoding data in readFields() of non-predetermined
> length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1" tend
> to mark the end of the data stream.  Is this reliable, or is there a better
> way?
>
> Thanks,
> Ken



-- 
Harsh J

Re: Writable readFields question

Posted by Harsh J <ha...@cloudera.com>.
When you're encoding/write()-ing the writable, do you not know the
length? If you do, store the length first, and you can solve your
problem?

On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com> wrote:
> For my application I'm decoding data in readFields() of non-predetermined
> length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1" tend
> to mark the end of the data stream.  Is this reliable, or is there a better
> way?
>
> Thanks,
> Ken



-- 
Harsh J

Re: Writable readFields question

Posted by Harsh J <ha...@cloudera.com>.
When you're encoding/write()-ing the writable, do you not know the
length? If you do, store the length first, and you can solve your
problem?

On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com> wrote:
> For my application I'm decoding data in readFields() of non-predetermined
> length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1" tend
> to mark the end of the data stream.  Is this reliable, or is there a better
> way?
>
> Thanks,
> Ken



-- 
Harsh J

Re: Writable readFields question

Posted by Harsh J <ha...@cloudera.com>.
When you're encoding/write()-ing the writable, do you not know the
length? If you do, store the length first, and you can solve your
problem?

On Sat, Aug 24, 2013 at 3:58 AM, Ken Sullivan <su...@mayachitra.com> wrote:
> For my application I'm decoding data in readFields() of non-predetermined
> length.  I've found parsing for "4" (ASCII End Of Transmission) or "-1" tend
> to mark the end of the data stream.  Is this reliable, or is there a better
> way?
>
> Thanks,
> Ken



-- 
Harsh J