You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by Dmitriy Ryaboy <dv...@gmail.com> on 2011/06/06 18:28:54 UTC

Why does TextLoader return bytes?

Is there a reason TextLoader returns bytes by default, and converts to
Strings only if requested?

Text value = (Text) in.getCurrentValue();
byte[] ba = value.getBytes();
// make a copy of the bytes representing the input since
// TextInputFormat will reuse the byte array
return mTupleFactory.newTuple(new DataByteArray(ba, 0, value.getLength()));

This makes the common case require 2 copies (first to get the bytes,
then to get the string). Why not return a chararray and avoid the
second copy?

D