You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pig.apache.org by Apache Wiki <wi...@apache.org> on 2010/02/23 22:46:14 UTC

[Pig Wiki] Update of "LoadStoreMigrationGuide" by daijy

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.

The "LoadStoreMigrationGuide" page has been changed by daijy.
http://wiki.apache.org/pig/LoadStoreMigrationGuide?action=diff&rev1=27&rev2=28

--------------------------------------------------

  ||No equivalent method ||setLocation() ||!LoadFunc ||This method is called by Pig to communicate the load location to the loader. The loader should use this method to communicate the same information to the underlying !InputFormat. This method is called multiple times by pig - implementations should bear in mind that this method is called multiple times and should ensure there are no inconsistent side effects due to the multiple calls. ||
  ||bindTo() ||prepareToRead() ||!LoadFunc ||bindTo() was the old method which would provide an !InputStream among other things to the !LoadFunc. The !LoadFunc implementation would then read from the !InputStream in getNext(). In the new API, reading of the data is through the !InputFormat provided by the !LoadFunc. So the equivalent call is prepareToRead() wherein the !RecordReader associated with the !InputFormat provided by the !LoadFunc is passed to the !LoadFunc. The !RecordReader can then be used by the implementation in getNext() to return a tuple representing a record of data back to pig. ||
  ||getNext() ||getNext() ||!LoadFunc ||The meaning of getNext() has not changed and is called by Pig runtime to get the next tuple in the data - in the new API, this is the method wherein the implementation will use the the underlying !RecordReader and construct a tuple ||
- ||bytesToInteger(),...bytesToBag() ||bytesToInteger(),...bytesToBag() ||!LoadCaster ||The meaning of these methods has not changed and is called by Pig runtime to cast a !DataByteArray fields to the right type when needed. In the new API, a !LoadFunc implementation should give a !LoadCaster object back to pig as the return value of getLoadCaster() method so that it can be used for casting. The default implementation in !LoadFunc returns an instance of !UTF8StorageConvertor which can handle casting from UTF-8 bytes to different types. If a null is returned then casting from !DataByteArray to any other type (implicitly or explicitly) in the pig script will not be possible ||
+ ||bytesToInteger(),...bytesToBag() ||bytesToInteger(),...bytesToBag() ||!LoadCaster ||The meaning of these methods has not changed and is called by Pig runtime to cast a !DataByteArray fields to the right type when needed. In the new API, a !LoadFunc implementation should give a !LoadCaster object back to pig as the return value of getLoadCaster() method so that it can be used for casting. The default implementation in !LoadFunc returns an instance of !UTF8StorageConvertor which can handle casting from UTF-8 bytes to different types. If a null is returned then casting from !DataByteArray to any other type (implicitly or explicitly) in the pig script will not be possible. The signature of bytesToTuple, bytesToBag is also changed to take a field schema of the bag/tuple, and bytesToTuple/bytesToBag should construct the tuple/bag in conformance with the given field schema ||
  
  
  An example of how a simple !LoadFunc implementation based on old interface can be converted to the new interfaces is shown in the Examples section below.