You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Mark Kerzner <ma...@gmail.com> on 2011/03/10 07:15:36 UTC

What is the best way to output string values?

Hi,

in my MapReduce job I parse documents' metadata, so about each document I
know things like author=john, and last_printed=3/3/10, and so on.

What is the best way to put all this into a text line of output in the
reducer? In other words, I want to imitate HBase, putting (qualifier, value)
pairs with the same rowkey - but in text output. I will probably need to
base64 encode them, because Hadoop does not like any characters other than
ASCII.

So is "Key=Value", "Key=Value" the best practice?

What do I do with that values? Eventually I want to create a CSV file, with
all the metadata values in a table format - running this as the next
MapReduce job.

Thank you,
Mark