You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Anil <an...@gmail.com> on 2017/01/31 09:22:03 UTC

Phoenix mapreduce

Hello,

I have phoenix table which have both child and parent records.
now i have created a phoenix mapreduce job to populate few columns of
parent record into child record.

Two ways of populating parent columns into child record are

1.
a. Get the parent columns information by phoenix query for each child
record in mapper.
b. Set number of reducers to zero

2.
a. Group by the records by parent id (which is available in both parent and
child records). it mean use parent id as key of mapper output and record as
value of mapper output
b. populate parent coulmns information into child record in reducer.

I tried #1 and always see container memory insufficient error or GC
overhead error.

What is the recommended approach ? thanks for your help.

Thanks.