You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by enes yücer <en...@gmail.com> on 2013/07/23 10:19:23 UTC

Join Operation with Regular Expression

Hi,

I have 2 data set one of them contain string text, and other table contain
string patern (which is searching in text), id.

I have create volatile solution, to create two external hive table and full
join of them
and after full join,I use regex function in where case. but, it takes too
long. because hive does not support regex based join conditions.


how do I do this operation in hadoop or have you implement MR job( in
hive,pig, java) like this? or any advice?


thanks.

RE: Join Operation with Regular Expression

Posted by Devaraj k <de...@huawei.com>.
You can try writing the mapreduce job for this. In the Job, you can filter the records in Mapper based on the where condition regex and then perform the join in the Reducer.

Please refer the classes present in hadoop-datajoin module to get an idea how to implement the join job.

Thanks
Devaraj k

From: enes yücer [mailto:enesycr@gmail.com]
Sent: 23 July 2013 13:49
To: user@hadoop.apache.org
Subject: Join Operation with Regular Expression

Hi,
I have 2 data set one of them contain string text, and other table contain string patern (which is searching in text), id.
I have create volatile solution, to create two external hive table and full join of them
and after full join,I use regex function in where case. but, it takes too long. because hive does not support regex based join conditions.

how do I do this operation in hadoop or have you implement MR job( in hive,pig, java) like this? or any advice?

thanks.

RE: Join Operation with Regular Expression

Posted by Devaraj k <de...@huawei.com>.
You can try writing the mapreduce job for this. In the Job, you can filter the records in Mapper based on the where condition regex and then perform the join in the Reducer.

Please refer the classes present in hadoop-datajoin module to get an idea how to implement the join job.

Thanks
Devaraj k

From: enes yücer [mailto:enesycr@gmail.com]
Sent: 23 July 2013 13:49
To: user@hadoop.apache.org
Subject: Join Operation with Regular Expression

Hi,
I have 2 data set one of them contain string text, and other table contain string patern (which is searching in text), id.
I have create volatile solution, to create two external hive table and full join of them
and after full join,I use regex function in where case. but, it takes too long. because hive does not support regex based join conditions.

how do I do this operation in hadoop or have you implement MR job( in hive,pig, java) like this? or any advice?

thanks.

RE: Join Operation with Regular Expression

Posted by Devaraj k <de...@huawei.com>.
You can try writing the mapreduce job for this. In the Job, you can filter the records in Mapper based on the where condition regex and then perform the join in the Reducer.

Please refer the classes present in hadoop-datajoin module to get an idea how to implement the join job.

Thanks
Devaraj k

From: enes yücer [mailto:enesycr@gmail.com]
Sent: 23 July 2013 13:49
To: user@hadoop.apache.org
Subject: Join Operation with Regular Expression

Hi,
I have 2 data set one of them contain string text, and other table contain string patern (which is searching in text), id.
I have create volatile solution, to create two external hive table and full join of them
and after full join,I use regex function in where case. but, it takes too long. because hive does not support regex based join conditions.

how do I do this operation in hadoop or have you implement MR job( in hive,pig, java) like this? or any advice?

thanks.

RE: Join Operation with Regular Expression

Posted by Devaraj k <de...@huawei.com>.
You can try writing the mapreduce job for this. In the Job, you can filter the records in Mapper based on the where condition regex and then perform the join in the Reducer.

Please refer the classes present in hadoop-datajoin module to get an idea how to implement the join job.

Thanks
Devaraj k

From: enes yücer [mailto:enesycr@gmail.com]
Sent: 23 July 2013 13:49
To: user@hadoop.apache.org
Subject: Join Operation with Regular Expression

Hi,
I have 2 data set one of them contain string text, and other table contain string patern (which is searching in text), id.
I have create volatile solution, to create two external hive table and full join of them
and after full join,I use regex function in where case. but, it takes too long. because hive does not support regex based join conditions.

how do I do this operation in hadoop or have you implement MR job( in hive,pig, java) like this? or any advice?

thanks.