You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by shashank_kiett <sh...@yahoo.com.INVALID> on 2016/02/23 16:29:54 UTC
Regarding execution of Map Reduce Jobs with Apache NIFI
Hi,
I want to configure a map reduce job in Apache NIFI as a processor. The
scenario for which this job developed is as below :
There are two files:
1. User_data having tab separated data like userid
username movieid rating
2. Movie_data having | separated data like
movieid|movie_name
Requirement is :
To get movie name and it's aggregated rating in one
resultant file.
Used approach for now [Step by step]:
1. Used ExecuteCommandScript processor with using shell script to load
and fetch data from HIVE.
2. In shell script I have written SQL queries for loading and
fetching data then output data was written on disk by using putFile
processor.
Please suggest,
If I opted right approach [As I think ExecuteSQL processor should be used
for execution of SQL queries on HIVE but I do not know What is DB connection
string for it ]?
what is best approach for it?
Thanks with regards
Shashank Tiwari