You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Guru Prasad <gu...@ibibogroup.com> on 2010/08/25 12:28:42 UTC

Can i modify the stream of data and then insert into hive

Hi,
Suppose i get a stream of data.
Now I want to filter this 'stream data' and then insert into hive table.

For example
lets say a file "datas.txt" has following info
------------------------------------------------
guru    12    delhi
prasad    13    gurgaon
----------------------------------------------

Now I will open this file as fp=open("datas.txt","r");
and then read a line(line=fp.readlines()) one a time and do some 
operations like  adding only first two column from the "datas.txt" to table.

*I want to do it using thrift api of hive in python.*

Reason is then i can play with the incoming data and filter/modify it 
according to my need.

*Note:* I will receive stream of data continuously. For example: logs of 
some server.

So I want to know whether its possible or not.

Thanks & Regards
~guru
This message is intended only for the use of the addressee and may contain information that is privileged, confidential 
and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the 
employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly prohibited. If you have received this e-mail 
in error, please notify us immediately by return e-mail and delete this e-mail and all attachments from your system.