You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Karol Gwaj <Ka...@skillpages.com> on 2012/02/08 18:12:49 UTC

Hive & custom InputFormat

Hi,

Im trying to create custom input format that will work with hive version 0.7.0 & hadoop 0.20.205 (current amazon EMR setup)
in attachment are: dummy input format, record reader & input split i created

below are steps im performing to try make it work (without success), so if im missing something please let me know:

1. start EMR job with hive interactive mode (only one node - master) and ssh to it after it starts

2. copy jar with code from attachment into /home/hadoop/lib directory

3. run hive in interactive mode (hive command)

4. create hive table:

create table dummy (idx string)
stored as inputformat 'test.DummyInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat';

5. select * from dummy;

and nothing happens (zero results)
im not getting any errors, so hard to say whats going on there

i know that hive can see my jar, because if i use inputformat name that dont exists in my jar im getting error message

i even tried to create some files on s3 from inside of recordreader and inputformat (to see if any method is actually called by hive), but no files were created

any help will be very appreciated
thx,