You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by ya...@uwaterloo.ca on 2012/11/26 17:07:17 UTC

Permissions preventing me from inserting data into table I have just created

Hello,

I'm using Cloudera's CDH4 with Hive 0.9 and Hive Server 2. I am trying  
to load data into hive using the JDBC driver (the one distributed with  
Cloudera CDH4 "org.apache.hive.jdbc.HiveDriver". I can create the  
staging table and LOAD LOCAL into it. However when I try to insert  
data into a table with Columnar SerDe Stored As RCFILE I get an error  
caused by file permissions. I don't think that the SerDE or the Stored  
as parameters have anything to do with the problem but I mentioned  
them for completeness. The problem is that hive creates a temporary  
file in its scratch folder (local) owned by hive:hive with permissions  
755, then pass it as an input to a mapper running as the user  
mapred:mapred. Now the mapper tries to create something inside the  
input folder (probably can do this elsewhere), and the following  
exception is thrown:

org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:  
Mkdirs failed to create  
file:/home/yaboulnaga/tmp/hive-scratch/hive_2012-11-26_10-46-44_887_2004468370569495405/_task_tmp.-ext-10002
	at  
org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:237)
	at  
org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:477)
	at  
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:709)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
	at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
	at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at  
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)


As you might have noticed, I moved the scrach folder to a directory  
under my home dir so that I can give this directory 777 permissions.  
The idea was to use hive.files.umask.value of 0000 to cause  
subdirectories to inherit the same open permission (not the best  
workaround, but wouldn't hurt on my local machine). Unfortunately this  
didn't work even when I added a umask 0000 to /etc/init.d/hiveserver2.  
Can someone please tell me what's the right way to do this? I mean  
create a table and then insert values into it! The Hive QL statements  
I use are very similar to the ones in the tutorials about loading data.

Cheers!
-- Younos




Re: Permissions preventing me from inserting data into table I have just created

Posted by Edward Capriolo <ed...@gmail.com>.
You may have to go directly to cloudera support for this one.
HiveServer2 is not officially part of hive yet so technically we
should not be supporting it (yet). However someone on list might still
answer you.



On Mon, Nov 26, 2012 at 11:07 AM,  <ya...@uwaterloo.ca> wrote:
> Hello,
>
> I'm using Cloudera's CDH4 with Hive 0.9 and Hive Server 2. I am trying to
> load data into hive using the JDBC driver (the one distributed with Cloudera
> CDH4 "org.apache.hive.jdbc.HiveDriver". I can create the staging table and
> LOAD LOCAL into it. However when I try to insert data into a table with
> Columnar SerDe Stored As RCFILE I get an error caused by file permissions. I
> don't think that the SerDE or the Stored as parameters have anything to do
> with the problem but I mentioned them for completeness. The problem is that
> hive creates a temporary file in its scratch folder (local) owned by
> hive:hive with permissions 755, then pass it as an input to a mapper running
> as the user mapred:mapred. Now the mapper tries to create something inside
> the input folder (probably can do this elsewhere), and the following
> exception is thrown:
>
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
> Mkdirs failed to create
> file:/home/yaboulnaga/tmp/hive-scratch/hive_2012-11-26_10-46-44_887_2004468370569495405/_task_tmp.-ext-10002
>         at
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:237)
>         at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:477)
>         at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:709)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
>         at
> org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:393)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:327)
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)
>         at org.apache.hadoop.mapred.Child.main(Child.java:262)
>
>
> As you might have noticed, I moved the scrach folder to a directory under my
> home dir so that I can give this directory 777 permissions. The idea was to
> use hive.files.umask.value of 0000 to cause subdirectories to inherit the
> same open permission (not the best workaround, but wouldn't hurt on my local
> machine). Unfortunately this didn't work even when I added a umask 0000 to
> /etc/init.d/hiveserver2. Can someone please tell me what's the right way to
> do this? I mean create a table and then insert values into it! The Hive QL
> statements I use are very similar to the ones in the tutorials about loading
> data.
>
> Cheers!
> -- Younos
>
>
>

Re: Permissions preventing me from inserting data into table I have just created

Posted by ya...@uwaterloo.ca.
I solved the problem by using a fully qualified path for  
hive.exec.scratchdir and then the umask trick worked. It turns out  
that hive was creating a different directory (on hdfs) than the one  
mapreduce was trying to write into, and that's why the umask didn't  
work. This remains a nasty workaround, and I wish someone would say  
how to do this right!

Quoting yaboulna@uwaterloo.ca:

> Thanks for the reply Tim. It is writable to all (permission 777). As  
> a side note, I have discovered now that the mapreduce task spawned  
> by the RCFileOutputDriver is setting mapred.output.dir to a folder  
> under file:// regardrless of the fs.default.name. This might be  
> expected beahviour, but I just wanted to note it.
>
> Quoting Tim Havens <ti...@gmail.com>:
>
>> make sure :/home/yaboulnaga/tmp/**hive-scratch/ is writeable by your
>> processes.
>>
>>
>> On Mon, Nov 26, 2012 at 10:07 AM, <ya...@uwaterloo.ca> wrote:
>>
>>> Hello,
>>>
>>> I'm using Cloudera's CDH4 with Hive 0.9 and Hive Server 2. I am trying to
>>> load data into hive using the JDBC driver (the one distributed with
>>> Cloudera CDH4 "org.apache.hive.jdbc.**HiveDriver". I can create the
>>> staging table and LOAD LOCAL into it. However when I try to insert data
>>> into a table with Columnar SerDe Stored As RCFILE I get an error caused by
>>> file permissions. I don't think that the SerDE or the Stored as parameters
>>> have anything to do with the problem but I mentioned them for completeness.
>>> The problem is that hive creates a temporary file in its scratch folder
>>> (local) owned by hive:hive with permissions 755, then pass it as an input
>>> to a mapper running as the user mapred:mapred. Now the mapper tries to
>>> create something inside the input folder (probably can do this elsewhere),
>>> and the following exception is thrown:
>>>
>>> org.apache.hadoop.hive.ql.**metadata.HiveException: java.io.IOException:
>>> Mkdirs failed to create file:/home/yaboulnaga/tmp/**
>>> hive-scratch/hive_2012-11-26_**10-46-44_887_**
>>> 2004468370569495405/_task_tmp.**-ext-10002
>>>        at org.apache.hadoop.hive.ql.io.**HiveFileFormatUtils.**
>>> getHiveRecordWriter(**HiveFileFormatUtils.java:237)
>>>        at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**
>>> createBucketFiles(**FileSinkOperator.java:477)
>>>        at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.closeOp(**
>>> FileSinkOperator.java:709)
>>>        at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>>> java:557)
>>>        at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>>> java:566)
>>>        at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>>> java:566)
>>>        at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>>> java:566)
>>>        at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>>> java:566)
>>>        at org.apache.hadoop.hive.ql.**exec.ExecMapper.close(**
>>> ExecMapper.java:193)
>>>        at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**57)
>>>        at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
>>> java:393)
>>>        at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:327)
>>>        at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
>>>        at java.security.**AccessController.doPrivileged(**Native Method)
>>>        at javax.security.auth.Subject.**doAs(Subject.java:396)
>>>        at org.apache.hadoop.security.**UserGroupInformation.doAs(**
>>> UserGroupInformation.java:**1332)
>>>        at org.apache.hadoop.mapred.**Child.main(Child.java:262)
>>>
>>>
>>> As you might have noticed, I moved the scrach folder to a directory under
>>> my home dir so that I can give this directory 777 permissions. The idea was
>>> to use hive.files.umask.value of 0000 to cause subdirectories to inherit
>>> the same open permission (not the best workaround, but wouldn't hurt on my
>>> local machine). Unfortunately this didn't work even when I added a umask
>>> 0000 to /etc/init.d/hiveserver2. Can someone please tell me what's the
>>> right way to do this? I mean create a table and then insert values into it!
>>> The Hive QL statements I use are very similar to the ones in the tutorials
>>> about loading data.
>>>
>>> Cheers!
>>> -- Younos
>>>
>>>
>>>
>>>
>>
>>
>> --
>> "The whole world is you. Yet you keep thinking there is something else." -
>> Xuefeng Yicun 822-902 A.D.
>>
>> Tim R. Havens
>> Google Phone: 573.454.1232
>> ICQ: 495992798
>> ICBM:  37°51'34.79"N   90°35'24.35"W
>> ham radio callsign: NW0W
>>
>
>
>
> Best regards,
> Younos Aboulnaga
>
> Masters candidate
> David Cheriton school of computer science
> University of Waterloo
> http://cs.uwaterloo.ca
>
> E-Mail: younos.aboulnaga@uwaterloo.ca
> Mobile: +1 (519) 497-5669
>
>
>



Best regards,
Younos Aboulnaga

Masters candidate
David Cheriton school of computer science
University of Waterloo
http://cs.uwaterloo.ca

E-Mail: younos.aboulnaga@uwaterloo.ca
Mobile: +1 (519) 497-5669



Re: Permissions preventing me from inserting data into table I have just created

Posted by ya...@uwaterloo.ca.
Thanks for the reply Tim. It is writable to all (permission 777). As a  
side note, I have discovered now that the mapreduce task spawned by  
the RCFileOutputDriver is setting mapred.output.dir to a folder under  
file:// regardrless of the fs.default.name. This might be expected  
beahviour, but I just wanted to note it.

Quoting Tim Havens <ti...@gmail.com>:

> make sure :/home/yaboulnaga/tmp/**hive-scratch/ is writeable by your
> processes.
>
>
> On Mon, Nov 26, 2012 at 10:07 AM, <ya...@uwaterloo.ca> wrote:
>
>> Hello,
>>
>> I'm using Cloudera's CDH4 with Hive 0.9 and Hive Server 2. I am trying to
>> load data into hive using the JDBC driver (the one distributed with
>> Cloudera CDH4 "org.apache.hive.jdbc.**HiveDriver". I can create the
>> staging table and LOAD LOCAL into it. However when I try to insert data
>> into a table with Columnar SerDe Stored As RCFILE I get an error caused by
>> file permissions. I don't think that the SerDE or the Stored as parameters
>> have anything to do with the problem but I mentioned them for completeness.
>> The problem is that hive creates a temporary file in its scratch folder
>> (local) owned by hive:hive with permissions 755, then pass it as an input
>> to a mapper running as the user mapred:mapred. Now the mapper tries to
>> create something inside the input folder (probably can do this elsewhere),
>> and the following exception is thrown:
>>
>> org.apache.hadoop.hive.ql.**metadata.HiveException: java.io.IOException:
>> Mkdirs failed to create file:/home/yaboulnaga/tmp/**
>> hive-scratch/hive_2012-11-26_**10-46-44_887_**
>> 2004468370569495405/_task_tmp.**-ext-10002
>>         at org.apache.hadoop.hive.ql.io.**HiveFileFormatUtils.**
>> getHiveRecordWriter(**HiveFileFormatUtils.java:237)
>>         at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**
>> createBucketFiles(**FileSinkOperator.java:477)
>>         at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.closeOp(**
>> FileSinkOperator.java:709)
>>         at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>> java:557)
>>         at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>> java:566)
>>         at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>> java:566)
>>         at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>> java:566)
>>         at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
>> java:566)
>>         at org.apache.hadoop.hive.ql.**exec.ExecMapper.close(**
>> ExecMapper.java:193)
>>         at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**57)
>>         at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
>> java:393)
>>         at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:327)
>>         at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
>>         at java.security.**AccessController.doPrivileged(**Native Method)
>>         at javax.security.auth.Subject.**doAs(Subject.java:396)
>>         at org.apache.hadoop.security.**UserGroupInformation.doAs(**
>> UserGroupInformation.java:**1332)
>>         at org.apache.hadoop.mapred.**Child.main(Child.java:262)
>>
>>
>> As you might have noticed, I moved the scrach folder to a directory under
>> my home dir so that I can give this directory 777 permissions. The idea was
>> to use hive.files.umask.value of 0000 to cause subdirectories to inherit
>> the same open permission (not the best workaround, but wouldn't hurt on my
>> local machine). Unfortunately this didn't work even when I added a umask
>> 0000 to /etc/init.d/hiveserver2. Can someone please tell me what's the
>> right way to do this? I mean create a table and then insert values into it!
>> The Hive QL statements I use are very similar to the ones in the tutorials
>> about loading data.
>>
>> Cheers!
>> -- Younos
>>
>>
>>
>>
>
>
> --
> "The whole world is you. Yet you keep thinking there is something else." -
> Xuefeng Yicun 822-902 A.D.
>
> Tim R. Havens
> Google Phone: 573.454.1232
> ICQ: 495992798
> ICBM:  37°51'34.79"N   90°35'24.35"W
> ham radio callsign: NW0W
>



Best regards,
Younos Aboulnaga

Masters candidate
David Cheriton school of computer science
University of Waterloo
http://cs.uwaterloo.ca

E-Mail: younos.aboulnaga@uwaterloo.ca
Mobile: +1 (519) 497-5669



Re: Permissions preventing me from inserting data into table I have just created

Posted by Tim Havens <ti...@gmail.com>.
make sure :/home/yaboulnaga/tmp/**hive-scratch/ is writeable by your
processes.


On Mon, Nov 26, 2012 at 10:07 AM, <ya...@uwaterloo.ca> wrote:

> Hello,
>
> I'm using Cloudera's CDH4 with Hive 0.9 and Hive Server 2. I am trying to
> load data into hive using the JDBC driver (the one distributed with
> Cloudera CDH4 "org.apache.hive.jdbc.**HiveDriver". I can create the
> staging table and LOAD LOCAL into it. However when I try to insert data
> into a table with Columnar SerDe Stored As RCFILE I get an error caused by
> file permissions. I don't think that the SerDE or the Stored as parameters
> have anything to do with the problem but I mentioned them for completeness.
> The problem is that hive creates a temporary file in its scratch folder
> (local) owned by hive:hive with permissions 755, then pass it as an input
> to a mapper running as the user mapred:mapred. Now the mapper tries to
> create something inside the input folder (probably can do this elsewhere),
> and the following exception is thrown:
>
> org.apache.hadoop.hive.ql.**metadata.HiveException: java.io.IOException:
> Mkdirs failed to create file:/home/yaboulnaga/tmp/**
> hive-scratch/hive_2012-11-26_**10-46-44_887_**
> 2004468370569495405/_task_tmp.**-ext-10002
>         at org.apache.hadoop.hive.ql.io.**HiveFileFormatUtils.**
> getHiveRecordWriter(**HiveFileFormatUtils.java:237)
>         at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.**
> createBucketFiles(**FileSinkOperator.java:477)
>         at org.apache.hadoop.hive.ql.**exec.FileSinkOperator.closeOp(**
> FileSinkOperator.java:709)
>         at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
> java:557)
>         at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
> java:566)
>         at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
> java:566)
>         at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
> java:566)
>         at org.apache.hadoop.hive.ql.**exec.Operator.close(Operator.**
> java:566)
>         at org.apache.hadoop.hive.ql.**exec.ExecMapper.close(**
> ExecMapper.java:193)
>         at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**57)
>         at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
> java:393)
>         at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:327)
>         at org.apache.hadoop.mapred.**Child$4.run(Child.java:268)
>         at java.security.**AccessController.doPrivileged(**Native Method)
>         at javax.security.auth.Subject.**doAs(Subject.java:396)
>         at org.apache.hadoop.security.**UserGroupInformation.doAs(**
> UserGroupInformation.java:**1332)
>         at org.apache.hadoop.mapred.**Child.main(Child.java:262)
>
>
> As you might have noticed, I moved the scrach folder to a directory under
> my home dir so that I can give this directory 777 permissions. The idea was
> to use hive.files.umask.value of 0000 to cause subdirectories to inherit
> the same open permission (not the best workaround, but wouldn't hurt on my
> local machine). Unfortunately this didn't work even when I added a umask
> 0000 to /etc/init.d/hiveserver2. Can someone please tell me what's the
> right way to do this? I mean create a table and then insert values into it!
> The Hive QL statements I use are very similar to the ones in the tutorials
> about loading data.
>
> Cheers!
> -- Younos
>
>
>
>


-- 
"The whole world is you. Yet you keep thinking there is something else." -
Xuefeng Yicun 822-902 A.D.

Tim R. Havens
Google Phone: 573.454.1232
ICQ: 495992798
ICBM:  37°51'34.79"N   90°35'24.35"W
ham radio callsign: NW0W