You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by Srinivas M <sm...@gmail.com> on 2018/09/24 15:29:47 UTC

Exception while writing a Parquet File in a secured Cluster

 Hi

We have an application that writes parquet files. I am using the
AvroParquetWriter to write parquet files. While this piece of code works
fine in a Kerberos environment, it is failing when SSL is enabled in the
Hadoop cluster. So, I had modified the code to use the swebhdfs protocol
instead of the webhdfs and it is still failing with the following exception.

           {
             conf.set("hadoop.security.authentication", "kerberos");
             UserGroupInformation.setConfiguration(conf);
             ugi =
UserGroupInformation.loginUserFromKeytabAndReturnUGI(_user,_keytab) ;

             try
             {
                ugi.doAs(new PrivilegedExceptionAction<Object>()
                {
                  public Object run() throws IOException
                  {
                     fs = FileSystem.get(hdfsuri,conf) ;

                      if (_fileExistsAction == VAL_FILEEXISTS_OVERWRITE)
                         fs.delete(new Path(_fileName),false) ;

                     _writer = new AvroParquetWriter(new Path(hdfsuri),
_schema, _ParquetCompressionCodec, _ParquetBlockSize, _ParquetPageSize);
                    return null ;
                  }
                }
              );

hdfsuri in this case is of the format "swebhdfs://"+_host+":"+_port + "/" +
_fileName"

*The application is failing with the following exception :*
*===========================================*
Error :
Caused by
org.apache.hadoop.ipc.RemoteException(javax.ws.rs.WebApplicationException):
null
    at
org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:124)
    at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:420)
    at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:108)
    at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:596)
    at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:674)
    at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:524)
    at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:554)
    at
java.security.AccessController.doPrivileged(AccessController.java:488)
    at javax.security.auth.Subject.doAs(Subject.java:572)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
    at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:550)
    at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.create(WebHdfsFileSystem.java:1257)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:926)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:907)
    at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:804)
    at parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:225)
    at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:302)
    at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:253)
    at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:219)
    at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:153)
    at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:119)
    at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:92)
    at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:66)
    at parquet.avro.AvroParquetWriter.<init>(AvroParquetWriter.java:54)
    at
com.ibm.iis.jis.utilities.parquet.ParquetBuilder$1.run(ParquetBuilder.java:191)
    at
java.security.AccessController.doPrivileged(AccessController.java:488)
    at javax.security.auth.Subject.doAs(Subject.java:572)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
    at
com.ibm.iis.jis.utilities.parquet.ParquetBuilder.open(ParquetBuilder.java:182)
    at
com.ibm.iis.jis.utilities.dochandler.impl.OutputBuilder.<init>(OutputBuilder.java:83)
    at
com.ibm.iis.jis.utilities.dochandler.impl.Registrar.getBuilder(Registrar.java:331)
    at
com.ibm.iis.jis.utilities.dochandler.impl.Registrar.getBuilder(Registrar.java:293)
    at com.ibm.iis.cc.filesystem.FileSystem.getBuilder(FileSystem.java:2177)
    at
com.ibm.iis.cc.filesystem.FileSystem.writeDelimitedFiles(FileSystem.java:1168)
    at com.ibm.iis.cc.filesystem.FileSystem.writeFiles(FileSystem.java:922)
    at com.ibm.iis.cc.filesystem.FileSystem.process(FileSystem.java:763)
    at
com.ibm.is.cc.javastage.connector.CC_JavaAdapter.run(CC_JavaAdapter.java:443)



Any reason why the writing is failing with security exception ? I have
checked the Kerberos and SSL Debug logs, but there is no indication why it
is failing with security exception..


-- 
Srinivas
(*-*)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
You have to grow from the inside out. None can teach you, none can make you
spiritual.
                      -Narendra Nath Dutta(Swamy Vivekananda)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Re: Exception while writing a Parquet File in a secured Cluster

Posted by Deepak Majeti <ma...@gmail.com>.
Hi Srinivas,

We have been using swebhdfs in Vertica to support secured (Kerberos + SSL)
Hadoop clusters without any issues.
I don't think this is a Parquet issue.
One alternative is to use command-line curl with verbose logging to see if
something shows up.

On Tue, Sep 25, 2018 at 11:06 PM Srinivas M <sm...@gmail.com> wrote:

> Hi Ryan, Thanks a lot for taking time to respond to my email and providing
> your perspective. Yes, the FileSystem could be accessed from outside
> through the hadoop and as well as Hive. But, when we are trying to access
> it through webhdfs (and swebhdfs as well), we are running into issues.
>
> While using the protocol as webhdfs, we were seeing the following error.
>
> Caused by: java.net.SocketException: bda6node02.infoftps.com:14000:
> Unexpected end of file from server
>                at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>                at
>
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:86)
>                at
>
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:58)
>                at
> java.lang.reflect.Constructor.newInstance(Constructor.java:542)
>                at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:691)
>                at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:519)
>                at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:549)
>                at
> java.security.AccessController.doPrivileged(AccessController.java:488)
>                at javax.security.auth.Subject.doAs(Subject.java:572)
>
> After investigating into the issue, it has been identified that the issue
> was due to the mismatch in the protocol (server is configured with SSL) and
> the client application was doing a plain request and hence the error. So, I
> had switched the protocol to swebhdfs and then we started seeing this new
> error (which was mentioned in the earlier mail).
>
> Are there any additional debugging that could be enabled to understand what
> could be failing ? I could not make out much from the Kerberos and SSL
> Debug logs.
>
> On a side note, is the swebhdfs implementation fully stable and is it
> expected to work for Parquet, when accessing files over secure HDFS ?
>
> Thanks Once again for taking time to respond to my questions.
>
> On Mon, Sep 24, 2018 at 9:55 PM Ryan Blue <rb...@netflix.com.invalid>
> wrote:
>
> > This is probably related to the fact that your FS is getting created
> inside
> > a call to Parquet (org.apache.hadoop.fs.FileSystem.create). Can you
> access
> > that target file system first to make sure it is set up properly?
> >
> > It could be that Parquet isn't handling Configuration correctly in this
> > stack.
> >
> > rb
> >
> > On Mon, Sep 24, 2018 at 8:30 AM Srinivas M <sm...@gmail.com> wrote:
> >
> > >  Hi
> > >
> > > We have an application that writes parquet files. I am using the
> > > AvroParquetWriter to write parquet files. While this piece of code
> works
> > > fine in a Kerberos environment, it is failing when SSL is enabled in
> the
> > > Hadoop cluster. So, I had modified the code to use the swebhdfs
> protocol
> > > instead of the webhdfs and it is still failing with the following
> > > exception.
> > >
> > >            {
> > >              conf.set("hadoop.security.authentication", "kerberos");
> > >              UserGroupInformation.setConfiguration(conf);
> > >              ugi =
> > > UserGroupInformation.loginUserFromKeytabAndReturnUGI(_user,_keytab) ;
> > >
> > >              try
> > >              {
> > >                 ugi.doAs(new PrivilegedExceptionAction<Object>()
> > >                 {
> > >                   public Object run() throws IOException
> > >                   {
> > >                      fs = FileSystem.get(hdfsuri,conf) ;
> > >
> > >                       if (_fileExistsAction ==
> VAL_FILEEXISTS_OVERWRITE)
> > >                          fs.delete(new Path(_fileName),false) ;
> > >
> > >                      _writer = new AvroParquetWriter(new Path(hdfsuri),
> > > _schema, _ParquetCompressionCodec, _ParquetBlockSize,
> _ParquetPageSize);
> > >                     return null ;
> > >                   }
> > >                 }
> > >               );
> > >
> > > hdfsuri in this case is of the format "swebhdfs://"+_host+":"+_port +
> > "/" +
> > > _fileName"
> > >
> > > *The application is failing with the following exception :*
> > > *===========================================*
> > > Error :
> > > Caused by
> > > org.apache.hadoop.ipc.RemoteException(javax.ws.rs
> > > .WebApplicationException):
> > > null
> > >     at
> > >
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:124)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:420)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:108)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:596)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:674)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:524)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:554)
> > >     at
> > > java.security.AccessController.doPrivileged(AccessController.java:488)
> > >     at javax.security.auth.Subject.doAs(Subject.java:572)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:550)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.create(WebHdfsFileSystem.java:1257)
> > >     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:926)
> > >     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:907)
> > >     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:804)
> > >     at
> > parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:225)
> > >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:302)
> > >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:253)
> > >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:219)
> > >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:153)
> > >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:119)
> > >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:92)
> > >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:66)
> > >     at parquet.avro.AvroParquetWriter.<init>(AvroParquetWriter.java:54)
> > >     at
> > >
> > >
> >
> com.ibm.iis.jis.utilities.parquet.ParquetBuilder$1.run(ParquetBuilder.java:191)
> > >     at
> > > java.security.AccessController.doPrivileged(AccessController.java:488)
> > >     at javax.security.auth.Subject.doAs(Subject.java:572)
> > >     at
> > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> > >     at
> > >
> > >
> >
> com.ibm.iis.jis.utilities.parquet.ParquetBuilder.open(ParquetBuilder.java:182)
> > >     at
> > >
> > >
> >
> com.ibm.iis.jis.utilities.dochandler.impl.OutputBuilder.<init>(OutputBuilder.java:83)
> > >     at
> > >
> > >
> >
> com.ibm.iis.jis.utilities.dochandler.impl.Registrar.getBuilder(Registrar.java:331)
> > >     at
> > >
> > >
> >
> com.ibm.iis.jis.utilities.dochandler.impl.Registrar.getBuilder(Registrar.java:293)
> > >     at
> > > com.ibm.iis.cc.filesystem.FileSystem.getBuilder(FileSystem.java:2177)
> > >     at
> > >
> > >
> >
> com.ibm.iis.cc.filesystem.FileSystem.writeDelimitedFiles(FileSystem.java:1168)
> > >     at
> > com.ibm.iis.cc.filesystem.FileSystem.writeFiles(FileSystem.java:922)
> > >     at
> com.ibm.iis.cc.filesystem.FileSystem.process(FileSystem.java:763)
> > >     at
> > >
> > >
> >
> com.ibm.is.cc.javastage.connector.CC_JavaAdapter.run(CC_JavaAdapter.java:443)
> > >
> > >
> > >
> > > Any reason why the writing is failing with security exception ? I have
> > > checked the Kerberos and SSL Debug logs, but there is no indication why
> > it
> > > is failing with security exception..
> > >
> > >
> > > --
> > > Srinivas
> > > (*-*)
> > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > > You have to grow from the inside out. None can teach you, none can make
> > you
> > > spiritual.
> > >                       -Narendra Nath Dutta(Swamy Vivekananda)
> > >
> > >
> >
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > >
> >
> >
> > --
> > Ryan Blue
> > Software Engineer
> > Netflix
> >
>
>
> --
> Srinivas
> (*-*)
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> You have to grow from the inside out. None can teach you, none can make you
> spiritual.
>                       -Narendra Nath Dutta(Swamy Vivekananda)
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>


-- 
regards,
Deepak Majeti

Re: Exception while writing a Parquet File in a secured Cluster

Posted by Srinivas M <sm...@gmail.com>.
Hi Ryan, Thanks a lot for taking time to respond to my email and providing
your perspective. Yes, the FileSystem could be accessed from outside
through the hadoop and as well as Hive. But, when we are trying to access
it through webhdfs (and swebhdfs as well), we are running into issues.

While using the protocol as webhdfs, we were seeing the following error.

Caused by: java.net.SocketException: bda6node02.infoftps.com:14000:
Unexpected end of file from server
               at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
               at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:86)
               at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:58)
               at
java.lang.reflect.Constructor.newInstance(Constructor.java:542)
               at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:691)
               at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:519)
               at
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:549)
               at
java.security.AccessController.doPrivileged(AccessController.java:488)
               at javax.security.auth.Subject.doAs(Subject.java:572)

After investigating into the issue, it has been identified that the issue
was due to the mismatch in the protocol (server is configured with SSL) and
the client application was doing a plain request and hence the error. So, I
had switched the protocol to swebhdfs and then we started seeing this new
error (which was mentioned in the earlier mail).

Are there any additional debugging that could be enabled to understand what
could be failing ? I could not make out much from the Kerberos and SSL
Debug logs.

On a side note, is the swebhdfs implementation fully stable and is it
expected to work for Parquet, when accessing files over secure HDFS ?

Thanks Once again for taking time to respond to my questions.

On Mon, Sep 24, 2018 at 9:55 PM Ryan Blue <rb...@netflix.com.invalid> wrote:

> This is probably related to the fact that your FS is getting created inside
> a call to Parquet (org.apache.hadoop.fs.FileSystem.create). Can you access
> that target file system first to make sure it is set up properly?
>
> It could be that Parquet isn't handling Configuration correctly in this
> stack.
>
> rb
>
> On Mon, Sep 24, 2018 at 8:30 AM Srinivas M <sm...@gmail.com> wrote:
>
> >  Hi
> >
> > We have an application that writes parquet files. I am using the
> > AvroParquetWriter to write parquet files. While this piece of code works
> > fine in a Kerberos environment, it is failing when SSL is enabled in the
> > Hadoop cluster. So, I had modified the code to use the swebhdfs protocol
> > instead of the webhdfs and it is still failing with the following
> > exception.
> >
> >            {
> >              conf.set("hadoop.security.authentication", "kerberos");
> >              UserGroupInformation.setConfiguration(conf);
> >              ugi =
> > UserGroupInformation.loginUserFromKeytabAndReturnUGI(_user,_keytab) ;
> >
> >              try
> >              {
> >                 ugi.doAs(new PrivilegedExceptionAction<Object>()
> >                 {
> >                   public Object run() throws IOException
> >                   {
> >                      fs = FileSystem.get(hdfsuri,conf) ;
> >
> >                       if (_fileExistsAction == VAL_FILEEXISTS_OVERWRITE)
> >                          fs.delete(new Path(_fileName),false) ;
> >
> >                      _writer = new AvroParquetWriter(new Path(hdfsuri),
> > _schema, _ParquetCompressionCodec, _ParquetBlockSize, _ParquetPageSize);
> >                     return null ;
> >                   }
> >                 }
> >               );
> >
> > hdfsuri in this case is of the format "swebhdfs://"+_host+":"+_port +
> "/" +
> > _fileName"
> >
> > *The application is failing with the following exception :*
> > *===========================================*
> > Error :
> > Caused by
> > org.apache.hadoop.ipc.RemoteException(javax.ws.rs
> > .WebApplicationException):
> > null
> >     at
> > org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:124)
> >     at
> >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:420)
> >     at
> >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:108)
> >     at
> >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:596)
> >     at
> >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:674)
> >     at
> >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:524)
> >     at
> >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:554)
> >     at
> > java.security.AccessController.doPrivileged(AccessController.java:488)
> >     at javax.security.auth.Subject.doAs(Subject.java:572)
> >     at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> >     at
> >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:550)
> >     at
> >
> >
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.create(WebHdfsFileSystem.java:1257)
> >     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:926)
> >     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:907)
> >     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:804)
> >     at
> parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:225)
> >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:302)
> >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:253)
> >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:219)
> >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:153)
> >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:119)
> >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:92)
> >     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:66)
> >     at parquet.avro.AvroParquetWriter.<init>(AvroParquetWriter.java:54)
> >     at
> >
> >
> com.ibm.iis.jis.utilities.parquet.ParquetBuilder$1.run(ParquetBuilder.java:191)
> >     at
> > java.security.AccessController.doPrivileged(AccessController.java:488)
> >     at javax.security.auth.Subject.doAs(Subject.java:572)
> >     at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> >     at
> >
> >
> com.ibm.iis.jis.utilities.parquet.ParquetBuilder.open(ParquetBuilder.java:182)
> >     at
> >
> >
> com.ibm.iis.jis.utilities.dochandler.impl.OutputBuilder.<init>(OutputBuilder.java:83)
> >     at
> >
> >
> com.ibm.iis.jis.utilities.dochandler.impl.Registrar.getBuilder(Registrar.java:331)
> >     at
> >
> >
> com.ibm.iis.jis.utilities.dochandler.impl.Registrar.getBuilder(Registrar.java:293)
> >     at
> > com.ibm.iis.cc.filesystem.FileSystem.getBuilder(FileSystem.java:2177)
> >     at
> >
> >
> com.ibm.iis.cc.filesystem.FileSystem.writeDelimitedFiles(FileSystem.java:1168)
> >     at
> com.ibm.iis.cc.filesystem.FileSystem.writeFiles(FileSystem.java:922)
> >     at com.ibm.iis.cc.filesystem.FileSystem.process(FileSystem.java:763)
> >     at
> >
> >
> com.ibm.is.cc.javastage.connector.CC_JavaAdapter.run(CC_JavaAdapter.java:443)
> >
> >
> >
> > Any reason why the writing is failing with security exception ? I have
> > checked the Kerberos and SSL Debug logs, but there is no indication why
> it
> > is failing with security exception..
> >
> >
> > --
> > Srinivas
> > (*-*)
> >
> >
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > You have to grow from the inside out. None can teach you, none can make
> you
> > spiritual.
> >                       -Narendra Nath Dutta(Swamy Vivekananda)
> >
> >
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> >
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>


-- 
Srinivas
(*-*)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
You have to grow from the inside out. None can teach you, none can make you
spiritual.
                      -Narendra Nath Dutta(Swamy Vivekananda)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Re: Exception while writing a Parquet File in a secured Cluster

Posted by Ryan Blue <rb...@netflix.com.INVALID>.
This is probably related to the fact that your FS is getting created inside
a call to Parquet (org.apache.hadoop.fs.FileSystem.create). Can you access
that target file system first to make sure it is set up properly?

It could be that Parquet isn't handling Configuration correctly in this
stack.

rb

On Mon, Sep 24, 2018 at 8:30 AM Srinivas M <sm...@gmail.com> wrote:

>  Hi
>
> We have an application that writes parquet files. I am using the
> AvroParquetWriter to write parquet files. While this piece of code works
> fine in a Kerberos environment, it is failing when SSL is enabled in the
> Hadoop cluster. So, I had modified the code to use the swebhdfs protocol
> instead of the webhdfs and it is still failing with the following
> exception.
>
>            {
>              conf.set("hadoop.security.authentication", "kerberos");
>              UserGroupInformation.setConfiguration(conf);
>              ugi =
> UserGroupInformation.loginUserFromKeytabAndReturnUGI(_user,_keytab) ;
>
>              try
>              {
>                 ugi.doAs(new PrivilegedExceptionAction<Object>()
>                 {
>                   public Object run() throws IOException
>                   {
>                      fs = FileSystem.get(hdfsuri,conf) ;
>
>                       if (_fileExistsAction == VAL_FILEEXISTS_OVERWRITE)
>                          fs.delete(new Path(_fileName),false) ;
>
>                      _writer = new AvroParquetWriter(new Path(hdfsuri),
> _schema, _ParquetCompressionCodec, _ParquetBlockSize, _ParquetPageSize);
>                     return null ;
>                   }
>                 }
>               );
>
> hdfsuri in this case is of the format "swebhdfs://"+_host+":"+_port + "/" +
> _fileName"
>
> *The application is failing with the following exception :*
> *===========================================*
> Error :
> Caused by
> org.apache.hadoop.ipc.RemoteException(javax.ws.rs
> .WebApplicationException):
> null
>     at
> org.apache.hadoop.hdfs.web.JsonUtil.toRemoteException(JsonUtil.java:124)
>     at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:420)
>     at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:108)
>     at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:596)
>     at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:674)
>     at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:524)
>     at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:554)
>     at
> java.security.AccessController.doPrivileged(AccessController.java:488)
>     at javax.security.auth.Subject.doAs(Subject.java:572)
>     at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>     at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:550)
>     at
>
> org.apache.hadoop.hdfs.web.WebHdfsFileSystem.create(WebHdfsFileSystem.java:1257)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:926)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:907)
>     at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:804)
>     at parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:225)
>     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:302)
>     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:253)
>     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:219)
>     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:153)
>     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:119)
>     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:92)
>     at parquet.hadoop.ParquetWriter.<init>(ParquetWriter.java:66)
>     at parquet.avro.AvroParquetWriter.<init>(AvroParquetWriter.java:54)
>     at
>
> com.ibm.iis.jis.utilities.parquet.ParquetBuilder$1.run(ParquetBuilder.java:191)
>     at
> java.security.AccessController.doPrivileged(AccessController.java:488)
>     at javax.security.auth.Subject.doAs(Subject.java:572)
>     at
>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
>     at
>
> com.ibm.iis.jis.utilities.parquet.ParquetBuilder.open(ParquetBuilder.java:182)
>     at
>
> com.ibm.iis.jis.utilities.dochandler.impl.OutputBuilder.<init>(OutputBuilder.java:83)
>     at
>
> com.ibm.iis.jis.utilities.dochandler.impl.Registrar.getBuilder(Registrar.java:331)
>     at
>
> com.ibm.iis.jis.utilities.dochandler.impl.Registrar.getBuilder(Registrar.java:293)
>     at
> com.ibm.iis.cc.filesystem.FileSystem.getBuilder(FileSystem.java:2177)
>     at
>
> com.ibm.iis.cc.filesystem.FileSystem.writeDelimitedFiles(FileSystem.java:1168)
>     at com.ibm.iis.cc.filesystem.FileSystem.writeFiles(FileSystem.java:922)
>     at com.ibm.iis.cc.filesystem.FileSystem.process(FileSystem.java:763)
>     at
>
> com.ibm.is.cc.javastage.connector.CC_JavaAdapter.run(CC_JavaAdapter.java:443)
>
>
>
> Any reason why the writing is failing with security exception ? I have
> checked the Kerberos and SSL Debug logs, but there is no indication why it
> is failing with security exception..
>
>
> --
> Srinivas
> (*-*)
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> You have to grow from the inside out. None can teach you, none can make you
> spiritual.
>                       -Narendra Nath Dutta(Swamy Vivekananda)
>
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>


-- 
Ryan Blue
Software Engineer
Netflix