You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Aniket Deshpande <an...@gmail.com> on 2017/10/04 01:15:49 UTC

Classloader error after SSL setup

 Background: We have a setup of Flink 1.3.1 along with a secure MAPR
cluster (Flink is running on mapr client nodes). We run this flink cluster
via flink-jobmanager.sh foreground and flink-taskmanager.sh foreground command
via Marathon.  In order for us to make this work, we had to add -Djavax.net
.ssl.trustStore="$JAVA_HOME/jre/lib/security/cacerts" in flink-console.sh as
extra JVM arg (otherwise, flink was taking MAPR's ssl_truststore as default
truststore and then we were facing issues for any 3rd party jars like
aws_sdk etc.). This entire setup was working fine as it is and we could
submit our jars and the pipelines ran without any problem


Problem: We started experimenting with enabling ssl for all communication
for Flink. For this, we followed
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/security-ssl.html
for
generating CA and keystore. I added the following properties to
flink-conf.yaml:


security.ssl.enabled: true
security.ssl.keystore: /opt/flink/certs/node1.keystore
security.ssl.keystore-password: <password>
security.ssl.key-password: <password>
security.ssl.truststore: /opt/flink/certs/ca.truststore
security.ssl.truststore-password: <password>
jobmanager.web.ssl.enabled: true
taskmanager.data.ssl.enabled: true
blob.service.ssl.enabled: true
akka.ssl.enabled: true


We then spin up a cluster and tried submitting the same job which was
working before. We get the following erros:
org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot load
user class:
org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09
ClassLoader info: URL ClassLoader:
Class not resolvable through given classloader.
        at
org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:229)
        at
org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(OperatorChain.java:95)
        at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:230)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
        at java.lang.Thread.run(Thread.java:748)


This error disappears when we remove the ssl config properties i.e run
flink cluster without ssl enabled.


So, did we miss any steps for enabling ssl?


P.S.: We tried removing the extra JVm arg mentioned above, but still get
the same error.

-- 

Aniket

Re: Classloader error after SSL setup

Posted by Aniket Deshpande <an...@gmail.com>.
So, according to Eron's suggestion I tried *security.ssl.verify-hostname:
false *configuration and that does the trick. I no longer get the
classloader error even with *blob.service.ssl.enabled: true *configuration.
Do you think the hostname verification fails because we are running flink
jobmanager and taskmanager via Marathon (and hence essentially as a mesos
task)?

On Wed, Oct 4, 2017 at 5:47 PM, Chesnay Schepler <ch...@apache.org> wrote:

> I don't think this is a configuration problem, but a bug in Flink. But
> we'll have to dig a little deeper to be sure.
>
> Besides the actual SSL problem, what concerns me is that we didn't fail
> earlier. If a bug in the SSL setup prevents
> the up- or download of jars then we should fail earlier. Looping in Nico
> who may have some input.
>
>
> On 04.10.2017 22:58, Aniket Deshpande wrote:
>
> Hi Chesnay,
> Thanks for the reply. After your suggestion, I found out that setting *blob.service.ssl.enabled:
> false* solved the issue and now all the pipelines run as expected.
> So, the issue is kinda narrowed down to blob service ssl now.
> I also checked the jobmanager logs when blob ssl is enabled and I see the
> following error:
>
>
>
>
>
>
>
>
>
>
> *2017-10-03 23:28:50.459 [BLOB connection for /<jm_ip>:46932] ERROR
> org.apache.flink.runtime.blob.BlobServerConnection  - Error while executing
> BLOB connection.  javax.net.ssl.SSLHandshakeException: Received fatal
> alert: certificate_unknown          at
> sun.security.ssl.Alerts.getSSLException(Alerts.java:192)          at
> sun.security.ssl.Alerts.getSSLException(Alerts.java:154)          at
> sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:2023)
> at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1125)
>     at
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
>         at
> sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:928)
>     at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
>   at sun.security.ssl.AppInputStream.read(AppInputStream.java:71)
> at
> org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:119) *
> So, is there some additional steps that I have to follow for enabling SSL
> for blob service?
>
> On Wed, Oct 4, 2017 at 4:09 PM, Eron Wright <er...@gmail.com> wrote:
>
>> By following Chesney's recommendation we will hopefully uncover an SSL
>> error that is being masked.  Another thing to try is to disable hostname
>> verification (it is enabled by default) to see whether the certificate is
>> being rejected.
>>
>> On Wed, Oct 4, 2017 at 5:15 AM, Chesnay Schepler <ch...@apache.org>
>> wrote:
>>
>>> something that would also help us narrow down the problematic area is to
>>> enable SSL for one component at a time and see
>>> which one causesd the job to fail.
>>>
>>>
>>> On 04.10.2017 14:11, Chesnay Schepler wrote:
>>>
>>> The configuration looks reasonable. Just to be sure, are the paths
>>> accessible by all nodes?
>>>
>>> As a first step, could you set the logging level to DEBUG (by modifying
>>> the 'conf/log4j.properties' file), resubmit the job (after a cluster
>>> restart) and check the Job- and TaskManager logs for any exception?
>>>
>>> On 04.10.2017 03:15, Aniket Deshpande wrote:
>>>
>>> Background: We have a setup of Flink 1.3.1 along with a secure MAPR
>>> cluster (Flink is running on mapr client nodes). We run this flink cluster
>>> via flink-jobmanager.sh foreground and flink-taskmanager.sh foreground command
>>> via Marathon.  In order for us to make this work, we had to add
>>> -Djavax.net.ssl.trustStore="$JAVA_HOME/jre/lib/security/cacerts" in
>>> flink-console.sh as extra JVM arg (otherwise, flink was taking MAPR's
>>> ssl_truststore as default truststore and then we were facing issues for any
>>> 3rd party jars like aws_sdk etc.). This entire setup was working fine as it
>>> is and we could submit our jars and the pipelines ran without any problem
>>>
>>>
>>> Problem: We started experimenting with enabling ssl for all
>>> communication for Flink. For this, we followed https://ci.apache.org
>>> /projects/flink/flink-docs-release-1.3/setup/security-ssl.html for
>>> generating CA and keystore. I added the following properties to
>>> flink-conf.yaml:
>>>
>>>
>>> security.ssl.enabled: true
>>> security.ssl.keystore: /opt/flink/certs/node1.keystore
>>> security.ssl.keystore-password: <password>
>>> security.ssl.key-password: <password>
>>> security.ssl.truststore: /opt/flink/certs/ca.truststore
>>> security.ssl.truststore-password: <password>
>>> jobmanager.web.ssl.enabled: true
>>> taskmanager.data.ssl.enabled: true
>>> blob.service.ssl.enabled: true
>>> akka.ssl.enabled: true
>>>
>>>
>>> We then spin up a cluster and tried submitting the same job which was
>>> working before. We get the following erros:
>>> org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot
>>> load user class: org.apache.flink.streaming.con
>>> nectors.kafka.FlinkKafkaConsumer09
>>> ClassLoader info: URL ClassLoader:
>>> Class not resolvable through given classloader.
>>>         at org.apache.flink.streaming.api.graph.StreamConfig.getStreamO
>>> perator(StreamConfig.java:229)
>>>         at org.apache.flink.streaming.runtime.tasks.OperatorChain.<init
>>> >(OperatorChain.java:95)
>>>         at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(S
>>> treamTask.java:230)
>>>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>>>         at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>> This error disappears when we remove the ssl config properties i.e run
>>> flink cluster without ssl enabled.
>>>
>>>
>>> So, did we miss any steps for enabling ssl?
>>>
>>>
>>> P.S.: We tried removing the extra JVm arg mentioned above, but still get
>>> the same error.
>>>
>>> --
>>>
>>> Aniket
>>>
>>>
>>>
>>>
>>
>
>
> --
> Yours Sincerely,
> Aniket S Deshpande.
>
>
>


-- 
Yours Sincerely,
Aniket S Deshpande.

Re: Classloader error after SSL setup

Posted by Chesnay Schepler <ch...@apache.org>.
I don't think this is a configuration problem, but a bug in Flink. But 
we'll have to dig a little deeper to be sure.

Besides the actual SSL problem, what concerns me is that we didn't fail 
earlier. If a bug in the SSL setup prevents
the up- or download of jars then we should fail earlier. Looping in Nico 
who may have some input.

On 04.10.2017 22:58, Aniket Deshpande wrote:
> Hi Chesnay,
> Thanks for the reply. After your suggestion, I found out that setting 
> /blob.service.ssl.enabled: false/ solved the issue and now all the 
> pipelines run as expected.
> So, the issue is kinda narrowed down to blob service ssl now.
> I also checked the jobmanager logs when blob ssl is enabled and I see 
> the following error:
> /2017-10-03 23:28:50.459 [BLOB connection for /<jm_ip>:46932] ERROR 
> org.apache.flink.runtime.blob.BlobServerConnection  - Error while 
> executing BLOB connection.
> javax.net.ssl.SSLHandshakeException: Received fatal alert: 
> certificate_unknown
>         at sun.security.ssl.Alerts.getSSLException(Alerts.java:192)
>         at sun.security.ssl.Alerts.getSSLException(Alerts.java:154)
>         at 
> sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:2023)
>         at 
> sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1125)
>         at 
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375) 
>
>         at 
> sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:928)
>         at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
>         at sun.security.ssl.AppInputStream.read(AppInputStream.java:71)
>         at 
> org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:119) 
> /
> So, is there some additional steps that I have to follow for enabling 
> SSL for blob service?
>
> On Wed, Oct 4, 2017 at 4:09 PM, Eron Wright <eronwright@gmail.com 
> <ma...@gmail.com>> wrote:
>
>     By following Chesney's recommendation we will hopefully uncover an
>     SSL error that is being masked. Another thing to try is to disable
>     hostname verification (it is enabled by default) to see whether
>     the certificate is being rejected.
>
>     On Wed, Oct 4, 2017 at 5:15 AM, Chesnay Schepler
>     <chesnay@apache.org <ma...@apache.org>> wrote:
>
>         something that would also help us narrow down the problematic
>         area is to enable SSL for one component at a time and see
>         which one causesd the job to fail.
>
>
>         On 04.10.2017 14:11, Chesnay Schepler wrote:
>>         The configuration looks reasonable. Just to be sure, are the
>>         paths accessible by all nodes?
>>
>>         As a first step, could you set the logging level to DEBUG (by
>>         modifying the 'conf/log4j.properties' file), resubmit the job
>>         (after a cluster restart) and check the Job- and TaskManager
>>         logs for any exception?
>>
>>         On 04.10.2017 03:15, Aniket Deshpande wrote:
>>>         Background: We have a setup of Flink 1.3.1 along with a
>>>         secure MAPR cluster (Flink is running on mapr client nodes).
>>>         We run this flink cluster via flink-jobmanager.sh
>>>         <http://flink-jobmanager.sh> foreground and
>>>         flink-taskmanager.sh <http://flink-taskmanager.sh>
>>>         foreground command via Marathon.  In order for us to make
>>>         this work, we had to add -Djavax.net
>>>         <http://-Djavax.net>.ssl.trustStore="$JAVA_HOME/jre/lib/security/cacerts" in
>>>         flink-console.sh <http://flink-console.sh> as extra JVM arg
>>>         (otherwise, flink was taking MAPR's ssl_truststore as
>>>         default truststore and then we were facing issues for any
>>>         3rd party jars like aws_sdk etc.). This entire setup was
>>>         working fine as it is and we could submit our jars and the
>>>         pipelines ran without any problem
>>>
>>>
>>>         Problem: We started experimenting with enabling ssl for all
>>>         communication for Flink. For this, we followed
>>>         https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/security-ssl.html
>>>         <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/security-ssl.html> for
>>>         generating CA and keystore. I added the following properties
>>>         to flink-conf.yaml:
>>>
>>>
>>>         security.ssl.enabled: true
>>>         security.ssl.keystore: /opt/flink/certs/node1.keystore
>>>         security.ssl.keystore-password: <password>
>>>         security.ssl.key-password: <password>
>>>         security.ssl.truststore: /opt/flink/certs/ca.truststore
>>>         security.ssl.truststore-password: <password>
>>>         jobmanager.web.ssl.enabled: true
>>>         taskmanager.data.ssl.enabled: true
>>>         blob.service.ssl.enabled: true
>>>         akka.ssl.enabled: true
>>>
>>>
>>>         We then spin up a cluster and tried submitting the same job
>>>         which was working before. We get the following erros:
>>>         org.apache.flink.streaming.runtime.tasks.StreamTaskException:
>>>         Cannot load user class:
>>>         org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09
>>>
>>>         ClassLoader info: URL ClassLoader:
>>>         Class not resolvable through given classloader.
>>>         at
>>>         org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:229)
>>>
>>>         at
>>>         org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(OperatorChain.java:95)
>>>
>>>         at
>>>         org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:230)
>>>
>>>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>>>         at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>>         This error disappears when we remove the ssl config
>>>         properties i.e run flink cluster without ssl enabled.
>>>
>>>
>>>         So, did we miss any steps for enabling ssl?
>>>
>>>
>>>         P.S.: We tried removing the extra JVm arg mentioned above,
>>>         but still get the same error.
>>>
>>>         -- 
>>>
>>>         Aniket
>>
>>
>
>
>
>
>
> -- 
> Yours Sincerely,
> Aniket S Deshpande.



Re: Classloader error after SSL setup

Posted by Aniket Deshpande <an...@gmail.com>.
Hi Chesnay,
Thanks for the reply. After your suggestion, I found out that setting
*blob.service.ssl.enabled:
false* solved the issue and now all the pipelines run as expected.
So, the issue is kinda narrowed down to blob service ssl now.
I also checked the jobmanager logs when blob ssl is enabled and I see the
following error:










*2017-10-03 23:28:50.459 [BLOB connection for /<jm_ip>:46932] ERROR
org.apache.flink.runtime.blob.BlobServerConnection  - Error while executing
BLOB connection. javax.net.ssl.SSLHandshakeException: Received fatal alert:
certificate_unknown         at
sun.security.ssl.Alerts.getSSLException(Alerts.java:192)         at
sun.security.ssl.Alerts.getSSLException(Alerts.java:154)         at
sun.security.ssl.SSLSocketImpl.recvAlert(SSLSocketImpl.java:2023)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:1125)
  at
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1375)
      at
sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:928)
  at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:71)         at
org.apache.flink.runtime.blob.BlobServerConnection.run(BlobServerConnection.java:119)
*
So, is there some additional steps that I have to follow for enabling SSL
for blob service?

On Wed, Oct 4, 2017 at 4:09 PM, Eron Wright <er...@gmail.com> wrote:

> By following Chesney's recommendation we will hopefully uncover an SSL
> error that is being masked.  Another thing to try is to disable hostname
> verification (it is enabled by default) to see whether the certificate is
> being rejected.
>
> On Wed, Oct 4, 2017 at 5:15 AM, Chesnay Schepler <ch...@apache.org>
> wrote:
>
>> something that would also help us narrow down the problematic area is to
>> enable SSL for one component at a time and see
>> which one causesd the job to fail.
>>
>>
>> On 04.10.2017 14:11, Chesnay Schepler wrote:
>>
>> The configuration looks reasonable. Just to be sure, are the paths
>> accessible by all nodes?
>>
>> As a first step, could you set the logging level to DEBUG (by modifying
>> the 'conf/log4j.properties' file), resubmit the job (after a cluster
>> restart) and check the Job- and TaskManager logs for any exception?
>>
>> On 04.10.2017 03:15, Aniket Deshpande wrote:
>>
>> Background: We have a setup of Flink 1.3.1 along with a secure MAPR
>> cluster (Flink is running on mapr client nodes). We run this flink cluster
>> via flink-jobmanager.sh foreground and flink-taskmanager.sh foreground command
>> via Marathon.  In order for us to make this work, we had to add
>> -Djavax.net.ssl.trustStore="$JAVA_HOME/jre/lib/security/cacerts" in
>> flink-console.sh as extra JVM arg (otherwise, flink was taking MAPR's
>> ssl_truststore as default truststore and then we were facing issues for any
>> 3rd party jars like aws_sdk etc.). This entire setup was working fine as it
>> is and we could submit our jars and the pipelines ran without any problem
>>
>>
>> Problem: We started experimenting with enabling ssl for all
>> communication for Flink. For this, we followed https://ci.apache.org
>> /projects/flink/flink-docs-release-1.3/setup/security-ssl.html for
>> generating CA and keystore. I added the following properties to
>> flink-conf.yaml:
>>
>>
>> security.ssl.enabled: true
>> security.ssl.keystore: /opt/flink/certs/node1.keystore
>> security.ssl.keystore-password: <password>
>> security.ssl.key-password: <password>
>> security.ssl.truststore: /opt/flink/certs/ca.truststore
>> security.ssl.truststore-password: <password>
>> jobmanager.web.ssl.enabled: true
>> taskmanager.data.ssl.enabled: true
>> blob.service.ssl.enabled: true
>> akka.ssl.enabled: true
>>
>>
>> We then spin up a cluster and tried submitting the same job which was
>> working before. We get the following erros:
>> org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot
>> load user class: org.apache.flink.streaming.con
>> nectors.kafka.FlinkKafkaConsumer09
>> ClassLoader info: URL ClassLoader:
>> Class not resolvable through given classloader.
>>         at org.apache.flink.streaming.api.graph.StreamConfig.getStreamO
>> perator(StreamConfig.java:229)
>>         at org.apache.flink.streaming.runtime.tasks.OperatorChain.<init
>> >(OperatorChain.java:95)
>>         at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>> StreamTask.java:230)
>>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>>         at java.lang.Thread.run(Thread.java:748)
>>
>>
>> This error disappears when we remove the ssl config properties i.e run
>> flink cluster without ssl enabled.
>>
>>
>> So, did we miss any steps for enabling ssl?
>>
>>
>> P.S.: We tried removing the extra JVm arg mentioned above, but still get
>> the same error.
>>
>> --
>>
>> Aniket
>>
>>
>>
>>
>


-- 
Yours Sincerely,
Aniket S Deshpande.

Re: Classloader error after SSL setup

Posted by Eron Wright <er...@gmail.com>.
By following Chesney's recommendation we will hopefully uncover an SSL
error that is being masked.  Another thing to try is to disable hostname
verification (it is enabled by default) to see whether the certificate is
being rejected.

On Wed, Oct 4, 2017 at 5:15 AM, Chesnay Schepler <ch...@apache.org> wrote:

> something that would also help us narrow down the problematic area is to
> enable SSL for one component at a time and see
> which one causesd the job to fail.
>
>
> On 04.10.2017 14:11, Chesnay Schepler wrote:
>
> The configuration looks reasonable. Just to be sure, are the paths
> accessible by all nodes?
>
> As a first step, could you set the logging level to DEBUG (by modifying
> the 'conf/log4j.properties' file), resubmit the job (after a cluster
> restart) and check the Job- and TaskManager logs for any exception?
>
> On 04.10.2017 03:15, Aniket Deshpande wrote:
>
> Background: We have a setup of Flink 1.3.1 along with a secure MAPR
> cluster (Flink is running on mapr client nodes). We run this flink cluster
> via flink-jobmanager.sh foreground and flink-taskmanager.sh foreground command
> via Marathon.  In order for us to make this work, we had to add
> -Djavax.net.ssl.trustStore="$JAVA_HOME/jre/lib/security/cacerts" in
> flink-console.sh as extra JVM arg (otherwise, flink was taking MAPR's
> ssl_truststore as default truststore and then we were facing issues for any
> 3rd party jars like aws_sdk etc.). This entire setup was working fine as it
> is and we could submit our jars and the pipelines ran without any problem
>
>
> Problem: We started experimenting with enabling ssl for all communication
> for Flink. For this, we followed https://ci.apache.
> org/projects/flink/flink-docs-release-1.3/setup/security-ssl.html for
> generating CA and keystore. I added the following properties to
> flink-conf.yaml:
>
>
> security.ssl.enabled: true
> security.ssl.keystore: /opt/flink/certs/node1.keystore
> security.ssl.keystore-password: <password>
> security.ssl.key-password: <password>
> security.ssl.truststore: /opt/flink/certs/ca.truststore
> security.ssl.truststore-password: <password>
> jobmanager.web.ssl.enabled: true
> taskmanager.data.ssl.enabled: true
> blob.service.ssl.enabled: true
> akka.ssl.enabled: true
>
>
> We then spin up a cluster and tried submitting the same job which was
> working before. We get the following erros:
> org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot load
> user class: org.apache.flink.streaming.connectors.kafka.
> FlinkKafkaConsumer09
> ClassLoader info: URL ClassLoader:
> Class not resolvable through given classloader.
>         at org.apache.flink.streaming.api.graph.StreamConfig.
> getStreamOperator(StreamConfig.java:229)
>         at org.apache.flink.streaming.runtime.tasks.OperatorChain.<
> init>(OperatorChain.java:95)
>         at org.apache.flink.streaming.runtime.tasks.StreamTask.
> invoke(StreamTask.java:230)
>         at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>         at java.lang.Thread.run(Thread.java:748)
>
>
> This error disappears when we remove the ssl config properties i.e run
> flink cluster without ssl enabled.
>
>
> So, did we miss any steps for enabling ssl?
>
>
> P.S.: We tried removing the extra JVm arg mentioned above, but still get
> the same error.
>
> --
>
> Aniket
>
>
>
>

Re: Classloader error after SSL setup

Posted by Chesnay Schepler <ch...@apache.org>.
something that would also help us narrow down the problematic area is to 
enable SSL for one component at a time and see
which one causesd the job to fail.

On 04.10.2017 14:11, Chesnay Schepler wrote:
> The configuration looks reasonable. Just to be sure, are the paths 
> accessible by all nodes?
>
> As a first step, could you set the logging level to DEBUG (by 
> modifying the 'conf/log4j.properties' file), resubmit the job (after a 
> cluster restart) and check the Job- and TaskManager logs for any 
> exception?
>
> On 04.10.2017 03:15, Aniket Deshpande wrote:
>> Background: We have a setup of Flink 1.3.1 along with a secure MAPR 
>> cluster (Flink is running on mapr client nodes). We run this flink 
>> cluster via flink-jobmanager.sh <http://flink-jobmanager.sh> 
>> foreground and flink-taskmanager.sh <http://flink-taskmanager.sh> 
>> foreground command via Marathon.  In order for us to make this work, 
>> we had to add -Djavax.net 
>> <http://-Djavax.net>.ssl.trustStore="$JAVA_HOME/jre/lib/security/cacerts" in 
>> flink-console.sh <http://flink-console.sh> as extra JVM arg 
>> (otherwise, flink was taking MAPR's ssl_truststore as default 
>> truststore and then we were facing issues for any 3rd party jars like 
>> aws_sdk etc.). This entire setup was working fine as it is and we 
>> could submit our jars and the pipelines ran without any problem
>>
>>
>> Problem: We started experimenting with enabling ssl for all 
>> communication for Flink. For this, we followed 
>> https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/security-ssl.html for 
>> generating CA and keystore. I added the following properties to 
>> flink-conf.yaml:
>>
>>
>> security.ssl.enabled: true
>> security.ssl.keystore: /opt/flink/certs/node1.keystore
>> security.ssl.keystore-password: <password>
>> security.ssl.key-password: <password>
>> security.ssl.truststore: /opt/flink/certs/ca.truststore
>> security.ssl.truststore-password: <password>
>> jobmanager.web.ssl.enabled: true
>> taskmanager.data.ssl.enabled: true
>> blob.service.ssl.enabled: true
>> akka.ssl.enabled: true
>>
>>
>> We then spin up a cluster and tried submitting the same job which was 
>> working before. We get the following erros:
>> org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot 
>> load user class: 
>> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09
>> ClassLoader info: URL ClassLoader:
>> Class not resolvable through given classloader.
>> at 
>> org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:229) 
>>
>> at 
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(OperatorChain.java:95) 
>>
>> at 
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:230) 
>>
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>> at java.lang.Thread.run(Thread.java:748)
>>
>>
>> This error disappears when we remove the ssl config properties i.e 
>> run flink cluster without ssl enabled.
>>
>>
>> So, did we miss any steps for enabling ssl?
>>
>>
>> P.S.: We tried removing the extra JVm arg mentioned above, but still 
>> get the same error.
>>
>> -- 
>>
>> Aniket
>
>


Re: Classloader error after SSL setup

Posted by Chesnay Schepler <ch...@apache.org>.
The configuration looks reasonable. Just to be sure, are the paths 
accessible by all nodes?

As a first step, could you set the logging level to DEBUG (by modifying 
the 'conf/log4j.properties' file), resubmit the job (after a cluster 
restart) and check the Job- and TaskManager logs for any exception?

On 04.10.2017 03:15, Aniket Deshpande wrote:
> Background: We have a setup of Flink 1.3.1 along with a secure MAPR 
> cluster (Flink is running on mapr client nodes). We run this flink 
> cluster via flink-jobmanager.sh <http://flink-jobmanager.sh> 
> foreground and flink-taskmanager.sh <http://flink-taskmanager.sh> 
> foreground command via Marathon.  In order for us to make this work, 
> we had to add -Djavax.net 
> <http://-Djavax.net>.ssl.trustStore="$JAVA_HOME/jre/lib/security/cacerts" in 
> flink-console.sh <http://flink-console.sh> as extra JVM arg 
> (otherwise, flink was taking MAPR's ssl_truststore as default 
> truststore and then we were facing issues for any 3rd party jars like 
> aws_sdk etc.). This entire setup was working fine as it is and we 
> could submit our jars and the pipelines ran without any problem
>
>
> Problem: We started experimenting with enabling ssl for all 
> communication for Flink. For this, we followed 
> https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/security-ssl.html for 
> generating CA and keystore. I added the following properties to 
> flink-conf.yaml:
>
>
> security.ssl.enabled: true
> security.ssl.keystore: /opt/flink/certs/node1.keystore
> security.ssl.keystore-password: <password>
> security.ssl.key-password: <password>
> security.ssl.truststore: /opt/flink/certs/ca.truststore
> security.ssl.truststore-password: <password>
> jobmanager.web.ssl.enabled: true
> taskmanager.data.ssl.enabled: true
> blob.service.ssl.enabled: true
> akka.ssl.enabled: true
>
>
> We then spin up a cluster and tried submitting the same job which was 
> working before. We get the following erros:
> org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot 
> load user class: 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer09
> ClassLoader info: URL ClassLoader:
> Class not resolvable through given classloader.
> at 
> org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperator(StreamConfig.java:229) 
>
> at 
> org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(OperatorChain.java:95) 
>
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:230) 
>
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
> at java.lang.Thread.run(Thread.java:748)
>
>
> This error disappears when we remove the ssl config properties i.e run 
> flink cluster without ssl enabled.
>
>
> So, did we miss any steps for enabling ssl?
>
>
> P.S.: We tried removing the extra JVm arg mentioned above, but still 
> get the same error.
>
> -- 
>
> Aniket