You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Flavio Pompermaier <po...@okkam.it> on 2021/04/14 15:36:49 UTC
Flink Hadoop config on docker-compose
Hi everybody,
I'm trying to set up reading from HDFS using docker-compose and Flink
1.11.3.
If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir'
using FLINK_PROPERTIES (under environment section of the docker-compose
service) I see in the logs the following line:
"Could not find Hadoop configuration via any of the supported method"
If I'm not wrong, this means that the HADOOP_CONF_DIR is actually not
generated by the run scripts.
Indeed, If I add HADOOP_CONF_DIR and YARN_CONF_DIR (always under
environment section of the docker-compose service) I don't see that line.
Is this the expected behavior?
Below the relevant docker-compose service I use (I've removed the content
of HADOOP_CLASSPATH content because is too long and I didn't report the
taskmanager that is similar):
flink-jobmanager:
container_name: flink-jobmanager
build:
context: .
dockerfile: Dockerfile.flink
args:
FLINK_VERSION: 1.11.3-scala_2.12-java11
image: 'flink-test:1.11.3-scala_2.12-java11'
ports:
- "8091:8081"
- "8092:8082"
command: jobmanager
environment:
- |
FLINK_PROPERTIES=
jobmanager.rpc.address: flink-jobmanager
rest.port: 8081
historyserver.web.port: 8082
web.upload.dir: /opt/flink
env.hadoop.conf.dir: /opt/hadoop/conf
env.yarn.conf.dir: /opt/hadoop/conf
- |
HADOOP_CLASSPATH=...
- HADOOP_CONF_DIR=/opt/hadoop/conf
- YARN_CONF_DIR=/opt/hadoop/conf
volumes:
- 'flink_shared_folder:/tmp/test'
- 'flink_uploads:/opt/flink/flink-web-upload'
- 'flink_hadoop_conf:/opt/hadoop/conf'
- 'flink_hadoop_libs:/opt/hadoop-3.2.1/share'
Thanks in advance for any support,
Flavio
Re: Flink Hadoop config on docker-compose
Posted by Flavio Pompermaier <po...@okkam.it>.
Great! Thanks for the support
On Thu, Apr 22, 2021 at 2:57 PM Matthias Pohl <ma...@ververica.com>
wrote:
> I think you're right, Flavio. I created FLINK-22414 to cover this. Thanks
> for bringing it up.
>
> Matthias
>
> [1] https://issues.apache.org/jira/browse/FLINK-22414
>
> On Fri, Apr 16, 2021 at 9:32 AM Flavio Pompermaier <po...@okkam.it>
> wrote:
>
>> Hi Yang,
>> isn't this something to fix? If I look at the documentation at [1], in
>> the "Passing configuration via environment variables" section, there is:
>>
>> "The environment variable FLINK_PROPERTIES should contain a list of Flink
>> cluster configuration options separated by new line,
>> the same way as in the flink-conf.yaml. FLINK_PROPERTIES takes precedence
>> over configurations in flink-conf.yaml."
>>
>> To me this means that if I specify "env.hadoop.conf.dir" it should be
>> handled as well. Am I wrong?
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/docker.html
>>
>> Best,
>> Flavio
>>
>> On Fri, Apr 16, 2021 at 4:52 AM Yang Wang <da...@gmail.com> wrote:
>>
>>> It seems that we do not export HADOOP_CONF_DIR as environment variables
>>> in current implementation, even though we have set the env.xxx flink config
>>> options. It is only used to construct the classpath for the JM/TM process.
>>> However, in "HadoopUtils"[2] we do not support getting the hadoop
>>> configuration from classpath.
>>>
>>>
>>> [1].
>>> https://github.com/apache/flink/blob/release-1.11/flink-dist/src/main/flink-bin/bin/config.sh#L256
>>> [2].
>>> https://github.com/apache/flink/blob/release-1.11/flink-connectors/flink-hadoop-compatibility/src/main/java/org/apache/flink/api/java/hadoop/mapred/utils/HadoopUtils.java#L64
>>>
>>>
>>> Best,
>>> Yang
>>>
>>> Best,
>>> Yang
>>>
>>> Flavio Pompermaier <po...@okkam.it> 于2021年4月16日周五 上午3:55写道:
>>>
>>>> Hi Robert,
>>>> indeed my docker-compose does work only if I add also Hadoop and yarn
>>>> home while I was expecting that those two variables were generated
>>>> automatically just setting env.xxx variables in FLINK_PROPERTIES variable..
>>>>
>>>> I just want to understand what to expect, if I really need to specify
>>>> Hadoop and yarn home as env variables or not
>>>>
>>>> Il gio 15 apr 2021, 20:39 Robert Metzger <rm...@apache.org> ha
>>>> scritto:
>>>>
>>>>> Hi,
>>>>>
>>>>> I'm not aware of any known issues with Hadoop and Flink on Docker.
>>>>>
>>>>> I also tried what you are doing locally, and it seems to work:
>>>>>
>>>>> flink-jobmanager | 2021-04-15 18:37:48,300 INFO
>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Starting
>>>>> StandaloneSessionClusterEntrypoint.
>>>>> flink-jobmanager | 2021-04-15 18:37:48,338 INFO
>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
>>>>> default filesystem.
>>>>> flink-jobmanager | 2021-04-15 18:37:48,375 INFO
>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
>>>>> security context.
>>>>> flink-jobmanager | 2021-04-15 18:37:48,404 INFO
>>>>> org.apache.flink.runtime.security.modules.HadoopModule [] - Hadoop
>>>>> user set to flink (auth:SIMPLE)
>>>>> flink-jobmanager | 2021-04-15 18:37:48,408 INFO
>>>>> org.apache.flink.runtime.security.modules.JaasModule [] - Jaas
>>>>> file will be created as /tmp/jaas-811306162058602256.conf.
>>>>> flink-jobmanager | 2021-04-15 18:37:48,415 INFO
>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] -
>>>>> Initializing cluster services.
>>>>>
>>>>> Here's my code:
>>>>>
>>>>> https://gist.github.com/rmetzger/0cf4ba081d685d26478525bf69c7bd39
>>>>>
>>>>> Hope this helps!
>>>>>
>>>>> On Wed, Apr 14, 2021 at 5:37 PM Flavio Pompermaier <
>>>>> pompermaier@okkam.it> wrote:
>>>>>
>>>>>> Hi everybody,
>>>>>> I'm trying to set up reading from HDFS using docker-compose and Flink
>>>>>> 1.11.3.
>>>>>> If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir'
>>>>>> using FLINK_PROPERTIES (under environment section of the docker-compose
>>>>>> service) I see in the logs the following line:
>>>>>>
>>>>>> "Could not find Hadoop configuration via any of the supported method"
>>>>>>
>>>>>> If I'm not wrong, this means that the HADOOP_CONF_DIR is actually not
>>>>>> generated by the run scripts.
>>>>>> Indeed, If I add HADOOP_CONF_DIR and YARN_CONF_DIR (always under
>>>>>> environment section of the docker-compose service) I don't see that line.
>>>>>>
>>>>>> Is this the expected behavior?
>>>>>>
>>>>>> Below the relevant docker-compose service I use (I've removed the
>>>>>> content of HADOOP_CLASSPATH content because is too long and I didn't report
>>>>>> the taskmanager that is similar):
>>>>>>
>>>>>> flink-jobmanager:
>>>>>> container_name: flink-jobmanager
>>>>>> build:
>>>>>> context: .
>>>>>> dockerfile: Dockerfile.flink
>>>>>> args:
>>>>>> FLINK_VERSION: 1.11.3-scala_2.12-java11
>>>>>> image: 'flink-test:1.11.3-scala_2.12-java11'
>>>>>> ports:
>>>>>> - "8091:8081"
>>>>>> - "8092:8082"
>>>>>> command: jobmanager
>>>>>> environment:
>>>>>> - |
>>>>>> FLINK_PROPERTIES=
>>>>>> jobmanager.rpc.address: flink-jobmanager
>>>>>> rest.port: 8081
>>>>>> historyserver.web.port: 8082
>>>>>> web.upload.dir: /opt/flink
>>>>>> env.hadoop.conf.dir: /opt/hadoop/conf
>>>>>> env.yarn.conf.dir: /opt/hadoop/conf
>>>>>> - |
>>>>>> HADOOP_CLASSPATH=...
>>>>>> - HADOOP_CONF_DIR=/opt/hadoop/conf
>>>>>> - YARN_CONF_DIR=/opt/hadoop/conf
>>>>>> volumes:
>>>>>> - 'flink_shared_folder:/tmp/test'
>>>>>> - 'flink_uploads:/opt/flink/flink-web-upload'
>>>>>> - 'flink_hadoop_conf:/opt/hadoop/conf'
>>>>>> - 'flink_hadoop_libs:/opt/hadoop-3.2.1/share'
>>>>>>
>>>>>>
>>>>>> Thanks in advance for any support,
>>>>>> Flavio
>>>>>>
>>>>>
>
Re: Flink Hadoop config on docker-compose
Posted by Matthias Pohl <ma...@ververica.com>.
I think you're right, Flavio. I created FLINK-22414 to cover this. Thanks
for bringing it up.
Matthias
[1] https://issues.apache.org/jira/browse/FLINK-22414
On Fri, Apr 16, 2021 at 9:32 AM Flavio Pompermaier <po...@okkam.it>
wrote:
> Hi Yang,
> isn't this something to fix? If I look at the documentation at [1], in
> the "Passing configuration via environment variables" section, there is:
>
> "The environment variable FLINK_PROPERTIES should contain a list of Flink
> cluster configuration options separated by new line,
> the same way as in the flink-conf.yaml. FLINK_PROPERTIES takes precedence
> over configurations in flink-conf.yaml."
>
> To me this means that if I specify "env.hadoop.conf.dir" it should be
> handled as well. Am I wrong?
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/docker.html
>
> Best,
> Flavio
>
> On Fri, Apr 16, 2021 at 4:52 AM Yang Wang <da...@gmail.com> wrote:
>
>> It seems that we do not export HADOOP_CONF_DIR as environment variables
>> in current implementation, even though we have set the env.xxx flink config
>> options. It is only used to construct the classpath for the JM/TM process.
>> However, in "HadoopUtils"[2] we do not support getting the hadoop
>> configuration from classpath.
>>
>>
>> [1].
>> https://github.com/apache/flink/blob/release-1.11/flink-dist/src/main/flink-bin/bin/config.sh#L256
>> [2].
>> https://github.com/apache/flink/blob/release-1.11/flink-connectors/flink-hadoop-compatibility/src/main/java/org/apache/flink/api/java/hadoop/mapred/utils/HadoopUtils.java#L64
>>
>>
>> Best,
>> Yang
>>
>> Best,
>> Yang
>>
>> Flavio Pompermaier <po...@okkam.it> 于2021年4月16日周五 上午3:55写道:
>>
>>> Hi Robert,
>>> indeed my docker-compose does work only if I add also Hadoop and yarn
>>> home while I was expecting that those two variables were generated
>>> automatically just setting env.xxx variables in FLINK_PROPERTIES variable..
>>>
>>> I just want to understand what to expect, if I really need to specify
>>> Hadoop and yarn home as env variables or not
>>>
>>> Il gio 15 apr 2021, 20:39 Robert Metzger <rm...@apache.org> ha
>>> scritto:
>>>
>>>> Hi,
>>>>
>>>> I'm not aware of any known issues with Hadoop and Flink on Docker.
>>>>
>>>> I also tried what you are doing locally, and it seems to work:
>>>>
>>>> flink-jobmanager | 2021-04-15 18:37:48,300 INFO
>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Starting
>>>> StandaloneSessionClusterEntrypoint.
>>>> flink-jobmanager | 2021-04-15 18:37:48,338 INFO
>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
>>>> default filesystem.
>>>> flink-jobmanager | 2021-04-15 18:37:48,375 INFO
>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
>>>> security context.
>>>> flink-jobmanager | 2021-04-15 18:37:48,404 INFO
>>>> org.apache.flink.runtime.security.modules.HadoopModule [] - Hadoop
>>>> user set to flink (auth:SIMPLE)
>>>> flink-jobmanager | 2021-04-15 18:37:48,408 INFO
>>>> org.apache.flink.runtime.security.modules.JaasModule [] - Jaas
>>>> file will be created as /tmp/jaas-811306162058602256.conf.
>>>> flink-jobmanager | 2021-04-15 18:37:48,415 INFO
>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] -
>>>> Initializing cluster services.
>>>>
>>>> Here's my code:
>>>>
>>>> https://gist.github.com/rmetzger/0cf4ba081d685d26478525bf69c7bd39
>>>>
>>>> Hope this helps!
>>>>
>>>> On Wed, Apr 14, 2021 at 5:37 PM Flavio Pompermaier <
>>>> pompermaier@okkam.it> wrote:
>>>>
>>>>> Hi everybody,
>>>>> I'm trying to set up reading from HDFS using docker-compose and Flink
>>>>> 1.11.3.
>>>>> If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir'
>>>>> using FLINK_PROPERTIES (under environment section of the docker-compose
>>>>> service) I see in the logs the following line:
>>>>>
>>>>> "Could not find Hadoop configuration via any of the supported method"
>>>>>
>>>>> If I'm not wrong, this means that the HADOOP_CONF_DIR is actually not
>>>>> generated by the run scripts.
>>>>> Indeed, If I add HADOOP_CONF_DIR and YARN_CONF_DIR (always under
>>>>> environment section of the docker-compose service) I don't see that line.
>>>>>
>>>>> Is this the expected behavior?
>>>>>
>>>>> Below the relevant docker-compose service I use (I've removed the
>>>>> content of HADOOP_CLASSPATH content because is too long and I didn't report
>>>>> the taskmanager that is similar):
>>>>>
>>>>> flink-jobmanager:
>>>>> container_name: flink-jobmanager
>>>>> build:
>>>>> context: .
>>>>> dockerfile: Dockerfile.flink
>>>>> args:
>>>>> FLINK_VERSION: 1.11.3-scala_2.12-java11
>>>>> image: 'flink-test:1.11.3-scala_2.12-java11'
>>>>> ports:
>>>>> - "8091:8081"
>>>>> - "8092:8082"
>>>>> command: jobmanager
>>>>> environment:
>>>>> - |
>>>>> FLINK_PROPERTIES=
>>>>> jobmanager.rpc.address: flink-jobmanager
>>>>> rest.port: 8081
>>>>> historyserver.web.port: 8082
>>>>> web.upload.dir: /opt/flink
>>>>> env.hadoop.conf.dir: /opt/hadoop/conf
>>>>> env.yarn.conf.dir: /opt/hadoop/conf
>>>>> - |
>>>>> HADOOP_CLASSPATH=...
>>>>> - HADOOP_CONF_DIR=/opt/hadoop/conf
>>>>> - YARN_CONF_DIR=/opt/hadoop/conf
>>>>> volumes:
>>>>> - 'flink_shared_folder:/tmp/test'
>>>>> - 'flink_uploads:/opt/flink/flink-web-upload'
>>>>> - 'flink_hadoop_conf:/opt/hadoop/conf'
>>>>> - 'flink_hadoop_libs:/opt/hadoop-3.2.1/share'
>>>>>
>>>>>
>>>>> Thanks in advance for any support,
>>>>> Flavio
>>>>>
>>>>
Re: Flink Hadoop config on docker-compose
Posted by Flavio Pompermaier <po...@okkam.it>.
Hi Yang,
isn't this something to fix? If I look at the documentation at [1], in
the "Passing configuration via environment variables" section, there is:
"The environment variable FLINK_PROPERTIES should contain a list of Flink
cluster configuration options separated by new line,
the same way as in the flink-conf.yaml. FLINK_PROPERTIES takes precedence
over configurations in flink-conf.yaml."
To me this means that if I specify "env.hadoop.conf.dir" it should be
handled as well. Am I wrong?
[1]
https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/docker.html
Best,
Flavio
On Fri, Apr 16, 2021 at 4:52 AM Yang Wang <da...@gmail.com> wrote:
> It seems that we do not export HADOOP_CONF_DIR as environment variables in
> current implementation, even though we have set the env.xxx flink config
> options. It is only used to construct the classpath for the JM/TM process.
> However, in "HadoopUtils"[2] we do not support getting the hadoop
> configuration from classpath.
>
>
> [1].
> https://github.com/apache/flink/blob/release-1.11/flink-dist/src/main/flink-bin/bin/config.sh#L256
> [2].
> https://github.com/apache/flink/blob/release-1.11/flink-connectors/flink-hadoop-compatibility/src/main/java/org/apache/flink/api/java/hadoop/mapred/utils/HadoopUtils.java#L64
>
>
> Best,
> Yang
>
> Best,
> Yang
>
> Flavio Pompermaier <po...@okkam.it> 于2021年4月16日周五 上午3:55写道:
>
>> Hi Robert,
>> indeed my docker-compose does work only if I add also Hadoop and yarn
>> home while I was expecting that those two variables were generated
>> automatically just setting env.xxx variables in FLINK_PROPERTIES variable..
>>
>> I just want to understand what to expect, if I really need to specify
>> Hadoop and yarn home as env variables or not
>>
>> Il gio 15 apr 2021, 20:39 Robert Metzger <rm...@apache.org> ha
>> scritto:
>>
>>> Hi,
>>>
>>> I'm not aware of any known issues with Hadoop and Flink on Docker.
>>>
>>> I also tried what you are doing locally, and it seems to work:
>>>
>>> flink-jobmanager | 2021-04-15 18:37:48,300 INFO
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Starting
>>> StandaloneSessionClusterEntrypoint.
>>> flink-jobmanager | 2021-04-15 18:37:48,338 INFO
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
>>> default filesystem.
>>> flink-jobmanager | 2021-04-15 18:37:48,375 INFO
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
>>> security context.
>>> flink-jobmanager | 2021-04-15 18:37:48,404 INFO
>>> org.apache.flink.runtime.security.modules.HadoopModule [] - Hadoop
>>> user set to flink (auth:SIMPLE)
>>> flink-jobmanager | 2021-04-15 18:37:48,408 INFO
>>> org.apache.flink.runtime.security.modules.JaasModule [] - Jaas
>>> file will be created as /tmp/jaas-811306162058602256.conf.
>>> flink-jobmanager | 2021-04-15 18:37:48,415 INFO
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] -
>>> Initializing cluster services.
>>>
>>> Here's my code:
>>>
>>> https://gist.github.com/rmetzger/0cf4ba081d685d26478525bf69c7bd39
>>>
>>> Hope this helps!
>>>
>>> On Wed, Apr 14, 2021 at 5:37 PM Flavio Pompermaier <po...@okkam.it>
>>> wrote:
>>>
>>>> Hi everybody,
>>>> I'm trying to set up reading from HDFS using docker-compose and Flink
>>>> 1.11.3.
>>>> If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir'
>>>> using FLINK_PROPERTIES (under environment section of the docker-compose
>>>> service) I see in the logs the following line:
>>>>
>>>> "Could not find Hadoop configuration via any of the supported method"
>>>>
>>>> If I'm not wrong, this means that the HADOOP_CONF_DIR is actually not
>>>> generated by the run scripts.
>>>> Indeed, If I add HADOOP_CONF_DIR and YARN_CONF_DIR (always under
>>>> environment section of the docker-compose service) I don't see that line.
>>>>
>>>> Is this the expected behavior?
>>>>
>>>> Below the relevant docker-compose service I use (I've removed the
>>>> content of HADOOP_CLASSPATH content because is too long and I didn't report
>>>> the taskmanager that is similar):
>>>>
>>>> flink-jobmanager:
>>>> container_name: flink-jobmanager
>>>> build:
>>>> context: .
>>>> dockerfile: Dockerfile.flink
>>>> args:
>>>> FLINK_VERSION: 1.11.3-scala_2.12-java11
>>>> image: 'flink-test:1.11.3-scala_2.12-java11'
>>>> ports:
>>>> - "8091:8081"
>>>> - "8092:8082"
>>>> command: jobmanager
>>>> environment:
>>>> - |
>>>> FLINK_PROPERTIES=
>>>> jobmanager.rpc.address: flink-jobmanager
>>>> rest.port: 8081
>>>> historyserver.web.port: 8082
>>>> web.upload.dir: /opt/flink
>>>> env.hadoop.conf.dir: /opt/hadoop/conf
>>>> env.yarn.conf.dir: /opt/hadoop/conf
>>>> - |
>>>> HADOOP_CLASSPATH=...
>>>> - HADOOP_CONF_DIR=/opt/hadoop/conf
>>>> - YARN_CONF_DIR=/opt/hadoop/conf
>>>> volumes:
>>>> - 'flink_shared_folder:/tmp/test'
>>>> - 'flink_uploads:/opt/flink/flink-web-upload'
>>>> - 'flink_hadoop_conf:/opt/hadoop/conf'
>>>> - 'flink_hadoop_libs:/opt/hadoop-3.2.1/share'
>>>>
>>>>
>>>> Thanks in advance for any support,
>>>> Flavio
>>>>
>>>
Re: Flink Hadoop config on docker-compose
Posted by Yang Wang <da...@gmail.com>.
It seems that we do not export HADOOP_CONF_DIR as environment variables in
current implementation, even though we have set the env.xxx flink config
options. It is only used to construct the classpath for the JM/TM process.
However, in "HadoopUtils"[2] we do not support getting the hadoop
configuration from classpath.
[1].
https://github.com/apache/flink/blob/release-1.11/flink-dist/src/main/flink-bin/bin/config.sh#L256
[2].
https://github.com/apache/flink/blob/release-1.11/flink-connectors/flink-hadoop-compatibility/src/main/java/org/apache/flink/api/java/hadoop/mapred/utils/HadoopUtils.java#L64
Best,
Yang
Best,
Yang
Flavio Pompermaier <po...@okkam.it> 于2021年4月16日周五 上午3:55写道:
> Hi Robert,
> indeed my docker-compose does work only if I add also Hadoop and yarn home
> while I was expecting that those two variables were generated automatically
> just setting env.xxx variables in FLINK_PROPERTIES variable..
>
> I just want to understand what to expect, if I really need to specify
> Hadoop and yarn home as env variables or not
>
> Il gio 15 apr 2021, 20:39 Robert Metzger <rm...@apache.org> ha scritto:
>
>> Hi,
>>
>> I'm not aware of any known issues with Hadoop and Flink on Docker.
>>
>> I also tried what you are doing locally, and it seems to work:
>>
>> flink-jobmanager | 2021-04-15 18:37:48,300 INFO
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Starting
>> StandaloneSessionClusterEntrypoint.
>> flink-jobmanager | 2021-04-15 18:37:48,338 INFO
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
>> default filesystem.
>> flink-jobmanager | 2021-04-15 18:37:48,375 INFO
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
>> security context.
>> flink-jobmanager | 2021-04-15 18:37:48,404 INFO
>> org.apache.flink.runtime.security.modules.HadoopModule [] - Hadoop
>> user set to flink (auth:SIMPLE)
>> flink-jobmanager | 2021-04-15 18:37:48,408 INFO
>> org.apache.flink.runtime.security.modules.JaasModule [] - Jaas
>> file will be created as /tmp/jaas-811306162058602256.conf.
>> flink-jobmanager | 2021-04-15 18:37:48,415 INFO
>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] -
>> Initializing cluster services.
>>
>> Here's my code:
>>
>> https://gist.github.com/rmetzger/0cf4ba081d685d26478525bf69c7bd39
>>
>> Hope this helps!
>>
>> On Wed, Apr 14, 2021 at 5:37 PM Flavio Pompermaier <po...@okkam.it>
>> wrote:
>>
>>> Hi everybody,
>>> I'm trying to set up reading from HDFS using docker-compose and Flink
>>> 1.11.3.
>>> If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir'
>>> using FLINK_PROPERTIES (under environment section of the docker-compose
>>> service) I see in the logs the following line:
>>>
>>> "Could not find Hadoop configuration via any of the supported method"
>>>
>>> If I'm not wrong, this means that the HADOOP_CONF_DIR is actually not
>>> generated by the run scripts.
>>> Indeed, If I add HADOOP_CONF_DIR and YARN_CONF_DIR (always under
>>> environment section of the docker-compose service) I don't see that line.
>>>
>>> Is this the expected behavior?
>>>
>>> Below the relevant docker-compose service I use (I've removed the
>>> content of HADOOP_CLASSPATH content because is too long and I didn't report
>>> the taskmanager that is similar):
>>>
>>> flink-jobmanager:
>>> container_name: flink-jobmanager
>>> build:
>>> context: .
>>> dockerfile: Dockerfile.flink
>>> args:
>>> FLINK_VERSION: 1.11.3-scala_2.12-java11
>>> image: 'flink-test:1.11.3-scala_2.12-java11'
>>> ports:
>>> - "8091:8081"
>>> - "8092:8082"
>>> command: jobmanager
>>> environment:
>>> - |
>>> FLINK_PROPERTIES=
>>> jobmanager.rpc.address: flink-jobmanager
>>> rest.port: 8081
>>> historyserver.web.port: 8082
>>> web.upload.dir: /opt/flink
>>> env.hadoop.conf.dir: /opt/hadoop/conf
>>> env.yarn.conf.dir: /opt/hadoop/conf
>>> - |
>>> HADOOP_CLASSPATH=...
>>> - HADOOP_CONF_DIR=/opt/hadoop/conf
>>> - YARN_CONF_DIR=/opt/hadoop/conf
>>> volumes:
>>> - 'flink_shared_folder:/tmp/test'
>>> - 'flink_uploads:/opt/flink/flink-web-upload'
>>> - 'flink_hadoop_conf:/opt/hadoop/conf'
>>> - 'flink_hadoop_libs:/opt/hadoop-3.2.1/share'
>>>
>>>
>>> Thanks in advance for any support,
>>> Flavio
>>>
>>
Re: Flink Hadoop config on docker-compose
Posted by Flavio Pompermaier <po...@okkam.it>.
Hi Robert,
indeed my docker-compose does work only if I add also Hadoop and yarn home
while I was expecting that those two variables were generated automatically
just setting env.xxx variables in FLINK_PROPERTIES variable..
I just want to understand what to expect, if I really need to specify
Hadoop and yarn home as env variables or not
Il gio 15 apr 2021, 20:39 Robert Metzger <rm...@apache.org> ha scritto:
> Hi,
>
> I'm not aware of any known issues with Hadoop and Flink on Docker.
>
> I also tried what you are doing locally, and it seems to work:
>
> flink-jobmanager | 2021-04-15 18:37:48,300 INFO
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Starting
> StandaloneSessionClusterEntrypoint.
> flink-jobmanager | 2021-04-15 18:37:48,338 INFO
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
> default filesystem.
> flink-jobmanager | 2021-04-15 18:37:48,375 INFO
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
> security context.
> flink-jobmanager | 2021-04-15 18:37:48,404 INFO
> org.apache.flink.runtime.security.modules.HadoopModule [] - Hadoop
> user set to flink (auth:SIMPLE)
> flink-jobmanager | 2021-04-15 18:37:48,408 INFO
> org.apache.flink.runtime.security.modules.JaasModule [] - Jaas
> file will be created as /tmp/jaas-811306162058602256.conf.
> flink-jobmanager | 2021-04-15 18:37:48,415 INFO
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] -
> Initializing cluster services.
>
> Here's my code:
>
> https://gist.github.com/rmetzger/0cf4ba081d685d26478525bf69c7bd39
>
> Hope this helps!
>
> On Wed, Apr 14, 2021 at 5:37 PM Flavio Pompermaier <po...@okkam.it>
> wrote:
>
>> Hi everybody,
>> I'm trying to set up reading from HDFS using docker-compose and Flink
>> 1.11.3.
>> If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir'
>> using FLINK_PROPERTIES (under environment section of the docker-compose
>> service) I see in the logs the following line:
>>
>> "Could not find Hadoop configuration via any of the supported method"
>>
>> If I'm not wrong, this means that the HADOOP_CONF_DIR is actually not
>> generated by the run scripts.
>> Indeed, If I add HADOOP_CONF_DIR and YARN_CONF_DIR (always under
>> environment section of the docker-compose service) I don't see that line.
>>
>> Is this the expected behavior?
>>
>> Below the relevant docker-compose service I use (I've removed the content
>> of HADOOP_CLASSPATH content because is too long and I didn't report the
>> taskmanager that is similar):
>>
>> flink-jobmanager:
>> container_name: flink-jobmanager
>> build:
>> context: .
>> dockerfile: Dockerfile.flink
>> args:
>> FLINK_VERSION: 1.11.3-scala_2.12-java11
>> image: 'flink-test:1.11.3-scala_2.12-java11'
>> ports:
>> - "8091:8081"
>> - "8092:8082"
>> command: jobmanager
>> environment:
>> - |
>> FLINK_PROPERTIES=
>> jobmanager.rpc.address: flink-jobmanager
>> rest.port: 8081
>> historyserver.web.port: 8082
>> web.upload.dir: /opt/flink
>> env.hadoop.conf.dir: /opt/hadoop/conf
>> env.yarn.conf.dir: /opt/hadoop/conf
>> - |
>> HADOOP_CLASSPATH=...
>> - HADOOP_CONF_DIR=/opt/hadoop/conf
>> - YARN_CONF_DIR=/opt/hadoop/conf
>> volumes:
>> - 'flink_shared_folder:/tmp/test'
>> - 'flink_uploads:/opt/flink/flink-web-upload'
>> - 'flink_hadoop_conf:/opt/hadoop/conf'
>> - 'flink_hadoop_libs:/opt/hadoop-3.2.1/share'
>>
>>
>> Thanks in advance for any support,
>> Flavio
>>
>
Re: Flink Hadoop config on docker-compose
Posted by Robert Metzger <rm...@apache.org>.
Hi,
I'm not aware of any known issues with Hadoop and Flink on Docker.
I also tried what you are doing locally, and it seems to work:
flink-jobmanager | 2021-04-15 18:37:48,300 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Starting
StandaloneSessionClusterEntrypoint.
flink-jobmanager | 2021-04-15 18:37:48,338 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
default filesystem.
flink-jobmanager | 2021-04-15 18:37:48,375 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] - Install
security context.
flink-jobmanager | 2021-04-15 18:37:48,404 INFO
org.apache.flink.runtime.security.modules.HadoopModule [] - Hadoop
user set to flink (auth:SIMPLE)
flink-jobmanager | 2021-04-15 18:37:48,408 INFO
org.apache.flink.runtime.security.modules.JaasModule [] - Jaas
file will be created as /tmp/jaas-811306162058602256.conf.
flink-jobmanager | 2021-04-15 18:37:48,415 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint [] -
Initializing cluster services.
Here's my code:
https://gist.github.com/rmetzger/0cf4ba081d685d26478525bf69c7bd39
Hope this helps!
On Wed, Apr 14, 2021 at 5:37 PM Flavio Pompermaier <po...@okkam.it>
wrote:
> Hi everybody,
> I'm trying to set up reading from HDFS using docker-compose and Flink
> 1.11.3.
> If I pass 'env.hadoop.conf.dir' and 'env.yarn.conf.dir'
> using FLINK_PROPERTIES (under environment section of the docker-compose
> service) I see in the logs the following line:
>
> "Could not find Hadoop configuration via any of the supported method"
>
> If I'm not wrong, this means that the HADOOP_CONF_DIR is actually not
> generated by the run scripts.
> Indeed, If I add HADOOP_CONF_DIR and YARN_CONF_DIR (always under
> environment section of the docker-compose service) I don't see that line.
>
> Is this the expected behavior?
>
> Below the relevant docker-compose service I use (I've removed the content
> of HADOOP_CLASSPATH content because is too long and I didn't report the
> taskmanager that is similar):
>
> flink-jobmanager:
> container_name: flink-jobmanager
> build:
> context: .
> dockerfile: Dockerfile.flink
> args:
> FLINK_VERSION: 1.11.3-scala_2.12-java11
> image: 'flink-test:1.11.3-scala_2.12-java11'
> ports:
> - "8091:8081"
> - "8092:8082"
> command: jobmanager
> environment:
> - |
> FLINK_PROPERTIES=
> jobmanager.rpc.address: flink-jobmanager
> rest.port: 8081
> historyserver.web.port: 8082
> web.upload.dir: /opt/flink
> env.hadoop.conf.dir: /opt/hadoop/conf
> env.yarn.conf.dir: /opt/hadoop/conf
> - |
> HADOOP_CLASSPATH=...
> - HADOOP_CONF_DIR=/opt/hadoop/conf
> - YARN_CONF_DIR=/opt/hadoop/conf
> volumes:
> - 'flink_shared_folder:/tmp/test'
> - 'flink_uploads:/opt/flink/flink-web-upload'
> - 'flink_hadoop_conf:/opt/hadoop/conf'
> - 'flink_hadoop_libs:/opt/hadoop-3.2.1/share'
>
>
> Thanks in advance for any support,
> Flavio
>