You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Be...@Dev.Helaba.de on 2017/12/04 09:03:37 UTC

Blob server not working with 1.4.0.RC2

Hi
Since we switched to Release 1.4 the taskmanagers are unable to download blobs from the jobmanager.
The taskmanager registration still works.
Netstat on jobmanager shows open ports at 6123 and 50000. But a telnet connection from taskmanager to jobmanager on port 50000 times out.

Any ideas are welcome.

Regards

Bernd

Jobmanager log:

2017-12-04 08:48:30,167 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Starting JobManager actor
2017-12-04 08:48:30,197 INFO  org.apache.flink.runtime.blob.BlobServer                      - Created BLOB server storage directory /tmp/blobStore-81cd12c7-394e-4777-85a1-98389b72dd08
2017-12-04 08:48:30,205 INFO  org.apache.flink.runtime.blob.BlobServer                      - Started BLOB server at 0.0.0.0:50000 - max concurrent requests: 50 - max backlog: 1000
2017-12-04 08:48:30,608 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Starting JobManager at akka.tcp://flink@flink-jobmanager:6123/user/jobmanager.
2017-12-04 08:48:30,628 INFO  org.apache.flink.runtime.jobmanager.MemoryArchivist           - Started memory archivist akka://flink/user/archive
2017-12-04 08:48:30,676 INFO  org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager  - Trying to associate with JobManager leader akka.tcp://flink@flink-jobmanager:6123/user/jobmanager
2017-12-04 08:48:30,692 INFO  org.apache.flink.runtime.jobmanager.JobManager                - JobManager akka.tcp://flink@flink-jobmanager:6123/user/jobmanager was granted leadership with leader session ID Some(00000000-0000-0000-0000-000000000000).
2017-12-04 08:48:30,700 INFO  org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager  - Resource Manager associating with leading JobManager Actor[akka://flink/user/jobmanager#886586058] - leader session 00000000-0000-0000-0000-000000000000
2017-12-04 08:53:50,635 INFO  org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager  - TaskManager 627338086a766c140909ba45f2e717d0 has started.
2017-12-04 08:53:50,638 INFO  org.apache.flink.runtime.instance.InstanceManager             - Registered TaskManager at flink-taskmanager-65cf757d9b-hj65d (akka.tcp://flink@flink-taskmanager-65cf757d9b-hj65d:45932/user/taskmanager) as f9d2843d0223b15d8fce52aea8231cc6. Current number of registered hosts is 1. Current number of alive task slots is 8.
2017-12-04 08:53:50,658 WARN  akka.serialization.Serialization(akka://flink)                - Using the default Java serializer for class [org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage'
2017-12-04 08:53:55,714 INFO  org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager  - TaskManager 08c3e6f7c765e2ab88e2ea645049cb9d has started.
2017-12-04 08:53:55,714 INFO  org.apache.flink.runtime.instance.InstanceManager             - Registered TaskManager at flink-taskmanager-65cf757d9b-jtzw5 (akka.tcp://flink@flink-taskmanager-65cf757d9b-jtzw5:41710/user/taskmanager) as da8a8da3650ce53f460784c54938a071. Current number of registered hosts is 2. Current number of alive task slots is 16.
2017-12-04 09:04:08,850 ERROR org.apache.flink.runtime.rest.handler.legacy.files.StaticFileServerHandler  - Caught exception
java.io.IOException: Operation timed out
        at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
        at sun.nio.ch.IOUtil.read(IOUtil.java:192)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)


Taskmanager log:
2017-12-04 08:53:55,511 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Starting TaskManager actor at akka://flink/user/taskmanager#-977591027.
2017-12-04 08:53:55,511 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - TaskManager data connection information: 08c3e6f7c765e2ab88e2ea645049cb9d @ flink-taskmanager-65cf757d9b-jtzw5 (dataPort=42142)
2017-12-04 08:53:55,512 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - TaskManager has 8 task slot(s).
2017-12-04 08:53:55,513 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Memory usage stats: [HEAP: 131/981/981 MB, NON HEAP: 34/35/-1 MB (used/committed/max)]
2017-12-04 08:53:55,518 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Trying to register at JobManager akka.tcp://flink@flink-jobmanager:6123/user/jobmanager (attempt 1, timeout: 500 milliseconds)
2017-12-04 08:53:55,671 WARN  akka.serialization.Serialization(akka://flink)                - Using the default Java serializer for class [org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage'
2017-12-04 08:53:55,737 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Successful registration at JobManager (akka.tcp://flink@flink-jobmanager:6123/user/jobmanager), starting network stack and library cache.
2017-12-04 08:53:55,742 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Determined BLOB server address to be flink-jobmanager/10.104.5.130:50000. Starting BLOB cache.
2017-12-04 08:53:55,752 INFO  org.apache.flink.runtime.blob.PermanentBlobCache              - Created BLOB cache storage directory /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5
2017-12-04 08:53:55,776 INFO  org.apache.flink.runtime.blob.TransientBlobCache              - Created BLOB cache storage directory /tmp/blobStore-8d7c8660-4455-43e9-93be-dadeb622b9e1
2017-12-04 08:54:00,796 WARN  akka.serialization.Serialization(akka://flink)                - Using the default Java serializer for class [org.apache.flink.runtime.messages.TaskManagerMessages$Heartbeat] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage'
2017-12-04 09:10:38,726 WARN  akka.serialization.Serialization(akka://flink)                - Using the default Java serializer for class [org.apache.flink.runtime.metrics.dump.MetricDumpSerialization$MetricSerializationResult] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage'
2017-12-04 09:11:35,403 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Received task Source: PACS008 Generator -> Double Submission Generator -> Invalid IBAN Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1)
2017-12-04 09:11:35,404 INFO  org.apache.flink.runtime.taskmanager.Task                     - Source: PACS008 Generator -> Double Submission Generator -> Invalid IBAN Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1) (797de317bebf24df087f2da63cf5118e) switched from CREATED to DEPLOYING.
2017-12-04 09:11:35,404 INFO  org.apache.flink.runtime.taskmanager.Task                     - Creating FileSystem stream leak safety net for task Source: PACS008 Generator -> Double Submission Generator -> Invalid IBAN Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1) (797de317bebf24df087f2da63cf5118e) [DEPLOYING]
2017-12-04 09:11:35,408 WARN  akka.serialization.Serialization(akka://flink)                - Using the default Java serializer for class [org.apache.flink.runtime.messages.Acknowledge] which is not recommended because of performance implications. Use another serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage'
2017-12-04 09:11:35,410 INFO  org.apache.flink.runtime.taskmanager.Task                     - Loading JAR files for task Source: PACS008 Generator -> Double Submission Generator -> Invalid IBAN Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1) (797de317bebf24df087f2da63cf5118e) [DEPLOYING].
2017-12-04 09:11:35,440 INFO  org.apache.flink.runtime.blob.BlobClient                      - Downloading 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000
2017-12-04 09:13:42,663 ERROR org.apache.flink.runtime.blob.BlobClient                      - Failed to fetch BLOB 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 and store it under /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000 Retrying...
2017-12-04 09:13:42,663 INFO  org.apache.flink.runtime.blob.BlobClient                      - Downloading 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 (retry 1)
2017-12-04 09:15:49,895 ERROR org.apache.flink.runtime.blob.BlobClient                      - Failed to fetch BLOB 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 and store it under /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000 Retrying...
2017-12-04 09:15:49,895 INFO  org.apache.flink.runtime.blob.BlobClient                      - Downloading 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 (retry 2)
2017-12-04 09:17:57,127 ERROR org.apache.flink.runtime.blob.BlobClient                      - Failed to fetch BLOB 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 and store it under /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000 Retrying...
2017-12-04 09:17:57,127 INFO  org.apache.flink.runtime.blob.BlobClient                      - Downloading 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 (retry 3)
2017-12-04 09:20:04,359 ERROR org.apache.flink.runtime.blob.BlobClient                      - Failed to fetch BLOB 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 and store it under /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000 Retrying...
2017-12-04 09:20:04,359 INFO  org.apache.flink.runtime.blob.BlobClient                      - Downloading 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 (retry 4)
2017-12-04 09:22:11,590 ERROR org.apache.flink.runtime.blob.BlobClient                      - Failed to fetch BLOB 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 and store it under /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000 Retrying...
2017-12-04 09:22:11,591 INFO  org.apache.flink.runtime.blob.BlobClient                      - Downloading 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 (retry 5)
2017-12-04 09:24:18,823 ERROR org.apache.flink.runtime.blob.BlobClient                      - Failed to fetch BLOB 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61 from flink-jobmanager/10.104.5.130:50000 and store it under /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000 No retries left.
java.io.IOException: Could not connect to BlobServer at address flink-jobmanager/10.104.5.130:50000



  ________________________________


Landesbank Hessen-Thueringen Girozentrale
Anstalt des oeffentlichen Rechts
Sitz: Frankfurt am Main / Erfurt
Amtsgericht Frankfurt am Main, HRA 29821 / Amtsgericht Jena, HRA 102181

Bitte nutzen Sie die E-Mail-Verbindung mit uns ausschliesslich zum Informationsaustausch. Wir koennen auf diesem Wege keine rechtsgeschaeftlichen Erklaerungen (Auftraege etc.) entgegennehmen.

Der Inhalt dieser Nachricht ist vertraulich und nur fuer den angegebenen Empfaenger bestimmt. Jede Form der Kenntnisnahme oder Weitergabe durch Dritte ist unzulaessig. Sollte diese Nachricht nicht fur Sie bestimmt sein, so bitten wir Sie, sich mit uns per E-Mail oder telefonisch in Verbindung zu setzen.

Please use your E-mail connection with us exclusively for the exchange of information. We do not accept legally binding declarations (orders, etc.) by this means of communication.

The contents of this message is confidential and intended only for the recipient indicated. Taking notice of this message or disclosure by third parties is not
permitted. In the event that this message is not intended for you, please contact us via E-mail or phone.

Re: Blob server not working with 1.4.0.RC2

Posted by Nico Kruber <ni...@data-artisans.com>.
Hi Bernd,
thanks for the report. I tried to reproduce it locally but both a telnet
connection to the BlobServer as well as the BLOB download by the
TaskManagers work for me. Can you share your configuration that is
causing the problem? You could also try increasing the log level to
DEBUG and see if there is something more in the logs (the exception
thrown in StaticFileServerHandler looks suspicious but is not related to
the BlobServer).
Apparently, the TaskManager resolves flink-jobmanager to 10.104.5.130.
Is that the correct address and can the TaskManager talk to this IP?
(may a firewall block this?)

Did you, by any chance, set up SSL, too? There was a recent thread on
the mailing list [1] where a had some problems with
"security.ssl.verify-hostname" being set to true which may be related.


Nico

[1]
https://lists.apache.org/thread.html/879d072bfd6761947b4bd703324489db50e8b14c328992118af875d8@%3Cuser.flink.apache.org%3E

On 04/12/17 10:03, Bernd.Winterstein@Dev.Helaba.de wrote:
> Hi
> Since we switched to Release 1.4 the taskmanagers are unable to download
> blobs from the jobmanager.
> The taskmanager registration still works.
> Netstat on jobmanager shows open ports at 6123 and 50000. But a telnet
> connection from taskmanager to jobmanager on port 50000 times out.
>  
> Any ideas are welcome.
>  
> Regards
>  
> Bernd
>  
> Jobmanager log:
>  
> 2017-12-04 08:48:30,167 INFO 
> org.apache.flink.runtime.jobmanager.JobManager                - Starting
> JobManager actor
> 2017-12-04 08:48:30,197 INFO 
> org.apache.flink.runtime.blob.BlobServer                      - Created
> BLOB server storage directory
> /tmp/blobStore-81cd12c7-394e-4777-85a1-98389b72dd08
> 2017-12-04 08:48:30,205 INFO 
> org.apache.flink.runtime.blob.BlobServer                      - Started
> BLOB server at 0.0.0.0:50000 - max concurrent requests: 50 - max
> backlog: 1000
> 2017-12-04 08:48:30,608 INFO 
> org.apache.flink.runtime.jobmanager.JobManager                - Starting
> JobManager at akka.tcp://flink@flink-jobmanager:6123/user/jobmanager.
> 2017-12-04 08:48:30,628 INFO 
> org.apache.flink.runtime.jobmanager.MemoryArchivist           - Started
> memory archivist akka://flink/user/archive
> 2017-12-04 08:48:30,676 INFO 
> org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager 
> - Trying to associate with JobManager leader
> akka.tcp://flink@flink-jobmanager:6123/user/jobmanager
> 2017-12-04 08:48:30,692 INFO 
> org.apache.flink.runtime.jobmanager.JobManager                -
> JobManager akka.tcp://flink@flink-jobmanager:6123/user/jobmanager was
> granted leadership with leader session ID
> Some(00000000-0000-0000-0000-000000000000).
> 2017-12-04 08:48:30,700 INFO 
> org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager 
> - Resource Manager associating with leading JobManager
> Actor[akka://flink/user/jobmanager#886586058] - leader session
> 00000000-0000-0000-0000-000000000000
> 2017-12-04 08:53:50,635 INFO 
> org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager 
> - TaskManager 627338086a766c140909ba45f2e717d0 has started.
> 2017-12-04 08:53:50,638 INFO 
> org.apache.flink.runtime.instance.InstanceManager             -
> Registered TaskManager at flink-taskmanager-65cf757d9b-hj65d
> (akka.tcp://flink@flink-taskmanager-65cf757d9b-hj65d:45932/user/taskmanager)
> as f9d2843d0223b15d8fce52aea8231cc6. Current number of registered hosts
> is 1. Current number of alive task slots is 8.
> 2017-12-04 08:53:50,658 WARN 
> akka.serialization.Serialization(akka://flink)                - Using
> the default Java serializer for class
> [org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage]
> which is not recommended because of performance implications. Use
> another serializer or disable this warning using the setting
> 'akka.actor.warn-about-java-serializer-usage'
> 2017-12-04 08:53:55,714 INFO 
> org.apache.flink.runtime.clusterframework.standalone.StandaloneResourceManager 
> - TaskManager 08c3e6f7c765e2ab88e2ea645049cb9d has started.
> 2017-12-04 08:53:55,714 INFO 
> org.apache.flink.runtime.instance.InstanceManager             -
> Registered TaskManager at flink-taskmanager-65cf757d9b-jtzw5
> (akka.tcp://flink@flink-taskmanager-65cf757d9b-jtzw5:41710/user/taskmanager)
> as da8a8da3650ce53f460784c54938a071. Current number of registered hosts
> is 2. Current number of alive task slots is 16.
> 2017-12-04 09:04:08,850 ERROR
> org.apache.flink.runtime.rest.handler.legacy.files.StaticFileServerHandler 
> - Caught exception
> java.io.IOException: Operation timed out
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>         at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>  
>  
> Taskmanager log:
> 2017-12-04 08:53:55,511 INFO 
> org.apache.flink.runtime.taskmanager.TaskManager              - Starting
> TaskManager actor at akka://flink/user/taskmanager#-977591027.
> 2017-12-04 08:53:55,511 INFO 
> org.apache.flink.runtime.taskmanager.TaskManager              -
> TaskManager data connection information:
> 08c3e6f7c765e2ab88e2ea645049cb9d @ flink-taskmanager-65cf757d9b-jtzw5
> (dataPort=42142)
> 2017-12-04 08:53:55,512 INFO 
> org.apache.flink.runtime.taskmanager.TaskManager              -
> TaskManager has 8 task slot(s).
> 2017-12-04 08:53:55,513 INFO 
> org.apache.flink.runtime.taskmanager.TaskManager              - Memory
> usage stats: [HEAP: 131/981/981 MB, NON HEAP: 34/35/-1 MB
> (used/committed/max)]
> 2017-12-04 08:53:55,518 INFO 
> org.apache.flink.runtime.taskmanager.TaskManager              - Trying
> to register at JobManager
> akka.tcp://flink@flink-jobmanager:6123/user/jobmanager (attempt 1,
> timeout: 500 milliseconds)
> 2017-12-04 08:53:55,671 WARN 
> akka.serialization.Serialization(akka://flink)                - Using
> the default Java serializer for class
> [org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMessage]
> which is not recommended because of performance implications. Use
> another serializer or disable this warning using the setting
> 'akka.actor.warn-about-java-serializer-usage'
> 2017-12-04 08:53:55,737 INFO 
> org.apache.flink.runtime.taskmanager.TaskManager              -
> Successful registration at JobManager
> (akka.tcp://flink@flink-jobmanager:6123/user/jobmanager), starting
> network stack and library cache.
> 2017-12-04 08:53:55,742 INFO 
> org.apache.flink.runtime.taskmanager.TaskManager              -
> Determined BLOB server address to be
> flink-jobmanager/10.104.5.130:50000. Starting BLOB cache.
> 2017-12-04 08:53:55,752 INFO 
> org.apache.flink.runtime.blob.PermanentBlobCache              - Created
> BLOB cache storage directory
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5
> 2017-12-04 08:53:55,776 INFO 
> org.apache.flink.runtime.blob.TransientBlobCache              - Created
> BLOB cache storage directory
> /tmp/blobStore-8d7c8660-4455-43e9-93be-dadeb622b9e1
> 2017-12-04 08:54:00,796 WARN 
> akka.serialization.Serialization(akka://flink)                - Using
> the default Java serializer for class
> [org.apache.flink.runtime.messages.TaskManagerMessages$Heartbeat] which
> is not recommended because of performance implications. Use another
> serializer or disable this warning using the setting
> 'akka.actor.warn-about-java-serializer-usage'
> 2017-12-04 09:10:38,726 WARN 
> akka.serialization.Serialization(akka://flink)                - Using
> the default Java serializer for class
> [org.apache.flink.runtime.metrics.dump.MetricDumpSerialization$MetricSerializationResult]
> which is not recommended because of performance implications. Use
> another serializer or disable this warning using the setting
> 'akka.actor.warn-about-java-serializer-usage'
> 2017-12-04 09:11:35,403 INFO 
> org.apache.flink.runtime.taskmanager.TaskManager              - Received
> task Source: PACS008 Generator -> Double Submission Generator -> Invalid
> IBAN Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1)
> 2017-12-04 09:11:35,404 INFO 
> org.apache.flink.runtime.taskmanager.Task                     - Source:
> PACS008 Generator -> Double Submission Generator -> Invalid IBAN
> Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1)
> (797de317bebf24df087f2da63cf5118e) switched from CREATED to DEPLOYING.
> 2017-12-04 09:11:35,404 INFO 
> org.apache.flink.runtime.taskmanager.Task                     - Creating
> FileSystem stream leak safety net for task Source: PACS008 Generator ->
> Double Submission Generator -> Invalid IBAN Generator -> Sink:
> IP-PREPROCESSOR-INBOUND (1/1) (797de317bebf24df087f2da63cf5118e) [DEPLOYING]
> 2017-12-04 09:11:35,408 WARN 
> akka.serialization.Serialization(akka://flink)                - Using
> the default Java serializer for class
> [org.apache.flink.runtime.messages.Acknowledge] which is not recommended
> because of performance implications. Use another serializer or disable
> this warning using the setting 'akka.actor.warn-about-java-serializer-usage'
> 2017-12-04 09:11:35,410 INFO 
> org.apache.flink.runtime.taskmanager.Task                     - Loading
> JAR files for task Source: PACS008 Generator -> Double Submission
> Generator -> Invalid IBAN Generator -> Sink: IP-PREPROCESSOR-INBOUND
> (1/1) (797de317bebf24df087f2da63cf5118e) [DEPLOYING].
> 2017-12-04 09:11:35,440 INFO 
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000
> 2017-12-04 09:13:42,663 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000
> Retrying...
> 2017-12-04 09:13:42,663 INFO 
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 (retry 1)
> 2017-12-04 09:15:49,895 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000
> Retrying...
> 2017-12-04 09:15:49,895 INFO 
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 (retry 2)
> 2017-12-04 09:17:57,127 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000
> Retrying...
> 2017-12-04 09:17:57,127 INFO 
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 (retry 3)
> 2017-12-04 09:20:04,359 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000
> Retrying...
> 2017-12-04 09:20:04,359 INFO 
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 (retry 4)
> 2017-12-04 09:22:11,590 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000
> Retrying...
> 2017-12-04 09:22:11,591 INFO 
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 (retry 5)
> 2017-12-04 09:24:18,823 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-00000000
> No retries left.
> java.io.IOException: Could not connect to BlobServer at address
> flink-jobmanager/10.104.5.130:50000
>  
> 
> _  ________________________________  _
> 
> 
> Landesbank Hessen-Thueringen Girozentrale
> Anstalt des oeffentlichen Rechts
> Sitz: Frankfurt am Main / Erfurt
> Amtsgericht Frankfurt am Main, HRA 29821 / Amtsgericht Jena, HRA 102181
> 
> Bitte nutzen Sie die E-Mail-Verbindung mit uns ausschliesslich zum
> Informationsaustausch. Wir koennen auf diesem Wege keine
> rechtsgeschaeftlichen Erklaerungen (Auftraege etc.) entgegennehmen.
> 
> Der Inhalt dieser Nachricht ist vertraulich und nur fuer den angegebenen
> Empfaenger bestimmt. Jede Form der Kenntnisnahme oder Weitergabe durch
> Dritte ist unzulaessig. Sollte diese Nachricht nicht fur Sie bestimmt
> sein, so bitten wir Sie, sich mit uns per E-Mail oder telefonisch in
> Verbindung zu setzen.
> 
> Please use your E-mail connection with us exclusively for the exchange
> of information. We do not accept legally binding declarations (orders,
> etc.) by this means of communication.
> 
> The contents of this message is confidential and intended only for the
> recipient indicated. Taking notice of this message or disclosure by
> third parties is not
> permitted. In the event that this message is not intended for you,
> please contact us via E-mail or phone.


Re: Blob server not working with 1.4.0.RC2

Posted by Fabian Hueske <fh...@gmail.com>.
Hi Bernd,

just to make sure, is your setup working now as expected?

Thanks, Fabian

2017-12-06 8:45 GMT+01:00 <Be...@dev.helaba.de>:

> Hi Nico
> I think there were changes in the default port fort the BLOB server. I
> missed the fact that the Kubernetes configuration was still exposing 6124
> for the JobManager BLOB server.
> Thanks
>
> Bernd
>
> -----Ursprüngliche Nachricht-----
> Von: Nico Kruber [mailto:nico@data-artisans.com]
> Gesendet: Montag, 4. Dezember 2017 14:17
> An: Winterstein, Bernd; user@flink.apache.org
> Betreff: Re: Blob server not working with 1.4.0.RC2
>
> Hi Bernd,
> thanks for the report. I tried to reproduce it locally but both a telnet
> connection to the BlobServer as well as the BLOB download by the
> TaskManagers work for me. Can you share your configuration that is causing
> the problem? You could also try increasing the log level to DEBUG and see
> if there is something more in the logs (the exception thrown in
> StaticFileServerHandler looks suspicious but is not related to the
> BlobServer).
> Apparently, the TaskManager resolves flink-jobmanager to 10.104.5.130.
> Is that the correct address and can the TaskManager talk to this IP?
> (may a firewall block this?)
>
> Did you, by any chance, set up SSL, too? There was a recent thread on the
> mailing list [1] where a had some problems with
> "security.ssl.verify-hostname" being set to true which may be related.
>
>
> Nico
>
> [1]
> https://lists.apache.org/thread.html/879d072bfd6761947b4bd703324489
> db50e8b14c328992118af875d8@%3Cuser.flink.apache.org%3E
>
> On 04/12/17 10:03, Bernd.Winterstein@Dev.Helaba.de wrote:
> > Hi
> > Since we switched to Release 1.4 the taskmanagers are unable to
> > download blobs from the jobmanager.
> > The taskmanager registration still works.
> > Netstat on jobmanager shows open ports at 6123 and 50000. But a telnet
> > connection from taskmanager to jobmanager on port 50000 times out.
> >
> > Any ideas are welcome.
> >
> > Regards
> >
> > Bernd
> >
> > Jobmanager log:
> >
> > 2017-12-04 08:48:30,167 INFO
> > org.apache.flink.runtime.jobmanager.JobManager                -
> > Starting JobManager actor
> > 2017-12-04 08:48:30,197 INFO
> > org.apache.flink.runtime.blob.BlobServer                      -
> > Created BLOB server storage directory
> > /tmp/blobStore-81cd12c7-394e-4777-85a1-98389b72dd08
> > 2017-12-04 08:48:30,205 INFO
> > org.apache.flink.runtime.blob.BlobServer                      -
> > Started BLOB server at 0.0.0.0:50000 - max concurrent requests: 50 -
> > max
> > backlog: 1000
> > 2017-12-04 08:48:30,608 INFO
> > org.apache.flink.runtime.jobmanager.JobManager                -
> > Starting JobManager at akka.tcp://flink@flink-jobmanager:6123/user/
> jobmanager.
> > 2017-12-04 08:48:30,628 INFO
> > org.apache.flink.runtime.jobmanager.MemoryArchivist           -
> > Started memory archivist akka://flink/user/archive
> > 2017-12-04 08:48:30,676 INFO
> > org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> > eManager
> > - Trying to associate with JobManager leader
> > akka.tcp://flink@flink-jobmanager:6123/user/jobmanager
> > 2017-12-04 08:48:30,692 INFO
> > org.apache.flink.runtime.jobmanager.JobManager                -
> > JobManager akka.tcp://flink@flink-jobmanager:6123/user/jobmanager was
> > granted leadership with leader session ID
> > Some(00000000-0000-0000-0000-000000000000).
> > 2017-12-04 08:48:30,700 INFO
> > org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> > eManager
> > - Resource Manager associating with leading JobManager
> > Actor[akka://flink/user/jobmanager#886586058] - leader session
> > 00000000-0000-0000-0000-000000000000
> > 2017-12-04 08:53:50,635 INFO
> > org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> > eManager
> > - TaskManager 627338086a766c140909ba45f2e717d0 has started.
> > 2017-12-04 08:53:50,638 INFO
> > org.apache.flink.runtime.instance.InstanceManager             -
> > Registered TaskManager at flink-taskmanager-65cf757d9b-hj65d
> > (akka.tcp://flink@flink-taskmanager-65cf757d9b-hj65d:45932/user/taskma
> > nager) as f9d2843d0223b15d8fce52aea8231cc6. Current number of
> > registered hosts is 1. Current number of alive task slots is 8.
> > 2017-12-04 08:53:50,658 WARN
> > akka.serialization.Serialization(akka://flink)                - Using
> > the default Java serializer for class
> > [org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMes
> > sage] which is not recommended because of performance implications.
> > Use another serializer or disable this warning using the setting
> > 'akka.actor.warn-about-java-serializer-usage'
> > 2017-12-04 08:53:55,714 INFO
> > org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> > eManager
> > - TaskManager 08c3e6f7c765e2ab88e2ea645049cb9d has started.
> > 2017-12-04 08:53:55,714 INFO
> > org.apache.flink.runtime.instance.InstanceManager             -
> > Registered TaskManager at flink-taskmanager-65cf757d9b-jtzw5
> > (akka.tcp://flink@flink-taskmanager-65cf757d9b-jtzw5:41710/user/taskma
> > nager) as da8a8da3650ce53f460784c54938a071. Current number of
> > registered hosts is 2. Current number of alive task slots is 16.
> > 2017-12-04 09:04:08,850 ERROR
> > org.apache.flink.runtime.rest.handler.legacy.files.StaticFileServerHan
> > dler
> > - Caught exception
> > java.io.IOException: Operation timed out
> >         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> >         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> >         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> >         at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> >         at
> > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> >
> >
> > Taskmanager log:
> > 2017-12-04 08:53:55,511 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > Starting TaskManager actor at akka://flink/user/taskmanager#-977591027.
> > 2017-12-04 08:53:55,511 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > TaskManager data connection information:
> > 08c3e6f7c765e2ab88e2ea645049cb9d @ flink-taskmanager-65cf757d9b-jtzw5
> > (dataPort=42142)
> > 2017-12-04 08:53:55,512 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > TaskManager has 8 task slot(s).
> > 2017-12-04 08:53:55,513 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              - Memory
> > usage stats: [HEAP: 131/981/981 MB, NON HEAP: 34/35/-1 MB
> > (used/committed/max)]
> > 2017-12-04 08:53:55,518 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              - Trying
> > to register at JobManager
> > akka.tcp://flink@flink-jobmanager:6123/user/jobmanager (attempt 1,
> > timeout: 500 milliseconds)
> > 2017-12-04 08:53:55,671 WARN
> > akka.serialization.Serialization(akka://flink)                - Using
> > the default Java serializer for class
> > [org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMes
> > sage] which is not recommended because of performance implications.
> > Use another serializer or disable this warning using the setting
> > 'akka.actor.warn-about-java-serializer-usage'
> > 2017-12-04 08:53:55,737 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > Successful registration at JobManager
> > (akka.tcp://flink@flink-jobmanager:6123/user/jobmanager), starting
> > network stack and library cache.
> > 2017-12-04 08:53:55,742 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > Determined BLOB server address to be
> > flink-jobmanager/10.104.5.130:50000. Starting BLOB cache.
> > 2017-12-04 08:53:55,752 INFO
> > org.apache.flink.runtime.blob.PermanentBlobCache              -
> > Created BLOB cache storage directory
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5
> > 2017-12-04 08:53:55,776 INFO
> > org.apache.flink.runtime.blob.TransientBlobCache              -
> > Created BLOB cache storage directory
> > /tmp/blobStore-8d7c8660-4455-43e9-93be-dadeb622b9e1
> > 2017-12-04 08:54:00,796 WARN
> > akka.serialization.Serialization(akka://flink)                - Using
> > the default Java serializer for class
> > [org.apache.flink.runtime.messages.TaskManagerMessages$Heartbeat]
> > which is not recommended because of performance implications. Use
> > another serializer or disable this warning using the setting
> > 'akka.actor.warn-about-java-serializer-usage'
> > 2017-12-04 09:10:38,726 WARN
> > akka.serialization.Serialization(akka://flink)                - Using
> > the default Java serializer for class
> > [org.apache.flink.runtime.metrics.dump.MetricDumpSerialization$MetricS
> > erializationResult] which is not recommended because of performance
> > implications. Use another serializer or disable this warning using the
> > setting 'akka.actor.warn-about-java-serializer-usage'
> > 2017-12-04 09:11:35,403 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > Received task Source: PACS008 Generator -> Double Submission Generator
> > -> Invalid IBAN Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1)
> > 2017-12-04 09:11:35,404 INFO
> > org.apache.flink.runtime.taskmanager.Task                     - Source:
> > PACS008 Generator -> Double Submission Generator -> Invalid IBAN
> > Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1)
> > (797de317bebf24df087f2da63cf5118e) switched from CREATED to DEPLOYING.
> > 2017-12-04 09:11:35,404 INFO
> > org.apache.flink.runtime.taskmanager.Task                     -
> > Creating FileSystem stream leak safety net for task Source: PACS008
> > Generator -> Double Submission Generator -> Invalid IBAN Generator ->
> Sink:
> > IP-PREPROCESSOR-INBOUND (1/1) (797de317bebf24df087f2da63cf5118e)
> > [DEPLOYING]
> > 2017-12-04 09:11:35,408 WARN
> > akka.serialization.Serialization(akka://flink)                - Using
> > the default Java serializer for class
> > [org.apache.flink.runtime.messages.Acknowledge] which is not
> > recommended because of performance implications. Use another
> > serializer or disable this warning using the setting
> 'akka.actor.warn-about-java-serializer-usage'
> > 2017-12-04 09:11:35,410 INFO
> > org.apache.flink.runtime.taskmanager.Task                     -
> > Loading JAR files for task Source: PACS008 Generator -> Double
> > Submission Generator -> Invalid IBAN Generator -> Sink:
> > IP-PREPROCESSOR-INBOUND
> > (1/1) (797de317bebf24df087f2da63cf5118e) [DEPLOYING].
> > 2017-12-04 09:11:35,440 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000
> > 2017-12-04 09:13:42,663 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > Retrying...
> > 2017-12-04 09:13:42,663 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 (retry 1)
> > 2017-12-04 09:15:49,895 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > Retrying...
> > 2017-12-04 09:15:49,895 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 (retry 2)
> > 2017-12-04 09:17:57,127 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > Retrying...
> > 2017-12-04 09:17:57,127 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 (retry 3)
> > 2017-12-04 09:20:04,359 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > Retrying...
> > 2017-12-04 09:20:04,359 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 (retry 4)
> > 2017-12-04 09:22:11,590 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > Retrying...
> > 2017-12-04 09:22:11,591 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 (retry 5)
> > 2017-12-04 09:24:18,823 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > No retries left.
> > java.io.IOException: Could not connect to BlobServer at address
> > flink-jobmanager/10.104.5.130:50000
> >
> >
> > _  ________________________________  _
> >
> >
> > Landesbank Hessen-Thueringen Girozentrale Anstalt des oeffentlichen
> > Rechts
> > Sitz: Frankfurt am Main / Erfurt
> > Amtsgericht Frankfurt am Main, HRA 29821 / Amtsgericht Jena, HRA
> > 102181
> >
> > Bitte nutzen Sie die E-Mail-Verbindung mit uns ausschliesslich zum
> > Informationsaustausch. Wir koennen auf diesem Wege keine
> > rechtsgeschaeftlichen Erklaerungen (Auftraege etc.) entgegennehmen.
> >
> > Der Inhalt dieser Nachricht ist vertraulich und nur fuer den
> > angegebenen Empfaenger bestimmt. Jede Form der Kenntnisnahme oder
> > Weitergabe durch Dritte ist unzulaessig. Sollte diese Nachricht nicht
> > fur Sie bestimmt sein, so bitten wir Sie, sich mit uns per E-Mail oder
> > telefonisch in Verbindung zu setzen.
> >
> > Please use your E-mail connection with us exclusively for the exchange
> > of information. We do not accept legally binding declarations (orders,
> > etc.) by this means of communication.
> >
> > The contents of this message is confidential and intended only for the
> > recipient indicated. Taking notice of this message or disclosure by
> > third parties is not permitted. In the event that this message is not
> > intended for you, please contact us via E-mail or phone.
>
>
> ________________________________
>
>
> Landesbank Hessen-Thueringen Girozentrale
> Anstalt des oeffentlichen Rechts
> Sitz: Frankfurt am Main / Erfurt
> Amtsgericht Frankfurt am Main, HRA 29821 / Amtsgericht Jena, HRA 102181
>
> Bitte nutzen Sie die E-Mail-Verbindung mit uns ausschliesslich zum
> Informationsaustausch. Wir koennen auf diesem Wege keine
> rechtsgeschaeftlichen Erklaerungen (Auftraege etc.) entgegennehmen.
>
> Der Inhalt dieser Nachricht ist vertraulich und nur fuer den angegebenen
> Empfaenger bestimmt. Jede Form der Kenntnisnahme oder Weitergabe durch
> Dritte ist unzulaessig. Sollte diese Nachricht nicht fur Sie bestimmt sein,
> so bitten wir Sie, sich mit uns per E-Mail oder telefonisch in Verbindung
> zu setzen.
>
> Please use your E-mail connection with us exclusively for the exchange of
> information. We do not accept legally binding declarations (orders, etc.)
> by this means of communication.
>
> The contents of this message is confidential and intended only for the
> recipient indicated. Taking notice of this message or disclosure by third
> parties is not
> permitted. In the event that this message is not intended for you, please
> contact us via E-mail or phone.
>

Re: AW: Blob server not working with 1.4.0.RC2

Posted by Nico Kruber <ni...@data-artisans.com>.
Hi Bernd,
at least from our side I don't see a change in the default BlobServer ports 
between 1.3 and 1.4 - without configuration, the OS chooses the port.
If you want to influence the range it is chosen from (or want to fix a 
specific port), you need to set the blob.server.port configuration parameter 
in Flink's flink-conf.yaml.


Regards
Nico

On Wednesday, 6 December 2017 08:45:19 CET Bernd.Winterstein@Dev.Helaba.de 
wrote:
> Hi Nico
> I think there were changes in the default port fort the BLOB server. I
> missed the fact that the Kubernetes configuration was still exposing 6124
> for the JobManager BLOB server. Thanks
> 
> Bernd
> 
> -----Ursprüngliche Nachricht-----
> Von: Nico Kruber [mailto:nico@data-artisans.com]
> Gesendet: Montag, 4. Dezember 2017 14:17
> An: Winterstein, Bernd; user@flink.apache.org
> Betreff: Re: Blob server not working with 1.4.0.RC2
> 
> Hi Bernd,
> thanks for the report. I tried to reproduce it locally but both a telnet
> connection to the BlobServer as well as the BLOB download by the
> TaskManagers work for me. Can you share your configuration that is causing
> the problem? You could also try increasing the log level to DEBUG and see
> if there is something more in the logs (the exception thrown in
> StaticFileServerHandler looks suspicious but is not related to the
> BlobServer). Apparently, the TaskManager resolves flink-jobmanager to
> 10.104.5.130. Is that the correct address and can the TaskManager talk to
> this IP? (may a firewall block this?)
> 
> Did you, by any chance, set up SSL, too? There was a recent thread on the
> mailing list [1] where a had some problems with
> "security.ssl.verify-hostname" being set to true which may be related.
> 
> 
> Nico
> 
> [1]
> https://lists.apache.org/thread.html/879d072bfd6761947b4bd703324489db50e8b14
> c328992118af875d8@%3Cuser.flink.apache.org%3E
> On 04/12/17 10:03, Bernd.Winterstein@Dev.Helaba.de wrote:
> > Hi
> > Since we switched to Release 1.4 the taskmanagers are unable to
> > download blobs from the jobmanager.
> > The taskmanager registration still works.
> > Netstat on jobmanager shows open ports at 6123 and 50000. But a telnet
> > connection from taskmanager to jobmanager on port 50000 times out.
> > 
> > Any ideas are welcome.
> > 
> > Regards
> > 
> > Bernd
> > 
> > Jobmanager log:
> > 
> > 2017-12-04 08:48:30,167 INFO
> > org.apache.flink.runtime.jobmanager.JobManager                -
> > Starting JobManager actor
> > 2017-12-04 08:48:30,197 INFO
> > org.apache.flink.runtime.blob.BlobServer                      -
> > Created BLOB server storage directory
> > /tmp/blobStore-81cd12c7-394e-4777-85a1-98389b72dd08
> > 2017-12-04 08:48:30,205 INFO
> > org.apache.flink.runtime.blob.BlobServer                      -
> > Started BLOB server at 0.0.0.0:50000 - max concurrent requests: 50 -
> > max
> > backlog: 1000
> > 2017-12-04 08:48:30,608 INFO
> > org.apache.flink.runtime.jobmanager.JobManager                -
> > Starting JobManager at
> > akka.tcp://flink@flink-jobmanager:6123/user/jobmanager. 2017-12-04
> > 08:48:30,628 INFO
> > org.apache.flink.runtime.jobmanager.MemoryArchivist           -
> > Started memory archivist akka://flink/user/archive
> > 2017-12-04 08:48:30,676 INFO
> > org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> > eManager
> > - Trying to associate with JobManager leader
> > akka.tcp://flink@flink-jobmanager:6123/user/jobmanager
> > 2017-12-04 08:48:30,692 INFO
> > org.apache.flink.runtime.jobmanager.JobManager                -
> > JobManager akka.tcp://flink@flink-jobmanager:6123/user/jobmanager was
> > granted leadership with leader session ID
> > Some(00000000-0000-0000-0000-000000000000).
> > 2017-12-04 08:48:30,700 INFO
> > org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> > eManager
> > - Resource Manager associating with leading JobManager
> > Actor[akka://flink/user/jobmanager#886586058] - leader session
> > 00000000-0000-0000-0000-000000000000
> > 2017-12-04 08:53:50,635 INFO
> > org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> > eManager
> > - TaskManager 627338086a766c140909ba45f2e717d0 has started.
> > 2017-12-04 08:53:50,638 INFO
> > org.apache.flink.runtime.instance.InstanceManager             -
> > Registered TaskManager at flink-taskmanager-65cf757d9b-hj65d
> > (akka.tcp://flink@flink-taskmanager-65cf757d9b-hj65d:45932/user/taskma
> > nager) as f9d2843d0223b15d8fce52aea8231cc6. Current number of
> > registered hosts is 1. Current number of alive task slots is 8.
> > 2017-12-04 08:53:50,658 WARN
> > akka.serialization.Serialization(akka://flink)                - Using
> > the default Java serializer for class
> > [org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMes
> > sage] which is not recommended because of performance implications.
> > Use another serializer or disable this warning using the setting
> > 'akka.actor.warn-about-java-serializer-usage'
> > 2017-12-04 08:53:55,714 INFO
> > org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> > eManager
> > - TaskManager 08c3e6f7c765e2ab88e2ea645049cb9d has started.
> > 2017-12-04 08:53:55,714 INFO
> > org.apache.flink.runtime.instance.InstanceManager             -
> > Registered TaskManager at flink-taskmanager-65cf757d9b-jtzw5
> > (akka.tcp://flink@flink-taskmanager-65cf757d9b-jtzw5:41710/user/taskma
> > nager) as da8a8da3650ce53f460784c54938a071. Current number of
> > registered hosts is 2. Current number of alive task slots is 16.
> > 2017-12-04 09:04:08,850 ERROR
> > org.apache.flink.runtime.rest.handler.legacy.files.StaticFileServerHan
> > dler
> > - Caught exception
> > java.io.IOException: Operation timed out
> > 
> >         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
> >         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
> >         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
> >         at sun.nio.ch.IOUtil.read(IOUtil.java:192)
> >         at
> > 
> > sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
> > 
> > 
> > Taskmanager log:
> > 2017-12-04 08:53:55,511 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > Starting TaskManager actor at akka://flink/user/taskmanager#-977591027.
> > 2017-12-04 08:53:55,511 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > TaskManager data connection information:
> > 08c3e6f7c765e2ab88e2ea645049cb9d @ flink-taskmanager-65cf757d9b-jtzw5
> > (dataPort=42142)
> > 2017-12-04 08:53:55,512 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > TaskManager has 8 task slot(s).
> > 2017-12-04 08:53:55,513 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              - Memory
> > usage stats: [HEAP: 131/981/981 MB, NON HEAP: 34/35/-1 MB
> > (used/committed/max)]
> > 2017-12-04 08:53:55,518 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              - Trying
> > to register at JobManager
> > akka.tcp://flink@flink-jobmanager:6123/user/jobmanager (attempt 1,
> > timeout: 500 milliseconds)
> > 2017-12-04 08:53:55,671 WARN
> > akka.serialization.Serialization(akka://flink)                - Using
> > the default Java serializer for class
> > [org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMes
> > sage] which is not recommended because of performance implications.
> > Use another serializer or disable this warning using the setting
> > 'akka.actor.warn-about-java-serializer-usage'
> > 2017-12-04 08:53:55,737 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > Successful registration at JobManager
> > (akka.tcp://flink@flink-jobmanager:6123/user/jobmanager), starting
> > network stack and library cache.
> > 2017-12-04 08:53:55,742 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > Determined BLOB server address to be
> > flink-jobmanager/10.104.5.130:50000. Starting BLOB cache.
> > 2017-12-04 08:53:55,752 INFO
> > org.apache.flink.runtime.blob.PermanentBlobCache              -
> > Created BLOB cache storage directory
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5
> > 2017-12-04 08:53:55,776 INFO
> > org.apache.flink.runtime.blob.TransientBlobCache              -
> > Created BLOB cache storage directory
> > /tmp/blobStore-8d7c8660-4455-43e9-93be-dadeb622b9e1
> > 2017-12-04 08:54:00,796 WARN
> > akka.serialization.Serialization(akka://flink)                - Using
> > the default Java serializer for class
> > [org.apache.flink.runtime.messages.TaskManagerMessages$Heartbeat]
> > which is not recommended because of performance implications. Use
> > another serializer or disable this warning using the setting
> > 'akka.actor.warn-about-java-serializer-usage'
> > 2017-12-04 09:10:38,726 WARN
> > akka.serialization.Serialization(akka://flink)                - Using
> > the default Java serializer for class
> > [org.apache.flink.runtime.metrics.dump.MetricDumpSerialization$MetricS
> > erializationResult] which is not recommended because of performance
> > implications. Use another serializer or disable this warning using the
> > setting 'akka.actor.warn-about-java-serializer-usage'
> > 2017-12-04 09:11:35,403 INFO
> > org.apache.flink.runtime.taskmanager.TaskManager              -
> > Received task Source: PACS008 Generator -> Double Submission Generator
> > -> Invalid IBAN Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1)
> > 2017-12-04 09:11:35,404 INFO
> > org.apache.flink.runtime.taskmanager.Task                     - Source:
> > PACS008 Generator -> Double Submission Generator -> Invalid IBAN
> > Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1)
> > (797de317bebf24df087f2da63cf5118e) switched from CREATED to DEPLOYING.
> > 2017-12-04 09:11:35,404 INFO
> > org.apache.flink.runtime.taskmanager.Task                     -
> > Creating FileSystem stream leak safety net for task Source: PACS008
> > Generator -> Double Submission Generator -> Invalid IBAN Generator ->
> > Sink:
> > IP-PREPROCESSOR-INBOUND (1/1) (797de317bebf24df087f2da63cf5118e)
> > [DEPLOYING]
> > 2017-12-04 09:11:35,408 WARN
> > akka.serialization.Serialization(akka://flink)                - Using
> > the default Java serializer for class
> > [org.apache.flink.runtime.messages.Acknowledge] which is not
> > recommended because of performance implications. Use another
> > serializer or disable this warning using the setting
> > 'akka.actor.warn-about-java-serializer-usage' 2017-12-04 09:11:35,410
> > INFO
> > org.apache.flink.runtime.taskmanager.Task                     -
> > Loading JAR files for task Source: PACS008 Generator -> Double
> > Submission Generator -> Invalid IBAN Generator -> Sink:
> > IP-PREPROCESSOR-INBOUND
> > (1/1) (797de317bebf24df087f2da63cf5118e) [DEPLOYING].
> > 2017-12-04 09:11:35,440 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000
> > 2017-12-04 09:13:42,663 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > Retrying...
> > 2017-12-04 09:13:42,663 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 (retry 1)
> > 2017-12-04 09:15:49,895 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > Retrying...
> > 2017-12-04 09:15:49,895 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 (retry 2)
> > 2017-12-04 09:17:57,127 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > Retrying...
> > 2017-12-04 09:17:57,127 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 (retry 3)
> > 2017-12-04 09:20:04,359 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > Retrying...
> > 2017-12-04 09:20:04,359 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 (retry 4)
> > 2017-12-04 09:22:11,590 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > Retrying...
> > 2017-12-04 09:22:11,591 INFO
> > org.apache.flink.runtime.blob.BlobClient                      -
> > Downloading
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 (retry 5)
> > 2017-12-04 09:24:18,823 ERROR
> > org.apache.flink.runtime.blob.BlobClient                      - Failed
> > to fetch BLOB
> > 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> > d2b1b-392b55cacd4ed70b48e620f131a8bc61
> > from flink-jobmanager/10.104.5.130:50000 and store it under
> > /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> > 0000
> > No retries left.
> > java.io.IOException: Could not connect to BlobServer at address
> > flink-jobmanager/10.104.5.130:50000
> > 
> > 
> > _  ________________________________  _
> > 
> > 
> > Landesbank Hessen-Thueringen Girozentrale Anstalt des oeffentlichen
> > Rechts
> > Sitz: Frankfurt am Main / Erfurt
> > Amtsgericht Frankfurt am Main, HRA 29821 / Amtsgericht Jena, HRA
> > 102181
> > 
> > Bitte nutzen Sie die E-Mail-Verbindung mit uns ausschliesslich zum
> > Informationsaustausch. Wir koennen auf diesem Wege keine
> > rechtsgeschaeftlichen Erklaerungen (Auftraege etc.) entgegennehmen.
> > 
> > Der Inhalt dieser Nachricht ist vertraulich und nur fuer den
> > angegebenen Empfaenger bestimmt. Jede Form der Kenntnisnahme oder
> > Weitergabe durch Dritte ist unzulaessig. Sollte diese Nachricht nicht
> > fur Sie bestimmt sein, so bitten wir Sie, sich mit uns per E-Mail oder
> > telefonisch in Verbindung zu setzen.
> > 
> > Please use your E-mail connection with us exclusively for the exchange
> > of information. We do not accept legally binding declarations (orders,
> > etc.) by this means of communication.
> > 
> > The contents of this message is confidential and intended only for the
> > recipient indicated. Taking notice of this message or disclosure by
> > third parties is not permitted. In the event that this message is not
> > intended for you, please contact us via E-mail or phone.
> 
> ________________________________
> 
> 
> Landesbank Hessen-Thueringen Girozentrale
> Anstalt des oeffentlichen Rechts
> Sitz: Frankfurt am Main / Erfurt
> Amtsgericht Frankfurt am Main, HRA 29821 / Amtsgericht Jena, HRA 102181
> 
> Bitte nutzen Sie die E-Mail-Verbindung mit uns ausschliesslich zum
> Informationsaustausch. Wir koennen auf diesem Wege keine
> rechtsgeschaeftlichen Erklaerungen (Auftraege etc.) entgegennehmen.
> 
> Der Inhalt dieser Nachricht ist vertraulich und nur fuer den angegebenen
> Empfaenger bestimmt. Jede Form der Kenntnisnahme oder Weitergabe durch
> Dritte ist unzulaessig. Sollte diese Nachricht nicht fur Sie bestimmt sein,
> so bitten wir Sie, sich mit uns per E-Mail oder telefonisch in Verbindung
> zu setzen.
> 
> Please use your E-mail connection with us exclusively for the exchange of
> information. We do not accept legally binding declarations (orders, etc.)
> by this means of communication.
> 
> The contents of this message is confidential and intended only for the
> recipient indicated. Taking notice of this message or disclosure by third
> parties is not permitted. In the event that this message is not intended
> for you, please contact us via E-mail or phone.


AW: Blob server not working with 1.4.0.RC2

Posted by Be...@Dev.Helaba.de.
Hi Nico
I think there were changes in the default port fort the BLOB server. I missed the fact that the Kubernetes configuration was still exposing 6124 for the JobManager BLOB server.
Thanks

Bernd

-----Ursprüngliche Nachricht-----
Von: Nico Kruber [mailto:nico@data-artisans.com]
Gesendet: Montag, 4. Dezember 2017 14:17
An: Winterstein, Bernd; user@flink.apache.org
Betreff: Re: Blob server not working with 1.4.0.RC2

Hi Bernd,
thanks for the report. I tried to reproduce it locally but both a telnet connection to the BlobServer as well as the BLOB download by the TaskManagers work for me. Can you share your configuration that is causing the problem? You could also try increasing the log level to DEBUG and see if there is something more in the logs (the exception thrown in StaticFileServerHandler looks suspicious but is not related to the BlobServer).
Apparently, the TaskManager resolves flink-jobmanager to 10.104.5.130.
Is that the correct address and can the TaskManager talk to this IP?
(may a firewall block this?)

Did you, by any chance, set up SSL, too? There was a recent thread on the mailing list [1] where a had some problems with "security.ssl.verify-hostname" being set to true which may be related.


Nico

[1]
https://lists.apache.org/thread.html/879d072bfd6761947b4bd703324489db50e8b14c328992118af875d8@%3Cuser.flink.apache.org%3E

On 04/12/17 10:03, Bernd.Winterstein@Dev.Helaba.de wrote:
> Hi
> Since we switched to Release 1.4 the taskmanagers are unable to
> download blobs from the jobmanager.
> The taskmanager registration still works.
> Netstat on jobmanager shows open ports at 6123 and 50000. But a telnet
> connection from taskmanager to jobmanager on port 50000 times out.
>
> Any ideas are welcome.
>
> Regards
>
> Bernd
>
> Jobmanager log:
>
> 2017-12-04 08:48:30,167 INFO
> org.apache.flink.runtime.jobmanager.JobManager                -
> Starting JobManager actor
> 2017-12-04 08:48:30,197 INFO
> org.apache.flink.runtime.blob.BlobServer                      -
> Created BLOB server storage directory
> /tmp/blobStore-81cd12c7-394e-4777-85a1-98389b72dd08
> 2017-12-04 08:48:30,205 INFO
> org.apache.flink.runtime.blob.BlobServer                      -
> Started BLOB server at 0.0.0.0:50000 - max concurrent requests: 50 -
> max
> backlog: 1000
> 2017-12-04 08:48:30,608 INFO
> org.apache.flink.runtime.jobmanager.JobManager                -
> Starting JobManager at akka.tcp://flink@flink-jobmanager:6123/user/jobmanager.
> 2017-12-04 08:48:30,628 INFO
> org.apache.flink.runtime.jobmanager.MemoryArchivist           -
> Started memory archivist akka://flink/user/archive
> 2017-12-04 08:48:30,676 INFO
> org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> eManager
> - Trying to associate with JobManager leader
> akka.tcp://flink@flink-jobmanager:6123/user/jobmanager
> 2017-12-04 08:48:30,692 INFO
> org.apache.flink.runtime.jobmanager.JobManager                -
> JobManager akka.tcp://flink@flink-jobmanager:6123/user/jobmanager was
> granted leadership with leader session ID
> Some(00000000-0000-0000-0000-000000000000).
> 2017-12-04 08:48:30,700 INFO
> org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> eManager
> - Resource Manager associating with leading JobManager
> Actor[akka://flink/user/jobmanager#886586058] - leader session
> 00000000-0000-0000-0000-000000000000
> 2017-12-04 08:53:50,635 INFO
> org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> eManager
> - TaskManager 627338086a766c140909ba45f2e717d0 has started.
> 2017-12-04 08:53:50,638 INFO
> org.apache.flink.runtime.instance.InstanceManager             -
> Registered TaskManager at flink-taskmanager-65cf757d9b-hj65d
> (akka.tcp://flink@flink-taskmanager-65cf757d9b-hj65d:45932/user/taskma
> nager) as f9d2843d0223b15d8fce52aea8231cc6. Current number of
> registered hosts is 1. Current number of alive task slots is 8.
> 2017-12-04 08:53:50,658 WARN
> akka.serialization.Serialization(akka://flink)                - Using
> the default Java serializer for class
> [org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMes
> sage] which is not recommended because of performance implications.
> Use another serializer or disable this warning using the setting
> 'akka.actor.warn-about-java-serializer-usage'
> 2017-12-04 08:53:55,714 INFO
> org.apache.flink.runtime.clusterframework.standalone.StandaloneResourc
> eManager
> - TaskManager 08c3e6f7c765e2ab88e2ea645049cb9d has started.
> 2017-12-04 08:53:55,714 INFO
> org.apache.flink.runtime.instance.InstanceManager             -
> Registered TaskManager at flink-taskmanager-65cf757d9b-jtzw5
> (akka.tcp://flink@flink-taskmanager-65cf757d9b-jtzw5:41710/user/taskma
> nager) as da8a8da3650ce53f460784c54938a071. Current number of
> registered hosts is 2. Current number of alive task slots is 16.
> 2017-12-04 09:04:08,850 ERROR
> org.apache.flink.runtime.rest.handler.legacy.files.StaticFileServerHan
> dler
> - Caught exception
> java.io.IOException: Operation timed out
>         at sun.nio.ch.FileDispatcherImpl.read0(Native Method)
>         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
>         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
>         at sun.nio.ch.IOUtil.read(IOUtil.java:192)
>         at
> sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
>
>
> Taskmanager log:
> 2017-12-04 08:53:55,511 INFO
> org.apache.flink.runtime.taskmanager.TaskManager              -
> Starting TaskManager actor at akka://flink/user/taskmanager#-977591027.
> 2017-12-04 08:53:55,511 INFO
> org.apache.flink.runtime.taskmanager.TaskManager              -
> TaskManager data connection information:
> 08c3e6f7c765e2ab88e2ea645049cb9d @ flink-taskmanager-65cf757d9b-jtzw5
> (dataPort=42142)
> 2017-12-04 08:53:55,512 INFO
> org.apache.flink.runtime.taskmanager.TaskManager              -
> TaskManager has 8 task slot(s).
> 2017-12-04 08:53:55,513 INFO
> org.apache.flink.runtime.taskmanager.TaskManager              - Memory
> usage stats: [HEAP: 131/981/981 MB, NON HEAP: 34/35/-1 MB
> (used/committed/max)]
> 2017-12-04 08:53:55,518 INFO
> org.apache.flink.runtime.taskmanager.TaskManager              - Trying
> to register at JobManager
> akka.tcp://flink@flink-jobmanager:6123/user/jobmanager (attempt 1,
> timeout: 500 milliseconds)
> 2017-12-04 08:53:55,671 WARN
> akka.serialization.Serialization(akka://flink)                - Using
> the default Java serializer for class
> [org.apache.flink.runtime.messages.JobManagerMessages$LeaderSessionMes
> sage] which is not recommended because of performance implications.
> Use another serializer or disable this warning using the setting
> 'akka.actor.warn-about-java-serializer-usage'
> 2017-12-04 08:53:55,737 INFO
> org.apache.flink.runtime.taskmanager.TaskManager              -
> Successful registration at JobManager
> (akka.tcp://flink@flink-jobmanager:6123/user/jobmanager), starting
> network stack and library cache.
> 2017-12-04 08:53:55,742 INFO
> org.apache.flink.runtime.taskmanager.TaskManager              -
> Determined BLOB server address to be
> flink-jobmanager/10.104.5.130:50000. Starting BLOB cache.
> 2017-12-04 08:53:55,752 INFO
> org.apache.flink.runtime.blob.PermanentBlobCache              -
> Created BLOB cache storage directory
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5
> 2017-12-04 08:53:55,776 INFO
> org.apache.flink.runtime.blob.TransientBlobCache              -
> Created BLOB cache storage directory
> /tmp/blobStore-8d7c8660-4455-43e9-93be-dadeb622b9e1
> 2017-12-04 08:54:00,796 WARN
> akka.serialization.Serialization(akka://flink)                - Using
> the default Java serializer for class
> [org.apache.flink.runtime.messages.TaskManagerMessages$Heartbeat]
> which is not recommended because of performance implications. Use
> another serializer or disable this warning using the setting
> 'akka.actor.warn-about-java-serializer-usage'
> 2017-12-04 09:10:38,726 WARN
> akka.serialization.Serialization(akka://flink)                - Using
> the default Java serializer for class
> [org.apache.flink.runtime.metrics.dump.MetricDumpSerialization$MetricS
> erializationResult] which is not recommended because of performance
> implications. Use another serializer or disable this warning using the
> setting 'akka.actor.warn-about-java-serializer-usage'
> 2017-12-04 09:11:35,403 INFO
> org.apache.flink.runtime.taskmanager.TaskManager              -
> Received task Source: PACS008 Generator -> Double Submission Generator
> -> Invalid IBAN Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1)
> 2017-12-04 09:11:35,404 INFO
> org.apache.flink.runtime.taskmanager.Task                     - Source:
> PACS008 Generator -> Double Submission Generator -> Invalid IBAN
> Generator -> Sink: IP-PREPROCESSOR-INBOUND (1/1)
> (797de317bebf24df087f2da63cf5118e) switched from CREATED to DEPLOYING.
> 2017-12-04 09:11:35,404 INFO
> org.apache.flink.runtime.taskmanager.Task                     -
> Creating FileSystem stream leak safety net for task Source: PACS008
> Generator -> Double Submission Generator -> Invalid IBAN Generator -> Sink:
> IP-PREPROCESSOR-INBOUND (1/1) (797de317bebf24df087f2da63cf5118e)
> [DEPLOYING]
> 2017-12-04 09:11:35,408 WARN
> akka.serialization.Serialization(akka://flink)                - Using
> the default Java serializer for class
> [org.apache.flink.runtime.messages.Acknowledge] which is not
> recommended because of performance implications. Use another
> serializer or disable this warning using the setting 'akka.actor.warn-about-java-serializer-usage'
> 2017-12-04 09:11:35,410 INFO
> org.apache.flink.runtime.taskmanager.Task                     -
> Loading JAR files for task Source: PACS008 Generator -> Double
> Submission Generator -> Invalid IBAN Generator -> Sink:
> IP-PREPROCESSOR-INBOUND
> (1/1) (797de317bebf24df087f2da63cf5118e) [DEPLOYING].
> 2017-12-04 09:11:35,440 INFO
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000
> 2017-12-04 09:13:42,663 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> 0000
> Retrying...
> 2017-12-04 09:13:42,663 INFO
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 (retry 1)
> 2017-12-04 09:15:49,895 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> 0000
> Retrying...
> 2017-12-04 09:15:49,895 INFO
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 (retry 2)
> 2017-12-04 09:17:57,127 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> 0000
> Retrying...
> 2017-12-04 09:17:57,127 INFO
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 (retry 3)
> 2017-12-04 09:20:04,359 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> 0000
> Retrying...
> 2017-12-04 09:20:04,359 INFO
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 (retry 4)
> 2017-12-04 09:22:11,590 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> 0000
> Retrying...
> 2017-12-04 09:22:11,591 INFO
> org.apache.flink.runtime.blob.BlobClient                      -
> Downloading
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 (retry 5)
> 2017-12-04 09:24:18,823 ERROR
> org.apache.flink.runtime.blob.BlobClient                      - Failed
> to fetch BLOB
> 27c97ae389eba8e2a4ec988cac86848a/p-756ea9bdf6f93f80dbe8b923ce288523120
> d2b1b-392b55cacd4ed70b48e620f131a8bc61
> from flink-jobmanager/10.104.5.130:50000 and store it under
> /tmp/blobStore-39a9ec2f-32ff-4a31-a1e5-03294cd3cea5/incoming/temp-0000
> 0000
> No retries left.
> java.io.IOException: Could not connect to BlobServer at address
> flink-jobmanager/10.104.5.130:50000
>
>
> _  ________________________________  _
>
>
> Landesbank Hessen-Thueringen Girozentrale Anstalt des oeffentlichen
> Rechts
> Sitz: Frankfurt am Main / Erfurt
> Amtsgericht Frankfurt am Main, HRA 29821 / Amtsgericht Jena, HRA
> 102181
>
> Bitte nutzen Sie die E-Mail-Verbindung mit uns ausschliesslich zum
> Informationsaustausch. Wir koennen auf diesem Wege keine
> rechtsgeschaeftlichen Erklaerungen (Auftraege etc.) entgegennehmen.
>
> Der Inhalt dieser Nachricht ist vertraulich und nur fuer den
> angegebenen Empfaenger bestimmt. Jede Form der Kenntnisnahme oder
> Weitergabe durch Dritte ist unzulaessig. Sollte diese Nachricht nicht
> fur Sie bestimmt sein, so bitten wir Sie, sich mit uns per E-Mail oder
> telefonisch in Verbindung zu setzen.
>
> Please use your E-mail connection with us exclusively for the exchange
> of information. We do not accept legally binding declarations (orders,
> etc.) by this means of communication.
>
> The contents of this message is confidential and intended only for the
> recipient indicated. Taking notice of this message or disclosure by
> third parties is not permitted. In the event that this message is not
> intended for you, please contact us via E-mail or phone.


________________________________


Landesbank Hessen-Thueringen Girozentrale
Anstalt des oeffentlichen Rechts
Sitz: Frankfurt am Main / Erfurt
Amtsgericht Frankfurt am Main, HRA 29821 / Amtsgericht Jena, HRA 102181

Bitte nutzen Sie die E-Mail-Verbindung mit uns ausschliesslich zum Informationsaustausch. Wir koennen auf diesem Wege keine rechtsgeschaeftlichen Erklaerungen (Auftraege etc.) entgegennehmen.

Der Inhalt dieser Nachricht ist vertraulich und nur fuer den angegebenen Empfaenger bestimmt. Jede Form der Kenntnisnahme oder Weitergabe durch Dritte ist unzulaessig. Sollte diese Nachricht nicht fur Sie bestimmt sein, so bitten wir Sie, sich mit uns per E-Mail oder telefonisch in Verbindung zu setzen.

Please use your E-mail connection with us exclusively for the exchange of information. We do not accept legally binding declarations (orders, etc.) by this means of communication.

The contents of this message is confidential and intended only for the recipient indicated. Taking notice of this message or disclosure by third parties is not
permitted. In the event that this message is not intended for you, please contact us via E-mail or phone.