You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@toree.apache.org by "Jim Rhyness (JIRA)" <ji...@apache.org> on 2017/03/09 12:51:37 UTC

[jira] [Created] (TOREE-391) Messages to Jupyter kernel gateway are dropped in jeromq

Jim Rhyness created TOREE-391:
---------------------------------

             Summary: Messages to Jupyter kernel gateway are dropped in jeromq
                 Key: TOREE-391
                 URL: https://issues.apache.org/jira/browse/TOREE-391
             Project: TOREE
          Issue Type: Bug
    Affects Versions: 0.1.0
         Environment: Linux ( RHEL 7.3 )
            Reporter: Jim Rhyness


Kernel restart from Jupyter kernel gateway is failing with a timeout.  The kernel is restarted, but kernel gateway times out waiting for a kernel_info_reply message that it is
expecting in response to kernel_info_request that it sends after initiating the restart.

The problem is reproducible most of the time with something like this:

curl -v -X POST --data '{ "name":"apache_toree_scala" }'  http://127.0.0.1:8888/api/kernels
curl -v -X POST --data '{}'  http://127.0.0.1:8888/api/kernels/<kernelid-from-above>/restart


From the IPython message protocol doc, this is the message format:

[
  b'u-u-i-d',         # zmq identity(ies)
  b'<IDS|MSG>',       # delimiter
  b'baddad42',        # HMAC signature
  b'{header}',        # serialized header dict
  b'{parent_header}', # serialized parent header dict
  b'{metadata}',      # serialized metadata dict
  b'{content},        # serialized content dict
  b'blob',            # extra raw data buffer(s)
  ...
]

The first frame of the message contains zmq identities which, in some cases in a Router-type socket, are generated by jeromq and then consist of five bytes - 0 followed by a random int.

In Toree, all frames are treated as Strings.  Conversion to UTF-8 corrupts the zmq id, replacing non-UTF-8 characters by the replacement character 0xEFBFBD.

When the corrupted id is used in a message sent to the Router socket, the peer to send the message to is not found and the message is dropped.

This affects other messages as well, not just kernel_info_reply.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)