You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mesos.apache.org by "brian wickman (JIRA)" <ji...@apache.org> on 2014/04/30 19:44:18 UTC

[jira] [Commented] (MESOS-857) restructure mesos python namespace

    [ https://issues.apache.org/jira/browse/MESOS-857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13985801#comment-13985801 ] 

brian wickman commented on MESOS-857:
-------------------------------------

I propose that we restructure the mesos python project.  Right now it's fractured haphazardly, yet there are idioms made available by the python packaging ecosystem to do this correctly.

For example, there is src/cli which is a mishmash of C++ and python, which contains a redeclaration of 'mesos' in unpackaged form which would conflict with the existing code in src/python.  Now src/python bundles mesos_pb2.py, mesos.py and _mesos.so in a top-level namespace.  Ordinarily if you 'pip install baz', you expect one top level package name and everything residing underneath, e.g. 'import baz' with baz.foo, baz.bar, baz.bak subpackages.

We should structure the mesos namespace such that bits and pieces of mesos can be installed a la carte.  Right now you have to go all-in, bringing in C extensions (which are challenging to build and have no pure source distribution available yet) which is a hindrance for adoption.

It seems reasonable that I might just want API stubs or the code-generated protobuf classes or just the CLI.  We can do this in a few ways, but it means splitting everything into different packages with dependencies between each (codified by "install_requires" in setup.py.)  The following proposal uses a top-level 'mesos' namespace package, but it could be done with separate top-level packages, e.g. mesos_api, mesos_driver, instead of mesos.api or mesos.driver.

I propose the following packages (which would also mirror the import namespace):

{noformat}
  mesos [nspkg]
  mesos.api [pkg]
  mesos.cli [pkg]
  mesos.driver [pkg]
  mesos.native [pkg]
  mesos.protocol [pkg]
{noformat}

mesos should be a namespace package: it contains no symbols.  But by default it would have install_requires on everything provided within the mesos project, so that 'pip install mesos' does approximately the correct thing.  But in and of itself, it would contain no sources.

mesos.api should contain just the Scheduler, SchedulerDriver, Executor, ExecutorDriver (and in the future, possibly Log, LogDriver, Containerizer, ContainerizerDriver) stubs.  it has no dependencies on anything else.

mesos.cli should contain all the CLI commands.  it also shouldn't need to depend on any other packages except maybe mesos.protocol.  we can use the console_scripts entry point in mesos.cli to handle script installation (see http://www.scotttorborg.com/python-packaging/command-line-scripts.html#the-console-scripts-entry-point ).  this means 'pip install mesos.cli' would create wrapper scripts for mesos-cat, mesos-ps, etc, that correctly invoke the underlying python modules with all the dependencies set up correctly, and put onto the $PATH in the same place as your python interpreter.

mesos.driver should be a package that is a small wrapper around pkg_resources find_packages + get_entry_map and used to detect any python packages in the environment exporting concrete driver implementations (e.g. _mesos.MesosSchedulerDriver or _mesos.MesosExecutorDriver.)  this would be done via EntryPoints (see https://pythonhosted.org/setuptools/pkg_resources.html#entry-points )

mesos.native should be the package that contains _mesos.so and entry_point metadata expected by mesos.driver in the setup.py.  we could even go so far as to publish mesos.native.el5 or mesos.native.el6 binary wheels to PyPI in order to differentiate linux ABIs, but have them correctly detected and picked up by mesos.driver at runtime.  this strategy is also compatible with the pesos project (https://github.com/wickman/pesos ), which would just publish PesosSchedulerDriver and PesosExecutorDriver entry points for mesos.driver, allowing a pure python scheduler or executor to be implemented.

finally, mesos.protocol would be the package containing all of the code-generated protobuf stubs.  we could even split mesos.protocol out as a namespace package with separate subpackages for mesos.protocol.pb, mesos.protocol.json.  currently protobuf only supports python 2.x (there are some branches out there with support for 3.x but afaik there is no plan for those to reach master.)  mesos.protocol.pb would have an install_requires on protobuf, and mesos.protocol.json would be dependency-free, and hence friendly with python 3.x.  ideally there would be helper messages for constructing the body of libprocess messages (the "wire protocol".)  in the future that could be ported over to the Event/Call interface that Ben has described.

in order to support legacy applications, we could have the mesos.legacy package, which would map all the above names into their _mesos, mesos_pb2 and mesos.* counterparts.

> restructure mesos python namespace
> ----------------------------------
>
>                 Key: MESOS-857
>                 URL: https://issues.apache.org/jira/browse/MESOS-857
>             Project: Mesos
>          Issue Type: Improvement
>          Components: python api
>            Reporter: brian wickman
>
> Right now the mesos_pb2 and mesos dependencies are bundled together into the mesos egg. We have some tooling that uses just the compiled protobufs, but because they're lumped together with the mesos egg, we get all the dependency/platform nightmare that comes along with it, not to mention the bloat of including 20MB of .so files.  This proposes splitting the mesos protobufs into a separate mesos_pb distribution that the mesos distribution should depend upon via install_requires (e.g. "mesos_pb==0.15.0-rc4")



--
This message was sent by Atlassian JIRA
(v6.2#6252)