You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by Michael Miklavcic <mi...@gmail.com> on 2017/09/28 02:03:00 UTC

[DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

I am working on upgrading Elasticsearch and Kibana. There are quite a few
changes involved with this vix. I believe I'm mostly finished with the
Ambari mpack side of things, however we currently only support one version
with no backwards compatibility. What is the community's thoughts on this?

Here is some work contributed to the community that I'm referencing while
working on this upgrade - https://github.com/apache/metron/pull/619/files

Best,
Michael Miklavcic

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Michael Miklavcic <mi...@gmail.com>.
Agreed Matt. I do think Jira is probably the most sensible place at this
point. I typically link my PRs back to the Jira in the main comments to
make access easy and fast anyhow.

On Thu, Oct 12, 2017 at 10:50 AM, Matt Foley <mf...@hortonworks.com> wrote:

> Github also allows file attachments, via comments on PRs.  Not necessarily
> intuitive for a “Discussion”.  There would be more sensible places to put
> files for discussion in github, such as “wiki” or “issues”, but those
> aren’t enabled on apache projects in github.
>
> On 10/11/17, 10:39 PM, "Michael Miklavcic" <mi...@gmail.com>
> wrote:
>
>     We've generally preferred communication workflows via Github and the
>     mailing list rather than Jira for most things on this project, but
> you're
>     right that we could probably leverage it for sharing attachments to
> the dev
>     list.
>
>     On Wed, Oct 11, 2017 at 9:54 PM, Matt Foley <mf...@hortonworks.com>
> wrote:
>
>     > You can avoid the permission issues by attaching it to an Apache
> jira.
>     >
>     > On 10/11/17, 6:10 PM, "James Sirota" <js...@apache.org> wrote:
>     >
>     >     I can't see it.  You probably want to link to a google drive
>     >
>     >     11.10.2017, 18:01, "Michael Miklavcic" <
> michael.miklavcic@gmail.com>:
>     >     > I attached a PDF - shows up on my end. Is that not coming
> through?
>     >     >
>     >     > On Wed, Oct 11, 2017 at 6:42 PM, Otto Fowler <
>     > ottobackwards@gmail.com>
>     >     > wrote:
>     >     >
>     >     >>  I think there is a missing attachment?
>     >     >>
>     >     >>  On October 11, 2017 at 20:22:33, Michael Miklavcic (
>     >     >>  michael.miklavcic@gmail.com) wrote:
>     >     >>
>     >     >>  For community reference, here is a class diagram that
> depicts our
>     > current
>     >     >>  Metron 0.4.1 dependencies, for both prod and test code,
> against
>     > the old ES
>     >     >>  client APIs along with an "after" diagram showing the world
> with
>     > the new
>     >     >>  client. Feedback welcome.
>     >     >>
>     >     >>  On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <
> cestella@gmail.com>
>     > wrote:
>     >     >>
>     >     >>>  Yeah, I agree with what Michael "fine whine" Miklavcic
> said; I'm
>     > in favor
>     >     >>>  of the high level client.
>     >     >>>
>     >     >>>  On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
>     >     >>>  michael.miklavcic@gmail.com> wrote:
>     >     >>>
>     >     >>>  > Justin, thanks for the feedback! I'm inclined to agree
> with you
>     > about
>     >     >>>  using
>     >     >>>  > the high level client. It's a bummer that we still need
> to do
>     > jar
>     >     >>>  shading,
>     >     >>>  > but I think that's a reasonable short term sacrifice
>     > considering the
>     >     >>>  other
>     >     >>>  > benefits. And they're angling towards slowly removing the
> ES
>     > core dep
>     >     >>>  over
>     >     >>>  > time anyhow so, like myself, this will get better with
> age.
>     >     >>>  >
>     >     >>>  > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <
>     > justinjleet@gmail.com>
>     >     >>>  > wrote:
>     >     >>>  >
>     >     >>>  > > Do we intend on (or have interest in) supporting ES
> across
>     > major
>     >     >>>  version
>     >     >>>  > > for a given version of Metron? I'm not convinced it's
> worth
>     > the work
>     >     >>>  of
>     >     >>>  > > using the low level client.
>     >     >>>  > >
>     >     >>>  > > This really only seems useful for ES clusters that are
> being
>     > used
>     >     >>>  outside
>     >     >>>  > > Metron and need to be on a different ES major version.
> Is
>     > that a use
>     >     >>>  case
>     >     >>>  > > we want/need to support? I'm willing to bet it's
>     > significantly more
>     >     >>>  work
>     >     >>>  > > and means we're modifying queries and even
> templates/mappings
>     > based on
>     >     >>>  > what
>     >     >>>  > > ES version we're interacting with (e.g. meta alerts in
> 5.x can
>     >     >>>  exploit a
>     >     >>>  > > query param to not screw around with the mapping, but
> that
>     > param
>     >     >>>  doesn't
>     >     >>>  > > exist in 2.x). At that point, we're either back to
> writing
>     > for ES 2.x
>     >     >>>  or
>     >     >>>  > > writing for every version of ES.
>     >     >>>  > >
>     >     >>>  > > Unless that's something we have a demand for (or
> someone else
>     >     >>>  persuades
>     >     >>>  > me
>     >     >>>  > > otherwise), I'm in favor of using the high level
> client. It
>     > seems
>     >     >>>  like
>     >     >>>  > > it'd be easier to migrate to also, given the
> similarities
>     > API-wise to
>     >     >>>  the
>     >     >>>  > > current client we're using.
>     >     >>>  > >
>     >     >>>  > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
>     >     >>>  > > michael.miklavcic@gmail.com> wrote:
>     >     >>>  > >
>     >     >>>  > > > I think it might help the discussion to share my
>     > impressions of
>     >     >>>  looking
>     >     >>>  > > > over the new API recommendations from ES. I've
> summarized
>     > some info
>     >     >>>  > > > provided by ES back in December 2016 regarding the
> reasons
>     > for
>     >     >>>  > switching
>     >     >>>  > > to
>     >     >>>  > > > a new client model. [1]
>     >     >>>  > > >
>     >     >>>  > > > *Summary points:*
>     >     >>>  > > >
>     >     >>>  > > > Pre-5.x had Java API - binary exchange format used for
>     > node-to-node
>     >     >>>  > > > communications.
>     >     >>>  > > > In 5.x a low level REST API was added. Now there's
> also a
>     > high level
>     >     >>>  > REST
>     >     >>>  > > > client that handles request marshalling and response
>     > un-marshalling.
>     >     >>>  > > >
>     >     >>>  > > > *Benefits of existing Java API*
>     >     >>>  > > >
>     >     >>>  > > > 1. Theoretically faster - binary format, no JSON
> parsing
>     >     >>>  > > > 2. Hardened, used for internal ES node to node
>     > communications
>     >     >>>  > > >
>     >     >>>  > > > *Cons of Java API*
>     >     >>>  > > >
>     >     >>>  > > > 1. Benchmarks show it's not really that much faster.
>     >     >>>  > > > 2. Backwards compatibility - Java API changes often.
>     >     >>>  > > > 3. Upgrades more challenging - need to refactor
> client code
>     > for
>     >     >>>  new
>     >     >>>  > > and
>     >     >>>  > > > deprecated features.
>     >     >>>  > > > 4. Minor releases may contain breaking changes in the
> Java
>     > API
>     >     >>>  > > > 5. Client and server *should* be on same JVM version
> (not as
>     >     >>>  > important
>     >     >>>  > > > post 2.x, but still potentially necessary bc of
>     > serialization
>     >     >>>  > w/binary
>     >     >>>  > > > format)
>     >     >>>  > > > 6. Requires dependency on the entire elasticsearch
> server in
>     >     >>>  order
>     >     >>>  > to
>     >     >>>  > > > use the client. We end up shading jars.
>     >     >>>  > > >
>     >     >>>  > > > *Benefits of new REST API*
>     >     >>>  > > >
>     >     >>>  > > > 1. Upgrades
>     >     >>>  > > > 1. Breaking changes only made in major releases - "We
> are
>     > very
>     >     >>>  > > > careful with backwards compatibility on the REST
> layer where
>     >     >>>  > > breaking
>     >     >>>  > > > changes are made only in major releases."
>     >     >>>  > > > 2. "The REST interface is much more stable and can be
>     > upgraded
>     >     >>>  > out
>     >     >>>  > > of
>     >     >>>  > > > step with the Elasticsearch cluster."
>     >     >>>  > > > 2. REST client and server can be on different JVM's
>     >     >>>  > > > 3. Dependencies for the low level client are very
> slim. No
>     > need
>     >     >>>  for
>     >     >>>  > > > shading.
>     >     >>>  > > > 4. The RestHighLevelClient supports the same request
> and
>     > response
>     >     >>>  > > > objects as the TransportClient
>     >     >>>  > > > 5. Can be secured via HTTPS
>     >     >>>  > > >
>     >     >>>  > > > There are some additional benefits to the new API,
> however
>     > they
>     >     >>>  depend
>     >     >>>  > on
>     >     >>>  > > > whether we choose to go with the high or low level
> client.
>     > More
>     >     >>>  > comments
>     >     >>>  > > > below.
>     >     >>>  > > >
>     >     >>>  > > > *Cons of new API*
>     >     >>>  > > >
>     >     >>>  > > > 1. Dependencies - The high level client still
> requires the
>     > full
>     >     >>>  ES
>     >     >>>  > > > dependency, though this will slim down in future
> releases.
>     >     >>>  > > >
>     >     >>>  > > > *Other comments specific to Metron*
>     >     >>>  > > >
>     >     >>>  > > > There's a question of whether we should use the low
> or high
>     > level
>     >     >>>  REST
>     >     >>>  > > > client. The main differences between the two are how
> they
>     > handle lib
>     >     >>>  > > > dependencies and marshaling/unmarshaling. The low
> level
>     > client
>     >     >>>  cleans
>     >     >>>  > up
>     >     >>>  > > > the dependencies dramatically, whereas the high level
>     > client still
>     >     >>>  > > requires
>     >     >>>  > > > you to depend on elasticsearch core. On the other
> hand, the
>     > low
>     >     >>>  level
>     >     >>>  > > > client does no work to handle marshaling/unmarshaling
> the
>     >     >>>  > > > requests/responses from the HTTP calls while the high
> level
>     > client
>     >     >>>  > > handles
>     >     >>>  > > > this for you and exposes api-specific methods. The
> high
>     > level client
>     >     >>>  > > > accepts the same request arguments as the
> TransportClient
>     > and
>     >     >>>  returns
>     >     >>>  > the
>     >     >>>  > > > same response objects. One more thing to note is that
> the
>     > low level
>     >     >>>  > > client
>     >     >>>  > > > claims to be compatible with all versions of ES
> whereas the
>     > high
>     >     >>>  level
>     >     >>>  > > > client appears to be only major version compatible.
>     >     >>>  > > >
>     >     >>>  > > > "The 5.6 client can communicate with any 5.6.x
>     > Elasticsearch node.
>     >     >>>  > > Previous
>     >     >>>  > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not
> (fully)
>     >     >>>  supported."
>     >     >>>  > [2]
>     >     >>>  > > >
>     >     >>>  > > > Just as an example, here's a simple comparison of an
> index
>     > request
>     >     >>>  in
>     >     >>>  > the
>     >     >>>  > > > low and high level API's.
>     >     >>>  > > >
>     >     >>>  > > > *Low Level*
>     >     >>>  > > >
>     >     >>>  > > > Map<String, String> params = Collections.emptyMap();
>     >     >>>  > > > String jsonString = "{" +
>     >     >>>  > > > "\"user\":\"kimchy\"," +
>     >     >>>  > > > "\"postDate\":\"2013-01-30\"," +
>     >     >>>  > > > "\"message\":\"trying out Elasticsearch\"" +
>     >     >>>  > > > "}";
>     >     >>>  > > > HttpEntity entity = new NStringEntity(jsonString,
>     >     >>>  > > > ContentType.APPLICATION_JSON);
>     >     >>>  > > > Response response = restClient.performRequest("PUT",
>     >     >>>  "/posts/doc/1",
>     >     >>>  > > > params, entity);
>     >     >>>  > > >
>     >     >>>  > > > *High Level*
>     >     >>>  > > >
>     >     >>>  > > > IndexRequest indexRequest = new IndexRequest("posts",
>     > "doc", "1")
>     >     >>>  > > > .source("user", "kimchy",
>     >     >>>  > > > "postDate", new Date(),
>     >     >>>  > > > "message", "trying out Elasticsearch");
>     >     >>>  > > >
>     >     >>>  > > > *Note*: there are a few ways to do this with the high
> level
>     > API, but
>     >     >>>  > this
>     >     >>>  > > > was the most concise for me to offer a comparison of
>     > benefits over
>     >     >>>  the
>     >     >>>  > > low
>     >     >>>  > > > level API.
>     >     >>>  > > >
>     >     >>>  > > > *Thoughts/Recommendations*: I do think we should
> migrate to
>     > the new
>     >     >>>  > API.
>     >     >>>  > > I
>     >     >>>  > > > think the question is which of the new APIs we should
> use.
>     > The high
>     >     >>>  > level
>     >     >>>  > > > client seems to shield us from having to deal with
>     > constructing
>     >     >>>  special
>     >     >>>  > > > JSON handling code, whereas the low level client
> handles all
>     >     >>>  versions
>     >     >>>  > of
>     >     >>>  > > > ES. I don't have a good feel (yet) for just how much
> work
>     > it would
>     >     >>>  > > require
>     >     >>>  > > > to use the low level API, or how difficult it would
> be to
>     > add new
>     >     >>>  > request
>     >     >>>  > > > features in the future. Actually, we could probably
> leverage
>     >     >>>  existing
>     >     >>>  > > code
>     >     >>>  > > > we have for dealing with JSON maps, so this might be
> really
>     > easy.
>     >     >>>  > Someone
>     >     >>>  > > > with more experience in Metron's ES client use might
> have a
>     > better
>     >     >>>  idea
>     >     >>>  > > of
>     >     >>>  > > > the pros and cons to this. The high level client
> appears to
>     > handle
>     >     >>>  > > > everything all JSON manipulation for us, but we lose
> the
>     > benefit of
>     >     >>>  a
>     >     >>>  > > > simpler dependency tree and support for all versions
> of ES.
>     > My only
>     >     >>>  > > concern
>     >     >>>  > > > with "supports all versions" is that I have to imagine
>     > there are
>     >     >>>  > specific
>     >     >>>  > > > calls that we'd have to be careful of when
> constructing the
>     > JSON
>     >     >>>  > > requests,
>     >     >>>  > > > so it's unclear to me if this is better or worse in
> the end.
>     >     >>>  > > >
>     >     >>>  > > > Best,
>     >     >>>  > > > Mike
>     >     >>>  > > >
>     >     >>>  > > >
>     >     >>>  > > > 1. https://www.elastic.co/blog/state-of-the-official-
>     >     >>>  > > > elasticsearch-java-clients
>     >     >>>  > > > 2. https://www.elastic.co/guide/
>     > en/elasticsearch/client/java-
>     >     >>>  > > > rest/current/java-rest-high-compatibility.html
>     >     >>>  > > > <https://www.elastic.co/guide/
> en/elasticsearch/client/java-
>     >     >>>  > > > rest/current/java-rest-high-compatibility.html>
>     >     >>>  > > >
>     >     >>>  > > >
>     >     >>>  > > >
>     >     >>>  > > >
>     >     >>>  > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
>     >     >>>  > > > michael.miklavcic@gmail.com> wrote:
>     >     >>>  > > >
>     >     >>>  > > > > I am working on upgrading Elasticsearch and Kibana.
> There
>     > are
>     >     >>>  quite a
>     >     >>>  > > few
>     >     >>>  > > > > changes involved with this vix. I believe I'm mostly
>     > finished with
>     >     >>>  > the
>     >     >>>  > > > > Ambari mpack side of things, however we currently
> only
>     > support one
>     >     >>>  > > > version
>     >     >>>  > > > > with no backwards compatibility. What is the
> community's
>     > thoughts
>     >     >>>  on
>     >     >>>  > > > this?
>     >     >>>  > > > >
>     >     >>>  > > > > Here is some work contributed to the community that
> I'm
>     >     >>>  referencing
>     >     >>>  > > while
>     >     >>>  > > > > working on this upgrade -
> https://github.com/apache/
>     >     >>>  > > > metron/pull/619/files
>     >     >>>  > > > >
>     >     >>>  > > > > Best,
>     >     >>>  > > > > Michael Miklavcic
>     >     >>>  > > > >
>     >     >>>  > > >
>     >     >>>  > >
>     >     >>>  >
>     >
>     >     -------------------
>     >     Thank you,
>     >
>     >     James Sirota
>     >     PMC- Apache Metron
>     >     jsirota AT apache DOT org
>     >
>     >
>     >
>     >
>
>
>

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Matt Foley <mf...@hortonworks.com>.
Github also allows file attachments, via comments on PRs.  Not necessarily intuitive for a “Discussion”.  There would be more sensible places to put files for discussion in github, such as “wiki” or “issues”, but those aren’t enabled on apache projects in github.

On 10/11/17, 10:39 PM, "Michael Miklavcic" <mi...@gmail.com> wrote:

    We've generally preferred communication workflows via Github and the
    mailing list rather than Jira for most things on this project, but you're
    right that we could probably leverage it for sharing attachments to the dev
    list.
    
    On Wed, Oct 11, 2017 at 9:54 PM, Matt Foley <mf...@hortonworks.com> wrote:
    
    > You can avoid the permission issues by attaching it to an Apache jira.
    >
    > On 10/11/17, 6:10 PM, "James Sirota" <js...@apache.org> wrote:
    >
    >     I can't see it.  You probably want to link to a google drive
    >
    >     11.10.2017, 18:01, "Michael Miklavcic" <mi...@gmail.com>:
    >     > I attached a PDF - shows up on my end. Is that not coming through?
    >     >
    >     > On Wed, Oct 11, 2017 at 6:42 PM, Otto Fowler <
    > ottobackwards@gmail.com>
    >     > wrote:
    >     >
    >     >>  I think there is a missing attachment?
    >     >>
    >     >>  On October 11, 2017 at 20:22:33, Michael Miklavcic (
    >     >>  michael.miklavcic@gmail.com) wrote:
    >     >>
    >     >>  For community reference, here is a class diagram that depicts our
    > current
    >     >>  Metron 0.4.1 dependencies, for both prod and test code, against
    > the old ES
    >     >>  client APIs along with an "after" diagram showing the world with
    > the new
    >     >>  client. Feedback welcome.
    >     >>
    >     >>  On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <ce...@gmail.com>
    > wrote:
    >     >>
    >     >>>  Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm
    > in favor
    >     >>>  of the high level client.
    >     >>>
    >     >>>  On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
    >     >>>  michael.miklavcic@gmail.com> wrote:
    >     >>>
    >     >>>  > Justin, thanks for the feedback! I'm inclined to agree with you
    > about
    >     >>>  using
    >     >>>  > the high level client. It's a bummer that we still need to do
    > jar
    >     >>>  shading,
    >     >>>  > but I think that's a reasonable short term sacrifice
    > considering the
    >     >>>  other
    >     >>>  > benefits. And they're angling towards slowly removing the ES
    > core dep
    >     >>>  over
    >     >>>  > time anyhow so, like myself, this will get better with age.
    >     >>>  >
    >     >>>  > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <
    > justinjleet@gmail.com>
    >     >>>  > wrote:
    >     >>>  >
    >     >>>  > > Do we intend on (or have interest in) supporting ES across
    > major
    >     >>>  version
    >     >>>  > > for a given version of Metron? I'm not convinced it's worth
    > the work
    >     >>>  of
    >     >>>  > > using the low level client.
    >     >>>  > >
    >     >>>  > > This really only seems useful for ES clusters that are being
    > used
    >     >>>  outside
    >     >>>  > > Metron and need to be on a different ES major version. Is
    > that a use
    >     >>>  case
    >     >>>  > > we want/need to support? I'm willing to bet it's
    > significantly more
    >     >>>  work
    >     >>>  > > and means we're modifying queries and even templates/mappings
    > based on
    >     >>>  > what
    >     >>>  > > ES version we're interacting with (e.g. meta alerts in 5.x can
    >     >>>  exploit a
    >     >>>  > > query param to not screw around with the mapping, but that
    > param
    >     >>>  doesn't
    >     >>>  > > exist in 2.x). At that point, we're either back to writing
    > for ES 2.x
    >     >>>  or
    >     >>>  > > writing for every version of ES.
    >     >>>  > >
    >     >>>  > > Unless that's something we have a demand for (or someone else
    >     >>>  persuades
    >     >>>  > me
    >     >>>  > > otherwise), I'm in favor of using the high level client. It
    > seems
    >     >>>  like
    >     >>>  > > it'd be easier to migrate to also, given the similarities
    > API-wise to
    >     >>>  the
    >     >>>  > > current client we're using.
    >     >>>  > >
    >     >>>  > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
    >     >>>  > > michael.miklavcic@gmail.com> wrote:
    >     >>>  > >
    >     >>>  > > > I think it might help the discussion to share my
    > impressions of
    >     >>>  looking
    >     >>>  > > > over the new API recommendations from ES. I've summarized
    > some info
    >     >>>  > > > provided by ES back in December 2016 regarding the reasons
    > for
    >     >>>  > switching
    >     >>>  > > to
    >     >>>  > > > a new client model. [1]
    >     >>>  > > >
    >     >>>  > > > *Summary points:*
    >     >>>  > > >
    >     >>>  > > > Pre-5.x had Java API - binary exchange format used for
    > node-to-node
    >     >>>  > > > communications.
    >     >>>  > > > In 5.x a low level REST API was added. Now there's also a
    > high level
    >     >>>  > REST
    >     >>>  > > > client that handles request marshalling and response
    > un-marshalling.
    >     >>>  > > >
    >     >>>  > > > *Benefits of existing Java API*
    >     >>>  > > >
    >     >>>  > > > 1. Theoretically faster - binary format, no JSON parsing
    >     >>>  > > > 2. Hardened, used for internal ES node to node
    > communications
    >     >>>  > > >
    >     >>>  > > > *Cons of Java API*
    >     >>>  > > >
    >     >>>  > > > 1. Benchmarks show it's not really that much faster.
    >     >>>  > > > 2. Backwards compatibility - Java API changes often.
    >     >>>  > > > 3. Upgrades more challenging - need to refactor client code
    > for
    >     >>>  new
    >     >>>  > > and
    >     >>>  > > > deprecated features.
    >     >>>  > > > 4. Minor releases may contain breaking changes in the Java
    > API
    >     >>>  > > > 5. Client and server *should* be on same JVM version (not as
    >     >>>  > important
    >     >>>  > > > post 2.x, but still potentially necessary bc of
    > serialization
    >     >>>  > w/binary
    >     >>>  > > > format)
    >     >>>  > > > 6. Requires dependency on the entire elasticsearch server in
    >     >>>  order
    >     >>>  > to
    >     >>>  > > > use the client. We end up shading jars.
    >     >>>  > > >
    >     >>>  > > > *Benefits of new REST API*
    >     >>>  > > >
    >     >>>  > > > 1. Upgrades
    >     >>>  > > > 1. Breaking changes only made in major releases - "We are
    > very
    >     >>>  > > > careful with backwards compatibility on the REST layer where
    >     >>>  > > breaking
    >     >>>  > > > changes are made only in major releases."
    >     >>>  > > > 2. "The REST interface is much more stable and can be
    > upgraded
    >     >>>  > out
    >     >>>  > > of
    >     >>>  > > > step with the Elasticsearch cluster."
    >     >>>  > > > 2. REST client and server can be on different JVM's
    >     >>>  > > > 3. Dependencies for the low level client are very slim. No
    > need
    >     >>>  for
    >     >>>  > > > shading.
    >     >>>  > > > 4. The RestHighLevelClient supports the same request and
    > response
    >     >>>  > > > objects as the TransportClient
    >     >>>  > > > 5. Can be secured via HTTPS
    >     >>>  > > >
    >     >>>  > > > There are some additional benefits to the new API, however
    > they
    >     >>>  depend
    >     >>>  > on
    >     >>>  > > > whether we choose to go with the high or low level client.
    > More
    >     >>>  > comments
    >     >>>  > > > below.
    >     >>>  > > >
    >     >>>  > > > *Cons of new API*
    >     >>>  > > >
    >     >>>  > > > 1. Dependencies - The high level client still requires the
    > full
    >     >>>  ES
    >     >>>  > > > dependency, though this will slim down in future releases.
    >     >>>  > > >
    >     >>>  > > > *Other comments specific to Metron*
    >     >>>  > > >
    >     >>>  > > > There's a question of whether we should use the low or high
    > level
    >     >>>  REST
    >     >>>  > > > client. The main differences between the two are how they
    > handle lib
    >     >>>  > > > dependencies and marshaling/unmarshaling. The low level
    > client
    >     >>>  cleans
    >     >>>  > up
    >     >>>  > > > the dependencies dramatically, whereas the high level
    > client still
    >     >>>  > > requires
    >     >>>  > > > you to depend on elasticsearch core. On the other hand, the
    > low
    >     >>>  level
    >     >>>  > > > client does no work to handle marshaling/unmarshaling the
    >     >>>  > > > requests/responses from the HTTP calls while the high level
    > client
    >     >>>  > > handles
    >     >>>  > > > this for you and exposes api-specific methods. The high
    > level client
    >     >>>  > > > accepts the same request arguments as the TransportClient
    > and
    >     >>>  returns
    >     >>>  > the
    >     >>>  > > > same response objects. One more thing to note is that the
    > low level
    >     >>>  > > client
    >     >>>  > > > claims to be compatible with all versions of ES whereas the
    > high
    >     >>>  level
    >     >>>  > > > client appears to be only major version compatible.
    >     >>>  > > >
    >     >>>  > > > "The 5.6 client can communicate with any 5.6.x
    > Elasticsearch node.
    >     >>>  > > Previous
    >     >>>  > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully)
    >     >>>  supported."
    >     >>>  > [2]
    >     >>>  > > >
    >     >>>  > > > Just as an example, here's a simple comparison of an index
    > request
    >     >>>  in
    >     >>>  > the
    >     >>>  > > > low and high level API's.
    >     >>>  > > >
    >     >>>  > > > *Low Level*
    >     >>>  > > >
    >     >>>  > > > Map<String, String> params = Collections.emptyMap();
    >     >>>  > > > String jsonString = "{" +
    >     >>>  > > > "\"user\":\"kimchy\"," +
    >     >>>  > > > "\"postDate\":\"2013-01-30\"," +
    >     >>>  > > > "\"message\":\"trying out Elasticsearch\"" +
    >     >>>  > > > "}";
    >     >>>  > > > HttpEntity entity = new NStringEntity(jsonString,
    >     >>>  > > > ContentType.APPLICATION_JSON);
    >     >>>  > > > Response response = restClient.performRequest("PUT",
    >     >>>  "/posts/doc/1",
    >     >>>  > > > params, entity);
    >     >>>  > > >
    >     >>>  > > > *High Level*
    >     >>>  > > >
    >     >>>  > > > IndexRequest indexRequest = new IndexRequest("posts",
    > "doc", "1")
    >     >>>  > > > .source("user", "kimchy",
    >     >>>  > > > "postDate", new Date(),
    >     >>>  > > > "message", "trying out Elasticsearch");
    >     >>>  > > >
    >     >>>  > > > *Note*: there are a few ways to do this with the high level
    > API, but
    >     >>>  > this
    >     >>>  > > > was the most concise for me to offer a comparison of
    > benefits over
    >     >>>  the
    >     >>>  > > low
    >     >>>  > > > level API.
    >     >>>  > > >
    >     >>>  > > > *Thoughts/Recommendations*: I do think we should migrate to
    > the new
    >     >>>  > API.
    >     >>>  > > I
    >     >>>  > > > think the question is which of the new APIs we should use.
    > The high
    >     >>>  > level
    >     >>>  > > > client seems to shield us from having to deal with
    > constructing
    >     >>>  special
    >     >>>  > > > JSON handling code, whereas the low level client handles all
    >     >>>  versions
    >     >>>  > of
    >     >>>  > > > ES. I don't have a good feel (yet) for just how much work
    > it would
    >     >>>  > > require
    >     >>>  > > > to use the low level API, or how difficult it would be to
    > add new
    >     >>>  > request
    >     >>>  > > > features in the future. Actually, we could probably leverage
    >     >>>  existing
    >     >>>  > > code
    >     >>>  > > > we have for dealing with JSON maps, so this might be really
    > easy.
    >     >>>  > Someone
    >     >>>  > > > with more experience in Metron's ES client use might have a
    > better
    >     >>>  idea
    >     >>>  > > of
    >     >>>  > > > the pros and cons to this. The high level client appears to
    > handle
    >     >>>  > > > everything all JSON manipulation for us, but we lose the
    > benefit of
    >     >>>  a
    >     >>>  > > > simpler dependency tree and support for all versions of ES.
    > My only
    >     >>>  > > concern
    >     >>>  > > > with "supports all versions" is that I have to imagine
    > there are
    >     >>>  > specific
    >     >>>  > > > calls that we'd have to be careful of when constructing the
    > JSON
    >     >>>  > > requests,
    >     >>>  > > > so it's unclear to me if this is better or worse in the end.
    >     >>>  > > >
    >     >>>  > > > Best,
    >     >>>  > > > Mike
    >     >>>  > > >
    >     >>>  > > >
    >     >>>  > > > 1. https://www.elastic.co/blog/state-of-the-official-
    >     >>>  > > > elasticsearch-java-clients
    >     >>>  > > > 2. https://www.elastic.co/guide/
    > en/elasticsearch/client/java-
    >     >>>  > > > rest/current/java-rest-high-compatibility.html
    >     >>>  > > > <https://www.elastic.co/guide/en/elasticsearch/client/java-
    >     >>>  > > > rest/current/java-rest-high-compatibility.html>
    >     >>>  > > >
    >     >>>  > > >
    >     >>>  > > >
    >     >>>  > > >
    >     >>>  > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
    >     >>>  > > > michael.miklavcic@gmail.com> wrote:
    >     >>>  > > >
    >     >>>  > > > > I am working on upgrading Elasticsearch and Kibana. There
    > are
    >     >>>  quite a
    >     >>>  > > few
    >     >>>  > > > > changes involved with this vix. I believe I'm mostly
    > finished with
    >     >>>  > the
    >     >>>  > > > > Ambari mpack side of things, however we currently only
    > support one
    >     >>>  > > > version
    >     >>>  > > > > with no backwards compatibility. What is the community's
    > thoughts
    >     >>>  on
    >     >>>  > > > this?
    >     >>>  > > > >
    >     >>>  > > > > Here is some work contributed to the community that I'm
    >     >>>  referencing
    >     >>>  > > while
    >     >>>  > > > > working on this upgrade - https://github.com/apache/
    >     >>>  > > > metron/pull/619/files
    >     >>>  > > > >
    >     >>>  > > > > Best,
    >     >>>  > > > > Michael Miklavcic
    >     >>>  > > > >
    >     >>>  > > >
    >     >>>  > >
    >     >>>  >
    >
    >     -------------------
    >     Thank you,
    >
    >     James Sirota
    >     PMC- Apache Metron
    >     jsirota AT apache DOT org
    >
    >
    >
    >
    


Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Michael Miklavcic <mi...@gmail.com>.
We've generally preferred communication workflows via Github and the
mailing list rather than Jira for most things on this project, but you're
right that we could probably leverage it for sharing attachments to the dev
list.

On Wed, Oct 11, 2017 at 9:54 PM, Matt Foley <mf...@hortonworks.com> wrote:

> You can avoid the permission issues by attaching it to an Apache jira.
>
> On 10/11/17, 6:10 PM, "James Sirota" <js...@apache.org> wrote:
>
>     I can't see it.  You probably want to link to a google drive
>
>     11.10.2017, 18:01, "Michael Miklavcic" <mi...@gmail.com>:
>     > I attached a PDF - shows up on my end. Is that not coming through?
>     >
>     > On Wed, Oct 11, 2017 at 6:42 PM, Otto Fowler <
> ottobackwards@gmail.com>
>     > wrote:
>     >
>     >>  I think there is a missing attachment?
>     >>
>     >>  On October 11, 2017 at 20:22:33, Michael Miklavcic (
>     >>  michael.miklavcic@gmail.com) wrote:
>     >>
>     >>  For community reference, here is a class diagram that depicts our
> current
>     >>  Metron 0.4.1 dependencies, for both prod and test code, against
> the old ES
>     >>  client APIs along with an "after" diagram showing the world with
> the new
>     >>  client. Feedback welcome.
>     >>
>     >>  On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <ce...@gmail.com>
> wrote:
>     >>
>     >>>  Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm
> in favor
>     >>>  of the high level client.
>     >>>
>     >>>  On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
>     >>>  michael.miklavcic@gmail.com> wrote:
>     >>>
>     >>>  > Justin, thanks for the feedback! I'm inclined to agree with you
> about
>     >>>  using
>     >>>  > the high level client. It's a bummer that we still need to do
> jar
>     >>>  shading,
>     >>>  > but I think that's a reasonable short term sacrifice
> considering the
>     >>>  other
>     >>>  > benefits. And they're angling towards slowly removing the ES
> core dep
>     >>>  over
>     >>>  > time anyhow so, like myself, this will get better with age.
>     >>>  >
>     >>>  > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <
> justinjleet@gmail.com>
>     >>>  > wrote:
>     >>>  >
>     >>>  > > Do we intend on (or have interest in) supporting ES across
> major
>     >>>  version
>     >>>  > > for a given version of Metron? I'm not convinced it's worth
> the work
>     >>>  of
>     >>>  > > using the low level client.
>     >>>  > >
>     >>>  > > This really only seems useful for ES clusters that are being
> used
>     >>>  outside
>     >>>  > > Metron and need to be on a different ES major version. Is
> that a use
>     >>>  case
>     >>>  > > we want/need to support? I'm willing to bet it's
> significantly more
>     >>>  work
>     >>>  > > and means we're modifying queries and even templates/mappings
> based on
>     >>>  > what
>     >>>  > > ES version we're interacting with (e.g. meta alerts in 5.x can
>     >>>  exploit a
>     >>>  > > query param to not screw around with the mapping, but that
> param
>     >>>  doesn't
>     >>>  > > exist in 2.x). At that point, we're either back to writing
> for ES 2.x
>     >>>  or
>     >>>  > > writing for every version of ES.
>     >>>  > >
>     >>>  > > Unless that's something we have a demand for (or someone else
>     >>>  persuades
>     >>>  > me
>     >>>  > > otherwise), I'm in favor of using the high level client. It
> seems
>     >>>  like
>     >>>  > > it'd be easier to migrate to also, given the similarities
> API-wise to
>     >>>  the
>     >>>  > > current client we're using.
>     >>>  > >
>     >>>  > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
>     >>>  > > michael.miklavcic@gmail.com> wrote:
>     >>>  > >
>     >>>  > > > I think it might help the discussion to share my
> impressions of
>     >>>  looking
>     >>>  > > > over the new API recommendations from ES. I've summarized
> some info
>     >>>  > > > provided by ES back in December 2016 regarding the reasons
> for
>     >>>  > switching
>     >>>  > > to
>     >>>  > > > a new client model. [1]
>     >>>  > > >
>     >>>  > > > *Summary points:*
>     >>>  > > >
>     >>>  > > > Pre-5.x had Java API - binary exchange format used for
> node-to-node
>     >>>  > > > communications.
>     >>>  > > > In 5.x a low level REST API was added. Now there's also a
> high level
>     >>>  > REST
>     >>>  > > > client that handles request marshalling and response
> un-marshalling.
>     >>>  > > >
>     >>>  > > > *Benefits of existing Java API*
>     >>>  > > >
>     >>>  > > > 1. Theoretically faster - binary format, no JSON parsing
>     >>>  > > > 2. Hardened, used for internal ES node to node
> communications
>     >>>  > > >
>     >>>  > > > *Cons of Java API*
>     >>>  > > >
>     >>>  > > > 1. Benchmarks show it's not really that much faster.
>     >>>  > > > 2. Backwards compatibility - Java API changes often.
>     >>>  > > > 3. Upgrades more challenging - need to refactor client code
> for
>     >>>  new
>     >>>  > > and
>     >>>  > > > deprecated features.
>     >>>  > > > 4. Minor releases may contain breaking changes in the Java
> API
>     >>>  > > > 5. Client and server *should* be on same JVM version (not as
>     >>>  > important
>     >>>  > > > post 2.x, but still potentially necessary bc of
> serialization
>     >>>  > w/binary
>     >>>  > > > format)
>     >>>  > > > 6. Requires dependency on the entire elasticsearch server in
>     >>>  order
>     >>>  > to
>     >>>  > > > use the client. We end up shading jars.
>     >>>  > > >
>     >>>  > > > *Benefits of new REST API*
>     >>>  > > >
>     >>>  > > > 1. Upgrades
>     >>>  > > > 1. Breaking changes only made in major releases - "We are
> very
>     >>>  > > > careful with backwards compatibility on the REST layer where
>     >>>  > > breaking
>     >>>  > > > changes are made only in major releases."
>     >>>  > > > 2. "The REST interface is much more stable and can be
> upgraded
>     >>>  > out
>     >>>  > > of
>     >>>  > > > step with the Elasticsearch cluster."
>     >>>  > > > 2. REST client and server can be on different JVM's
>     >>>  > > > 3. Dependencies for the low level client are very slim. No
> need
>     >>>  for
>     >>>  > > > shading.
>     >>>  > > > 4. The RestHighLevelClient supports the same request and
> response
>     >>>  > > > objects as the TransportClient
>     >>>  > > > 5. Can be secured via HTTPS
>     >>>  > > >
>     >>>  > > > There are some additional benefits to the new API, however
> they
>     >>>  depend
>     >>>  > on
>     >>>  > > > whether we choose to go with the high or low level client.
> More
>     >>>  > comments
>     >>>  > > > below.
>     >>>  > > >
>     >>>  > > > *Cons of new API*
>     >>>  > > >
>     >>>  > > > 1. Dependencies - The high level client still requires the
> full
>     >>>  ES
>     >>>  > > > dependency, though this will slim down in future releases.
>     >>>  > > >
>     >>>  > > > *Other comments specific to Metron*
>     >>>  > > >
>     >>>  > > > There's a question of whether we should use the low or high
> level
>     >>>  REST
>     >>>  > > > client. The main differences between the two are how they
> handle lib
>     >>>  > > > dependencies and marshaling/unmarshaling. The low level
> client
>     >>>  cleans
>     >>>  > up
>     >>>  > > > the dependencies dramatically, whereas the high level
> client still
>     >>>  > > requires
>     >>>  > > > you to depend on elasticsearch core. On the other hand, the
> low
>     >>>  level
>     >>>  > > > client does no work to handle marshaling/unmarshaling the
>     >>>  > > > requests/responses from the HTTP calls while the high level
> client
>     >>>  > > handles
>     >>>  > > > this for you and exposes api-specific methods. The high
> level client
>     >>>  > > > accepts the same request arguments as the TransportClient
> and
>     >>>  returns
>     >>>  > the
>     >>>  > > > same response objects. One more thing to note is that the
> low level
>     >>>  > > client
>     >>>  > > > claims to be compatible with all versions of ES whereas the
> high
>     >>>  level
>     >>>  > > > client appears to be only major version compatible.
>     >>>  > > >
>     >>>  > > > "The 5.6 client can communicate with any 5.6.x
> Elasticsearch node.
>     >>>  > > Previous
>     >>>  > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully)
>     >>>  supported."
>     >>>  > [2]
>     >>>  > > >
>     >>>  > > > Just as an example, here's a simple comparison of an index
> request
>     >>>  in
>     >>>  > the
>     >>>  > > > low and high level API's.
>     >>>  > > >
>     >>>  > > > *Low Level*
>     >>>  > > >
>     >>>  > > > Map<String, String> params = Collections.emptyMap();
>     >>>  > > > String jsonString = "{" +
>     >>>  > > > "\"user\":\"kimchy\"," +
>     >>>  > > > "\"postDate\":\"2013-01-30\"," +
>     >>>  > > > "\"message\":\"trying out Elasticsearch\"" +
>     >>>  > > > "}";
>     >>>  > > > HttpEntity entity = new NStringEntity(jsonString,
>     >>>  > > > ContentType.APPLICATION_JSON);
>     >>>  > > > Response response = restClient.performRequest("PUT",
>     >>>  "/posts/doc/1",
>     >>>  > > > params, entity);
>     >>>  > > >
>     >>>  > > > *High Level*
>     >>>  > > >
>     >>>  > > > IndexRequest indexRequest = new IndexRequest("posts",
> "doc", "1")
>     >>>  > > > .source("user", "kimchy",
>     >>>  > > > "postDate", new Date(),
>     >>>  > > > "message", "trying out Elasticsearch");
>     >>>  > > >
>     >>>  > > > *Note*: there are a few ways to do this with the high level
> API, but
>     >>>  > this
>     >>>  > > > was the most concise for me to offer a comparison of
> benefits over
>     >>>  the
>     >>>  > > low
>     >>>  > > > level API.
>     >>>  > > >
>     >>>  > > > *Thoughts/Recommendations*: I do think we should migrate to
> the new
>     >>>  > API.
>     >>>  > > I
>     >>>  > > > think the question is which of the new APIs we should use.
> The high
>     >>>  > level
>     >>>  > > > client seems to shield us from having to deal with
> constructing
>     >>>  special
>     >>>  > > > JSON handling code, whereas the low level client handles all
>     >>>  versions
>     >>>  > of
>     >>>  > > > ES. I don't have a good feel (yet) for just how much work
> it would
>     >>>  > > require
>     >>>  > > > to use the low level API, or how difficult it would be to
> add new
>     >>>  > request
>     >>>  > > > features in the future. Actually, we could probably leverage
>     >>>  existing
>     >>>  > > code
>     >>>  > > > we have for dealing with JSON maps, so this might be really
> easy.
>     >>>  > Someone
>     >>>  > > > with more experience in Metron's ES client use might have a
> better
>     >>>  idea
>     >>>  > > of
>     >>>  > > > the pros and cons to this. The high level client appears to
> handle
>     >>>  > > > everything all JSON manipulation for us, but we lose the
> benefit of
>     >>>  a
>     >>>  > > > simpler dependency tree and support for all versions of ES.
> My only
>     >>>  > > concern
>     >>>  > > > with "supports all versions" is that I have to imagine
> there are
>     >>>  > specific
>     >>>  > > > calls that we'd have to be careful of when constructing the
> JSON
>     >>>  > > requests,
>     >>>  > > > so it's unclear to me if this is better or worse in the end.
>     >>>  > > >
>     >>>  > > > Best,
>     >>>  > > > Mike
>     >>>  > > >
>     >>>  > > >
>     >>>  > > > 1. https://www.elastic.co/blog/state-of-the-official-
>     >>>  > > > elasticsearch-java-clients
>     >>>  > > > 2. https://www.elastic.co/guide/
> en/elasticsearch/client/java-
>     >>>  > > > rest/current/java-rest-high-compatibility.html
>     >>>  > > > <https://www.elastic.co/guide/en/elasticsearch/client/java-
>     >>>  > > > rest/current/java-rest-high-compatibility.html>
>     >>>  > > >
>     >>>  > > >
>     >>>  > > >
>     >>>  > > >
>     >>>  > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
>     >>>  > > > michael.miklavcic@gmail.com> wrote:
>     >>>  > > >
>     >>>  > > > > I am working on upgrading Elasticsearch and Kibana. There
> are
>     >>>  quite a
>     >>>  > > few
>     >>>  > > > > changes involved with this vix. I believe I'm mostly
> finished with
>     >>>  > the
>     >>>  > > > > Ambari mpack side of things, however we currently only
> support one
>     >>>  > > > version
>     >>>  > > > > with no backwards compatibility. What is the community's
> thoughts
>     >>>  on
>     >>>  > > > this?
>     >>>  > > > >
>     >>>  > > > > Here is some work contributed to the community that I'm
>     >>>  referencing
>     >>>  > > while
>     >>>  > > > > working on this upgrade - https://github.com/apache/
>     >>>  > > > metron/pull/619/files
>     >>>  > > > >
>     >>>  > > > > Best,
>     >>>  > > > > Michael Miklavcic
>     >>>  > > > >
>     >>>  > > >
>     >>>  > >
>     >>>  >
>
>     -------------------
>     Thank you,
>
>     James Sirota
>     PMC- Apache Metron
>     jsirota AT apache DOT org
>
>
>
>

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Matt Foley <mf...@hortonworks.com>.
You can avoid the permission issues by attaching it to an Apache jira.

On 10/11/17, 6:10 PM, "James Sirota" <js...@apache.org> wrote:

    I can't see it.  You probably want to link to a google drive
    
    11.10.2017, 18:01, "Michael Miklavcic" <mi...@gmail.com>:
    > I attached a PDF - shows up on my end. Is that not coming through?
    >
    > On Wed, Oct 11, 2017 at 6:42 PM, Otto Fowler <ot...@gmail.com>
    > wrote:
    >
    >>  I think there is a missing attachment?
    >>
    >>  On October 11, 2017 at 20:22:33, Michael Miklavcic (
    >>  michael.miklavcic@gmail.com) wrote:
    >>
    >>  For community reference, here is a class diagram that depicts our current
    >>  Metron 0.4.1 dependencies, for both prod and test code, against the old ES
    >>  client APIs along with an "after" diagram showing the world with the new
    >>  client. Feedback welcome.
    >>
    >>  On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <ce...@gmail.com> wrote:
    >>
    >>>  Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm in favor
    >>>  of the high level client.
    >>>
    >>>  On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
    >>>  michael.miklavcic@gmail.com> wrote:
    >>>
    >>>  > Justin, thanks for the feedback! I'm inclined to agree with you about
    >>>  using
    >>>  > the high level client. It's a bummer that we still need to do jar
    >>>  shading,
    >>>  > but I think that's a reasonable short term sacrifice considering the
    >>>  other
    >>>  > benefits. And they're angling towards slowly removing the ES core dep
    >>>  over
    >>>  > time anyhow so, like myself, this will get better with age.
    >>>  >
    >>>  > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <ju...@gmail.com>
    >>>  > wrote:
    >>>  >
    >>>  > > Do we intend on (or have interest in) supporting ES across major
    >>>  version
    >>>  > > for a given version of Metron? I'm not convinced it's worth the work
    >>>  of
    >>>  > > using the low level client.
    >>>  > >
    >>>  > > This really only seems useful for ES clusters that are being used
    >>>  outside
    >>>  > > Metron and need to be on a different ES major version. Is that a use
    >>>  case
    >>>  > > we want/need to support? I'm willing to bet it's significantly more
    >>>  work
    >>>  > > and means we're modifying queries and even templates/mappings based on
    >>>  > what
    >>>  > > ES version we're interacting with (e.g. meta alerts in 5.x can
    >>>  exploit a
    >>>  > > query param to not screw around with the mapping, but that param
    >>>  doesn't
    >>>  > > exist in 2.x). At that point, we're either back to writing for ES 2.x
    >>>  or
    >>>  > > writing for every version of ES.
    >>>  > >
    >>>  > > Unless that's something we have a demand for (or someone else
    >>>  persuades
    >>>  > me
    >>>  > > otherwise), I'm in favor of using the high level client. It seems
    >>>  like
    >>>  > > it'd be easier to migrate to also, given the similarities API-wise to
    >>>  the
    >>>  > > current client we're using.
    >>>  > >
    >>>  > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
    >>>  > > michael.miklavcic@gmail.com> wrote:
    >>>  > >
    >>>  > > > I think it might help the discussion to share my impressions of
    >>>  looking
    >>>  > > > over the new API recommendations from ES. I've summarized some info
    >>>  > > > provided by ES back in December 2016 regarding the reasons for
    >>>  > switching
    >>>  > > to
    >>>  > > > a new client model. [1]
    >>>  > > >
    >>>  > > > *Summary points:*
    >>>  > > >
    >>>  > > > Pre-5.x had Java API - binary exchange format used for node-to-node
    >>>  > > > communications.
    >>>  > > > In 5.x a low level REST API was added. Now there's also a high level
    >>>  > REST
    >>>  > > > client that handles request marshalling and response un-marshalling.
    >>>  > > >
    >>>  > > > *Benefits of existing Java API*
    >>>  > > >
    >>>  > > > 1. Theoretically faster - binary format, no JSON parsing
    >>>  > > > 2. Hardened, used for internal ES node to node communications
    >>>  > > >
    >>>  > > > *Cons of Java API*
    >>>  > > >
    >>>  > > > 1. Benchmarks show it's not really that much faster.
    >>>  > > > 2. Backwards compatibility - Java API changes often.
    >>>  > > > 3. Upgrades more challenging - need to refactor client code for
    >>>  new
    >>>  > > and
    >>>  > > > deprecated features.
    >>>  > > > 4. Minor releases may contain breaking changes in the Java API
    >>>  > > > 5. Client and server *should* be on same JVM version (not as
    >>>  > important
    >>>  > > > post 2.x, but still potentially necessary bc of serialization
    >>>  > w/binary
    >>>  > > > format)
    >>>  > > > 6. Requires dependency on the entire elasticsearch server in
    >>>  order
    >>>  > to
    >>>  > > > use the client. We end up shading jars.
    >>>  > > >
    >>>  > > > *Benefits of new REST API*
    >>>  > > >
    >>>  > > > 1. Upgrades
    >>>  > > > 1. Breaking changes only made in major releases - "We are very
    >>>  > > > careful with backwards compatibility on the REST layer where
    >>>  > > breaking
    >>>  > > > changes are made only in major releases."
    >>>  > > > 2. "The REST interface is much more stable and can be upgraded
    >>>  > out
    >>>  > > of
    >>>  > > > step with the Elasticsearch cluster."
    >>>  > > > 2. REST client and server can be on different JVM's
    >>>  > > > 3. Dependencies for the low level client are very slim. No need
    >>>  for
    >>>  > > > shading.
    >>>  > > > 4. The RestHighLevelClient supports the same request and response
    >>>  > > > objects as the TransportClient
    >>>  > > > 5. Can be secured via HTTPS
    >>>  > > >
    >>>  > > > There are some additional benefits to the new API, however they
    >>>  depend
    >>>  > on
    >>>  > > > whether we choose to go with the high or low level client. More
    >>>  > comments
    >>>  > > > below.
    >>>  > > >
    >>>  > > > *Cons of new API*
    >>>  > > >
    >>>  > > > 1. Dependencies - The high level client still requires the full
    >>>  ES
    >>>  > > > dependency, though this will slim down in future releases.
    >>>  > > >
    >>>  > > > *Other comments specific to Metron*
    >>>  > > >
    >>>  > > > There's a question of whether we should use the low or high level
    >>>  REST
    >>>  > > > client. The main differences between the two are how they handle lib
    >>>  > > > dependencies and marshaling/unmarshaling. The low level client
    >>>  cleans
    >>>  > up
    >>>  > > > the dependencies dramatically, whereas the high level client still
    >>>  > > requires
    >>>  > > > you to depend on elasticsearch core. On the other hand, the low
    >>>  level
    >>>  > > > client does no work to handle marshaling/unmarshaling the
    >>>  > > > requests/responses from the HTTP calls while the high level client
    >>>  > > handles
    >>>  > > > this for you and exposes api-specific methods. The high level client
    >>>  > > > accepts the same request arguments as the TransportClient and
    >>>  returns
    >>>  > the
    >>>  > > > same response objects. One more thing to note is that the low level
    >>>  > > client
    >>>  > > > claims to be compatible with all versions of ES whereas the high
    >>>  level
    >>>  > > > client appears to be only major version compatible.
    >>>  > > >
    >>>  > > > "The 5.6 client can communicate with any 5.6.x Elasticsearch node.
    >>>  > > Previous
    >>>  > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully)
    >>>  supported."
    >>>  > [2]
    >>>  > > >
    >>>  > > > Just as an example, here's a simple comparison of an index request
    >>>  in
    >>>  > the
    >>>  > > > low and high level API's.
    >>>  > > >
    >>>  > > > *Low Level*
    >>>  > > >
    >>>  > > > Map<String, String> params = Collections.emptyMap();
    >>>  > > > String jsonString = "{" +
    >>>  > > > "\"user\":\"kimchy\"," +
    >>>  > > > "\"postDate\":\"2013-01-30\"," +
    >>>  > > > "\"message\":\"trying out Elasticsearch\"" +
    >>>  > > > "}";
    >>>  > > > HttpEntity entity = new NStringEntity(jsonString,
    >>>  > > > ContentType.APPLICATION_JSON);
    >>>  > > > Response response = restClient.performRequest("PUT",
    >>>  "/posts/doc/1",
    >>>  > > > params, entity);
    >>>  > > >
    >>>  > > > *High Level*
    >>>  > > >
    >>>  > > > IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
    >>>  > > > .source("user", "kimchy",
    >>>  > > > "postDate", new Date(),
    >>>  > > > "message", "trying out Elasticsearch");
    >>>  > > >
    >>>  > > > *Note*: there are a few ways to do this with the high level API, but
    >>>  > this
    >>>  > > > was the most concise for me to offer a comparison of benefits over
    >>>  the
    >>>  > > low
    >>>  > > > level API.
    >>>  > > >
    >>>  > > > *Thoughts/Recommendations*: I do think we should migrate to the new
    >>>  > API.
    >>>  > > I
    >>>  > > > think the question is which of the new APIs we should use. The high
    >>>  > level
    >>>  > > > client seems to shield us from having to deal with constructing
    >>>  special
    >>>  > > > JSON handling code, whereas the low level client handles all
    >>>  versions
    >>>  > of
    >>>  > > > ES. I don't have a good feel (yet) for just how much work it would
    >>>  > > require
    >>>  > > > to use the low level API, or how difficult it would be to add new
    >>>  > request
    >>>  > > > features in the future. Actually, we could probably leverage
    >>>  existing
    >>>  > > code
    >>>  > > > we have for dealing with JSON maps, so this might be really easy.
    >>>  > Someone
    >>>  > > > with more experience in Metron's ES client use might have a better
    >>>  idea
    >>>  > > of
    >>>  > > > the pros and cons to this. The high level client appears to handle
    >>>  > > > everything all JSON manipulation for us, but we lose the benefit of
    >>>  a
    >>>  > > > simpler dependency tree and support for all versions of ES. My only
    >>>  > > concern
    >>>  > > > with "supports all versions" is that I have to imagine there are
    >>>  > specific
    >>>  > > > calls that we'd have to be careful of when constructing the JSON
    >>>  > > requests,
    >>>  > > > so it's unclear to me if this is better or worse in the end.
    >>>  > > >
    >>>  > > > Best,
    >>>  > > > Mike
    >>>  > > >
    >>>  > > >
    >>>  > > > 1. https://www.elastic.co/blog/state-of-the-official-
    >>>  > > > elasticsearch-java-clients
    >>>  > > > 2. https://www.elastic.co/guide/en/elasticsearch/client/java-
    >>>  > > > rest/current/java-rest-high-compatibility.html
    >>>  > > > <https://www.elastic.co/guide/en/elasticsearch/client/java-
    >>>  > > > rest/current/java-rest-high-compatibility.html>
    >>>  > > >
    >>>  > > >
    >>>  > > >
    >>>  > > >
    >>>  > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
    >>>  > > > michael.miklavcic@gmail.com> wrote:
    >>>  > > >
    >>>  > > > > I am working on upgrading Elasticsearch and Kibana. There are
    >>>  quite a
    >>>  > > few
    >>>  > > > > changes involved with this vix. I believe I'm mostly finished with
    >>>  > the
    >>>  > > > > Ambari mpack side of things, however we currently only support one
    >>>  > > > version
    >>>  > > > > with no backwards compatibility. What is the community's thoughts
    >>>  on
    >>>  > > > this?
    >>>  > > > >
    >>>  > > > > Here is some work contributed to the community that I'm
    >>>  referencing
    >>>  > > while
    >>>  > > > > working on this upgrade - https://github.com/apache/
    >>>  > > > metron/pull/619/files
    >>>  > > > >
    >>>  > > > > Best,
    >>>  > > > > Michael Miklavcic
    >>>  > > > >
    >>>  > > >
    >>>  > >
    >>>  >
    
    ------------------- 
    Thank you,
    
    James Sirota
    PMC- Apache Metron
    jsirota AT apache DOT org
    
    


Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by James Sirota <js...@apache.org>.
I can't see it.  You probably want to link to a google drive

11.10.2017, 18:01, "Michael Miklavcic" <mi...@gmail.com>:
> I attached a PDF - shows up on my end. Is that not coming through?
>
> On Wed, Oct 11, 2017 at 6:42 PM, Otto Fowler <ot...@gmail.com>
> wrote:
>
>>  I think there is a missing attachment?
>>
>>  On October 11, 2017 at 20:22:33, Michael Miklavcic (
>>  michael.miklavcic@gmail.com) wrote:
>>
>>  For community reference, here is a class diagram that depicts our current
>>  Metron 0.4.1 dependencies, for both prod and test code, against the old ES
>>  client APIs along with an "after" diagram showing the world with the new
>>  client. Feedback welcome.
>>
>>  On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <ce...@gmail.com> wrote:
>>
>>>  Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm in favor
>>>  of the high level client.
>>>
>>>  On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
>>>  michael.miklavcic@gmail.com> wrote:
>>>
>>>  > Justin, thanks for the feedback! I'm inclined to agree with you about
>>>  using
>>>  > the high level client. It's a bummer that we still need to do jar
>>>  shading,
>>>  > but I think that's a reasonable short term sacrifice considering the
>>>  other
>>>  > benefits. And they're angling towards slowly removing the ES core dep
>>>  over
>>>  > time anyhow so, like myself, this will get better with age.
>>>  >
>>>  > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <ju...@gmail.com>
>>>  > wrote:
>>>  >
>>>  > > Do we intend on (or have interest in) supporting ES across major
>>>  version
>>>  > > for a given version of Metron? I'm not convinced it's worth the work
>>>  of
>>>  > > using the low level client.
>>>  > >
>>>  > > This really only seems useful for ES clusters that are being used
>>>  outside
>>>  > > Metron and need to be on a different ES major version. Is that a use
>>>  case
>>>  > > we want/need to support? I'm willing to bet it's significantly more
>>>  work
>>>  > > and means we're modifying queries and even templates/mappings based on
>>>  > what
>>>  > > ES version we're interacting with (e.g. meta alerts in 5.x can
>>>  exploit a
>>>  > > query param to not screw around with the mapping, but that param
>>>  doesn't
>>>  > > exist in 2.x). At that point, we're either back to writing for ES 2.x
>>>  or
>>>  > > writing for every version of ES.
>>>  > >
>>>  > > Unless that's something we have a demand for (or someone else
>>>  persuades
>>>  > me
>>>  > > otherwise), I'm in favor of using the high level client. It seems
>>>  like
>>>  > > it'd be easier to migrate to also, given the similarities API-wise to
>>>  the
>>>  > > current client we're using.
>>>  > >
>>>  > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
>>>  > > michael.miklavcic@gmail.com> wrote:
>>>  > >
>>>  > > > I think it might help the discussion to share my impressions of
>>>  looking
>>>  > > > over the new API recommendations from ES. I've summarized some info
>>>  > > > provided by ES back in December 2016 regarding the reasons for
>>>  > switching
>>>  > > to
>>>  > > > a new client model. [1]
>>>  > > >
>>>  > > > *Summary points:*
>>>  > > >
>>>  > > > Pre-5.x had Java API - binary exchange format used for node-to-node
>>>  > > > communications.
>>>  > > > In 5.x a low level REST API was added. Now there's also a high level
>>>  > REST
>>>  > > > client that handles request marshalling and response un-marshalling.
>>>  > > >
>>>  > > > *Benefits of existing Java API*
>>>  > > >
>>>  > > > 1. Theoretically faster - binary format, no JSON parsing
>>>  > > > 2. Hardened, used for internal ES node to node communications
>>>  > > >
>>>  > > > *Cons of Java API*
>>>  > > >
>>>  > > > 1. Benchmarks show it's not really that much faster.
>>>  > > > 2. Backwards compatibility - Java API changes often.
>>>  > > > 3. Upgrades more challenging - need to refactor client code for
>>>  new
>>>  > > and
>>>  > > > deprecated features.
>>>  > > > 4. Minor releases may contain breaking changes in the Java API
>>>  > > > 5. Client and server *should* be on same JVM version (not as
>>>  > important
>>>  > > > post 2.x, but still potentially necessary bc of serialization
>>>  > w/binary
>>>  > > > format)
>>>  > > > 6. Requires dependency on the entire elasticsearch server in
>>>  order
>>>  > to
>>>  > > > use the client. We end up shading jars.
>>>  > > >
>>>  > > > *Benefits of new REST API*
>>>  > > >
>>>  > > > 1. Upgrades
>>>  > > > 1. Breaking changes only made in major releases - "We are very
>>>  > > > careful with backwards compatibility on the REST layer where
>>>  > > breaking
>>>  > > > changes are made only in major releases."
>>>  > > > 2. "The REST interface is much more stable and can be upgraded
>>>  > out
>>>  > > of
>>>  > > > step with the Elasticsearch cluster."
>>>  > > > 2. REST client and server can be on different JVM's
>>>  > > > 3. Dependencies for the low level client are very slim. No need
>>>  for
>>>  > > > shading.
>>>  > > > 4. The RestHighLevelClient supports the same request and response
>>>  > > > objects as the TransportClient
>>>  > > > 5. Can be secured via HTTPS
>>>  > > >
>>>  > > > There are some additional benefits to the new API, however they
>>>  depend
>>>  > on
>>>  > > > whether we choose to go with the high or low level client. More
>>>  > comments
>>>  > > > below.
>>>  > > >
>>>  > > > *Cons of new API*
>>>  > > >
>>>  > > > 1. Dependencies - The high level client still requires the full
>>>  ES
>>>  > > > dependency, though this will slim down in future releases.
>>>  > > >
>>>  > > > *Other comments specific to Metron*
>>>  > > >
>>>  > > > There's a question of whether we should use the low or high level
>>>  REST
>>>  > > > client. The main differences between the two are how they handle lib
>>>  > > > dependencies and marshaling/unmarshaling. The low level client
>>>  cleans
>>>  > up
>>>  > > > the dependencies dramatically, whereas the high level client still
>>>  > > requires
>>>  > > > you to depend on elasticsearch core. On the other hand, the low
>>>  level
>>>  > > > client does no work to handle marshaling/unmarshaling the
>>>  > > > requests/responses from the HTTP calls while the high level client
>>>  > > handles
>>>  > > > this for you and exposes api-specific methods. The high level client
>>>  > > > accepts the same request arguments as the TransportClient and
>>>  returns
>>>  > the
>>>  > > > same response objects. One more thing to note is that the low level
>>>  > > client
>>>  > > > claims to be compatible with all versions of ES whereas the high
>>>  level
>>>  > > > client appears to be only major version compatible.
>>>  > > >
>>>  > > > "The 5.6 client can communicate with any 5.6.x Elasticsearch node.
>>>  > > Previous
>>>  > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully)
>>>  supported."
>>>  > [2]
>>>  > > >
>>>  > > > Just as an example, here's a simple comparison of an index request
>>>  in
>>>  > the
>>>  > > > low and high level API's.
>>>  > > >
>>>  > > > *Low Level*
>>>  > > >
>>>  > > > Map<String, String> params = Collections.emptyMap();
>>>  > > > String jsonString = "{" +
>>>  > > > "\"user\":\"kimchy\"," +
>>>  > > > "\"postDate\":\"2013-01-30\"," +
>>>  > > > "\"message\":\"trying out Elasticsearch\"" +
>>>  > > > "}";
>>>  > > > HttpEntity entity = new NStringEntity(jsonString,
>>>  > > > ContentType.APPLICATION_JSON);
>>>  > > > Response response = restClient.performRequest("PUT",
>>>  "/posts/doc/1",
>>>  > > > params, entity);
>>>  > > >
>>>  > > > *High Level*
>>>  > > >
>>>  > > > IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
>>>  > > > .source("user", "kimchy",
>>>  > > > "postDate", new Date(),
>>>  > > > "message", "trying out Elasticsearch");
>>>  > > >
>>>  > > > *Note*: there are a few ways to do this with the high level API, but
>>>  > this
>>>  > > > was the most concise for me to offer a comparison of benefits over
>>>  the
>>>  > > low
>>>  > > > level API.
>>>  > > >
>>>  > > > *Thoughts/Recommendations*: I do think we should migrate to the new
>>>  > API.
>>>  > > I
>>>  > > > think the question is which of the new APIs we should use. The high
>>>  > level
>>>  > > > client seems to shield us from having to deal with constructing
>>>  special
>>>  > > > JSON handling code, whereas the low level client handles all
>>>  versions
>>>  > of
>>>  > > > ES. I don't have a good feel (yet) for just how much work it would
>>>  > > require
>>>  > > > to use the low level API, or how difficult it would be to add new
>>>  > request
>>>  > > > features in the future. Actually, we could probably leverage
>>>  existing
>>>  > > code
>>>  > > > we have for dealing with JSON maps, so this might be really easy.
>>>  > Someone
>>>  > > > with more experience in Metron's ES client use might have a better
>>>  idea
>>>  > > of
>>>  > > > the pros and cons to this. The high level client appears to handle
>>>  > > > everything all JSON manipulation for us, but we lose the benefit of
>>>  a
>>>  > > > simpler dependency tree and support for all versions of ES. My only
>>>  > > concern
>>>  > > > with "supports all versions" is that I have to imagine there are
>>>  > specific
>>>  > > > calls that we'd have to be careful of when constructing the JSON
>>>  > > requests,
>>>  > > > so it's unclear to me if this is better or worse in the end.
>>>  > > >
>>>  > > > Best,
>>>  > > > Mike
>>>  > > >
>>>  > > >
>>>  > > > 1. https://www.elastic.co/blog/state-of-the-official-
>>>  > > > elasticsearch-java-clients
>>>  > > > 2. https://www.elastic.co/guide/en/elasticsearch/client/java-
>>>  > > > rest/current/java-rest-high-compatibility.html
>>>  > > > <https://www.elastic.co/guide/en/elasticsearch/client/java-
>>>  > > > rest/current/java-rest-high-compatibility.html>
>>>  > > >
>>>  > > >
>>>  > > >
>>>  > > >
>>>  > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
>>>  > > > michael.miklavcic@gmail.com> wrote:
>>>  > > >
>>>  > > > > I am working on upgrading Elasticsearch and Kibana. There are
>>>  quite a
>>>  > > few
>>>  > > > > changes involved with this vix. I believe I'm mostly finished with
>>>  > the
>>>  > > > > Ambari mpack side of things, however we currently only support one
>>>  > > > version
>>>  > > > > with no backwards compatibility. What is the community's thoughts
>>>  on
>>>  > > > this?
>>>  > > > >
>>>  > > > > Here is some work contributed to the community that I'm
>>>  referencing
>>>  > > while
>>>  > > > > working on this upgrade - https://github.com/apache/
>>>  > > > metron/pull/619/files
>>>  > > > >
>>>  > > > > Best,
>>>  > > > > Michael Miklavcic
>>>  > > > >
>>>  > > >
>>>  > >
>>>  >

------------------- 
Thank you,

James Sirota
PMC- Apache Metron
jsirota AT apache DOT org

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Michael Miklavcic <mi...@gmail.com>.
Nevermind, my bad - Apache dev lists don't support attachments.
https://commons.apache.org/mail-lists.html

"Note: please don't send patches or attachments to any of the mailing
lists. Patches are best handled via the Issue Tracking system. Otherwise,
please upload the file to a public server and include the URL in the mail."

I'm open to suggestions for better ways to share, but for now how's this?
https://imgur.com/a/xSnPr


On Wed, Oct 11, 2017 at 6:57 PM, Otto Fowler <ot...@gmail.com>
wrote:

> I don’t see one :(
>
>
> On October 11, 2017 at 20:54:33, Michael Miklavcic (
> michael.miklavcic@gmail.com) wrote:
>
> I attached a PDF - shows up on my end. Is that not coming through?
>
> On Wed, Oct 11, 2017 at 6:42 PM, Otto Fowler <ot...@gmail.com>
> wrote:
>
>> I think there is a missing attachment?
>>
>>
>> On October 11, 2017 at 20:22:33, Michael Miklavcic (
>> michael.miklavcic@gmail.com) wrote:
>>
>> For community reference, here is a class diagram that depicts our current
>> Metron 0.4.1 dependencies, for both prod and test code, against the old ES
>> client APIs along with an "after" diagram showing the world with the new
>> client. Feedback welcome.
>>
>>
>>
>> On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <ce...@gmail.com> wrote:
>>
>>> Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm in favor
>>> of the high level client.
>>>
>>> On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
>>> michael.miklavcic@gmail.com> wrote:
>>>
>>> > Justin, thanks for the feedback! I'm inclined to agree with you about
>>> using
>>> > the high level client. It's a bummer that we still need to do jar
>>> shading,
>>> > but I think that's a reasonable short term sacrifice considering the
>>> other
>>> > benefits. And they're angling towards slowly removing the ES core dep
>>> over
>>> > time anyhow so, like myself, this will get better with age.
>>> >
>>> > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <ju...@gmail.com>
>>> > wrote:
>>> >
>>> > > Do we intend on (or have interest in) supporting ES across major
>>> version
>>> > > for a given version of Metron?  I'm not convinced it's worth the
>>> work of
>>> > > using the low level client.
>>> > >
>>> > > This really only seems useful for ES clusters that are being used
>>> outside
>>> > > Metron and need to be on a different ES major version. Is that a use
>>> case
>>> > > we want/need to support? I'm willing to bet it's significantly more
>>> work
>>> > > and means we're modifying queries and even templates/mappings based
>>> on
>>> > what
>>> > > ES version we're interacting with (e.g. meta alerts in 5.x can
>>> exploit a
>>> > > query param to not screw around with the mapping, but that param
>>> doesn't
>>> > > exist in 2.x). At that point, we're either back to writing for ES
>>> 2.x or
>>> > > writing for every version of ES.
>>> > >
>>> > > Unless that's something we have a demand for (or someone else
>>> persuades
>>> > me
>>> > > otherwise), I'm in favor of using the high level client.  It seems
>>> like
>>> > > it'd be easier to migrate to also, given the similarities API-wise
>>> to the
>>> > > current client we're using.
>>> > >
>>> > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
>>> > > michael.miklavcic@gmail.com> wrote:
>>> > >
>>> > > > I think it might help the discussion to share my impressions of
>>> looking
>>> > > > over the new API recommendations from ES. I've summarized some info
>>> > > > provided by ES back in December 2016 regarding the reasons for
>>> > switching
>>> > > to
>>> > > > a new client model. [1]
>>> > > >
>>> > > > *Summary points:*
>>> > > >
>>> > > > Pre-5.x had Java API - binary exchange format used for node-to-node
>>> > > > communications.
>>> > > > In 5.x a low level REST API was added. Now there's also a high
>>> level
>>> > REST
>>> > > > client that handles request marshalling and response
>>> un-marshalling.
>>> > > >
>>> > > > *Benefits of existing Java API*
>>> > > >
>>> > > >    1. Theoretically faster - binary format, no JSON parsing
>>> > > >    2. Hardened, used for internal ES node to node communications
>>> > > >
>>> > > > *Cons of Java API*
>>> > > >
>>> > > >    1. Benchmarks show it's not really that much faster.
>>> > > >    2. Backwards compatibility - Java API changes often.
>>> > > >    3. Upgrades more challenging - need to refactor client code for
>>> new
>>> > > and
>>> > > >    deprecated features.
>>> > > >    4. Minor releases may contain breaking changes in the Java API
>>> > > >    5. Client and server *should* be on same JVM version (not as
>>> > important
>>> > > >    post 2.x, but still potentially necessary bc of serialization
>>> > w/binary
>>> > > >    format)
>>> > > >    6. Requires dependency on the entire elasticsearch server in
>>> order
>>> > to
>>> > > >    use the client. We end up shading jars.
>>> > > >
>>> > > > *Benefits of new REST API*
>>> > > >
>>> > > >    1. Upgrades
>>> > > >       1. Breaking changes only made in major releases - "We are
>>> very
>>> > > >       careful with backwards compatibility on the REST layer where
>>> > > breaking
>>> > > >       changes are made only in major releases."
>>> > > >       2. "The REST interface is much more stable and can be
>>> upgraded
>>> > out
>>> > > of
>>> > > >       step with the Elasticsearch cluster."
>>> > > >    2. REST client and server can be on different JVM's
>>> > > >    3. Dependencies for the low level client are very slim. No need
>>> for
>>> > > >    shading.
>>> > > >    4. The RestHighLevelClient supports the same request and
>>> response
>>> > > >    objects as the TransportClient
>>> > > >    5. Can be secured via HTTPS
>>> > > >
>>> > > > There are some additional benefits to the new API, however they
>>> depend
>>> > on
>>> > > > whether we choose to go with the high or low level client. More
>>> > comments
>>> > > > below.
>>> > > >
>>> > > > *Cons of new API*
>>> > > >
>>> > > >    1. Dependencies - The high level client still requires the full
>>> ES
>>> > > >    dependency, though this will slim down in future releases.
>>> > > >
>>> > > > *Other comments specific to Metron*
>>> > > >
>>> > > > There's a question of whether we should use the low or high level
>>> REST
>>> > > > client. The main differences between the two are how they handle
>>> lib
>>> > > > dependencies and marshaling/unmarshaling. The low level client
>>> cleans
>>> > up
>>> > > > the dependencies dramatically, whereas the high level client still
>>> > > requires
>>> > > > you to depend on elasticsearch core. On the other hand, the low
>>> level
>>> > > > client does no work to handle marshaling/unmarshaling the
>>> > > > requests/responses from the HTTP calls while the high level client
>>> > > handles
>>> > > > this for you and exposes api-specific methods. The high level
>>> client
>>> > > > accepts the same request arguments as the TransportClient and
>>> returns
>>> > the
>>> > > > same response objects. One more thing to note is that the low level
>>> > > client
>>> > > > claims to be compatible with all versions of ES whereas the high
>>> level
>>> > > > client appears to be only major version compatible.
>>> > > >
>>> > > > "The 5.6 client can communicate with any 5.6.x Elasticsearch node.
>>> > > Previous
>>> > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully)
>>> supported."
>>> > [2]
>>> > > >
>>> > > > Just as an example, here's a simple comparison of an index request
>>> in
>>> > the
>>> > > > low and high level API's.
>>> > > >
>>> > > > *Low Level*
>>> > > >
>>> > > > Map<String, String> params = Collections.emptyMap();
>>> > > > String jsonString = "{" +
>>> > > >             "\"user\":\"kimchy\"," +
>>> > > >             "\"postDate\":\"2013-01-30\"," +
>>> > > >             "\"message\":\"trying out Elasticsearch\"" +
>>> > > >         "}";
>>> > > > HttpEntity entity = new NStringEntity(jsonString,
>>> > > > ContentType.APPLICATION_JSON);
>>> > > > Response response = restClient.performRequest("PUT",
>>> "/posts/doc/1",
>>> > > > params, entity);
>>> > > >
>>> > > > *High Level*
>>> > > >
>>> > > > IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
>>> > > >         .source("user", "kimchy",
>>> > > >                      "postDate", new Date(),
>>> > > >                      "message", "trying out Elasticsearch");
>>> > > >
>>> > > > *Note*: there are a few ways to do this with the high level API,
>>> but
>>> > this
>>> > > > was the most concise for me to offer a comparison of benefits over
>>> the
>>> > > low
>>> > > > level API.
>>> > > >
>>> > > > *Thoughts/Recommendations*: I do think we should migrate to the new
>>> > API.
>>> > > I
>>> > > > think the question is which of the new APIs we should use. The high
>>> > level
>>> > > > client seems to shield us from having to deal with constructing
>>> special
>>> > > > JSON handling code, whereas the low level client handles all
>>> versions
>>> > of
>>> > > > ES. I don't have a good feel (yet) for just how much work it would
>>> > > require
>>> > > > to use the low level API, or how difficult it would be to add new
>>> > request
>>> > > > features in the future. Actually, we could probably leverage
>>> existing
>>> > > code
>>> > > > we have for dealing with JSON maps, so this might be really easy.
>>> > Someone
>>> > > > with more experience in Metron's ES client use might have a better
>>> idea
>>> > > of
>>> > > > the pros and cons to this. The high level client appears to handle
>>> > > > everything all JSON manipulation for us, but we lose the benefit
>>> of a
>>> > > > simpler dependency tree and support for all versions of ES. My only
>>> > > concern
>>> > > > with "supports all versions" is that I have to imagine there are
>>> > specific
>>> > > > calls that we'd have to be careful of when constructing the JSON
>>> > > requests,
>>> > > > so it's unclear to me if this is better or worse in the end.
>>> > > >
>>> > > > Best,
>>> > > > Mike
>>> > > >
>>> > > >
>>> > > >    1. https://www.elastic.co/blog/state-of-the-official-
>>> > > >    elasticsearch-java-clients
>>> > > >    2. https://www.elastic.co/guide/en/elasticsearch/client/java-
>>> > > >    rest/current/java-rest-high-compatibility.html
>>> > > >    <https://www.elastic.co/guide/en/elasticsearch/client/java-
>>> > > > rest/current/java-rest-high-compatibility.html>
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
>>> > > > michael.miklavcic@gmail.com> wrote:
>>> > > >
>>> > > > > I am working on upgrading Elasticsearch and Kibana. There are
>>> quite a
>>> > > few
>>> > > > > changes involved with this vix. I believe I'm mostly finished
>>> with
>>> > the
>>> > > > > Ambari mpack side of things, however we currently only support
>>> one
>>> > > > version
>>> > > > > with no backwards compatibility. What is the community's
>>> thoughts on
>>> > > > this?
>>> > > > >
>>> > > > > Here is some work contributed to the community that I'm
>>> referencing
>>> > > while
>>> > > > > working on this upgrade - https://github.com/apache/
>>> > > > metron/pull/619/files
>>> > > > >
>>> > > > > Best,
>>> > > > > Michael Miklavcic
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Otto Fowler <ot...@gmail.com>.
I don’t see one :(


On October 11, 2017 at 20:54:33, Michael Miklavcic (
michael.miklavcic@gmail.com) wrote:

I attached a PDF - shows up on my end. Is that not coming through?

On Wed, Oct 11, 2017 at 6:42 PM, Otto Fowler <ot...@gmail.com>
wrote:

> I think there is a missing attachment?
>
>
> On October 11, 2017 at 20:22:33, Michael Miklavcic (
> michael.miklavcic@gmail.com) wrote:
>
> For community reference, here is a class diagram that depicts our current
> Metron 0.4.1 dependencies, for both prod and test code, against the old ES
> client APIs along with an "after" diagram showing the world with the new
> client. Feedback welcome.
>
>
>
> On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <ce...@gmail.com> wrote:
>
>> Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm in favor
>> of the high level client.
>>
>> On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
>> michael.miklavcic@gmail.com> wrote:
>>
>> > Justin, thanks for the feedback! I'm inclined to agree with you about
>> using
>> > the high level client. It's a bummer that we still need to do jar
>> shading,
>> > but I think that's a reasonable short term sacrifice considering the
>> other
>> > benefits. And they're angling towards slowly removing the ES core dep
>> over
>> > time anyhow so, like myself, this will get better with age.
>> >
>> > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <ju...@gmail.com>
>> > wrote:
>> >
>> > > Do we intend on (or have interest in) supporting ES across major
>> version
>> > > for a given version of Metron?  I'm not convinced it's worth the work
>> of
>> > > using the low level client.
>> > >
>> > > This really only seems useful for ES clusters that are being used
>> outside
>> > > Metron and need to be on a different ES major version. Is that a use
>> case
>> > > we want/need to support? I'm willing to bet it's significantly more
>> work
>> > > and means we're modifying queries and even templates/mappings based on
>> > what
>> > > ES version we're interacting with (e.g. meta alerts in 5.x can
>> exploit a
>> > > query param to not screw around with the mapping, but that param
>> doesn't
>> > > exist in 2.x). At that point, we're either back to writing for ES 2.x
>> or
>> > > writing for every version of ES.
>> > >
>> > > Unless that's something we have a demand for (or someone else
>> persuades
>> > me
>> > > otherwise), I'm in favor of using the high level client.  It seems
>> like
>> > > it'd be easier to migrate to also, given the similarities API-wise to
>> the
>> > > current client we're using.
>> > >
>> > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
>> > > michael.miklavcic@gmail.com> wrote:
>> > >
>> > > > I think it might help the discussion to share my impressions of
>> looking
>> > > > over the new API recommendations from ES. I've summarized some info
>> > > > provided by ES back in December 2016 regarding the reasons for
>> > switching
>> > > to
>> > > > a new client model. [1]
>> > > >
>> > > > *Summary points:*
>> > > >
>> > > > Pre-5.x had Java API - binary exchange format used for node-to-node
>> > > > communications.
>> > > > In 5.x a low level REST API was added. Now there's also a high level
>> > REST
>> > > > client that handles request marshalling and response un-marshalling.
>> > > >
>> > > > *Benefits of existing Java API*
>> > > >
>> > > >    1. Theoretically faster - binary format, no JSON parsing
>> > > >    2. Hardened, used for internal ES node to node communications
>> > > >
>> > > > *Cons of Java API*
>> > > >
>> > > >    1. Benchmarks show it's not really that much faster.
>> > > >    2. Backwards compatibility - Java API changes often.
>> > > >    3. Upgrades more challenging - need to refactor client code for
>> new
>> > > and
>> > > >    deprecated features.
>> > > >    4. Minor releases may contain breaking changes in the Java API
>> > > >    5. Client and server *should* be on same JVM version (not as
>> > important
>> > > >    post 2.x, but still potentially necessary bc of serialization
>> > w/binary
>> > > >    format)
>> > > >    6. Requires dependency on the entire elasticsearch server in
>> order
>> > to
>> > > >    use the client. We end up shading jars.
>> > > >
>> > > > *Benefits of new REST API*
>> > > >
>> > > >    1. Upgrades
>> > > >       1. Breaking changes only made in major releases - "We are very
>> > > >       careful with backwards compatibility on the REST layer where
>> > > breaking
>> > > >       changes are made only in major releases."
>> > > >       2. "The REST interface is much more stable and can be upgraded
>> > out
>> > > of
>> > > >       step with the Elasticsearch cluster."
>> > > >    2. REST client and server can be on different JVM's
>> > > >    3. Dependencies for the low level client are very slim. No need
>> for
>> > > >    shading.
>> > > >    4. The RestHighLevelClient supports the same request and response
>> > > >    objects as the TransportClient
>> > > >    5. Can be secured via HTTPS
>> > > >
>> > > > There are some additional benefits to the new API, however they
>> depend
>> > on
>> > > > whether we choose to go with the high or low level client. More
>> > comments
>> > > > below.
>> > > >
>> > > > *Cons of new API*
>> > > >
>> > > >    1. Dependencies - The high level client still requires the full
>> ES
>> > > >    dependency, though this will slim down in future releases.
>> > > >
>> > > > *Other comments specific to Metron*
>> > > >
>> > > > There's a question of whether we should use the low or high level
>> REST
>> > > > client. The main differences between the two are how they handle lib
>> > > > dependencies and marshaling/unmarshaling. The low level client
>> cleans
>> > up
>> > > > the dependencies dramatically, whereas the high level client still
>> > > requires
>> > > > you to depend on elasticsearch core. On the other hand, the low
>> level
>> > > > client does no work to handle marshaling/unmarshaling the
>> > > > requests/responses from the HTTP calls while the high level client
>> > > handles
>> > > > this for you and exposes api-specific methods. The high level client
>> > > > accepts the same request arguments as the TransportClient and
>> returns
>> > the
>> > > > same response objects. One more thing to note is that the low level
>> > > client
>> > > > claims to be compatible with all versions of ES whereas the high
>> level
>> > > > client appears to be only major version compatible.
>> > > >
>> > > > "The 5.6 client can communicate with any 5.6.x Elasticsearch node.
>> > > Previous
>> > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully)
>> supported."
>> > [2]
>> > > >
>> > > > Just as an example, here's a simple comparison of an index request
>> in
>> > the
>> > > > low and high level API's.
>> > > >
>> > > > *Low Level*
>> > > >
>> > > > Map<String, String> params = Collections.emptyMap();
>> > > > String jsonString = "{" +
>> > > >             "\"user\":\"kimchy\"," +
>> > > >             "\"postDate\":\"2013-01-30\"," +
>> > > >             "\"message\":\"trying out Elasticsearch\"" +
>> > > >         "}";
>> > > > HttpEntity entity = new NStringEntity(jsonString,
>> > > > ContentType.APPLICATION_JSON);
>> > > > Response response = restClient.performRequest("PUT",
>> "/posts/doc/1",
>> > > > params, entity);
>> > > >
>> > > > *High Level*
>> > > >
>> > > > IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
>> > > >         .source("user", "kimchy",
>> > > >                      "postDate", new Date(),
>> > > >                      "message", "trying out Elasticsearch");
>> > > >
>> > > > *Note*: there are a few ways to do this with the high level API, but
>> > this
>> > > > was the most concise for me to offer a comparison of benefits over
>> the
>> > > low
>> > > > level API.
>> > > >
>> > > > *Thoughts/Recommendations*: I do think we should migrate to the new
>> > API.
>> > > I
>> > > > think the question is which of the new APIs we should use. The high
>> > level
>> > > > client seems to shield us from having to deal with constructing
>> special
>> > > > JSON handling code, whereas the low level client handles all
>> versions
>> > of
>> > > > ES. I don't have a good feel (yet) for just how much work it would
>> > > require
>> > > > to use the low level API, or how difficult it would be to add new
>> > request
>> > > > features in the future. Actually, we could probably leverage
>> existing
>> > > code
>> > > > we have for dealing with JSON maps, so this might be really easy.
>> > Someone
>> > > > with more experience in Metron's ES client use might have a better
>> idea
>> > > of
>> > > > the pros and cons to this. The high level client appears to handle
>> > > > everything all JSON manipulation for us, but we lose the benefit of
>> a
>> > > > simpler dependency tree and support for all versions of ES. My only
>> > > concern
>> > > > with "supports all versions" is that I have to imagine there are
>> > specific
>> > > > calls that we'd have to be careful of when constructing the JSON
>> > > requests,
>> > > > so it's unclear to me if this is better or worse in the end.
>> > > >
>> > > > Best,
>> > > > Mike
>> > > >
>> > > >
>> > > >    1. https://www.elastic.co/blog/state-of-the-official-
>> > > >    elasticsearch-java-clients
>> > > >    2. https://www.elastic.co/guide/en/elasticsearch/client/java-
>> > > >    rest/current/java-rest-high-compatibility.html
>> > > >    <https://www.elastic.co/guide/en/elasticsearch/client/java-
>> > > > rest/current/java-rest-high-compatibility.html>
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
>> > > > michael.miklavcic@gmail.com> wrote:
>> > > >
>> > > > > I am working on upgrading Elasticsearch and Kibana. There are
>> quite a
>> > > few
>> > > > > changes involved with this vix. I believe I'm mostly finished with
>> > the
>> > > > > Ambari mpack side of things, however we currently only support one
>> > > > version
>> > > > > with no backwards compatibility. What is the community's thoughts
>> on
>> > > > this?
>> > > > >
>> > > > > Here is some work contributed to the community that I'm
>> referencing
>> > > while
>> > > > > working on this upgrade - https://github.com/apache/
>> > > > metron/pull/619/files
>> > > > >
>> > > > > Best,
>> > > > > Michael Miklavcic
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Michael Miklavcic <mi...@gmail.com>.
I attached a PDF - shows up on my end. Is that not coming through?

On Wed, Oct 11, 2017 at 6:42 PM, Otto Fowler <ot...@gmail.com>
wrote:

> I think there is a missing attachment?
>
>
> On October 11, 2017 at 20:22:33, Michael Miklavcic (
> michael.miklavcic@gmail.com) wrote:
>
> For community reference, here is a class diagram that depicts our current
> Metron 0.4.1 dependencies, for both prod and test code, against the old ES
> client APIs along with an "after" diagram showing the world with the new
> client. Feedback welcome.
>
>
>
> On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <ce...@gmail.com> wrote:
>
>> Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm in favor
>> of the high level client.
>>
>> On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
>> michael.miklavcic@gmail.com> wrote:
>>
>> > Justin, thanks for the feedback! I'm inclined to agree with you about
>> using
>> > the high level client. It's a bummer that we still need to do jar
>> shading,
>> > but I think that's a reasonable short term sacrifice considering the
>> other
>> > benefits. And they're angling towards slowly removing the ES core dep
>> over
>> > time anyhow so, like myself, this will get better with age.
>> >
>> > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <ju...@gmail.com>
>> > wrote:
>> >
>> > > Do we intend on (or have interest in) supporting ES across major
>> version
>> > > for a given version of Metron?  I'm not convinced it's worth the work
>> of
>> > > using the low level client.
>> > >
>> > > This really only seems useful for ES clusters that are being used
>> outside
>> > > Metron and need to be on a different ES major version. Is that a use
>> case
>> > > we want/need to support? I'm willing to bet it's significantly more
>> work
>> > > and means we're modifying queries and even templates/mappings based on
>> > what
>> > > ES version we're interacting with (e.g. meta alerts in 5.x can
>> exploit a
>> > > query param to not screw around with the mapping, but that param
>> doesn't
>> > > exist in 2.x). At that point, we're either back to writing for ES 2.x
>> or
>> > > writing for every version of ES.
>> > >
>> > > Unless that's something we have a demand for (or someone else
>> persuades
>> > me
>> > > otherwise), I'm in favor of using the high level client.  It seems
>> like
>> > > it'd be easier to migrate to also, given the similarities API-wise to
>> the
>> > > current client we're using.
>> > >
>> > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
>> > > michael.miklavcic@gmail.com> wrote:
>> > >
>> > > > I think it might help the discussion to share my impressions of
>> looking
>> > > > over the new API recommendations from ES. I've summarized some info
>> > > > provided by ES back in December 2016 regarding the reasons for
>> > switching
>> > > to
>> > > > a new client model. [1]
>> > > >
>> > > > *Summary points:*
>> > > >
>> > > > Pre-5.x had Java API - binary exchange format used for node-to-node
>> > > > communications.
>> > > > In 5.x a low level REST API was added. Now there's also a high level
>> > REST
>> > > > client that handles request marshalling and response un-marshalling.
>> > > >
>> > > > *Benefits of existing Java API*
>> > > >
>> > > >    1. Theoretically faster - binary format, no JSON parsing
>> > > >    2. Hardened, used for internal ES node to node communications
>> > > >
>> > > > *Cons of Java API*
>> > > >
>> > > >    1. Benchmarks show it's not really that much faster.
>> > > >    2. Backwards compatibility - Java API changes often.
>> > > >    3. Upgrades more challenging - need to refactor client code for
>> new
>> > > and
>> > > >    deprecated features.
>> > > >    4. Minor releases may contain breaking changes in the Java API
>> > > >    5. Client and server *should* be on same JVM version (not as
>> > important
>> > > >    post 2.x, but still potentially necessary bc of serialization
>> > w/binary
>> > > >    format)
>> > > >    6. Requires dependency on the entire elasticsearch server in
>> order
>> > to
>> > > >    use the client. We end up shading jars.
>> > > >
>> > > > *Benefits of new REST API*
>> > > >
>> > > >    1. Upgrades
>> > > >       1. Breaking changes only made in major releases - "We are very
>> > > >       careful with backwards compatibility on the REST layer where
>> > > breaking
>> > > >       changes are made only in major releases."
>> > > >       2. "The REST interface is much more stable and can be upgraded
>> > out
>> > > of
>> > > >       step with the Elasticsearch cluster."
>> > > >    2. REST client and server can be on different JVM's
>> > > >    3. Dependencies for the low level client are very slim. No need
>> for
>> > > >    shading.
>> > > >    4. The RestHighLevelClient supports the same request and response
>> > > >    objects as the TransportClient
>> > > >    5. Can be secured via HTTPS
>> > > >
>> > > > There are some additional benefits to the new API, however they
>> depend
>> > on
>> > > > whether we choose to go with the high or low level client. More
>> > comments
>> > > > below.
>> > > >
>> > > > *Cons of new API*
>> > > >
>> > > >    1. Dependencies - The high level client still requires the full
>> ES
>> > > >    dependency, though this will slim down in future releases.
>> > > >
>> > > > *Other comments specific to Metron*
>> > > >
>> > > > There's a question of whether we should use the low or high level
>> REST
>> > > > client. The main differences between the two are how they handle lib
>> > > > dependencies and marshaling/unmarshaling. The low level client
>> cleans
>> > up
>> > > > the dependencies dramatically, whereas the high level client still
>> > > requires
>> > > > you to depend on elasticsearch core. On the other hand, the low
>> level
>> > > > client does no work to handle marshaling/unmarshaling the
>> > > > requests/responses from the HTTP calls while the high level client
>> > > handles
>> > > > this for you and exposes api-specific methods. The high level client
>> > > > accepts the same request arguments as the TransportClient and
>> returns
>> > the
>> > > > same response objects. One more thing to note is that the low level
>> > > client
>> > > > claims to be compatible with all versions of ES whereas the high
>> level
>> > > > client appears to be only major version compatible.
>> > > >
>> > > > "The 5.6 client can communicate with any 5.6.x Elasticsearch node.
>> > > Previous
>> > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully)
>> supported."
>> > [2]
>> > > >
>> > > > Just as an example, here's a simple comparison of an index request
>> in
>> > the
>> > > > low and high level API's.
>> > > >
>> > > > *Low Level*
>> > > >
>> > > > Map<String, String> params = Collections.emptyMap();
>> > > > String jsonString = "{" +
>> > > >             "\"user\":\"kimchy\"," +
>> > > >             "\"postDate\":\"2013-01-30\"," +
>> > > >             "\"message\":\"trying out Elasticsearch\"" +
>> > > >         "}";
>> > > > HttpEntity entity = new NStringEntity(jsonString,
>> > > > ContentType.APPLICATION_JSON);
>> > > > Response response = restClient.performRequest("PUT",
>> "/posts/doc/1",
>> > > > params, entity);
>> > > >
>> > > > *High Level*
>> > > >
>> > > > IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
>> > > >         .source("user", "kimchy",
>> > > >                      "postDate", new Date(),
>> > > >                      "message", "trying out Elasticsearch");
>> > > >
>> > > > *Note*: there are a few ways to do this with the high level API, but
>> > this
>> > > > was the most concise for me to offer a comparison of benefits over
>> the
>> > > low
>> > > > level API.
>> > > >
>> > > > *Thoughts/Recommendations*: I do think we should migrate to the new
>> > API.
>> > > I
>> > > > think the question is which of the new APIs we should use. The high
>> > level
>> > > > client seems to shield us from having to deal with constructing
>> special
>> > > > JSON handling code, whereas the low level client handles all
>> versions
>> > of
>> > > > ES. I don't have a good feel (yet) for just how much work it would
>> > > require
>> > > > to use the low level API, or how difficult it would be to add new
>> > request
>> > > > features in the future. Actually, we could probably leverage
>> existing
>> > > code
>> > > > we have for dealing with JSON maps, so this might be really easy.
>> > Someone
>> > > > with more experience in Metron's ES client use might have a better
>> idea
>> > > of
>> > > > the pros and cons to this. The high level client appears to handle
>> > > > everything all JSON manipulation for us, but we lose the benefit of
>> a
>> > > > simpler dependency tree and support for all versions of ES. My only
>> > > concern
>> > > > with "supports all versions" is that I have to imagine there are
>> > specific
>> > > > calls that we'd have to be careful of when constructing the JSON
>> > > requests,
>> > > > so it's unclear to me if this is better or worse in the end.
>> > > >
>> > > > Best,
>> > > > Mike
>> > > >
>> > > >
>> > > >    1. https://www.elastic.co/blog/state-of-the-official-
>> > > >    elasticsearch-java-clients
>> > > >    2. https://www.elastic.co/guide/en/elasticsearch/client/java-
>> > > >    rest/current/java-rest-high-compatibility.html
>> > > >    <https://www.elastic.co/guide/en/elasticsearch/client/java-
>> > > > rest/current/java-rest-high-compatibility.html>
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
>> > > > michael.miklavcic@gmail.com> wrote:
>> > > >
>> > > > > I am working on upgrading Elasticsearch and Kibana. There are
>> quite a
>> > > few
>> > > > > changes involved with this vix. I believe I'm mostly finished with
>> > the
>> > > > > Ambari mpack side of things, however we currently only support one
>> > > > version
>> > > > > with no backwards compatibility. What is the community's thoughts
>> on
>> > > > this?
>> > > > >
>> > > > > Here is some work contributed to the community that I'm
>> referencing
>> > > while
>> > > > > working on this upgrade - https://github.com/apache/
>> > > > metron/pull/619/files
>> > > > >
>> > > > > Best,
>> > > > > Michael Miklavcic
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Otto Fowler <ot...@gmail.com>.
I think there is a missing attachment?


On October 11, 2017 at 20:22:33, Michael Miklavcic (
michael.miklavcic@gmail.com) wrote:

For community reference, here is a class diagram that depicts our current
Metron 0.4.1 dependencies, for both prod and test code, against the old ES
client APIs along with an "after" diagram showing the world with the new
client. Feedback welcome.



On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <ce...@gmail.com> wrote:

> Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm in favor
> of the high level client.
>
> On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
> michael.miklavcic@gmail.com> wrote:
>
> > Justin, thanks for the feedback! I'm inclined to agree with you about
> using
> > the high level client. It's a bummer that we still need to do jar
> shading,
> > but I think that's a reasonable short term sacrifice considering the
> other
> > benefits. And they're angling towards slowly removing the ES core dep
> over
> > time anyhow so, like myself, this will get better with age.
> >
> > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <ju...@gmail.com>
> > wrote:
> >
> > > Do we intend on (or have interest in) supporting ES across major
> version
> > > for a given version of Metron?  I'm not convinced it's worth the work
> of
> > > using the low level client.
> > >
> > > This really only seems useful for ES clusters that are being used
> outside
> > > Metron and need to be on a different ES major version. Is that a use
> case
> > > we want/need to support? I'm willing to bet it's significantly more
> work
> > > and means we're modifying queries and even templates/mappings based on
> > what
> > > ES version we're interacting with (e.g. meta alerts in 5.x can exploit
> a
> > > query param to not screw around with the mapping, but that param
> doesn't
> > > exist in 2.x). At that point, we're either back to writing for ES 2.x
> or
> > > writing for every version of ES.
> > >
> > > Unless that's something we have a demand for (or someone else persuades
> > me
> > > otherwise), I'm in favor of using the high level client.  It seems like
> > > it'd be easier to migrate to also, given the similarities API-wise to
> the
> > > current client we're using.
> > >
> > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
> > > michael.miklavcic@gmail.com> wrote:
> > >
> > > > I think it might help the discussion to share my impressions of
> looking
> > > > over the new API recommendations from ES. I've summarized some info
> > > > provided by ES back in December 2016 regarding the reasons for
> > switching
> > > to
> > > > a new client model. [1]
> > > >
> > > > *Summary points:*
> > > >
> > > > Pre-5.x had Java API - binary exchange format used for node-to-node
> > > > communications.
> > > > In 5.x a low level REST API was added. Now there's also a high level
> > REST
> > > > client that handles request marshalling and response un-marshalling.
> > > >
> > > > *Benefits of existing Java API*
> > > >
> > > >    1. Theoretically faster - binary format, no JSON parsing
> > > >    2. Hardened, used for internal ES node to node communications
> > > >
> > > > *Cons of Java API*
> > > >
> > > >    1. Benchmarks show it's not really that much faster.
> > > >    2. Backwards compatibility - Java API changes often.
> > > >    3. Upgrades more challenging - need to refactor client code for
> new
> > > and
> > > >    deprecated features.
> > > >    4. Minor releases may contain breaking changes in the Java API
> > > >    5. Client and server *should* be on same JVM version (not as
> > important
> > > >    post 2.x, but still potentially necessary bc of serialization
> > w/binary
> > > >    format)
> > > >    6. Requires dependency on the entire elasticsearch server in order
> > to
> > > >    use the client. We end up shading jars.
> > > >
> > > > *Benefits of new REST API*
> > > >
> > > >    1. Upgrades
> > > >       1. Breaking changes only made in major releases - "We are very
> > > >       careful with backwards compatibility on the REST layer where
> > > breaking
> > > >       changes are made only in major releases."
> > > >       2. "The REST interface is much more stable and can be upgraded
> > out
> > > of
> > > >       step with the Elasticsearch cluster."
> > > >    2. REST client and server can be on different JVM's
> > > >    3. Dependencies for the low level client are very slim. No need
> for
> > > >    shading.
> > > >    4. The RestHighLevelClient supports the same request and response
> > > >    objects as the TransportClient
> > > >    5. Can be secured via HTTPS
> > > >
> > > > There are some additional benefits to the new API, however they
> depend
> > on
> > > > whether we choose to go with the high or low level client. More
> > comments
> > > > below.
> > > >
> > > > *Cons of new API*
> > > >
> > > >    1. Dependencies - The high level client still requires the full ES
> > > >    dependency, though this will slim down in future releases.
> > > >
> > > > *Other comments specific to Metron*
> > > >
> > > > There's a question of whether we should use the low or high level
> REST
> > > > client. The main differences between the two are how they handle lib
> > > > dependencies and marshaling/unmarshaling. The low level client cleans
> > up
> > > > the dependencies dramatically, whereas the high level client still
> > > requires
> > > > you to depend on elasticsearch core. On the other hand, the low level
> > > > client does no work to handle marshaling/unmarshaling the
> > > > requests/responses from the HTTP calls while the high level client
> > > handles
> > > > this for you and exposes api-specific methods. The high level client
> > > > accepts the same request arguments as the TransportClient and returns
> > the
> > > > same response objects. One more thing to note is that the low level
> > > client
> > > > claims to be compatible with all versions of ES whereas the high
> level
> > > > client appears to be only major version compatible.
> > > >
> > > > "The 5.6 client can communicate with any 5.6.x Elasticsearch node.
> > > Previous
> > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully) supported."
> > [2]
> > > >
> > > > Just as an example, here's a simple comparison of an index request in
> > the
> > > > low and high level API's.
> > > >
> > > > *Low Level*
> > > >
> > > > Map<String, String> params = Collections.emptyMap();
> > > > String jsonString = "{" +
> > > >             "\"user\":\"kimchy\"," +
> > > >             "\"postDate\":\"2013-01-30\"," +
> > > >             "\"message\":\"trying out Elasticsearch\"" +
> > > >         "}";
> > > > HttpEntity entity = new NStringEntity(jsonString,
> > > > ContentType.APPLICATION_JSON);
> > > > Response response = restClient.performRequest("PUT", "/posts/doc/1",
> > > > params, entity);
> > > >
> > > > *High Level*
> > > >
> > > > IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
> > > >         .source("user", "kimchy",
> > > >                      "postDate", new Date(),
> > > >                      "message", "trying out Elasticsearch");
> > > >
> > > > *Note*: there are a few ways to do this with the high level API, but
> > this
> > > > was the most concise for me to offer a comparison of benefits over
> the
> > > low
> > > > level API.
> > > >
> > > > *Thoughts/Recommendations*: I do think we should migrate to the new
> > API.
> > > I
> > > > think the question is which of the new APIs we should use. The high
> > level
> > > > client seems to shield us from having to deal with constructing
> special
> > > > JSON handling code, whereas the low level client handles all versions
> > of
> > > > ES. I don't have a good feel (yet) for just how much work it would
> > > require
> > > > to use the low level API, or how difficult it would be to add new
> > request
> > > > features in the future. Actually, we could probably leverage existing
> > > code
> > > > we have for dealing with JSON maps, so this might be really easy.
> > Someone
> > > > with more experience in Metron's ES client use might have a better
> idea
> > > of
> > > > the pros and cons to this. The high level client appears to handle
> > > > everything all JSON manipulation for us, but we lose the benefit of a
> > > > simpler dependency tree and support for all versions of ES. My only
> > > concern
> > > > with "supports all versions" is that I have to imagine there are
> > specific
> > > > calls that we'd have to be careful of when constructing the JSON
> > > requests,
> > > > so it's unclear to me if this is better or worse in the end.
> > > >
> > > > Best,
> > > > Mike
> > > >
> > > >
> > > >    1. https://www.elastic.co/blog/state-of-the-official-
> > > >    elasticsearch-java-clients
> > > >    2. https://www.elastic.co/guide/en/elasticsearch/client/java-
> > > >    rest/current/java-rest-high-compatibility.html
> > > >    <https://www.elastic.co/guide/en/elasticsearch/client/java-
> > > > rest/current/java-rest-high-compatibility.html>
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
> > > > michael.miklavcic@gmail.com> wrote:
> > > >
> > > > > I am working on upgrading Elasticsearch and Kibana. There are
> quite a
> > > few
> > > > > changes involved with this vix. I believe I'm mostly finished with
> > the
> > > > > Ambari mpack side of things, however we currently only support one
> > > > version
> > > > > with no backwards compatibility. What is the community's thoughts
> on
> > > > this?
> > > > >
> > > > > Here is some work contributed to the community that I'm referencing
> > > while
> > > > > working on this upgrade - https://github.com/apache/
> > > > metron/pull/619/files
> > > > >
> > > > > Best,
> > > > > Michael Miklavcic
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Michael Miklavcic <mi...@gmail.com>.
For community reference, here is a class diagram that depicts our current
Metron 0.4.1 dependencies, for both prod and test code, against the old ES
client APIs along with an "after" diagram showing the world with the new
client. Feedback welcome.



On Fri, Oct 6, 2017 at 8:13 AM, Casey Stella <ce...@gmail.com> wrote:

> Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm in favor
> of the high level client.
>
> On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
> michael.miklavcic@gmail.com> wrote:
>
> > Justin, thanks for the feedback! I'm inclined to agree with you about
> using
> > the high level client. It's a bummer that we still need to do jar
> shading,
> > but I think that's a reasonable short term sacrifice considering the
> other
> > benefits. And they're angling towards slowly removing the ES core dep
> over
> > time anyhow so, like myself, this will get better with age.
> >
> > On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <ju...@gmail.com>
> > wrote:
> >
> > > Do we intend on (or have interest in) supporting ES across major
> version
> > > for a given version of Metron?  I'm not convinced it's worth the work
> of
> > > using the low level client.
> > >
> > > This really only seems useful for ES clusters that are being used
> outside
> > > Metron and need to be on a different ES major version. Is that a use
> case
> > > we want/need to support? I'm willing to bet it's significantly more
> work
> > > and means we're modifying queries and even templates/mappings based on
> > what
> > > ES version we're interacting with (e.g. meta alerts in 5.x can exploit
> a
> > > query param to not screw around with the mapping, but that param
> doesn't
> > > exist in 2.x). At that point, we're either back to writing for ES 2.x
> or
> > > writing for every version of ES.
> > >
> > > Unless that's something we have a demand for (or someone else persuades
> > me
> > > otherwise), I'm in favor of using the high level client.  It seems like
> > > it'd be easier to migrate to also, given the similarities API-wise to
> the
> > > current client we're using.
> > >
> > > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
> > > michael.miklavcic@gmail.com> wrote:
> > >
> > > > I think it might help the discussion to share my impressions of
> looking
> > > > over the new API recommendations from ES. I've summarized some info
> > > > provided by ES back in December 2016 regarding the reasons for
> > switching
> > > to
> > > > a new client model. [1]
> > > >
> > > > *Summary points:*
> > > >
> > > > Pre-5.x had Java API - binary exchange format used for node-to-node
> > > > communications.
> > > > In 5.x a low level REST API was added. Now there's also a high level
> > REST
> > > > client that handles request marshalling and response un-marshalling.
> > > >
> > > > *Benefits of existing Java API*
> > > >
> > > >    1. Theoretically faster - binary format, no JSON parsing
> > > >    2. Hardened, used for internal ES node to node communications
> > > >
> > > > *Cons of Java API*
> > > >
> > > >    1. Benchmarks show it's not really that much faster.
> > > >    2. Backwards compatibility - Java API changes often.
> > > >    3. Upgrades more challenging - need to refactor client code for
> new
> > > and
> > > >    deprecated features.
> > > >    4. Minor releases may contain breaking changes in the Java API
> > > >    5. Client and server *should* be on same JVM version (not as
> > important
> > > >    post 2.x, but still potentially necessary bc of serialization
> > w/binary
> > > >    format)
> > > >    6. Requires dependency on the entire elasticsearch server in order
> > to
> > > >    use the client. We end up shading jars.
> > > >
> > > > *Benefits of new REST API*
> > > >
> > > >    1. Upgrades
> > > >       1. Breaking changes only made in major releases - "We are very
> > > >       careful with backwards compatibility on the REST layer where
> > > breaking
> > > >       changes are made only in major releases."
> > > >       2. "The REST interface is much more stable and can be upgraded
> > out
> > > of
> > > >       step with the Elasticsearch cluster."
> > > >    2. REST client and server can be on different JVM's
> > > >    3. Dependencies for the low level client are very slim. No need
> for
> > > >    shading.
> > > >    4. The RestHighLevelClient supports the same request and response
> > > >    objects as the TransportClient
> > > >    5. Can be secured via HTTPS
> > > >
> > > > There are some additional benefits to the new API, however they
> depend
> > on
> > > > whether we choose to go with the high or low level client. More
> > comments
> > > > below.
> > > >
> > > > *Cons of new API*
> > > >
> > > >    1. Dependencies - The high level client still requires the full ES
> > > >    dependency, though this will slim down in future releases.
> > > >
> > > > *Other comments specific to Metron*
> > > >
> > > > There's a question of whether we should use the low or high level
> REST
> > > > client. The main differences between the two are how they handle lib
> > > > dependencies and marshaling/unmarshaling. The low level client cleans
> > up
> > > > the dependencies dramatically, whereas the high level client still
> > > requires
> > > > you to depend on elasticsearch core. On the other hand, the low level
> > > > client does no work to handle marshaling/unmarshaling the
> > > > requests/responses from the HTTP calls while the high level client
> > > handles
> > > > this for you and exposes api-specific methods. The high level client
> > > > accepts the same request arguments as the TransportClient and returns
> > the
> > > > same response objects. One more thing to note is that the low level
> > > client
> > > > claims to be compatible with all versions of ES whereas the high
> level
> > > > client appears to be only major version compatible.
> > > >
> > > > "The 5.6 client can communicate with any 5.6.x Elasticsearch node.
> > > Previous
> > > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully) supported."
> > [2]
> > > >
> > > > Just as an example, here's a simple comparison of an index request in
> > the
> > > > low and high level API's.
> > > >
> > > > *Low Level*
> > > >
> > > > Map<String, String> params = Collections.emptyMap();
> > > > String jsonString = "{" +
> > > >             "\"user\":\"kimchy\"," +
> > > >             "\"postDate\":\"2013-01-30\"," +
> > > >             "\"message\":\"trying out Elasticsearch\"" +
> > > >         "}";
> > > > HttpEntity entity = new NStringEntity(jsonString,
> > > > ContentType.APPLICATION_JSON);
> > > > Response response = restClient.performRequest("PUT", "/posts/doc/1",
> > > > params, entity);
> > > >
> > > > *High Level*
> > > >
> > > > IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
> > > >         .source("user", "kimchy",
> > > >                      "postDate", new Date(),
> > > >                      "message", "trying out Elasticsearch");
> > > >
> > > > *Note*: there are a few ways to do this with the high level API, but
> > this
> > > > was the most concise for me to offer a comparison of benefits over
> the
> > > low
> > > > level API.
> > > >
> > > > *Thoughts/Recommendations*: I do think we should migrate to the new
> > API.
> > > I
> > > > think the question is which of the new APIs we should use. The high
> > level
> > > > client seems to shield us from having to deal with constructing
> special
> > > > JSON handling code, whereas the low level client handles all versions
> > of
> > > > ES. I don't have a good feel (yet) for just how much work it would
> > > require
> > > > to use the low level API, or how difficult it would be to add new
> > request
> > > > features in the future. Actually, we could probably leverage existing
> > > code
> > > > we have for dealing with JSON maps, so this might be really easy.
> > Someone
> > > > with more experience in Metron's ES client use might have a better
> idea
> > > of
> > > > the pros and cons to this. The high level client appears to handle
> > > > everything all JSON manipulation for us, but we lose the benefit of a
> > > > simpler dependency tree and support for all versions of ES. My only
> > > concern
> > > > with "supports all versions" is that I have to imagine there are
> > specific
> > > > calls that we'd have to be careful of when constructing the JSON
> > > requests,
> > > > so it's unclear to me if this is better or worse in the end.
> > > >
> > > > Best,
> > > > Mike
> > > >
> > > >
> > > >    1. https://www.elastic.co/blog/state-of-the-official-
> > > >    elasticsearch-java-clients
> > > >    2. https://www.elastic.co/guide/en/elasticsearch/client/java-
> > > >    rest/current/java-rest-high-compatibility.html
> > > >    <https://www.elastic.co/guide/en/elasticsearch/client/java-
> > > > rest/current/java-rest-high-compatibility.html>
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
> > > > michael.miklavcic@gmail.com> wrote:
> > > >
> > > > > I am working on upgrading Elasticsearch and Kibana. There are
> quite a
> > > few
> > > > > changes involved with this vix. I believe I'm mostly finished with
> > the
> > > > > Ambari mpack side of things, however we currently only support one
> > > > version
> > > > > with no backwards compatibility. What is the community's thoughts
> on
> > > > this?
> > > > >
> > > > > Here is some work contributed to the community that I'm referencing
> > > while
> > > > > working on this upgrade - https://github.com/apache/
> > > > metron/pull/619/files
> > > > >
> > > > > Best,
> > > > > Michael Miklavcic
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Casey Stella <ce...@gmail.com>.
Yeah, I agree with what Michael "fine whine" Miklavcic said; I'm in favor
of the high level client.

On Thu, Oct 5, 2017 at 3:35 PM, Michael Miklavcic <
michael.miklavcic@gmail.com> wrote:

> Justin, thanks for the feedback! I'm inclined to agree with you about using
> the high level client. It's a bummer that we still need to do jar shading,
> but I think that's a reasonable short term sacrifice considering the other
> benefits. And they're angling towards slowly removing the ES core dep over
> time anyhow so, like myself, this will get better with age.
>
> On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <ju...@gmail.com>
> wrote:
>
> > Do we intend on (or have interest in) supporting ES across major version
> > for a given version of Metron?  I'm not convinced it's worth the work of
> > using the low level client.
> >
> > This really only seems useful for ES clusters that are being used outside
> > Metron and need to be on a different ES major version. Is that a use case
> > we want/need to support? I'm willing to bet it's significantly more work
> > and means we're modifying queries and even templates/mappings based on
> what
> > ES version we're interacting with (e.g. meta alerts in 5.x can exploit a
> > query param to not screw around with the mapping, but that param doesn't
> > exist in 2.x). At that point, we're either back to writing for ES 2.x or
> > writing for every version of ES.
> >
> > Unless that's something we have a demand for (or someone else persuades
> me
> > otherwise), I'm in favor of using the high level client.  It seems like
> > it'd be easier to migrate to also, given the similarities API-wise to the
> > current client we're using.
> >
> > On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
> > michael.miklavcic@gmail.com> wrote:
> >
> > > I think it might help the discussion to share my impressions of looking
> > > over the new API recommendations from ES. I've summarized some info
> > > provided by ES back in December 2016 regarding the reasons for
> switching
> > to
> > > a new client model. [1]
> > >
> > > *Summary points:*
> > >
> > > Pre-5.x had Java API - binary exchange format used for node-to-node
> > > communications.
> > > In 5.x a low level REST API was added. Now there's also a high level
> REST
> > > client that handles request marshalling and response un-marshalling.
> > >
> > > *Benefits of existing Java API*
> > >
> > >    1. Theoretically faster - binary format, no JSON parsing
> > >    2. Hardened, used for internal ES node to node communications
> > >
> > > *Cons of Java API*
> > >
> > >    1. Benchmarks show it's not really that much faster.
> > >    2. Backwards compatibility - Java API changes often.
> > >    3. Upgrades more challenging - need to refactor client code for new
> > and
> > >    deprecated features.
> > >    4. Minor releases may contain breaking changes in the Java API
> > >    5. Client and server *should* be on same JVM version (not as
> important
> > >    post 2.x, but still potentially necessary bc of serialization
> w/binary
> > >    format)
> > >    6. Requires dependency on the entire elasticsearch server in order
> to
> > >    use the client. We end up shading jars.
> > >
> > > *Benefits of new REST API*
> > >
> > >    1. Upgrades
> > >       1. Breaking changes only made in major releases - "We are very
> > >       careful with backwards compatibility on the REST layer where
> > breaking
> > >       changes are made only in major releases."
> > >       2. "The REST interface is much more stable and can be upgraded
> out
> > of
> > >       step with the Elasticsearch cluster."
> > >    2. REST client and server can be on different JVM's
> > >    3. Dependencies for the low level client are very slim. No need for
> > >    shading.
> > >    4. The RestHighLevelClient supports the same request and response
> > >    objects as the TransportClient
> > >    5. Can be secured via HTTPS
> > >
> > > There are some additional benefits to the new API, however they depend
> on
> > > whether we choose to go with the high or low level client. More
> comments
> > > below.
> > >
> > > *Cons of new API*
> > >
> > >    1. Dependencies - The high level client still requires the full ES
> > >    dependency, though this will slim down in future releases.
> > >
> > > *Other comments specific to Metron*
> > >
> > > There's a question of whether we should use the low or high level REST
> > > client. The main differences between the two are how they handle lib
> > > dependencies and marshaling/unmarshaling. The low level client cleans
> up
> > > the dependencies dramatically, whereas the high level client still
> > requires
> > > you to depend on elasticsearch core. On the other hand, the low level
> > > client does no work to handle marshaling/unmarshaling the
> > > requests/responses from the HTTP calls while the high level client
> > handles
> > > this for you and exposes api-specific methods. The high level client
> > > accepts the same request arguments as the TransportClient and returns
> the
> > > same response objects. One more thing to note is that the low level
> > client
> > > claims to be compatible with all versions of ES whereas the high level
> > > client appears to be only major version compatible.
> > >
> > > "The 5.6 client can communicate with any 5.6.x Elasticsearch node.
> > Previous
> > > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully) supported."
> [2]
> > >
> > > Just as an example, here's a simple comparison of an index request in
> the
> > > low and high level API's.
> > >
> > > *Low Level*
> > >
> > > Map<String, String> params = Collections.emptyMap();
> > > String jsonString = "{" +
> > >             "\"user\":\"kimchy\"," +
> > >             "\"postDate\":\"2013-01-30\"," +
> > >             "\"message\":\"trying out Elasticsearch\"" +
> > >         "}";
> > > HttpEntity entity = new NStringEntity(jsonString,
> > > ContentType.APPLICATION_JSON);
> > > Response response = restClient.performRequest("PUT", "/posts/doc/1",
> > > params, entity);
> > >
> > > *High Level*
> > >
> > > IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
> > >         .source("user", "kimchy",
> > >                      "postDate", new Date(),
> > >                      "message", "trying out Elasticsearch");
> > >
> > > *Note*: there are a few ways to do this with the high level API, but
> this
> > > was the most concise for me to offer a comparison of benefits over the
> > low
> > > level API.
> > >
> > > *Thoughts/Recommendations*: I do think we should migrate to the new
> API.
> > I
> > > think the question is which of the new APIs we should use. The high
> level
> > > client seems to shield us from having to deal with constructing special
> > > JSON handling code, whereas the low level client handles all versions
> of
> > > ES. I don't have a good feel (yet) for just how much work it would
> > require
> > > to use the low level API, or how difficult it would be to add new
> request
> > > features in the future. Actually, we could probably leverage existing
> > code
> > > we have for dealing with JSON maps, so this might be really easy.
> Someone
> > > with more experience in Metron's ES client use might have a better idea
> > of
> > > the pros and cons to this. The high level client appears to handle
> > > everything all JSON manipulation for us, but we lose the benefit of a
> > > simpler dependency tree and support for all versions of ES. My only
> > concern
> > > with "supports all versions" is that I have to imagine there are
> specific
> > > calls that we'd have to be careful of when constructing the JSON
> > requests,
> > > so it's unclear to me if this is better or worse in the end.
> > >
> > > Best,
> > > Mike
> > >
> > >
> > >    1. https://www.elastic.co/blog/state-of-the-official-
> > >    elasticsearch-java-clients
> > >    2. https://www.elastic.co/guide/en/elasticsearch/client/java-
> > >    rest/current/java-rest-high-compatibility.html
> > >    <https://www.elastic.co/guide/en/elasticsearch/client/java-
> > > rest/current/java-rest-high-compatibility.html>
> > >
> > >
> > >
> > >
> > > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
> > > michael.miklavcic@gmail.com> wrote:
> > >
> > > > I am working on upgrading Elasticsearch and Kibana. There are quite a
> > few
> > > > changes involved with this vix. I believe I'm mostly finished with
> the
> > > > Ambari mpack side of things, however we currently only support one
> > > version
> > > > with no backwards compatibility. What is the community's thoughts on
> > > this?
> > > >
> > > > Here is some work contributed to the community that I'm referencing
> > while
> > > > working on this upgrade - https://github.com/apache/
> > > metron/pull/619/files
> > > >
> > > > Best,
> > > > Michael Miklavcic
> > > >
> > >
> >
>

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Michael Miklavcic <mi...@gmail.com>.
Justin, thanks for the feedback! I'm inclined to agree with you about using
the high level client. It's a bummer that we still need to do jar shading,
but I think that's a reasonable short term sacrifice considering the other
benefits. And they're angling towards slowly removing the ES core dep over
time anyhow so, like myself, this will get better with age.

On Thu, Oct 5, 2017 at 12:40 PM, Justin Leet <ju...@gmail.com> wrote:

> Do we intend on (or have interest in) supporting ES across major version
> for a given version of Metron?  I'm not convinced it's worth the work of
> using the low level client.
>
> This really only seems useful for ES clusters that are being used outside
> Metron and need to be on a different ES major version. Is that a use case
> we want/need to support? I'm willing to bet it's significantly more work
> and means we're modifying queries and even templates/mappings based on what
> ES version we're interacting with (e.g. meta alerts in 5.x can exploit a
> query param to not screw around with the mapping, but that param doesn't
> exist in 2.x). At that point, we're either back to writing for ES 2.x or
> writing for every version of ES.
>
> Unless that's something we have a demand for (or someone else persuades me
> otherwise), I'm in favor of using the high level client.  It seems like
> it'd be easier to migrate to also, given the similarities API-wise to the
> current client we're using.
>
> On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
> michael.miklavcic@gmail.com> wrote:
>
> > I think it might help the discussion to share my impressions of looking
> > over the new API recommendations from ES. I've summarized some info
> > provided by ES back in December 2016 regarding the reasons for switching
> to
> > a new client model. [1]
> >
> > *Summary points:*
> >
> > Pre-5.x had Java API - binary exchange format used for node-to-node
> > communications.
> > In 5.x a low level REST API was added. Now there's also a high level REST
> > client that handles request marshalling and response un-marshalling.
> >
> > *Benefits of existing Java API*
> >
> >    1. Theoretically faster - binary format, no JSON parsing
> >    2. Hardened, used for internal ES node to node communications
> >
> > *Cons of Java API*
> >
> >    1. Benchmarks show it's not really that much faster.
> >    2. Backwards compatibility - Java API changes often.
> >    3. Upgrades more challenging - need to refactor client code for new
> and
> >    deprecated features.
> >    4. Minor releases may contain breaking changes in the Java API
> >    5. Client and server *should* be on same JVM version (not as important
> >    post 2.x, but still potentially necessary bc of serialization w/binary
> >    format)
> >    6. Requires dependency on the entire elasticsearch server in order to
> >    use the client. We end up shading jars.
> >
> > *Benefits of new REST API*
> >
> >    1. Upgrades
> >       1. Breaking changes only made in major releases - "We are very
> >       careful with backwards compatibility on the REST layer where
> breaking
> >       changes are made only in major releases."
> >       2. "The REST interface is much more stable and can be upgraded out
> of
> >       step with the Elasticsearch cluster."
> >    2. REST client and server can be on different JVM's
> >    3. Dependencies for the low level client are very slim. No need for
> >    shading.
> >    4. The RestHighLevelClient supports the same request and response
> >    objects as the TransportClient
> >    5. Can be secured via HTTPS
> >
> > There are some additional benefits to the new API, however they depend on
> > whether we choose to go with the high or low level client. More comments
> > below.
> >
> > *Cons of new API*
> >
> >    1. Dependencies - The high level client still requires the full ES
> >    dependency, though this will slim down in future releases.
> >
> > *Other comments specific to Metron*
> >
> > There's a question of whether we should use the low or high level REST
> > client. The main differences between the two are how they handle lib
> > dependencies and marshaling/unmarshaling. The low level client cleans up
> > the dependencies dramatically, whereas the high level client still
> requires
> > you to depend on elasticsearch core. On the other hand, the low level
> > client does no work to handle marshaling/unmarshaling the
> > requests/responses from the HTTP calls while the high level client
> handles
> > this for you and exposes api-specific methods. The high level client
> > accepts the same request arguments as the TransportClient and returns the
> > same response objects. One more thing to note is that the low level
> client
> > claims to be compatible with all versions of ES whereas the high level
> > client appears to be only major version compatible.
> >
> > "The 5.6 client can communicate with any 5.6.x Elasticsearch node.
> Previous
> > 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully) supported." [2]
> >
> > Just as an example, here's a simple comparison of an index request in the
> > low and high level API's.
> >
> > *Low Level*
> >
> > Map<String, String> params = Collections.emptyMap();
> > String jsonString = "{" +
> >             "\"user\":\"kimchy\"," +
> >             "\"postDate\":\"2013-01-30\"," +
> >             "\"message\":\"trying out Elasticsearch\"" +
> >         "}";
> > HttpEntity entity = new NStringEntity(jsonString,
> > ContentType.APPLICATION_JSON);
> > Response response = restClient.performRequest("PUT", "/posts/doc/1",
> > params, entity);
> >
> > *High Level*
> >
> > IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
> >         .source("user", "kimchy",
> >                      "postDate", new Date(),
> >                      "message", "trying out Elasticsearch");
> >
> > *Note*: there are a few ways to do this with the high level API, but this
> > was the most concise for me to offer a comparison of benefits over the
> low
> > level API.
> >
> > *Thoughts/Recommendations*: I do think we should migrate to the new API.
> I
> > think the question is which of the new APIs we should use. The high level
> > client seems to shield us from having to deal with constructing special
> > JSON handling code, whereas the low level client handles all versions of
> > ES. I don't have a good feel (yet) for just how much work it would
> require
> > to use the low level API, or how difficult it would be to add new request
> > features in the future. Actually, we could probably leverage existing
> code
> > we have for dealing with JSON maps, so this might be really easy. Someone
> > with more experience in Metron's ES client use might have a better idea
> of
> > the pros and cons to this. The high level client appears to handle
> > everything all JSON manipulation for us, but we lose the benefit of a
> > simpler dependency tree and support for all versions of ES. My only
> concern
> > with "supports all versions" is that I have to imagine there are specific
> > calls that we'd have to be careful of when constructing the JSON
> requests,
> > so it's unclear to me if this is better or worse in the end.
> >
> > Best,
> > Mike
> >
> >
> >    1. https://www.elastic.co/blog/state-of-the-official-
> >    elasticsearch-java-clients
> >    2. https://www.elastic.co/guide/en/elasticsearch/client/java-
> >    rest/current/java-rest-high-compatibility.html
> >    <https://www.elastic.co/guide/en/elasticsearch/client/java-
> > rest/current/java-rest-high-compatibility.html>
> >
> >
> >
> >
> > On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
> > michael.miklavcic@gmail.com> wrote:
> >
> > > I am working on upgrading Elasticsearch and Kibana. There are quite a
> few
> > > changes involved with this vix. I believe I'm mostly finished with the
> > > Ambari mpack side of things, however we currently only support one
> > version
> > > with no backwards compatibility. What is the community's thoughts on
> > this?
> > >
> > > Here is some work contributed to the community that I'm referencing
> while
> > > working on this upgrade - https://github.com/apache/
> > metron/pull/619/files
> > >
> > > Best,
> > > Michael Miklavcic
> > >
> >
>

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Justin Leet <ju...@gmail.com>.
Do we intend on (or have interest in) supporting ES across major version
for a given version of Metron?  I'm not convinced it's worth the work of
using the low level client.

This really only seems useful for ES clusters that are being used outside
Metron and need to be on a different ES major version. Is that a use case
we want/need to support? I'm willing to bet it's significantly more work
and means we're modifying queries and even templates/mappings based on what
ES version we're interacting with (e.g. meta alerts in 5.x can exploit a
query param to not screw around with the mapping, but that param doesn't
exist in 2.x). At that point, we're either back to writing for ES 2.x or
writing for every version of ES.

Unless that's something we have a demand for (or someone else persuades me
otherwise), I'm in favor of using the high level client.  It seems like
it'd be easier to migrate to also, given the similarities API-wise to the
current client we're using.

On Thu, Oct 5, 2017 at 1:52 PM, Michael Miklavcic <
michael.miklavcic@gmail.com> wrote:

> I think it might help the discussion to share my impressions of looking
> over the new API recommendations from ES. I've summarized some info
> provided by ES back in December 2016 regarding the reasons for switching to
> a new client model. [1]
>
> *Summary points:*
>
> Pre-5.x had Java API - binary exchange format used for node-to-node
> communications.
> In 5.x a low level REST API was added. Now there's also a high level REST
> client that handles request marshalling and response un-marshalling.
>
> *Benefits of existing Java API*
>
>    1. Theoretically faster - binary format, no JSON parsing
>    2. Hardened, used for internal ES node to node communications
>
> *Cons of Java API*
>
>    1. Benchmarks show it's not really that much faster.
>    2. Backwards compatibility - Java API changes often.
>    3. Upgrades more challenging - need to refactor client code for new and
>    deprecated features.
>    4. Minor releases may contain breaking changes in the Java API
>    5. Client and server *should* be on same JVM version (not as important
>    post 2.x, but still potentially necessary bc of serialization w/binary
>    format)
>    6. Requires dependency on the entire elasticsearch server in order to
>    use the client. We end up shading jars.
>
> *Benefits of new REST API*
>
>    1. Upgrades
>       1. Breaking changes only made in major releases - "We are very
>       careful with backwards compatibility on the REST layer where breaking
>       changes are made only in major releases."
>       2. "The REST interface is much more stable and can be upgraded out of
>       step with the Elasticsearch cluster."
>    2. REST client and server can be on different JVM's
>    3. Dependencies for the low level client are very slim. No need for
>    shading.
>    4. The RestHighLevelClient supports the same request and response
>    objects as the TransportClient
>    5. Can be secured via HTTPS
>
> There are some additional benefits to the new API, however they depend on
> whether we choose to go with the high or low level client. More comments
> below.
>
> *Cons of new API*
>
>    1. Dependencies - The high level client still requires the full ES
>    dependency, though this will slim down in future releases.
>
> *Other comments specific to Metron*
>
> There's a question of whether we should use the low or high level REST
> client. The main differences between the two are how they handle lib
> dependencies and marshaling/unmarshaling. The low level client cleans up
> the dependencies dramatically, whereas the high level client still requires
> you to depend on elasticsearch core. On the other hand, the low level
> client does no work to handle marshaling/unmarshaling the
> requests/responses from the HTTP calls while the high level client handles
> this for you and exposes api-specific methods. The high level client
> accepts the same request arguments as the TransportClient and returns the
> same response objects. One more thing to note is that the low level client
> claims to be compatible with all versions of ES whereas the high level
> client appears to be only major version compatible.
>
> "The 5.6 client can communicate with any 5.6.x Elasticsearch node. Previous
> 5.x minor versions like 5.5.x, 5.4.x etc. are not (fully) supported." [2]
>
> Just as an example, here's a simple comparison of an index request in the
> low and high level API's.
>
> *Low Level*
>
> Map<String, String> params = Collections.emptyMap();
> String jsonString = "{" +
>             "\"user\":\"kimchy\"," +
>             "\"postDate\":\"2013-01-30\"," +
>             "\"message\":\"trying out Elasticsearch\"" +
>         "}";
> HttpEntity entity = new NStringEntity(jsonString,
> ContentType.APPLICATION_JSON);
> Response response = restClient.performRequest("PUT", "/posts/doc/1",
> params, entity);
>
> *High Level*
>
> IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
>         .source("user", "kimchy",
>                      "postDate", new Date(),
>                      "message", "trying out Elasticsearch");
>
> *Note*: there are a few ways to do this with the high level API, but this
> was the most concise for me to offer a comparison of benefits over the low
> level API.
>
> *Thoughts/Recommendations*: I do think we should migrate to the new API. I
> think the question is which of the new APIs we should use. The high level
> client seems to shield us from having to deal with constructing special
> JSON handling code, whereas the low level client handles all versions of
> ES. I don't have a good feel (yet) for just how much work it would require
> to use the low level API, or how difficult it would be to add new request
> features in the future. Actually, we could probably leverage existing code
> we have for dealing with JSON maps, so this might be really easy. Someone
> with more experience in Metron's ES client use might have a better idea of
> the pros and cons to this. The high level client appears to handle
> everything all JSON manipulation for us, but we lose the benefit of a
> simpler dependency tree and support for all versions of ES. My only concern
> with "supports all versions" is that I have to imagine there are specific
> calls that we'd have to be careful of when constructing the JSON requests,
> so it's unclear to me if this is better or worse in the end.
>
> Best,
> Mike
>
>
>    1. https://www.elastic.co/blog/state-of-the-official-
>    elasticsearch-java-clients
>    2. https://www.elastic.co/guide/en/elasticsearch/client/java-
>    rest/current/java-rest-high-compatibility.html
>    <https://www.elastic.co/guide/en/elasticsearch/client/java-
> rest/current/java-rest-high-compatibility.html>
>
>
>
>
> On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
> michael.miklavcic@gmail.com> wrote:
>
> > I am working on upgrading Elasticsearch and Kibana. There are quite a few
> > changes involved with this vix. I believe I'm mostly finished with the
> > Ambari mpack side of things, however we currently only support one
> version
> > with no backwards compatibility. What is the community's thoughts on
> this?
> >
> > Here is some work contributed to the community that I'm referencing while
> > working on this upgrade - https://github.com/apache/
> metron/pull/619/files
> >
> > Best,
> > Michael Miklavcic
> >
>

Re: [DISCUSS] Upgrading Elasticsearch from 2.x to 5.x

Posted by Michael Miklavcic <mi...@gmail.com>.
I think it might help the discussion to share my impressions of looking
over the new API recommendations from ES. I've summarized some info
provided by ES back in December 2016 regarding the reasons for switching to
a new client model. [1]

*Summary points:*

Pre-5.x had Java API - binary exchange format used for node-to-node
communications.
In 5.x a low level REST API was added. Now there's also a high level REST
client that handles request marshalling and response un-marshalling.

*Benefits of existing Java API*

   1. Theoretically faster - binary format, no JSON parsing
   2. Hardened, used for internal ES node to node communications

*Cons of Java API*

   1. Benchmarks show it's not really that much faster.
   2. Backwards compatibility - Java API changes often.
   3. Upgrades more challenging - need to refactor client code for new and
   deprecated features.
   4. Minor releases may contain breaking changes in the Java API
   5. Client and server *should* be on same JVM version (not as important
   post 2.x, but still potentially necessary bc of serialization w/binary
   format)
   6. Requires dependency on the entire elasticsearch server in order to
   use the client. We end up shading jars.

*Benefits of new REST API*

   1. Upgrades
      1. Breaking changes only made in major releases - "We are very
      careful with backwards compatibility on the REST layer where breaking
      changes are made only in major releases."
      2. "The REST interface is much more stable and can be upgraded out of
      step with the Elasticsearch cluster."
   2. REST client and server can be on different JVM's
   3. Dependencies for the low level client are very slim. No need for
   shading.
   4. The RestHighLevelClient supports the same request and response
   objects as the TransportClient
   5. Can be secured via HTTPS

There are some additional benefits to the new API, however they depend on
whether we choose to go with the high or low level client. More comments
below.

*Cons of new API*

   1. Dependencies - The high level client still requires the full ES
   dependency, though this will slim down in future releases.

*Other comments specific to Metron*

There's a question of whether we should use the low or high level REST
client. The main differences between the two are how they handle lib
dependencies and marshaling/unmarshaling. The low level client cleans up
the dependencies dramatically, whereas the high level client still requires
you to depend on elasticsearch core. On the other hand, the low level
client does no work to handle marshaling/unmarshaling the
requests/responses from the HTTP calls while the high level client handles
this for you and exposes api-specific methods. The high level client
accepts the same request arguments as the TransportClient and returns the
same response objects. One more thing to note is that the low level client
claims to be compatible with all versions of ES whereas the high level
client appears to be only major version compatible.

"The 5.6 client can communicate with any 5.6.x Elasticsearch node. Previous
5.x minor versions like 5.5.x, 5.4.x etc. are not (fully) supported." [2]

Just as an example, here's a simple comparison of an index request in the
low and high level API's.

*Low Level*

Map<String, String> params = Collections.emptyMap();
String jsonString = "{" +
            "\"user\":\"kimchy\"," +
            "\"postDate\":\"2013-01-30\"," +
            "\"message\":\"trying out Elasticsearch\"" +
        "}";
HttpEntity entity = new NStringEntity(jsonString,
ContentType.APPLICATION_JSON);
Response response = restClient.performRequest("PUT", "/posts/doc/1",
params, entity);

*High Level*

IndexRequest indexRequest = new IndexRequest("posts", "doc", "1")
        .source("user", "kimchy",
                     "postDate", new Date(),
                     "message", "trying out Elasticsearch");

*Note*: there are a few ways to do this with the high level API, but this
was the most concise for me to offer a comparison of benefits over the low
level API.

*Thoughts/Recommendations*: I do think we should migrate to the new API. I
think the question is which of the new APIs we should use. The high level
client seems to shield us from having to deal with constructing special
JSON handling code, whereas the low level client handles all versions of
ES. I don't have a good feel (yet) for just how much work it would require
to use the low level API, or how difficult it would be to add new request
features in the future. Actually, we could probably leverage existing code
we have for dealing with JSON maps, so this might be really easy. Someone
with more experience in Metron's ES client use might have a better idea of
the pros and cons to this. The high level client appears to handle
everything all JSON manipulation for us, but we lose the benefit of a
simpler dependency tree and support for all versions of ES. My only concern
with "supports all versions" is that I have to imagine there are specific
calls that we'd have to be careful of when constructing the JSON requests,
so it's unclear to me if this is better or worse in the end.

Best,
Mike


   1. https://www.elastic.co/blog/state-of-the-official-
   elasticsearch-java-clients
   2. https://www.elastic.co/guide/en/elasticsearch/client/java-
   rest/current/java-rest-high-compatibility.html
   <https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-compatibility.html>




On Wed, Sep 27, 2017 at 8:03 PM, Michael Miklavcic <
michael.miklavcic@gmail.com> wrote:

> I am working on upgrading Elasticsearch and Kibana. There are quite a few
> changes involved with this vix. I believe I'm mostly finished with the
> Ambari mpack side of things, however we currently only support one version
> with no backwards compatibility. What is the community's thoughts on this?
>
> Here is some work contributed to the community that I'm referencing while
> working on this upgrade - https://github.com/apache/metron/pull/619/files
>
> Best,
> Michael Miklavcic
>