You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Etienne Chauchot <ec...@apache.org> on 2020/04/09 09:06:24 UTC

Re: A new reworked Elasticsearch 7+ IO module

Hi Kenn,

The user does not specify the backendVersion targeted (at least on the 
current version of the IO) it is transparent to him: the IO detects the 
version with a REST call and adapts its behavior. But, anyway, I agree, 
we need to put at least a WARN if detected version is 2. As the new IO 
will not be compatible with ESV2 (because ES classes differ too much to 
have a common production basis), the only option on the new IO is to 
reject completely if version is 2 IMHO.

Best

Etienne

On 06/03/2020 18:49, Kenneth Knowles wrote:
> Since the user provides backendVersion, here are some possible levels 
> of things to add in expand() based on that (these are extra niceties 
> beyond the agreed number of releases to remove)
>
>  - WARN for backendVersion < n
>  - reject for backendVersion < n with opt-in pipeline option to keep 
> it working one more version (gets their attention and indicates urgency)
>  - reject completely
>
> Kenn
>
> On Fri, Mar 6, 2020 at 2:26 AM Etienne Chauchot <echauchot@apache.org 
> <ma...@apache.org>> wrote:
>
>     Hi all,
>
>     it's been 3 weeks since the survey on ES versions the users use.
>
>     The survey received very few responses: only 9 responses for now
>     (multiple versions possible of course). The responses are the
>     following:
>
>     ES2: 0 clients, ES5: 1, ES6: 5, ES7: 8
>
>     It tends to go toward a drop of ES2 support but for now it is
>     still not very representative.
>
>     I'm cross-posting to @users to let you know that I'm closing the
>     survey within 1 or 2 weeks. So please respond if you're using ESIO.
>
>     Best
>
>     Etienne
>
>     On 13/02/2020 12:37, Etienne Chauchot wrote:
>>
>>     Hi Cham, thanks for your comments !
>>
>>     I just sent an email to user ML with a survey link to count ES
>>     uses per version:
>>
>>     https://lists.apache.org/thread.html/rc8185afb8af86a2a032909c13f569e18bd89e75a5839894d5b5d4082%40%3Cuser.beam.apache.org%3E
>>
>>     Best
>>
>>     Etienne
>>
>>     On 10/02/2020 19:46, Chamikara Jayalath wrote:
>>>
>>>
>>>     On Thu, Feb 6, 2020 at 8:13 AM Etienne Chauchot
>>>     <echauchot@apache.org <ma...@apache.org>> wrote:
>>>
>>>         Hi,
>>>
>>>         please see my comments inline
>>>
>>>         On 06/02/2020 16:24, Alexey Romanenko wrote:
>>>>         Please, see my comments inline.
>>>>
>>>>>         On 6 Feb 2020, at 10:50, Etienne Chauchot
>>>>>         <echauchot@apache.org <ma...@apache.org>> wrote:
>>>>>>
>>>>>>>>                 1. regarding version support: ES v2 is no more
>>>>>>>>                 maintained by Elastic since 2018/02 so we plan
>>>>>>>>                 to remove it from the IO. In the past we
>>>>>>>>                 already retired versions (like spark 1.6 for
>>>>>>>>                 instance).
>>>>>>>>
>>>>>>
>>>>>>             My only concern here is that there might be users who
>>>>>>             use the existing module who might not be able to
>>>>>>             easily upgrade the Beam version if we remove it. But
>>>>>>             given that V2 is 5 versions behind the latest release
>>>>>>             this might be OK.
>>>>>>
>>>>>>
>>>>>>         It seems we have a consensus on this.
>>>>>>         I think there should be another general discussion on the
>>>>>>         long term support of our prefered tool IO modules.
>>>>>
>>>>>         => yes, consensus, let's drop ESV2
>>>>>
>>>>         We had (and still have) a similar problem with KafkaIO to
>>>>         support different versions of Kafka, especially very old
>>>>         version 0.9. We raised this question on user@ and it
>>>>         appears that there are users who for some reasons still use
>>>>         old Kafka versions. So, before dropping a support of any ES
>>>>         versions, I’d suggest to ask it user@ and see if any people
>>>>         will be affected by this.
>>>         Yes we can do a survey among users but the question is,
>>>         should we support an ES version that is no more supported by
>>>         Elastic themselves ?
>>>
>>>
>>>     +1 for asking in the user list. I guess this is more about
>>>     whether users need this specific version that we hope to drop
>>>     support for. Whether we need to support unsupported versions is
>>>     a more generic question that should prob. be addressed in the
>>>     dev list. (and I personally don't think we should unless there's
>>>     a large enough user base for a given version).
>>>
>>>>>>>>                 2. regarding the user: the aim is to unlock
>>>>>>>>                 some new features (listed by Ludovic) and give
>>>>>>>>                 the user more flexibility on his request. For
>>>>>>>>                 that, it requires to use high level java ES
>>>>>>>>                 client in place of the low level REST client
>>>>>>>>                 (that was used because it is the only one
>>>>>>>>                 compatible with all ES versions). We plan to
>>>>>>>>                 replace the API (json document in and out) by
>>>>>>>>                 more complete standard ES objects that contain
>>>>>>>>                 de request logic (insert/update, doc routing
>>>>>>>>                 etc...) and the data. There are already IOs
>>>>>>>>                 like SpannerIO that use similar objects in
>>>>>>>>                 input PCollection rather than pure POJOs.
>>>>>>>>
>>>>>>
>>>>>>             Won't this be a breaking change for all users ? IMO
>>>>>>             using POJOs in PCollections is safer since we have to
>>>>>>             worry about changes to the underlying client library
>>>>>>             API. Exception would be when underlying client
>>>>>>             library offers a backwards compatibility guarantee
>>>>>>             that we can rely on for the foreseeable future (for
>>>>>>             example, BQ TableRow).
>>>>>>
>>>>>>
>>>>>>         Agreed but actually, there will be POJOs in order to
>>>>>>         abstract Elasticsearch's version support. The following
>>>>>>         third point explains this.
>>>>>
>>>>>         => indeed it will be a breaking change, hence this email
>>>>>         to get a consensus on that. Also I think our wrappers of
>>>>>         ES request objects will offer a backward compatible as the
>>>>>         underlying objects
>>>>>
>>>>         I just want to remind that according to what we agreed some
>>>>         time ago on dev@ (at least, for IOs), all breaking user API
>>>>         changes have to be added along with deprecation of old API
>>>>         that could be removed after 3 consecutive Beam releases. In
>>>>         this case, users will have a time to move to new API smoothly.
>>>
>>>         We are more discussing the target architecture of the new
>>>         module here but the process of deprecation is important to
>>>         recall, I agree. When I say DTOs backward compatible above I
>>>         mean between per-version sub-modules inside the new module.
>>>         Anyway, sure, for some time, both modules (the old
>>>         REST-based that supports v2-7 and the new that supports
>>>         v5-7) will cohabit and the old one will receive the
>>>         deprecation annotations.
>>>
>>>
>>>     +1 for supporting both versions for at least three minor
>>>     versions to give users time to migrate. Also, we should try to
>>>     produce a warning for users who use the deprecated versions.
>>>
>>>     Thanks,
>>>     Cham
>>>
>>>         Best
>>>
>>>         Etienne
>>>
>>>>
>>>>