You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Etienne Chauchot <ec...@apache.org> on 2020/04/09 09:06:24 UTC
Re: A new reworked Elasticsearch 7+ IO module
Hi Kenn,
The user does not specify the backendVersion targeted (at least on the
current version of the IO) it is transparent to him: the IO detects the
version with a REST call and adapts its behavior. But, anyway, I agree,
we need to put at least a WARN if detected version is 2. As the new IO
will not be compatible with ESV2 (because ES classes differ too much to
have a common production basis), the only option on the new IO is to
reject completely if version is 2 IMHO.
Best
Etienne
On 06/03/2020 18:49, Kenneth Knowles wrote:
> Since the user provides backendVersion, here are some possible levels
> of things to add in expand() based on that (these are extra niceties
> beyond the agreed number of releases to remove)
>
> - WARN for backendVersion < n
> - reject for backendVersion < n with opt-in pipeline option to keep
> it working one more version (gets their attention and indicates urgency)
> - reject completely
>
> Kenn
>
> On Fri, Mar 6, 2020 at 2:26 AM Etienne Chauchot <echauchot@apache.org
> <ma...@apache.org>> wrote:
>
> Hi all,
>
> it's been 3 weeks since the survey on ES versions the users use.
>
> The survey received very few responses: only 9 responses for now
> (multiple versions possible of course). The responses are the
> following:
>
> ES2: 0 clients, ES5: 1, ES6: 5, ES7: 8
>
> It tends to go toward a drop of ES2 support but for now it is
> still not very representative.
>
> I'm cross-posting to @users to let you know that I'm closing the
> survey within 1 or 2 weeks. So please respond if you're using ESIO.
>
> Best
>
> Etienne
>
> On 13/02/2020 12:37, Etienne Chauchot wrote:
>>
>> Hi Cham, thanks for your comments !
>>
>> I just sent an email to user ML with a survey link to count ES
>> uses per version:
>>
>> https://lists.apache.org/thread.html/rc8185afb8af86a2a032909c13f569e18bd89e75a5839894d5b5d4082%40%3Cuser.beam.apache.org%3E
>>
>> Best
>>
>> Etienne
>>
>> On 10/02/2020 19:46, Chamikara Jayalath wrote:
>>>
>>>
>>> On Thu, Feb 6, 2020 at 8:13 AM Etienne Chauchot
>>> <echauchot@apache.org <ma...@apache.org>> wrote:
>>>
>>> Hi,
>>>
>>> please see my comments inline
>>>
>>> On 06/02/2020 16:24, Alexey Romanenko wrote:
>>>> Please, see my comments inline.
>>>>
>>>>> On 6 Feb 2020, at 10:50, Etienne Chauchot
>>>>> <echauchot@apache.org <ma...@apache.org>> wrote:
>>>>>>
>>>>>>>> 1. regarding version support: ES v2 is no more
>>>>>>>> maintained by Elastic since 2018/02 so we plan
>>>>>>>> to remove it from the IO. In the past we
>>>>>>>> already retired versions (like spark 1.6 for
>>>>>>>> instance).
>>>>>>>>
>>>>>>
>>>>>> My only concern here is that there might be users who
>>>>>> use the existing module who might not be able to
>>>>>> easily upgrade the Beam version if we remove it. But
>>>>>> given that V2 is 5 versions behind the latest release
>>>>>> this might be OK.
>>>>>>
>>>>>>
>>>>>> It seems we have a consensus on this.
>>>>>> I think there should be another general discussion on the
>>>>>> long term support of our prefered tool IO modules.
>>>>>
>>>>> => yes, consensus, let's drop ESV2
>>>>>
>>>> We had (and still have) a similar problem with KafkaIO to
>>>> support different versions of Kafka, especially very old
>>>> version 0.9. We raised this question on user@ and it
>>>> appears that there are users who for some reasons still use
>>>> old Kafka versions. So, before dropping a support of any ES
>>>> versions, I’d suggest to ask it user@ and see if any people
>>>> will be affected by this.
>>> Yes we can do a survey among users but the question is,
>>> should we support an ES version that is no more supported by
>>> Elastic themselves ?
>>>
>>>
>>> +1 for asking in the user list. I guess this is more about
>>> whether users need this specific version that we hope to drop
>>> support for. Whether we need to support unsupported versions is
>>> a more generic question that should prob. be addressed in the
>>> dev list. (and I personally don't think we should unless there's
>>> a large enough user base for a given version).
>>>
>>>>>>>> 2. regarding the user: the aim is to unlock
>>>>>>>> some new features (listed by Ludovic) and give
>>>>>>>> the user more flexibility on his request. For
>>>>>>>> that, it requires to use high level java ES
>>>>>>>> client in place of the low level REST client
>>>>>>>> (that was used because it is the only one
>>>>>>>> compatible with all ES versions). We plan to
>>>>>>>> replace the API (json document in and out) by
>>>>>>>> more complete standard ES objects that contain
>>>>>>>> de request logic (insert/update, doc routing
>>>>>>>> etc...) and the data. There are already IOs
>>>>>>>> like SpannerIO that use similar objects in
>>>>>>>> input PCollection rather than pure POJOs.
>>>>>>>>
>>>>>>
>>>>>> Won't this be a breaking change for all users ? IMO
>>>>>> using POJOs in PCollections is safer since we have to
>>>>>> worry about changes to the underlying client library
>>>>>> API. Exception would be when underlying client
>>>>>> library offers a backwards compatibility guarantee
>>>>>> that we can rely on for the foreseeable future (for
>>>>>> example, BQ TableRow).
>>>>>>
>>>>>>
>>>>>> Agreed but actually, there will be POJOs in order to
>>>>>> abstract Elasticsearch's version support. The following
>>>>>> third point explains this.
>>>>>
>>>>> => indeed it will be a breaking change, hence this email
>>>>> to get a consensus on that. Also I think our wrappers of
>>>>> ES request objects will offer a backward compatible as the
>>>>> underlying objects
>>>>>
>>>> I just want to remind that according to what we agreed some
>>>> time ago on dev@ (at least, for IOs), all breaking user API
>>>> changes have to be added along with deprecation of old API
>>>> that could be removed after 3 consecutive Beam releases. In
>>>> this case, users will have a time to move to new API smoothly.
>>>
>>> We are more discussing the target architecture of the new
>>> module here but the process of deprecation is important to
>>> recall, I agree. When I say DTOs backward compatible above I
>>> mean between per-version sub-modules inside the new module.
>>> Anyway, sure, for some time, both modules (the old
>>> REST-based that supports v2-7 and the new that supports
>>> v5-7) will cohabit and the old one will receive the
>>> deprecation annotations.
>>>
>>>
>>> +1 for supporting both versions for at least three minor
>>> versions to give users time to migrate. Also, we should try to
>>> produce a warning for users who use the deprecated versions.
>>>
>>> Thanks,
>>> Cham
>>>
>>> Best
>>>
>>> Etienne
>>>
>>>>
>>>>