You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by anonymitaet _ <an...@hotmail.com> on 2019/11/11 13:33:53 UTC

[Community Weekly Update] 2019-11-02 ~ 2019-11-08

Dear Pulsar enthusiast,

This is the weekly community update for 2019-11-02 ~ 2019-11-08, with updates on the sticky consumer for key-shared subscription, producers support producing messages with different schemas, the release of BookKeeper 4.10, and more.

This Pulsar weekly update is also available at https://streamnative.io/weekly/2019/2019-11/2019-11-08-pulsar-weekly/.

===================
Pulsar Development
===================

- [PIP-49][Permissions] The discussion of improvements of permissions continues. The scope of the PIP is proposed to reduce to focus on documenting the permissions for admin API and make the implementation stick to it.

    https://github.com/apache/pulsar/wiki/PIP-49%3A-Permission-levels-and-inheritance

- [PIP-43][Schema] PIP-43 is completed. Producers now are able to send messages with different schemas. Many thanks to Yi Tang for this incredible contribution.

    https://github.com/apache/pulsar/wiki/PIP-43%3A-producer-send-message-with-different-schema

- [PIP-34] Penghui Li introduced sticky consumers for `key_shared` subscription. This provides the ability for implementing an exactly-once source connector for Flink with auto-scaling.

    https://github.com/apache/pulsar/pull/5388

- [Pulsar Manager] Pulsar Manager starts the vote for its first Apache release.

    https://lists.apache.org/thread.html/eb7766661380652fd61ba09c1f6bf198c49b7f45dac9915908a073d1@%3Cdev.pulsar.apache.org%3E

- [BookKeeper] Apache BookKeeper has successfully voted its 4.10 release. Pulsar is bumping bookkeeper version to 4.10 in its coming releases.

    https://lists.apache.org/thread.html/b4f1545bd23552e7fe9f9946b73c135e4d49a0717af8a1051913b48e@%3Cdev.bookkeeper.apache.org%3E

===================
Notable Feature
===================

- [Connectors] Add subscribe position param for consumers of input topics of a sink (Release: 2.4.2, 2.5.0)

    https://github.com/apache/pulsar/pull/5532

- [Client][C++/Python] Increase number of digits in temporary subscription name for readers (Release: 2.5.0)

    https://github.com/apache/pulsar/pull/5547

- [Client][C++/Python] Add seek-by-time support for C++ and Python clients (Release: 2.5.0)

    https://github.com/apache/pulsar/pull/5542

===================
Notable Bug Fix
===================

- [Schema] Fix "Trying to subscribe with incompatible schema" (Release: 2.4.2, 2.5.0)

    https://github.com/apache/pulsar/issues/4790

    https://github.com/apache/pulsar/pull/5563

- [Schema][Functions] Fix Java function errors using Protobuf schema (Release: 2.4.2, 2.5.0)

    https://github.com/apache/pulsar/pull/5569

- [Kubernetes][Prometheus] Fix zookeeper Kubernetes annotations for Prometheus pull metrics from port 8000 (Release: 2.4.2, 2.5.0)

    https://github.com/apache/pulsar/pull/5601

===================
Event/News
===================

* [Conference] Devoxx was held on November 4-8 in Belgium. Quentin Adam and Steven Le Roux presented this conference and shared Architecture, Concepts, and  Benchmarks of Apache Pulsar.

    https://www.youtube.com/watch?v=De6avNyQUMw

* [Conference] 4th Workshop on Real-time & Stream Analytics in Big Data & Stream Data Management is coming on December 9-12 in Los Angeles. Matteo Merli will attend and give a talk about Apache Pulsar.

    https://workshop.euranova.eu/bigdata19.html

* [Anniversary] Apache BookKeeper celebrated the 5th anniversary this month.

    https://projects.apache.org/committee.html?bookkeeper

===================
Blog/Article
===================

* Apache Pulsar — One Cluster for the Entire Enterprise Using Multi-Tenancy (by Karthikeyan Palanivelu)

    https://medium.com/capital-one-tech/apache-pulsar-one-cluster-for-the-entire-enterprise-using-multi-tenancy-ac0bd925fbdf

* Apache Pulsar: configuring tiered-storage (AWS S3) via HELM (by Thomas Memenga)

    https://www.syscrest.com/2019/11/apache-pulsar-tiered-storage-helm-aws-s3/

If we miss anything, welcome to reply to this thread, thank you.

Cheers,

Sijie Guo, Yu Liu (@Anonymitaet)

From: anonymitaet _ <an...@hotmail.com>
Date: Tuesday, November 5, 2019 at 01:00
To: "users@pulsar.apache.org" <us...@pulsar.apache.org>, "dev@pulsar.apache.org" <de...@pulsar.apache.org>
Subject: [Community Weekly Update] 2019-10-26 ~ 2019-11-01

Dear Pulsar enthusiast,

This is the weekly community update for 2019-10-26 ~ 2019-11-01, with updates on new PIPs kicking in discussions of revisiting admin API and permissions, introducing package management in Pulsar Functions, a new Pulsar admin CLI `pulsarctl`, first apache release of Pulsar Manager, and more.

This Pulsar weekly update is also available at https://streamnative.io/weekly/2019/2019-11/2019-11-01-pulsar-weekly/.

===================
Pulsar Development
===================

- [Pulsar Manager] The release plan was settled down. Guangning is going to kick off the first release plan.

    https://lists.apache.org/thread.html/b3f05f0f50a6b8c32536c5a29f8e1ce5c6efe42b2a6ccf91f28d33a0@%3Cdev.pulsar.apache.org%3E

- [PIP-47][Release] Time-Based Release Plan

    PIP-47 was submitted for proposing moving towards a time-based release plan. So Pulsar development can enter a faster feedback cycle and users can benefit from features shipped quicker.

    - https://lists.apache.org/thread.html/6509f012d52241090dd987609ed7eb27ab29af224217f6934cfee64c@%3Cdev.pulsar.apache.org%3E

    - https://github.com/apache/pulsar/wiki/PIP-47%3A-Time-Based-Release-Plan

- [PIP-48][admin] Hierarchical admin API

    Steven and Florentin from OVH kicked off a discussion for improving current admin API.

    - https://lists.apache.org/thread.html/742ff9ffcc1f7212acf5e0eaae4bbd45c4754a43b7a36f22bd57578a@%3Cdev.pulsar.apache.org%3E

    - https://github.com/apache/pulsar/wiki/PIP-48%3A-hierarchical-admin-api

- [PIP-49][admin] Permission levels and inheritance

    Xiaolong started a proposal for revisiting the permission levels and inheritance in the Pulsar admin API.

    - https://lists.apache.org/thread.html/d56048b007d713d7b927c37e4cdf23a353852b2f7d6491840f1e3e0f@%3Cdev.pulsar.apache.org%3E

    - https://github.com/apache/pulsar/wiki/PIP-49%3A-Permission-levels-and-inheritance

- [PIP-50][Functions] Package Management

    Yong proposed introducing a package management system for managing different versions of functions and connectors.

    - https://lists.apache.org/thread.html/61a8138b0a304298d0de6178c5b870b38bd32bd762374486777f2c8c@%3Cdev.pulsar.apache.org%3E

    - https://github.com/apache/pulsar/wiki/PIP-50%3A-Package-Management

- [Document] The broken links in the website have been fixed. Kudos to Guangning and Jennifer.

- [Document] A new discussion thread started to introduce versioning in managing Pulsar client API documentation.

- [Functions] Distributed the CA for KubernetesSecretsTokenAuthProvider (Release: 2.5.0)

    - https://github.com/apache/pulsar/pull/5469

    - https://lists.apache.org/thread.html/a6a91990907cf8d2458ff8b7e1e916a9bfb17ec0cd216d57c66ab96c@%3Cdev.pulsar.apache.org%3E

- [BookKeeper] Apache BookKeeper is voting its 4.10 release, which can unblock Pulsar 2.4.2 release. Thank you to Enrico from the BookKeeper community!

    https://lists.apache.org/thread.html/d57f500fe8383903922fc4be3c3901fb0c00fda777672f32653c979e@%3Cdev.bookkeeper.apache.org%3E

===================
Notable Features
===================

- [Functions] Added deletion of state for Functions (Release: 2.5.0)

    https://github.com/apache/pulsar/pull/5469

- [Functions] Distributed the CA for KubernetesSecretsTokenAuthProvider (Release: 2.5.0)

    https://github.com/apache/pulsar/pull/5398

- [ZooKeeper] Bumped ZooKeeper to version 3.5.6

    https://github.com/apache/pulsar/pull/5043

- [Functions] Function runtime pluggable

    https://github.com/apache/pulsar/pull/5463

===================
Notable Bug Fix
===================

- [Client][Java] Wrongly report "3600 messages have timed-out" (Fixed, Release: 2.4.2 / 2.5.0)

    https://github.com/apache/pulsar/pull/5477

- [Functions] Fixed Pulsar cannot load the customized SerDe (Fixed, Release: 2.4.2 / 2.5.0)

    https://github.com/apache/pulsar/pull/5357

- [Authentication] Fixed broken custom auth-provider that uses authenticationData (Fixed, Release: 2.4.2 / 2.5.0)

    https://github.com/apache/pulsar/pull/5462

- [Tiered Storage] Only seek when reading unexpected entry (Fixed, Release: 2.4.2 / 2.5.0)

    https://github.com/apache/pulsar/pull/5356

- [Broker] Trim messages which less than mark delete position for message redelivery (Fixed, Release: 2.4.2 / 2.5.0)

    https://github.com/apache/pulsar/pull/5378

- [Admin] Fix list non-persistent topics shows the persistent topics (Fixed, Release: 2.4.2 / 2.5.0)

    https://github.com/apache/pulsar/pull/5502

===================
Ecosystem
===================

- [CLI] StreamNative open sourced a Pulsar Go admin client and a new CLI tool `pulsarctl` (built on the Pulsar Go admin client).

    https://github.com/streamnative/pulsarctl

- [Hadoop-unit](https://github.com/jetoile/hadoop-unit) added the support for Pulsar and BookKeeper.

    https://github.com/jetoile/hadoop-unit/blob/master/CHANGELOG.md#v36-20191101-1603-0000

- Pulsar.Client 0.8.0 was released with Reader support.

    https://www.nuget.org/packages/Pulsar.Client/

===================
Event / News
===================

- [Meetup] A new Apache Pulsar meetup is coming in Shanghai (China) on November 16, which will feature adoption stories from China Telecom, Zhaopin, and TuyaSmart.

    Signup link: https://www.eventbrite.com/e/apache-pulsar-meetup-shanghai-tickets-79293658467

===================
Blog / Article
===================

- [Video] Presentation "Apache Pulsar 101: architecture, concepts et comparaison" given by Quentin Adam & Steven Le Roux at DevFest Nantes 2019 was alive.

    https://www.youtube.com/watch?v=5fqhT82wghY&list=PLuZ_sYdawLiUjPGPsOvBcgBxC6yP_HSA6&index=55&t=0s

If we miss anything, welcome to reply to this thread, thank you.

Cheers,

Sijie Guo, Yu Liu (@Anonymitaet)

From: anonymitaet _ <an...@hotmail.com>
Date: Monday, October 28, 2019 at 21:40
To: "users@pulsar.apache.org" <us...@pulsar.apache.org>, "dev@pulsar.apache.org" <de...@pulsar.apache.org>
Subject: [Community Weekly Update] 2019-10-19 ~ 2019-10-25

Dear Pulsar enthusiast,

This is the community weekly update for 2019-10-19 ~ 2019-10-25, which helps you quickly capture Pulsar's highlights and spot trends over last week, meanwhile strengthen the communication and connection within the Pulsar family.

All Pulsar weekly updates are available at https://streamnative.io/weekly/.

===================
Pulsar Development
===================

* [CI] ASF Jenkins is back to 'normal' after reverting 'Add default loader for latest pyyaml (#4974)' [1] in #5432 [2]. The problem was from the usage of pyyaml in python 2.7 causing function workers failing to start in integration tests. The committers started merging the pull requests.

    [1] https://github.com/apache/pulsar/pull/4974

    [2] https://github.com/apache/pulsar/pull/5432

* [PIP-43] The main logic for supporting producers to send message with different schemas was merged. With this change, Pulsar provides the capability of supporting event sourcing applications with different schemas. Kudos to Yi Tang!

    https://github.com/apache/pulsar/pull/5443

* [Functions] Jerry Peng started the effort on refactoring functions runtime to make it pluggale. It will make the future development of adding a new runtime easier and more smoothly.

    https://github.com/apache/pulsar/pull/5463

* [PIP-45] The second pull request for pluggable metadata interface is out to change the implementation of ManagedLedger to use MetadataStorre interface.

    https://github.com/apache/pulsar/pull/5358

* [Key-Shared] Penghui kicked off the implementation of supporting sticky consumers in key_shared subscription. This provides a capability for consuming sub-streams from a given topic (partition) in order. It can be used in Flink integration for supporting flexiblle scaling up-and-down.

    https://github.com/apache/pulsar/pull/5388

* [Transaction] The development of TC continues with adding topic ownership listener for bootstraping coordinator when a coordination topic is owned.

    https://github.com/apache/pulsar/pull/5457

===================
Notable Features
===================

* [Client][Java] Add support for partitioned topic consumer seek by time. (Release: 2.5.0)

    https://github.com/apache/pulsar/pull/5435

* [Functions] Make Function authentication provider pluggable. (Release: 2.5.0)

    https://github.com/apache/pulsar/pull/5404

* [Client][Java] Support set read-position based on timestamp. (Release: 2.5.0)

    https://github.com/apache/pulsar/pull/5075

===================
Notable Bug Fix
===================

* [Client][CGo] Return message ID for produced messages. (Fixed, Release: 2.5.0)

    https://github.com/apache/pulsar/pull/4811

* [Broker] Fix potential deadlock that can occur in addConsumer. (Fixed, Release: 2.4.2 / 2.5.0)

    https://github.com/apache/pulsar/pull/5371

* [Client][Java] Avoid leak on publish failure on batch message. (Fixed, Release: 2.4.2 / 2.5.0)

    https://github.com/apache/pulsar/pull/5442

* [Broker] Fix: race condition: failed to read-more entries on dispatcher. (Fixed, Release: 2.4.2 / 2.5.0)

    https://github.com/apache/pulsar/pull/5391

* [Client][Java] Fix message corruption on OOM for batch messages. (Fixed, Release: 2.4.2 / 2.5.0)

    https://github.com/apache/pulsar/pull/5443

===================
Ecosystem
===================

* Pulsar.Client 0.7.0 was released with TLS and token authentication support.

    https://www.nuget.org/packages/Pulsar.Client/

* More Pulsar tools, integrations, and resources can also be found at https://github.com/streamnative/awesome-pulsar.

===================
Event / News
===================

* Apache Pulsar gets more attention by giants like Splunk. On Oct 21, Splunk announced to acquire Streamlio to accelerate efforts in real-time stream processing and containerized multi-tenant cloud platform applications. Streamlio is powered by Apache Pulsar, specializing in designing and operating streaming data solutions at scale in demanding enterprise environments.

    https://www.splunk.com/blog/2019/10/21/splunk-to-expand-streaming-expertise-announces-intent-to-acquire-streamlio-open-source-distributed-messaging-leader.html

* Paris Data Engineers (meetup)

    Paris Data Engineers was held on Oct 22 in France. Quentin Adam talked about how CleverCloud is using Pulsar for scalable logs processing.

    https://www.meetup.com/fr-FR/Paris-Data-Engineers/events/264819837/

===================
Blog / Article
===================

* Powering Tencent Billing Platform with Apache Pulsar (by Dezhi Liu)

    https://streamnative.io/blog/tech/2019-10-22-powering-tencent-billing-platform-with-apache-pulsar/

* How to use Apache Pulsar Manager with HerdDB (by Enrico Olivelli)

    https://medium.com/streamnative/how-to-use-apache-pulsar-manager-with-herddb-dd265c955ca4

* Why Nutanix Beam went ahead with Apache Pulsar instead of Apache Kafka? (by Yuvaraj Loganathan)

    https://medium.com/@yuvarajl/why-nutanix-beam-went-ahead-with-apache-pulsar-instead-of-apache-kafka-1415f592dbbb

* Basic Pulsar producer and consumer (by Thomas Memenga)

    https://www.syscrest.com/2019/10/basic-pulsar-producer-and-consumer-json-helm-kubernetes/

If we miss anything, welcome to reply to this thread, thank you.

Cheers,

Sijie Guo, Yu Liu (@Anonymitaet)

From: anonymitaet _ <an...@hotmail.com>
Date: Saturday, October 19, 2019 at 13:28
To: "users@pulsar.apache.org" <us...@pulsar.apache.org>, "dev@pulsar.apache.org" <de...@pulsar.apache.org>
Subject: [Community Weekly Update] 2019-10-07 ~ 2019-10-18

Dear Pulsar enthusiast,

This is the first weekly community update, which helps you quickly capture Pulsar's highlights and spot trends over last week, meanwhile strengthen the communication and connection within the Pulsar family.

==================
Pulsar Development
==================

- [CI]  ASF Jenkins is still in a flaky state. There is still a huge backlog of pull requests to be merged due to Jenkins issue. Ali Ahmed drove the efforts looking into different CI options to address the problem [1].

- [PIP] [metadata] Matteo proposed introducing pluggable metadata interface in PIP-45 (https://github.com/apache/pulsar/wiki/PIP-45:-Pluggable-metadata-interface). It is a great movement to support other metadata storage besides zookeeper. The first pull request was merged [2].

- [DOC] [connector] Yu Liu (@Anonymitaet) continues contributing to the documentation for built-in connectors (https://github.com/apache/pulsar/issues/5015). Hope we can fill the documentation gap soon. Those changes are available in the latest version of documentation (http://pulsar.apache.org/docs/en/next/io-connectors/).

[1] https://mail-archives.apache.org/mod_mbox/pulsar-dev/201910.mbox/%3CCANcJaZucPz%2BinJ%3DaVNVM0f-1qEJ_J%3DRcamkj8v6XeiqY4Thv_A%40mail.gmail.com%3E

[2] https://mail-archives.apache.org/mod_mbox/pulsar-dev/201910.mbox/%3CCA%2BJmKXYakGKi9d7j%2Bavo4WW%3DRu3BY-ZUpLDW3RLeAic-NVN1sA%40mail.gmail.com%3E

==================
Notable Bug Fix
==================

- [Broker] Deduplication may drop messages if there is an error persisting to bookkeeper. (Fixed, Release: 2.4.2 / 2.5.0)

https://github.com/apache/pulsar/issues/5218

- [Broker] Race condition while triggering message redelivery after an ack-timeout event. (Fixed, Release: 2.4.2 / 2.5.0)

https://github.com/apache/pulsar/pull/5276

- [Broker] If a cursor is not durable, close dispatcher when all consumers are removed from subscription. (Fixed, Release: 2.4.2 / 2.5.0)

https://github.com/apache/pulsar/pull/5340

- [TIEREDSTORAGE] Don't require both region and endpoint to be specified (Fixed, Release: 2.4.2 / 2.5.0)

https://github.com/apache/pulsar/pull/5355

==================
Ecosystem
==================

- Pulsar + Flink

The discussion of adding Pulsar connector to Flink main repo continues in Flink mailing list. The contributions include Sink Connector, Source Connector and Catalog integration. FLIP-72 is the umbrella for the whole contribution.

https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector

- Pulsar + Skywalking

The integration of Pulsar and Skywalking was completed [3]. The Pulsar plugin is now available in Skywalking main repo and will be released in its 6.5.0 release. Kudos to Penghui and the Skywalking community. Penghui also wrote a tutorial about using Skywalking to trace Pulsar messages [4].

[3] https://github.com/apache/skywalking/pull/3476

[4] https://medium.com/streamnative/use-apache-skywalking-to-trace-apache-pulsar-messages-b543ac253053

- Pulsar .NET Client

            Many features landed in the Pulsar .NET Client in the past 2 weeks (https://github.com/fsharplang-ru/pulsar-client-dotnet)

            - Oct 16th: 0.6.0 released with consumer seek support.

            - Oct 15th: 0.5.0 released with compacted topics support.

            - Oct 8th: 0.4.0 released with key/value properties support.

            .NET client package is available at: https://www.nuget.org/packages/Pulsar.Client

- Pulsar Express

Pulsar Express released 0.5.0 on Oct 13 with many features like broker health check, namespace creation and deletion, topic creation, and son on.

==================
Event/News
==================

- [HUG] Special Apache Pulsar Meetup chez OVH (Paris 17)

The first Pulsar meetup in Paris was held at OVH office. It was organized by HUG France and dedicated to talks for Pulsar. Committers and contributors from OVH, Clever Cloud and StreamNative gathered together to give an introduction of Pulsar/BookKeeper and share the use cases of Pulsar.

https://www.meetup.com/fr-FR/Hadoop-User-Group-France/events/264920447/

- Flink Forward Europe 2019 | Berlin

The Flink community conference happened in Berlin last week. Sijie Guo gave a presentation about the latest integration with Flink 1.9+ around schema/catalog, exactly-once source and etc, and demonstrated the capability of using Pulsar as a unified event stream storage for unified data processing.

The presentation is available at https://www.slideshare.net/streamnative/query-pulsar-streams-using-apache-flink.

https://europe-2019.flink-forward.org/conference-program#<https://europe-2019.flink-forward.org/conference-program>

- Nantes Java User Group

Bruno Bonnin provided an overview of Apache Pulsar at Nantes Java User Group on Oct 15th.

https://nantesjug.org/#/events/2019_10_15

- Crunch Data Conference | Budapest

            Crunch Data Conference happened from Oct 16 to Oct 18. Ivan Kelly gave a presentation of “Infinite topic backlogs with Apache Pulsar”.

            https://crunchconf.com/speaker/IvanKelly#talks

- ParisDataEng’ #15 ~ Data Engineering with Delta Lake, Pulsar and Spark-tools

There will be a ParisDataEng meetup on Oct 22nd in Paris, including a Pulsar talk by Quentin Adam from CleverCloud. He will share their success story of using Pulsar to manage its high scalable logs infrastructure.

             https://www.meetup.com/Paris-Data-Engineers/events/264819837/

==================
Blog/Article
==================

 - Life beyond Kafka with Apache Pulsar (by Avaro Santos Andres)

            https://dzone.com/articles/life-beyond-kafka-with-apache-pulsar

- An introduction to Stream Processing with Pulsar Functions (by Matteo Merli and Jerry Peng)

            https://dzone.com/articles/an-introduction-to-stream-processing-with-pulsar-f

- 5 More Reasons to Choose Apache Pulsar over Kafka (by Chris Bartholomew)

            https://kafkaesque.io/5-more-reasons-to-choose-apache-pulsar-over-kafka/

Cheers,

Sijie Guo, Xiaorong Ran, Yu Liu