You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/01/28 11:30:01 UTC

[GitHub] [pulsar] sijie commented on pull request #9343: Issue 9004: Pulsar Schema API: provide Type information for Fields

sijie commented on pull request #9343:
URL: https://github.com/apache/pulsar/pull/9343#issuecomment-768990538

@eolivelli

There are two questions here. Let's not couple them together.

1) Whether we want to introduce a type system or not? If we want to introduce a type system what would be the approach.
2) How connector can access the schema information?

For the first one, I have a really strong different opinion on introducing a type system. I have seen problems with converting types between different systems. AVRO is already an open-standard that all the existing computing engines support. If we are introducing another type of system, we introduce a lot of troubles when integrating with other computing engines. That's something I would avoid. If you want to do that, let's think carefully. It is not just about connectors. Pulsar has a broader set of integrations that relies on schema. We can't afford to maintain different type systems and different approaches for different integrations. You end up need to maintain a lot of converters between Pulsar schemas with many other systems. It is going to be a nightmare.

For the second one, we should just let users access the underlying libraries. AVRO, JSON, and PROTOBUF have different type systems. Let's use their existing tools and not reinvent one.

> without these features Pulsar IO won't be able to compete with Kafka Connect, and also to "port" existing Kafka Connect workflows to the Pulsar Ecosystem

First of all, Pulsar I/O doesn't compete with Kafka Connect. Pulsar I/O is an extension of the Pulsar ecosystem. Kafka Connect is the extension to Kafka. They might share similar goals. But they don't compete with each other.

Secondly, we embrace the Kafka ecosystem by providing a fully compatible wrapper to run existing Kafka connectors without introducing a type system. The key for this to work is using an open standard like ARVO.

```
Kafka Schema -> [ Open Standards: AVRO ] -> Pulsar Schema
```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org