You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by si...@apache.org on 2019/09/17 13:58:52 UTC
[pulsar] branch master updated: [Doc] Update Canal source connector
guide (#5173)
This is an automated email from the ASF dual-hosted git repository.
sijie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pulsar.git
The following commit(s) were added to refs/heads/master by this push:
new 627e6f8 [Doc] Update Canal source connector guide (#5173)
627e6f8 is described below
commit 627e6f8c2d7a64911517247f15db4e48e3ad89ae
Author: Anonymitaet <50...@users.noreply.github.com>
AuthorDate: Tue Sep 17 21:58:47 2019 +0800
[Doc] Update Canal source connector guide (#5173)
1. I've updated this guide based on `CanalSourceConfig.java` and `canal-mysql-source-config.yaml`.
2. Fix https://github.com/apache/pulsar/issues/5015
---
site2/docs/io-cdc-canal.md | 360 ++++++++++++++++++++++++---------------------
1 file changed, 195 insertions(+), 165 deletions(-)
diff --git a/site2/docs/io-cdc-canal.md b/site2/docs/io-cdc-canal.md
index e5d53aa..9d371ae 100644
--- a/site2/docs/io-cdc-canal.md
+++ b/site2/docs/io-cdc-canal.md
@@ -1,174 +1,204 @@
---
id: io-cdc-canal
-title: CDC Canal Connector
-sidebar_label: CDC Canal Connector
+title: Canal source connector
+sidebar_label: Canal source connector
---
-### Source Configuration Options
+The Canal source connector pulls messages from MySQL to Pulsar topics.
-The Configuration is mostly related to Canal task config.
+This guide explains how to congifure and use Canal source connector.
+
+## Configuration
+
+The configuration of Canal source connector has the following parameters.
+
+### Parameter
| Name | Required | Default | Description |
|------|----------|---------|-------------|
-| `zkServers` | `false` | `127.0.0.1:2181` | `The address and port of the zookeeper . if canal server configured to cluster mode` |
-| `batchSize` | `true` | `5120` | `Take 5120 records from the canal server in batches` |
-| `username` | `false` | `` | `Canal server account, not MySQL` |
-| `password` | `false` | `` | `Canal server password, not MySQL` |
-| `cluster` | `false` | `false` | `Decide whether to open cluster mode based on canal server configuration, true: cluster mode, false: standalone mode` |
-| `singleHostname` | `false` | `127.0.0.1` | `The address of canal server` |
-| `singlePort` | `false` | `11111` | `The port of canal server` |
-
-
-### Configuration Example
-
-Here is a configuration Json example:
-
-```$json
-{
- "zkServers": "127.0.0.1:2181",
- "batchSize": "5120",
- "destination": "example",
- "username": "",
- "password": "",
- "cluster": false,
- "singleHostname": "127.0.0.1",
- "singlePort": "11111",
-}
-```
-You could also find the yaml example in this [file](https://github.com/apache/pulsar/blob/master/pulsar-io/canal/src/main/resources/canal-mysql-source-config.yaml), which has similar content below:
-
-```$yaml
-configs:
- zkServers: "127.0.0.1:2181"
- batchSize: "5120"
- destination: "example"
- username: ""
- password: ""
- cluster: false
- singleHostname: "127.0.0.1"
- singlePort: "11111"
-```
-
-### Usage example
-
-Here is a simple example to store MySQL change data using above example config.
-
-- Start a MySQL server
-
-```$bash
-docker pull mysql:5.7
-docker run -d -it --rm --name pulsar-mysql -p 3306:3306 -e MYSQL_ROOT_PASSWORD=canal -e MYSQL_USER=mysqluser -e MYSQL_PASSWORD=mysqlpw mysql:5.7
-```
-- Modify configuration files mysqld.cnf
-
-```
-[mysqld]
-pid-file = /var/run/mysqld/mysqld.pid
-socket = /var/run/mysqld/mysqld.sock
-datadir = /var/lib/mysql
-#log-error = /var/log/mysql/error.log
-# By default we only accept connections from localhost
-#bind-address = 127.0.0.1
-# Disabling symbolic-links is recommended to prevent assorted security risks
-symbolic-links=0
-log-bin=mysql-bin
-binlog-format=ROW
-server_id=1
-```
-
-- Copy file to mysql server from local and restart mysql server
-```$bash
-docker cp mysqld.cnf pulsar-mysql:/etc/mysql/mysql.conf.d/
-docker restart pulsar-mysql
-```
-
-- Create test database in mysql server
-```$bash
-docker exec -it pulsar-mysql /bin/bash
-mysql -h 127.0.0.1 -uroot -pcanal -e 'create database test;'
-```
-
-- Start canal server and connect mysql server
-
-```
-docker pull canal/canal-server:v1.1.2
-docker run -d -it --link pulsar-mysql -e canal.auto.scan=false -e canal.destinations=test -e canal.instance.master.address=pulsar-mysql:3306 -e canal.instance.dbUsername=root -e canal.instance.dbPassword=canal -e canal.instance.connectionCharset=UTF-8 -e canal.instance.tsdb.enable=true -e canal.instance.gtidon=false --name=pulsar-canal-server -p 8000:8000 -p 2222:2222 -p 11111:11111 -p 11112:11112 -m 4096m canal/canal-server:v1.1.2
-```
-
-- Start pulsar standalone
-
-```$bash
-docker pull apachepulsar/pulsar:2.3.0
-docker run -d -it --link pulsar-canal-server -p 6650:6650 -p 8080:8080 -v $PWD/data:/pulsar/data --name pulsar-standalone apachepulsar/pulsar:2.3.0 bin/pulsar standalone
-```
-
-- Start pulsar-io in standalone
-
-- Config file canal-mysql-source-config.yaml
-
-```$yaml
-configs:
- zkServers: ""
- batchSize: "5120"
- destination: "test"
- username: ""
- password: ""
- cluster: false
- singleHostname: "pulsar-canal-server"
- singlePort: "11111"
-```
-- Consumer file pulsar-client.py for test
-```
-import pulsar
-
-client = pulsar.Client('pulsar://localhost:6650')
-consumer = client.subscribe('my-topic',
- subscription_name='my-sub')
-
-while True:
- msg = consumer.receive()
- print("Received message: '%s'" % msg.data())
- consumer.acknowledge(msg)
-
-client.close()
-```
-
-- Copy config file and test file to pulsar server
-
-```$bash
-docker cp canal-mysql-source-config.yaml pulsar-standalone:/pulsar/conf/
-docker cp pulsar-client.py pulsar-standalone:/pulsar/
-```
-
-- Download canal connector and start canal connector
-```$bash
-docker exec -it pulsar-standalone /bin/bash
-wget http://apache.01link.hk/pulsar/pulsar-2.3.0/connectors/pulsar-io-canal-2.3.0.nar -P connectors
-./bin/pulsar-admin sources localrun --archive ./connectors/pulsar-io-canal-2.3.0.nar --classname org.apache.pulsar.io.canal.CanalStringSource --tenant public --namespace default --name canal --destination-topic-name my-topic --source-config-file /pulsar/conf/canal-mysql-source-config.yaml --parallelism 1
-```
-
-- Consumption data
-
-```$bash
-docker exec -it pulsar-standalone /bin/bash
-python pulsar-client.py
-```
-
-- Open another window for login mysql server
-
-```$bash
-docker exec -it pulsar-mysql /bin/bash
-mysql -h 127.0.0.1 -uroot -pcanal
-```
-- Create table and insert, delete, update data in mysql server
-```
-mysql> use test;
-mysql> show tables;
-mysql> CREATE TABLE IF NOT EXISTS `test_table`(`test_id` INT UNSIGNED AUTO_INCREMENT,`test_title` VARCHAR(100) NOT NULL,
-`test_author` VARCHAR(40) NOT NULL,
-`test_date` DATE,PRIMARY KEY ( `test_id` ))ENGINE=InnoDB DEFAULT CHARSET=utf8;
-mysql> INSERT INTO test_table (test_title, test_author, test_date) VALUES("a", "b", NOW());
-mysql> UPDATE test_table SET test_title='c' WHERE test_title='a';
-mysql> DELETE FROM test_table WHERE test_title='c';
-```
+| `username` | true | None | Canal server account (not MySQL).|
+| `password` | true | None | Canal server password (not MySQL). |
+|`destination`|true|None|Source destination that Canal source connector connects to.
+| `singleHostname` | false | None | Canal server address.|
+| `singlePort` | false | None | Canal server port.|
+| `cluster` | true | false | Whether to enable cluster mode based on Canal server configuration or not.<br/><br/><li>true: **cluster** mode.<br/>If set to true, it talks to `zkServers` to figure out the actual database host.<br/><br/><li>false: **standalone** mode.<br/>If set to false, it connects to the database specified by `singleHostname` and `singlePort`. |
+| `zkServers` | true | None | Address and port of the Zookeeper that Canal source connector talks to figure out the actual database host.|
+| `batchSize` | false | 1000 | Batch size to fetch from Canal. |
+
+### Example
+
+Before using the Canal connector, you can create a configuration file through one of the following methods.
+
+* JSON
+
+ ```json
+ {
+ "zkServers": "127.0.0.1:2181",
+ "batchSize": "5120",
+ "destination": "example",
+ "username": "",
+ "password": "",
+ "cluster": false,
+ "singleHostname": "127.0.0.1",
+ "singlePort": "11111",
+ }
+ ```
+
+* YAML
+
+ You can create a YAML file and copy the [contents](https://github.com/apache/pulsar/blob/master/pulsar-io/canal/src/main/resources/canal-mysql-source-config.yaml) below to your YAML file.
+
+ ```yaml
+ configs:
+ zkServers: "127.0.0.1:2181"
+ batchSize: "5120"
+ destination: "example"
+ username: ""
+ password: ""
+ cluster: false
+ singleHostname: "127.0.0.1"
+ singlePort: "11111"
+ ```
+
+## Usage
+
+Here is an example of storing MySQL data using the configuration file as above.
+
+1. Start a MySQL server.
+
+ ```bash
+ $ docker pull mysql:5.7
+ $ docker run -d -it --rm --name pulsar-mysql -p 3306:3306 -e MYSQL_ROOT_PASSWORD=canal -e MYSQL_USER=mysqluser -e MYSQL_PASSWORD=mysqlpw mysql:5.7
+ ```
+
+2. Create a configuration file `mysqld.cnf`.
+
+ ```bash
+ [mysqld]
+ pid-file = /var/run/mysqld/mysqld.pid
+ socket = /var/run/mysqld/mysqld.sock
+ datadir = /var/lib/mysql
+ #log-error = /var/log/mysql/error.log
+ # By default we only accept connections from localhost
+ #bind-address = 127.0.0.1
+ # Disabling symbolic-links is recommended to prevent assorted security risks
+ symbolic-links=0
+ log-bin=mysql-bin
+ binlog-format=ROW
+ server_id=1
+ ```
+
+3. Copy the configuration file `mysqld.cnf` to MySQL server.
+
+ ```bash
+ $ docker cp mysqld.cnf pulsar-mysql:/etc/mysql/mysql.conf.d/
+ ```
+
+4. Restart the MySQL server.
+
+ ```bash
+ $ docker restart pulsar-mysql
+ ```
+
+5. Create a test database in MySQL server.
+
+ ```bash
+ $ docker exec -it pulsar-mysql /bin/bash
+ $ mysql -h 127.0.0.1 -uroot -pcanal -e 'create database test;'
+ ```
+
+6. Start a Canal server and connect to MySQL server.
+
+ ```
+ $ docker pull canal/canal-server:v1.1.2
+ $ docker run -d -it --link pulsar-mysql -e canal.auto.scan=false -e canal.destinations=test -e canal.instance.master.address=pulsar-mysql:3306 -e canal.instance.dbUsername=root -e canal.instance.dbPassword=canal -e canal.instance.connectionCharset=UTF-8 -e canal.instance.tsdb.enable=true -e canal.instance.gtidon=false --name=pulsar-canal-server -p 8000:8000 -p 2222:2222 -p 11111:11111 -p 11112:11112 -m 4096m canal/canal-server:v1.1.2
+ ```
+
+7. Start Pulsar standalone.
+
+ ```bash
+ $ docker pull apachepulsar/pulsar:2.3.0
+ $ docker run -d -it --link pulsar-canal-server -p 6650:6650 -p 8080:8080 -v $PWD/data:/pulsar/data --name pulsar-standalone apachepulsar/pulsar:2.3.0 bin/pulsar standalone
+ ```
+
+8. Modify the configuration file `canal-mysql-source-config.yaml`.
+
+ ```yaml
+ configs:
+ zkServers: ""
+ batchSize: "5120"
+ destination: "test"
+ username: ""
+ password: ""
+ cluster: false
+ singleHostname: "pulsar-canal-server"
+ singlePort: "11111"
+ ```
+
+9. Create a consumer file `pulsar-client.py`.
+
+ ```python
+ import pulsar
+
+ client = pulsar.Client('pulsar://localhost:6650')
+ consumer = client.subscribe('my-topic',
+ subscription_name='my-sub')
+
+ while True:
+ msg = consumer.receive()
+ print("Received message: '%s'" % msg.data())
+ consumer.acknowledge(msg)
+
+ client.close()
+ ```
+
+10. Copy the configuration file `canal-mysql-source-config.yaml` and the consumer file `pulsar-client.py` to Pulsar server.
+
+ ```bash
+ $ docker cp canal-mysql-source-config.yaml pulsar-standalone:/pulsar/conf/
+ $ docker cp pulsar-client.py pulsar-standalone:/pulsar/
+ ```
+
+11. Download a Canal connector and start it.
+
+ ```bash
+ $ docker exec -it pulsar-standalone /bin/bash
+ $ wget http://apache.01link.hk/pulsar/pulsar-2.3.0/connectors/pulsar-io-canal-2.3.0.nar -P connectors
+ $ ./bin/pulsar-admin sources localrun \
+ --archive ./connectors/pulsar-io-canal-2.3.0.nar \
+ --classname org.apache.pulsar.io.canal.CanalStringSource \
+ --tenant public \
+ --namespace default \
+ --name canal \
+ --destination-topic-name my-topic \
+ --source-config-file /pulsar/conf/canal-mysql-source-config.yaml \
+ --parallelism 1
+ ```
+
+12. Consume data from MySQL.
+
+ ```bash
+ $ docker exec -it pulsar-standalone /bin/bash
+ $ python pulsar-client.py
+ ```
+
+13. Open another window to log in MySQL server.
+
+ ```bash
+ $ docker exec -it pulsar-mysql /bin/bash
+ $ mysql -h 127.0.0.1 -uroot -pcanal
+ ```
+
+14. Create a table, and insert, delete, and update data in MySQL server.
+
+ ```bash
+ mysql> use test;
+ mysql> show tables;
+ mysql> CREATE TABLE IF NOT EXISTS `test_table`(`test_id` INT UNSIGNED AUTO_INCREMENT,`test_title` VARCHAR(100) NOT NULL,
+ `test_author` VARCHAR(40) NOT NULL,
+ `test_date` DATE,PRIMARY KEY ( `test_id` ))ENGINE=InnoDB DEFAULT CHARSET=utf8;
+ mysql> INSERT INTO test_table (test_title, test_author, test_date) VALUES("a", "b", NOW());
+ mysql> UPDATE test_table SET test_title='c' WHERE test_title='a';
+ mysql> DELETE FROM test_table WHERE test_title='c';
+ ```