You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@inlong.apache.org by GitBox <gi...@apache.org> on 2022/06/15 12:00:29 UTC

[GitHub] [incubator-inlong] yunqingmoswu opened a new pull request, #4672: [INLONG-4670][Sort] Update the README.md for the sort

yunqingmoswu opened a new pull request, #4672:
URL: https://github.com/apache/incubator-inlong/pull/4672

   ### Prepare a Pull Request
   *(Change the title refer to the following example)*
   
   Title: [INLONG-4670][Sort] Update the README.md for the sort
   
   *(The following *XYZ* should be replaced by the actual [GitHub Issue](https://github.com/apache/incubator-inlong/issues) number)*
   
   - Fixes #4670 
   
   ### Motivation
   
   Update the README.md for the sort
   
   ### Modifications
   
   The README.md of the sort will be updated.
   
   ### Verifying this change
   
   *(Please pick either of the following options)*
   
   - [ ] This change is a trivial rework/code cleanup without any test coverage.
   
   - [x] This change is already covered by existing tests, such as:
     *(please describe tests)*
   
   - [ ] This change added tests and can be verified as follows:
   
     *(example:)*
     - *Added integration tests for end-to-end deployment with large payloads (10MB)*
     - *Extended integration test for recovery after broker failure*
   
   ### Documentation
   
     - Does this pull request introduce a new feature? (yes / no)
     - If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
     - If a feature is not applicable for documentation, explain why?
     - If a feature is not documented yet in this PR, please create a follow-up issue for adding the documentation
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-inlong] healchow commented on a diff in pull request #4672: [INLONG-4670][Sort] Update the README.md for the Sort module

Posted by GitBox <gi...@apache.org>.
healchow commented on code in PR #4672:
URL: https://github.com/apache/incubator-inlong/pull/4672#discussion_r897896416


##########
inlong-sort/README.md:
##########
@@ -1,35 +1,56 @@
-# Description
-## overview
-Inlong-sort is used to extract data from different source systems, then transforms the data and finally loads the data into diffrent storage systems.
-Inlong-sort is simply an Flink application, and relys on Inlong-manager to manage meta data(such as the source informations and storage informations)
+## Description
 
-## features
-### multi-tenancy
-Inlong-sort is an multi-tenancy system, which means you can extract data from different sources(these sources must be of the same source type) and load data into different sinks(these sinks must be of the same storage type).
-e.g. you can extract data form different topics of inlong-tubemq and the load them to different hive clusters.
+## Overview
 
-### change meta data without restart
-Inlong-sort uses zookeeper to manage its meta data, every time you change meta data on zk, inlong-sort application will be informed immediately.
-e.g if you want to change the schema of your data, just change the meta data on zk without restart your inlong-sort application.
+InLong-Sort is used to extract data from different source systems, then transforms the data and finally loads the data

Review Comment:
   Suggest a change to:
   
   ```
   InLong-Sort is used to extract data from different source systems, then transform the data, and finally load the data into different storage systems.
   
   InLong-Sort is simply a Flink Application and relays on InLong-Manager to manage metadata (such as the source information and storage information).
   ```



##########
inlong-sort/README.md:
##########
@@ -1,35 +1,56 @@
-# Description
-## overview
-Inlong-sort is used to extract data from different source systems, then transforms the data and finally loads the data into diffrent storage systems.
-Inlong-sort is simply an Flink application, and relys on Inlong-manager to manage meta data(such as the source informations and storage informations)
+## Description
 
-## features
-### multi-tenancy
-Inlong-sort is an multi-tenancy system, which means you can extract data from different sources(these sources must be of the same source type) and load data into different sinks(these sinks must be of the same storage type).
-e.g. you can extract data form different topics of inlong-tubemq and the load them to different hive clusters.
+## Overview
 
-### change meta data without restart
-Inlong-sort uses zookeeper to manage its meta data, every time you change meta data on zk, inlong-sort application will be informed immediately.
-e.g if you want to change the schema of your data, just change the meta data on zk without restart your inlong-sort application.
+InLong-Sort is used to extract data from different source systems, then transforms the data and finally loads the data
+into diffrent storage systems.
+InLong-Sort is simply a Flink Application, and relys on InLong-Manager to manage meta data(such as the source
+informations and storage informations).
 
-## supported sources
-**inlong-tubemq**
+## Features
 
-## supported storages
-**hive**
-Currently we just support parquet file format in hive
+### Supported Extract Node
 
-**clickhouse**
+- Pulsar
+- MySQL
+- Kafka
+- MongoDB
+- PostgreSQL
+- HDFS
+- Oracle
+- SQLServer
 
-## limitations
-Currently, we just support extracting specified fields in the stage of **Transform**.
+### Supported Transform
 
-# Plans in the future
-## More kinds of source systems
-pulsar, kafka and etc
+- String Split
+- String Regular Replace
+- String Regular Replace First Matched Value
+- Data Filter
+- Data Distinct
+- Regular Join
 
-## More kinds of storage systems
-Hbase, Elastic Search, and etc
+### Supported Load Node
 
-## More kinds of file format in hive sink
-sequence file, orc
+- Hive
+- Kafka
+- HBase
+- ClickHouse
+- Iceberg
+- PostgreSQL
+- HDFS
+- TDSQL PostgreSQL
+- Oracle
+- Elasticsearch
+- Greenplum
+- MySQL
+- SQLServer
+
+## Future Plans
+
+### More kinds of Transform
+
+Time window aggregation, Content extraction, Type conversion, Time format conversion, and etc.

Review Comment:
   Suggest removing the `and` before the `etc.`.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-inlong] EMsnap merged pull request #4672: [INLONG-4670][Sort] Update the README.md for the Sort module

Posted by GitBox <gi...@apache.org>.
EMsnap merged PR #4672:
URL: https://github.com/apache/incubator-inlong/pull/4672


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@inlong.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org