You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/03/25 02:09:58 UTC

[GitHub] [incubator-seatunnel] BenJFan opened a new issue #1557: [Feature][Core] Connector support multi language coding

BenJFan opened a new issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557


   ### Search before asking
   
   - [X] I had searched in the [feature](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22Feature%22) and found no similar feature requirement.
   
   
   ### Description
   
   ## Target
   In order to realize the rapid development of plug-ins, the convenience and speed of developing plug-ins are increased. Supports the multilingualization of plug-ins, so that plug-in developers can quickly develop plug-ins in languages they are familiar with. To help SeaTunnel rapidly expand its capabilities, here are a few ways to achieve multilingualism.
   ## GraalVM
   ### Simple Design
   > Python is used as an example here, and the same is true for other languages such as Javascript
   
   Using the capabilities of GraalVM, SeaTunnel defines a specific Connector Python interface. The interface developer implements the corresponding Python code file, and calls the Python Connector logic implemented by the developer through the Graal SDK. Then convert to the common data format of SeaTunnel platform.
   
    
   ```Python
   // Python interface
   
   def open():
       pass
   
   def get_data(info):
       pass
   ```
   
   ``` Java 
   // Java Demo
   private static final String PYTHON_SOURCE = /* ... */;
   private Context polyglotContext = Context.newBuilder("python").allowAllAccess(true).build();
   
   private Row getData(String info) {
      polyglotContext.eval("python", PYTHON_SOURCE);
      Value getDataFunction = polyglotContext.getBindings("python").getMember("get_data");
      return getDataFunction.execute(info).as(Row.class);
   }
   
   ```
   
   ### Problem
   1. Since it runs on a Flink/Spark cluster, all nodes in the cluster need to run GraalVM to support multiple languages
   2. License problem, we use GraalVM SDK (UPL), User use Graal VM (GPL2). Does this affect the license of our project?
   3. At least use JDK 11
   
   ### Reference
   1. https://github.com/oracle/graal
   2. https://blogs.oracle.com/javamagazine/post/java-graalvm-polyglot-python-r
   3. https://blogs.oracle.com/java/post/apache-sparklightning-fast-on-graalvm-enterprise
   ## Jython
   ### Simple Design
   Same with GraalVM, We can call python file in Java code, but we don't need install GraalVM, just use default vm is enough
   ### Problem
   1. Jython only support python version is 2.x, 3.x doesn't supported.
   2. Jython is slow than GraalVM
   ### Reference
   1. https://www.jython.org/
   ## Nashorn
   > Java 8 default javascript engine
   ### Simple Design
   Same with Jython, can call javascript file in Java code.
   ### Problem
   ### Reference
   1. https://www.oracle.com/technical-resources/articles/java/jf14-nashorn.html
   ## Dependency
   We should package Java/Python/Javascript dependency in release package, Including connector used dependency.
   
   
   ### Usage Scenario
   
   _No response_
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] William-GuoWei commented on issue #1557: [Feature][Core] Connector support multi language coding

Posted by GitBox <gi...@apache.org>.
William-GuoWei commented on issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557#issuecomment-1078683132


   > > By "multi language coding", I suppose you are saying users can write the plugins in different languages, but the ways you listed include only one additional language, GraalVM for Python (I know GraalVM can support other languages too), Jython also for Python, Nashorn for javascript. Can we have a more general way to support not only Python OR JavaScript?
   > 
   > GraalVM can support multi language, this is a demo to show what should we do when we need support multi language. But in my opinon, Python and Javascript are first choose to support multi language. General way to support multi language is Graal VM at now
   
   I think it is not a VM thing to support multi-language. It seems a easy way now, but it will be very hard in the end.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] ruanwenjun commented on issue #1557: [Feature][Core] Connector support multi language coding

Posted by GitBox <gi...@apache.org>.
ruanwenjun commented on issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557#issuecomment-1080141087


   This is very cool and might be useful. But something need to handle based on our currently architecture.
   
   If we hope to use our core engine to directly load user's plugin(written by Python or Go or others), we need to confirm the dependencies in our plugins are also having the related language version.
   
   I guess can we write a generated module to transform user's code to Java, then we can control the dependencies in our transform module. 
   
   All the work will be innovative.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] William-GuoWei commented on issue #1557: [Feature][Core] Connector support multi language coding

Posted by GitBox <gi...@apache.org>.
William-GuoWei commented on issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557#issuecomment-1078681734


   > By "multi language coding", I suppose you are saying users can write the plugins in different languages, but the ways you listed include only one additional language, GraalVM for Python (I know GraalVM can support other languages too), Jython also for Python, Nashorn for javascript. Can we have a more general way to support not only Python OR JavaScript?
   
   So I just thought what if we design a 'data format' for SeaTunnel e.g. implemented by Apache Arrow. And Contributors can use their own language to develop their Source and Sink(such as Go, Rust, C...). As far as their output is follow the 'data format', SeaTunnel could sync the data to everywhere.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] BenJFan commented on issue #1557: [Feature][Core] Connector support multi language coding

Posted by GitBox <gi...@apache.org>.
BenJFan commented on issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557#issuecomment-1078690239


   > > By "multi language coding", I suppose you are saying users can write the plugins in different languages, but the ways you listed include only one additional language, GraalVM for Python (I know GraalVM can support other languages too), Jython also for Python, Nashorn for javascript. Can we have a more general way to support not only Python OR JavaScript?
   > 
   > So I just thought what if we design a 'data format' for SeaTunnel e.g. implemented by Apache Arrow. And Contributors can use their own language to develop their Source and Sink(such as Go, Rust, C...). As far as their output follows the 'data format', SeaTunnel could sync the data to everywhere.
   
   This method will be multi process, but we run program in Spark/Flink cluster at now, how do we manage other language process?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] bigdataf commented on issue #1557: [Feature][Core] Connector support multi language coding

Posted by GitBox <gi...@apache.org>.
bigdataf commented on issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557#issuecomment-1081916378


   Multilingual is very cool, but it is also very complicated. I think there are a limited number of souce or sink plugins, moreover, frequent changes are not required. Generally, production applications do not need so much. For the changeable characteristics of transform, there are many more for transform. Language support and function support are the more meaningful directions for multi-language


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] kezhenxu94 commented on issue #1557: [Feature][Core] Connector support multi language coding

Posted by GitBox <gi...@apache.org>.
kezhenxu94 commented on issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557#issuecomment-1078640544


   By "multi language coding", I suppose you are saying users can write the plugins in different languages, but the ways you listed include only one additional language, GraalVM for Python (I know GraalVM can support other languages too), Jython also for Python, Nashorn for javascript. Can we have a more general way to support not only Python OR JavaScript? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] William-GuoWei commented on issue #1557: [Feature][Core] Connector support multi language coding

Posted by GitBox <gi...@apache.org>.
William-GuoWei commented on issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557#issuecomment-1078678715


   The multi-language idea is great! 
   But the suggestion should be more careful about the license issues:
   Nashorn: It is a part of Java SE 8  https://download.oracle.com/otndocs/jcp/java_se-18-final-spec/license.html 
   As far as I know, the Oracle Java SE license is not compatible with Apache (https://www.oracle.com/downloads/licenses/javase-license1.html), so I think it is not a good idea to involve the package of Nashorn in the project.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] CalvinKirs commented on issue #1557: [Feature][Core] Connector support multi language coding

Posted by GitBox <gi...@apache.org>.
CalvinKirs commented on issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557#issuecomment-1078616177


   If you just use his SDK (GraalVM SDK) then there is no problem.
   
   also, IMO, we still need to maintain at least jdk8 support. If everyone supports this function, we can launch a separate version to do it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] BenJFan commented on issue #1557: [Feature][Core] Connector support multi language coding

Posted by GitBox <gi...@apache.org>.
BenJFan commented on issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557#issuecomment-1078661046


   > By "multi language coding", I suppose you are saying users can write the plugins in different languages, but the ways you listed include only one additional language, GraalVM for Python (I know GraalVM can support other languages too), Jython also for Python, Nashorn for javascript. Can we have a more general way to support not only Python OR JavaScript?
   
   GraalVM can support multi language, this is a demo to show what should we do when we need support multi language. But in my opinon, Python and Javascript are first choose to support multi language. General way to support multi language is Graal VM at now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [incubator-seatunnel] William-GuoWei edited a comment on issue #1557: [Feature][Core] Connector support multi language coding

Posted by GitBox <gi...@apache.org>.
William-GuoWei edited a comment on issue #1557:
URL: https://github.com/apache/incubator-seatunnel/issues/1557#issuecomment-1078681734


   > By "multi language coding", I suppose you are saying users can write the plugins in different languages, but the ways you listed include only one additional language, GraalVM for Python (I know GraalVM can support other languages too), Jython also for Python, Nashorn for javascript. Can we have a more general way to support not only Python OR JavaScript?
   
   So I just thought what if we design a 'data format' for SeaTunnel e.g. implemented by Apache Arrow. And Contributors can use their own language to develop their Source and Sink(such as Go, Rust, C...). As far as their output follows the 'data format', SeaTunnel could sync the data to everywhere.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org