You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by chemikadze <gi...@git.apache.org> on 2018/11/17 22:38:52 UTC

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

GitHub user chemikadze opened a pull request:

    https://github.com/apache/incubator-griffin/pull/456

    [GRIFFIN-213] Custom connector support

    Provide ability to extend batch and streaming data integrations
    with custom user-provided connectors. Introduces new data connector
    type, `CUSTOM`, parameterized with `class` property. Also adds support
    for custom data connector enum on service side.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/chemikadze/incubator-griffin GRIFFIN-213

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-griffin/pull/456.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #456
    
----
commit d487347a363f172cfc9e26225d5687cc3f95ab73
Author: Nikolay Sokolov <ch...@...>
Date:   2018-11-17T22:37:36Z

    [GRIFFIN-213] Custom connector support
    
    Provide ability to extend batch and streaming data integrations
    with custom user-provided connectors. Introduces new data connector
    type, `CUSTOM`, parameterized with `class` property. Also adds support
    for custom data connector enum on service side.

----


---

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-griffin/pull/456


---

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

Posted by gavlyukovskiy <gi...@git.apache.org>.
Github user gavlyukovskiy commented on a diff in the pull request:

    https://github.com/apache/incubator-griffin/pull/456#discussion_r234425707
  
    --- Diff: measure/src/main/scala/org/apache/griffin/measure/datasource/connector/DataConnectorFactory.scala ---
    @@ -84,6 +87,26 @@ object DataConnectorFactory extends Loggable {
         }
       }
     
    +  private def getCustomConnector(session: SparkSession,
    +                                 context: StreamingContext,
    +                                 param: DataConnectorParam,
    +                                 storage: TimestampStorage,
    +                                 maybeClient: Option[StreamingCacheClient]): DataConnector = {
    +    val className = param.getConfig("class").asInstanceOf[String]
    +    val cls = Class.forName(className)
    +    if (classOf[BatchDataConnector].isAssignableFrom(cls)) {
    +      val ctx = BatchDataConnectorContext(session, param, storage)
    +      val meth = cls.getDeclaredMethod("apply", classOf[BatchDataConnectorContext])
    +      meth.invoke(null, ctx).asInstanceOf[BatchDataConnector]
    +    } else if (classOf[StreamingDataConnector].isAssignableFrom(cls)) {
    +      val ctx = StreamingDataConnectorContext(session, context, param, storage, maybeClient)
    +      val meth = cls.getDeclaredMethod("apply", classOf[StreamingDataConnectorContext])
    +      meth.invoke(null, ctx).asInstanceOf[StreamingDataConnector]
    +    } else {
    +      throw new ClassCastException("")
    --- End diff --
    
    It would be nice to have here message that custom connector class must extend `BatchDataConnector` or `StreamingDataConnector`


---

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

Posted by chemikadze <gi...@git.apache.org>.
Github user chemikadze commented on a diff in the pull request:

    https://github.com/apache/incubator-griffin/pull/456#discussion_r235266418
  
    --- Diff: griffin-doc/measure/measure-configuration-guide.md ---
    @@ -188,7 +188,7 @@ Above lists DQ job configure parameters.
     - **sinks**: Whitelisted sink types for this job. Note: no sinks will be used, if empty or omitted. 
     
     ### <a name="data-connector"></a>Data Connector
    -- **type**: Data connector type, "AVRO", "HIVE", "TEXT-DIR" for batch mode, "KAFKA" for streaming mode.
    +- **type**: Data connector type: "AVRO", "HIVE", "TEXT-DIR", "CUSTOM" for batch mode; "KAFKA", "CUSTOM" for streaming mode.
    --- End diff --
    
    ANY seems to have "auto-detect" meaning to it, while in fact it's just plugin mechanism.
    What about CLASS, CLASSNAME, PLUGIN or EXTERNAL?


---

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

Posted by toyboxman <gi...@git.apache.org>.
Github user toyboxman commented on a diff in the pull request:

    https://github.com/apache/incubator-griffin/pull/456#discussion_r235585607
  
    --- Diff: griffin-doc/measure/measure-configuration-guide.md ---
    @@ -188,7 +188,7 @@ Above lists DQ job configure parameters.
     - **sinks**: Whitelisted sink types for this job. Note: no sinks will be used, if empty or omitted. 
     
     ### <a name="data-connector"></a>Data Connector
    -- **type**: Data connector type, "AVRO", "HIVE", "TEXT-DIR" for batch mode, "KAFKA" for streaming mode.
    +- **type**: Data connector type: "AVRO", "HIVE", "TEXT-DIR", "CUSTOM" for batch mode; "KAFKA", "CUSTOM" for streaming mode.
    --- End diff --
    
    EXTERNAL seems better, which means a third-party or vendor's data connector.
    
    @guoyuepeng @chemikadze  how do you think about?


---

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

Posted by toyboxman <gi...@git.apache.org>.
Github user toyboxman commented on a diff in the pull request:

    https://github.com/apache/incubator-griffin/pull/456#discussion_r234475362
  
    --- Diff: griffin-doc/measure/measure-configuration-guide.md ---
    @@ -188,7 +188,7 @@ Above lists DQ job configure parameters.
     - **sinks**: Whitelisted sink types for this job. Note: no sinks will be used, if empty or omitted. 
     
     ### <a name="data-connector"></a>Data Connector
    -- **type**: Data connector type, "AVRO", "HIVE", "TEXT-DIR" for batch mode, "KAFKA" for streaming mode.
    +- **type**: Data connector type: "AVRO", "HIVE", "TEXT-DIR", "CUSTOM" for batch mode; "KAFKA", "CUSTOM" for streaming mode.
    --- End diff --
    
    do you think about 'ANY' as replacement for 'CUSTOM'
    
    **type**: Data connector type, "AVRO", "HIVE", "TEXT-DIR" for batch mode, "KAFKA" for streaming mode, "ANY" for boths.


---

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

Posted by chemikadze <gi...@git.apache.org>.
Github user chemikadze commented on a diff in the pull request:

    https://github.com/apache/incubator-griffin/pull/456#discussion_r235619493
  
    --- Diff: griffin-doc/measure/measure-configuration-guide.md ---
    @@ -188,7 +188,7 @@ Above lists DQ job configure parameters.
     - **sinks**: Whitelisted sink types for this job. Note: no sinks will be used, if empty or omitted. 
     
     ### <a name="data-connector"></a>Data Connector
    -- **type**: Data connector type, "AVRO", "HIVE", "TEXT-DIR" for batch mode, "KAFKA" for streaming mode.
    +- **type**: Data connector type: "AVRO", "HIVE", "TEXT-DIR", "CUSTOM" for batch mode; "KAFKA", "CUSTOM" for streaming mode.
    --- End diff --
    
    Let stick to CUSTOM then.


---

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

Posted by toyboxman <gi...@git.apache.org>.
Github user toyboxman commented on a diff in the pull request:

    https://github.com/apache/incubator-griffin/pull/456#discussion_r235621954
  
    --- Diff: griffin-doc/measure/measure-configuration-guide.md ---
    @@ -188,7 +188,7 @@ Above lists DQ job configure parameters.
     - **sinks**: Whitelisted sink types for this job. Note: no sinks will be used, if empty or omitted. 
     
     ### <a name="data-connector"></a>Data Connector
    -- **type**: Data connector type, "AVRO", "HIVE", "TEXT-DIR" for batch mode, "KAFKA" for streaming mode.
    +- **type**: Data connector type: "AVRO", "HIVE", "TEXT-DIR", "CUSTOM" for batch mode; "KAFKA", "CUSTOM" for streaming mode.
    --- End diff --
    
    sure, please go ahead


---

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

Posted by chemikadze <gi...@git.apache.org>.
Github user chemikadze commented on a diff in the pull request:

    https://github.com/apache/incubator-griffin/pull/456#discussion_r234426015
  
    --- Diff: measure/src/main/scala/org/apache/griffin/measure/datasource/connector/DataConnectorFactory.scala ---
    @@ -84,6 +87,26 @@ object DataConnectorFactory extends Loggable {
         }
       }
     
    +  private def getCustomConnector(session: SparkSession,
    +                                 context: StreamingContext,
    +                                 param: DataConnectorParam,
    +                                 storage: TimestampStorage,
    +                                 maybeClient: Option[StreamingCacheClient]): DataConnector = {
    +    val className = param.getConfig("class").asInstanceOf[String]
    +    val cls = Class.forName(className)
    +    if (classOf[BatchDataConnector].isAssignableFrom(cls)) {
    +      val ctx = BatchDataConnectorContext(session, param, storage)
    +      val meth = cls.getDeclaredMethod("apply", classOf[BatchDataConnectorContext])
    +      meth.invoke(null, ctx).asInstanceOf[BatchDataConnector]
    +    } else if (classOf[StreamingDataConnector].isAssignableFrom(cls)) {
    +      val ctx = StreamingDataConnectorContext(session, context, param, storage, maybeClient)
    +      val meth = cls.getDeclaredMethod("apply", classOf[StreamingDataConnectorContext])
    +      meth.invoke(null, ctx).asInstanceOf[StreamingDataConnector]
    +    } else {
    +      throw new ClassCastException("")
    --- End diff --
    
    Oh, thanks for reminding! Planned to do that, but got distracted.


---

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

Posted by guoyuepeng <gi...@git.apache.org>.
Github user guoyuepeng commented on a diff in the pull request:

    https://github.com/apache/incubator-griffin/pull/456#discussion_r235603134
  
    --- Diff: griffin-doc/measure/measure-configuration-guide.md ---
    @@ -188,7 +188,7 @@ Above lists DQ job configure parameters.
     - **sinks**: Whitelisted sink types for this job. Note: no sinks will be used, if empty or omitted. 
     
     ### <a name="data-connector"></a>Data Connector
    -- **type**: Data connector type, "AVRO", "HIVE", "TEXT-DIR" for batch mode, "KAFKA" for streaming mode.
    +- **type**: Data connector type: "AVRO", "HIVE", "TEXT-DIR", "CUSTOM" for batch mode; "KAFKA", "CUSTOM" for streaming mode.
    --- End diff --
    
    I still think 'custom' better than others. 


---

[GitHub] incubator-griffin pull request #456: [GRIFFIN-213] Custom connector support

Posted by guoyuepeng <gi...@git.apache.org>.
Github user guoyuepeng commented on a diff in the pull request:

    https://github.com/apache/incubator-griffin/pull/456#discussion_r235226675
  
    --- Diff: griffin-doc/measure/measure-configuration-guide.md ---
    @@ -188,7 +188,7 @@ Above lists DQ job configure parameters.
     - **sinks**: Whitelisted sink types for this job. Note: no sinks will be used, if empty or omitted. 
     
     ### <a name="data-connector"></a>Data Connector
    -- **type**: Data connector type, "AVRO", "HIVE", "TEXT-DIR" for batch mode, "KAFKA" for streaming mode.
    +- **type**: Data connector type: "AVRO", "HIVE", "TEXT-DIR", "CUSTOM" for batch mode; "KAFKA", "CUSTOM" for streaming mode.
    --- End diff --
    
    CUSTOM looks good to me.


---