You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@griffin.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/06/17 15:07:00 UTC

[jira] [Work logged] (GRIFFIN-305) Standardize Sink Hierarchy

     [ https://issues.apache.org/jira/browse/GRIFFIN-305?focusedWorklogId=447342&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-447342 ]

ASF GitHub Bot logged work on GRIFFIN-305:
------------------------------------------

                Author: ASF GitHub Bot
            Created on: 17/Jun/20 15:06
            Start Date: 17/Jun/20 15:06
    Worklog Time Spent: 10m 
      Work Description: chitralverma opened a new pull request #575:
URL: https://github.com/apache/griffin/pull/575


   **What changes were proposed in this pull request?**
   
   Currently, the implementation of `Sinks` in Griffin poses the below issues. This PR aims at fixing these issues.
   - `Sinks` are based on the recursive MultiSink class which is a sink itself but the underlying implementation is that of a `Seq` which causes ambiguity and isn't much useful. This has been removed.
   - Some unused code like `SinkContext` has been removed.
   - Data is converted from the performant DataFrame to RDD while persisting in both streaming and batch pipelines. A new method `sinkBatchRecords` has been added to allow operations directly on DataFrame for batch pipelines. Streaming will still use the old implementation which will be replaced with structured streaming.
   - Refactored the methods of `Sink` like changed `start`/ `finish` to `open`/ `close` and `jobName` was incorrectly passed as `metricName`.
   - Presently, only one instance of a sink with a given type can be defined in the env config. This will not allow the cases where you want to configure multiple sinks of same type like HDFS or JDBC. Added sink `name` to env config which is used to define the sink that should be used in the job config also.
   - Updated all sinks as per the changes above. With some additional changes to ConsoleSink
   
   **Does this PR introduce any user-facing change?**
   Yes. As mentioned above, the sink config has changed in env and job configs.
   
   How was this patch tested?
   Griffin test suite and additional unit test cases


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 447342)
    Remaining Estimate: 0h
            Time Spent: 10m

> Standardize Sink Hierarchy
> --------------------------
>
>                 Key: GRIFFIN-305
>                 URL: https://issues.apache.org/jira/browse/GRIFFIN-305
>             Project: Griffin
>          Issue Type: Sub-task
>            Reporter: Chitral Verma
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)