You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Veena Basavaraj (JIRA)" <ji...@apache.org> on 2015/03/16 21:17:38 UTC

[jira] [Comment Edited] (SQOOP-1803) JobManager and Execution Engine changes: Support for a injecting and pulling out configs and job output in connectors

    [ https://issues.apache.org/jira/browse/SQOOP-1803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363551#comment-14363551 ] 

Veena Basavaraj edited comment on SQOOP-1803 at 3/16/15 8:17 PM:
-----------------------------------------------------------------

To clarify my earlier point I made a few days ago that I do not see caught your attention [~jarcec], Here are the details.

MutableContext today is not persisted, it allows certain types like int/ long/boolean/String. My question was we should allow even a list/ map or any object to be stored in here. The key - value pairs are already uniquely identified, so any config is underneath a key /value pair and we can keep this interface to update or overwrite any of these config values. I do not see a need for a special API for doing this. 

The only additional change is to look up this context map that the intiializer has already set and then persist them. We can add a new property to indicate if this "transient or persistent" value for the context so we dont end up shoving everything in this object into the repository. Makes sense?

Second, most important point, the code I posted above in the JobManager class ..happens only when the job has completed successfully so there is no need to worry about any synchronization issues at this point
{code}
      RepositoryManager.getInstance().getRepository().updateJobConfig( ...)

{code}

few more details, after thinking through. My thought when I first used the distributed cache, was to do this update in the "output committer" since it is ensured to be called "once", similar to how the current SqoopDestroyerExecutor is invoked, we need to have the MutableContextPesistExecutor or something along those lines.


For this 


was (Author: vybs):
To clarify my earlier point I made a few days ago that I do not see caught your attention [~jarcec], Here are the details.

MutableContext today is not persisted, it allows certain types like int/ long/boolean/String. My question was we should allow even a list/ map or any object to be stored in here. The key - value pairs are already uniquely identified, so any config is underneath a key /value pair and we can keep this interface to update or overwrite any of these config values. I do not see a need for a special API for doing this. 

The only additional change is to look up this context map that the intiializer has already set and then persist them. We can add a new property to indicate if this "transient or persistent" value for the context so we dont end up shoving everything in this object into the repository. Makes sense?

Second, most important point, the code I posted above in the JobManager class ..happens only when the job has completed successfully so there is no need to worry about any synchronization issues at this point
{code}
      RepositoryManager.getInstance().getRepository().updateJobConfig( ...)

{code}

> JobManager and Execution Engine changes: Support for a injecting and pulling out configs and job output in connectors 
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SQOOP-1803
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1803
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 1.99.6
>
>
> The details are in the design wiki, as the implementation happens more discussions can happen here.
> https://cwiki.apache.org/confluence/display/SQOOP/Delta+Fetch+And+Merge+Design#DeltaFetchAndMergeDesign-Howtogetoutputfromconnectortosqoop?
> The goal is to dynamically inject a IncrementalConfig instance into the FromJobConfiguration. The current MFromConfig and MToConfig can already hold a list of configs, and a strong sentiment was expressed to keep it as a list, why not for the first time actually make use of it and group the incremental related configs in one config object
> This task will prepare the FromJobConfiguration from the job config data, ExtractorContext with the relevant values from the prev job run 
> This task will prepare the ToJobConfiguration from the job config data, LoaderContext with the relevant values from the prev job run if any
> We will use DistributedCache to get State information from the Extractor and Loader out and finally persist it into the sqoop repository depending on SQOOP-1804 once the outputcommitter commit is called



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)