You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org> on 2015/04/14 05:20:12 UTC

[jira] [Updated] (SQOOP-2299) Sqoop2: Store Context classes in repository

     [ https://issues.apache.org/jira/browse/SQOOP-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jarek Jarcec Cecho updated SQOOP-2299:
--------------------------------------
    Attachment: SQOOP-2299.patch

Attaching partial patch - I have working Derby repository, but I'm missing support in PostgreSQL repository. As quite huge unit of work is done (and PostgreSQL will be "the same"), I think that it would be great if others can take a look and comment if my direction make sense. 

Nevertheless please do not commit this patch yet, I will include the PostgreSQL repo and probably clean it up a bit before committing.

> Sqoop2: Store Context classes in repository
> -------------------------------------------
>
>                 Key: SQOOP-2299
>                 URL: https://issues.apache.org/jira/browse/SQOOP-2299
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.99.5
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Jarek Jarcec Cecho
>             Fix For: 1.99.7
>
>         Attachments: SQOOP-2299.patch
>
>
> While looking into persisting state from incremental job (SQOOP-1803), I've uncover a Hadoop bug where any Hadoop 2 will return incorrect {{job.xml}} when using {{JobClient}} APIs to get job's details. The issue is harder to track as it was initially fixed in Hadoop 2.7.0 via MAPREDUCE-5875, but subsequently reverted because of MAPREDUCE-6288 and it's not clear to me when/if the fix will be provided. This is relevant to us as we are storing our {{Context}} classes in job conf. I've looked around why nobody seen this problem before and it seems that projects are generally persisting properties in their repositories rather then using Hadoop APIs to retrieve the {{Configuration}} object back.
> Thinking about it a bit more, I think that it would be useful to keep track of the context classes as they contain additional information that can be useful for debugging purpose. I'm not yet sure whether we should expose those objects over the REST interface as they can possibly contain sensitive information, but it seems useful to at least persist those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)