You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Gabriel Reid (JIRA)" <ji...@apache.org> on 2014/03/19 22:08:48 UTC

[jira] [Resolved] (PHOENIX-129) Improve MapReduce-based import

     [ https://issues.apache.org/jira/browse/PHOENIX-129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gabriel Reid resolved PHOENIX-129.
----------------------------------

       Resolution: Fixed
    Fix Version/s: 5.0.0
                   4.0.0
                   3.0.0

> Improve MapReduce-based import
> ------------------------------
>
>                 Key: PHOENIX-129
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-129
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Gabriel Reid
>            Assignee: Gabriel Reid
>             Fix For: 3.0.0, 4.0.0, 5.0.0
>
>         Attachments: PHOENIX-129-3.0.patch, PHOENIX-129-3.0_2.patch, PHOENIX-129-master.patch, PHOENIX-129-master_2.patch
>
>
> In implementing PHOENIX-66, it was noted that the current MapReduce-based importer implementation has a number issues, including the following:
> * CSV handling is largely replicated from the non-MR code, with no ability to specify custom separators
> * No automated tests, and code is written in a way that makes it difficult to test
> * Unusual custom config loading and handling instead of using GenericOptionParser and ToolRunner and friends
> The initial work towards PHOENIX-66 included refactoring the MR importer enough to use common code, up until the development of automated testing exposed the fact that the MR importer could use some major refactoring.
> This ticket is a proposal to do a relatively major rework of the MR import, fixing the above issues. The biggest improvements that will result from this are a common codebase for handling CSV input, and the addition of automated testing for the MR import.



--
This message was sent by Atlassian JIRA
(v6.2#6252)