You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Cheolsoo Park (Commented) (JIRA)" <ji...@apache.org> on 2012/03/27 01:26:27 UTC

[jira] [Commented] (SQOOP-465) BLOB support for Avro import

    [ https://issues.apache.org/jira/browse/SQOOP-465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13238976#comment-13238976 ] 

Cheolsoo Park commented on SQOOP-465:
-------------------------------------

Currently, LargeObjectBlob (LOB) is handled in Sqoop as follows:

1) if LOB is < MAX_INLINE_LOB_LEN (16 MB by default), it is imported as text (or sequence) files just like any other types of data.

2) if LOB is > MAX_INLINE_LOB_LEN, it is saved in .lob files, and reference files that contain information about these .lob files are created. The content of reference files looks like this:

lf,<path>,<offset>,<length>

But if the --as-sqeuncefile option is enabled, Sqoop generates reference files as sequence files while LOB is still saved in .lob files. (The .lob file format specification can be found at https://github.com/cloudera/sqoop/wiki/sip-3.)


As the first step of blob support for Avro import, I am going to follow the current semantics of --as-sequencefile. That is, if the --as-avrodatafile option is enabled,

1) if LOB is < MAX_INLINE_LOB_LEN, it will be saved as Avro data files.

2) if LOB is > MAX_INLINE_LOB_LEN, reference files will be generated as Avro data files while LOB is still saved in .lob files.
                
> BLOB support for Avro import
> ----------------------------
>
>                 Key: SQOOP-465
>                 URL: https://issues.apache.org/jira/browse/SQOOP-465
>             Project: Sqoop
>          Issue Type: Improvement
>            Reporter: Bilung Lee
>            Assignee: Cheolsoo Park
>
> BLOB is supported for text import already.
> Provide further support for Avro import.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira