You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Hariprasad Kuppuswamy (JIRA)" <ji...@apache.org> on 2011/06/09 11:31:59 UTC

[jira] [Created] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
---------------------------------------------------------------------------------

                 Key: PIG-2114
                 URL: https://issues.apache.org/jira/browse/PIG-2114
             Project: Pig
          Issue Type: New Feature
          Components: impl
    Affects Versions: 0.9.0
            Reporter: Hariprasad Kuppuswamy
            Priority: Minor
             Fix For: 0.9.0
         Attachments: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch

- Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
- Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
- Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
- Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
- Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Matt Henkel (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067991#comment-13067991 ] 

Matt Henkel commented on PIG-2114:
----------------------------------

Currently the "scanStartTimestamp" parameter description states that "Record version timestamps must be greater than this value", however it should be "greater than or equal to" as [scan's minStamp is inclusive|https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#setTimeRange(long, long)]. 

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: Enhancments-to-enable-timestampversion-based-row-scan.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065738#comment-13065738 ] 

Dmitriy V. Ryaboy commented on PIG-2114:
----------------------------------------

That'd be great!
Do note my point about your change to the default behavior w.r.t. multiple versions -- it can't go in like that, default has to stay "get the latest".

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067823#comment-13067823 ] 

Hariprasad Kuppuswamy commented on PIG-2114:
--------------------------------------------

Okay, It took me little longer to squeeze in sometime for this.

For the timestamp based row view, I have introduced new flag for the same & this patch also have a small unit test with a test pig script.
Let me know if this is good.

Thanks,







> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: Enhancments-to-enable-timestampversion-based-row-scan.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hariprasad Kuppuswamy updated PIG-2114:
---------------------------------------

    Attachment:     (was: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch)

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: Enhancments-to-enable-timestampversion-based-row-scan.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13213709#comment-13213709 ] 

Hariprasad Kuppuswamy commented on PIG-2114:
--------------------------------------------

Dmitriy: Any updates ?

                
> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>         Attachments: Changes-to-configure-omitnulls-puttimestamp-rowkeyprefixes.patch
>
>
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hariprasad Kuppuswamy updated PIG-2114:
---------------------------------------

    Description: 
- Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
- Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
- Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)


  was:
- Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
- Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
- Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
- Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
- Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)


    
> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Alan Gates (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Gates updated PIG-2114:
----------------------------

    Fix Version/s:     (was: 0.10)
           Status: Open  (was: Patch Available)

Cancelling patch as there has been no response to Dmitry's concerns.

Unlinking from 0.10.
                
> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>         Attachments: Enhancments-to-enable-timestampversion-based-row-scan.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hariprasad Kuppuswamy updated PIG-2114:
---------------------------------------

    Attachment:     (was: Enhancments-to-enable-timestampversion-based-row-scan.patch)
    
> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065734#comment-13065734 ] 

Hariprasad Kuppuswamy commented on PIG-2114:
--------------------------------------------

Bill/Dmitriy
Agree, I have a working version with unit test of the multiple version fetch.
Will fork out a patch sometime this weekend.

Will that be good ?

Thanks

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hariprasad Kuppuswamy updated PIG-2114:
---------------------------------------

    Status: Patch Available  (was: Open)

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.9.0
>
>         Attachments: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy reassigned PIG-2114:
--------------------------------------

    Assignee: Hariprasad Kuppuswamy

Assigning to Hariprasad so he can get proper JIRA credit :).

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hariprasad Kuppuswamy updated PIG-2114:
---------------------------------------

    Status: Patch Available  (was: Open)

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: Enhancments-to-enable-timestampversion-based-row-scan.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hariprasad Kuppuswamy updated PIG-2114:
---------------------------------------

    Attachment: Changes-to-configure-omitnulls-puttimestamp-rowkeyprefixes.patch
    
> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>         Attachments: Changes-to-configure-omitnulls-puttimestamp-rowkeyprefixes.patch
>
>
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hariprasad Kuppuswamy updated PIG-2114:
---------------------------------------

    Attachment: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.9.0
>
>         Attachments: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hariprasad Kuppuswamy updated PIG-2114:
---------------------------------------

    Attachment: Enhancments-to-enable-timestampversion-based-row-scan.patch

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: Enhancments-to-enable-timestampversion-based-row-scan.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Eric Yang (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243259#comment-13243259 ] 

Eric Yang commented on PIG-2114:
--------------------------------

Puttimestamp option is a concern.  The storefunc can only populate one timestamp per invocation.  This can overwrite data with the same row key.  Is this behavior acceptable? Could you provide a test case?
                
> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>         Attachments: Changes-to-configure-omitnulls-puttimestamp-rowkeyprefixes.patch
>
>
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082955#comment-13082955 ] 

Dmitriy V. Ryaboy commented on PIG-2114:
----------------------------------------

Apologies again for taking a while to review.

Thanks, that looks like a fair bit of work.

First, just a couple of procedural notes:
1) make sure the new files don't have @author annotations and do have the apache headers
2) there is already a TestHBaseStorage. Why add a new one in util?
3) This is sort of a PITA request, especially as there are plenty of places in the codebase that don't adhere to this practice, but can you make sure to do things like put spaces after commas (as in Map<family,Map<qualifier,Map<timestamp,value>>>) and before opening parens (as in for(Map.Entry valueEntry: ...),  wrap lines to a reasonable length, etc?

My major concern with the patch is as follows.

In getNext() you inserted a completely new flow that is used if timestamps are used. It bypasses all the existing logic for how results are created, and as far as I can see, does not respect things like projection pushdown. It also makes it so any future work on the hbase loader logic has to happen in two places. Let's not do that. Isn't loading a single-version row just a special case of loading multiple versions (with n = 1)? We should be able to do this in one go.

There being so much stuff mixed in here, I propose we get the smaller stuff like PIG-2115 in. Some of the things you are doing here are also pretty non-controversial, like omitNulls and prefix filters, we can get those in pretty easily. Let's factor out the multiple versions changes and add them to PIG-1832, leaving this (blessedly unspecifically titled :)) ticket to deal with the smaller stuff.

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: Enhancments-to-enable-timestampversion-based-row-scan.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065469#comment-13065469 ] 

Dmitriy V. Ryaboy commented on PIG-2114:
----------------------------------------

Thanks for contributing, good stuff in there.

Comments on the patch:

- "none existent" should be "non-existent"

- if you add the line "Object storeValue = t.get(i);", might as well reuse storeValue in other places in this function rather than keep calling t.get

- the call setMaxVersions() is different from current behavior. We currently only get the latest version, and don't have provisions for dealing with multiple versions.  As Bill pointed out, that also means scanMaxVersions probably doesn't work

- this really needs tests for every new flag.



> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-2114:
-----------------------------------

    Status: Open  (was: Patch Available)

canceling patch pending unit tests & other updates.

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13122320#comment-13122320 ] 

Hariprasad Kuppuswamy commented on PIG-2114:
--------------------------------------------

Apologies for the lack of response for so long !
This somehow sneaked out of my backlogs and completely missed it.

Will update a patch that addresses all Dmitriy's concern by this weekend.


                
> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>         Attachments: Enhancments-to-enable-timestampversion-based-row-scan.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125020#comment-13125020 ] 

Hariprasad Kuppuswamy commented on PIG-2114:
--------------------------------------------

Dmitriy

"There being so much stuff mixed in here, I propose we get the smaller stuff like PIG-2115  in. Some of the things you are doing here are also pretty non-controversial, like omitNulls and prefix filters, we can get those in pretty easily."

Agreed. Modified the patch description and created a new patch as you requested.

"Let's factor out the multiple versions changes and add them to PIG-1832, leaving this (blessedly unspecifically titled ) ticket to deal with the smaller stuff."

Okay, I will update the patch with larger change there addressing all your concerns shortly.

Hope this is good.
                
> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>         Attachments: Changes-to-configure-omitnulls-puttimestamp-rowkeyprefixes.patch
>
>
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Hariprasad Kuppuswamy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067830#comment-13067830 ] 

Hariprasad Kuppuswamy commented on PIG-2114:
--------------------------------------------

Also, this patch includes changes related to [PIG-2115|https://issues.apache.org/jira/browse/PIG-2115] as it was required for unit test case to run properly :)


> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: Enhancments-to-enable-timestampversion-based-row-scan.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063602#comment-13063602 ] 

Bill Graham commented on PIG-2114:
----------------------------------

The {{HBaseStorage}} schema returned does not support multiple cell versions, so I don't think the {{scanMaxVersions}} flag does anything in this implementation. A unit test might prove me wrong though. :)

There was discussion in PIG-1782 about creating an {{AdvancedHBaseStorage}} class with a more complex data structure that could support multiple cell versions.

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Bill Graham (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Graham updated PIG-2114:
-----------------------------

    Patch Info:   (was: Patch Available)

Canceling patch due to unaddressed comments and missing unit tests.
                
> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Assignee: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>         Attachments: Changes-to-configure-omitnulls-puttimestamp-rowkeyprefixes.patch
>
>
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (PIG-2114) Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations

Posted by "Olga Natkovich (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Olga Natkovich updated PIG-2114:
--------------------------------

    Fix Version/s:     (was: 0.9.0)
                   0.10

Delaying till 10 since we are about to spin the release

> Enhancements to PIG HBaseStorage Load & Store Func with extra scan configurations
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-2114
>                 URL: https://issues.apache.org/jira/browse/PIG-2114
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Hariprasad Kuppuswamy
>            Priority: Minor
>              Labels: hbase, storage
>             Fix For: 0.10
>
>         Attachments: 0001-Enhancements-to-Pig-HBase-Load-Storage-function-as-l.patch
>
>
> - Added capability to specify scan based on timestamps (Hariprasad Kuppuswwamy)
> - Ability to specify number of versions to be fetched with current scan (Hariprasad Kuppuswwamy)
> - Configure the rowkey prefixes filter for the scan (Hariprasad Kuppuswwamy)
> - Added ability to omit nulls when dealing with hbase storage (Greg Bowyer)
> - Added ability to specify Put timestamps while insertion (Hariprasad Kuppuswamy)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira