You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Olga Natkovich (JIRA)" <ji...@apache.org> on 2007/12/12 19:28:43 UTC

[jira] Commented: (PIG-30) Get rid of DataBag and always use BigDataBag

    [ https://issues.apache.org/jira/browse/PIG-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12551056 ] 

Olga Natkovich commented on PIG-30:
-----------------------------------

A couple of other issues I observed with BigDataBag:

- Should check memory availability periodically, not on every add
- Try to buffer in memory first. Currently we always write to disk after the first spill


> Get rid of DataBag and always use BigDataBag
> --------------------------------------------
>
>                 Key: PIG-30
>                 URL: https://issues.apache.org/jira/browse/PIG-30
>             Project: Pig
>          Issue Type: Bug
>          Components: data
>            Reporter: Benjamin Reed
>            Assignee: Alan Gates
>
> We should never use DataBag directly; instead, we should always use BigDataBag. I think we already do this. The problem is that the logic in BigDataBag is hard to follow and it is made more complicated because it subclasses DataBag. We should merge these two classes together.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.