You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Francesco Angi (JIRA)" <ji...@apache.org> on 2010/11/04 12:19:41 UTC

[jira] Created: (HBASE-3197) Using the same HBaseConfiguration for multiple puts makes some data not to be written

Using the same HBaseConfiguration for multiple puts makes some data not to be written
-------------------------------------------------------------------------------------

                 Key: HBASE-3197
                 URL: https://issues.apache.org/jira/browse/HBASE-3197
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.20.6
         Environment: HBase cluster running on 6 machines (Centos 5.5, Intel Dual Core/core i3, ram 4/8GB)
            Reporter: Francesco Angi


I created a DAO object for loading and storing data into HBase. The DAO has a HBaseConfiguration field, created inside the DAO constructor. Each DAO's method creates a new HTable using the class's HBaseConfiguration. The problem shows up when subsequent writings (using a put) are invoked, since not all the data is written. Moreover this behaviour is not deterministic: invoking the same writings never writes all the data, but the missing ones change every time.
I solved this situation removing the class's HBaseConfiguration and creating a new HBaseConfiguration inside each method.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3197) Using the same HBaseConfiguration for multiple puts makes some data not to be written

Posted by "Jean-Daniel Cryans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928292#action_12928292 ] 

Jean-Daniel Cryans commented on HBASE-3197:
-------------------------------------------

Thanks for reporting this Francesco, perhaps you could provide a small unit test that shows the issue?

> Using the same HBaseConfiguration for multiple puts makes some data not to be written
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-3197
>                 URL: https://issues.apache.org/jira/browse/HBASE-3197
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.6
>         Environment: HBase cluster running on 6 machines (Centos 5.5, Intel Dual Core/core i3, ram 4/8GB)
>            Reporter: Francesco Angi
>
> I created a DAO object for loading and storing data into HBase. The DAO has a HBaseConfiguration field, created inside the DAO constructor. Each DAO's method creates a new HTable using the class's HBaseConfiguration. The problem shows up when subsequent writings (using a put) are invoked, since not all the data is written. Moreover this behaviour is not deterministic: invoking the same writings never writes all the data, but the missing ones change every time.
> I solved this situation removing the class's HBaseConfiguration and creating a new HBaseConfiguration inside each method.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-3197) Using the same HBaseConfiguration for multiple puts makes some data not to be written

Posted by "Francesco Angi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-3197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12930561#action_12930561 ] 

Francesco Angi commented on HBASE-3197:
---------------------------------------

Hello and sorry for the late reply. I did some tests trying to reproduce a "small unit test" but that wasn't that simple. I'll try to explain. The DAO I developed is part of a server accepting telnet connections and storing the data it computes to HBase. The server is built upon Mina. I wasn't able to produce a local program that shows the same erroneous behaviour as the server. I tried running a simple for loop, calling the DAO's saving method and every data were saved correctly. So I tried running multiple threads and everything went ok too.
Here's a snippet of the DAO involved:

{code:java}
public class HBaseDataStorageDAO implements DataStorageDAO {
    private TSerializer serializer;
    private TDeserializer deserializer;
    private HBaseConfiguration config; 
    private String zookeperQuorumList;

    public HBaseDataStorageDAO(String zookeperQuorumList) throws DataStorageException {
        serializer = new TSerializer(new TBinaryProtocol.Factory());        
        deserializer = new TDeserializer(new TBinaryProtocol.Factory());
        this.zookeperQuorumList = zookeperQuorumList; 
        config = new HBaseConfiguration();
        config.set("hbase.zookeeper.quorum", zookeeperQuorumList);
    }

    public void saveDataInfo(String dataId, DataInfoWrapper dataInfoWrapper) throws DataStorageException {
        HTable tableData = null;
        try {
            tableData = new HTable(config, "dataTable");
            Put p = new Put(Bytes.toBytes(dataId));
            p.add(Bytes.toBytes("dataInfo"),  // column family 
                    Bytes.toBytes(""),  // no column id
                    serializer.serialize(dataInfoWrapper.getDataInfo()));  // value
            tableDatas.put(p);
             
        } catch (IOException e) {
            handleException(e);
        } catch (TException e) {
            handleException(e);
        } finally {
            if (tableData != null)
                try {
                    tableData.close();
                } catch (IOException e) {
                    handleException(e);
                }
        }
    }
}
{code}

When the server starts it instantiates a HBaseDataStorageDAO, so as you can see every saveDataInfo call shares the same config, serializer and deserializer objects. I suspected that the problem could be the single serializer and deserializer, so I tried instantiating a new TSerializer in every saveDataInfo invocation, not even this solved the problem. As I said in my first post the only solution I found was creating a new HBaseConfiguration for each saveDataInfo invocation, and this is a bit weird, at least in my opinion.
In conclusion I'm not sure if I posted a "false alarm", indeed the problem could be related to the underlying architecture of my server.
Sorry if I wasted your time,
best regards,
f.

> Using the same HBaseConfiguration for multiple puts makes some data not to be written
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-3197
>                 URL: https://issues.apache.org/jira/browse/HBASE-3197
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.6
>         Environment: HBase cluster running on 6 machines (Centos 5.5, Intel Dual Core/core i3, ram 4/8GB)
>            Reporter: Francesco Angi
>
> I created a DAO object for loading and storing data into HBase. The DAO has a HBaseConfiguration field, created inside the DAO constructor. Each DAO's method creates a new HTable using the class's HBaseConfiguration. The problem shows up when subsequent writings (using a put) are invoked, since not all the data is written. Moreover this behaviour is not deterministic: invoking the same writings never writes all the data, but the missing ones change every time.
> I solved this situation removing the class's HBaseConfiguration and creating a new HBaseConfiguration inside each method.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.