You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Gaojinchao <ga...@huawei.com> on 2012/01/09 06:13:19 UTC

A question about HBase accesses the HDFS

In DN logs, There are a lot of "480000 millis timeout" exceptions when some scans finished, But Region server have no exceptions.
I analyzed the flow of HBase and Hadoop. I found we are using API "readBuffer(byte buf[], int off, int len)", It means we need read the whole block data, if we stop reading, some timeout will happen in DN side.

I want to know whether is the bad effect?
For example leaking the socket, I found some "CLOSE_WAIT" in our machine, I did not know whether it is the result.

Who has experience? Please help explain.

Thank You.

The HDFS API
2012-01-07 07:02:40,753 INFO org.apache.hadoop.hdfs.DFSClient: Hbase is invoking .................... seek(long targetPos)
2012-01-07 07:02:40,753 INFO org.apache.hadoop.hdfs.DFSClient: Hbase is invoking.................................read(byte buf[], int off, int len)
2012-01-07 07:02:40,754 INFO org.apache.hadoop.hdfs.DFSClient: Hbase is invoking.................................readBuffer(byte buf[], int off, int len)

DN Logs:
2012-01-08 02:52:32,969 WARN  datanode.DataNode (DataXceiver.java:readBlock(274)) - DatanodeRegistration(158.1.130.33:10010, storageID=DS-1985031385-158.1.130.33-10010-1318824003883, infoPort=10075, ipcPort=10020):Got exception while serving blk_1325150650461_45153 to /158.1.130.33:
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/158.1.130.33:10010 remote=/158.1.130.33:53792]
         at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
         at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
         at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:410)
         at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:508)
         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:247)
         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:130)
         at java.lang.Thread.run(Thread.java:662)

2012-01-08 02:52:32,969 ERROR datanode.DataNode (DataXceiver.java:run(183)) - org.apache.hadoop.hdfs.server.datanode.DataNode (158.1.130.33:10010, storageID=DS-1985031385-158.1.130.33-10010-1318824003883, infoPort=10075, ipcPort=10020):DataXceiver
java.net.SocketTimeoutException: 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/158.1.130.33:10010 remote=/158.1.130.33:53792]
         at org.apache.hadoop.net.SocketIOWithTimeout.waitForIO(SocketIOWithTimeout.java:247)
         at org.apache.hadoop.net.SocketOutputStream.waitForWritable(SocketOutputStream.java:159)
         at org.apache.hadoop.net.SocketOutputStream.transferToFully(SocketOutputStream.java:198)
         at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendChunks(BlockSender.java:410)
         at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock(BlockSender.java:508)
         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:247)
         at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:130)
         at java.lang.Thread.run(Thread.java:662)

Re: schema optimisation - go for multiple tables, rows or column families?

Posted by kisalay <ki...@gmail.com>.

Tom,

I would want to add to what Jonathan suggested. The approach (1) of having
multiple problems:
a> As Jonathan suggested, regions are created on a per table basis, so data
from different tables will fall in different regions. There is no guarantee
on what servers are these regions allocated.
b> The greater problem that I perceive with the approach 1 is that small
metadata table may not be split well into regions (as the splitting is size
based) and hence can become a hot-spot, as a lot of keys will fall in one
region.

There is more. If you store the two data in different column-families, they
will in-turn be stored in different store-files. So when you fetch the two
of them, you will indeed be fetching data from two different store-files,
and possibly from two different physical nodes.

So, I would ask you: Can you store both meta and measurement data as two
different columns in the same column-family ? In that case one fetch on the
key for both the data-points will resolve to same region, same store file.

just a thought

~Kisalay

On Mon, Jan 9, 2012 at 5:21 PM, Jonathan Hsieh <jo...@cloudera.com> wrote:

> Hi Tom,
>
> In the case you describe -- two HTables -- there is no guarantee that they
> will end up going to the same region server.  If you have multiple tables,
> these are different regions and which can (and most likely will) be
> distributed to different regionserver machines.  The fact that both tables
> use the same rowkeys doesn't matter.
>
> If you use (2), the single table with column family approach, they would be
> located in the same region and thus the same regionserver.
>
> Given your concerns, and depending on your read patterns (do you do a lot
> of scans of only the meta data?), I'd probably take approach (2) or (3).
>
> Jon.
>
> On Mon, Jan 9, 2012 at 2:01 AM, Tom <fi...@gmail.com> wrote:
>
> > Hello,
> >
> > I got most, but not all, answers about schemas from the HBase Book and
> the
> > "Definite Guide".
> > Let's say there is a single row key and I use this key to add to two
> > tables, one row each (case (1)).
> > Could someone please confirm that even though the tables are different,
> > based on the key, this data will end up in the same or at least adjacent
> > regions? (I.e. my hbase client has to deal with two HTable instances but
> > only one region server needs to be looked up)?
> >
> > Thank you,
> > Tom
> >
> > Background:
> > I have two types of data: meta data (low volume) and measurement data
> > (high volume); and I get requests coming in where, based on an ID, I need
> > my HBase client to be able to access both metadata and measurement data
> for
> > this ID quickly. I want to reduce communication overhead (lookups, number
> > of tcp connections etc).
> >
> > In regards to dealing with the two types of data in Hbase, I see these
> > three design choices, which one to go for?
> >
> > (1) Multiple tables - single key - single column family
> >
> > (2) Single table - single key - multiple column families (the HBase Book
> > advises against that in section 6.2).
> >
> > (3) Single table - multiple keys (all made in such a way that they will
> be
> > co-located and system wide hot spots are avoided) - single column family
> >
> >
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>

Re: schema optimisation - go for multiple tables, rows or column families?

Posted by Tom <fi...@gmail.com>.

Hi Jon, Kisalay and Rohit,

thank you for your feedback!

I almost always need to access my metadata and (the most recent subset 
of ) the measurement data together.
To do this access (scan/put) fast, it seems a valid goal to have my data 
distributed as little as possible among the cluster (ideally in the same 
HFile).

 From you input, it seems clear now that to reach this goal, I can only 
have a single table, one single column family.

Then, I use row keys as follows:

key            --> {column-qual-1, val1}, {column-qual-2, val2}
---------------------------------------------------------------
"[UID]-meta"   --> {'email', 'some@test.com'}, {'last-upd', '1/1/2012'}

"[UID]-data-x" --> {'1326138930', '2523'}, {'1326138931', '2520'} ...

"[UID]-data-y" --> {'1326138930', '2555'}, {'1326138931', '2544'} ...

While it is not guaranteed that these rows are in the same region, 
chances are pretty good that they will be on the same region server if I 
make the servers / regions large enough.

Cheers
Tom


On 01/09/2012 06:09 AM, Rohit Kelkar wrote:
> Tom, think of it this way (guys correct me if I am wrong)
>
> Each column family translates to 1 file on hdfs.
> You have 3 cases -
> case 1: Multiple tables - single key - single column family
> N tables and each table has 1 column family. This translates to N files on hdfs
>
> case 2: Single table - single key - multiple column families (the
> HBase Book advises against that in section 6.2).
> 1 table and M column families. This translates to M files on hdfs
>
> case 3: Single table - multiple keys (all made in such a way that they
> will be co-located and system wide hot spots are avoided) - single
> column family
> 1 table and 1 column family. Translates to 1 file on hdfs.
>
> One perspective of looking at the data is - your metadata is going to
> be static whereas the measurement data is going to grow over time. So
> for each rowId you would have n bytes of metadata and n*10 (perhaps)
> bytes of measurement data. This would cause metadata (which is small)
> to be distributed across regionservers and if you were to do a simple
> scan on your metadata then it would have to scan over many region
> servers. Do you foresee yourself doing a scan over just the metadata?
> Or is your main use case to randomly retrieve the metadata and
> measurement data for rowIds?
>
> - Rohit Kelkar
>
>
> On Mon, Jan 9, 2012 at 5:21 PM, Jonathan Hsieh<jo...@cloudera.com>  wrote:
>> Hi Tom,
>>
>> In the case you describe -- two HTables -- there is no guarantee that they
>> will end up going to the same region server.  If you have multiple tables,
>> these are different regions and which can (and most likely will) be
>> distributed to different regionserver machines.  The fact that both tables
>> use the same rowkeys doesn't matter.
>>
>> If you use (2), the single table with column family approach, they would be
>> located in the same region and thus the same regionserver.
>>
>> Given your concerns, and depending on your read patterns (do you do a lot
>> of scans of only the meta data?), I'd probably take approach (2) or (3).
>>
>> Jon.
>>
>> On Mon, Jan 9, 2012 at 2:01 AM, Tom<fi...@gmail.com>  wrote:
>>
>>> Hello,
>>>
>>> I got most, but not all, answers about schemas from the HBase Book and the
>>> "Definite Guide".
>>> Let's say there is a single row key and I use this key to add to two
>>> tables, one row each (case (1)).
>>> Could someone please confirm that even though the tables are different,
>>> based on the key, this data will end up in the same or at least adjacent
>>> regions? (I.e. my hbase client has to deal with two HTable instances but
>>> only one region server needs to be looked up)?
>>>
>>> Thank you,
>>> Tom
>>>
>>> Background:
>>> I have two types of data: meta data (low volume) and measurement data
>>> (high volume); and I get requests coming in where, based on an ID, I need
>>> my HBase client to be able to access both metadata and measurement data for
>>> this ID quickly. I want to reduce communication overhead (lookups, number
>>> of tcp connections etc).
>>>
>>> In regards to dealing with the two types of data in Hbase, I see these
>>> three design choices, which one to go for?
>>>
>>> (1) Multiple tables - single key - single column family
>>>
>>> (2) Single table - single key - multiple column families (the HBase Book
>>> advises against that in section 6.2).
>>>
>>> (3) Single table - multiple keys (all made in such a way that they will be
>>> co-located and system wide hot spots are avoided) - single column family
>>>
>>>
>>
>>
>> --
>> // Jonathan Hsieh (shay)
>> // Software Engineer, Cloudera
>> // jon@cloudera.com
>

Re: schema optimisation - go for multiple tables, rows or column families?

Posted by Rohit Kelkar <ro...@gmail.com>.

Tom, think of it this way (guys correct me if I am wrong)

Each column family translates to 1 file on hdfs.
You have 3 cases -
case 1: Multiple tables - single key - single column family
N tables and each table has 1 column family. This translates to N files on hdfs

case 2: Single table - single key - multiple column families (the
HBase Book advises against that in section 6.2).
1 table and M column families. This translates to M files on hdfs

case 3: Single table - multiple keys (all made in such a way that they
will be co-located and system wide hot spots are avoided) - single
column family
1 table and 1 column family. Translates to 1 file on hdfs.

One perspective of looking at the data is - your metadata is going to
be static whereas the measurement data is going to grow over time. So
for each rowId you would have n bytes of metadata and n*10 (perhaps)
bytes of measurement data. This would cause metadata (which is small)
to be distributed across regionservers and if you were to do a simple
scan on your metadata then it would have to scan over many region
servers. Do you foresee yourself doing a scan over just the metadata?
Or is your main use case to randomly retrieve the metadata and
measurement data for rowIds?

- Rohit Kelkar

On Mon, Jan 9, 2012 at 5:21 PM, Jonathan Hsieh <jo...@cloudera.com> wrote:
> Hi Tom,
>
> In the case you describe -- two HTables -- there is no guarantee that they
> will end up going to the same region server.  If you have multiple tables,
> these are different regions and which can (and most likely will) be
> distributed to different regionserver machines.  The fact that both tables
> use the same rowkeys doesn't matter.
>
> If you use (2), the single table with column family approach, they would be
> located in the same region and thus the same regionserver.
>
> Given your concerns, and depending on your read patterns (do you do a lot
> of scans of only the meta data?), I'd probably take approach (2) or (3).
>
> Jon.
>
> On Mon, Jan 9, 2012 at 2:01 AM, Tom <fi...@gmail.com> wrote:
>
>> Hello,
>>
>> I got most, but not all, answers about schemas from the HBase Book and the
>> "Definite Guide".
>> Let's say there is a single row key and I use this key to add to two
>> tables, one row each (case (1)).
>> Could someone please confirm that even though the tables are different,
>> based on the key, this data will end up in the same or at least adjacent
>> regions? (I.e. my hbase client has to deal with two HTable instances but
>> only one region server needs to be looked up)?
>>
>> Thank you,
>> Tom
>>
>> Background:
>> I have two types of data: meta data (low volume) and measurement data
>> (high volume); and I get requests coming in where, based on an ID, I need
>> my HBase client to be able to access both metadata and measurement data for
>> this ID quickly. I want to reduce communication overhead (lookups, number
>> of tcp connections etc).
>>
>> In regards to dealing with the two types of data in Hbase, I see these
>> three design choices, which one to go for?
>>
>> (1) Multiple tables - single key - single column family
>>
>> (2) Single table - single key - multiple column families (the HBase Book
>> advises against that in section 6.2).
>>
>> (3) Single table - multiple keys (all made in such a way that they will be
>> co-located and system wide hot spots are avoided) - single column family
>>
>>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com

Re: schema optimisation - go for multiple tables, rows or column families?

Posted by Jonathan Hsieh <jo...@cloudera.com>.

Hi Tom,

In the case you describe -- two HTables -- there is no guarantee that they
will end up going to the same region server.  If you have multiple tables,
these are different regions and which can (and most likely will) be
distributed to different regionserver machines.  The fact that both tables
use the same rowkeys doesn't matter.

If you use (2), the single table with column family approach, they would be
located in the same region and thus the same regionserver.

Given your concerns, and depending on your read patterns (do you do a lot
of scans of only the meta data?), I'd probably take approach (2) or (3).

Jon.

On Mon, Jan 9, 2012 at 2:01 AM, Tom <fi...@gmail.com> wrote:

> Hello,
>
> I got most, but not all, answers about schemas from the HBase Book and the
> "Definite Guide".
> Let's say there is a single row key and I use this key to add to two
> tables, one row each (case (1)).
> Could someone please confirm that even though the tables are different,
> based on the key, this data will end up in the same or at least adjacent
> regions? (I.e. my hbase client has to deal with two HTable instances but
> only one region server needs to be looked up)?
>
> Thank you,
> Tom
>
> Background:
> I have two types of data: meta data (low volume) and measurement data
> (high volume); and I get requests coming in where, based on an ID, I need
> my HBase client to be able to access both metadata and measurement data for
> this ID quickly. I want to reduce communication overhead (lookups, number
> of tcp connections etc).
>
> In regards to dealing with the two types of data in Hbase, I see these
> three design choices, which one to go for?
>
> (1) Multiple tables - single key - single column family
>
> (2) Single table - single key - multiple column families (the HBase Book
> advises against that in section 6.2).
>
> (3) Single table - multiple keys (all made in such a way that they will be
> co-located and system wide hot spots are avoided) - single column family
>
>

-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

schema optimisation - go for multiple tables, rows or column families?

Posted by Tom <fi...@gmail.com>.

Hello,

I got most, but not all, answers about schemas from the HBase Book and 
the "Definite Guide".
Let's say there is a single row key and I use this key to add to two 
tables, one row each (case (1)).
Could someone please confirm that even though the tables are different, 
based on the key, this data will end up in the same or at least adjacent 
regions? (I.e. my hbase client has to deal with two HTable instances but 
only one region server needs to be looked up)?

Thank you,
Tom

Background:
I have two types of data: meta data (low volume) and measurement data 
(high volume); and I get requests coming in where, based on an ID, I 
need my HBase client to be able to access both metadata and measurement 
data for this ID quickly. I want to reduce communication overhead 
(lookups, number of tcp connections etc).

In regards to dealing with the two types of data in Hbase, I see these 
three design choices, which one to go for?

(1) Multiple tables - single key - single column family

(2) Single table - single key - multiple column families (the HBase Book 
advises against that in section 6.2).

(3) Single table - multiple keys (all made in such a way that they will 
be co-located and system wide hot spots are avoided) - single column family