You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Bing Jiang <ji...@gmail.com> on 2013/04/09 13:41:51 UTC

Efficient way to use different storage medium

hi,

There are some physical machines which each one contains a large ssd(2T)
and general disk(4T),
and we want to build our hdfs and hbase environment.

If we use all storage(6T) as each machine provides, I want to know whether
it is an efficient way to
make advantage of ssd, or provide different performance for different table
through hdfs or hbase's configuration as application's requirements.

>From my views, different physical storage medium will provide a good chance
to think over hdfs or hbase's infrastructure, so any views from yours will
be nice.

Regards


-- 
Bing Jiang
weibo: http://weibo.com/jiangbinglover
BLOG: http://www.binospace.com
National Research Center for Intelligent Computing Systems
Institute of Computing technology
Graduate University of Chinese Academy of Science

Re: Efficient way to use different storage medium

Posted by Ted <yu...@gmail.com>.
Please take a look at https://issues.apache.org/jira/browse/HBASE-7404
Where bucket cache can be configured as secondary cache and utilize the speed of your ssd device. 

Cheers

On Apr 9, 2013, at 4:41 AM, Bing Jiang <ji...@gmail.com> wrote:

> hi,
> 
> There are some physical machines which each one contains a large ssd(2T)
> and general disk(4T),
> and we want to build our hdfs and hbase environment.
> 
> If we use all storage(6T) as each machine provides, I want to know whether
> it is an efficient way to
> make advantage of ssd, or provide different performance for different table
> through hdfs or hbase's configuration as application's requirements.
> 
> From my views, different physical storage medium will provide a good chance
> to think over hdfs or hbase's infrastructure, so any views from yours will
> be nice.
> 
> Regards
> 
> 
> -- 
> Bing Jiang
> weibo: http://weibo.com/jiangbinglover
> BLOG: http://www.binospace.com
> National Research Center for Intelligent Computing Systems
> Institute of Computing technology
> Graduate University of Chinese Academy of Science

Re: Efficient way to use different storage medium

Posted by Jean-Marc Spaggiari <je...@spaggiari.org>.
Hi Bing,

If you mount all your drives into HDFS, some blocks are going to be on
SSD and some on regular drives. So from reads are going to be fast,
and some others are going to be slow.

On a single machine, I don't think you can specify which table will be
on which drive since the blocks are going to be spread over the
drives.

You will still see some performances improvement since sometime it's
the SSD which is going to be used.

Also, at some points, your SSD drives might be full before your
regular drives. They will removed from valide WRITE destinations, but
will still be used as READs.

JM

2013/4/9 Bing Jiang <ji...@gmail.com>:
> hi,
>
> There are some physical machines which each one contains a large ssd(2T)
> and general disk(4T),
> and we want to build our hdfs and hbase environment.
>
> If we use all storage(6T) as each machine provides, I want to know whether
> it is an efficient way to
> make advantage of ssd, or provide different performance for different table
> through hdfs or hbase's configuration as application's requirements.
>
> From my views, different physical storage medium will provide a good chance
> to think over hdfs or hbase's infrastructure, so any views from yours will
> be nice.
>
> Regards
>
>
> --
> Bing Jiang
> weibo: http://weibo.com/jiangbinglover
> BLOG: http://www.binospace.com
> National Research Center for Intelligent Computing Systems
> Institute of Computing technology
> Graduate University of Chinese Academy of Science

Re: Efficient way to use different storage medium

Posted by Ted Yu <yu...@gmail.com>.
This one is under active discussion:

HDFS-4672 Support tiered storage policies

Cheers

On Tue, Apr 9, 2013 at 10:02 PM, ramkrishna vasudevan <
ramkrishna.s.vasudevan@gmail.com> wrote:

> Hi
>
> Interesting topic and we have a JIRA already raised for such a feature.
>  But still the work is in progress
> https://issues.apache.org/jira/browse/HBASE-6572
> https://issues.apache.org/jira/browse/HDFS-2832
>
> Regards
> Ram
>
>
> On Tue, Apr 9, 2013 at 10:07 PM, Stack <st...@duboce.net> wrote:
>
> > On Tue, Apr 9, 2013 at 4:41 AM, Bing Jiang <ji...@gmail.com>
> > wrote:
> >
> > > hi,
> > >
> > > There are some physical machines which each one contains a large
> ssd(2T)
> > > and general disk(4T),
> > > and we want to build our hdfs and hbase environment.
> > >
> >
> > What kind of workload do you intend to run on these machines?  Do you
> have
> > enough space running all of your work load on SSD?  At an extreme, you
> > could have two clusters -- one running on SSDs for low latency workloads
> > and the other on spinning disk -- and perhaps your segregation is such
> that
> > having to copy between the two systems is rare, etc., etc.
> >
> > St.Ack
> >
>

Re: Efficient way to use different storage medium

Posted by ramkrishna vasudevan <ra...@gmail.com>.
Hi

Interesting topic and we have a JIRA already raised for such a feature.
 But still the work is in progress
https://issues.apache.org/jira/browse/HBASE-6572
https://issues.apache.org/jira/browse/HDFS-2832

Regards
Ram


On Tue, Apr 9, 2013 at 10:07 PM, Stack <st...@duboce.net> wrote:

> On Tue, Apr 9, 2013 at 4:41 AM, Bing Jiang <ji...@gmail.com>
> wrote:
>
> > hi,
> >
> > There are some physical machines which each one contains a large ssd(2T)
> > and general disk(4T),
> > and we want to build our hdfs and hbase environment.
> >
>
> What kind of workload do you intend to run on these machines?  Do you have
> enough space running all of your work load on SSD?  At an extreme, you
> could have two clusters -- one running on SSDs for low latency workloads
> and the other on spinning disk -- and perhaps your segregation is such that
> having to copy between the two systems is rare, etc., etc.
>
> St.Ack
>

Re: Efficient way to use different storage medium

Posted by Stack <st...@duboce.net>.
On Tue, Apr 9, 2013 at 4:41 AM, Bing Jiang <ji...@gmail.com> wrote:

> hi,
>
> There are some physical machines which each one contains a large ssd(2T)
> and general disk(4T),
> and we want to build our hdfs and hbase environment.
>

What kind of workload do you intend to run on these machines?  Do you have
enough space running all of your work load on SSD?  At an extreme, you
could have two clusters -- one running on SSDs for low latency workloads
and the other on spinning disk -- and perhaps your segregation is such that
having to copy between the two systems is rare, etc., etc.

St.Ack