You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Vladimir Rodionov <vr...@carrieriq.com> on 2011/10/26 21:51:39 UTC

Random I/O performance

We have a reporting tool which runs queries against Oracle DB, collects fact ids and then 
queries HBase for these facts (one-by-one). This is single thread, simple Get op

It is slow, of course. 5 hours to retrieve 1M facts from HBase storage. Approx 55 rows per sec

I know I can use batch get to increase the speed but my question is what else we can do to make our ops team happier? 

How to optimize random I/O performance in HBase (hi, Facebook we have the same problem as you guys :)

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Gary Helmling [ghelmling@gmail.com]
Sent: Wednesday, October 26, 2011 12:34 PM
To: dev@hbase.apache.org
Subject: Re: proposal for naming convention of patches for TRUNK

Also should be possible to use the file command?

$ file HBASE-4680.txt
HBASE-4680.txt: diff output text



On Wed, Oct 26, 2011 at 12:32 PM, Ted Yu <yu...@gmail.com> wrote:
> Looping in Giri.
>
> Giri:
> Do you think you have enough heuristics for the filter ?
>
> Thanks
>
> On Wed, Oct 26, 2011 at 12:29 PM, Todd Lipcon <to...@cloudera.com> wrote:
>
>> Should be pretty easy to use grep to determine if a file is a patch or
>> not. Patch files have lines starting with "---" and "+++".
>>
>>
>> On Wed, Oct 26, 2011 at 11:58 AM, Ted Yu <yu...@gmail.com> wrote:
>> > #1 is reasonable.
>> >
>> > For #2, the following would be included for test validation:
>> >
>> > how-to-reproduce-the-problem.txt
>> > script-I-used.txt
>> >
>> > Just a few examples.
>> >
>> > On Wed, Oct 26, 2011 at 11:52 AM, Jonathan Hsieh <jo...@cloudera.com>
>> wrote:
>> >
>> >> Suggestion:
>> >>
>> >> 1) Don't run check if the apache inclusion flag isn't checked?
>> >> 2) Require extension to be .diff, .patch, or .txt?
>> >>
>> >> Jon.
>> >>
>> >> On Wed, Oct 26, 2011 at 11:37 AM, Ted Yu <yu...@gmail.com> wrote:
>> >>
>> >> > How do we exclude non-patch attachments, such as
>> >> > EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo<
>> >> >
>> >>
>> http://issues.apache.org/jira/secure/attachment/12500832/EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo
>> >> > >?
>> >> >
>> >> > Thanks
>> >> >
>> >> > On Wed, Oct 26, 2011 at 11:32 AM, Todd Lipcon <to...@cloudera.com>
>> wrote:
>> >> >
>> >> > > I prefer to default to trunk, and require a -0.90 or -0.92 to
>> >> > > delineate a different branch. Most patches should be against trunk,
>> so
>> >> > > let's optimize for the common case.
>> >> > >
>> >> > > -Todd
>> >> > >
>> >> > > On Wed, Oct 26, 2011 at 11:04 AM, Ted Yu <yu...@gmail.com>
>> wrote:
>> >> > > > Hi,
>> >> > > > I am working with Giri on a filter that should help us avoid the
>> >> > > following
>> >> > > > (see HBASE-4377):
>> >> > > >
>> >> > > > -1 overall. Here are the results of testing the latest attachment
>> >> > > >
>> >> > >
>> >> >
>> >>
>> http://issues.apache.org/jira/secure/attachment/12500832/EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo
>> >> > > > against trunk revision .
>> >> > > >
>> >> > > > I am proposing the following convention: TRUNK patch filename
>> should
>> >> > > contain
>> >> > > > the word 'trunk' in a prominent manner - surrounded by either dash
>> or
>> >> > > dot.
>> >> > > > Valid examples are:
>> >> > > >
>> >> > > > <
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12500830/hbase-4377.trunk.v4.txt
>> >> > > >
>> >> > > >  hbase-4377.trunk.v4.txt<
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12500830/hbase-4377.trunk.v4.txt
>> >> > > >
>> >> > > > <
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12497503/hbase-4377-trunk.v2.patch
>> >> > > >
>> >> > > >  hbase-4377-trunk.v2.patch<
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12497503/hbase-4377-trunk.v2.patch
>> >> > > >
>> >> > > > <
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12499805/0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch
>> >> > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>>  0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch<
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/secure/attachment/12499805/0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch
>> >> > > >
>> >> > > >
>> >> > > > This would allow Giri to write filter that correctly uploads patch
>> >> for
>> >> > > TRUNK
>> >> > > > to Jenkins for test build.
>> >> > > >
>> >> > > > Please provide your comments.
>> >> > > >
>> >> > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Todd Lipcon
>> >> > > Software Engineer, Cloudera
>> >> > >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> // Jonathan Hsieh (shay)
>> >> // Software Engineer, Cloudera
>> >> // jon@cloudera.com
>> >>
>> >
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>

Re: Random I/O performance

Posted by Todd Lipcon <to...@cloudera.com>.

On Wed, Oct 26, 2011 at 4:13 PM, Ted Yu <yu...@gmail.com> wrote:
>>> Off-heap cache is experimental in 0.92 and TRUNK.
> As of now, TestSlabCache passes consistently in 0.92 and TRUNK.
>
> Li Pi's slides from Aug can be found here:
> https://docs.google.com/present/view?id=d23xkzr_55hgnvngf6
> Toward the end of it, you can find performance chart.

Those were micro-benchmarks, though. I tried doing some tests on a
large cluster and wasn't able to get great performance out of it. But,
that was on an earlier version of the patch, and the later fixes could
have also fixed the perf issue. Hopefully some other folks can try it
out and provide feedback!

-Todd

>
> On Wed, Oct 26, 2011 at 3:49 PM, Stack <st...@duboce.net> wrote:
>
>> On Wed, Oct 26, 2011 at 2:50 PM, Vladimir Rodionov
>> <vr...@carrieriq.com> wrote:
>> >>> Are you hitting cache at all?
>> >>
>> >> Its totally random, due to the proposed key design which favored fast
>> inserts. Keys are randomized
>> >> values, that is why there is no data locality in row look ups. Effect of
>> the cache (LruBlockCache?) is negligible
>> >> in this case.
>> >>
>> >
>> >>>So a different schema would get cache into the mix?
>> >
>> > You can/t change schema while system is in production
>> >
>>
>> True but caveat Ted's note and FB fellas apparently did it three times
>> before they hit on the 'right' schema (Not sure whether they took the
>> portion being modified offline when changing schema)
>>
>> >
>> >>>Its going to keep growing without bound?
>> >
>> >
>> > No, we keep data for XX days than purge stale data from the table.
>> >
>> >
>> > My question was: what else besides obvious -run all in parallel - can
>> help to improve random I/O?
>> >
>> > 1. Will BLOOM filter help to optimize HBase Read path?
>>
>> Yes.  0.92 blooms will be less expensive than those in 0.90 (because
>> the blooms are tiered and live in the LRU in 0.92 so they are let go
>> if unused).
>>
>>
>> > 2. We use compression already.
>> > 3. Block size - does it really matter much?
>>
>> Not much in my experience.  Smaller blocks can help a little at the
>> cost of some bloat in index size (Again 0.92 is better here because
>> indices are partitioned and now also are in the LRU rather than pegged
>> in RAM as they are in 0.90).
>>
>> > 4. Off heap block cache? Its in 92 trunk? Have anybody performed real
>> performance tests on Off heap cache?
>> >
>>
>> Off-heap cache is experimental in 0.92 and TRUNK.
>>
>> St.Ack
>>
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Random I/O performance

Posted by Li Pi <li...@idle.li>.

Because it hasn't been tested in a full cluster yet and I just got the
last race condition out a week and a half ago (hopefully).

On Wed, Oct 26, 2011 at 4:36 PM, Stack <st...@duboce.net> wrote:
> On Wed, Oct 26, 2011 at 4:31 PM, Vladimir Rodionov
> <vr...@carrieriq.com> wrote:
>> Great
>>
>> Why is nobody using it?
>>
>
> Probably because its not in a release yet V.
> St.Ack
>

Re: Random I/O performance

Posted by Stack <st...@duboce.net>.

On Wed, Oct 26, 2011 at 4:31 PM, Vladimir Rodionov
<vr...@carrieriq.com> wrote:
> Great
>
> Why is nobody using it?
>

Probably because its not in a release yet V.
St.Ack

RE: Random I/O performance

Posted by Vladimir Rodionov <vr...@carrieriq.com>.

Great

Why is nobody using it?

Best regards,
Vladimir Rodionov
Principal Platform Engineer
Carrier IQ, www.carrieriq.com
e-mail: vrodionov@carrieriq.com

________________________________________
From: Ted Yu [yuzhihong@gmail.com]
Sent: Wednesday, October 26, 2011 4:13 PM
To: dev@hbase.apache.org
Subject: Re: Random I/O performance

>> Off-heap cache is experimental in 0.92 and TRUNK.
As of now, TestSlabCache passes consistently in 0.92 and TRUNK.

Li Pi's slides from Aug can be found here:
https://docs.google.com/present/view?id=d23xkzr_55hgnvngf6
Toward the end of it, you can find performance chart.

On Wed, Oct 26, 2011 at 3:49 PM, Stack <st...@duboce.net> wrote:

> On Wed, Oct 26, 2011 at 2:50 PM, Vladimir Rodionov
> <vr...@carrieriq.com> wrote:
> >>> Are you hitting cache at all?
> >>
> >> Its totally random, due to the proposed key design which favored fast
> inserts. Keys are randomized
> >> values, that is why there is no data locality in row look ups. Effect of
> the cache (LruBlockCache?) is negligible
> >> in this case.
> >>
> >
> >>>So a different schema would get cache into the mix?
> >
> > You can/t change schema while system is in production
> >
>
> True but caveat Ted's note and FB fellas apparently did it three times
> before they hit on the 'right' schema (Not sure whether they took the
> portion being modified offline when changing schema)
>
> >
> >>>Its going to keep growing without bound?
> >
> >
> > No, we keep data for XX days than purge stale data from the table.
> >
> >
> > My question was: what else besides obvious -run all in parallel - can
> help to improve random I/O?
> >
> > 1. Will BLOOM filter help to optimize HBase Read path?
>
> Yes.  0.92 blooms will be less expensive than those in 0.90 (because
> the blooms are tiered and live in the LRU in 0.92 so they are let go
> if unused).
>
>
> > 2. We use compression already.
> > 3. Block size - does it really matter much?
>
> Not much in my experience.  Smaller blocks can help a little at the
> cost of some bloat in index size (Again 0.92 is better here because
> indices are partitioned and now also are in the LRU rather than pegged
> in RAM as they are in 0.90).
>
> > 4. Off heap block cache? Its in 92 trunk? Have anybody performed real
> performance tests on Off heap cache?
> >
>
> Off-heap cache is experimental in 0.92 and TRUNK.
>
> St.Ack
>

Confidentiality Notice:  The information contained in this message, including any attachments hereto, may be confidential and is intended to be read only by the individual or entity to whom this message is addressed. If the reader of this message is not the intended recipient or an agent or designee of the intended recipient, please note that any review, use, disclosure or distribution of this message or its attachments, in any form, is strictly prohibited.  If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com and delete or destroy any copy of this message and its attachments.

Re: Random I/O performance

Posted by Ted Yu <yu...@gmail.com>.

>> Off-heap cache is experimental in 0.92 and TRUNK.
As of now, TestSlabCache passes consistently in 0.92 and TRUNK.

Li Pi's slides from Aug can be found here:
https://docs.google.com/present/view?id=d23xkzr_55hgnvngf6
Toward the end of it, you can find performance chart.

On Wed, Oct 26, 2011 at 3:49 PM, Stack <st...@duboce.net> wrote:

> On Wed, Oct 26, 2011 at 2:50 PM, Vladimir Rodionov
> <vr...@carrieriq.com> wrote:
> >>> Are you hitting cache at all?
> >>
> >> Its totally random, due to the proposed key design which favored fast
> inserts. Keys are randomized
> >> values, that is why there is no data locality in row look ups. Effect of
> the cache (LruBlockCache?) is negligible
> >> in this case.
> >>
> >
> >>>So a different schema would get cache into the mix?
> >
> > You can/t change schema while system is in production
> >
>
> True but caveat Ted's note and FB fellas apparently did it three times
> before they hit on the 'right' schema (Not sure whether they took the
> portion being modified offline when changing schema)
>
> >
> >>>Its going to keep growing without bound?
> >
> >
> > No, we keep data for XX days than purge stale data from the table.
> >
> >
> > My question was: what else besides obvious -run all in parallel - can
> help to improve random I/O?
> >
> > 1. Will BLOOM filter help to optimize HBase Read path?
>
> Yes.  0.92 blooms will be less expensive than those in 0.90 (because
> the blooms are tiered and live in the LRU in 0.92 so they are let go
> if unused).
>
>
> > 2. We use compression already.
> > 3. Block size - does it really matter much?
>
> Not much in my experience.  Smaller blocks can help a little at the
> cost of some bloat in index size (Again 0.92 is better here because
> indices are partitioned and now also are in the LRU rather than pegged
> in RAM as they are in 0.90).
>
> > 4. Off heap block cache? Its in 92 trunk? Have anybody performed real
> performance tests on Off heap cache?
> >
>
> Off-heap cache is experimental in 0.92 and TRUNK.
>
> St.Ack
>

Re: Random I/O performance

Posted by Stack <st...@duboce.net>.

On Wed, Oct 26, 2011 at 2:50 PM, Vladimir Rodionov
<vr...@carrieriq.com> wrote:
>>> Are you hitting cache at all?
>>
>> Its totally random, due to the proposed key design which favored fast inserts. Keys are randomized
>> values, that is why there is no data locality in row look ups. Effect of the cache (LruBlockCache?) is negligible
>> in this case.
>>
>
>>>So a different schema would get cache into the mix?
>
> You can/t change schema while system is in production
>

True but caveat Ted's note and FB fellas apparently did it three times
before they hit on the 'right' schema (Not sure whether they took the
portion being modified offline when changing schema)

>
>>>Its going to keep growing without bound?
>
>
> No, we keep data for XX days than purge stale data from the table.
>
>
> My question was: what else besides obvious -run all in parallel - can help to improve random I/O?
>
> 1. Will BLOOM filter help to optimize HBase Read path?

Yes.  0.92 blooms will be less expensive than those in 0.90 (because
the blooms are tiered and live in the LRU in 0.92 so they are let go
if unused).


> 2. We use compression already.
> 3. Block size - does it really matter much?

Not much in my experience.  Smaller blocks can help a little at the
cost of some bloat in index size (Again 0.92 is better here because
indices are partitioned and now also are in the LRU rather than pegged
in RAM as they are in 0.90).

> 4. Off heap block cache? Its in 92 trunk? Have anybody performed real performance tests on Off heap cache?
>

Off-heap cache is experimental in 0.92 and TRUNK.

St.Ack

Re: Random I/O performance

Posted by Ted Dunning <td...@maprtech.com>.

But you can change the key definition to get semi-sorted behavior.  This
would give you some parallelism while still grouping recent keys into the
same pages.

Doing an on-the-fly key transition is pretty hairy, but doable.

Accessing older data just requires knowing the cutoff date and doing one
encoding or the other.

On Wed, Oct 26, 2011 at 2:50 PM, Vladimir Rodionov
<vr...@carrieriq.com>wrote:

> >>So a different schema would get cache into the mix?
>
> You can/t change schema while system is in production
>

RE: Random I/O performance

Posted by Vladimir Rodionov <vr...@carrieriq.com>.


>> Are you hitting cache at all?
>
> Its totally random, due to the proposed key design which favored fast inserts. Keys are randomized
> values, that is why there is no data locality in row look ups. Effect of the cache (LruBlockCache?) is negligible
> in this case.
>

>>So a different schema would get cache into the mix?

You can/t change schema while system is in production


>>Its going to keep growing without bound?


No, we keep data for XX days than purge stale data from the table.


My question was: what else besides obvious -run all in parallel - can help to improve random I/O? 

1. Will BLOOM filter help to optimize HBase Read path?
2. We use compression already.
3. Block size - does it really matter much?
4. Off heap block cache? Its in 92 trunk? Have anybody performed real performance tests on Off heap cache?

We could easily allocate 10-15 GB per node thus effectively caching hot data in other tables (not in the fact table)

Off heap cache. What is max size of off heap cache we could try?
 My major concerns are: 

- memory allocators are pretty hard to debug and get them working right.
- memory fragmentation? 
- It still relies on on- heap Java data structures to perform eviction- which can degrade performance in case of a large caches.

Re: Random I/O performance

Posted by Stack <st...@duboce.net>.

On Wed, Oct 26, 2011 at 1:39 PM, Vladimir Rodionov
<vr...@carrieriq.com> wrote:
>> Can you do concurrent gets?
>
> Yes,
>

This should make a difference.


>> Whats your hardware like.  How many disks per machine?
>
> This is on our customer premises. I suppose - not less than 6
>

More disks, more i/o.


>> Is the table major compacted?
>
> Really doubt of that but I do not have direct access to the grid environment
>

It can make a difference.

These fellas know how to run hbase or you have to do it all for them.

>> Are you hitting cache at all?
>
> Its totally random, due to the proposed key design which favored fast inserts. Keys are randomized
> values, that is why there is no data locality in row look ups. Effect of the cache (LruBlockCache?) is negligible
> in this case.
>

So a different schema would get cache into the mix?

You are doing totally random keys just so you can do distribute
inserts?  This is time series?

>> It is slow, of course. 5 hours to retrieve 1M facts from HBase storage. Approx 55 rows per sec
>>
>
> How big is your table?
>
> Not too big yet (50M rows) each row is approx 1-2K but its getting every day.
>

Its going to keep growing without bound?

St.Ack

Re: Random I/O performance

Posted by Sebastian Bauer <ad...@ugame.net.pl>.

If only inserts are perform you dont need major_compaction, but propably
you need BLOOMFILTER=ROW, smaller BLOCKSIZE(?), COMPRESSION lzo or
snappy, probably smaller region hbase.hregion.max.filesize

its only my $0.02


W dniu 26.10.2011 22:39, Vladimir Rodionov pisze:
>
>
> On Wed, Oct 26, 2011 at 12:51 PM, Vladimir Rodionov
> <vr...@carrieriq.com> wrote:
>> We have a reporting tool which runs queries against Oracle DB, collects fact ids and then
>> queries HBase for these facts (one-by-one). This is single thread, simple Get op
>>
>> Can you do concurrent gets?
> Yes, 
>
>> Whats your hardware like.  How many disks per machine?
> This is on our customer premises. I suppose - not less than 6
>
>> Is the table major compacted?
> Really doubt of that but I do not have direct access to the grid environment
>
>> Are you hitting cache at all?
> Its totally random, due to the proposed key design which favored fast inserts. Keys are randomized
> values, that is why there is no data locality in row look ups. Effect of the cache (LruBlockCache?) is negligible
> in this case.  
>
>> It is slow, of course. 5 hours to retrieve 1M facts from HBase storage. Approx 55 rows per sec
>>
> How big is your table?
>
> Not too big yet (50M rows) each row is approx 1-2K but its getting every day.
>
>
> St.Ack
>

RE: Random I/O performance

Posted by Vladimir Rodionov <vr...@carrieriq.com>.

On Wed, Oct 26, 2011 at 12:51 PM, Vladimir Rodionov
<vr...@carrieriq.com> wrote:
> We have a reporting tool which runs queries against Oracle DB, collects fact ids and then
> queries HBase for these facts (one-by-one). This is single thread, simple Get op
>

> Can you do concurrent gets?

Yes, 

> Whats your hardware like.  How many disks per machine?

This is on our customer premises. I suppose - not less than 6

> Is the table major compacted?

Really doubt of that but I do not have direct access to the grid environment

> Are you hitting cache at all?

Its totally random, due to the proposed key design which favored fast inserts. Keys are randomized
values, that is why there is no data locality in row look ups. Effect of the cache (LruBlockCache?) is negligible
in this case.  

> It is slow, of course. 5 hours to retrieve 1M facts from HBase storage. Approx 55 rows per sec
>

How big is your table?

Not too big yet (50M rows) each row is approx 1-2K but its getting every day.

St.Ack

Re: Random I/O performance

Posted by Stack <st...@duboce.net>.

On Wed, Oct 26, 2011 at 12:51 PM, Vladimir Rodionov
<vr...@carrieriq.com> wrote:
> We have a reporting tool which runs queries against Oracle DB, collects fact ids and then
> queries HBase for these facts (one-by-one). This is single thread, simple Get op
>

Can you do concurrent gets?

Whats your hardware like.  How many disks per machine?

Is the table major compacted?

Are you hitting cache at all?

> It is slow, of course. 5 hours to retrieve 1M facts from HBase storage. Approx 55 rows per sec
>

How big is your table?

St.Ack

Re: Random I/O performance

Posted by Amandeep Khurana <am...@gmail.com>.

55 rows/sec? What's your row size? What % of your reads are hitting the
cache and what % are going to the disk?

One of the things you can do to improve the random read performance is
reduce the HFile block size.

-ak

On Wed, Oct 26, 2011 at 12:51 PM, Vladimir Rodionov <vrodionov@carrieriq.com
> wrote:

>
> We have a reporting tool which runs queries against Oracle DB, collects
> fact ids and then
> queries HBase for these facts (one-by-one). This is single thread, simple
> Get op
>
> It is slow, of course. 5 hours to retrieve 1M facts from HBase storage.
> Approx 55 rows per sec
>
> I know I can use batch get to increase the speed but my question is what
> else we can do to make our ops team happier?
>
> How to optimize random I/O performance in HBase (hi, Facebook we have the
> same problem as you guys :)
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Gary Helmling [ghelmling@gmail.com]
> Sent: Wednesday, October 26, 2011 12:34 PM
> To: dev@hbase.apache.org
> Subject: Re: proposal for naming convention of patches for TRUNK
>
> Also should be possible to use the file command?
>
> $ file HBASE-4680.txt
> HBASE-4680.txt: diff output text
>
>
>
> On Wed, Oct 26, 2011 at 12:32 PM, Ted Yu <yu...@gmail.com> wrote:
> > Looping in Giri.
> >
> > Giri:
> > Do you think you have enough heuristics for the filter ?
> >
> > Thanks
> >
> > On Wed, Oct 26, 2011 at 12:29 PM, Todd Lipcon <to...@cloudera.com> wrote:
> >
> >> Should be pretty easy to use grep to determine if a file is a patch or
> >> not. Patch files have lines starting with "---" and "+++".
> >>
> >>
> >> On Wed, Oct 26, 2011 at 11:58 AM, Ted Yu <yu...@gmail.com> wrote:
> >> > #1 is reasonable.
> >> >
> >> > For #2, the following would be included for test validation:
> >> >
> >> > how-to-reproduce-the-problem.txt
> >> > script-I-used.txt
> >> >
> >> > Just a few examples.
> >> >
> >> > On Wed, Oct 26, 2011 at 11:52 AM, Jonathan Hsieh <jo...@cloudera.com>
> >> wrote:
> >> >
> >> >> Suggestion:
> >> >>
> >> >> 1) Don't run check if the apache inclusion flag isn't checked?
> >> >> 2) Require extension to be .diff, .patch, or .txt?
> >> >>
> >> >> Jon.
> >> >>
> >> >> On Wed, Oct 26, 2011 at 11:37 AM, Ted Yu <yu...@gmail.com>
> wrote:
> >> >>
> >> >> > How do we exclude non-patch attachments, such as
> >> >> > EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo<
> >> >> >
> >> >>
> >>
> http://issues.apache.org/jira/secure/attachment/12500832/EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo
> >> >> > >?
> >> >> >
> >> >> > Thanks
> >> >> >
> >> >> > On Wed, Oct 26, 2011 at 11:32 AM, Todd Lipcon <to...@cloudera.com>
> >> wrote:
> >> >> >
> >> >> > > I prefer to default to trunk, and require a -0.90 or -0.92 to
> >> >> > > delineate a different branch. Most patches should be against
> trunk,
> >> so
> >> >> > > let's optimize for the common case.
> >> >> > >
> >> >> > > -Todd
> >> >> > >
> >> >> > > On Wed, Oct 26, 2011 at 11:04 AM, Ted Yu <yu...@gmail.com>
> >> wrote:
> >> >> > > > Hi,
> >> >> > > > I am working with Giri on a filter that should help us avoid
> the
> >> >> > > following
> >> >> > > > (see HBASE-4377):
> >> >> > > >
> >> >> > > > -1 overall. Here are the results of testing the latest
> attachment
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> http://issues.apache.org/jira/secure/attachment/12500832/EXT_ATU_05f84d32cbc0bdabf00e00bc2f3570f0.regioninfo
> >> >> > > > against trunk revision .
> >> >> > > >
> >> >> > > > I am proposing the following convention: TRUNK patch filename
> >> should
> >> >> > > contain
> >> >> > > > the word 'trunk' in a prominent manner - surrounded by either
> dash
> >> or
> >> >> > > dot.
> >> >> > > > Valid examples are:
> >> >> > > >
> >> >> > > > <
> >> >> > >
> >> >> >
> >> >>
> >>
> https://issues.apache.org/jira/secure/attachment/12500830/hbase-4377.trunk.v4.txt
> >> >> > > >
> >> >> > > >  hbase-4377.trunk.v4.txt<
> >> >> > >
> >> >> >
> >> >>
> >>
> https://issues.apache.org/jira/secure/attachment/12500830/hbase-4377.trunk.v4.txt
> >> >> > > >
> >> >> > > > <
> >> >> > >
> >> >> >
> >> >>
> >>
> https://issues.apache.org/jira/secure/attachment/12497503/hbase-4377-trunk.v2.patch
> >> >> > > >
> >> >> > > >  hbase-4377-trunk.v2.patch<
> >> >> > >
> >> >> >
> >> >>
> >>
> https://issues.apache.org/jira/secure/attachment/12497503/hbase-4377-trunk.v2.patch
> >> >> > > >
> >> >> > > > <
> >> >> > >
> >> >> >
> >> >>
> >>
> https://issues.apache.org/jira/secure/attachment/12499805/0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch
> >> >> > > >
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
>  0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch<
> >> >> > >
> >> >> >
> >> >>
> >>
> https://issues.apache.org/jira/secure/attachment/12499805/0001-HBASE-4377-hbck-Offline-rebuild-.META.-from-fs-data-.trunk.v3.patch
> >> >> > > >
> >> >> > > >
> >> >> > > > This would allow Giri to write filter that correctly uploads
> patch
> >> >> for
> >> >> > > TRUNK
> >> >> > > > to Jenkins for test build.
> >> >> > > >
> >> >> > > > Please provide your comments.
> >> >> > > >
> >> >> > >
> >> >> > >
> >> >> > >
> >> >> > > --
> >> >> > > Todd Lipcon
> >> >> > > Software Engineer, Cloudera
> >> >> > >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> // Jonathan Hsieh (shay)
> >> >> // Software Engineer, Cloudera
> >> >> // jon@cloudera.com
> >> >>
> >> >
> >>
> >>
> >>
> >> --
> >> Todd Lipcon
> >> Software Engineer, Cloudera
> >>
> >
>