You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Rohit Jain <ro...@esgyn.com> on 2019/08/08 16:42:00 UTC

RE: The note of the round table meeting after HBaseConAsia 2019

Hi folks,

This is a nice write-up of the round-table meeting at HBaseConAsia.  I would like to address the points I have pulled out from write-up (at the bottom of this message).

Many in the HBase community may not be aware that besides Apache Phoenix, there has been a project called Apache Trafodion, contributed by Hewlett-Packard in 2015 that has now been top-level project for a while.  Apache Trafodion is essentially technology from Tandem-Compaq-HP that started its OLTP / Operational journey as NonStop SQL effectively in the early 1990s.  Granted it is a C++ project, but it has 170+ patents as part of it that were contributed to Apache.  These are capabilities that still don’t exist in other databases.

It is a full-fledged SQL relational database engine with the breadth of ANSI SQL support, including OLAP functions mentioned, and including many de facto standard functions from databases like Oracle.  You can go to the Apache Trafodion wiki to see the documentation as to what all is supported by Trafodion.

When we introduced Apache Trafodion, we implemented a completely distributed transaction management capability right into the HBase engine using coprocessors, that is completely scalable with no bottlenecks what-so-ever.  We have made this infrastructure very efficient over time, e.g. reducing two-phase commit overhead for single region transactions.  We have presented this at HBaseCon.

The engine also supports secondary indexes.  However, because of our Multi-dimensional Access Method patented technology the need to use a secondary index is substantially reduced.  All DDL and index updates are completely protected by ACID transactions.

Probably because of our own inability to create excitement about the project, and potentially other reasons, we could not get community involvement as we were expecting.  That is why you may see that while we are maintaining the code base and introducing enhancements to it, much of our focus has shifted to the commercial product based on Apache Trafodion, namely EsgynDB.  But if the community involvement increases, we can certainly refresh Trafodion with some of the additional functionality we have added on the HBase side of the product.

But let me be clear.  We are about 150 employees at Esgyn with 40 or so in the US, mostly in Milpitas, and the rest in Shanghai, Beijing, and Guiyang.  We cannot sustain the company on service revenue alone.  You have seen companies that tried to do that have not been successful, unless they have a way to leverage the open source project for a different business model – enhanced capabilities, Cloud services, etc.

To that end we have added to EsgynDB complete Disaster Recovery, Point-in-Time, fuzzy Backup and Restore, Manageability via a Database Manager, Multi-tenancy, and a large number of other capabilities for High Availability scale-out production deployments.  EsgynDB also provides full BI and Analytics capabilities, again because of our heritage products supporting up to 250TB EDWs for HP and customers like Walmart competing with Teradata, leveraging Apache ORC and Parquet.  So yes, it can integrate with other storage engines as needed.

However, in spite of all this, the pricing on EsgynDB is very competitive – in other words “cheap” compared to anything else with the same caliber of capabilities.

We have demonstrated the capability of the product by running the TPC-C and TPC-DS (all 99 queries) benchmarks, especially at high concurrency which our product is especially well suited for, based on its architecture and patents.  (The TPC-DS benchmarks are run on ORC and Parquet for obvious reasons.)

We just closed a couple of very large Core Banking deals in Guiyang where we are replacing the entire Core Banking system for these banks from their current Oracle implementations – where they were having challenges scaling at a reasonable cost.  But we have many customers both in the US and China that are using EsgynDB for operational, BI and Analytics needs.  And now finally … OLTP.

I know that this is sounding more like a commercial for Esgyn, but that is not my intent.  I would like to make you aware of Apache Trafodion as a solution to many of these issues that the community is facing.  We will provide full support for Trafodion with community involvement and hope that some of that involvement results in EsgynDB revenue that we can sustain the company on 😊.  I would like to encourage the community to look at Trafodion to address many of the concerns sighted below.

“Allan Yang said that most of their customers want secondary index, even more than SQL. And for global strong consistent secondary index, we agree that the only safe way is to use transaction. Other 'local' solutions will be in trouble when splitting/merging.”

“We talked about Phoenix, the problem for Phoenix is well known: not stable enough. We even had a user on the mailing-list said he/she will never use Phoenix again.”

“Some guys said that the current feature set for 3.0.0 is not good enough to attract more users, especially for small companies. Only internal improvements, no users visible features. SQL and secondary index are very important.”

“Then we back to SQL again. Alibaba said that most of their customers are migrate from old business, so they need 'full' SQL support. That's why they need Phoenix. And lots of small companies wants to run OLAP queries directly on the database, they do no want to use ETL. So maybe in the SQL proxy (planned above), we should delegate the OLAP queries to spark SQL or something else, rather than just rejecting them.”

“And a Phoenix committer said that, the Phoenix community are currently re-evaluate the relationship with HBase, because when upgrading to HBase 2.1.x, lots of things are broken. They plan to break the tie between Phoenix and HBase, which means Phoenix plans to also run on other storage systems. Note: This is not on the meeting but personally, I think this maybe a good news, since Phoenix is not HBase only, we have more reasons to introduce our own SQL layer.”

Rohit Jain
CTO
Esgyn



-----Original Message-----
From: Stack <st...@duboce.net>
Sent: Friday, July 26, 2019 12:01 PM
To: HBase Dev List <de...@hbase.apache.org>
Cc: hbase-user <us...@hbase.apache.org>
Subject: Re: The note of the round table meeting after HBaseConAsia 2019



External



Thanks for the thorough write-up Duo. Made for a good read....

S



On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <pa...@gmail.com>> wrote:



> The conclusion of the HBaseConAsia 2019 will be available later. And

> here is the note of the round table meeting after the conference. A bit long...

>

> First we talked about splittable meta. At Xiaomi we have a cluster

> which has nearly 200k regions and meta is very easy to overload and

> can not recover. Anoop said we can try read replica, but agreed that

> read replica can not solve all the problems, finally we still need to split meta.

>

> Then we talked about SQL. Allan Yang said that most of their customers

> want secondary index, even more than SQL. And for global strong

> consistent secondary index, we agree that the only safe way is to use transaction.

> Other 'local' solutions will be in trouble when splitting/merging.

> Xiaomi has an global secondary index solution, open source it?

>

> Then we back to SQL. We talked about Phoenix, the problem for Phoenix

> is well known: not stable enough. We even had a user on the

> mailing-list said he/she will never use Phoenix again. Alibaba and

> Huawei both have their in-house SQL solution, and Huawei also talked

> about it on HBaseConAsia 2019, they will try to open source it. And we

> could introduce a SQL proxy in hbase-connector repo. No push down

> support first, all logics are done at the proxy side, can optimize later.

>

> Some guys said that the current feature set for 3.0.0 is not good

> enough to attract more users, especially for small companies. Only

> internal improvements, no users visible features. SQL and secondary

> index are very important.

>

> Yu Li talked about the CCSMap, we still want it to be release in

> 3.0.0. One problem is the relationship with in memory compaction.

> Theoretically they should have no conflicts but actually they have.

> And Xiaomi guys mentioned that in memory compaction still has some

> bugs, even for basic mode, the MVCC writePoint may be stuck and hang

> the region server. And Jieshan Bi asked why not just use CCSMap to

> replace CSLM. Yu Li said this is for better memory usage, the index and data could be placed together.

>

> Then we started to talk about the HBase on cloud. For now, it is a bit

> difficult to deploy HBase on cloud as we need to deploy zookeeper and

> HDFS first. Then we talked about the HBOSS and WAL abstraction(HBASE-209520.

> Wellington said the HBOSS basicly works, it use s3a and zookeeper to

> help simulating the operations of HDFS. We could introduce our own 'FileSystem'

> interface, not the hadoop one, and we could remove the 'atomic renaming'

> dependency so the 'FileSystem' implementation will be easier. And on

> the WAL abstraction, Wellington said there are still some guys working

> it, but now they focus on patching ratis, rather than abstracting the

> WAL system first. We agreed that a better way is to abstract WAL

> system at a level higher than FileSystem. so maybe we could even use Kafka to store the WAL.

>

> Then we talked about the FPGA usage for compaction at Alibaba. Jieshan

> Bi said that in Huawei they offload the compaction to storage layer.

> For open source solution, maybe we could offload the compaction to

> spark, and then use something like bulkload to let region server load

> the new HFiles. The problem for doing compaction inside region server

> is the CPU cost and GC pressure. We need to scan every cell so the CPU

> cost is high. Yu Li talked about their page based compaction in flink

> state store, maybe it could also benefit HBase.

>

> Then it is the time for MOB. Huawei said MOD can not solve their problem.

> We still need to read the data through RPC, and it will also introduce

> pressures on the memstore, since the memstore is still a bit small,

> comparing to MOB cell. And we will also flush a lot although there are

> only a small number of MOB cells in the memstore, so we still need to

> compact a lot. So maybe the suitable scenario for using MOB is that,

> most of your data are still small, and a small amount of the data are

> a bit larger, where MOD could increase the performance, and users do

> not need to use another system to store the larger data.

> Huawei said that they implement the logic at client side. If the data

> is larger than a threshold, the client will go to another storage

> system rather than HBase.

> Alibaba said that if we want to support large blob, we need to

> introduce streaming API.

> And Kuaishou said that they do not use MOB, they just store data on

> HDFS and the index in HBase, typical solution.

>

> Then we talked about which company to host the next year's

> HBaseConAsia. It will be Tencent or Huawei, or both, probably in

> Shenzhen. And since there is no HBaseCon in America any more(it is

> called 'NoSQL Day'), maybe next year we could just call the conference HBaseCon.

>

> Then we back to SQL again. Alibaba said that most of their customers

> are migrate from old business, so they need 'full' SQL support. That's

> why they need Phoenix. And lots of small companies wants to run OLAP

> queries directly on the database, they do no want to use ETL. So maybe

> in the SQL proxy(planned above), we should delegate the OLAP queries

> to spark SQL or something else, rather than just rejecting them.

>

> And a Phoenix committer said that, the Phoenix community are currently

> re-evaluate the relationship with HBase, because when upgrading to

> HBase 2.1.x, lots of things are broken. They plan to break the tie

> between Phoenix and HBase, which means Phoenix plans to also run on

> other storage systems.

> Note: This is not on the meeting but personally, I think this maybe a

> good news, since Phoenix is not HBase only, we have more reasons to

> introduce our own SQL layer.

>

> Then we talked about Kudu. It is faster than HBase on scan. If we want

> to increase the performance on scan, we should have larger block size,

> but this will lead to a slower random read, so we need to trade-off.

> The Kuaishou guys asked whether HBase could support storing HFile in

> columnar format. The answer is no, as said above, it will slow random read.

> But we could learn what google done in bigtable. We could write a copy

> of the data in parquet format to another FileSystem, and user could

> just scan the parquet file for better analysis performance. And if

> they want the newest data, they could ask HBase for the newest data,

> and it should be small. This is more like a solution, not only HBase

> is involved. But at least we could introduce some APIs in HBase so

> users can build the solution in their own environment. And if you do

> not care the newest data, you could also use replication to replicate

> the data to ES or other systems, and search there.

>

> And Didi talked about their problems using HBase. They use kylin so

> they also have lots of regions, so meta is also a problem for them.

> And the pressure on zookeeper is also a problem, as the replication

> queues are stored on zk. And after 2.1, zookeeper is only used as an

> external storage in replication implementation, so it is possible to

> switch to other storages, such as etcd. But it is still a bit

> difficult to store the data in a system table, as now we need to start

> the replication system before WAL system, but  if we want to store the

> replication data in a hbase table, obviously the WAL system must be

> started before replication system, as we need the region of the system

> online first, and it will write an open marker to WAL. We need to find a way to break the dead lock.

> And they also mentioned that, the rsgroup feature also makes big znode

> on zookeeper, as they have lots of tables. We have HBASE-22514 which

> aims to solve the problem.

> And last, they shared their experience when upgrading from 0.98 to 1.4.x.

> they should be compatible but actually there are problems. They agreed

> to post a blog about this.

>

> And the Flipkart guys said they will open source their test-suite,

> which focus on the consistency(Jepsen?). This is a good news, hope we

> could have another useful tool other than ITBLL.

>

> That's all. Thanks for reading.

>

Re: The note of the round table meeting after HBaseConAsia 2019

Posted by Geoffrey Jacoby <gj...@apache.org>.
Just want to chime in with my Phoenix PMC hat on and say that there are no
current plans endorsed by the PMC to "split" Phoenix away from HBase. I'm
not even aware of any JIRAs proposing such a thing, though if the anonymous
Phoenix committer at HBaseCon Asia wants to make one, he or she is of
course welcome to do so. Since Phoenix is implemented as a series of HBase
coprocessor hooks, I'm not even sure what an HBase-less Phoenix would _be_.

As always, we welcome any and all suggestions on our dev list or JIRA on
how we can integrate with HBase better, improve stability, or build cool
new features together.

Thanks,

Geoffrey Jacoby




On Thu, Aug 8, 2019 at 11:25 AM Andrew Purtell <ap...@apache.org> wrote:

> This is great, but in the future please refrain from borderline marketing
> of a commercial product on these lists. This is not the appropriate venue
> for that.
>
> It is especially poor form to dump on a fellow open source project, as you
> claim to be. This I think is the tell behind the commercial motivation.
>
> Also I should point out, being pretty familiar with Phoenix in operation
> where I work, and in my interactions with various Phoenix committers and
> PMC, that the particular group of HBasers in that group appeared to share a
> negative view - which I will not comment on, they are entitled to their
> opinions, and more choice in SQL access to HBase is good! - that should not
> be claimed to be universal or even representative.
>
>
>
> On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain <ro...@esgyn.com> wrote:
>
> > Hi folks,
> >
> > This is a nice write-up of the round-table meeting at HBaseConAsia.  I
> > would like to address the points I have pulled out from write-up (at the
> > bottom of this message).
> >
> > Many in the HBase community may not be aware that besides Apache Phoenix,
> > there has been a project called Apache Trafodion, contributed by
> > Hewlett-Packard in 2015 that has now been top-level project for a while.
> > Apache Trafodion is essentially technology from Tandem-Compaq-HP that
> > started its OLTP / Operational journey as NonStop SQL effectively in the
> > early 1990s.  Granted it is a C++ project, but it has 170+ patents as
> part
> > of it that were contributed to Apache.  These are capabilities that still
> > don’t exist in other databases.
> >
> > It is a full-fledged SQL relational database engine with the breadth of
> > ANSI SQL support, including OLAP functions mentioned, and including many
> de
> > facto standard functions from databases like Oracle.  You can go to the
> > Apache Trafodion wiki to see the documentation as to what all is
> supported
> > by Trafodion.
> >
> > When we introduced Apache Trafodion, we implemented a completely
> > distributed transaction management capability right into the HBase engine
> > using coprocessors, that is completely scalable with no bottlenecks
> > what-so-ever.  We have made this infrastructure very efficient over time,
> > e.g. reducing two-phase commit overhead for single region transactions.
> We
> > have presented this at HBaseCon.
> >
> > The engine also supports secondary indexes.  However, because of our
> > Multi-dimensional Access Method patented technology the need to use a
> > secondary index is substantially reduced.  All DDL and index updates are
> > completely protected by ACID transactions.
> >
> > Probably because of our own inability to create excitement about the
> > project, and potentially other reasons, we could not get community
> > involvement as we were expecting.  That is why you may see that while we
> > are maintaining the code base and introducing enhancements to it, much of
> > our focus has shifted to the commercial product based on Apache
> Trafodion,
> > namely EsgynDB.  But if the community involvement increases, we can
> > certainly refresh Trafodion with some of the additional functionality we
> > have added on the HBase side of the product.
> >
> > But let me be clear.  We are about 150 employees at Esgyn with 40 or so
> in
> > the US, mostly in Milpitas, and the rest in Shanghai, Beijing, and
> > Guiyang.  We cannot sustain the company on service revenue alone.  You
> have
> > seen companies that tried to do that have not been successful, unless
> they
> > have a way to leverage the open source project for a different business
> > model – enhanced capabilities, Cloud services, etc.
> >
> > To that end we have added to EsgynDB complete Disaster Recovery,
> > Point-in-Time, fuzzy Backup and Restore, Manageability via a Database
> > Manager, Multi-tenancy, and a large number of other capabilities for High
> > Availability scale-out production deployments.  EsgynDB also provides
> full
> > BI and Analytics capabilities, again because of our heritage products
> > supporting up to 250TB EDWs for HP and customers like Walmart competing
> > with Teradata, leveraging Apache ORC and Parquet.  So yes, it can
> integrate
> > with other storage engines as needed.
> >
> > However, in spite of all this, the pricing on EsgynDB is very competitive
> > – in other words “cheap” compared to anything else with the same caliber
> of
> > capabilities.
> >
> > We have demonstrated the capability of the product by running the TPC-C
> > and TPC-DS (all 99 queries) benchmarks, especially at high concurrency
> > which our product is especially well suited for, based on its
> architecture
> > and patents.  (The TPC-DS benchmarks are run on ORC and Parquet for
> obvious
> > reasons.)
> >
> > We just closed a couple of very large Core Banking deals in Guiyang where
> > we are replacing the entire Core Banking system for these banks from
> their
> > current Oracle implementations – where they were having challenges
> scaling
> > at a reasonable cost.  But we have many customers both in the US and
> China
> > that are using EsgynDB for operational, BI and Analytics needs.  And now
> > finally … OLTP.
> >
> > I know that this is sounding more like a commercial for Esgyn, but that
> is
> > not my intent.  I would like to make you aware of Apache Trafodion as a
> > solution to many of these issues that the community is facing.  We will
> > provide full support for Trafodion with community involvement and hope
> that
> > some of that involvement results in EsgynDB revenue that we can sustain
> the
> > company on 😊.  I would like to encourage the community to look at
> > Trafodion to address many of the concerns sighted below.
> >
> > “Allan Yang said that most of their customers want secondary index, even
> > more than SQL. And for global strong consistent secondary index, we agree
> > that the only safe way is to use transaction. Other 'local' solutions
> will
> > be in trouble when splitting/merging.”
> >
> > “We talked about Phoenix, the problem for Phoenix is well known: not
> > stable enough. We even had a user on the mailing-list said he/she will
> > never use Phoenix again.”
> >
> > “Some guys said that the current feature set for 3.0.0 is not good enough
> > to attract more users, especially for small companies. Only internal
> > improvements, no users visible features. SQL and secondary index are very
> > important.”
> >
> > “Then we back to SQL again. Alibaba said that most of their customers are
> > migrate from old business, so they need 'full' SQL support. That's why
> they
> > need Phoenix. And lots of small companies wants to run OLAP queries
> > directly on the database, they do no want to use ETL. So maybe in the SQL
> > proxy (planned above), we should delegate the OLAP queries to spark SQL
> or
> > something else, rather than just rejecting them.”
> >
> > “And a Phoenix committer said that, the Phoenix community are currently
> > re-evaluate the relationship with HBase, because when upgrading to HBase
> > 2.1.x, lots of things are broken. They plan to break the tie between
> > Phoenix and HBase, which means Phoenix plans to also run on other storage
> > systems. Note: This is not on the meeting but personally, I think this
> > maybe a good news, since Phoenix is not HBase only, we have more reasons
> to
> > introduce our own SQL layer.”
> >
> > Rohit Jain
> > CTO
> > Esgyn
> >
> >
> >
> > -----Original Message-----
> > From: Stack <st...@duboce.net>
> > Sent: Friday, July 26, 2019 12:01 PM
> > To: HBase Dev List <de...@hbase.apache.org>
> > Cc: hbase-user <us...@hbase.apache.org>
> > Subject: Re: The note of the round table meeting after HBaseConAsia 2019
> >
> >
> >
> > External
> >
> >
> >
> > Thanks for the thorough write-up Duo. Made for a good read....
> >
> > S
> >
> >
> >
> > On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <palomino219@gmail.com
> > <ma...@gmail.com>> wrote:
> >
> >
> >
> > > The conclusion of the HBaseConAsia 2019 will be available later. And
> >
> > > here is the note of the round table meeting after the conference. A bit
> > long...
> >
> > >
> >
> > > First we talked about splittable meta. At Xiaomi we have a cluster
> >
> > > which has nearly 200k regions and meta is very easy to overload and
> >
> > > can not recover. Anoop said we can try read replica, but agreed that
> >
> > > read replica can not solve all the problems, finally we still need to
> > split meta.
> >
> > >
> >
> > > Then we talked about SQL. Allan Yang said that most of their customers
> >
> > > want secondary index, even more than SQL. And for global strong
> >
> > > consistent secondary index, we agree that the only safe way is to use
> > transaction.
> >
> > > Other 'local' solutions will be in trouble when splitting/merging.
> >
> > > Xiaomi has an global secondary index solution, open source it?
> >
> > >
> >
> > > Then we back to SQL. We talked about Phoenix, the problem for Phoenix
> >
> > > is well known: not stable enough. We even had a user on the
> >
> > > mailing-list said he/she will never use Phoenix again. Alibaba and
> >
> > > Huawei both have their in-house SQL solution, and Huawei also talked
> >
> > > about it on HBaseConAsia 2019, they will try to open source it. And we
> >
> > > could introduce a SQL proxy in hbase-connector repo. No push down
> >
> > > support first, all logics are done at the proxy side, can optimize
> later.
> >
> > >
> >
> > > Some guys said that the current feature set for 3.0.0 is not good
> >
> > > enough to attract more users, especially for small companies. Only
> >
> > > internal improvements, no users visible features. SQL and secondary
> >
> > > index are very important.
> >
> > >
> >
> > > Yu Li talked about the CCSMap, we still want it to be release in
> >
> > > 3.0.0. One problem is the relationship with in memory compaction.
> >
> > > Theoretically they should have no conflicts but actually they have.
> >
> > > And Xiaomi guys mentioned that in memory compaction still has some
> >
> > > bugs, even for basic mode, the MVCC writePoint may be stuck and hang
> >
> > > the region server. And Jieshan Bi asked why not just use CCSMap to
> >
> > > replace CSLM. Yu Li said this is for better memory usage, the index and
> > data could be placed together.
> >
> > >
> >
> > > Then we started to talk about the HBase on cloud. For now, it is a bit
> >
> > > difficult to deploy HBase on cloud as we need to deploy zookeeper and
> >
> > > HDFS first. Then we talked about the HBOSS and WAL
> > abstraction(HBASE-209520.
> >
> > > Wellington said the HBOSS basicly works, it use s3a and zookeeper to
> >
> > > help simulating the operations of HDFS. We could introduce our own
> > 'FileSystem'
> >
> > > interface, not the hadoop one, and we could remove the 'atomic
> renaming'
> >
> > > dependency so the 'FileSystem' implementation will be easier. And on
> >
> > > the WAL abstraction, Wellington said there are still some guys working
> >
> > > it, but now they focus on patching ratis, rather than abstracting the
> >
> > > WAL system first. We agreed that a better way is to abstract WAL
> >
> > > system at a level higher than FileSystem. so maybe we could even use
> > Kafka to store the WAL.
> >
> > >
> >
> > > Then we talked about the FPGA usage for compaction at Alibaba. Jieshan
> >
> > > Bi said that in Huawei they offload the compaction to storage layer.
> >
> > > For open source solution, maybe we could offload the compaction to
> >
> > > spark, and then use something like bulkload to let region server load
> >
> > > the new HFiles. The problem for doing compaction inside region server
> >
> > > is the CPU cost and GC pressure. We need to scan every cell so the CPU
> >
> > > cost is high. Yu Li talked about their page based compaction in flink
> >
> > > state store, maybe it could also benefit HBase.
> >
> > >
> >
> > > Then it is the time for MOB. Huawei said MOD can not solve their
> problem.
> >
> > > We still need to read the data through RPC, and it will also introduce
> >
> > > pressures on the memstore, since the memstore is still a bit small,
> >
> > > comparing to MOB cell. And we will also flush a lot although there are
> >
> > > only a small number of MOB cells in the memstore, so we still need to
> >
> > > compact a lot. So maybe the suitable scenario for using MOB is that,
> >
> > > most of your data are still small, and a small amount of the data are
> >
> > > a bit larger, where MOD could increase the performance, and users do
> >
> > > not need to use another system to store the larger data.
> >
> > > Huawei said that they implement the logic at client side. If the data
> >
> > > is larger than a threshold, the client will go to another storage
> >
> > > system rather than HBase.
> >
> > > Alibaba said that if we want to support large blob, we need to
> >
> > > introduce streaming API.
> >
> > > And Kuaishou said that they do not use MOB, they just store data on
> >
> > > HDFS and the index in HBase, typical solution.
> >
> > >
> >
> > > Then we talked about which company to host the next year's
> >
> > > HBaseConAsia. It will be Tencent or Huawei, or both, probably in
> >
> > > Shenzhen. And since there is no HBaseCon in America any more(it is
> >
> > > called 'NoSQL Day'), maybe next year we could just call the conference
> > HBaseCon.
> >
> > >
> >
> > > Then we back to SQL again. Alibaba said that most of their customers
> >
> > > are migrate from old business, so they need 'full' SQL support. That's
> >
> > > why they need Phoenix. And lots of small companies wants to run OLAP
> >
> > > queries directly on the database, they do no want to use ETL. So maybe
> >
> > > in the SQL proxy(planned above), we should delegate the OLAP queries
> >
> > > to spark SQL or something else, rather than just rejecting them.
> >
> > >
> >
> > > And a Phoenix committer said that, the Phoenix community are currently
> >
> > > re-evaluate the relationship with HBase, because when upgrading to
> >
> > > HBase 2.1.x, lots of things are broken. They plan to break the tie
> >
> > > between Phoenix and HBase, which means Phoenix plans to also run on
> >
> > > other storage systems.
> >
> > > Note: This is not on the meeting but personally, I think this maybe a
> >
> > > good news, since Phoenix is not HBase only, we have more reasons to
> >
> > > introduce our own SQL layer.
> >
> > >
> >
> > > Then we talked about Kudu. It is faster than HBase on scan. If we want
> >
> > > to increase the performance on scan, we should have larger block size,
> >
> > > but this will lead to a slower random read, so we need to trade-off.
> >
> > > The Kuaishou guys asked whether HBase could support storing HFile in
> >
> > > columnar format. The answer is no, as said above, it will slow random
> > read.
> >
> > > But we could learn what google done in bigtable. We could write a copy
> >
> > > of the data in parquet format to another FileSystem, and user could
> >
> > > just scan the parquet file for better analysis performance. And if
> >
> > > they want the newest data, they could ask HBase for the newest data,
> >
> > > and it should be small. This is more like a solution, not only HBase
> >
> > > is involved. But at least we could introduce some APIs in HBase so
> >
> > > users can build the solution in their own environment. And if you do
> >
> > > not care the newest data, you could also use replication to replicate
> >
> > > the data to ES or other systems, and search there.
> >
> > >
> >
> > > And Didi talked about their problems using HBase. They use kylin so
> >
> > > they also have lots of regions, so meta is also a problem for them.
> >
> > > And the pressure on zookeeper is also a problem, as the replication
> >
> > > queues are stored on zk. And after 2.1, zookeeper is only used as an
> >
> > > external storage in replication implementation, so it is possible to
> >
> > > switch to other storages, such as etcd. But it is still a bit
> >
> > > difficult to store the data in a system table, as now we need to start
> >
> > > the replication system before WAL system, but  if we want to store the
> >
> > > replication data in a hbase table, obviously the WAL system must be
> >
> > > started before replication system, as we need the region of the system
> >
> > > online first, and it will write an open marker to WAL. We need to find
> a
> > way to break the dead lock.
> >
> > > And they also mentioned that, the rsgroup feature also makes big znode
> >
> > > on zookeeper, as they have lots of tables. We have HBASE-22514 which
> >
> > > aims to solve the problem.
> >
> > > And last, they shared their experience when upgrading from 0.98 to
> 1.4.x.
> >
> > > they should be compatible but actually there are problems. They agreed
> >
> > > to post a blog about this.
> >
> > >
> >
> > > And the Flipkart guys said they will open source their test-suite,
> >
> > > which focus on the consistency(Jepsen?). This is a good news, hope we
> >
> > > could have another useful tool other than ITBLL.
> >
> > >
> >
> > > That's all. Thanks for reading.
> >
> > >
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>

Re: The note of the round table meeting after HBaseConAsia 2019

Posted by Andrew Purtell <ap...@apache.org>.
Ok, in that spirit let me say I've always found Apache Trafodion to be
interesting and credible technology and worthy of anyone's consideration.


On Thu, Aug 8, 2019 at 1:34 PM Rohit Jain <ro...@esgyn.com> wrote:

> Andrew,
>
> I would never dump on Apache Phoenix.  I have worked with James for years
> and have always wanted to see how we could collaborate on various aspects,
> including common data type support and transaction management, to name a
> few.  I think the challenges we faced is the Java vs C++ nature of the two
> projects.  I am just pointing out that Apache Trafodion is an alternate
> option available.  I am also letting people know what is NOT in Apache
> Trafodion, so they understand that before making the time investment.
>
> Yes, I do apologize it sounds a bit like marketing, even though I tried to
> minimize that.  But you will see that we have had no marketing at all
> elsewhere.  One of the reasons why no one seems to know about Apache
> Trafodion.
>
> Rohit
>
> -----Original Message-----
> From: Andrew Purtell <ap...@apache.org>
> Sent: Thursday, August 8, 2019 1:25 PM
> To: Hbase-User <us...@hbase.apache.org>
> Cc: HBase Dev List <de...@hbase.apache.org>
> Subject: Re: The note of the round table meeting after HBaseConAsia 2019
>
> This is great, but in the future please refrain from borderline marketing
> of a commercial product on these lists. This is not the appropriate venue
> for that.
>
> It is especially poor form to dump on a fellow open source project, as you
> claim to be. This I think is the tell behind the commercial motivation.
>
> Also I should point out, being pretty familiar with Phoenix in operation
> where I work, and in my interactions with various Phoenix committers and
> PMC, that the particular group of HBasers in that group appeared to share a
> negative view - which I will not comment on, they are entitled to their
> opinions, and more choice in SQL access to HBase is good! - that should not
> be claimed to be universal or even representative.
>
>
>
> On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain <ro...@esgyn.com> wrote:
>
> > Hi folks,
> >
> > This is a nice write-up of the round-table meeting at HBaseConAsia.  I
> > would like to address the points I have pulled out from write-up (at
> > the bottom of this message).
> >
> > Many in the HBase community may not be aware that besides Apache
> > Phoenix, there has been a project called Apache Trafodion, contributed
> > by Hewlett-Packard in 2015 that has now been top-level project for a
> while.
> > Apache Trafodion is essentially technology from Tandem-Compaq-HP that
> > started its OLTP / Operational journey as NonStop SQL effectively in
> > the early 1990s.  Granted it is a C++ project, but it has 170+ patents
> > as part of it that were contributed to Apache.  These are capabilities
> > that still don’t exist in other databases.
> >
> > It is a full-fledged SQL relational database engine with the breadth
> > of ANSI SQL support, including OLAP functions mentioned, and including
> > many de facto standard functions from databases like Oracle.  You can
> > go to the Apache Trafodion wiki to see the documentation as to what
> > all is supported by Trafodion.
> >
> > When we introduced Apache Trafodion, we implemented a completely
> > distributed transaction management capability right into the HBase
> > engine using coprocessors, that is completely scalable with no
> > bottlenecks what-so-ever.  We have made this infrastructure very
> > efficient over time, e.g. reducing two-phase commit overhead for
> > single region transactions.  We have presented this at HBaseCon.
> >
> > The engine also supports secondary indexes.  However, because of our
> > Multi-dimensional Access Method patented technology the need to use a
> > secondary index is substantially reduced.  All DDL and index updates
> > are completely protected by ACID transactions.
> >
> > Probably because of our own inability to create excitement about the
> > project, and potentially other reasons, we could not get community
> > involvement as we were expecting.  That is why you may see that while
> > we are maintaining the code base and introducing enhancements to it,
> > much of our focus has shifted to the commercial product based on
> > Apache Trafodion, namely EsgynDB.  But if the community involvement
> > increases, we can certainly refresh Trafodion with some of the
> > additional functionality we have added on the HBase side of the product.
> >
> > But let me be clear.  We are about 150 employees at Esgyn with 40 or
> > so in the US, mostly in Milpitas, and the rest in Shanghai, Beijing,
> > and Guiyang.  We cannot sustain the company on service revenue alone.
> > You have seen companies that tried to do that have not been
> > successful, unless they have a way to leverage the open source project
> > for a different business model – enhanced capabilities, Cloud services,
> etc.
> >
> > To that end we have added to EsgynDB complete Disaster Recovery,
> > Point-in-Time, fuzzy Backup and Restore, Manageability via a Database
> > Manager, Multi-tenancy, and a large number of other capabilities for
> > High Availability scale-out production deployments.  EsgynDB also
> > provides full BI and Analytics capabilities, again because of our
> > heritage products supporting up to 250TB EDWs for HP and customers
> > like Walmart competing with Teradata, leveraging Apache ORC and
> > Parquet.  So yes, it can integrate with other storage engines as needed.
> >
> > However, in spite of all this, the pricing on EsgynDB is very
> > competitive – in other words “cheap” compared to anything else with
> > the same caliber of capabilities.
> >
> > We have demonstrated the capability of the product by running the
> > TPC-C and TPC-DS (all 99 queries) benchmarks, especially at high
> > concurrency which our product is especially well suited for, based on
> > its architecture and patents.  (The TPC-DS benchmarks are run on ORC
> > and Parquet for obvious
> > reasons.)
> >
> > We just closed a couple of very large Core Banking deals in Guiyang
> > where we are replacing the entire Core Banking system for these banks
> > from their current Oracle implementations – where they were having
> > challenges scaling at a reasonable cost.  But we have many customers
> > both in the US and China that are using EsgynDB for operational, BI
> > and Analytics needs.  And now finally … OLTP.
> >
> > I know that this is sounding more like a commercial for Esgyn, but
> > that is not my intent.  I would like to make you aware of Apache
> > Trafodion as a solution to many of these issues that the community is
> > facing.  We will provide full support for Trafodion with community
> > involvement and hope that some of that involvement results in EsgynDB
> > revenue that we can sustain the company on 😊.  I would like to
> > encourage the community to look at Trafodion to address many of the
> concerns sighted below.
> >
> > “Allan Yang said that most of their customers want secondary index,
> > even more than SQL. And for global strong consistent secondary index,
> > we agree that the only safe way is to use transaction. Other 'local'
> > solutions will be in trouble when splitting/merging.”
> >
> > “We talked about Phoenix, the problem for Phoenix is well known: not
> > stable enough. We even had a user on the mailing-list said he/she will
> > never use Phoenix again.”
> >
> > “Some guys said that the current feature set for 3.0.0 is not good
> > enough to attract more users, especially for small companies. Only
> > internal improvements, no users visible features. SQL and secondary
> > index are very important.”
> >
> > “Then we back to SQL again. Alibaba said that most of their customers
> > are migrate from old business, so they need 'full' SQL support. That's
> > why they need Phoenix. And lots of small companies wants to run OLAP
> > queries directly on the database, they do no want to use ETL. So maybe
> > in the SQL proxy (planned above), we should delegate the OLAP queries
> > to spark SQL or something else, rather than just rejecting them.”
> >
> > “And a Phoenix committer said that, the Phoenix community are
> > currently re-evaluate the relationship with HBase, because when
> > upgrading to HBase 2.1.x, lots of things are broken. They plan to
> > break the tie between Phoenix and HBase, which means Phoenix plans to
> > also run on other storage systems. Note: This is not on the meeting
> > but personally, I think this maybe a good news, since Phoenix is not
> > HBase only, we have more reasons to introduce our own SQL layer.”
> >
> > Rohit Jain
> > CTO
> > Esgyn
> >
> >
> >
> > -----Original Message-----
> > From: Stack <st...@duboce.net>
> > Sent: Friday, July 26, 2019 12:01 PM
> > To: HBase Dev List <de...@hbase.apache.org>
> > Cc: hbase-user <us...@hbase.apache.org>
> > Subject: Re: The note of the round table meeting after HBaseConAsia
> > 2019
> >
> >
> >
> > External
> >
> >
> >
> > Thanks for the thorough write-up Duo. Made for a good read....
> >
> > S
> >
> >
> >
> > On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <palomino219@gmail.com
> > <ma...@gmail.com>> wrote:
> >
> >
> >
> > > The conclusion of the HBaseConAsia 2019 will be available later. And
> >
> > > here is the note of the round table meeting after the conference. A
> > > bit
> > long...
> >
> > >
> >
> > > First we talked about splittable meta. At Xiaomi we have a cluster
> >
> > > which has nearly 200k regions and meta is very easy to overload and
> >
> > > can not recover. Anoop said we can try read replica, but agreed that
> >
> > > read replica can not solve all the problems, finally we still need
> > > to
> > split meta.
> >
> > >
> >
> > > Then we talked about SQL. Allan Yang said that most of their
> > > customers
> >
> > > want secondary index, even more than SQL. And for global strong
> >
> > > consistent secondary index, we agree that the only safe way is to
> > > use
> > transaction.
> >
> > > Other 'local' solutions will be in trouble when splitting/merging.
> >
> > > Xiaomi has an global secondary index solution, open source it?
> >
> > >
> >
> > > Then we back to SQL. We talked about Phoenix, the problem for
> > > Phoenix
> >
> > > is well known: not stable enough. We even had a user on the
> >
> > > mailing-list said he/she will never use Phoenix again. Alibaba and
> >
> > > Huawei both have their in-house SQL solution, and Huawei also talked
> >
> > > about it on HBaseConAsia 2019, they will try to open source it. And
> > > we
> >
> > > could introduce a SQL proxy in hbase-connector repo. No push down
> >
> > > support first, all logics are done at the proxy side, can optimize
> later.
> >
> > >
> >
> > > Some guys said that the current feature set for 3.0.0 is not good
> >
> > > enough to attract more users, especially for small companies. Only
> >
> > > internal improvements, no users visible features. SQL and secondary
> >
> > > index are very important.
> >
> > >
> >
> > > Yu Li talked about the CCSMap, we still want it to be release in
> >
> > > 3.0.0. One problem is the relationship with in memory compaction.
> >
> > > Theoretically they should have no conflicts but actually they have.
> >
> > > And Xiaomi guys mentioned that in memory compaction still has some
> >
> > > bugs, even for basic mode, the MVCC writePoint may be stuck and hang
> >
> > > the region server. And Jieshan Bi asked why not just use CCSMap to
> >
> > > replace CSLM. Yu Li said this is for better memory usage, the index
> > > and
> > data could be placed together.
> >
> > >
> >
> > > Then we started to talk about the HBase on cloud. For now, it is a
> > > bit
> >
> > > difficult to deploy HBase on cloud as we need to deploy zookeeper
> > > and
> >
> > > HDFS first. Then we talked about the HBOSS and WAL
> > abstraction(HBASE-209520.
> >
> > > Wellington said the HBOSS basicly works, it use s3a and zookeeper to
> >
> > > help simulating the operations of HDFS. We could introduce our own
> > 'FileSystem'
> >
> > > interface, not the hadoop one, and we could remove the 'atomic
> renaming'
> >
> > > dependency so the 'FileSystem' implementation will be easier. And on
> >
> > > the WAL abstraction, Wellington said there are still some guys
> > > working
> >
> > > it, but now they focus on patching ratis, rather than abstracting
> > > the
> >
> > > WAL system first. We agreed that a better way is to abstract WAL
> >
> > > system at a level higher than FileSystem. so maybe we could even use
> > Kafka to store the WAL.
> >
> > >
> >
> > > Then we talked about the FPGA usage for compaction at Alibaba.
> > > Jieshan
> >
> > > Bi said that in Huawei they offload the compaction to storage layer.
> >
> > > For open source solution, maybe we could offload the compaction to
> >
> > > spark, and then use something like bulkload to let region server
> > > load
> >
> > > the new HFiles. The problem for doing compaction inside region
> > > server
> >
> > > is the CPU cost and GC pressure. We need to scan every cell so the
> > > CPU
> >
> > > cost is high. Yu Li talked about their page based compaction in
> > > flink
> >
> > > state store, maybe it could also benefit HBase.
> >
> > >
> >
> > > Then it is the time for MOB. Huawei said MOD can not solve their
> problem.
> >
> > > We still need to read the data through RPC, and it will also
> > > introduce
> >
> > > pressures on the memstore, since the memstore is still a bit small,
> >
> > > comparing to MOB cell. And we will also flush a lot although there
> > > are
> >
> > > only a small number of MOB cells in the memstore, so we still need
> > > to
> >
> > > compact a lot. So maybe the suitable scenario for using MOB is that,
> >
> > > most of your data are still small, and a small amount of the data
> > > are
> >
> > > a bit larger, where MOD could increase the performance, and users do
> >
> > > not need to use another system to store the larger data.
> >
> > > Huawei said that they implement the logic at client side. If the
> > > data
> >
> > > is larger than a threshold, the client will go to another storage
> >
> > > system rather than HBase.
> >
> > > Alibaba said that if we want to support large blob, we need to
> >
> > > introduce streaming API.
> >
> > > And Kuaishou said that they do not use MOB, they just store data on
> >
> > > HDFS and the index in HBase, typical solution.
> >
> > >
> >
> > > Then we talked about which company to host the next year's
> >
> > > HBaseConAsia. It will be Tencent or Huawei, or both, probably in
> >
> > > Shenzhen. And since there is no HBaseCon in America any more(it is
> >
> > > called 'NoSQL Day'), maybe next year we could just call the
> > > conference
> > HBaseCon.
> >
> > >
> >
> > > Then we back to SQL again. Alibaba said that most of their customers
> >
> > > are migrate from old business, so they need 'full' SQL support.
> > > That's
> >
> > > why they need Phoenix. And lots of small companies wants to run OLAP
> >
> > > queries directly on the database, they do no want to use ETL. So
> > > maybe
> >
> > > in the SQL proxy(planned above), we should delegate the OLAP queries
> >
> > > to spark SQL or something else, rather than just rejecting them.
> >
> > >
> >
> > > And a Phoenix committer said that, the Phoenix community are
> > > currently
> >
> > > re-evaluate the relationship with HBase, because when upgrading to
> >
> > > HBase 2.1.x, lots of things are broken. They plan to break the tie
> >
> > > between Phoenix and HBase, which means Phoenix plans to also run on
> >
> > > other storage systems.
> >
> > > Note: This is not on the meeting but personally, I think this maybe
> > > a
> >
> > > good news, since Phoenix is not HBase only, we have more reasons to
> >
> > > introduce our own SQL layer.
> >
> > >
> >
> > > Then we talked about Kudu. It is faster than HBase on scan. If we
> > > want
> >
> > > to increase the performance on scan, we should have larger block
> > > size,
> >
> > > but this will lead to a slower random read, so we need to trade-off.
> >
> > > The Kuaishou guys asked whether HBase could support storing HFile in
> >
> > > columnar format. The answer is no, as said above, it will slow
> > > random
> > read.
> >
> > > But we could learn what google done in bigtable. We could write a
> > > copy
> >
> > > of the data in parquet format to another FileSystem, and user could
> >
> > > just scan the parquet file for better analysis performance. And if
> >
> > > they want the newest data, they could ask HBase for the newest data,
> >
> > > and it should be small. This is more like a solution, not only HBase
> >
> > > is involved. But at least we could introduce some APIs in HBase so
> >
> > > users can build the solution in their own environment. And if you do
> >
> > > not care the newest data, you could also use replication to
> > > replicate
> >
> > > the data to ES or other systems, and search there.
> >
> > >
> >
> > > And Didi talked about their problems using HBase. They use kylin so
> >
> > > they also have lots of regions, so meta is also a problem for them.
> >
> > > And the pressure on zookeeper is also a problem, as the replication
> >
> > > queues are stored on zk. And after 2.1, zookeeper is only used as an
> >
> > > external storage in replication implementation, so it is possible to
> >
> > > switch to other storages, such as etcd. But it is still a bit
> >
> > > difficult to store the data in a system table, as now we need to
> > > start
> >
> > > the replication system before WAL system, but  if we want to store
> > > the
> >
> > > replication data in a hbase table, obviously the WAL system must be
> >
> > > started before replication system, as we need the region of the
> > > system
> >
> > > online first, and it will write an open marker to WAL. We need to
> > > find a
> > way to break the dead lock.
> >
> > > And they also mentioned that, the rsgroup feature also makes big
> > > znode
> >
> > > on zookeeper, as they have lots of tables. We have HBASE-22514 which
> >
> > > aims to solve the problem.
> >
> > > And last, they shared their experience when upgrading from 0.98 to
> 1.4.x.
> >
> > > they should be compatible but actually there are problems. They
> > > agreed
> >
> > > to post a blog about this.
> >
> > >
> >
> > > And the Flipkart guys said they will open source their test-suite,
> >
> > > which focus on the consistency(Jepsen?). This is a good news, hope
> > > we
> >
> > > could have another useful tool other than ITBLL.
> >
> > >
> >
> > > That's all. Thanks for reading.
> >
> > >
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: The note of the round table meeting after HBaseConAsia 2019

Posted by Andrew Purtell <ap...@apache.org>.
Ok, in that spirit let me say I've always found Apache Trafodion to be
interesting and credible technology and worthy of anyone's consideration.


On Thu, Aug 8, 2019 at 1:34 PM Rohit Jain <ro...@esgyn.com> wrote:

> Andrew,
>
> I would never dump on Apache Phoenix.  I have worked with James for years
> and have always wanted to see how we could collaborate on various aspects,
> including common data type support and transaction management, to name a
> few.  I think the challenges we faced is the Java vs C++ nature of the two
> projects.  I am just pointing out that Apache Trafodion is an alternate
> option available.  I am also letting people know what is NOT in Apache
> Trafodion, so they understand that before making the time investment.
>
> Yes, I do apologize it sounds a bit like marketing, even though I tried to
> minimize that.  But you will see that we have had no marketing at all
> elsewhere.  One of the reasons why no one seems to know about Apache
> Trafodion.
>
> Rohit
>
> -----Original Message-----
> From: Andrew Purtell <ap...@apache.org>
> Sent: Thursday, August 8, 2019 1:25 PM
> To: Hbase-User <us...@hbase.apache.org>
> Cc: HBase Dev List <de...@hbase.apache.org>
> Subject: Re: The note of the round table meeting after HBaseConAsia 2019
>
> This is great, but in the future please refrain from borderline marketing
> of a commercial product on these lists. This is not the appropriate venue
> for that.
>
> It is especially poor form to dump on a fellow open source project, as you
> claim to be. This I think is the tell behind the commercial motivation.
>
> Also I should point out, being pretty familiar with Phoenix in operation
> where I work, and in my interactions with various Phoenix committers and
> PMC, that the particular group of HBasers in that group appeared to share a
> negative view - which I will not comment on, they are entitled to their
> opinions, and more choice in SQL access to HBase is good! - that should not
> be claimed to be universal or even representative.
>
>
>
> On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain <ro...@esgyn.com> wrote:
>
> > Hi folks,
> >
> > This is a nice write-up of the round-table meeting at HBaseConAsia.  I
> > would like to address the points I have pulled out from write-up (at
> > the bottom of this message).
> >
> > Many in the HBase community may not be aware that besides Apache
> > Phoenix, there has been a project called Apache Trafodion, contributed
> > by Hewlett-Packard in 2015 that has now been top-level project for a
> while.
> > Apache Trafodion is essentially technology from Tandem-Compaq-HP that
> > started its OLTP / Operational journey as NonStop SQL effectively in
> > the early 1990s.  Granted it is a C++ project, but it has 170+ patents
> > as part of it that were contributed to Apache.  These are capabilities
> > that still don’t exist in other databases.
> >
> > It is a full-fledged SQL relational database engine with the breadth
> > of ANSI SQL support, including OLAP functions mentioned, and including
> > many de facto standard functions from databases like Oracle.  You can
> > go to the Apache Trafodion wiki to see the documentation as to what
> > all is supported by Trafodion.
> >
> > When we introduced Apache Trafodion, we implemented a completely
> > distributed transaction management capability right into the HBase
> > engine using coprocessors, that is completely scalable with no
> > bottlenecks what-so-ever.  We have made this infrastructure very
> > efficient over time, e.g. reducing two-phase commit overhead for
> > single region transactions.  We have presented this at HBaseCon.
> >
> > The engine also supports secondary indexes.  However, because of our
> > Multi-dimensional Access Method patented technology the need to use a
> > secondary index is substantially reduced.  All DDL and index updates
> > are completely protected by ACID transactions.
> >
> > Probably because of our own inability to create excitement about the
> > project, and potentially other reasons, we could not get community
> > involvement as we were expecting.  That is why you may see that while
> > we are maintaining the code base and introducing enhancements to it,
> > much of our focus has shifted to the commercial product based on
> > Apache Trafodion, namely EsgynDB.  But if the community involvement
> > increases, we can certainly refresh Trafodion with some of the
> > additional functionality we have added on the HBase side of the product.
> >
> > But let me be clear.  We are about 150 employees at Esgyn with 40 or
> > so in the US, mostly in Milpitas, and the rest in Shanghai, Beijing,
> > and Guiyang.  We cannot sustain the company on service revenue alone.
> > You have seen companies that tried to do that have not been
> > successful, unless they have a way to leverage the open source project
> > for a different business model – enhanced capabilities, Cloud services,
> etc.
> >
> > To that end we have added to EsgynDB complete Disaster Recovery,
> > Point-in-Time, fuzzy Backup and Restore, Manageability via a Database
> > Manager, Multi-tenancy, and a large number of other capabilities for
> > High Availability scale-out production deployments.  EsgynDB also
> > provides full BI and Analytics capabilities, again because of our
> > heritage products supporting up to 250TB EDWs for HP and customers
> > like Walmart competing with Teradata, leveraging Apache ORC and
> > Parquet.  So yes, it can integrate with other storage engines as needed.
> >
> > However, in spite of all this, the pricing on EsgynDB is very
> > competitive – in other words “cheap” compared to anything else with
> > the same caliber of capabilities.
> >
> > We have demonstrated the capability of the product by running the
> > TPC-C and TPC-DS (all 99 queries) benchmarks, especially at high
> > concurrency which our product is especially well suited for, based on
> > its architecture and patents.  (The TPC-DS benchmarks are run on ORC
> > and Parquet for obvious
> > reasons.)
> >
> > We just closed a couple of very large Core Banking deals in Guiyang
> > where we are replacing the entire Core Banking system for these banks
> > from their current Oracle implementations – where they were having
> > challenges scaling at a reasonable cost.  But we have many customers
> > both in the US and China that are using EsgynDB for operational, BI
> > and Analytics needs.  And now finally … OLTP.
> >
> > I know that this is sounding more like a commercial for Esgyn, but
> > that is not my intent.  I would like to make you aware of Apache
> > Trafodion as a solution to many of these issues that the community is
> > facing.  We will provide full support for Trafodion with community
> > involvement and hope that some of that involvement results in EsgynDB
> > revenue that we can sustain the company on 😊.  I would like to
> > encourage the community to look at Trafodion to address many of the
> concerns sighted below.
> >
> > “Allan Yang said that most of their customers want secondary index,
> > even more than SQL. And for global strong consistent secondary index,
> > we agree that the only safe way is to use transaction. Other 'local'
> > solutions will be in trouble when splitting/merging.”
> >
> > “We talked about Phoenix, the problem for Phoenix is well known: not
> > stable enough. We even had a user on the mailing-list said he/she will
> > never use Phoenix again.”
> >
> > “Some guys said that the current feature set for 3.0.0 is not good
> > enough to attract more users, especially for small companies. Only
> > internal improvements, no users visible features. SQL and secondary
> > index are very important.”
> >
> > “Then we back to SQL again. Alibaba said that most of their customers
> > are migrate from old business, so they need 'full' SQL support. That's
> > why they need Phoenix. And lots of small companies wants to run OLAP
> > queries directly on the database, they do no want to use ETL. So maybe
> > in the SQL proxy (planned above), we should delegate the OLAP queries
> > to spark SQL or something else, rather than just rejecting them.”
> >
> > “And a Phoenix committer said that, the Phoenix community are
> > currently re-evaluate the relationship with HBase, because when
> > upgrading to HBase 2.1.x, lots of things are broken. They plan to
> > break the tie between Phoenix and HBase, which means Phoenix plans to
> > also run on other storage systems. Note: This is not on the meeting
> > but personally, I think this maybe a good news, since Phoenix is not
> > HBase only, we have more reasons to introduce our own SQL layer.”
> >
> > Rohit Jain
> > CTO
> > Esgyn
> >
> >
> >
> > -----Original Message-----
> > From: Stack <st...@duboce.net>
> > Sent: Friday, July 26, 2019 12:01 PM
> > To: HBase Dev List <de...@hbase.apache.org>
> > Cc: hbase-user <us...@hbase.apache.org>
> > Subject: Re: The note of the round table meeting after HBaseConAsia
> > 2019
> >
> >
> >
> > External
> >
> >
> >
> > Thanks for the thorough write-up Duo. Made for a good read....
> >
> > S
> >
> >
> >
> > On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <palomino219@gmail.com
> > <ma...@gmail.com>> wrote:
> >
> >
> >
> > > The conclusion of the HBaseConAsia 2019 will be available later. And
> >
> > > here is the note of the round table meeting after the conference. A
> > > bit
> > long...
> >
> > >
> >
> > > First we talked about splittable meta. At Xiaomi we have a cluster
> >
> > > which has nearly 200k regions and meta is very easy to overload and
> >
> > > can not recover. Anoop said we can try read replica, but agreed that
> >
> > > read replica can not solve all the problems, finally we still need
> > > to
> > split meta.
> >
> > >
> >
> > > Then we talked about SQL. Allan Yang said that most of their
> > > customers
> >
> > > want secondary index, even more than SQL. And for global strong
> >
> > > consistent secondary index, we agree that the only safe way is to
> > > use
> > transaction.
> >
> > > Other 'local' solutions will be in trouble when splitting/merging.
> >
> > > Xiaomi has an global secondary index solution, open source it?
> >
> > >
> >
> > > Then we back to SQL. We talked about Phoenix, the problem for
> > > Phoenix
> >
> > > is well known: not stable enough. We even had a user on the
> >
> > > mailing-list said he/she will never use Phoenix again. Alibaba and
> >
> > > Huawei both have their in-house SQL solution, and Huawei also talked
> >
> > > about it on HBaseConAsia 2019, they will try to open source it. And
> > > we
> >
> > > could introduce a SQL proxy in hbase-connector repo. No push down
> >
> > > support first, all logics are done at the proxy side, can optimize
> later.
> >
> > >
> >
> > > Some guys said that the current feature set for 3.0.0 is not good
> >
> > > enough to attract more users, especially for small companies. Only
> >
> > > internal improvements, no users visible features. SQL and secondary
> >
> > > index are very important.
> >
> > >
> >
> > > Yu Li talked about the CCSMap, we still want it to be release in
> >
> > > 3.0.0. One problem is the relationship with in memory compaction.
> >
> > > Theoretically they should have no conflicts but actually they have.
> >
> > > And Xiaomi guys mentioned that in memory compaction still has some
> >
> > > bugs, even for basic mode, the MVCC writePoint may be stuck and hang
> >
> > > the region server. And Jieshan Bi asked why not just use CCSMap to
> >
> > > replace CSLM. Yu Li said this is for better memory usage, the index
> > > and
> > data could be placed together.
> >
> > >
> >
> > > Then we started to talk about the HBase on cloud. For now, it is a
> > > bit
> >
> > > difficult to deploy HBase on cloud as we need to deploy zookeeper
> > > and
> >
> > > HDFS first. Then we talked about the HBOSS and WAL
> > abstraction(HBASE-209520.
> >
> > > Wellington said the HBOSS basicly works, it use s3a and zookeeper to
> >
> > > help simulating the operations of HDFS. We could introduce our own
> > 'FileSystem'
> >
> > > interface, not the hadoop one, and we could remove the 'atomic
> renaming'
> >
> > > dependency so the 'FileSystem' implementation will be easier. And on
> >
> > > the WAL abstraction, Wellington said there are still some guys
> > > working
> >
> > > it, but now they focus on patching ratis, rather than abstracting
> > > the
> >
> > > WAL system first. We agreed that a better way is to abstract WAL
> >
> > > system at a level higher than FileSystem. so maybe we could even use
> > Kafka to store the WAL.
> >
> > >
> >
> > > Then we talked about the FPGA usage for compaction at Alibaba.
> > > Jieshan
> >
> > > Bi said that in Huawei they offload the compaction to storage layer.
> >
> > > For open source solution, maybe we could offload the compaction to
> >
> > > spark, and then use something like bulkload to let region server
> > > load
> >
> > > the new HFiles. The problem for doing compaction inside region
> > > server
> >
> > > is the CPU cost and GC pressure. We need to scan every cell so the
> > > CPU
> >
> > > cost is high. Yu Li talked about their page based compaction in
> > > flink
> >
> > > state store, maybe it could also benefit HBase.
> >
> > >
> >
> > > Then it is the time for MOB. Huawei said MOD can not solve their
> problem.
> >
> > > We still need to read the data through RPC, and it will also
> > > introduce
> >
> > > pressures on the memstore, since the memstore is still a bit small,
> >
> > > comparing to MOB cell. And we will also flush a lot although there
> > > are
> >
> > > only a small number of MOB cells in the memstore, so we still need
> > > to
> >
> > > compact a lot. So maybe the suitable scenario for using MOB is that,
> >
> > > most of your data are still small, and a small amount of the data
> > > are
> >
> > > a bit larger, where MOD could increase the performance, and users do
> >
> > > not need to use another system to store the larger data.
> >
> > > Huawei said that they implement the logic at client side. If the
> > > data
> >
> > > is larger than a threshold, the client will go to another storage
> >
> > > system rather than HBase.
> >
> > > Alibaba said that if we want to support large blob, we need to
> >
> > > introduce streaming API.
> >
> > > And Kuaishou said that they do not use MOB, they just store data on
> >
> > > HDFS and the index in HBase, typical solution.
> >
> > >
> >
> > > Then we talked about which company to host the next year's
> >
> > > HBaseConAsia. It will be Tencent or Huawei, or both, probably in
> >
> > > Shenzhen. And since there is no HBaseCon in America any more(it is
> >
> > > called 'NoSQL Day'), maybe next year we could just call the
> > > conference
> > HBaseCon.
> >
> > >
> >
> > > Then we back to SQL again. Alibaba said that most of their customers
> >
> > > are migrate from old business, so they need 'full' SQL support.
> > > That's
> >
> > > why they need Phoenix. And lots of small companies wants to run OLAP
> >
> > > queries directly on the database, they do no want to use ETL. So
> > > maybe
> >
> > > in the SQL proxy(planned above), we should delegate the OLAP queries
> >
> > > to spark SQL or something else, rather than just rejecting them.
> >
> > >
> >
> > > And a Phoenix committer said that, the Phoenix community are
> > > currently
> >
> > > re-evaluate the relationship with HBase, because when upgrading to
> >
> > > HBase 2.1.x, lots of things are broken. They plan to break the tie
> >
> > > between Phoenix and HBase, which means Phoenix plans to also run on
> >
> > > other storage systems.
> >
> > > Note: This is not on the meeting but personally, I think this maybe
> > > a
> >
> > > good news, since Phoenix is not HBase only, we have more reasons to
> >
> > > introduce our own SQL layer.
> >
> > >
> >
> > > Then we talked about Kudu. It is faster than HBase on scan. If we
> > > want
> >
> > > to increase the performance on scan, we should have larger block
> > > size,
> >
> > > but this will lead to a slower random read, so we need to trade-off.
> >
> > > The Kuaishou guys asked whether HBase could support storing HFile in
> >
> > > columnar format. The answer is no, as said above, it will slow
> > > random
> > read.
> >
> > > But we could learn what google done in bigtable. We could write a
> > > copy
> >
> > > of the data in parquet format to another FileSystem, and user could
> >
> > > just scan the parquet file for better analysis performance. And if
> >
> > > they want the newest data, they could ask HBase for the newest data,
> >
> > > and it should be small. This is more like a solution, not only HBase
> >
> > > is involved. But at least we could introduce some APIs in HBase so
> >
> > > users can build the solution in their own environment. And if you do
> >
> > > not care the newest data, you could also use replication to
> > > replicate
> >
> > > the data to ES or other systems, and search there.
> >
> > >
> >
> > > And Didi talked about their problems using HBase. They use kylin so
> >
> > > they also have lots of regions, so meta is also a problem for them.
> >
> > > And the pressure on zookeeper is also a problem, as the replication
> >
> > > queues are stored on zk. And after 2.1, zookeeper is only used as an
> >
> > > external storage in replication implementation, so it is possible to
> >
> > > switch to other storages, such as etcd. But it is still a bit
> >
> > > difficult to store the data in a system table, as now we need to
> > > start
> >
> > > the replication system before WAL system, but  if we want to store
> > > the
> >
> > > replication data in a hbase table, obviously the WAL system must be
> >
> > > started before replication system, as we need the region of the
> > > system
> >
> > > online first, and it will write an open marker to WAL. We need to
> > > find a
> > way to break the dead lock.
> >
> > > And they also mentioned that, the rsgroup feature also makes big
> > > znode
> >
> > > on zookeeper, as they have lots of tables. We have HBASE-22514 which
> >
> > > aims to solve the problem.
> >
> > > And last, they shared their experience when upgrading from 0.98 to
> 1.4.x.
> >
> > > they should be compatible but actually there are problems. They
> > > agreed
> >
> > > to post a blog about this.
> >
> > >
> >
> > > And the Flipkart guys said they will open source their test-suite,
> >
> > > which focus on the consistency(Jepsen?). This is a good news, hope
> > > we
> >
> > > could have another useful tool other than ITBLL.
> >
> > >
> >
> > > That's all. Thanks for reading.
> >
> > >
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

RE: The note of the round table meeting after HBaseConAsia 2019

Posted by Rohit Jain <ro...@esgyn.com>.
Andrew,

I would never dump on Apache Phoenix.  I have worked with James for years and have always wanted to see how we could collaborate on various aspects, including common data type support and transaction management, to name a few.  I think the challenges we faced is the Java vs C++ nature of the two projects.  I am just pointing out that Apache Trafodion is an alternate option available.  I am also letting people know what is NOT in Apache Trafodion, so they understand that before making the time investment.

Yes, I do apologize it sounds a bit like marketing, even though I tried to minimize that.  But you will see that we have had no marketing at all elsewhere.  One of the reasons why no one seems to know about Apache Trafodion.

Rohit

-----Original Message-----
From: Andrew Purtell <ap...@apache.org> 
Sent: Thursday, August 8, 2019 1:25 PM
To: Hbase-User <us...@hbase.apache.org>
Cc: HBase Dev List <de...@hbase.apache.org>
Subject: Re: The note of the round table meeting after HBaseConAsia 2019

This is great, but in the future please refrain from borderline marketing of a commercial product on these lists. This is not the appropriate venue for that.

It is especially poor form to dump on a fellow open source project, as you claim to be. This I think is the tell behind the commercial motivation.

Also I should point out, being pretty familiar with Phoenix in operation where I work, and in my interactions with various Phoenix committers and PMC, that the particular group of HBasers in that group appeared to share a negative view - which I will not comment on, they are entitled to their opinions, and more choice in SQL access to HBase is good! - that should not be claimed to be universal or even representative.



On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain <ro...@esgyn.com> wrote:

> Hi folks,
>
> This is a nice write-up of the round-table meeting at HBaseConAsia.  I 
> would like to address the points I have pulled out from write-up (at 
> the bottom of this message).
>
> Many in the HBase community may not be aware that besides Apache 
> Phoenix, there has been a project called Apache Trafodion, contributed 
> by Hewlett-Packard in 2015 that has now been top-level project for a while.
> Apache Trafodion is essentially technology from Tandem-Compaq-HP that 
> started its OLTP / Operational journey as NonStop SQL effectively in 
> the early 1990s.  Granted it is a C++ project, but it has 170+ patents 
> as part of it that were contributed to Apache.  These are capabilities 
> that still don’t exist in other databases.
>
> It is a full-fledged SQL relational database engine with the breadth 
> of ANSI SQL support, including OLAP functions mentioned, and including 
> many de facto standard functions from databases like Oracle.  You can 
> go to the Apache Trafodion wiki to see the documentation as to what 
> all is supported by Trafodion.
>
> When we introduced Apache Trafodion, we implemented a completely 
> distributed transaction management capability right into the HBase 
> engine using coprocessors, that is completely scalable with no 
> bottlenecks what-so-ever.  We have made this infrastructure very 
> efficient over time, e.g. reducing two-phase commit overhead for 
> single region transactions.  We have presented this at HBaseCon.
>
> The engine also supports secondary indexes.  However, because of our 
> Multi-dimensional Access Method patented technology the need to use a 
> secondary index is substantially reduced.  All DDL and index updates 
> are completely protected by ACID transactions.
>
> Probably because of our own inability to create excitement about the 
> project, and potentially other reasons, we could not get community 
> involvement as we were expecting.  That is why you may see that while 
> we are maintaining the code base and introducing enhancements to it, 
> much of our focus has shifted to the commercial product based on 
> Apache Trafodion, namely EsgynDB.  But if the community involvement 
> increases, we can certainly refresh Trafodion with some of the 
> additional functionality we have added on the HBase side of the product.
>
> But let me be clear.  We are about 150 employees at Esgyn with 40 or 
> so in the US, mostly in Milpitas, and the rest in Shanghai, Beijing, 
> and Guiyang.  We cannot sustain the company on service revenue alone.  
> You have seen companies that tried to do that have not been 
> successful, unless they have a way to leverage the open source project 
> for a different business model – enhanced capabilities, Cloud services, etc.
>
> To that end we have added to EsgynDB complete Disaster Recovery, 
> Point-in-Time, fuzzy Backup and Restore, Manageability via a Database 
> Manager, Multi-tenancy, and a large number of other capabilities for 
> High Availability scale-out production deployments.  EsgynDB also 
> provides full BI and Analytics capabilities, again because of our 
> heritage products supporting up to 250TB EDWs for HP and customers 
> like Walmart competing with Teradata, leveraging Apache ORC and 
> Parquet.  So yes, it can integrate with other storage engines as needed.
>
> However, in spite of all this, the pricing on EsgynDB is very 
> competitive – in other words “cheap” compared to anything else with 
> the same caliber of capabilities.
>
> We have demonstrated the capability of the product by running the 
> TPC-C and TPC-DS (all 99 queries) benchmarks, especially at high 
> concurrency which our product is especially well suited for, based on 
> its architecture and patents.  (The TPC-DS benchmarks are run on ORC 
> and Parquet for obvious
> reasons.)
>
> We just closed a couple of very large Core Banking deals in Guiyang 
> where we are replacing the entire Core Banking system for these banks 
> from their current Oracle implementations – where they were having 
> challenges scaling at a reasonable cost.  But we have many customers 
> both in the US and China that are using EsgynDB for operational, BI 
> and Analytics needs.  And now finally … OLTP.
>
> I know that this is sounding more like a commercial for Esgyn, but 
> that is not my intent.  I would like to make you aware of Apache 
> Trafodion as a solution to many of these issues that the community is 
> facing.  We will provide full support for Trafodion with community 
> involvement and hope that some of that involvement results in EsgynDB 
> revenue that we can sustain the company on 😊.  I would like to 
> encourage the community to look at Trafodion to address many of the concerns sighted below.
>
> “Allan Yang said that most of their customers want secondary index, 
> even more than SQL. And for global strong consistent secondary index, 
> we agree that the only safe way is to use transaction. Other 'local' 
> solutions will be in trouble when splitting/merging.”
>
> “We talked about Phoenix, the problem for Phoenix is well known: not 
> stable enough. We even had a user on the mailing-list said he/she will 
> never use Phoenix again.”
>
> “Some guys said that the current feature set for 3.0.0 is not good 
> enough to attract more users, especially for small companies. Only 
> internal improvements, no users visible features. SQL and secondary 
> index are very important.”
>
> “Then we back to SQL again. Alibaba said that most of their customers 
> are migrate from old business, so they need 'full' SQL support. That's 
> why they need Phoenix. And lots of small companies wants to run OLAP 
> queries directly on the database, they do no want to use ETL. So maybe 
> in the SQL proxy (planned above), we should delegate the OLAP queries 
> to spark SQL or something else, rather than just rejecting them.”
>
> “And a Phoenix committer said that, the Phoenix community are 
> currently re-evaluate the relationship with HBase, because when 
> upgrading to HBase 2.1.x, lots of things are broken. They plan to 
> break the tie between Phoenix and HBase, which means Phoenix plans to 
> also run on other storage systems. Note: This is not on the meeting 
> but personally, I think this maybe a good news, since Phoenix is not 
> HBase only, we have more reasons to introduce our own SQL layer.”
>
> Rohit Jain
> CTO
> Esgyn
>
>
>
> -----Original Message-----
> From: Stack <st...@duboce.net>
> Sent: Friday, July 26, 2019 12:01 PM
> To: HBase Dev List <de...@hbase.apache.org>
> Cc: hbase-user <us...@hbase.apache.org>
> Subject: Re: The note of the round table meeting after HBaseConAsia 
> 2019
>
>
>
> External
>
>
>
> Thanks for the thorough write-up Duo. Made for a good read....
>
> S
>
>
>
> On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <palomino219@gmail.com 
> <ma...@gmail.com>> wrote:
>
>
>
> > The conclusion of the HBaseConAsia 2019 will be available later. And
>
> > here is the note of the round table meeting after the conference. A 
> > bit
> long...
>
> >
>
> > First we talked about splittable meta. At Xiaomi we have a cluster
>
> > which has nearly 200k regions and meta is very easy to overload and
>
> > can not recover. Anoop said we can try read replica, but agreed that
>
> > read replica can not solve all the problems, finally we still need 
> > to
> split meta.
>
> >
>
> > Then we talked about SQL. Allan Yang said that most of their 
> > customers
>
> > want secondary index, even more than SQL. And for global strong
>
> > consistent secondary index, we agree that the only safe way is to 
> > use
> transaction.
>
> > Other 'local' solutions will be in trouble when splitting/merging.
>
> > Xiaomi has an global secondary index solution, open source it?
>
> >
>
> > Then we back to SQL. We talked about Phoenix, the problem for 
> > Phoenix
>
> > is well known: not stable enough. We even had a user on the
>
> > mailing-list said he/she will never use Phoenix again. Alibaba and
>
> > Huawei both have their in-house SQL solution, and Huawei also talked
>
> > about it on HBaseConAsia 2019, they will try to open source it. And 
> > we
>
> > could introduce a SQL proxy in hbase-connector repo. No push down
>
> > support first, all logics are done at the proxy side, can optimize later.
>
> >
>
> > Some guys said that the current feature set for 3.0.0 is not good
>
> > enough to attract more users, especially for small companies. Only
>
> > internal improvements, no users visible features. SQL and secondary
>
> > index are very important.
>
> >
>
> > Yu Li talked about the CCSMap, we still want it to be release in
>
> > 3.0.0. One problem is the relationship with in memory compaction.
>
> > Theoretically they should have no conflicts but actually they have.
>
> > And Xiaomi guys mentioned that in memory compaction still has some
>
> > bugs, even for basic mode, the MVCC writePoint may be stuck and hang
>
> > the region server. And Jieshan Bi asked why not just use CCSMap to
>
> > replace CSLM. Yu Li said this is for better memory usage, the index 
> > and
> data could be placed together.
>
> >
>
> > Then we started to talk about the HBase on cloud. For now, it is a 
> > bit
>
> > difficult to deploy HBase on cloud as we need to deploy zookeeper 
> > and
>
> > HDFS first. Then we talked about the HBOSS and WAL
> abstraction(HBASE-209520.
>
> > Wellington said the HBOSS basicly works, it use s3a and zookeeper to
>
> > help simulating the operations of HDFS. We could introduce our own
> 'FileSystem'
>
> > interface, not the hadoop one, and we could remove the 'atomic renaming'
>
> > dependency so the 'FileSystem' implementation will be easier. And on
>
> > the WAL abstraction, Wellington said there are still some guys 
> > working
>
> > it, but now they focus on patching ratis, rather than abstracting 
> > the
>
> > WAL system first. We agreed that a better way is to abstract WAL
>
> > system at a level higher than FileSystem. so maybe we could even use
> Kafka to store the WAL.
>
> >
>
> > Then we talked about the FPGA usage for compaction at Alibaba. 
> > Jieshan
>
> > Bi said that in Huawei they offload the compaction to storage layer.
>
> > For open source solution, maybe we could offload the compaction to
>
> > spark, and then use something like bulkload to let region server 
> > load
>
> > the new HFiles. The problem for doing compaction inside region 
> > server
>
> > is the CPU cost and GC pressure. We need to scan every cell so the 
> > CPU
>
> > cost is high. Yu Li talked about their page based compaction in 
> > flink
>
> > state store, maybe it could also benefit HBase.
>
> >
>
> > Then it is the time for MOB. Huawei said MOD can not solve their problem.
>
> > We still need to read the data through RPC, and it will also 
> > introduce
>
> > pressures on the memstore, since the memstore is still a bit small,
>
> > comparing to MOB cell. And we will also flush a lot although there 
> > are
>
> > only a small number of MOB cells in the memstore, so we still need 
> > to
>
> > compact a lot. So maybe the suitable scenario for using MOB is that,
>
> > most of your data are still small, and a small amount of the data 
> > are
>
> > a bit larger, where MOD could increase the performance, and users do
>
> > not need to use another system to store the larger data.
>
> > Huawei said that they implement the logic at client side. If the 
> > data
>
> > is larger than a threshold, the client will go to another storage
>
> > system rather than HBase.
>
> > Alibaba said that if we want to support large blob, we need to
>
> > introduce streaming API.
>
> > And Kuaishou said that they do not use MOB, they just store data on
>
> > HDFS and the index in HBase, typical solution.
>
> >
>
> > Then we talked about which company to host the next year's
>
> > HBaseConAsia. It will be Tencent or Huawei, or both, probably in
>
> > Shenzhen. And since there is no HBaseCon in America any more(it is
>
> > called 'NoSQL Day'), maybe next year we could just call the 
> > conference
> HBaseCon.
>
> >
>
> > Then we back to SQL again. Alibaba said that most of their customers
>
> > are migrate from old business, so they need 'full' SQL support. 
> > That's
>
> > why they need Phoenix. And lots of small companies wants to run OLAP
>
> > queries directly on the database, they do no want to use ETL. So 
> > maybe
>
> > in the SQL proxy(planned above), we should delegate the OLAP queries
>
> > to spark SQL or something else, rather than just rejecting them.
>
> >
>
> > And a Phoenix committer said that, the Phoenix community are 
> > currently
>
> > re-evaluate the relationship with HBase, because when upgrading to
>
> > HBase 2.1.x, lots of things are broken. They plan to break the tie
>
> > between Phoenix and HBase, which means Phoenix plans to also run on
>
> > other storage systems.
>
> > Note: This is not on the meeting but personally, I think this maybe 
> > a
>
> > good news, since Phoenix is not HBase only, we have more reasons to
>
> > introduce our own SQL layer.
>
> >
>
> > Then we talked about Kudu. It is faster than HBase on scan. If we 
> > want
>
> > to increase the performance on scan, we should have larger block 
> > size,
>
> > but this will lead to a slower random read, so we need to trade-off.
>
> > The Kuaishou guys asked whether HBase could support storing HFile in
>
> > columnar format. The answer is no, as said above, it will slow 
> > random
> read.
>
> > But we could learn what google done in bigtable. We could write a 
> > copy
>
> > of the data in parquet format to another FileSystem, and user could
>
> > just scan the parquet file for better analysis performance. And if
>
> > they want the newest data, they could ask HBase for the newest data,
>
> > and it should be small. This is more like a solution, not only HBase
>
> > is involved. But at least we could introduce some APIs in HBase so
>
> > users can build the solution in their own environment. And if you do
>
> > not care the newest data, you could also use replication to 
> > replicate
>
> > the data to ES or other systems, and search there.
>
> >
>
> > And Didi talked about their problems using HBase. They use kylin so
>
> > they also have lots of regions, so meta is also a problem for them.
>
> > And the pressure on zookeeper is also a problem, as the replication
>
> > queues are stored on zk. And after 2.1, zookeeper is only used as an
>
> > external storage in replication implementation, so it is possible to
>
> > switch to other storages, such as etcd. But it is still a bit
>
> > difficult to store the data in a system table, as now we need to 
> > start
>
> > the replication system before WAL system, but  if we want to store 
> > the
>
> > replication data in a hbase table, obviously the WAL system must be
>
> > started before replication system, as we need the region of the 
> > system
>
> > online first, and it will write an open marker to WAL. We need to 
> > find a
> way to break the dead lock.
>
> > And they also mentioned that, the rsgroup feature also makes big 
> > znode
>
> > on zookeeper, as they have lots of tables. We have HBASE-22514 which
>
> > aims to solve the problem.
>
> > And last, they shared their experience when upgrading from 0.98 to 1.4.x.
>
> > they should be compatible but actually there are problems. They 
> > agreed
>
> > to post a blog about this.
>
> >
>
> > And the Flipkart guys said they will open source their test-suite,
>
> > which focus on the consistency(Jepsen?). This is a good news, hope 
> > we
>
> > could have another useful tool other than ITBLL.
>
> >
>
> > That's all. Thanks for reading.
>
> >
>


--
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands
   - A23, Crosstalk

RE: The note of the round table meeting after HBaseConAsia 2019

Posted by Rohit Jain <ro...@esgyn.com>.
Andrew,

I would never dump on Apache Phoenix.  I have worked with James for years and have always wanted to see how we could collaborate on various aspects, including common data type support and transaction management, to name a few.  I think the challenges we faced is the Java vs C++ nature of the two projects.  I am just pointing out that Apache Trafodion is an alternate option available.  I am also letting people know what is NOT in Apache Trafodion, so they understand that before making the time investment.

Yes, I do apologize it sounds a bit like marketing, even though I tried to minimize that.  But you will see that we have had no marketing at all elsewhere.  One of the reasons why no one seems to know about Apache Trafodion.

Rohit

-----Original Message-----
From: Andrew Purtell <ap...@apache.org> 
Sent: Thursday, August 8, 2019 1:25 PM
To: Hbase-User <us...@hbase.apache.org>
Cc: HBase Dev List <de...@hbase.apache.org>
Subject: Re: The note of the round table meeting after HBaseConAsia 2019

This is great, but in the future please refrain from borderline marketing of a commercial product on these lists. This is not the appropriate venue for that.

It is especially poor form to dump on a fellow open source project, as you claim to be. This I think is the tell behind the commercial motivation.

Also I should point out, being pretty familiar with Phoenix in operation where I work, and in my interactions with various Phoenix committers and PMC, that the particular group of HBasers in that group appeared to share a negative view - which I will not comment on, they are entitled to their opinions, and more choice in SQL access to HBase is good! - that should not be claimed to be universal or even representative.



On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain <ro...@esgyn.com> wrote:

> Hi folks,
>
> This is a nice write-up of the round-table meeting at HBaseConAsia.  I 
> would like to address the points I have pulled out from write-up (at 
> the bottom of this message).
>
> Many in the HBase community may not be aware that besides Apache 
> Phoenix, there has been a project called Apache Trafodion, contributed 
> by Hewlett-Packard in 2015 that has now been top-level project for a while.
> Apache Trafodion is essentially technology from Tandem-Compaq-HP that 
> started its OLTP / Operational journey as NonStop SQL effectively in 
> the early 1990s.  Granted it is a C++ project, but it has 170+ patents 
> as part of it that were contributed to Apache.  These are capabilities 
> that still don’t exist in other databases.
>
> It is a full-fledged SQL relational database engine with the breadth 
> of ANSI SQL support, including OLAP functions mentioned, and including 
> many de facto standard functions from databases like Oracle.  You can 
> go to the Apache Trafodion wiki to see the documentation as to what 
> all is supported by Trafodion.
>
> When we introduced Apache Trafodion, we implemented a completely 
> distributed transaction management capability right into the HBase 
> engine using coprocessors, that is completely scalable with no 
> bottlenecks what-so-ever.  We have made this infrastructure very 
> efficient over time, e.g. reducing two-phase commit overhead for 
> single region transactions.  We have presented this at HBaseCon.
>
> The engine also supports secondary indexes.  However, because of our 
> Multi-dimensional Access Method patented technology the need to use a 
> secondary index is substantially reduced.  All DDL and index updates 
> are completely protected by ACID transactions.
>
> Probably because of our own inability to create excitement about the 
> project, and potentially other reasons, we could not get community 
> involvement as we were expecting.  That is why you may see that while 
> we are maintaining the code base and introducing enhancements to it, 
> much of our focus has shifted to the commercial product based on 
> Apache Trafodion, namely EsgynDB.  But if the community involvement 
> increases, we can certainly refresh Trafodion with some of the 
> additional functionality we have added on the HBase side of the product.
>
> But let me be clear.  We are about 150 employees at Esgyn with 40 or 
> so in the US, mostly in Milpitas, and the rest in Shanghai, Beijing, 
> and Guiyang.  We cannot sustain the company on service revenue alone.  
> You have seen companies that tried to do that have not been 
> successful, unless they have a way to leverage the open source project 
> for a different business model – enhanced capabilities, Cloud services, etc.
>
> To that end we have added to EsgynDB complete Disaster Recovery, 
> Point-in-Time, fuzzy Backup and Restore, Manageability via a Database 
> Manager, Multi-tenancy, and a large number of other capabilities for 
> High Availability scale-out production deployments.  EsgynDB also 
> provides full BI and Analytics capabilities, again because of our 
> heritage products supporting up to 250TB EDWs for HP and customers 
> like Walmart competing with Teradata, leveraging Apache ORC and 
> Parquet.  So yes, it can integrate with other storage engines as needed.
>
> However, in spite of all this, the pricing on EsgynDB is very 
> competitive – in other words “cheap” compared to anything else with 
> the same caliber of capabilities.
>
> We have demonstrated the capability of the product by running the 
> TPC-C and TPC-DS (all 99 queries) benchmarks, especially at high 
> concurrency which our product is especially well suited for, based on 
> its architecture and patents.  (The TPC-DS benchmarks are run on ORC 
> and Parquet for obvious
> reasons.)
>
> We just closed a couple of very large Core Banking deals in Guiyang 
> where we are replacing the entire Core Banking system for these banks 
> from their current Oracle implementations – where they were having 
> challenges scaling at a reasonable cost.  But we have many customers 
> both in the US and China that are using EsgynDB for operational, BI 
> and Analytics needs.  And now finally … OLTP.
>
> I know that this is sounding more like a commercial for Esgyn, but 
> that is not my intent.  I would like to make you aware of Apache 
> Trafodion as a solution to many of these issues that the community is 
> facing.  We will provide full support for Trafodion with community 
> involvement and hope that some of that involvement results in EsgynDB 
> revenue that we can sustain the company on 😊.  I would like to 
> encourage the community to look at Trafodion to address many of the concerns sighted below.
>
> “Allan Yang said that most of their customers want secondary index, 
> even more than SQL. And for global strong consistent secondary index, 
> we agree that the only safe way is to use transaction. Other 'local' 
> solutions will be in trouble when splitting/merging.”
>
> “We talked about Phoenix, the problem for Phoenix is well known: not 
> stable enough. We even had a user on the mailing-list said he/she will 
> never use Phoenix again.”
>
> “Some guys said that the current feature set for 3.0.0 is not good 
> enough to attract more users, especially for small companies. Only 
> internal improvements, no users visible features. SQL and secondary 
> index are very important.”
>
> “Then we back to SQL again. Alibaba said that most of their customers 
> are migrate from old business, so they need 'full' SQL support. That's 
> why they need Phoenix. And lots of small companies wants to run OLAP 
> queries directly on the database, they do no want to use ETL. So maybe 
> in the SQL proxy (planned above), we should delegate the OLAP queries 
> to spark SQL or something else, rather than just rejecting them.”
>
> “And a Phoenix committer said that, the Phoenix community are 
> currently re-evaluate the relationship with HBase, because when 
> upgrading to HBase 2.1.x, lots of things are broken. They plan to 
> break the tie between Phoenix and HBase, which means Phoenix plans to 
> also run on other storage systems. Note: This is not on the meeting 
> but personally, I think this maybe a good news, since Phoenix is not 
> HBase only, we have more reasons to introduce our own SQL layer.”
>
> Rohit Jain
> CTO
> Esgyn
>
>
>
> -----Original Message-----
> From: Stack <st...@duboce.net>
> Sent: Friday, July 26, 2019 12:01 PM
> To: HBase Dev List <de...@hbase.apache.org>
> Cc: hbase-user <us...@hbase.apache.org>
> Subject: Re: The note of the round table meeting after HBaseConAsia 
> 2019
>
>
>
> External
>
>
>
> Thanks for the thorough write-up Duo. Made for a good read....
>
> S
>
>
>
> On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <palomino219@gmail.com 
> <ma...@gmail.com>> wrote:
>
>
>
> > The conclusion of the HBaseConAsia 2019 will be available later. And
>
> > here is the note of the round table meeting after the conference. A 
> > bit
> long...
>
> >
>
> > First we talked about splittable meta. At Xiaomi we have a cluster
>
> > which has nearly 200k regions and meta is very easy to overload and
>
> > can not recover. Anoop said we can try read replica, but agreed that
>
> > read replica can not solve all the problems, finally we still need 
> > to
> split meta.
>
> >
>
> > Then we talked about SQL. Allan Yang said that most of their 
> > customers
>
> > want secondary index, even more than SQL. And for global strong
>
> > consistent secondary index, we agree that the only safe way is to 
> > use
> transaction.
>
> > Other 'local' solutions will be in trouble when splitting/merging.
>
> > Xiaomi has an global secondary index solution, open source it?
>
> >
>
> > Then we back to SQL. We talked about Phoenix, the problem for 
> > Phoenix
>
> > is well known: not stable enough. We even had a user on the
>
> > mailing-list said he/she will never use Phoenix again. Alibaba and
>
> > Huawei both have their in-house SQL solution, and Huawei also talked
>
> > about it on HBaseConAsia 2019, they will try to open source it. And 
> > we
>
> > could introduce a SQL proxy in hbase-connector repo. No push down
>
> > support first, all logics are done at the proxy side, can optimize later.
>
> >
>
> > Some guys said that the current feature set for 3.0.0 is not good
>
> > enough to attract more users, especially for small companies. Only
>
> > internal improvements, no users visible features. SQL and secondary
>
> > index are very important.
>
> >
>
> > Yu Li talked about the CCSMap, we still want it to be release in
>
> > 3.0.0. One problem is the relationship with in memory compaction.
>
> > Theoretically they should have no conflicts but actually they have.
>
> > And Xiaomi guys mentioned that in memory compaction still has some
>
> > bugs, even for basic mode, the MVCC writePoint may be stuck and hang
>
> > the region server. And Jieshan Bi asked why not just use CCSMap to
>
> > replace CSLM. Yu Li said this is for better memory usage, the index 
> > and
> data could be placed together.
>
> >
>
> > Then we started to talk about the HBase on cloud. For now, it is a 
> > bit
>
> > difficult to deploy HBase on cloud as we need to deploy zookeeper 
> > and
>
> > HDFS first. Then we talked about the HBOSS and WAL
> abstraction(HBASE-209520.
>
> > Wellington said the HBOSS basicly works, it use s3a and zookeeper to
>
> > help simulating the operations of HDFS. We could introduce our own
> 'FileSystem'
>
> > interface, not the hadoop one, and we could remove the 'atomic renaming'
>
> > dependency so the 'FileSystem' implementation will be easier. And on
>
> > the WAL abstraction, Wellington said there are still some guys 
> > working
>
> > it, but now they focus on patching ratis, rather than abstracting 
> > the
>
> > WAL system first. We agreed that a better way is to abstract WAL
>
> > system at a level higher than FileSystem. so maybe we could even use
> Kafka to store the WAL.
>
> >
>
> > Then we talked about the FPGA usage for compaction at Alibaba. 
> > Jieshan
>
> > Bi said that in Huawei they offload the compaction to storage layer.
>
> > For open source solution, maybe we could offload the compaction to
>
> > spark, and then use something like bulkload to let region server 
> > load
>
> > the new HFiles. The problem for doing compaction inside region 
> > server
>
> > is the CPU cost and GC pressure. We need to scan every cell so the 
> > CPU
>
> > cost is high. Yu Li talked about their page based compaction in 
> > flink
>
> > state store, maybe it could also benefit HBase.
>
> >
>
> > Then it is the time for MOB. Huawei said MOD can not solve their problem.
>
> > We still need to read the data through RPC, and it will also 
> > introduce
>
> > pressures on the memstore, since the memstore is still a bit small,
>
> > comparing to MOB cell. And we will also flush a lot although there 
> > are
>
> > only a small number of MOB cells in the memstore, so we still need 
> > to
>
> > compact a lot. So maybe the suitable scenario for using MOB is that,
>
> > most of your data are still small, and a small amount of the data 
> > are
>
> > a bit larger, where MOD could increase the performance, and users do
>
> > not need to use another system to store the larger data.
>
> > Huawei said that they implement the logic at client side. If the 
> > data
>
> > is larger than a threshold, the client will go to another storage
>
> > system rather than HBase.
>
> > Alibaba said that if we want to support large blob, we need to
>
> > introduce streaming API.
>
> > And Kuaishou said that they do not use MOB, they just store data on
>
> > HDFS and the index in HBase, typical solution.
>
> >
>
> > Then we talked about which company to host the next year's
>
> > HBaseConAsia. It will be Tencent or Huawei, or both, probably in
>
> > Shenzhen. And since there is no HBaseCon in America any more(it is
>
> > called 'NoSQL Day'), maybe next year we could just call the 
> > conference
> HBaseCon.
>
> >
>
> > Then we back to SQL again. Alibaba said that most of their customers
>
> > are migrate from old business, so they need 'full' SQL support. 
> > That's
>
> > why they need Phoenix. And lots of small companies wants to run OLAP
>
> > queries directly on the database, they do no want to use ETL. So 
> > maybe
>
> > in the SQL proxy(planned above), we should delegate the OLAP queries
>
> > to spark SQL or something else, rather than just rejecting them.
>
> >
>
> > And a Phoenix committer said that, the Phoenix community are 
> > currently
>
> > re-evaluate the relationship with HBase, because when upgrading to
>
> > HBase 2.1.x, lots of things are broken. They plan to break the tie
>
> > between Phoenix and HBase, which means Phoenix plans to also run on
>
> > other storage systems.
>
> > Note: This is not on the meeting but personally, I think this maybe 
> > a
>
> > good news, since Phoenix is not HBase only, we have more reasons to
>
> > introduce our own SQL layer.
>
> >
>
> > Then we talked about Kudu. It is faster than HBase on scan. If we 
> > want
>
> > to increase the performance on scan, we should have larger block 
> > size,
>
> > but this will lead to a slower random read, so we need to trade-off.
>
> > The Kuaishou guys asked whether HBase could support storing HFile in
>
> > columnar format. The answer is no, as said above, it will slow 
> > random
> read.
>
> > But we could learn what google done in bigtable. We could write a 
> > copy
>
> > of the data in parquet format to another FileSystem, and user could
>
> > just scan the parquet file for better analysis performance. And if
>
> > they want the newest data, they could ask HBase for the newest data,
>
> > and it should be small. This is more like a solution, not only HBase
>
> > is involved. But at least we could introduce some APIs in HBase so
>
> > users can build the solution in their own environment. And if you do
>
> > not care the newest data, you could also use replication to 
> > replicate
>
> > the data to ES or other systems, and search there.
>
> >
>
> > And Didi talked about their problems using HBase. They use kylin so
>
> > they also have lots of regions, so meta is also a problem for them.
>
> > And the pressure on zookeeper is also a problem, as the replication
>
> > queues are stored on zk. And after 2.1, zookeeper is only used as an
>
> > external storage in replication implementation, so it is possible to
>
> > switch to other storages, such as etcd. But it is still a bit
>
> > difficult to store the data in a system table, as now we need to 
> > start
>
> > the replication system before WAL system, but  if we want to store 
> > the
>
> > replication data in a hbase table, obviously the WAL system must be
>
> > started before replication system, as we need the region of the 
> > system
>
> > online first, and it will write an open marker to WAL. We need to 
> > find a
> way to break the dead lock.
>
> > And they also mentioned that, the rsgroup feature also makes big 
> > znode
>
> > on zookeeper, as they have lots of tables. We have HBASE-22514 which
>
> > aims to solve the problem.
>
> > And last, they shared their experience when upgrading from 0.98 to 1.4.x.
>
> > they should be compatible but actually there are problems. They 
> > agreed
>
> > to post a blog about this.
>
> >
>
> > And the Flipkart guys said they will open source their test-suite,
>
> > which focus on the consistency(Jepsen?). This is a good news, hope 
> > we
>
> > could have another useful tool other than ITBLL.
>
> >
>
> > That's all. Thanks for reading.
>
> >
>


--
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's decrepit hands
   - A23, Crosstalk

Re: The note of the round table meeting after HBaseConAsia 2019

Posted by Andrew Purtell <ap...@apache.org>.
This is great, but in the future please refrain from borderline marketing
of a commercial product on these lists. This is not the appropriate venue
for that.

It is especially poor form to dump on a fellow open source project, as you
claim to be. This I think is the tell behind the commercial motivation.

Also I should point out, being pretty familiar with Phoenix in operation
where I work, and in my interactions with various Phoenix committers and
PMC, that the particular group of HBasers in that group appeared to share a
negative view - which I will not comment on, they are entitled to their
opinions, and more choice in SQL access to HBase is good! - that should not
be claimed to be universal or even representative.



On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain <ro...@esgyn.com> wrote:

> Hi folks,
>
> This is a nice write-up of the round-table meeting at HBaseConAsia.  I
> would like to address the points I have pulled out from write-up (at the
> bottom of this message).
>
> Many in the HBase community may not be aware that besides Apache Phoenix,
> there has been a project called Apache Trafodion, contributed by
> Hewlett-Packard in 2015 that has now been top-level project for a while.
> Apache Trafodion is essentially technology from Tandem-Compaq-HP that
> started its OLTP / Operational journey as NonStop SQL effectively in the
> early 1990s.  Granted it is a C++ project, but it has 170+ patents as part
> of it that were contributed to Apache.  These are capabilities that still
> don’t exist in other databases.
>
> It is a full-fledged SQL relational database engine with the breadth of
> ANSI SQL support, including OLAP functions mentioned, and including many de
> facto standard functions from databases like Oracle.  You can go to the
> Apache Trafodion wiki to see the documentation as to what all is supported
> by Trafodion.
>
> When we introduced Apache Trafodion, we implemented a completely
> distributed transaction management capability right into the HBase engine
> using coprocessors, that is completely scalable with no bottlenecks
> what-so-ever.  We have made this infrastructure very efficient over time,
> e.g. reducing two-phase commit overhead for single region transactions.  We
> have presented this at HBaseCon.
>
> The engine also supports secondary indexes.  However, because of our
> Multi-dimensional Access Method patented technology the need to use a
> secondary index is substantially reduced.  All DDL and index updates are
> completely protected by ACID transactions.
>
> Probably because of our own inability to create excitement about the
> project, and potentially other reasons, we could not get community
> involvement as we were expecting.  That is why you may see that while we
> are maintaining the code base and introducing enhancements to it, much of
> our focus has shifted to the commercial product based on Apache Trafodion,
> namely EsgynDB.  But if the community involvement increases, we can
> certainly refresh Trafodion with some of the additional functionality we
> have added on the HBase side of the product.
>
> But let me be clear.  We are about 150 employees at Esgyn with 40 or so in
> the US, mostly in Milpitas, and the rest in Shanghai, Beijing, and
> Guiyang.  We cannot sustain the company on service revenue alone.  You have
> seen companies that tried to do that have not been successful, unless they
> have a way to leverage the open source project for a different business
> model – enhanced capabilities, Cloud services, etc.
>
> To that end we have added to EsgynDB complete Disaster Recovery,
> Point-in-Time, fuzzy Backup and Restore, Manageability via a Database
> Manager, Multi-tenancy, and a large number of other capabilities for High
> Availability scale-out production deployments.  EsgynDB also provides full
> BI and Analytics capabilities, again because of our heritage products
> supporting up to 250TB EDWs for HP and customers like Walmart competing
> with Teradata, leveraging Apache ORC and Parquet.  So yes, it can integrate
> with other storage engines as needed.
>
> However, in spite of all this, the pricing on EsgynDB is very competitive
> – in other words “cheap” compared to anything else with the same caliber of
> capabilities.
>
> We have demonstrated the capability of the product by running the TPC-C
> and TPC-DS (all 99 queries) benchmarks, especially at high concurrency
> which our product is especially well suited for, based on its architecture
> and patents.  (The TPC-DS benchmarks are run on ORC and Parquet for obvious
> reasons.)
>
> We just closed a couple of very large Core Banking deals in Guiyang where
> we are replacing the entire Core Banking system for these banks from their
> current Oracle implementations – where they were having challenges scaling
> at a reasonable cost.  But we have many customers both in the US and China
> that are using EsgynDB for operational, BI and Analytics needs.  And now
> finally … OLTP.
>
> I know that this is sounding more like a commercial for Esgyn, but that is
> not my intent.  I would like to make you aware of Apache Trafodion as a
> solution to many of these issues that the community is facing.  We will
> provide full support for Trafodion with community involvement and hope that
> some of that involvement results in EsgynDB revenue that we can sustain the
> company on 😊.  I would like to encourage the community to look at
> Trafodion to address many of the concerns sighted below.
>
> “Allan Yang said that most of their customers want secondary index, even
> more than SQL. And for global strong consistent secondary index, we agree
> that the only safe way is to use transaction. Other 'local' solutions will
> be in trouble when splitting/merging.”
>
> “We talked about Phoenix, the problem for Phoenix is well known: not
> stable enough. We even had a user on the mailing-list said he/she will
> never use Phoenix again.”
>
> “Some guys said that the current feature set for 3.0.0 is not good enough
> to attract more users, especially for small companies. Only internal
> improvements, no users visible features. SQL and secondary index are very
> important.”
>
> “Then we back to SQL again. Alibaba said that most of their customers are
> migrate from old business, so they need 'full' SQL support. That's why they
> need Phoenix. And lots of small companies wants to run OLAP queries
> directly on the database, they do no want to use ETL. So maybe in the SQL
> proxy (planned above), we should delegate the OLAP queries to spark SQL or
> something else, rather than just rejecting them.”
>
> “And a Phoenix committer said that, the Phoenix community are currently
> re-evaluate the relationship with HBase, because when upgrading to HBase
> 2.1.x, lots of things are broken. They plan to break the tie between
> Phoenix and HBase, which means Phoenix plans to also run on other storage
> systems. Note: This is not on the meeting but personally, I think this
> maybe a good news, since Phoenix is not HBase only, we have more reasons to
> introduce our own SQL layer.”
>
> Rohit Jain
> CTO
> Esgyn
>
>
>
> -----Original Message-----
> From: Stack <st...@duboce.net>
> Sent: Friday, July 26, 2019 12:01 PM
> To: HBase Dev List <de...@hbase.apache.org>
> Cc: hbase-user <us...@hbase.apache.org>
> Subject: Re: The note of the round table meeting after HBaseConAsia 2019
>
>
>
> External
>
>
>
> Thanks for the thorough write-up Duo. Made for a good read....
>
> S
>
>
>
> On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <palomino219@gmail.com
> <ma...@gmail.com>> wrote:
>
>
>
> > The conclusion of the HBaseConAsia 2019 will be available later. And
>
> > here is the note of the round table meeting after the conference. A bit
> long...
>
> >
>
> > First we talked about splittable meta. At Xiaomi we have a cluster
>
> > which has nearly 200k regions and meta is very easy to overload and
>
> > can not recover. Anoop said we can try read replica, but agreed that
>
> > read replica can not solve all the problems, finally we still need to
> split meta.
>
> >
>
> > Then we talked about SQL. Allan Yang said that most of their customers
>
> > want secondary index, even more than SQL. And for global strong
>
> > consistent secondary index, we agree that the only safe way is to use
> transaction.
>
> > Other 'local' solutions will be in trouble when splitting/merging.
>
> > Xiaomi has an global secondary index solution, open source it?
>
> >
>
> > Then we back to SQL. We talked about Phoenix, the problem for Phoenix
>
> > is well known: not stable enough. We even had a user on the
>
> > mailing-list said he/she will never use Phoenix again. Alibaba and
>
> > Huawei both have their in-house SQL solution, and Huawei also talked
>
> > about it on HBaseConAsia 2019, they will try to open source it. And we
>
> > could introduce a SQL proxy in hbase-connector repo. No push down
>
> > support first, all logics are done at the proxy side, can optimize later.
>
> >
>
> > Some guys said that the current feature set for 3.0.0 is not good
>
> > enough to attract more users, especially for small companies. Only
>
> > internal improvements, no users visible features. SQL and secondary
>
> > index are very important.
>
> >
>
> > Yu Li talked about the CCSMap, we still want it to be release in
>
> > 3.0.0. One problem is the relationship with in memory compaction.
>
> > Theoretically they should have no conflicts but actually they have.
>
> > And Xiaomi guys mentioned that in memory compaction still has some
>
> > bugs, even for basic mode, the MVCC writePoint may be stuck and hang
>
> > the region server. And Jieshan Bi asked why not just use CCSMap to
>
> > replace CSLM. Yu Li said this is for better memory usage, the index and
> data could be placed together.
>
> >
>
> > Then we started to talk about the HBase on cloud. For now, it is a bit
>
> > difficult to deploy HBase on cloud as we need to deploy zookeeper and
>
> > HDFS first. Then we talked about the HBOSS and WAL
> abstraction(HBASE-209520.
>
> > Wellington said the HBOSS basicly works, it use s3a and zookeeper to
>
> > help simulating the operations of HDFS. We could introduce our own
> 'FileSystem'
>
> > interface, not the hadoop one, and we could remove the 'atomic renaming'
>
> > dependency so the 'FileSystem' implementation will be easier. And on
>
> > the WAL abstraction, Wellington said there are still some guys working
>
> > it, but now they focus on patching ratis, rather than abstracting the
>
> > WAL system first. We agreed that a better way is to abstract WAL
>
> > system at a level higher than FileSystem. so maybe we could even use
> Kafka to store the WAL.
>
> >
>
> > Then we talked about the FPGA usage for compaction at Alibaba. Jieshan
>
> > Bi said that in Huawei they offload the compaction to storage layer.
>
> > For open source solution, maybe we could offload the compaction to
>
> > spark, and then use something like bulkload to let region server load
>
> > the new HFiles. The problem for doing compaction inside region server
>
> > is the CPU cost and GC pressure. We need to scan every cell so the CPU
>
> > cost is high. Yu Li talked about their page based compaction in flink
>
> > state store, maybe it could also benefit HBase.
>
> >
>
> > Then it is the time for MOB. Huawei said MOD can not solve their problem.
>
> > We still need to read the data through RPC, and it will also introduce
>
> > pressures on the memstore, since the memstore is still a bit small,
>
> > comparing to MOB cell. And we will also flush a lot although there are
>
> > only a small number of MOB cells in the memstore, so we still need to
>
> > compact a lot. So maybe the suitable scenario for using MOB is that,
>
> > most of your data are still small, and a small amount of the data are
>
> > a bit larger, where MOD could increase the performance, and users do
>
> > not need to use another system to store the larger data.
>
> > Huawei said that they implement the logic at client side. If the data
>
> > is larger than a threshold, the client will go to another storage
>
> > system rather than HBase.
>
> > Alibaba said that if we want to support large blob, we need to
>
> > introduce streaming API.
>
> > And Kuaishou said that they do not use MOB, they just store data on
>
> > HDFS and the index in HBase, typical solution.
>
> >
>
> > Then we talked about which company to host the next year's
>
> > HBaseConAsia. It will be Tencent or Huawei, or both, probably in
>
> > Shenzhen. And since there is no HBaseCon in America any more(it is
>
> > called 'NoSQL Day'), maybe next year we could just call the conference
> HBaseCon.
>
> >
>
> > Then we back to SQL again. Alibaba said that most of their customers
>
> > are migrate from old business, so they need 'full' SQL support. That's
>
> > why they need Phoenix. And lots of small companies wants to run OLAP
>
> > queries directly on the database, they do no want to use ETL. So maybe
>
> > in the SQL proxy(planned above), we should delegate the OLAP queries
>
> > to spark SQL or something else, rather than just rejecting them.
>
> >
>
> > And a Phoenix committer said that, the Phoenix community are currently
>
> > re-evaluate the relationship with HBase, because when upgrading to
>
> > HBase 2.1.x, lots of things are broken. They plan to break the tie
>
> > between Phoenix and HBase, which means Phoenix plans to also run on
>
> > other storage systems.
>
> > Note: This is not on the meeting but personally, I think this maybe a
>
> > good news, since Phoenix is not HBase only, we have more reasons to
>
> > introduce our own SQL layer.
>
> >
>
> > Then we talked about Kudu. It is faster than HBase on scan. If we want
>
> > to increase the performance on scan, we should have larger block size,
>
> > but this will lead to a slower random read, so we need to trade-off.
>
> > The Kuaishou guys asked whether HBase could support storing HFile in
>
> > columnar format. The answer is no, as said above, it will slow random
> read.
>
> > But we could learn what google done in bigtable. We could write a copy
>
> > of the data in parquet format to another FileSystem, and user could
>
> > just scan the parquet file for better analysis performance. And if
>
> > they want the newest data, they could ask HBase for the newest data,
>
> > and it should be small. This is more like a solution, not only HBase
>
> > is involved. But at least we could introduce some APIs in HBase so
>
> > users can build the solution in their own environment. And if you do
>
> > not care the newest data, you could also use replication to replicate
>
> > the data to ES or other systems, and search there.
>
> >
>
> > And Didi talked about their problems using HBase. They use kylin so
>
> > they also have lots of regions, so meta is also a problem for them.
>
> > And the pressure on zookeeper is also a problem, as the replication
>
> > queues are stored on zk. And after 2.1, zookeeper is only used as an
>
> > external storage in replication implementation, so it is possible to
>
> > switch to other storages, such as etcd. But it is still a bit
>
> > difficult to store the data in a system table, as now we need to start
>
> > the replication system before WAL system, but  if we want to store the
>
> > replication data in a hbase table, obviously the WAL system must be
>
> > started before replication system, as we need the region of the system
>
> > online first, and it will write an open marker to WAL. We need to find a
> way to break the dead lock.
>
> > And they also mentioned that, the rsgroup feature also makes big znode
>
> > on zookeeper, as they have lots of tables. We have HBASE-22514 which
>
> > aims to solve the problem.
>
> > And last, they shared their experience when upgrading from 0.98 to 1.4.x.
>
> > they should be compatible but actually there are problems. They agreed
>
> > to post a blog about this.
>
> >
>
> > And the Flipkart guys said they will open source their test-suite,
>
> > which focus on the consistency(Jepsen?). This is a good news, hope we
>
> > could have another useful tool other than ITBLL.
>
> >
>
> > That's all. Thanks for reading.
>
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: The note of the round table meeting after HBaseConAsia 2019

Posted by Andrew Purtell <ap...@apache.org>.
This is great, but in the future please refrain from borderline marketing
of a commercial product on these lists. This is not the appropriate venue
for that.

It is especially poor form to dump on a fellow open source project, as you
claim to be. This I think is the tell behind the commercial motivation.

Also I should point out, being pretty familiar with Phoenix in operation
where I work, and in my interactions with various Phoenix committers and
PMC, that the particular group of HBasers in that group appeared to share a
negative view - which I will not comment on, they are entitled to their
opinions, and more choice in SQL access to HBase is good! - that should not
be claimed to be universal or even representative.



On Thu, Aug 8, 2019 at 9:42 AM Rohit Jain <ro...@esgyn.com> wrote:

> Hi folks,
>
> This is a nice write-up of the round-table meeting at HBaseConAsia.  I
> would like to address the points I have pulled out from write-up (at the
> bottom of this message).
>
> Many in the HBase community may not be aware that besides Apache Phoenix,
> there has been a project called Apache Trafodion, contributed by
> Hewlett-Packard in 2015 that has now been top-level project for a while.
> Apache Trafodion is essentially technology from Tandem-Compaq-HP that
> started its OLTP / Operational journey as NonStop SQL effectively in the
> early 1990s.  Granted it is a C++ project, but it has 170+ patents as part
> of it that were contributed to Apache.  These are capabilities that still
> don’t exist in other databases.
>
> It is a full-fledged SQL relational database engine with the breadth of
> ANSI SQL support, including OLAP functions mentioned, and including many de
> facto standard functions from databases like Oracle.  You can go to the
> Apache Trafodion wiki to see the documentation as to what all is supported
> by Trafodion.
>
> When we introduced Apache Trafodion, we implemented a completely
> distributed transaction management capability right into the HBase engine
> using coprocessors, that is completely scalable with no bottlenecks
> what-so-ever.  We have made this infrastructure very efficient over time,
> e.g. reducing two-phase commit overhead for single region transactions.  We
> have presented this at HBaseCon.
>
> The engine also supports secondary indexes.  However, because of our
> Multi-dimensional Access Method patented technology the need to use a
> secondary index is substantially reduced.  All DDL and index updates are
> completely protected by ACID transactions.
>
> Probably because of our own inability to create excitement about the
> project, and potentially other reasons, we could not get community
> involvement as we were expecting.  That is why you may see that while we
> are maintaining the code base and introducing enhancements to it, much of
> our focus has shifted to the commercial product based on Apache Trafodion,
> namely EsgynDB.  But if the community involvement increases, we can
> certainly refresh Trafodion with some of the additional functionality we
> have added on the HBase side of the product.
>
> But let me be clear.  We are about 150 employees at Esgyn with 40 or so in
> the US, mostly in Milpitas, and the rest in Shanghai, Beijing, and
> Guiyang.  We cannot sustain the company on service revenue alone.  You have
> seen companies that tried to do that have not been successful, unless they
> have a way to leverage the open source project for a different business
> model – enhanced capabilities, Cloud services, etc.
>
> To that end we have added to EsgynDB complete Disaster Recovery,
> Point-in-Time, fuzzy Backup and Restore, Manageability via a Database
> Manager, Multi-tenancy, and a large number of other capabilities for High
> Availability scale-out production deployments.  EsgynDB also provides full
> BI and Analytics capabilities, again because of our heritage products
> supporting up to 250TB EDWs for HP and customers like Walmart competing
> with Teradata, leveraging Apache ORC and Parquet.  So yes, it can integrate
> with other storage engines as needed.
>
> However, in spite of all this, the pricing on EsgynDB is very competitive
> – in other words “cheap” compared to anything else with the same caliber of
> capabilities.
>
> We have demonstrated the capability of the product by running the TPC-C
> and TPC-DS (all 99 queries) benchmarks, especially at high concurrency
> which our product is especially well suited for, based on its architecture
> and patents.  (The TPC-DS benchmarks are run on ORC and Parquet for obvious
> reasons.)
>
> We just closed a couple of very large Core Banking deals in Guiyang where
> we are replacing the entire Core Banking system for these banks from their
> current Oracle implementations – where they were having challenges scaling
> at a reasonable cost.  But we have many customers both in the US and China
> that are using EsgynDB for operational, BI and Analytics needs.  And now
> finally … OLTP.
>
> I know that this is sounding more like a commercial for Esgyn, but that is
> not my intent.  I would like to make you aware of Apache Trafodion as a
> solution to many of these issues that the community is facing.  We will
> provide full support for Trafodion with community involvement and hope that
> some of that involvement results in EsgynDB revenue that we can sustain the
> company on 😊.  I would like to encourage the community to look at
> Trafodion to address many of the concerns sighted below.
>
> “Allan Yang said that most of their customers want secondary index, even
> more than SQL. And for global strong consistent secondary index, we agree
> that the only safe way is to use transaction. Other 'local' solutions will
> be in trouble when splitting/merging.”
>
> “We talked about Phoenix, the problem for Phoenix is well known: not
> stable enough. We even had a user on the mailing-list said he/she will
> never use Phoenix again.”
>
> “Some guys said that the current feature set for 3.0.0 is not good enough
> to attract more users, especially for small companies. Only internal
> improvements, no users visible features. SQL and secondary index are very
> important.”
>
> “Then we back to SQL again. Alibaba said that most of their customers are
> migrate from old business, so they need 'full' SQL support. That's why they
> need Phoenix. And lots of small companies wants to run OLAP queries
> directly on the database, they do no want to use ETL. So maybe in the SQL
> proxy (planned above), we should delegate the OLAP queries to spark SQL or
> something else, rather than just rejecting them.”
>
> “And a Phoenix committer said that, the Phoenix community are currently
> re-evaluate the relationship with HBase, because when upgrading to HBase
> 2.1.x, lots of things are broken. They plan to break the tie between
> Phoenix and HBase, which means Phoenix plans to also run on other storage
> systems. Note: This is not on the meeting but personally, I think this
> maybe a good news, since Phoenix is not HBase only, we have more reasons to
> introduce our own SQL layer.”
>
> Rohit Jain
> CTO
> Esgyn
>
>
>
> -----Original Message-----
> From: Stack <st...@duboce.net>
> Sent: Friday, July 26, 2019 12:01 PM
> To: HBase Dev List <de...@hbase.apache.org>
> Cc: hbase-user <us...@hbase.apache.org>
> Subject: Re: The note of the round table meeting after HBaseConAsia 2019
>
>
>
> External
>
>
>
> Thanks for the thorough write-up Duo. Made for a good read....
>
> S
>
>
>
> On Fri, Jul 26, 2019 at 6:43 AM 张铎(Duo Zhang) <palomino219@gmail.com
> <ma...@gmail.com>> wrote:
>
>
>
> > The conclusion of the HBaseConAsia 2019 will be available later. And
>
> > here is the note of the round table meeting after the conference. A bit
> long...
>
> >
>
> > First we talked about splittable meta. At Xiaomi we have a cluster
>
> > which has nearly 200k regions and meta is very easy to overload and
>
> > can not recover. Anoop said we can try read replica, but agreed that
>
> > read replica can not solve all the problems, finally we still need to
> split meta.
>
> >
>
> > Then we talked about SQL. Allan Yang said that most of their customers
>
> > want secondary index, even more than SQL. And for global strong
>
> > consistent secondary index, we agree that the only safe way is to use
> transaction.
>
> > Other 'local' solutions will be in trouble when splitting/merging.
>
> > Xiaomi has an global secondary index solution, open source it?
>
> >
>
> > Then we back to SQL. We talked about Phoenix, the problem for Phoenix
>
> > is well known: not stable enough. We even had a user on the
>
> > mailing-list said he/she will never use Phoenix again. Alibaba and
>
> > Huawei both have their in-house SQL solution, and Huawei also talked
>
> > about it on HBaseConAsia 2019, they will try to open source it. And we
>
> > could introduce a SQL proxy in hbase-connector repo. No push down
>
> > support first, all logics are done at the proxy side, can optimize later.
>
> >
>
> > Some guys said that the current feature set for 3.0.0 is not good
>
> > enough to attract more users, especially for small companies. Only
>
> > internal improvements, no users visible features. SQL and secondary
>
> > index are very important.
>
> >
>
> > Yu Li talked about the CCSMap, we still want it to be release in
>
> > 3.0.0. One problem is the relationship with in memory compaction.
>
> > Theoretically they should have no conflicts but actually they have.
>
> > And Xiaomi guys mentioned that in memory compaction still has some
>
> > bugs, even for basic mode, the MVCC writePoint may be stuck and hang
>
> > the region server. And Jieshan Bi asked why not just use CCSMap to
>
> > replace CSLM. Yu Li said this is for better memory usage, the index and
> data could be placed together.
>
> >
>
> > Then we started to talk about the HBase on cloud. For now, it is a bit
>
> > difficult to deploy HBase on cloud as we need to deploy zookeeper and
>
> > HDFS first. Then we talked about the HBOSS and WAL
> abstraction(HBASE-209520.
>
> > Wellington said the HBOSS basicly works, it use s3a and zookeeper to
>
> > help simulating the operations of HDFS. We could introduce our own
> 'FileSystem'
>
> > interface, not the hadoop one, and we could remove the 'atomic renaming'
>
> > dependency so the 'FileSystem' implementation will be easier. And on
>
> > the WAL abstraction, Wellington said there are still some guys working
>
> > it, but now they focus on patching ratis, rather than abstracting the
>
> > WAL system first. We agreed that a better way is to abstract WAL
>
> > system at a level higher than FileSystem. so maybe we could even use
> Kafka to store the WAL.
>
> >
>
> > Then we talked about the FPGA usage for compaction at Alibaba. Jieshan
>
> > Bi said that in Huawei they offload the compaction to storage layer.
>
> > For open source solution, maybe we could offload the compaction to
>
> > spark, and then use something like bulkload to let region server load
>
> > the new HFiles. The problem for doing compaction inside region server
>
> > is the CPU cost and GC pressure. We need to scan every cell so the CPU
>
> > cost is high. Yu Li talked about their page based compaction in flink
>
> > state store, maybe it could also benefit HBase.
>
> >
>
> > Then it is the time for MOB. Huawei said MOD can not solve their problem.
>
> > We still need to read the data through RPC, and it will also introduce
>
> > pressures on the memstore, since the memstore is still a bit small,
>
> > comparing to MOB cell. And we will also flush a lot although there are
>
> > only a small number of MOB cells in the memstore, so we still need to
>
> > compact a lot. So maybe the suitable scenario for using MOB is that,
>
> > most of your data are still small, and a small amount of the data are
>
> > a bit larger, where MOD could increase the performance, and users do
>
> > not need to use another system to store the larger data.
>
> > Huawei said that they implement the logic at client side. If the data
>
> > is larger than a threshold, the client will go to another storage
>
> > system rather than HBase.
>
> > Alibaba said that if we want to support large blob, we need to
>
> > introduce streaming API.
>
> > And Kuaishou said that they do not use MOB, they just store data on
>
> > HDFS and the index in HBase, typical solution.
>
> >
>
> > Then we talked about which company to host the next year's
>
> > HBaseConAsia. It will be Tencent or Huawei, or both, probably in
>
> > Shenzhen. And since there is no HBaseCon in America any more(it is
>
> > called 'NoSQL Day'), maybe next year we could just call the conference
> HBaseCon.
>
> >
>
> > Then we back to SQL again. Alibaba said that most of their customers
>
> > are migrate from old business, so they need 'full' SQL support. That's
>
> > why they need Phoenix. And lots of small companies wants to run OLAP
>
> > queries directly on the database, they do no want to use ETL. So maybe
>
> > in the SQL proxy(planned above), we should delegate the OLAP queries
>
> > to spark SQL or something else, rather than just rejecting them.
>
> >
>
> > And a Phoenix committer said that, the Phoenix community are currently
>
> > re-evaluate the relationship with HBase, because when upgrading to
>
> > HBase 2.1.x, lots of things are broken. They plan to break the tie
>
> > between Phoenix and HBase, which means Phoenix plans to also run on
>
> > other storage systems.
>
> > Note: This is not on the meeting but personally, I think this maybe a
>
> > good news, since Phoenix is not HBase only, we have more reasons to
>
> > introduce our own SQL layer.
>
> >
>
> > Then we talked about Kudu. It is faster than HBase on scan. If we want
>
> > to increase the performance on scan, we should have larger block size,
>
> > but this will lead to a slower random read, so we need to trade-off.
>
> > The Kuaishou guys asked whether HBase could support storing HFile in
>
> > columnar format. The answer is no, as said above, it will slow random
> read.
>
> > But we could learn what google done in bigtable. We could write a copy
>
> > of the data in parquet format to another FileSystem, and user could
>
> > just scan the parquet file for better analysis performance. And if
>
> > they want the newest data, they could ask HBase for the newest data,
>
> > and it should be small. This is more like a solution, not only HBase
>
> > is involved. But at least we could introduce some APIs in HBase so
>
> > users can build the solution in their own environment. And if you do
>
> > not care the newest data, you could also use replication to replicate
>
> > the data to ES or other systems, and search there.
>
> >
>
> > And Didi talked about their problems using HBase. They use kylin so
>
> > they also have lots of regions, so meta is also a problem for them.
>
> > And the pressure on zookeeper is also a problem, as the replication
>
> > queues are stored on zk. And after 2.1, zookeeper is only used as an
>
> > external storage in replication implementation, so it is possible to
>
> > switch to other storages, such as etcd. But it is still a bit
>
> > difficult to store the data in a system table, as now we need to start
>
> > the replication system before WAL system, but  if we want to store the
>
> > replication data in a hbase table, obviously the WAL system must be
>
> > started before replication system, as we need the region of the system
>
> > online first, and it will write an open marker to WAL. We need to find a
> way to break the dead lock.
>
> > And they also mentioned that, the rsgroup feature also makes big znode
>
> > on zookeeper, as they have lots of tables. We have HBASE-22514 which
>
> > aims to solve the problem.
>
> > And last, they shared their experience when upgrading from 0.98 to 1.4.x.
>
> > they should be compatible but actually there are problems. They agreed
>
> > to post a blog about this.
>
> >
>
> > And the Flipkart guys said they will open source their test-suite,
>
> > which focus on the consistency(Jepsen?). This is a good news, hope we
>
> > could have another useful tool other than ITBLL.
>
> >
>
> > That's all. Thanks for reading.
>
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk