You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@s2graph.apache.org by DO YUNG YOON <sh...@gmail.com> on 2016/02/01 17:59:20 UTC

Re: hbasecon2016 and berlinbuzzwords2016

Hi stack.

Thanks for let us know and definitely I will submit talks to hbasecon2016
and hope to participate in this time again. also I will check out buzzword
too.

Hi forks

I think following improvements would be good candidates for sharing on
hbasecon this year.

1. HBase Schema.
we have evolved from version 1(last year hbasecon) to version 4 now.
I think sharing the problems that we have encountered with each schema
would be good.

2. Consistency.
Last hbasecon, there was no consistency guarantee when contention happen.
since then we have come up with state machine that can resolve contention
eventually. I think it is worthwhile to present problem and solutions we
have come up with for this topic.

3. Super node problem.
We have encountered super node problem for read request. detail is on
https://github.com/kakao/s2graph/issues/183 and simply, user request skewed
a lot and lead to hot spot on one region server. we have solved this
problem by caching future instead of result(I don`t know what is formal
name of this so let me know if any of you guys know it).

4. Bulk loading.
Before using bulk load feature, read response time become slow and region
server got too busy with flushing memstore. HBase 2.0 snapshot branch has
great package called hbase-spark which has nice interface over spark. we
have been used this to load bulk data into production cluster without any
penalty on service SLA.

apart from system improvement, Here is what I think worth to share on
conferences(maybe or maybe not for hbasecon).

1. use cases.
people are not familiar with graph database and I got lots of questions
from developers why they need to bother to use it. so I think it is
worthwhile to present real-world use cases. I think benefit of S2Graph is
simplifying data pipeline and free realtime recommendation. There is two
part I can think of for now, one is simplicity of data pipeline and other
is actual recommendation quality compared to industry-standard algorithms.
I asked Kakao and they said it is ok to share click-through-rate(CTR) on
realtime recommendation powered by S2Graph and I think it would help others
to motivate to consider it for their use cases.

want to here what others think

Thanks.



On Sat, Jan 30, 2016 at 2:44 AM Stack <st...@duboce.net> wrote:

> Hey All:
>
> Have you all considered submitting talks to hbasecon2016? If not, you
> should (smile). See hbasecon.com for how. Would be great to have an update
> from you lot especially now you are in incubator.
>
> You should also consider submitting to berlin buzzwords. Its a great
> conference. See https://berlinbuzzwords.de/
>
> Thanks,
> St.Ack
>