You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@s2graph.apache.org by "DOYUNG YOON (JIRA)" <ji...@apache.org> on 2016/02/29 06:55:18 UTC

[jira] [Created] (S2GRAPH-50) Provide new HBase Storage Schema

DOYUNG YOON created S2GRAPH-50:
----------------------------------

Summary: Provide new HBase Storage Schema
Key: S2GRAPH-50
URL: https://issues.apache.org/jira/browse/S2GRAPH-50
Project: S2Graph
Issue Type: New Feature
Reporter: DOYUNG YOON
Assignee: DOYUNG YOON

I think we need to provide choice for both for `Tall` and `Wide` row for IndexEdge. The fatal difference between these two would be following.

# Wide.

if we store adjacent edges on single row with wide column and use get request to get adjacent edges. This is how IndexEdge is currently stored.

# Tall.

adjacent edges are on multiple `consecutive` rows and we use scanner to scan through them.

once S2GRAPH-17 is resolved, then I think only thing we have to do is provide `IndexEdgeSerializer` and `IndexEdgeDeserializer` for Tall row schema on HBase and I think this is very trivial task since we all have primitives for this.

The hard part would be changing interface for client.

currently query support `offset` and `limit` for pagination. if we use scanner, then there is no easy way to support `offset`.

I think it is worth to try with Tall row schema and benchmark them over Wide row schema. also I think this is very beneficial for others who is interested in implementing other storage such as RocksDB or LevelDB(including myself).

I will followup with benchmark on both `Tall` and `Wide` row then we can decide what schema should be default. What others think?

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)