You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aditya Kishore (JIRA)" <ji...@apache.org> on 2014/09/25 02:57:33 UTC

[jira] [Commented] (DRILL-1414) Move profile storage to DFS rather than using PStore

    [ https://issues.apache.org/jira/browse/DRILL-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14147194#comment-14147194 ] 

Aditya Kishore commented on DRILL-1414:
---------------------------------------

So I have been thinking about couple of ways to do it.

# Extend {{org.apache.drill.exec.store.sys.PStore}} interface to add two additional functions
{code}
  public V getBlob(String key);
  public void putBlob(String key, V value);
{code}
Now these two methods can be used by the consumers to store large amount of data, that may not require frequent enumeration and not suitable for storage on systems like Zookeeper. A particular PStore implementation could choose to store the blob data differently than the primary value, for example, HBase PStore provider could store them in a different column family while Zookeeper PStore provider can store them on DFS (as this JIRA summary suggests).
The Query Profile, then can be split into two part where small, meta info about the query is stored with a {{put()}} while the fragment profiles are stored using {{putBlob()}}.
# Alternatively, we could handle this narrowly by just modifying {{org.apache.drill.exec.work.foreman.QueryStatus}} to split and store the profile meta data separately form individual query profile.

I am inclined to go with approach #1 as it will allow any future consumer to reuse it effortlessly. I already have a partial patch, excluding modification to the Web UI, that I am currently testing at this moment. If I do not hear any concern with the approach #1, I'll post the patch shortly for the review.

> Move profile storage to DFS rather than using PStore
> ----------------------------------------------------
>
>                 Key: DRILL-1414
>                 URL: https://issues.apache.org/jira/browse/DRILL-1414
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Jacques Nadeau
>            Assignee: Aditya Kishore
>             Fix For: 0.6.0
>
>
> PStores were really built for trivial configuration data, not large query profiles.  As such, we should move to using the DFS for storage of query profiles when distributed mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)