You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Jesang Yoon (JIRA)" <ji...@apache.org> on 2016/07/08 00:40:10 UTC
[jira] [Created] (ZEPPELIN-1135) Provide a manifest for data &
interface to use it
Jesang Yoon created ZEPPELIN-1135:
-------------------------------------
Summary: Provide a manifest for data & interface to use it
Key: ZEPPELIN-1135
URL: https://issues.apache.org/jira/browse/ZEPPELIN-1135
Project: Zeppelin
Issue Type: New Feature
Components: documentation, GUI, zeppelin-interpreter
Affects Versions: 0.7.0
Reporter: Jesang Yoon
Priority: Minor
While using various data at various sources (difference URLs) to run a mixed data analysis via zeppelin, my team encounter problem with manging many different data source URLs and share between teammates.
So I propose a idea to solve this problem by providing "manifest of data and interface to use it" and want to build consensus between contributors and PPMC before build and commit a code.
h4. Pain points
* Files or resources tend to be displaced to various location. (HDFS, Web, etc...)
* It's bit complicated to remember & identify location of data and use a long URL for it.
* URL for data is not enough to describe what is inside of it.
h4. How to resolve it
# Define a format of web based document(XML/JSON/YAML) contains manifest(or meta) of data that can be used by team.
#* Title of data
#* Location of data (URL)
#* Description of data
#* Tags of data (for search)
# Build a zeppelin interface function to search & view description of data described at 1.
# Build a zeppelin interface function to return a real location of data captured at 2. to using with load() functions of various interpreters.
h4. Effects
* Able to share single clean and neat information about data between teammates.
* Do not have to follow & change all URLs in notebooks when location of data has been modified.
* Easy to search and use data in analysis codes.
Please review this idea and give comments :)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)