You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Tw UxTLi51Nus <Tw...@posteo.co> on 2017/06/29 08:51:35 UTC

Notebook Storage and Git

Hi,

not sure if I should write this on dev@, but I thought I'll give it a 
try here first ...

I am using Zeppelin with Git controlled notebook storage. However, I 
find the "git client" integrated in Zeppelin quite rudimentary. So I do 
most of the VCS stuff via the CLI.

Two things are bothering me:

1) the naming scheme
On the file system, the notebooks are named with some random names 
(well, the folders, the notebooks itself are all note.json). Wouldn't it 
be better to reflect the structure of the notebooks in Zeppelin also on 
the file system, e.g. a notebook named "nbfolder1/nbfolder2/nb1" is on 
the file system on "NOTEBOOK-STORAGE/nbfolder1/nbfolder2/nb1.json" ?
Was this or something similar discussed / discarded at some point? If 
discarded, why?

2) The notebooks containing the results
... this leads to a change in the note.json files when the notebook is 
run again, even when the "code" itself has not changed, which makes 
comparing diffs really difficult. Why not use a second file (e.g. 
notebook_results.json) to store the results and thus have a "clean" 
notebook file to put into VC?

Thanks,

-- 
Tw UxTLi51Nus
Email: TwUxTLi51Nus@posteo.co

Re: Notebook Storage and Git

Posted by moon soo Lee <mo...@apache.org>.
Please feel free to comment issues, related PullRequests or continue the
discussion on the mailing list.

Thanks!
moon

On Fri, Jul 14, 2017 at 4:35 AM Tw UxTLi51Nus <Tw...@posteo.co>
wrote:

> Hi,
>
> > I think Version Control System friendly notebook file format is
> interesting
> > subject to discuss. related issue is
> > https://issues.apache.org/jira/browse/ZEPPELIN-451
>
> yeah, I think that too. Should I add a comment to the mentioned issue? Or
> where would you suggest starting this discussion?
>
> THX
>
>
> On Thursday, June 29, 2017 11:00:11 AM CEST moon soo Lee wrote:
> > Hi,
> >
> > There's a related issue
> https://issues.apache.org/jira/browse/ZEPPELIN-2702
> > for naming scheme.
> >
> > I think Version Control System friendly notebook file format is
> interesting
> > subject to discuss. related issue is
> > https://issues.apache.org/jira/browse/ZEPPELIN-451
> >
> > Thanks,
> > moon
> >
> > On Thu, Jun 29, 2017 at 5:51 PM Tw UxTLi51Nus <Tw...@posteo.co>
> >
> > wrote:
> > > Hi,
> > >
> > > not sure if I should write this on dev@, but I thought I'll give it a
> > > try here first ...
> > >
> > > I am using Zeppelin with Git controlled notebook storage. However, I
> > > find the "git client" integrated in Zeppelin quite rudimentary. So I do
> > > most of the VCS stuff via the CLI.
> > >
> > > Two things are bothering me:
> > >
> > > 1) the naming scheme
> > > On the file system, the notebooks are named with some random names
> > > (well, the folders, the notebooks itself are all note.json). Wouldn't
> it
> > > be better to reflect the structure of the notebooks in Zeppelin also on
> > > the file system, e.g. a notebook named "nbfolder1/nbfolder2/nb1" is on
> > > the file system on "NOTEBOOK-STORAGE/nbfolder1/nbfolder2/nb1.json" ?
> > > Was this or something similar discussed / discarded at some point? If
> > > discarded, why?
> > >
> > > 2) The notebooks containing the results
> > > ... this leads to a change in the note.json files when the notebook is
> > > run again, even when the "code" itself has not changed, which makes
> > > comparing diffs really difficult. Why not use a second file (e.g.
> > > notebook_results.json) to store the results and thus have a "clean"
> > > notebook file to put into VC?
> > >
> > > Thanks,
> > >
> > > --
> > > Tw UxTLi51Nus
> > > Email: TwUxTLi51Nus@posteo.co
>
>
> --
> Tw UxTLi51Nus
> Email: TwUxTLi51Nus@posteo.co
>
>

Re: Notebook Storage and Git

Posted by Tw UxTLi51Nus <Tw...@posteo.co>.
Hi,

> I think Version Control System friendly notebook file format is interesting
> subject to discuss. related issue is
> https://issues.apache.org/jira/browse/ZEPPELIN-451

yeah, I think that too. Should I add a comment to the mentioned issue? Or 
where would you suggest starting this discussion?

THX


On Thursday, June 29, 2017 11:00:11 AM CEST moon soo Lee wrote:
> Hi,
> 
> There's a related issue https://issues.apache.org/jira/browse/ZEPPELIN-2702
> for naming scheme.
> 
> I think Version Control System friendly notebook file format is interesting
> subject to discuss. related issue is
> https://issues.apache.org/jira/browse/ZEPPELIN-451
> 
> Thanks,
> moon
> 
> On Thu, Jun 29, 2017 at 5:51 PM Tw UxTLi51Nus <Tw...@posteo.co>
> 
> wrote:
> > Hi,
> > 
> > not sure if I should write this on dev@, but I thought I'll give it a
> > try here first ...
> > 
> > I am using Zeppelin with Git controlled notebook storage. However, I
> > find the "git client" integrated in Zeppelin quite rudimentary. So I do
> > most of the VCS stuff via the CLI.
> > 
> > Two things are bothering me:
> > 
> > 1) the naming scheme
> > On the file system, the notebooks are named with some random names
> > (well, the folders, the notebooks itself are all note.json). Wouldn't it
> > be better to reflect the structure of the notebooks in Zeppelin also on
> > the file system, e.g. a notebook named "nbfolder1/nbfolder2/nb1" is on
> > the file system on "NOTEBOOK-STORAGE/nbfolder1/nbfolder2/nb1.json" ?
> > Was this or something similar discussed / discarded at some point? If
> > discarded, why?
> > 
> > 2) The notebooks containing the results
> > ... this leads to a change in the note.json files when the notebook is
> > run again, even when the "code" itself has not changed, which makes
> > comparing diffs really difficult. Why not use a second file (e.g.
> > notebook_results.json) to store the results and thus have a "clean"
> > notebook file to put into VC?
> > 
> > Thanks,
> > 
> > --
> > Tw UxTLi51Nus
> > Email: TwUxTLi51Nus@posteo.co


-- 
Tw UxTLi51Nus
Email: TwUxTLi51Nus@posteo.co


Re: Notebook Storage and Git

Posted by moon soo Lee <mo...@apache.org>.
Hi,

There's a related issue https://issues.apache.org/jira/browse/ZEPPELIN-2702 for
naming scheme.

I think Version Control System friendly notebook file format is interesting
subject to discuss. related issue is
https://issues.apache.org/jira/browse/ZEPPELIN-451

Thanks,
moon

On Thu, Jun 29, 2017 at 5:51 PM Tw UxTLi51Nus <Tw...@posteo.co>
wrote:

> Hi,
>
> not sure if I should write this on dev@, but I thought I'll give it a
> try here first ...
>
> I am using Zeppelin with Git controlled notebook storage. However, I
> find the "git client" integrated in Zeppelin quite rudimentary. So I do
> most of the VCS stuff via the CLI.
>
> Two things are bothering me:
>
> 1) the naming scheme
> On the file system, the notebooks are named with some random names
> (well, the folders, the notebooks itself are all note.json). Wouldn't it
> be better to reflect the structure of the notebooks in Zeppelin also on
> the file system, e.g. a notebook named "nbfolder1/nbfolder2/nb1" is on
> the file system on "NOTEBOOK-STORAGE/nbfolder1/nbfolder2/nb1.json" ?
> Was this or something similar discussed / discarded at some point? If
> discarded, why?
>
> 2) The notebooks containing the results
> ... this leads to a change in the note.json files when the notebook is
> run again, even when the "code" itself has not changed, which makes
> comparing diffs really difficult. Why not use a second file (e.g.
> notebook_results.json) to store the results and thus have a "clean"
> notebook file to put into VC?
>
> Thanks,
>
> --
> Tw UxTLi51Nus
> Email: TwUxTLi51Nus@posteo.co
>