You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by Kevin Pilch-Bisson <ke...@pilch-bisson.net> on 2002/12/04 13:05:03 UTC

Re: setting a convention for property values

On Wed, Dec 04, 2002 at 01:56:25PM -0600, Nicholas Riley wrote:
> On Wed, Dec 04, 2002 at 12:54:12PM -0600, Karl Fogel wrote:
> > Nicholas Riley <nj...@uiuc.edu> writes:
> > > I have no objection to what you want to do with text properties, but I
> > > already have lots of binary properties in my repos (using a svn
> > > wrapper script for Mac resource forks).  If you declare that binary
> > > properties are the way forward, I need some way to fix my existing
> > > properties ;-) - preferably as automatic as possible, for example svn
> > > detecting binary properties the way it detects binary files and
> > > auto-encoding/decoding them would be nice.
> > 
> > Um, how have your binary properties been behaving over ra_dav? :-)
> > It should be corrupting them... see
> > 
> >    http://subversion.tigris.org/issues/show_bug.cgi?id=1015
> 
> I've only been using them over ra_local.  Whatever svn needs to do to
> base64-encode and -decode properties over DAV sounds reasonable, but
> I'd prefer it not be exposed to the client.  That's the cool thing
> about svn's file handling, it's transparent and most of the time the
> user doesn't have to care.  Properties should be similar, and binary
> metadata really will be needed eventually even if I'm the only person
> using it now.  Eventually I plan to add native svn support for the
> stuff I'm using wrapper scripts for at the moment (unless someone else
> beats me to it), but it won't happen until after 1.0 at this rate.
> 
The problem with this is that we need to know for absolutely certain the our
binary detection algorithm will always be right for this to work.  Otherwise
we need props on props to say whether they are binary.  Props need to either
be all binary, or not binary at all.

(With the possible exception that svn: props could be text and the rest
binary).

My personal option is that svn: props should be utf-8, lf line endings.  All
other props should be binary, and base64 encoded if that is what dav requires.
-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kevin Pilch-Bisson                    http://www.pilch-bisson.net
     "Historically speaking, the presences of wheels in Unix
     has never precluded their reinvention." - Larry Wall
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Re: setting a convention for property values

Posted by Peter Davis <pe...@pdavis.cx>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I was just thinking about that, and I like it too.  Also, there should be some 
mechanism (the default, hopefully) to automatically decode based on the 
Content-Transfer-Encoding (base64).

When someone sets a binary property, by default, the client should 
automatically insert the headers and encode in Base64.  Propset, propget, and 
propedit should have options to allow manipulation of the raw encoded headers 
and body as well.

$ svn pset bin:icon -F icon.png foofile
$ svn pedit --raw bin:icon foofile
Content-Type: image/png   /* change to this */
Content-Transfer-Encoding: base64
AAAAdGhpcyBpcyBhIHRlc3QK...
$ svn ci ... other stuff
$ svn pget bin:icon foofile > icon.png

On second thought, why not do this for all properties internally, but by 
default the mime-type is "text/plain;charset=UTF-8" for all non-"bin:*"?  No 
need for a new fs schema, but it easily allows meta-metadata.  Or is this all 
too much to solve a presumably rare edge case?


On Wednesday 04 December 2002 16:06, Glenn A. Thompson wrote:
> I like the previously mentioned:
>
> bin:DoHickey
> Content-Type: application/octet-stream
> Content-Transfer-Encoding: base64
> AAAAdGhpcyBpcyBhIHRlc3QK
>
> People see/use this kind of stuff all the time in e-mail.
>
> It also allows non-SVN savvy programs or shell scripting languages to
> create property files which can be fed  to svn via -F arguments.

- -- 
Peter Davis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE97p20hDAgUT1yirARAmLhAKCatNvIRgNSH+0yEKf3keoLrvQUjACeJBgm
TvJncMTy3TSdxRxiW/RinZA=
=eBYF
-----END PGP SIGNATURE-----


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by "Glenn A. Thompson" <gt...@cdr.net>.
Hey,

Ben Collins-Sussman wrote:

>Karl Fogel <kf...@newton.ch.collab.net> writes:
>
>  
>
>>Peter Davis <pe...@pdavis.cx> writes:
>>    
>>
>>>How about:
>>>
>>>  bin:*                            = Always binary
>>>  [anything else, including svn:*] = Always text (UTF-8)
>>>      
>>>
So Now I have a decoded blob.  Now what is it?   How do I tell my client 
what it is?
I like the previously mentioned:

bin:DoHickey
Content-Type: application/octet-stream
Content-Transfer-Encoding: base64
AAAAdGhpcyBpcyBhIHRlc3QK

People see/use this kind of stuff all the time in e-mail.

It also allows non-SVN savvy programs or shell scripting languages to 
create property files which can be fed  to svn via -F arguments.

As for the Repos. Well it might be nice to formalize the "Content-xxxx 
:" name value pairs.  But some crazy FS/Repos implementor could parse 
the headers for "bin" properties and store the meta-meta data however 
they see fit.   Or if  additional Property methods are wanted/need  
(i.e. getPropHeader()) and the data is stored as a single header + blob 
string, parse the header at retrival time to get the required info to 
support the new method(s).

Or perhaps a flag on propget that indicates headers only.  This flag 
would only apply to bin properties.  So proplist --verbose would list 
only the header for binary properties and the actual property for txt 
properties.

OR    configurable header filters, validators ....... Ummmm Sounds like 
fun to me:-)

gat imposter (Real gat off working.)




 




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by Ben Collins-Sussman <su...@collab.net>.
Karl Fogel <kf...@newton.ch.collab.net> writes:

> Peter Davis <pe...@pdavis.cx> writes:
> > How about:
> > 
> >   bin:*                            = Always binary
> >   [anything else, including svn:*] = Always text (UTF-8)
> > 
> > Simple and consistent with current practices.
> > 
> > Binary should always be Base64-encoded over DAV, and decoded by the client 
> > automatically.  Text should always be XML-escaped over DAV.
> 
> I like this way, because it makes the common case simple.
> 
> Have we a consensus? :-)

This is exactly what I was thinking today.  +1

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Peter Davis <pe...@pdavis.cx> writes:
> How about:
> 
>   bin:*                            = Always binary
>   [anything else, including svn:*] = Always text (UTF-8)
> 
> Simple and consistent with current practices.
> 
> Binary should always be Base64-encoded over DAV, and decoded by the client 
> automatically.  Text should always be XML-escaped over DAV.

I like this way, because it makes the common case simple.

Have we a consensus? :-)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by Peter Davis <pe...@pdavis.cx>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wednesday 04 December 2002 12:29, Karl Fogel wrote:
>    bin:foooooo   /* Always binary. */
>    text:fooooo   /* Always text. */
>    svn:fooooo    /* All are text currently. */
>
> props, and any prop without a prefix is either rejected or, maybe,
> defaults to... What?  Bin?  Text?

How about:

  bin:*                            = Always binary
  [anything else, including svn:*] = Always text (UTF-8)

Simple and consistent with current practices.

Binary should always be Base64-encoded over DAV, and decoded by the client 
automatically.  Text should always be XML-escaped over DAV.

- -- 
Peter Davis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE97m+shDAgUT1yirARAptpAJ9PRrsrWhOLTvULn1lsrZ7Lj5V2FgCgj8LO
T6wdFl5qJZU/tvgQKyUcRrQ=
=DCNw
-----END PGP SIGNATURE-----


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by Florin Iucha <fl...@iucha.net>.
On Wed, Dec 04, 2002 at 06:12:54AM -0800, Kevin Pilch-Bisson wrote:
> On Wed, Dec 04, 2002 at 02:29:06PM -0600, Karl Fogel wrote:
> > Nicholas Riley <nj...@uiuc.edu> writes:
> > > How about creating a separate property namespace for binary properties
> > > then?  Or a separate mechanism?  Requiring that client developers
> > > somehow agree on a standard for encoding binary props seems like
> > > putting up -more- barriers, not less.
> > 
> > So we'd have something like
> > 
> >    bin:foooooo   /* Always binary. */
> >    text:fooooo   /* Always text. */
> >    svn:fooooo    /* All are text currently. */
> > 
> > props, and any prop without a prefix is either rejected or, maybe,
> > defaults to... What?  Bin?  Text?
> > 
> You know I like this option, but gsvn still wouldn't know how to display
> binary props.

It wouldn't know how to display arbitrary file contents either. This is
not a problem.

> What about text/plain:propname
> 
> I.e. encode the mime type in the propname?

"Easy things easy, hard things possible". The common case will be for
text.

AFAIK reiser4 intends to expose metadata/extended attributes as a
filesystem, so you could "cd" into the file and a "ls" would show you
the attribute names...

florin

-- 

"If it's not broken, let's fix it till it is."

41A9 2BDE 8E11 F1C5 87A6  03EE 34B3 E075 3B90 DFE4

Re: setting a convention for property values

Posted by Kevin Pilch-Bisson <ke...@pilch-bisson.net>.
On Wed, Dec 04, 2002 at 02:29:06PM -0600, Karl Fogel wrote:
> Nicholas Riley <nj...@uiuc.edu> writes:
> > How about creating a separate property namespace for binary properties
> > then?  Or a separate mechanism?  Requiring that client developers
> > somehow agree on a standard for encoding binary props seems like
> > putting up -more- barriers, not less.
> 
> So we'd have something like
> 
>    bin:foooooo   /* Always binary. */
>    text:fooooo   /* Always text. */
>    svn:fooooo    /* All are text currently. */
> 
> props, and any prop without a prefix is either rejected or, maybe,
> defaults to... What?  Bin?  Text?
> 
You know I like this option, but gsvn still wouldn't know how to display
binary props.

What about text/plain:propname

I.e. encode the mime type in the propname?
-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Kevin Pilch-Bisson                    http://www.pilch-bisson.net
     "Historically speaking, the presences of wheels in Unix
     has never precluded their reinvention." - Larry Wall
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Re: setting a convention for property values

Posted by Karl Fogel <kf...@newton.ch.collab.net>.
Nicholas Riley <nj...@uiuc.edu> writes:
> How about creating a separate property namespace for binary properties
> then?  Or a separate mechanism?  Requiring that client developers
> somehow agree on a standard for encoding binary props seems like
> putting up -more- barriers, not less.

So we'd have something like

   bin:foooooo   /* Always binary. */
   text:fooooo   /* Always text. */
   svn:fooooo    /* All are text currently. */

props, and any prop without a prefix is either rejected or, maybe,
defaults to... What?  Bin?  Text?

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by Nicholas Riley <nj...@uiuc.edu>.
On Wed, Dec 04, 2002 at 03:24:39PM -0600, Florin Iucha wrote:
> But if your object is that complex, you might want to turn it into a
> directory with files. That way, you will have more tools available to
> operate on your bits and we can avoid the property-editor-flamewar ;)

If there were support for opaque directories in svn, that would work
too. :-)

-- 
=Nicholas Riley <nj...@uiuc.edu> | <http://www.uiuc.edu/ph/www/njriley>
        Pablo Research Group, Department of Computer Science and
  Medical Scholars Program, University of Illinois at Urbana-Champaign

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by Florin Iucha <fl...@iucha.net>.
On Wed, Dec 04, 2002 at 03:08:11PM -0600, Jon Trowbridge wrote:
> On Wed, 2002-12-04 at 14:55, Nicholas Riley wrote:
> > How about creating a separate property namespace for binary properties
> > then?  Or a separate mechanism?  Requiring that client developers
> > somehow agree on a standard for encoding binary props seems like
> > putting up -more- barriers, not less.
> 
> In the long run, I think that namespacing properties is a losing
> proposition.
> 
> To be really useful and to interoperate, we need to know more than if a
> chunk of data is binary or not.  If we are going to have some metadata
> on our metadata, we should take it to its logical conclusion and have a
> way of attaching mime-type information to properties.  That way a client
> could at least theoretically do the right thing with the data --- like
> pop up an image viewer when you ask to look at the PNG that you've put
> in a property.
> 
> Even if binary properties are encoded in base64, the mime type should
> still be embedded in the prop in some systematic way.

You could look at $prop:mime-type to find information about $prop.

But if your object is that complex, you might want to turn it into a
directory with files. That way, you will have more tools available to
operate on your bits and we can avoid the property-editor-flamewar ;)

florin

-- 

"If it's not broken, let's fix it till it is."

41A9 2BDE 8E11 F1C5 87A6  03EE 34B3 E075 3B90 DFE4

Re: setting a convention for property values

Posted by Nicholas Riley <nj...@uiuc.edu>.
On Wed, Dec 04, 2002 at 03:08:11PM -0600, Jon Trowbridge wrote:
> In the long run, I think that namespacing properties is a losing
> proposition.

I agree.  I'm trying to keep the implementation suggestions simple
under the hope that if it's simple enough, someone will just implement
it soon and the problem will be 'solved' for a while at least.

> To be really useful and to interoperate, we need to know more than if a
> chunk of data is binary or not.  If we are going to have some metadata
> on our metadata, we should take it to its logical conclusion and have a
> way of attaching mime-type information to properties.  That way a client
> could at least theoretically do the right thing with the data --- like
> pop up an image viewer when you ask to look at the PNG that you've put
> in a property.

Yeah, this is the ideal, but would require changes to the repos
structure, right?

> Even if binary properties are encoded in base64, the mime type should
> still be embedded in the prop in some systematic way.

I agree - how about just having MIME headers at the beginning of the
binary block?  So it'd look something like this:

Content-Type: application/octet-stream
Content-Transfer-Encoding: base64

AAAAdGhpcyBpcyBhIHRlc3QK

One problem with that is that if somehow you'd want to retrieve the
MIME info without getting the entire property, it'd be a pain, and if
similarly, you want to have direct access to property URLs via WebDAV
it would also be a problem to pull that info out of the body and stick
it in the HTTP header.

But the ideal would be allowing attachment of arbitrary name/value
pairs, like properties, which -would- be constrained to being UTF-8.
This could get annoying if you had typed properties set on almost
every file because the metadata would be very repetitive, so it might
be nice to be able to define such properties with inheritance, or
repos-wide.  This line of reasoning was what was scared me about the
complexity of the code required to implement such a scheme.  The
madness has to stop somewhere :-)

-- 
=Nicholas Riley <nj...@uiuc.edu> | <http://www.uiuc.edu/ph/www/njriley>
        Pablo Research Group, Department of Computer Science and
  Medical Scholars Program, University of Illinois at Urbana-Champaign

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by Jon Trowbridge <tr...@ximian.com>.
On Wed, 2002-12-04 at 14:55, Nicholas Riley wrote:
> How about creating a separate property namespace for binary properties
> then?  Or a separate mechanism?  Requiring that client developers
> somehow agree on a standard for encoding binary props seems like
> putting up -more- barriers, not less.

In the long run, I think that namespacing properties is a losing
proposition.

To be really useful and to interoperate, we need to know more than if a
chunk of data is binary or not.  If we are going to have some metadata
on our metadata, we should take it to its logical conclusion and have a
way of attaching mime-type information to properties.  That way a client
could at least theoretically do the right thing with the data --- like
pop up an image viewer when you ask to look at the PNG that you've put
in a property.

Even if binary properties are encoded in base64, the mime type should
still be embedded in the prop in some systematic way.

-JT



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by Nicholas Riley <nj...@uiuc.edu>.
On Wed, Dec 04, 2002 at 02:18:32PM -0600, Ben Collins-Sussman wrote:

> Jon is working on gsvn.  He gets some property values, and needs to
> display them, and these aren't svn: props.  What can he do?  He has no
> way to know if they're binary.  And if they're text, he has no idea
> how to decode them for display.  (Gnome requires that all displayed
> strings be UTF8.)  

So, declare that all text properties must be UTF-8.  Sounds reasonable.

> By not imposing any policy at all, we end up with an interoperability
> nightmare.  You may have many collaborators using one repository, but
> they all have different clients, and none of the clients have any idea
> how to share metadata with one another.  The subversion libraries are
> in a unique position to *create* interoperabilty by imposing UTF8/LF
> policy, just as we've done with log messages.

OTOH, this restricts what types of metadata can be stored about files.
The general tendency (see Microsoft's plans for future versions of
Windows, BeFS, Reiser4, ...) seems to be towards -more- metadata for
files, not less, and some of this is undoubtedly going to be binary in
nature and need versioning.

> I guess our assumption is that binary properties are truly rare
> things, and that client interoperability is worth "more" to us...

How about creating a separate property namespace for binary properties
then?  Or a separate mechanism?  Requiring that client developers
somehow agree on a standard for encoding binary props seems like
putting up -more- barriers, not less.

-- 
=Nicholas Riley <nj...@uiuc.edu> | <http://www.uiuc.edu/ph/www/njriley>
        Pablo Research Group, Department of Computer Science and
  Medical Scholars Program, University of Illinois at Urbana-Champaign

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by Ben Collins-Sussman <su...@collab.net>.
Kevin Pilch-Bisson <ke...@pilch-bisson.net> writes:

> My personal option is that svn: props should be utf-8, lf line
> endings.  All other props should be binary, and base64 encoded if
> that is what dav requires.

The problem with the status quo, as Jon Trowbridge puts it, is that
we're effectively slipping into the "X-windows fallacy" -- i.e. not
imposing enough policy.

Jon is working on gsvn.  He gets some property values, and needs to
display them, and these aren't svn: props.  What can he do?  He has no
way to know if they're binary.  And if they're text, he has no idea
how to decode them for display.  (Gnome requires that all displayed
strings be UTF8.)  

So we sat around and speculated about all these crazy ways to create
metadata for our metadata.  But after a while, we all realized that
that way lies madness.

By not imposing any policy at all, we end up with an interoperability
nightmare.  You may have many collaborators using one repository, but
they all have different clients, and none of the clients have any idea
how to share metadata with one another.  The subversion libraries are
in a unique position to *create* interoperabilty by imposing UTF8/LF
policy, just as we've done with log messages.

I guess our assumption is that binary properties are truly rare
things, and that client interoperability is worth "more" to us...



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: setting a convention for property values

Posted by Nicholas Riley <nj...@uiuc.edu>.
On Wed, Dec 04, 2002 at 05:05:03AM -0800, Kevin Pilch-Bisson wrote:
> The problem with this is that we need to know for absolutely certain the our
> binary detection algorithm will always be right for this to work.  Otherwise
> we need props on props to say whether they are binary.  Props need to either
> be all binary, or not binary at all.

Yeah, you're right.

> My personal option is that svn: props should be utf-8, lf line
> endings.  All other props should be binary, and base64 encoded if
> that is what dav requires.

Sounds good to me, and someone needs to fix the ra_dav problems.

-- 
=Nicholas Riley <nj...@uiuc.edu> | <http://www.uiuc.edu/ph/www/njriley>
        Pablo Research Group, Department of Computer Science and
  Medical Scholars Program, University of Illinois at Urbana-Champaign

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org