You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@daffodil.apache.org by Mike Beckerle <mb...@tresys.com> on 2017/11/17 13:05:57 UTC

TIFF format - base+offset proposed features

This is lower priority than many things, but a while back we tried to create a DFDL schema for TIFF and failed.


DFDL features for base+offset location of elements never were designed. They were dropped in the haste to get DFDL v1.0 finished.


But I happened to ask Dave Sugar of the Tresys CADRE team whether the CADRE engine (a binary data format description - for wire-speed applications) could do TIFF? Turns out some work was done on this, and I was able to look at the format specification for TIFF (a prototype thereof for CADRE).


So the features for DFDL to suppport base + offset locations don't seem too daunting.


I created a wiki page to capture the design ideas here. There's more work to do on this. We really should write the DFDL schema for TIFF using these features to be sure we're happy with it. Thus far I've not taken the time to do that. I've just recorded the basic ideas.


https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74687382


Re: TIFF format - base+offset proposed features

Posted by Steve Lawrence <sl...@apache.org>.
Ok. In that case, it seems like a reasonable restriction to have. If
negative offsets are necessary, it's a backwards compatible change to
the specification, so probably not a big deal to loosen.


On 11/17/2017 08:58 AM, Mike Beckerle wrote:
> There may be a role for negative offsets.
> 
> 
> But the combination of a negative offset with the base position must always be 
> non-negative.
> 
> 
> Certainly in TIFF, the various structures can be in any order in the file. So 
> one will frequently combine an offset value that comes from the infoset with the 
> base to get a new position that is less than the current position. But I believe 
> there's no need to use negative offsets.
> 
> 
> But that's because in TIFF it is expressed in terms of offsets from a base that 
> is before ALL the objects in the file.
> 
> 
> One can imagine a situation where the offsets are "self relative". I.e., an 
> element sets the new dfdl:offsetBase to it's own start position. Then if a data 
> element is before the current one, a negative offset would be needed.
> 
> 
> I don't know that this is required to express any actual format. It's 
> theoretically possible that a format might be described in this way.
> 
> 
> 
> 
> --------------------------------------------------------------------------------
> *From:* Steve Lawrence <sl...@apache.org>
> *Sent:* Friday, November 17, 2017 8:34:30 AM
> *To:* dev@daffodil.apache.org; Mike Beckerle
> *Subject:* Re: TIFF format - base+offset proposed features
> On 11/17/2017 08:05 AM, Mike Beckerle wrote:
>> This is lower priority than many things, but a while back we tried to create a DFDL schema for TIFF and failed.
>> 
>> 
>> DFDL features for base+offset location of elements never were designed. They were dropped in the haste to get DFDL v1.0 finished.
>> 
>> 
>> But I happened to ask Dave Sugar of the Tresys CADRE team whether the CADRE engine (a binary data format description - for wire-speed applications) could do TIFF? Turns out some work was done on this, and I was able to look at the format specification for  TIFF (a prototype thereof for CADRE).
>> 
>> 
>> So the features for DFDL to suppport base + offset locations don't seem too daunting.
>> 
>> 
>> I created a wiki page to capture the design ideas here. There's more work to do on this. We really should write the DFDL schema for TIFF using these features to be sure we're happy with it. Thus far I've not taken the time to do that. I've just recorded the  basic ideas.
>> 
>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74687382
>> 
>> 
> 
> You specify a requirement that dfdl:offset be non-negative--could that
> be too restrictive? For example, I could imagine a (really dumb) file
> format where it essentially chains blobs together, with the position of
> the next blob relative to this one, e.g.:
> 
>    0: Blob-1, Next: +5
>    1:
>    2: Blob-4, Next: +2
>    3: Blob-3, Next: -1
>    4: Blob-5, End
>    5: Blob-2, Next: -2
> 
> I'm not sure supporting a format like this would be possible without
> negative offsets. If no format has something like this I'm not sure it
> matters and restricting to non-negative is completely reasonable, but I
> thought you could do something like this with TIFF? Maybe not?
> 
> - Steve
> 


Re: TIFF format - base+offset proposed features

Posted by Mike Beckerle <mb...@tresys.com>.
There may be a role for negative offsets.

But the combination of a negative offset with the base position must always be non-negative.


Certainly in TIFF, the various structures can be in any order in the file. So one will frequently combine an offset value that comes from the infoset with the base to get a new position that is less than the current position. But I believe there's no need to use negative offsets.


But that's because in TIFF it is expressed in terms of offsets from a base that is before ALL the objects in the file.


One can imagine a situation where the offsets are "self relative". I.e., an element sets the new dfdl:offsetBase to it's own start position. Then if a data element is before the current one, a negative offset would be needed.


I don't know that this is required to express any actual format. It's theoretically possible that a format might be described in this way.




________________________________
From: Steve Lawrence <sl...@apache.org>
Sent: Friday, November 17, 2017 8:34:30 AM
To: dev@daffodil.apache.org; Mike Beckerle
Subject: Re: TIFF format - base+offset proposed features

On 11/17/2017 08:05 AM, Mike Beckerle wrote:
> This is lower priority than many things, but a while back we tried to create a DFDL schema for TIFF and failed.
>
>
> DFDL features for base+offset location of elements never were designed. They were dropped in the haste to get DFDL v1.0 finished.
>
>
> But I happened to ask Dave Sugar of the Tresys CADRE team whether the CADRE engine (a binary data format description - for wire-speed applications) could do TIFF? Turns out some work was done on this, and I was able to look at the format specification for TIFF (a prototype thereof for CADRE).
>
>
> So the features for DFDL to suppport base + offset locations don't seem too daunting.
>
>
> I created a wiki page to capture the design ideas here. There's more work to do on this. We really should write the DFDL schema for TIFF using these features to be sure we're happy with it. Thus far I've not taken the time to do that. I've just recorded the basic ideas.
>
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74687382
>
>

You specify a requirement that dfdl:offset be non-negative--could that
be too restrictive? For example, I could imagine a (really dumb) file
format where it essentially chains blobs together, with the position of
the next blob relative to this one, e.g.:

  0: Blob-1, Next: +5
  1:
  2: Blob-4, Next: +2
  3: Blob-3, Next: -1
  4: Blob-5, End
  5: Blob-2, Next: -2

I'm not sure supporting a format like this would be possible without
negative offsets. If no format has something like this I'm not sure it
matters and restricting to non-negative is completely reasonable, but I
thought you could do something like this with TIFF? Maybe not?

- Steve

Re: TIFF format - base+offset proposed features

Posted by Steve Lawrence <sl...@apache.org>.
On 11/17/2017 08:05 AM, Mike Beckerle wrote:
> This is lower priority than many things, but a while back we tried to create a DFDL schema for TIFF and failed.
> 
> 
> DFDL features for base+offset location of elements never were designed. They were dropped in the haste to get DFDL v1.0 finished.
> 
> 
> But I happened to ask Dave Sugar of the Tresys CADRE team whether the CADRE engine (a binary data format description - for wire-speed applications) could do TIFF? Turns out some work was done on this, and I was able to look at the format specification for TIFF (a prototype thereof for CADRE).
> 
> 
> So the features for DFDL to suppport base + offset locations don't seem too daunting.
> 
> 
> I created a wiki page to capture the design ideas here. There's more work to do on this. We really should write the DFDL schema for TIFF using these features to be sure we're happy with it. Thus far I've not taken the time to do that. I've just recorded the basic ideas.
> 
> 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=74687382
> 
> 

You specify a requirement that dfdl:offset be non-negative--could that
be too restrictive? For example, I could imagine a (really dumb) file
format where it essentially chains blobs together, with the position of
the next blob relative to this one, e.g.:

  0: Blob-1, Next: +5
  1:
  2: Blob-4, Next: +2
  3: Blob-3, Next: -1
  4: Blob-5, End
  5: Blob-2, Next: -2

I'm not sure supporting a format like this would be possible without
negative offsets. If no format has something like this I'm not sure it
matters and restricting to non-negative is completely reasonable, but I
thought you could do something like this with TIFF? Maybe not?

- Steve