You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@directory.apache.org by Alex Karasulu <ao...@bellsouth.net> on 2005/02/24 05:33:47 UTC

[asn1] why use TLV objects at all?

Emmanuel,

I was just thinking about your position on object creation.  Namely the 
one that is against the creation of Tuple objects that represent TLVs.  
Your proposal to use pooling of these objects worries me a bit.  It just 
makes me think there would be a lot of synchronization overhead.  I may 
be wrong.

However I started thinking, "why create Tuples at all?" Follow my 
concepts here for a sec even though we have not been discussiong these 
constructs: TupleProducers and TupleConsumers.  A producer simply emits 
callbacks to a consumer and they are bound to each other.  What if the 
callbacks did not pass in a Tuple as an argument but the components T, L 
and V of the Tuple instead.  A stub, which is like the parser you 
mentioned, tracks and changes state as an automaton to populate its 
properties appropriately with the stream of Tuple events.  The stub can 
be a TupleConsumer - really a tuple event consumer rather.  This would 
eliminate object creation overheads and populate the stub.

Thoughts?

    -Alex

Re: [asn1] why use TLV objects at all?

Posted by Alex Karasulu <ao...@bellsouth.net>.

Emmanuel Lecharny wrote:

>Hi all !
>
>
>Le jeudi 24 février 2005 à 01:55 -0500, Alex Karasulu a écrit :
>  
>
>>Alan D. Cabrera wrote:
>>
>>    
>>
>>>Alex Karasulu wrote:
>>>
>>>      
>>>
>>>>Alan D. Cabrera wrote:
>>>>
>>>>        
>>>>
>>>>>Alex Karasulu wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>Emmanuel,
>>>>>>
>>>>>>I was just thinking about your position on object creation.  Namely 
>>>>>>the one that is against the creation of Tuple objects that 
>>>>>>represent TLVs.  Your proposal to use pooling of these objects 
>>>>>>worries me a bit.  It just makes me think there would be a lot of 
>>>>>>synchronization overhead.  I may be wrong.
>>>>>>            
>>>>>>
>
>No synchronisation : its a local pool, each thread has its own pool. You
>won't have 10 000 threads, so it's ok.
>
>  
>
Awesome! That works really well - too bad I did not think of that.

>>>>>I was also concerned by this as it may require that you keep a rough 
>>>>>factor of 2 more memory, one for the Tuple structure of the message 
>>>>>and one for the POJO that you are creating.
>>>>>          
>>>>>
>
>TLV are allocated already, so it's not a pb. The value part will be
>copied to stubs as the stub read it, so this is the only variable part.
>You create the value while reading the PDU, and pass it to the stub. So
>memory consumption is just like sizeof(stubs) + sizeof(data) + sizeof
>(preallocated TLV). It's really important that the memory footprint is
>somehow static, even if big at the beginning. We are tradding initial
>memory need against stability on the long term.
>
>  
>
>>>>It would be if we were collecting all PDU tuples to form a TLV tree.  
>>>>However the idea is to use and release whatever is allocated to the 
>>>>tuple.  In this case, the only time you have two copies of a datum 
>>>>(tuple value) is when you are holding on to the value long enough to 
>>>>set a stub's property using with the tuple value.
>>>>        
>>>>
>
>We don't have two copies of a datum. When the TLV is a Primitive one, as
>soon as its data has been completly read, we can pass it to the
>POJO/Stub. We just keep a reference to it in the TLV, the only
>duplication which could occur is the Tand L parts, but, again, it's not
>simply a duplication: TLV are allocated from the beginning, and won't be
>released untill you stop the server.
>
>  
>
>>>>Furthermore if we implement the strategy of streaming a large value 
>>>>to disk (say a JPEG photo) then the value is just a URI to access the 
>>>>stream later on.  This URI is what is set as the stub property 
>>>>value.  So in this case we don't have the double hit as mentioned 
>>>>above where a value is in memory in a Tuple and duplicated in the 
>>>>value of the stub property.
>>>>        
>>>>
>
>+1 for the URI. It could also be a sub-classed StreamedTLV, which has
>the same interface. The implentation of its getData method will handle
>the situation.
>
>  
>
>>>So we only keep a stack of tuples?
>>>      
>>>
>>You mean constructed tuples for nesting?  Depends on the stub.  I don't 
>>think even that may be needed.  Don't know for sure yet though.
>>    
>>
>
>Stub/POJO is just the final representation of the data. Obviously we can
>avoid all those TLVs plumbingif we have a compiler that handle it.A
>deepth first decoding strategy is something that is faster than a two
>layers parser/lexer strategy, but it's much more complicated.
>
>
>  
>
>>>>>>However I started thinking, "why create Tuples at all?" Follow my 
>>>>>>concepts here for a sec even though we have not been discussiong 
>>>>>>these constructs: TupleProducers and TupleConsumers.  A producer 
>>>>>>simply emits callbacks to a consumer and they are bound to each 
>>>>>>other.  What if the callbacks did not pass in a Tuple as an 
>>>>>>argument but the components T, L and V of the Tuple instead.  A 
>>>>>>stub, which is like the parser you mentioned, tracks and changes 
>>>>>>state as an automaton to populate its properties appropriately with 
>>>>>>the stream of Tuple events.  The stub can be a TupleConsumer - 
>>>>>>really a tuple event consumer rather.  This would eliminate object 
>>>>>>creation overheads and populate the stub. 
>>>>>>            
>>>>>>
>
>If you want to control L, you need to keep a track of the Constructed
>TLVs. Primitives TLV are not very importants, we can discard them
>immediatly, just keeping their V part. So keeping a stack of Constructed
>TLVs is just a question of fixating length.
>
>  
>
>>>>>Could you not flatten it even further by making a compiler generated 
>>>>>stub act as both the producer and consumer?  This is the tack that I 
>>>>>am taking with my "smart" stubs.
>>>>>          
>>>>>
>
>Yes for sure. But then it will become difficult to track bad PDU (I
>mean, PDU in which Length are not correct).
>
>
>
>  
>
>>>>I highly discourage this approach.  Reason being the nature of the 
>>>>relationship between ASN.1 and encodings.  As you know an ASN.1 spec 
>>>>can use any encoding.  Conventionally a protocol specifies an 
>>>>encoding and sticks to it so it seems to support your approach.  This 
>>>>however is not always the case and ASN.1 is being used in new ways 
>>>>where alternate encodings are being applied to different data 
>>>>structures based on the target: i.e. GSM network clients.  However 
>>>>these are not the strongest cases for why you should avoid this 
>>>>"smart" stub approach IMO.
>>>>        
>>>>
>>>Each stub is specific to a particular encoding.  It is the POJOs that 
>>>are used that are universal to the encodings.
>>>      
>>>
>>Ahh ok you mean there's a difference between the stub and a POJO.  I 
>>thought the pojo is the stub.  Or are you refering to some base class or 
>>POJI?
>>    
>>
>
>We should agreed on terms, don't you think so? In my mind, a POJO is an
>instance of a ASN.1 path through a specific grammar (for example, a
>LdapBindResponse POJO, or a LdapSearch POJO). The stub is the class that
>feed the POJO with Data. So the stub is the POJO producer/consumer (with
>or without TLV). wdyt ?
>
>  
>
Hmmm I was thinking pojo and stub were one and the same but it need not 
be.  So its making sense now what Alan was referring to.

>  
>
>>>>The most important reason is to decouple the generation of encoding 
>>>>specific code from the stub compiler.  If you make your stubs 
>>>>"encoding aware" then your adding some serious complexity to the stub 
>>>>compiler IMO.  Why do this when you can avoid it and gain the ability 
>>>>to swap out the encoding at runtime?
>>>>        
>>>>
>>>You have the ability to swap out encodings at runtime, you can just 
>>>switch stubs. 
>>>      
>>>
>
>Both of you are right, it's just a question of "how long will it take to
>write the compiler?" versus "do we really need a compiler at the
>moment?"
>
>  
>
For me I want a faster beter easier to maintain LDAP runtime while 
consolidating the DER needs for Kerberos.  If we got BER and DER done 
tight this is what directory is after.  However I really want a generic 
stub compiler for the future and want to balance this.  At the end 
though Directory concerns will outwiegh generic ASN.1.

>>So interface or base class is the same but concrete implementation is 
>>the stub for a particular encoding?
>>
>>    
>>
>>>>The way I like to visualize this is ... there is a common 
>>>>representation the stub compiler needs to work with.  Rather than 
>>>>read bytes from a stream it responds to tuple events as its input at 
>>>>a higher level.  Regardless of the type of encoding at the lowest 
>>>>level the stub compiler and the stubs it generates need not be 
>>>>aware.  It's sort of like the way javac works with the underlying 
>>>>runtime: the compiled code as byte codes are bound at runtime to the 
>>>>underlying native code to do the actual work using native code.  
>>>>Similarly here I'm recommending that the stub compiler generate a 
>>>>stub which deals only with TLVs and at runtime the source/target can 
>>>>be a BER, DER, PER binary stream or even a XER encoded ascii stream.
>>>>I think perhaps some of your concerns on the stub compiler side 
>>>>revolve around finding a tangible way for the antlr based stub 
>>>>compiler to generate code that deals with a TLV stream rather than a 
>>>>byte/char stream.  I too have this problem - it is not easy.  In this 
>>>>regard the approach of making the stub totally encoding aware may 
>>>>seem easier to do.
>>>>        
>>>>
>>>IIUC, PER does not use TLVs.  You need to know the structure of your 
>>>ASN1 object to decode the stream.
>>>
>>>Keep it simple.  We may as well remove the layer and generate protocol 
>>>specific stubs.
>>>      
>>>
>>If you're writing the stub compiler then its your call.  However I'm 
>>still not convinced this is keeping it simple.  Have you already 
>>finished the parts of the compiler that can handle different encodings?
>>    
>>
>
>Alan is perfectly right. PER is really inseparable from the specific
>ASN.1 grammar it encodes. You need to know the semantic of the incomming
>data to decode it, because T and L are optionnal (I mean, not optionnal
>in a way you are allowed to skip them, but they depend on the grammar).
>So when writing an ASN.1 decoder for a specific grammar using a PER
>decoder, the layer approach is totally useless.  It's much more
>something like "give me the 5 following bits that I know represent the
>Value I'm reading" decoder. Quite complicated to implement, but it's the
>way it works. You need a compiler.
>
>  
>
Ok PER is the odd man out here.

>If you have this PER enabled codec compiler, having the same BER/CER/DER
>enabled codec compiler is a piece of cake. No layers, no TLVs, just a
>compiler.
>
>This could also be a perfect new Apache project : Apache ASN.1 free
>compiler. (a SNACC GPLized, in a way !)
>  
>
ehem ASLized :-).

>How far are we from this target? 
>
>  
>
Alan would know that but the last I asked he's got some way to go.

>I also want to realize what is the cost of coding/decoding data against
>the cost of fetching/storing them in the database. If it's a 50/50
>ratio, we could have major performance improvement by first implementing
>the layered approach then the compiler one when it's ready. If it's a
>10/90, forget about layers. Let's focus on compiler on one side, and
>other performance issues on the other.
>
>wdyt?
>  
>
Like the staged approach: +1 for that.

Alex

Re: [asn1] why use TLV objects at all?

Posted by Enrique Rodriguez <er...@apache.org>.

Alex Karasulu wrote:
> Emmanuel Lecharny wrote:
> 
>>> I am not in favor of cleaning up old code that's going to be replaced 
>>> anyway.  What do you think of the idea of hand writing the stubs for 
>>> LDAP using our "new" runtime?
>>
>> Hum, well, do not tell that to Alex : it's not so much a cleaning
>> process ... Much more a wipe out process ;-)
>>
> I'm completely fine with garbage collecting - I think I pointed that out 
> in another email.  It's not my code here!  Its our code.  Code ownership 
> mentality is *NOT* something we want to promote here or anywhere else at 
> the ASF.
> Let's whipe it like apseda.  These are the days I've been praying for.  
> People who know what they're doing to come around and fix some obvious 
> problems.  I knew just how nasty the ASN.1 code was while writing it but 
> I just kept going instead of giving in to my rewrite instincts.  I had 
> no choice in this or we would not be able to graduate incubation around 
> now.

Yeah, my same two cents on the DER stuff Kerberos is using.  I can't 
wait to toss it.

As I mentioned to Emmanuel on chat, it is quite common for Kerberos to 
send bad structures to the ASN.1 decoder, so my only wish is for fail-fast.

I don't think I have anything else to add to the ASN.1 discussion, but 
when it's ready, I'll do the work to integrate it with Kerberos.  Oh, 
and the changepw protocol uses Kerberos infrastructure and has its own 
ASN.1 structures, so I'll port changepw, too.  Changepw is described in 
RFC 3244 and you can conceptually think of it as the update to kpasswd, 
the old Kerberos change password protocol.

-enrique

Re: [asn1] why use TLV objects at all?

Posted by Alex Karasulu <ao...@bellsouth.net>.

Emmanuel Lecharny wrote:

>>I am not in favor of cleaning up old code that's going to be replaced 
>>anyway.  What do you think of the idea of hand writing the stubs for 
>>LDAP using our "new" runtime?
>>    
>>
>
>Hum, well, do not tell that to Alex : it's not so much a cleaning
>process ... Much more a wipe out process ;-)
>
>  
>
I'm completely fine with garbage collecting - I think I pointed that out 
in another email.  It's not my code here!  Its our code.  Code ownership 
mentality is *NOT* something we want to promote here or anywhere else at 
the ASF. 

Let's whipe it like apseda.  These are the days I've been praying for.  
People who know what they're doing to come around and fix some obvious 
problems.  I knew just how nasty the ASN.1 code was while writing it but 
I just kept going instead of giving in to my rewrite instincts.  I had 
no choice in this or we would not be able to graduate incubation around now.

>+1 to hand write the stubs. I was thinking about an XSD -> POJO and
>upper this a hand written stub.
>  
>
+1 yes I agree.

>Don't misunderstand what I'm saying about XSD -> POJO : it's a Write and
>Forget weapon, something that will cost two or three days, and can be
>put back in a trash bin when the compiler is ready to go by itself.
>
>Any other better ideas?
>  
>
That's not a bad thing.  We're proposing it for the current runtime - so 
why not do it with an intermediate XSD representation. 

Alex

Re: [asn1] why use TLV objects at all?

Posted by Emmanuel Lecharny <el...@iktek.com>.

> I am not in favor of cleaning up old code that's going to be replaced 
> anyway.  What do you think of the idea of hand writing the stubs for 
> LDAP using our "new" runtime?

Hum, well, do not tell that to Alex : it's not so much a cleaning
process ... Much more a wipe out process ;-)

+1 to hand write the stubs. I was thinking about an XSD -> POJO and
upper this a hand written stub.

Don't misunderstand what I'm saying about XSD -> POJO : it's a Write and
Forget weapon, something that will cost two or three days, and can be
put back in a trash bin when the compiler is ready to go by itself.

Any other better ideas?

Re: [asn1] why use TLV objects at all?

Posted by "Alan D. Cabrera" <ad...@toolazydogs.com>.


Emmanuel Lecharny wrote:

>not yet. I'm just focusing on the plumbing under LDAP messages, as it
>was quite intricate in the current code, and as said on the WS :
>
>"The internal 0.2 release was the first successful attempt to produce a
>replacement for Snacc4J ... However the library does have performance
>problems ... Furthermore the code base is very difficult to maintain
>needing some reorganization. We hope to refactor the library so it is
>more efficient, and easier to maintain while reducing the number of
>dependencies it has. In the process we would like to introduce some new
>features and improvements which are listed below ..."
>
>I so I tried to brought my 2 cents there.
>  
>

I am not in favor of cleaning up old code that's going to be replaced 
anyway.  What do you think of the idea of hand writing the stubs for 
LDAP using our "new" runtime?


Regards,
Alan

Re: [asn1] why use TLV objects at all?

Posted by Emmanuel Lecharny <el...@iktek.com>.

> > That, I feel we can do with a new runtime and hand coded stubs.  We 
> > can clock in a early win and use real-world server code to performance 
> > test our implementation.
> >
> Ok start writing the code.  I've said in several emails as a reminder 
> ... get a better runtime done and we'll gc() the existing runtime.  No 
> problem by me.
> 
> -Alex


Alex, You're Kool !

Re: [asn1] why use TLV objects at all?

Posted by Alex Karasulu <ao...@bellsouth.net>.

Alan D. Cabrera wrote:

>
>
> Alex Karasulu wrote:
>
>> Yeah guys our primary goals are obviously ApacheDS based.  Which 
>> includes BER and DER concerns for the protocols we support.  However 
>> that does not mean we cannot balance this to get a general Stub 
>> compiler project from this.  Plus you never know where life takes 
>> us.  We may want to build our own protocol and use it for something 
>> in ApacheDS and/or implement other ASN.1 based protocols ...
>>
>> I think we're more than capable of collaborating together to get the 
>> pie in the sky dream incrementally while reaching our Directory 
>> centric objectives.
>>
>> <snip/> 
>
>
>
> That, I feel we can do with a new runtime and hand coded stubs.  We 
> can clock in a early win and use real-world server code to performance 
> test our implementation.
>
Ok start writing the code.  I've said in several emails as a reminder 
... get a better runtime done and we'll gc() the existing runtime.  No 
problem by me.

-Alex

Re: [asn1] why use TLV objects at all?

Posted by "Alan D. Cabrera" <ad...@toolazydogs.com>.


Alex Karasulu wrote:

> Yeah guys our primary goals are obviously ApacheDS based.  Which 
> includes BER and DER concerns for the protocols we support.  However 
> that does not mean we cannot balance this to get a general Stub 
> compiler project from this.  Plus you never know where life takes us.  
> We may want to build our own protocol and use it for something in 
> ApacheDS and/or implement other ASN.1 based protocols ...
>
> I think we're more than capable of collaborating together to get the 
> pie in the sky dream incrementally while reaching our Directory 
> centric objectives.
>
> <snip/> 


That, I feel we can do with a new runtime and hand coded stubs.  We can 
clock in a early win and use real-world server code to performance test 
our implementation.


Regards,
Alan

Re: [asn1] why use TLV objects at all?

Posted by Alex Karasulu <ao...@bellsouth.net>.

Emmanuel Lecharny wrote:

<snip/>

>>IIUC, we already have a working version of ASN1 to get the ADS projects 
>>up and running. 
>>    
>>
>
>not yet. I'm just focusing on the plumbing under LDAP messages, as it
>was quite intricate in the current code, and as said on the WS :
>
>"The internal 0.2 release was the first successful attempt to produce a
>replacement for Snacc4J ... However the library does have performance
>problems ... Furthermore the code base is very difficult to maintain
>needing some reorganization. We hope to refactor the library so it is
>more efficient, and easier to maintain while reducing the number of
>dependencies it has. In the process we would like to introduce some new
>features and improvements which are listed below ..."
>
>I so I tried to brought my 2 cents there.
>  
>
Oh and you did.  Instead of refactoring we can do a full rewrite hehe.  
That's the new mindset I've gained from our interactions.

>> It is my goal to write a compiler that generates ASTs 
>>which are translated into PO*Os and their family of encoders/decoders.  
>>The ASTs allow us to target Java, C#, C++, etc.  The ASTs also allow us 
>>to target new encodings when and if then come.  ASTs, I realize that 
>>this is blue sky dreaming, allow for coding optimizations specific to 
>>the protocol and target language, maybe even OS. 
>>    
>>
>
>Hem, I'm not so sure that AST could be helpfull to optimize anything, as
>there is a 1 to 1 relation between the ASN.1 grammar of a protocol and
>the ?ER to (en/de)code. I may be wrong...
>
>Whatever, compiler are GOOD ! And having a compiler to generate ASN.1
>codec is definitively a plus, especially if the underlaying coding is
>PER.
>
>  
>
Yeah guys our primary goals are obviously ApacheDS based.  Which 
includes BER and DER concerns for the protocols we support.  However 
that does not mean we cannot balance this to get a general Stub compiler 
project from this.  Plus you never know where life takes us.  We may 
want to build our own protocol and use it for something in ApacheDS 
and/or implement other ASN.1 based protocols ...

I think we're more than capable of collaborating together to get the pie 
in the sky dream incrementally while reaching our Directory centric 
objectives.

<snip/>

    -Alex

Re: [asn1] why use TLV objects at all?

Posted by Emmanuel Lecharny <el...@iktek.com>.

> >I also want to realize what is the cost of coding/decoding data against
> >the cost of fetching/storing them in the database. If it's a 50/50
> >ratio, we could have major performance improvement by first implementing
> >the layered approach then the compiler one when it's ready. If it's a
> >10/90, forget about layers. Let's focus on compiler on one side, and
> >other performance issues on the other.
> >
> >wdyt?
> >  
> >
> 
> This may be one of our "misunderstandings".  What is our goal? 

I pretty much feel it as a communication problem, as I'm quite a
newcommer on this project, so I missed a lot of explicit and implicit
informations. We have to write down all the implications of our choices
(our = community) in order to share the reasons we have made those
choices (be them good or bad). That takes time...


> IIUC, we already have a working version of ASN1 to get the ADS projects 
> up and running. 

not yet. I'm just focusing on the plumbing under LDAP messages, as it
was quite intricate in the current code, and as said on the WS :

"The internal 0.2 release was the first successful attempt to produce a
replacement for Snacc4J ... However the library does have performance
problems ... Furthermore the code base is very difficult to maintain
needing some reorganization. We hope to refactor the library so it is
more efficient, and easier to maintain while reducing the number of
dependencies it has. In the process we would like to introduce some new
features and improvements which are listed below ..."

I so I tried to brought my 2 cents there.


>  It is my goal to write a compiler that generates ASTs 
> which are translated into PO*Os and their family of encoders/decoders.  
> The ASTs allow us to target Java, C#, C++, etc.  The ASTs also allow us 
> to target new encodings when and if then come.  ASTs, I realize that 
> this is blue sky dreaming, allow for coding optimizations specific to 
> the protocol and target language, maybe even OS. 

Hem, I'm not so sure that AST could be helpfull to optimize anything, as
there is a 1 to 1 relation between the ASN.1 grammar of a protocol and
the ?ER to (en/de)code. I may be wrong...

Whatever, compiler are GOOD ! And having a compiler to generate ASN.1
codec is definitively a plus, especially if the underlaying coding is
PER.


> It's a big task and I welcome, no, *need*, any help that I can get.  To 
> that end, I hope that this goal is aligned with what the ADS community 
> needs.

It is aligned, as ASN.1 is widely used in LDAP, Kerberos, and many other
protocols we can have to deal with (even if RFC are not very respectfull
with ASN.1 syntax and semantic ;-)

But it can be extended to a point where it will become a standalone
project (the first free PER enabled compiler in the universe !)

> I am excited because it is clear that you have a thorough understanding 
> of ASN1 encodings and also seem to be well versant in compiler development.

I'm much more used to deal with compiler as it was one of my favorite
subject, and still is. ASN.1 is something that I'm following from the
distance, as I worked for Marben a decade ago. Funny that those
X400/X500/ASN.1 where buried a little bit to quickly ! (well, X400,
R.I.P) 


So, we absolutly have to work together (not only both of us) on these
subjects, and I'm a volunteer to help as much as I can !

Re: [asn1] why use TLV objects at all?

Posted by "Alan D. Cabrera" <ad...@toolazydogs.com>.

Emmanuel Lecharny wrote:

>Hi all !
>
>
>Both of you are right, it's just a question of "how long will it take to
>write the compiler?" versus "do we really need a compiler at the
>moment?"
>

>Alan is perfectly right. PER is really inseparable from the specific
>ASN.1 grammar it encodes. You need to know the semantic of the incomming
>data to decode it, because T and L are optionnal (I mean, not optionnal
>in a way you are allowed to skip them, but they depend on the grammar).
>So when writing an ASN.1 decoder for a specific grammar using a PER
>decoder, the layer approach is totally useless.  It's much more
>something like "give me the 5 following bits that I know represent the
>Value I'm reading" decoder. Quite complicated to implement, but it's the
>way it works. You need a compiler.
>
>If you have this PER enabled codec compiler, having the same BER/CER/DER
>enabled codec compiler is a piece of cake. No layers, no TLVs, just a
>compiler.
>
>This could also be a perfect new Apache project : Apache ASN.1 free
>compiler. (a SNACC GPLized, in a way !)
>
>How far are we from this target? 
>
>I also want to realize what is the cost of coding/decoding data against
>the cost of fetching/storing them in the database. If it's a 50/50
>ratio, we could have major performance improvement by first implementing
>the layered approach then the compiler one when it's ready. If it's a
>10/90, forget about layers. Let's focus on compiler on one side, and
>other performance issues on the other.
>
>wdyt?
>  
>

This may be one of our "misunderstandings".  What is our goal? 

IIUC, we already have a working version of ASN1 to get the ADS projects 
up and running.  It is my goal to write a compiler that generates ASTs 
which are translated into PO*Os and their family of encoders/decoders.  
The ASTs allow us to target Java, C#, C++, etc.  The ASTs also allow us 
to target new encodings when and if then come.  ASTs, I realize that 
this is blue sky dreaming, allow for coding optimizations specific to 
the protocol and target language, maybe even OS.  

It's a big task and I welcome, no, *need*, any help that I can get.  To 
that end, I hope that this goal is aligned with what the ADS community 
needs.

I am excited because it is clear that you have a thorough understanding 
of ASN1 encodings and also seem to be well versant in compiler development.

Regards,
Alan

Re: [asn1] why use TLV objects at all?

Posted by Emmanuel Lecharny <el...@iktek.com>.

Hi all !

Le jeudi 24 février 2005 à 01:55 -0500, Alex Karasulu a écrit :
> Alan D. Cabrera wrote:
> 
> >
> >
> > Alex Karasulu wrote:
> >
> >> Alan D. Cabrera wrote:
> >>
> >>>
> >>>
> >>> Alex Karasulu wrote:
> >>>
> >>>> Emmanuel,
> >>>>
> >>>> I was just thinking about your position on object creation.  Namely 
> >>>> the one that is against the creation of Tuple objects that 
> >>>> represent TLVs.  Your proposal to use pooling of these objects 
> >>>> worries me a bit.  It just makes me think there would be a lot of 
> >>>> synchronization overhead.  I may be wrong.

No synchronisation : its a local pool, each thread has its own pool. You
won't have 10 000 threads, so it's ok.

> >>> I was also concerned by this as it may require that you keep a rough 
> >>> factor of 2 more memory, one for the Tuple structure of the message 
> >>> and one for the POJO that you are creating.

TLV are allocated already, so it's not a pb. The value part will be
copied to stubs as the stub read it, so this is the only variable part.
You create the value while reading the PDU, and pass it to the stub. So
memory consumption is just like sizeof(stubs) + sizeof(data) + sizeof
(preallocated TLV). It's really important that the memory footprint is
somehow static, even if big at the beginning. We are tradding initial
memory need against stability on the long term.

> >> It would be if we were collecting all PDU tuples to form a TLV tree.  
> >> However the idea is to use and release whatever is allocated to the 
> >> tuple.  In this case, the only time you have two copies of a datum 
> >> (tuple value) is when you are holding on to the value long enough to 
> >> set a stub's property using with the tuple value.

We don't have two copies of a datum. When the TLV is a Primitive one, as
soon as its data has been completly read, we can pass it to the
POJO/Stub. We just keep a reference to it in the TLV, the only
duplication which could occur is the Tand L parts, but, again, it's not
simply a duplication: TLV are allocated from the beginning, and won't be
released untill you stop the server.

> >> Furthermore if we implement the strategy of streaming a large value 
> >> to disk (say a JPEG photo) then the value is just a URI to access the 
> >> stream later on.  This URI is what is set as the stub property 
> >> value.  So in this case we don't have the double hit as mentioned 
> >> above where a value is in memory in a Tuple and duplicated in the 
> >> value of the stub property.

+1 for the URI. It could also be a sub-classed StreamedTLV, which has
the same interface. The implentation of its getData method will handle
the situation.

> >
> >
> > So we only keep a stack of tuples?
> 
> You mean constructed tuples for nesting?  Depends on the stub.  I don't 
> think even that may be needed.  Don't know for sure yet though.

Stub/POJO is just the final representation of the data. Obviously we can
avoid all those TLVs plumbingif we have a compiler that handle it.A
deepth first decoding strategy is something that is faster than a two
layers parser/lexer strategy, but it's much more complicated.

> >>>> However I started thinking, "why create Tuples at all?" Follow my 
> >>>> concepts here for a sec even though we have not been discussiong 
> >>>> these constructs: TupleProducers and TupleConsumers.  A producer 
> >>>> simply emits callbacks to a consumer and they are bound to each 
> >>>> other.  What if the callbacks did not pass in a Tuple as an 
> >>>> argument but the components T, L and V of the Tuple instead.  A 
> >>>> stub, which is like the parser you mentioned, tracks and changes 
> >>>> state as an automaton to populate its properties appropriately with 
> >>>> the stream of Tuple events.  The stub can be a TupleConsumer - 
> >>>> really a tuple event consumer rather.  This would eliminate object 
> >>>> creation overheads and populate the stub. 

If you want to control L, you need to keep a track of the Constructed
TLVs. Primitives TLV are not very importants, we can discard them
immediatly, just keeping their V part. So keeping a stack of Constructed
TLVs is just a question of fixating length.

> >>> Could you not flatten it even further by making a compiler generated 
> >>> stub act as both the producer and consumer?  This is the tack that I 
> >>> am taking with my "smart" stubs.

Yes for sure. But then it will become difficult to track bad PDU (I
mean, PDU in which Length are not correct).

> >> I highly discourage this approach.  Reason being the nature of the 
> >> relationship between ASN.1 and encodings.  As you know an ASN.1 spec 
> >> can use any encoding.  Conventionally a protocol specifies an 
> >> encoding and sticks to it so it seems to support your approach.  This 
> >> however is not always the case and ASN.1 is being used in new ways 
> >> where alternate encodings are being applied to different data 
> >> structures based on the target: i.e. GSM network clients.  However 
> >> these are not the strongest cases for why you should avoid this 
> >> "smart" stub approach IMO.
> >
> >
> > Each stub is specific to a particular encoding.  It is the POJOs that 
> > are used that are universal to the encodings.
> 
> Ahh ok you mean there's a difference between the stub and a POJO.  I 
> thought the pojo is the stub.  Or are you refering to some base class or 
> POJI?

We should agreed on terms, don't you think so? In my mind, a POJO is an
instance of a ASN.1 path through a specific grammar (for example, a
LdapBindResponse POJO, or a LdapSearch POJO). The stub is the class that
feed the POJO with Data. So the stub is the POJO producer/consumer (with
or without TLV). wdyt ?

> >> The most important reason is to decouple the generation of encoding 
> >> specific code from the stub compiler.  If you make your stubs 
> >> "encoding aware" then your adding some serious complexity to the stub 
> >> compiler IMO.  Why do this when you can avoid it and gain the ability 
> >> to swap out the encoding at runtime?
> >
> >
> > You have the ability to swap out encodings at runtime, you can just 
> > switch stubs. 

Both of you are right, it's just a question of "how long will it take to
write the compiler?" versus "do we really need a compiler at the
moment?"

> 
> So interface or base class is the same but concrete implementation is 
> the stub for a particular encoding?
> 
> >
> >> The way I like to visualize this is ... there is a common 
> >> representation the stub compiler needs to work with.  Rather than 
> >> read bytes from a stream it responds to tuple events as its input at 
> >> a higher level.  Regardless of the type of encoding at the lowest 
> >> level the stub compiler and the stubs it generates need not be 
> >> aware.  It's sort of like the way javac works with the underlying 
> >> runtime: the compiled code as byte codes are bound at runtime to the 
> >> underlying native code to do the actual work using native code.  
> >> Similarly here I'm recommending that the stub compiler generate a 
> >> stub which deals only with TLVs and at runtime the source/target can 
> >> be a BER, DER, PER binary stream or even a XER encoded ascii stream.
> >> I think perhaps some of your concerns on the stub compiler side 
> >> revolve around finding a tangible way for the antlr based stub 
> >> compiler to generate code that deals with a TLV stream rather than a 
> >> byte/char stream.  I too have this problem - it is not easy.  In this 
> >> regard the approach of making the stub totally encoding aware may 
> >> seem easier to do.
> >
> >
> > IIUC, PER does not use TLVs.  You need to know the structure of your 
> > ASN1 object to decode the stream.
> >
> > Keep it simple.  We may as well remove the layer and generate protocol 
> > specific stubs.
> 
> If you're writing the stub compiler then its your call.  However I'm 
> still not convinced this is keeping it simple.  Have you already 
> finished the parts of the compiler that can handle different encodings?

Alan is perfectly right. PER is really inseparable from the specific
ASN.1 grammar it encodes. You need to know the semantic of the incomming
data to decode it, because T and L are optionnal (I mean, not optionnal
in a way you are allowed to skip them, but they depend on the grammar).
So when writing an ASN.1 decoder for a specific grammar using a PER
decoder, the layer approach is totally useless.  It's much more
something like "give me the 5 following bits that I know represent the
Value I'm reading" decoder. Quite complicated to implement, but it's the
way it works. You need a compiler.

If you have this PER enabled codec compiler, having the same BER/CER/DER
enabled codec compiler is a piece of cake. No layers, no TLVs, just a
compiler.

This could also be a perfect new Apache project : Apache ASN.1 free
compiler. (a SNACC GPLized, in a way !)

How far are we from this target? 

I also want to realize what is the cost of coding/decoding data against
the cost of fetching/storing them in the database. If it's a 50/50
ratio, we could have major performance improvement by first implementing
the layered approach then the compiler one when it's ready. If it's a
10/90, forget about layers. Let's focus on compiler on one side, and
other performance issues on the other.

wdyt?

Re: [asn1] why use TLV objects at all?

Posted by Emmanuel Lecharny <el...@iktek.com>.

> >
> > You will not be able to get away from having at least a stack of Ls.
> 
> I thought this before but I think there may be ways around this :).

You may not build a L stack, but you then accept the idea that you can't
anymore verify the correctness of Constructed TLVs.


> > POJOs are populated by applications and fed into the encoding specific 
> > stubs.  The decoder specific stubs produce fully populated POJOs.  The 
> > POJOs mimic the structure of the ASN1 description.
> >
> Ok this is different from SNACC4J where POJO is the stub.  Other 
> compilers do the same.  The POJO has machinery for populating itself.  
> This might not be the best approach in which case I like your's here 
> better.  Basically the stub is the high level decoder encoder.

+1 to Alan "lord of the Stubs" approach. 

One stub to rule them all (POJOs) ...

Re: [asn1] why use TLV objects at all?

Posted by Alex Karasulu <ao...@bellsouth.net>.

Alan D. Cabrera wrote:

>
>
> Alex Karasulu wrote:
>
>> Alan D. Cabrera wrote:
>>
>>> Alex Karasulu wrote:
>>>
>>>> Alan D. Cabrera wrote:
>>>>
>>>>> Alex Karasulu wrote:
>>>>>
>>>>>> Emmanuel,
>>>>>>
>>>>>> I was just thinking about your position on object creation.  
>>>>>> Namely the one that is against the creation of Tuple objects that 
>>>>>> represent TLVs.  Your proposal to use pooling of these objects 
>>>>>> worries me a bit.  It just makes me think there would be a lot of 
>>>>>> synchronization overhead.  I may be wrong.
>>>>>
>>>>>
>>>>> I was also concerned by this as it may require that you keep a 
>>>>> rough factor of 2 more memory, one for the Tuple structure of the 
>>>>> message and one for the POJO that you are creating.
>>>>
>>>>
>>>> It would be if we were collecting all PDU tuples to form a TLV 
>>>> tree.  However the idea is to use and release whatever is allocated 
>>>> to the tuple.  In this case, the only time you have two copies of a 
>>>> datum (tuple value) is when you are holding on to the value long 
>>>> enough to set a stub's property using with the tuple value.
>>>> Furthermore if we implement the strategy of streaming a large value 
>>>> to disk (say a JPEG photo) then the value is just a URI to access 
>>>> the stream later on.  This URI is what is set as the stub property 
>>>> value.  So in this case we don't have the double hit as mentioned 
>>>> above where a value is in memory in a Tuple and duplicated in the 
>>>> value of the stub property.
>>>
>>>
>>>
>>> So we only keep a stack of tuples?
>>
>>
>>
>> You mean constructed tuples for nesting?  Depends on the stub.  I 
>> don't think even that may be needed.  Don't know for sure yet though.
>
>
>
> You will not be able to get away from having at least a stack of Ls.

I thought this before but I think there may be ways around this :).

>
>>>>>> However I started thinking, "why create Tuples at all?" Follow my 
>>>>>> concepts here for a sec even though we have not been discussiong 
>>>>>> these constructs: TupleProducers and TupleConsumers.  A producer 
>>>>>> simply emits callbacks to a consumer and they are bound to each 
>>>>>> other.  What if the callbacks did not pass in a Tuple as an 
>>>>>> argument but the components T, L and V of the Tuple instead.  A 
>>>>>> stub, which is like the parser you mentioned, tracks and changes 
>>>>>> state as an automaton to populate its properties appropriately 
>>>>>> with the stream of Tuple events.  The stub can be a TupleConsumer 
>>>>>> - really a tuple event consumer rather.  This would eliminate 
>>>>>> object creation overheads and populate the stub. 
>>>>>
>>>>>
>>>>> Could you not flatten it even further by making a compiler 
>>>>> generated stub act as both the producer and consumer?  This is the 
>>>>> tack that I am taking with my "smart" stubs.
>>>>
>>>>
>>>> I highly discourage this approach.  Reason being the nature of the 
>>>> relationship between ASN.1 and encodings.  As you know an ASN.1 
>>>> spec can use any encoding.  Conventionally a protocol specifies an 
>>>> encoding and sticks to it so it seems to support your approach.  
>>>> This however is not always the case and ASN.1 is being used in new 
>>>> ways where alternate encodings are being applied to different data 
>>>> structures based on the target: i.e. GSM network clients.  However 
>>>> these are not the strongest cases for why you should avoid this 
>>>> "smart" stub approach IMO.
>>>
>>>
>>> Each stub is specific to a particular encoding.  It is the POJOs 
>>> that are used that are universal to the encodings.
>>
>>
>>
>> Ahh ok you mean there's a difference between the stub and a POJO.  I 
>> thought the pojo is the stub.  Or are you refering to some base class 
>> or POJI?
>
>
>
> POJOs are populated by applications and fed into the encoding specific 
> stubs.  The decoder specific stubs produce fully populated POJOs.  The 
> POJOs mimic the structure of the ASN1 description.
>
Ok this is different from SNACC4J where POJO is the stub.  Other 
compilers do the same.  The POJO has machinery for populating itself.  
This might not be the best approach in which case I like your's here 
better.  Basically the stub is the high level decoder encoder.

>>>> The most important reason is to decouple the generation of encoding 
>>>> specific code from the stub compiler.  If you make your stubs 
>>>> "encoding aware" then your adding some serious complexity to the 
>>>> stub compiler IMO.  Why do this when you can avoid it and gain the 
>>>> ability to swap out the encoding at runtime?
>>>
>>>
>>> You have the ability to swap out encodings at runtime, you can just 
>>> switch stubs. 
>>
>>
>> So interface or base class is the same but concrete implementation is 
>> the stub for a particular encoding?
>
>
>
> Yes, exactly, the compiler will generate encoding/decoding specific 
> stubs.
>
>>>> The way I like to visualize this is ... there is a common 
>>>> representation the stub compiler needs to work with.  Rather than 
>>>> read bytes from a stream it responds to tuple events as its input 
>>>> at a higher level.  Regardless of the type of encoding at the 
>>>> lowest level the stub compiler and the stubs it generates need not 
>>>> be aware.  It's sort of like the way javac works with the 
>>>> underlying runtime: the compiled code as byte codes are bound at 
>>>> runtime to the underlying native code to do the actual work using 
>>>> native code.  Similarly here I'm recommending that the stub 
>>>> compiler generate a stub which deals only with TLVs and at runtime 
>>>> the source/target can be a BER, DER, PER binary stream or even a 
>>>> XER encoded ascii stream.
>>>> I think perhaps some of your concerns on the stub compiler side 
>>>> revolve around finding a tangible way for the antlr based stub 
>>>> compiler to generate code that deals with a TLV stream rather than 
>>>> a byte/char stream.  I too have this problem - it is not easy.  In 
>>>> this regard the approach of making the stub totally encoding aware 
>>>> may seem easier to do.
>>>
>>>
>>> IIUC, PER does not use TLVs.  You need to know the structure of your 
>>> ASN1 object to decode the stream.
>>>
>>> Keep it simple.  We may as well remove the layer and generate 
>>> protocol specific stubs.
>>
>>
>> If you're writing the stub compiler then its your call.  However I'm 
>> still not convinced this is keeping it simple.  Have you already 
>> finished the parts of the compiler that can handle different encodings?
>

Re: [asn1] why use TLV objects at all?

Posted by "Alan D. Cabrera" <ad...@toolazydogs.com>.


Alex Karasulu wrote:

> Alan D. Cabrera wrote:
>
>> Alex Karasulu wrote:
>>
>>> Alan D. Cabrera wrote:
>>>
>>>> Alex Karasulu wrote:
>>>>
>>>>> Emmanuel,
>>>>>
>>>>> I was just thinking about your position on object creation.  
>>>>> Namely the one that is against the creation of Tuple objects that 
>>>>> represent TLVs.  Your proposal to use pooling of these objects 
>>>>> worries me a bit.  It just makes me think there would be a lot of 
>>>>> synchronization overhead.  I may be wrong.
>>>>
>>>> I was also concerned by this as it may require that you keep a 
>>>> rough factor of 2 more memory, one for the Tuple structure of the 
>>>> message and one for the POJO that you are creating.
>>>
>>> It would be if we were collecting all PDU tuples to form a TLV 
>>> tree.  However the idea is to use and release whatever is allocated 
>>> to the tuple.  In this case, the only time you have two copies of a 
>>> datum (tuple value) is when you are holding on to the value long 
>>> enough to set a stub's property using with the tuple value.
>>> Furthermore if we implement the strategy of streaming a large value 
>>> to disk (say a JPEG photo) then the value is just a URI to access 
>>> the stream later on.  This URI is what is set as the stub property 
>>> value.  So in this case we don't have the double hit as mentioned 
>>> above where a value is in memory in a Tuple and duplicated in the 
>>> value of the stub property.
>>
>>
>> So we only keep a stack of tuples?
>
>
> You mean constructed tuples for nesting?  Depends on the stub.  I 
> don't think even that may be needed.  Don't know for sure yet though.


You will not be able to get away from having at least a stack of Ls.

>>>>> However I started thinking, "why create Tuples at all?" Follow my 
>>>>> concepts here for a sec even though we have not been discussiong 
>>>>> these constructs: TupleProducers and TupleConsumers.  A producer 
>>>>> simply emits callbacks to a consumer and they are bound to each 
>>>>> other.  What if the callbacks did not pass in a Tuple as an 
>>>>> argument but the components T, L and V of the Tuple instead.  A 
>>>>> stub, which is like the parser you mentioned, tracks and changes 
>>>>> state as an automaton to populate its properties appropriately 
>>>>> with the stream of Tuple events.  The stub can be a TupleConsumer 
>>>>> - really a tuple event consumer rather.  This would eliminate 
>>>>> object creation overheads and populate the stub. 
>>>>
>>>> Could you not flatten it even further by making a compiler 
>>>> generated stub act as both the producer and consumer?  This is the 
>>>> tack that I am taking with my "smart" stubs.
>>>
>>> I highly discourage this approach.  Reason being the nature of the 
>>> relationship between ASN.1 and encodings.  As you know an ASN.1 spec 
>>> can use any encoding.  Conventionally a protocol specifies an 
>>> encoding and sticks to it so it seems to support your approach.  
>>> This however is not always the case and ASN.1 is being used in new 
>>> ways where alternate encodings are being applied to different data 
>>> structures based on the target: i.e. GSM network clients.  However 
>>> these are not the strongest cases for why you should avoid this 
>>> "smart" stub approach IMO.
>>
>> Each stub is specific to a particular encoding.  It is the POJOs that 
>> are used that are universal to the encodings.
>
>
> Ahh ok you mean there's a difference between the stub and a POJO.  I 
> thought the pojo is the stub.  Or are you refering to some base class 
> or POJI?


POJOs are populated by applications and fed into the encoding specific 
stubs.  The decoder specific stubs produce fully populated POJOs.  The 
POJOs mimic the structure of the ASN1 description.

>>> The most important reason is to decouple the generation of encoding 
>>> specific code from the stub compiler.  If you make your stubs 
>>> "encoding aware" then your adding some serious complexity to the 
>>> stub compiler IMO.  Why do this when you can avoid it and gain the 
>>> ability to swap out the encoding at runtime?
>>
>> You have the ability to swap out encodings at runtime, you can just 
>> switch stubs. 
>
> So interface or base class is the same but concrete implementation is 
> the stub for a particular encoding?


Yes, exactly, the compiler will generate encoding/decoding specific stubs.

>>> The way I like to visualize this is ... there is a common 
>>> representation the stub compiler needs to work with.  Rather than 
>>> read bytes from a stream it responds to tuple events as its input at 
>>> a higher level.  Regardless of the type of encoding at the lowest 
>>> level the stub compiler and the stubs it generates need not be 
>>> aware.  It's sort of like the way javac works with the underlying 
>>> runtime: the compiled code as byte codes are bound at runtime to the 
>>> underlying native code to do the actual work using native code.  
>>> Similarly here I'm recommending that the stub compiler generate a 
>>> stub which deals only with TLVs and at runtime the source/target can 
>>> be a BER, DER, PER binary stream or even a XER encoded ascii stream.
>>> I think perhaps some of your concerns on the stub compiler side 
>>> revolve around finding a tangible way for the antlr based stub 
>>> compiler to generate code that deals with a TLV stream rather than a 
>>> byte/char stream.  I too have this problem - it is not easy.  In 
>>> this regard the approach of making the stub totally encoding aware 
>>> may seem easier to do.
>>
>> IIUC, PER does not use TLVs.  You need to know the structure of your 
>> ASN1 object to decode the stream.
>>
>> Keep it simple.  We may as well remove the layer and generate 
>> protocol specific stubs.
>
> If you're writing the stub compiler then its your call.  However I'm 
> still not convinced this is keeping it simple.  Have you already 
> finished the parts of the compiler that can handle different encodings?

Re: [asn1] why use TLV objects at all?

Posted by Alex Karasulu <ao...@bellsouth.net>.

Alan D. Cabrera wrote:

>
>
> Alex Karasulu wrote:
>
>> Alan D. Cabrera wrote:
>>
>>>
>>>
>>> Alex Karasulu wrote:
>>>
>>>> Emmanuel,
>>>>
>>>> I was just thinking about your position on object creation.  Namely 
>>>> the one that is against the creation of Tuple objects that 
>>>> represent TLVs.  Your proposal to use pooling of these objects 
>>>> worries me a bit.  It just makes me think there would be a lot of 
>>>> synchronization overhead.  I may be wrong.
>>>
>>>
>>>
>>>
>>> I was also concerned by this as it may require that you keep a rough 
>>> factor of 2 more memory, one for the Tuple structure of the message 
>>> and one for the POJO that you are creating.
>>
>>
>>
>> It would be if we were collecting all PDU tuples to form a TLV tree.  
>> However the idea is to use and release whatever is allocated to the 
>> tuple.  In this case, the only time you have two copies of a datum 
>> (tuple value) is when you are holding on to the value long enough to 
>> set a stub's property using with the tuple value.
>> Furthermore if we implement the strategy of streaming a large value 
>> to disk (say a JPEG photo) then the value is just a URI to access the 
>> stream later on.  This URI is what is set as the stub property 
>> value.  So in this case we don't have the double hit as mentioned 
>> above where a value is in memory in a Tuple and duplicated in the 
>> value of the stub property.
>
>
> So we only keep a stack of tuples?

You mean constructed tuples for nesting?  Depends on the stub.  I don't 
think even that may be needed.  Don't know for sure yet though.

>
>>>> However I started thinking, "why create Tuples at all?" Follow my 
>>>> concepts here for a sec even though we have not been discussiong 
>>>> these constructs: TupleProducers and TupleConsumers.  A producer 
>>>> simply emits callbacks to a consumer and they are bound to each 
>>>> other.  What if the callbacks did not pass in a Tuple as an 
>>>> argument but the components T, L and V of the Tuple instead.  A 
>>>> stub, which is like the parser you mentioned, tracks and changes 
>>>> state as an automaton to populate its properties appropriately with 
>>>> the stream of Tuple events.  The stub can be a TupleConsumer - 
>>>> really a tuple event consumer rather.  This would eliminate object 
>>>> creation overheads and populate the stub. 
>>>
>>>
>>>
>>>
>>> Could you not flatten it even further by making a compiler generated 
>>> stub act as both the producer and consumer?  This is the tack that I 
>>> am taking with my "smart" stubs.
>>
>>
>>
>> I highly discourage this approach.  Reason being the nature of the 
>> relationship between ASN.1 and encodings.  As you know an ASN.1 spec 
>> can use any encoding.  Conventionally a protocol specifies an 
>> encoding and sticks to it so it seems to support your approach.  This 
>> however is not always the case and ASN.1 is being used in new ways 
>> where alternate encodings are being applied to different data 
>> structures based on the target: i.e. GSM network clients.  However 
>> these are not the strongest cases for why you should avoid this 
>> "smart" stub approach IMO.
>
>
> Each stub is specific to a particular encoding.  It is the POJOs that 
> are used that are universal to the encodings.

Ahh ok you mean there's a difference between the stub and a POJO.  I 
thought the pojo is the stub.  Or are you refering to some base class or 
POJI?

>
>
>> The most important reason is to decouple the generation of encoding 
>> specific code from the stub compiler.  If you make your stubs 
>> "encoding aware" then your adding some serious complexity to the stub 
>> compiler IMO.  Why do this when you can avoid it and gain the ability 
>> to swap out the encoding at runtime?
>
>
> You have the ability to swap out encodings at runtime, you can just 
> switch stubs. 

So interface or base class is the same but concrete implementation is 
the stub for a particular encoding?

>
>> The way I like to visualize this is ... there is a common 
>> representation the stub compiler needs to work with.  Rather than 
>> read bytes from a stream it responds to tuple events as its input at 
>> a higher level.  Regardless of the type of encoding at the lowest 
>> level the stub compiler and the stubs it generates need not be 
>> aware.  It's sort of like the way javac works with the underlying 
>> runtime: the compiled code as byte codes are bound at runtime to the 
>> underlying native code to do the actual work using native code.  
>> Similarly here I'm recommending that the stub compiler generate a 
>> stub which deals only with TLVs and at runtime the source/target can 
>> be a BER, DER, PER binary stream or even a XER encoded ascii stream.
>> I think perhaps some of your concerns on the stub compiler side 
>> revolve around finding a tangible way for the antlr based stub 
>> compiler to generate code that deals with a TLV stream rather than a 
>> byte/char stream.  I too have this problem - it is not easy.  In this 
>> regard the approach of making the stub totally encoding aware may 
>> seem easier to do.
>
>
> IIUC, PER does not use TLVs.  You need to know the structure of your 
> ASN1 object to decode the stream.
>
> Keep it simple.  We may as well remove the layer and generate protocol 
> specific stubs.

If you're writing the stub compiler then its your call.  However I'm 
still not convinced this is keeping it simple.  Have you already 
finished the parts of the compiler that can handle different encodings?

Alex

Re: [asn1] why use TLV objects at all?

Posted by "Alan D. Cabrera" <ad...@toolazydogs.com>.


Alex Karasulu wrote:

> Alan D. Cabrera wrote:
>
>>
>>
>> Alex Karasulu wrote:
>>
>>> Emmanuel,
>>>
>>> I was just thinking about your position on object creation.  Namely 
>>> the one that is against the creation of Tuple objects that represent 
>>> TLVs.  Your proposal to use pooling of these objects worries me a 
>>> bit.  It just makes me think there would be a lot of synchronization 
>>> overhead.  I may be wrong.
>>
>>
>>
>> I was also concerned by this as it may require that you keep a rough 
>> factor of 2 more memory, one for the Tuple structure of the message 
>> and one for the POJO that you are creating.
>
>
> It would be if we were collecting all PDU tuples to form a TLV tree.  
> However the idea is to use and release whatever is allocated to the 
> tuple.  In this case, the only time you have two copies of a datum 
> (tuple value) is when you are holding on to the value long enough to 
> set a stub's property using with the tuple value.
> Furthermore if we implement the strategy of streaming a large value to 
> disk (say a JPEG photo) then the value is just a URI to access the 
> stream later on.  This URI is what is set as the stub property value.  
> So in this case we don't have the double hit as mentioned above where 
> a value is in memory in a Tuple and duplicated in the value of the 
> stub property.

So we only keep a stack of tuples?

>>> However I started thinking, "why create Tuples at all?" Follow my 
>>> concepts here for a sec even though we have not been discussiong 
>>> these constructs: TupleProducers and TupleConsumers.  A producer 
>>> simply emits callbacks to a consumer and they are bound to each 
>>> other.  What if the callbacks did not pass in a Tuple as an argument 
>>> but the components T, L and V of the Tuple instead.  A stub, which 
>>> is like the parser you mentioned, tracks and changes state as an 
>>> automaton to populate its properties appropriately with the stream 
>>> of Tuple events.  The stub can be a TupleConsumer - really a tuple 
>>> event consumer rather.  This would eliminate object creation 
>>> overheads and populate the stub. 
>>
>>
>>
>> Could you not flatten it even further by making a compiler generated 
>> stub act as both the producer and consumer?  This is the tack that I 
>> am taking with my "smart" stubs.
>
>
> I highly discourage this approach.  Reason being the nature of the 
> relationship between ASN.1 and encodings.  As you know an ASN.1 spec 
> can use any encoding.  Conventionally a protocol specifies an encoding 
> and sticks to it so it seems to support your approach.  This however 
> is not always the case and ASN.1 is being used in new ways where 
> alternate encodings are being applied to different data structures 
> based on the target: i.e. GSM network clients.  However these are not 
> the strongest cases for why you should avoid this "smart" stub 
> approach IMO.

Each stub is specific to a particular encoding.  It is the POJOs that 
are used that are universal to the encodings.

> The most important reason is to decouple the generation of encoding 
> specific code from the stub compiler.  If you make your stubs 
> "encoding aware" then your adding some serious complexity to the stub 
> compiler IMO.  Why do this when you can avoid it and gain the ability 
> to swap out the encoding at runtime?

You have the ability to swap out encodings at runtime, you can just 
switch stubs.

> The way I like to visualize this is ... there is a common 
> representation the stub compiler needs to work with.  Rather than read 
> bytes from a stream it responds to tuple events as its input at a 
> higher level.  Regardless of the type of encoding at the lowest level 
> the stub compiler and the stubs it generates need not be aware.  It's 
> sort of like the way javac works with the underlying runtime: the 
> compiled code as byte codes are bound at runtime to the underlying 
> native code to do the actual work using native code.  Similarly here 
> I'm recommending that the stub compiler generate a stub which deals 
> only with TLVs and at runtime the source/target can be a BER, DER, PER 
> binary stream or even a XER encoded ascii stream.
> I think perhaps some of your concerns on the stub compiler side 
> revolve around finding a tangible way for the antlr based stub 
> compiler to generate code that deals with a TLV stream rather than a 
> byte/char stream.  I too have this problem - it is not easy.  In this 
> regard the approach of making the stub totally encoding aware may seem 
> easier to do.

IIUC, PER does not use TLVs.  You need to know the structure of your 
ASN1 object to decode the stream.

Keep it simple.  We may as well remove the layer and generate protocol 
specific stubs.


Regards,
Alan

Re: [asn1] why use TLV objects at all?

Posted by Alex Karasulu <ao...@bellsouth.net>.

Alan D. Cabrera wrote:

>
>
> Alex Karasulu wrote:
>
>> Emmanuel,
>>
>> I was just thinking about your position on object creation.  Namely 
>> the one that is against the creation of Tuple objects that represent 
>> TLVs.  Your proposal to use pooling of these objects worries me a 
>> bit.  It just makes me think there would be a lot of synchronization 
>> overhead.  I may be wrong.
>
>
> I was also concerned by this as it may require that you keep a rough 
> factor of 2 more memory, one for the Tuple structure of the message 
> and one for the POJO that you are creating.

It would be if we were collecting all PDU tuples to form a TLV tree.  
However the idea is to use and release whatever is allocated to the 
tuple.  In this case, the only time you have two copies of a datum 
(tuple value) is when you are holding on to the value long enough to set 
a stub's property using with the tuple value. 

Furthermore if we implement the strategy of streaming a large value to 
disk (say a JPEG photo) then the value is just a URI to access the 
stream later on.  This URI is what is set as the stub property value.  
So in this case we don't have the double hit as mentioned above where a 
value is in memory in a Tuple and duplicated in the value of the stub 
property.

>
>> However I started thinking, "why create Tuples at all?" Follow my 
>> concepts here for a sec even though we have not been discussiong 
>> these constructs: TupleProducers and TupleConsumers.  A producer 
>> simply emits callbacks to a consumer and they are bound to each 
>> other.  What if the callbacks did not pass in a Tuple as an argument 
>> but the components T, L and V of the Tuple instead.  A stub, which is 
>> like the parser you mentioned, tracks and changes state as an 
>> automaton to populate its properties appropriately with the stream of 
>> Tuple events.  The stub can be a TupleConsumer - really a tuple event 
>> consumer rather.  This would eliminate object creation overheads and 
>> populate the stub. 
>
>
> Could you not flatten it even further by making a compiler generated 
> stub act as both the producer and consumer?  This is the tack that I 
> am taking with my "smart" stubs.

I highly discourage this approach.  Reason being the nature of the 
relationship between ASN.1 and encodings.  As you know an ASN.1 spec can 
use any encoding.  Conventionally a protocol specifies an encoding and 
sticks to it so it seems to support your approach.  This however is not 
always the case and ASN.1 is being used in new ways where alternate 
encodings are being applied to different data structures based on the 
target: i.e. GSM network clients.  However these are not the strongest 
cases for why you should avoid this "smart" stub approach IMO. 

The most important reason is to decouple the generation of encoding 
specific code from the stub compiler.  If you make your stubs "encoding 
aware" then your adding some serious complexity to the stub compiler 
IMO.  Why do this when you can avoid it and gain the ability to swap out 
the encoding at runtime? 

The way I like to visualize this is ... there is a common representation 
the stub compiler needs to work with.  Rather than read bytes from a 
stream it responds to tuple events as its input at a higher level.  
Regardless of the type of encoding at the lowest level the stub compiler 
and the stubs it generates need not be aware.  It's sort of like the way 
javac works with the underlying runtime: the compiled code as byte codes 
are bound at runtime to the underlying native code to do the actual work 
using native code.  Similarly here I'm recommending that the stub 
compiler generate a stub which deals only with TLVs and at runtime the 
source/target can be a BER, DER, PER binary stream or even a XER encoded 
ascii stream. 

I think perhaps some of your concerns on the stub compiler side revolve 
around finding a tangible way for the antlr based stub compiler to 
generate code that deals with a TLV stream rather than a byte/char 
stream.  I too have this problem - it is not easy.  In this regard the 
approach of making the stub totally encoding aware may seem easier to do.

    -Alex

Re: [asn1] why use TLV objects at all?

Posted by "Alan D. Cabrera" <ad...@toolazydogs.com>.


Alex Karasulu wrote:

> Emmanuel,
>
> I was just thinking about your position on object creation.  Namely 
> the one that is against the creation of Tuple objects that represent 
> TLVs.  Your proposal to use pooling of these objects worries me a 
> bit.  It just makes me think there would be a lot of synchronization 
> overhead.  I may be wrong.

I was also concerned by this as it may require that you keep a rough 
factor of 2 more memory, one for the Tuple structure of the message and 
one for the POJO that you are creating.

> However I started thinking, "why create Tuples at all?" Follow my 
> concepts here for a sec even though we have not been discussiong these 
> constructs: TupleProducers and TupleConsumers.  A producer simply 
> emits callbacks to a consumer and they are bound to each other.  What 
> if the callbacks did not pass in a Tuple as an argument but the 
> components T, L and V of the Tuple instead.  A stub, which is like the 
> parser you mentioned, tracks and changes state as an automaton to 
> populate its properties appropriately with the stream of Tuple 
> events.  The stub can be a TupleConsumer - really a tuple event 
> consumer rather.  This would eliminate object creation overheads and 
> populate the stub. 

Could you not flatten it even further by making a compiler generated 
stub act as both the producer and consumer?  This is the tack that I am 
taking with my "smart" stubs.


Regards,
Alan

Re: [asn1] why use TLV objects at all?

Posted by "Alan D. Cabrera" <ad...@toolazydogs.com>.


Alex Karasulu wrote:

> Enrique Rodriguez wrote:
>
>> Alex Karasulu wrote:
>>
>>> Emmanuel,
>>>
>>> I was just thinking about your position on object creation.  Namely 
>>> the one that is against the creation of Tuple objects that represent 
>>> TLVs.  Your proposal to use pooling of these objects worries me a 
>>> bit.  It just makes me think there would be a lot of synchronization 
>>> overhead.  I may be wrong.
>>>
>>> However I started thinking, "why create Tuples at all?" Follow my 
>>> concepts here for a sec even though we have not been discussiong 
>>> these constructs: TupleProducers and TupleConsumers.  A producer 
>>> simply emits callbacks to a consumer and they are bound to each 
>>> other.  What if the callbacks did not pass in a Tuple as an argument 
>>> but the components T, L and V of the Tuple instead.
>>
>>
>>
>> Do you even need the L?  You have the V at this point.
>
>
> The V may be a URI to data held elsewhere while streaming big Tuples 
> to disk.  Also L may be of the indefinate form.

I don't understand what the URI has to do with the need of L.  As for 
the latter statement, I think you mean that the tuple may be a 
constructed encoding.  In that case, I stil agree w/ Enrique in that we 
may not need the L.


Regards,
Alan

Re: [asn1] why use TLV objects at all?

Posted by Alex Karasulu <ao...@bellsouth.net>.

Enrique Rodriguez wrote:

> Alex Karasulu wrote:
>
>> Emmanuel,
>>
>> I was just thinking about your position on object creation.  Namely 
>> the one that is against the creation of Tuple objects that represent 
>> TLVs.  Your proposal to use pooling of these objects worries me a 
>> bit.  It just makes me think there would be a lot of synchronization 
>> overhead.  I may be wrong.
>>
>> However I started thinking, "why create Tuples at all?" Follow my 
>> concepts here for a sec even though we have not been discussiong 
>> these constructs: TupleProducers and TupleConsumers.  A producer 
>> simply emits callbacks to a consumer and they are bound to each 
>> other.  What if the callbacks did not pass in a Tuple as an argument 
>> but the components T, L and V of the Tuple instead.
>
>
> Do you even need the L?  You have the V at this point.

The V may be a URI to data held elsewhere while streaming big Tuples to 
disk.  Also L may be of the indefinate form.

>
> -enrique
>
>
>> A stub, which is like the parser you mentioned, tracks and changes 
>> state as an automaton to populate its properties appropriately with 
>> the stream of Tuple events.  The stub can be a TupleConsumer - really 
>> a tuple event consumer rather.  This would eliminate object creation 
>> overheads and populate the stub.
>>
>> Thoughts?
>>
>>    -Alex
>
>

Re: [asn1] why use TLV objects at all?

Posted by Enrique Rodriguez <er...@apache.org>.

Alex Karasulu wrote:
> Emmanuel,
> 
> I was just thinking about your position on object creation.  Namely the 
> one that is against the creation of Tuple objects that represent TLVs.  
> Your proposal to use pooling of these objects worries me a bit.  It just 
> makes me think there would be a lot of synchronization overhead.  I may 
> be wrong.
> 
> However I started thinking, "why create Tuples at all?" Follow my 
> concepts here for a sec even though we have not been discussiong these 
> constructs: TupleProducers and TupleConsumers.  A producer simply emits 
> callbacks to a consumer and they are bound to each other.  What if the 
> callbacks did not pass in a Tuple as an argument but the components T, L 
> and V of the Tuple instead.

Do you even need the L?  You have the V at this point.

-enrique


> A stub, which is like the parser you 
> mentioned, tracks and changes state as an automaton to populate its 
> properties appropriately with the stream of Tuple events.  The stub can 
> be a TupleConsumer - really a tuple event consumer rather.  This would 
> eliminate object creation overheads and populate the stub.
> 
> Thoughts?
> 
>    -Alex