You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Greg Holmberg <ho...@comcast.net> on 2010/01/29 19:40:34 UTC

UIMA-AS binary serialization

Hi UIMA users--

I see in the README for 2.3 that UIMA-AS uses a new, efficient binary  
serialization for remote services.

I couldn't find much information about it in the Async Scaleout docs.  It  
was briefly mentioned as a configuration option, but not described.

Is this the same format that is used to serialize to C++?

If not, where can I find more information?

Must the recipient re-constitute the CAS, or is it self-describing like  
XML and could be handled by a non-UIMA recipient?

Thanks,

Greg Holmberg


Re: UIMA-AS binary serialization

Posted by Eddie Epstein <ea...@gmail.com>.
On Fri, Jan 29, 2010 at 5:07 PM, Marshall Schor <ms...@schor.com> wrote:
>
>
> Greg Holmberg wrote:
>> Hi UIMA users--
>>
>> I see in the README for 2.3 that UIMA-AS uses a new, efficient binary
>> serialization for remote services.
>>
>> I couldn't find much information about it in the Async Scaleout docs.
>> It was briefly mentioned as a configuration option, but not described.
>>
>> Is this the same format that is used to serialize to C++?
>
> I believe it's similar but not exactly the same.

The data format is the same for Java and C++, supports either byte order
and relies on the recipient to swap if necessary.

However, the service wrapper for UIMACPP does not currently support
binary serialization because the CPP framework does not generate
binary typecodes identical to Java given same input type system
descriptions. Should not be a large amount of work to fix this.

I believe ActiveMQ supports message compression to help with network
bandwidth. Another thing that would help some applications is
implementing CAS projections, where only the parts of the CAS specified
by a service AE descriptor would be sent.

Eddie

Re: UIMA-AS binary serialization

Posted by Marshall Schor <ms...@schor.com>.

Greg Holmberg wrote:
> Hi UIMA users--
>
> I see in the README for 2.3 that UIMA-AS uses a new, efficient binary
> serialization for remote services.
>
> I couldn't find much information about it in the Async Scaleout docs. 
> It was briefly mentioned as a configuration option, but not described.
>
> Is this the same format that is used to serialize to C++?

I believe it's similar but not exactly the same.
>
> If not, where can I find more information?

Read the open source code :-)
>
>
> Must the recipient re-constitute the CAS, or is it self-describing
> like XML and could be handled by a non-UIMA recipient?

It's much lower level than XML, and requires the recipient to have an
*identical* type system definition.  See note on this topic from Edward
Epstein in the thread "Re: XMI parsing" from 1/27/2010 10:03 AM.
>
>
> Thanks,
>
> Greg Holmberg
>
>
>