You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uima.apache.org by Richard Eckart de Castilho <ri...@gmail.com> on 2013/08/02 20:29:31 UTC
Re: How to use the new binary CAS (de)serialization?
Hi,
I'm still trying to use the new serialization methods but continue
running into problems.
Last time we discussed that I need to know the original type system
when I want to deserialize a format 6 binary CAS into a CAS.
So when I serialize the CAS now, I first write a header, then I
dump the type system into my output stream, and then the binary CAS
using
serializeWithCompression(cas, outputStream, cas.getTypeSystem());
When I read the data, I check for my header. If it is there, I
read the type system.
Now I wanted to call
deserializeCAS(cas, inputStream, typeSystem, null);
Unfortunately, that fails. The reason is, that this signature of
deserializeCAS immediately uses the BinaryCasSerDes6 to read
data from the input stream. However, serializeWithCompression
writes a header before the data that BinaryCasSerDes6. This
header is read by a deserializeCAS(cas, inputStream), but
in this signature, I have no way of specifying the original
type system.
Of course I can copy the whole header checking code from CASImpl,
but I don't think that is a good solution. I think the
deserializeCAS methods that UIMA provides should either all deal
with the header that the serializeWithCompression methods write,
or none should.
Maybe a solution for this dilemma is something that could also
go into a 2.4.2 release.
Cheers,
-- Richard
Re: How to use the new binary CAS (de)serialization?
Posted by Richard Eckart de Castilho <ri...@gmail.com>.
I have it working by now. My last issue was the subtyping of
the document annotation, which was a problem in a unit test
that I wrote, but is unlikely to be a problem in actual use.
-- Richard
Am 05.08.2013 um 17:40 schrieb Marshall Schor <ms...@schor.com>:
> I think if you "pre-read" some info from a stream, and then pass that stream to
> the reinit (or other method of binary deserialization), it just continues
> reading from wherever the stream was positioned, so I think your approach ought
> to work...
>
> -Marshall
>
> On 8/2/2013 2:34 PM, Richard Eckart de Castilho wrote:
>> Hm, I just notice that my problem analysis was not quite correct.
>> BinaryCasSerDes6 indeed is able to handle the header… so my problem
>> must be somewhere else.
>>
>> -- Richard
>>
>> Am 02.08.2013 um 20:29 schrieb Richard Eckart de Castilho <ri...@gmail.com>:
>>
>>> Hi,
>>>
>>> I'm still trying to use the new serialization methods but continue
>>> running into problems.
>>>
>>> Last time we discussed that I need to know the original type system
>>> when I want to deserialize a format 6 binary CAS into a CAS.
>>>
>>> So when I serialize the CAS now, I first write a header, then I
>>> dump the type system into my output stream, and then the binary CAS
>>> using
>>>
>>> serializeWithCompression(cas, outputStream, cas.getTypeSystem());
>>>
>>>
>>> When I read the data, I check for my header. If it is there, I
>>> read the type system.
>>>
>>> Now I wanted to call
>>>
>>> deserializeCAS(cas, inputStream, typeSystem, null);
>>>
>>> Unfortunately, that fails. The reason is, that this signature of
>>> deserializeCAS immediately uses the BinaryCasSerDes6 to read
>>> data from the input stream. However, serializeWithCompression
>>> writes a header before the data that BinaryCasSerDes6. This
>>> header is read by a deserializeCAS(cas, inputStream), but
>>> in this signature, I have no way of specifying the original
>>> type system.
>>>
>>> Of course I can copy the whole header checking code from CASImpl,
>>> but I don't think that is a good solution. I think the
>>> deserializeCAS methods that UIMA provides should either all deal
>>> with the header that the serializeWithCompression methods write,
>>> or none should.
>>>
>>> Maybe a solution for this dilemma is something that could also
>>> go into a 2.4.2 release.
>>>
>>> Cheers,
>>>
>>> -- Richard
Re: How to use the new binary CAS (de)serialization?
Posted by Marshall Schor <ms...@schor.com>.
I think if you "pre-read" some info from a stream, and then pass that stream to
the reinit (or other method of binary deserialization), it just continues
reading from wherever the stream was positioned, so I think your approach ought
to work...
-Marshall
On 8/2/2013 2:34 PM, Richard Eckart de Castilho wrote:
> Hm, I just notice that my problem analysis was not quite correct.
> BinaryCasSerDes6 indeed is able to handle the header… so my problem
> must be somewhere else.
>
> -- Richard
>
> Am 02.08.2013 um 20:29 schrieb Richard Eckart de Castilho <ri...@gmail.com>:
>
>> Hi,
>>
>> I'm still trying to use the new serialization methods but continue
>> running into problems.
>>
>> Last time we discussed that I need to know the original type system
>> when I want to deserialize a format 6 binary CAS into a CAS.
>>
>> So when I serialize the CAS now, I first write a header, then I
>> dump the type system into my output stream, and then the binary CAS
>> using
>>
>> serializeWithCompression(cas, outputStream, cas.getTypeSystem());
>>
>>
>> When I read the data, I check for my header. If it is there, I
>> read the type system.
>>
>> Now I wanted to call
>>
>> deserializeCAS(cas, inputStream, typeSystem, null);
>>
>> Unfortunately, that fails. The reason is, that this signature of
>> deserializeCAS immediately uses the BinaryCasSerDes6 to read
>> data from the input stream. However, serializeWithCompression
>> writes a header before the data that BinaryCasSerDes6. This
>> header is read by a deserializeCAS(cas, inputStream), but
>> in this signature, I have no way of specifying the original
>> type system.
>>
>> Of course I can copy the whole header checking code from CASImpl,
>> but I don't think that is a good solution. I think the
>> deserializeCAS methods that UIMA provides should either all deal
>> with the header that the serializeWithCompression methods write,
>> or none should.
>>
>> Maybe a solution for this dilemma is something that could also
>> go into a 2.4.2 release.
>>
>> Cheers,
>>
>> -- Richard
>
Re: How to use the new binary CAS (de)serialization?
Posted by Richard Eckart de Castilho <ri...@gmail.com>.
Hm, I just notice that my problem analysis was not quite correct.
BinaryCasSerDes6 indeed is able to handle the header… so my problem
must be somewhere else.
-- Richard
Am 02.08.2013 um 20:29 schrieb Richard Eckart de Castilho <ri...@gmail.com>:
> Hi,
>
> I'm still trying to use the new serialization methods but continue
> running into problems.
>
> Last time we discussed that I need to know the original type system
> when I want to deserialize a format 6 binary CAS into a CAS.
>
> So when I serialize the CAS now, I first write a header, then I
> dump the type system into my output stream, and then the binary CAS
> using
>
> serializeWithCompression(cas, outputStream, cas.getTypeSystem());
>
>
> When I read the data, I check for my header. If it is there, I
> read the type system.
>
> Now I wanted to call
>
> deserializeCAS(cas, inputStream, typeSystem, null);
>
> Unfortunately, that fails. The reason is, that this signature of
> deserializeCAS immediately uses the BinaryCasSerDes6 to read
> data from the input stream. However, serializeWithCompression
> writes a header before the data that BinaryCasSerDes6. This
> header is read by a deserializeCAS(cas, inputStream), but
> in this signature, I have no way of specifying the original
> type system.
>
> Of course I can copy the whole header checking code from CASImpl,
> but I don't think that is a good solution. I think the
> deserializeCAS methods that UIMA provides should either all deal
> with the header that the serializeWithCompression methods write,
> or none should.
>
> Maybe a solution for this dilemma is something that could also
> go into a 2.4.2 release.
>
> Cheers,
>
> -- Richard