You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucy.apache.org by Knut Arne Bjørndal <kn...@easyconnect.no> on 2014/05/05 16:58:25 UTC

[lucy-user] Serialization error when inheriting from Lucy::Analysis::Normalizer

Hi

I'm implementing a custom normalization analyzer and it would be
convenient to let it inherit from the default implementation, but I'm
having trouble with the serialization and deserialization.

I override dump to add some extra data to the serialized object, which
seems to be the part with the problem.

Minimal test case:
package My::Normalizer;
use base qw( Lucy::Analysis::Normalizer );

sub dump {
  shift->SUPER::dump(@_);
}

1;

Adding this to a schema causes an exception to be throw when the schema
is deserialized:
Can't downcast from Lucy::Object::CharBuf to Lucy::Object::BoolNum
	lucy_Normalizer_load at .../Normalizer.c line 151
	at /usr/lib/perl5/Lucy.pm line 239
	Lucy::Index::Indexer::new('Lucy::Index::Indexer', 'index',
'index.1246/', 'schema', 'Lucy::Plan::Schema=SCALAR(0x2e4fc90)',
'create', 1)

Looking at schema.json a Lucy::Analysis::Normalizer is serialized to:
        {
          "_class": "Lucy::Analysis::Normalizer",
          "case_fold": true,
          "normalization_form": "NFKC",
          "strip_accents": false
        },
while my subclassed instance is serialized to:
        {
          "_class": "My::Normalizer",
          "case_fold": "1",
          "normalization_form": "NFKC",
          "strip_accents": "0"
        },

So it looks like the serializer doesn't correctly handle boolean values
when called from perl?

-- 
Knut Arne Bjørndal, Tekniker Easy Connect AS - http://1890.no
E-post: knut.arne.bjorndal@easyconnect.no


Re: [lucy-user] Serialization error when inheriting from Lucy::Analysis::Normalizer

Posted by Marvin Humphrey <ma...@rectangular.com>.
Hi,

On Mon, May 5, 2014 at 7:58 AM, Knut Arne Bjørndal
<kn...@easyconnect.no> wrote:

> Minimal test case:

Thanks for providing the test case!  It allowed me to isolated the problem
right away -- here's a quick fix:

https://github.com/rectang/lucy/commit/91b7a5fde2e18370c81a9189591fa25b36b3e963

> So it looks like the serializer doesn't correctly handle boolean values
> when called from perl?

Round-trip serialization of booleans is tricky for languages like Perl which
don't provide a dedicated boolean type.  We can probably make some general
improvements on that front, but in any case it's better for client code like
Normalizer#Load to be forgiving and call To_Bool() on whatever object is
present rather than insist on a specific boolean type.

Marvin Humphrey