You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Nick Burch <ni...@alfresco.com> on 2012/02/08 19:06:58 UTC
Re: Sharing metadata logic between parsers

On Mon, 30 Jan 2012, Jukka Zitting wrote:
>> If we're doing that sort of thing, then I'd rather we put the logic 
>> onto the Property for that. The Property already has a type and the 
>> closed list of allowed values (as Strings, from the XMPDM 
>> specification). It would seem to me that the logic for going to/from 
>> channel numbers would best live with the strings themselves, on the 
>> property, rather than outside?
>
> I'm thinking of cases where such a convenience methods could rely on
> more than just a single property. For example, a getNumberOfPages()
> convenience method could look something like this (with an extra
> getInt(String) helper method):
>
>    int getNumberOfPages() {
>        Integer pages = metadata.getInt(PagedText.N_PAGES);
>        if (pages == null) {
>            pages = metadata.getInt(MSOffice.PAGE_COUNT);
>        }

I decided, after thinking on this a bit, that the best bet was to try to 
implement the solution and see how it actually worked

I've however just found a snag with Jukka's proposed solution above. I 
wanted to put the logic on XMPDM, as static methods, because I thought 
that the logic should really live with the property definition. 
Unfortunately, this doesn't work, as XMPDM is an interface so Java won't 
let you add random static methods. I tried putting the static method on 
Metadata itself, but I really didn't like the look of it

Property itself is final, as we don't want people to go around adding 
extra types in that the specifications don't support, so I can't simply 
add a converter method onto that.

In r1242018 I've committed a couple of different possible ways to do the 
conversion, which do work, so people can have a look and a think. I could 
see the static class being attached optionally to the Property, for either 
explicit calling, or implicit as Ray described.

Based on how it looks now I've actually coded something, I'm leaning 
towards implicit conversion in metadata.set(Property, thing) calling 
methods on an optional attached converter on the property. (I guess Ray 
would be in favour of this, as it's quite like his idea!).

Thoughts? Should I go ahead and try to implement my plan, so we can see if 
we like how it looks in real code and if there are any problems with doing 
it? Can anyone think of a cleaner way to do it? Can anyone think of a way 
to do what Jukka suggested, but without having to clutter up the main 
concrete class?

Cheers
Nick