You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/04/21 00:22:58 UTC
[jira] [Created] (TIKA-1609) Leverage Google's LibPhonenumber for
enhanced phone number extraction and metadata modeling
Lewis John McGibbney created TIKA-1609:
------------------------------------------
Summary: Leverage Google's LibPhonenumber for enhanced phone number extraction and metadata modeling
Key: TIKA-1609
URL: https://issues.apache.org/jira/browse/TIKA-1609
Project: Tika
Issue Type: New Feature
Components: core
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Fix For: 1.9
Google's Libphonenumber can provide us with comprehensive support for modeling Phone number metadata properly in Tika.
During the development of this patch I realized two things, namely
* This is not a parser as such as Phone numbers are not mapped to any particular Mimetype
* In addition, there can be many phone numbers per document, so this is most likely a Content Handler of sorts
* Tika's Metadata support is currently too restrictive to allow us to persist many complex objects e.g. String, Object. We need to expand Meatdata support over and above String, String[].
https://github.com/googlei18n/libphonenumber/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)