You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/10/08 22:50:34 UTC
[jira] [Created] (TIKA-1438) PhoneExtractingContentHandler to not
add individual MD entries for individual phone numbers
Lewis John McGibbney created TIKA-1438:
------------------------------------------
Summary: PhoneExtractingContentHandler to not add individual MD entries for individual phone numbers
Key: TIKA-1438
URL: https://issues.apache.org/jira/browse/TIKA-1438
Project: Tika
Issue Type: Bug
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Minor
Fix For: 1.7
Right now we have the PhoneExtractingContentHandler adding phone numbers as individual metadata entires.... I feel that this is cumbersome.
An example would be that we have a webpage with phone numbers on it, we then have many fields of the same type with different values!
I propose we reverse this and have one field with multiple values.
I would fully understand the current behaviour if we wished to augment the phone numbers further by associating dialing code, country, carrier, etc, however we are not currently doing this.
Patch coming for trunk.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)