You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2014/10/08 22:50:34 UTC

[jira] [Created] (TIKA-1438) PhoneExtractingContentHandler to not add individual MD entries for individual phone numbers

Lewis John McGibbney created TIKA-1438:
------------------------------------------

             Summary: PhoneExtractingContentHandler to not add individual MD entries for individual phone numbers
                 Key: TIKA-1438
                 URL: https://issues.apache.org/jira/browse/TIKA-1438
             Project: Tika
          Issue Type: Bug
            Reporter: Lewis John McGibbney
            Assignee: Lewis John McGibbney
            Priority: Minor
             Fix For: 1.7


Right now we have the PhoneExtractingContentHandler adding phone numbers as individual metadata entires.... I feel that this is cumbersome.

An example would be that we have a webpage with phone numbers on it, we then have many fields of the same type with different values!
I propose we reverse this and have one field with multiple values.

I would fully understand the current behaviour if we wished to augment the phone numbers further by associating dialing code, country, carrier, etc, however we are not currently doing this.

Patch coming for trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)