You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "soomyung (JIRA)" <ji...@apache.org> on 2019/07/17 21:00:02 UTC

[jira] [Created] (TIKA-2909) Contributing HWP v5 Parser

soomyung created TIKA-2909:
------------------------------

             Summary: Contributing HWP v5 Parser
                 Key: TIKA-2909
                 URL: https://issues.apache.org/jira/browse/TIKA-2909
             Project: Tika
          Issue Type: New Feature
          Components: parser
    Affects Versions: 2.0
            Reporter: soomyung


I wrote HWP v5 parser.  HWP is the file format for Hancom Word Processor. It is very popular in South Korea.  The parser has the features as below.

1.  AutoDetectParser can detect HWP v5 format.

2. extracting Text from HWP v5 file.

3. extracting Metadata from summary information of HWP v5 format

 

I'll proceed the contribution via Github



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)