You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "Jon Zeolla (JIRA)" <ji...@apache.org> on 2016/11/14 15:13:00 UTC

[jira] [Updated] (METRON-545) Truncate fields larger than 32766 bytes

     [ https://issues.apache.org/jira/browse/METRON-545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jon Zeolla updated METRON-545:
------------------------------
    Summary: Truncate fields larger than 32766 bytes  (was: Truncate fields larger than 32766)

> Truncate fields larger than 32766 bytes
> ---------------------------------------
>
>                 Key: METRON-545
>                 URL: https://issues.apache.org/jira/browse/METRON-545
>             Project: Metron
>          Issue Type: Sub-task
>            Reporter: Jon Zeolla
>            Priority: Minor
>
> Due to a limitation with using lucene where an individual term cannot be larger than 32766 bytes (assuming UTF-8 encoding, this is 8,191 characters), and assuming that we cannot easily identify the field datatype per the intent of the user (string vs integer vs ...), we should truncate fields if they are larger than 32766.  This should be somewhat rare, but even in cases where it occurs we can leverage the dual storage (HDFS and Lucene), integrity checking fields (METRON-544), and customizability of the UI (METRON-195) in order to retrieve the full original field value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)