You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Joern Kottmann (JIRA)" <ji...@apache.org> on 2017/01/16 14:37:26 UTC

[jira] [Closed] (OPENNLP-738) AbstractDataIndexer#sortAndMerge sets up callers for a NullPointerException

     [ https://issues.apache.org/jira/browse/OPENNLP-738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joern Kottmann closed OPENNLP-738.
----------------------------------
    Resolution: Won't Fix

Now an appropriate exception is thrown.

> AbstractDataIndexer#sortAndMerge sets up callers for a NullPointerException
> ---------------------------------------------------------------------------
>
>                 Key: OPENNLP-738
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-738
>             Project: OpenNLP
>          Issue Type: Bug
>            Reporter: Chris Lewis
>         Attachments: AbstractDataIndexer.java-NPE.patch
>
>
> In its constructor, the {{OnePassDataIndexer}} calls {{sortAndMerge}} of its parent class, {{AbstractDataIndexer}} (source file {{opennlp-tools/src/main/java/opennlp/tools/ml/model/AbstractDataIndexer.java}}). A quick read through the source of these two classes shows that the member variable {{contexts}} is only initialized by this method, otherwise it remains {{null}}. Note that in the case of {{sort}} being {{true}} (which it is as called) and there being fewer than two events, the method returns early thus leaving {{contexts}} unilitialized. Note also that {{getContexts}} exposes this variable, and that {{GIS.trainModel}} delegates to the {{trainModel}} method of {{GISTrainer}}. Line 263 attempts to dereference {{contexts.length}}, which will be {{null}} in the case of fewer than two events in the stream, and thus result in a {{NullPointerException}}.
> I'm not an expert in the algorithms relying on this code, but [some|http://comments.gmane.org/gmane.comp.apache.opennlp.user/564] [googling|http://blog.gmane.org/gmane.comp.apache.opennlp.user/month=20140501] shows a few incidents that lead back to this behavior, including at least the tickets OPENNLP-316 and OPENNLP-488. It may be the case that all uses of this code cannot possibly function correctly without >= 2 events, but I don't know that. As such, being the non-expert on the natural constraints of the inputs to {{sortAndMerge}}, I'd like to suggest 2 possible improvements: 1) default the {{contexts}} and other private arrays that are set in the >= 2 path of this code to non-null defaults or 2) throw an explicit {{IllegalArgumentException}} that states >= 2 events are required for the calculation.
> The latter is not as desirable as the former (for which I've attached a patch), but at least it provides a targeted, unambiguous reason for why an exception is being thrown.
> Also I apologize for not specifying the version or component, as I'm not clear on how the project source is organized with respect to the published artifacts. This issue is present in trunk whose parent pom claims a version of {{1.6.1-SNAPSHOT}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)