You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2007/08/17 21:36:31 UTC
[jira] Created: (LUCENE-985) AAIOB thrown when length of termText
is longer than 16384 characters
AAIOB thrown when length of termText is longer than 16384 characters
--------------------------------------------------------------------
Key: LUCENE-985
URL: https://issues.apache.org/jira/browse/LUCENE-985
Project: Lucene - Java
Issue Type: Bug
Components: Index
Affects Versions: 2.3
Reporter: Michael McCandless
Assignee: Michael McCandless
Priority: Minor
Fix For: 2.3
DocumentsWriter has a max term length of 16384; if you cross that you
get an unfriendly AIOOB. We should fix to raise a clearer exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-985) AIOOB thrown when length of termText
is longer than 16384 characters (ArrayIndexOutOfBoundsException)
Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520727 ]
Michael McCandless commented on LUCENE-985:
-------------------------------------------
> As a clarification point for people who stumble upon this issue
> years from now after encountering whatever exception we put in place
> of the current one...why is there a max termText length?
This is because DocumentsWriter packs the term text for each unique
term seen into a pool of char[] blocks of 16384 chars each (to avoid
GC overhead of each separate String). So, every time a new term is
seen, it puts it at the end of the current block; when there's not
enough space it allocates another block from the pool. So a given
term must fit entirely into a single block.
> AIOOB thrown when length of termText is longer than 16384 characters (ArrayIndexOutOfBoundsException)
> -----------------------------------------------------------------------------------------------------
>
> Key: LUCENE-985
> URL: https://issues.apache.org/jira/browse/LUCENE-985
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
>
> DocumentsWriter has a max term length of 16384; if you cross that you
> get an unfriendly ArrayIndexOutOfBoundsException. We should fix to raise a clearer exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-985) AIOOB thrown when length of termText
is longer than 16384 characters
Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520693 ]
Hoss Man commented on LUCENE-985:
---------------------------------
As a clarification point for people who stumble upon this issue years from now after encountering whatever exception we put in place of the current one...
why is there a max termText length?
> AIOOB thrown when length of termText is longer than 16384 characters
> --------------------------------------------------------------------
>
> Key: LUCENE-985
> URL: https://issues.apache.org/jira/browse/LUCENE-985
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
>
> DocumentsWriter has a max term length of 16384; if you cross that you
> get an unfriendly AIOOB. We should fix to raise a clearer exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-985) AIOOB thrown when length of termText
is longer than 16384 characters
Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-985:
--------------------------------------
Summary: AIOOB thrown when length of termText is longer than 16384 characters (was: AAIOB thrown when length of termText is longer than 16384 characters)
> AIOOB thrown when length of termText is longer than 16384 characters
> --------------------------------------------------------------------
>
> Key: LUCENE-985
> URL: https://issues.apache.org/jira/browse/LUCENE-985
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
>
> DocumentsWriter has a max term length of 16384; if you cross that you
> get an unfriendly AIOOB. We should fix to raise a clearer exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-985) AIOOB thrown when length of termText
is longer than 16384 characters (ArrayIndexOutOfBoundsException)
Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-985:
--------------------------------------
Attachment: LUCENE-985.patch
> I doubt anyone will have a problem with the limit. And if they hit
> the exception it is probably due to bad end-user input of some
> kind. I always run a token filter that leaves out any token larger
> than 250 charachters or so, depending on the application. (It was
> quite accidential that I hit this AIOOBE.)
Agreed!
> That would also be a recommendation I think makes sense in the
> documentation people will look up when hitting the exception.
I've added a blurb in javadoc for IndexWriter.addDocument explaining
this limit.
Thanks for catching this Karl!
> AIOOB thrown when length of termText is longer than 16384 characters (ArrayIndexOutOfBoundsException)
> -----------------------------------------------------------------------------------------------------
>
> Key: LUCENE-985
> URL: https://issues.apache.org/jira/browse/LUCENE-985
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
> Attachments: LUCENE-985.patch
>
>
> DocumentsWriter has a max term length of 16384; if you cross that you
> get an unfriendly ArrayIndexOutOfBoundsException. We should fix to raise a clearer exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Updated: (LUCENE-985) AIOOB thrown when length of termText
is longer than 16384 characters (ArrayIndexOutOfBoundsException)
Posted by "Hoss Man (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hoss Man updated LUCENE-985:
----------------------------
Description:
DocumentsWriter has a max term length of 16384; if you cross that you
get an unfriendly ArrayIndexOutOfBoundsException. We should fix to raise a clearer exception.
was:
DocumentsWriter has a max term length of 16384; if you cross that you
get an unfriendly AIOOB. We should fix to raise a clearer exception.
Summary: AIOOB thrown when length of termText is longer than 16384 characters (ArrayIndexOutOfBoundsException) (was: AIOOB thrown when length of termText is longer than 16384 characters)
(making summary longer to improve searchability of the exception for other people who may get bit by it)
> AIOOB thrown when length of termText is longer than 16384 characters (ArrayIndexOutOfBoundsException)
> -----------------------------------------------------------------------------------------------------
>
> Key: LUCENE-985
> URL: https://issues.apache.org/jira/browse/LUCENE-985
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
>
> DocumentsWriter has a max term length of 16384; if you cross that you
> get an unfriendly ArrayIndexOutOfBoundsException. We should fix to raise a clearer exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Resolved: (LUCENE-985) AIOOB thrown when length of termText
is longer than 16384 characters (ArrayIndexOutOfBoundsException)
Posted by "Michael McCandless (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless resolved LUCENE-985.
---------------------------------------
Resolution: Fixed
> AIOOB thrown when length of termText is longer than 16384 characters (ArrayIndexOutOfBoundsException)
> -----------------------------------------------------------------------------------------------------
>
> Key: LUCENE-985
> URL: https://issues.apache.org/jira/browse/LUCENE-985
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
> Attachments: LUCENE-985.patch
>
>
> DocumentsWriter has a max term length of 16384; if you cross that you
> get an unfriendly ArrayIndexOutOfBoundsException. We should fix to raise a clearer exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org
[jira] Commented: (LUCENE-985) AIOOB thrown when length of termText
is longer than 16384 characters (ArrayIndexOutOfBoundsException)
Posted by "Karl Wettin (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520741 ]
Karl Wettin commented on LUCENE-985:
------------------------------------
I doubt anyone will have a problem with the limit. And if they hit the exception it is probably due to bad end-user input of some kind. I always run a token filter that leaves out any token larger than 250 charachters or so, depending on the application. (It was quite accidential that I hit this AIOOBE.)
That would also be a recommendation I think makes sense in the documentation people will look up when hitting the exception.
> AIOOB thrown when length of termText is longer than 16384 characters (ArrayIndexOutOfBoundsException)
> -----------------------------------------------------------------------------------------------------
>
> Key: LUCENE-985
> URL: https://issues.apache.org/jira/browse/LUCENE-985
> Project: Lucene - Java
> Issue Type: Bug
> Components: Index
> Affects Versions: 2.3
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.3
>
>
> DocumentsWriter has a max term length of 16384; if you cross that you
> get an unfriendly ArrayIndexOutOfBoundsException. We should fix to raise a clearer exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org