You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/03/18 14:01:51 UTC

[GitHub] [lucene] ywelsch opened a new pull request #752: LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp

ywelsch opened a new pull request #752:
URL: https://github.com/apache/lucene/pull/752


   Creating a regular expression using the RegExp class can easily result
   in a StackOverflowError being thrown, for example when the input is
   larger than the maximum stack depth. Throwing a StackOverflowError
   isn't something a user would expect, and it isn't documented either.
   StackOverflowError is a user-unfriendly exception as it does not
   convey any intent that the user has done something wrong, but suggests
   a bug in the implementation.
   
   Instead of letting StackOverflowError bubble up, we now throw an
   IllegalArgumentException instead to clearly mark this as an input
   that the implementation can't handle.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #752: LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp

Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #752:
URL: https://github.com/apache/lucene/pull/752#issuecomment-1072449020


   As a library, we should throw the correct exception type, we shouldn't change it for fun. It is not correct to assume that this can only happen as result of union either.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #752: LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp

Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #752:
URL: https://github.com/apache/lucene/pull/752#issuecomment-1072528358


   I'm still -1 for the change. If you hit `StackOverFlowError`, really you should let the VM exit. There are no guarantees at this point.
   
   I don't care what OpenJDK does here, it is irrelevant to our situation. Because they have "special" mechanisms (annotations) available to them that we don't to provide more guarantees: See https://openjdk.java.net/jeps/270


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #752: LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp

Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #752:
URL: https://github.com/apache/lucene/pull/752#issuecomment-1072535834


   I agree with @rmuir - we should not be catching Error. The VM had to unwind the stack and who knows where we are now. If we could somehow detect the problem before it gets to that, then throwing IAE would make sense.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #752: LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp

Posted by GitBox <gi...@apache.org>.
mikemccand commented on pull request #752:
URL: https://github.com/apache/lucene/pull/752#issuecomment-1072508590


   Hmm maybe we could we preserve the full `StackOverflowException` as the cause in the newly thrown `IllegalArgumentException`?  I don't like losing/suppressing that information from the caller.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ywelsch commented on pull request #752: LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp

Posted by GitBox <gi...@apache.org>.
ywelsch commented on pull request #752:
URL: https://github.com/apache/lucene/pull/752#issuecomment-1073949409


   > I'm still -1 for the change. If you hit StackOverFlowError, really you should let the VM exit. There are no guarantees at this point.
   
   That kind of argument makes it even more compelling for StackOverFlowErrors to be avoided in the first place by safeguarding Lucene's RegExp implementation or, if deemed technically too complex, putting a big fat banner on the RegExp class that it's unsafe to use for large inputs.
   
   I would be happy to hear everyone's thoughts here on alternative solutions. For example, how would you feel about computing and passing the stack depth through the `parseXYZ` methods, and aborting computation at a user-configurable limit (set to 500 for example in default constructor)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ywelsch commented on pull request #752: LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp

Posted by GitBox <gi...@apache.org>.
ywelsch commented on pull request #752:
URL: https://github.com/apache/lucene/pull/752#issuecomment-1072464227


   > As a library, we should throw the correct exception type, we shouldn't change it for fun. It is not correct to assume that this can only happen as result of union either.
   
   I'm not sure what you're saying:
   
   - This is not changing it for fun. There is a proper explanation here you chose to ignore. Further, as outlined in the corresponding Lucene issue (https://issues.apache.org/jira/browse/LUCENE-10474) this patch follows the [approach taken by the JDK](https://github.com/openjdk/jdk/blob/cab4ff64541393a974ea91e35167668ef0036804/src/java.base/share/classes/java/util/regex/Pattern.java#L1441) to provide sensible exception behavior to users.
   - The patch does not assume it is only happening as result of union (the try / catch is in the RegExp constructor, not in the code that parses unions).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] ywelsch commented on pull request #752: LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp

Posted by GitBox <gi...@apache.org>.
ywelsch commented on pull request #752:
URL: https://github.com/apache/lucene/pull/752#issuecomment-1072464227


   > As a library, we should throw the correct exception type, we shouldn't change it for fun. It is not correct to assume that this can only happen as result of union either.
   
   I'm not sure what you're saying:
   
   - This is not changing it for fun. There is a proper explanation here you chose to ignore. Further, as outlined in the corresponding Lucene issue (https://issues.apache.org/jira/browse/LUCENE-10474) this patch follows the [approach taken by the JDK](https://github.com/openjdk/jdk/blob/cab4ff64541393a974ea91e35167668ef0036804/src/java.base/share/classes/java/util/regex/Pattern.java#L1441) to provide sensible exception behavior to users.
   - The patch does not assume it is only happening as result of union (the try / catch is in the RegExp constructor, not in the code that parses unions).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on pull request #752: LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp

Posted by GitBox <gi...@apache.org>.
rmuir commented on pull request #752:
URL: https://github.com/apache/lucene/pull/752#issuecomment-1072449020


   As a library, we should throw the correct exception type, we shouldn't change it for fun. It is not correct to assume that this can only happen as result of union either.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] msokolov commented on pull request #752: LUCENE-10474: Avoid throwing StackOverflowError when creating RegExp

Posted by GitBox <gi...@apache.org>.
msokolov commented on pull request #752:
URL: https://github.com/apache/lucene/pull/752#issuecomment-1074111548


   If we can find a clean way to detect imminent stack overflow and throw an exc, that would be great. Maybe a member variable on RegExp would be less intrusive than adding parameters. My one concern is I'm unclear on how this class is maintained -- is it generated code? Maybe it was once generated, and is now manually updated? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org