You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Nik V. Babichev (JIRA)" <ji...@apache.org> on 2011/08/19 15:30:27 UTC

[jira] [Created] (SOLR-2719) ReturnFields incorrect parse fields with hyphen

ReturnFields incorrect parse fields with hyphen 
------------------------------------------------

                 Key: SOLR-2719
                 URL: https://issues.apache.org/jira/browse/SOLR-2719
             Project: Solr
          Issue Type: Bug
          Components: search
    Affects Versions: 4.0
            Reporter: Nik V. Babichev


fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".

OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
Hyphen is not JavaIdentifierPart and this check break when first "-" is found.

This problem solve by passing '-' to check:
if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;

But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Hoss Man (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hoss Man updated SOLR-2719:
---------------------------

         Priority: Blocker  (was: Major)
    Fix Version/s: 4.0
          Summary: REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases  (was: ReturnFields incorrect parse fields with hyphen )

setting this as a blocker for 4.0 since it is a fairly serious regression for anyone using field names with "-" in them
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Mark Miller (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222401#comment-13222401 ] 

Mark Miller commented on SOLR-2719:
-----------------------------------

bq. but I'd also propose a field names validation at Solr startup,

+1 - rather than playing loosey goosey on what's a valid field name, we should doc and validate for it explicitly.
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Luca Cavanna (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222394#comment-13222394 ] 

Luca Cavanna commented on SOLR-2719:
------------------------------------

Yonik, I see your point. On the other hand, the dash is a widely used character within field names. The regression is on the Solr behaviour, and I think it's pretty annoying from a user perspective.

Anyway, if that's the direction of the project, no problem. What matters more than anything else is consistency. We should document it somewhere as you wrote, but I'd also propose a field names validation at Solr startup, using the StrParser rules, so that Solr accepts only allowed field names and can guarantee the proper behaviour with all allowed field names.

What do you think?
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Ryan McKinley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221234#comment-13221234 ] 

Ryan McKinley commented on SOLR-2719:
-------------------------------------

I added the tests with @Ignore in revision: 1296434
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Luca Cavanna (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219545#comment-13219545 ] 

Luca Cavanna edited comment on SOLR-2719 at 2/29/12 9:06 PM:
-------------------------------------------------------------

My patch isn't a fix but just a starting point: it adds a ReturnFieldsTest class which tests some of the new fl features. Some tests are of course failing. The biggest problem is the hyphen within the field name, which I guess is widely used. This could be corrected as suggested by Nik, but we have problems with other characters, even if less used within field names.

Solr doesn't validate field names, but now a lot of potential field names can't actually be used within the fl parameter, or even worse they break the query. Some of my test methods are intentionally weird, like the ~idtest or id$test, but those field names are both allowed by Solr. I'm afraid we might have the same problem with sorting since the QueryParsing#parseSort uses the same StrParser#getId method.

The main rule to identify the end of a field name in StrParser#getId seems to be the following:
{code}
if (!Character.isJavaIdentifierPart(ch) && ch != '.')
	break;
{code}
I guess it should be extended, not just with the hyphen, but in my opinion the point here is not just correct the hyphen regression. I think we should introduce consistency between fl and decide which characters Solr should accept within a field name. I mean, if Solr accepts everything, we'll always have this fl problem. What are your thoughts guys?
                
      was (Author: lucacavanna):
    My patch isn't a fix but just a starting point: it adds a ReturnFieldsTest class which tests some of the new fl features. Some tests are of course failing. The biggest problem is the hyphen within the field name, which I guess is widely used. This could be corrected as suggested by Nik, but we have problems with other characters, even if less used within field names.

Solr doesn't validate field names, but now a lot of potential field names can't actually be used within the fl parameter, or even worse they break the query. Some of my tests method are intentionally weird, like the ~idtest or id$test, but those field names are both allowed by Solr. I'm afraid we might have the same problem with sorting since the QueryParsing#parseSort uses the same StrParser#getId method.

The main rule to identify the end of a field name seems to be the following:
{code}
if (!Character.isJavaIdentifierPart(ch) && ch != '.')
	break;
{code}

In my opinion, the point here is not just correct the hyphen regression. I think we should introduce consistency between fl and decide which characters Solr accepts within a field name.

What are your thoughts guys?
                  
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Luca Cavanna (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222443#comment-13222443 ] 

Luca Cavanna commented on SOLR-2719:
------------------------------------

How about trying to achieve both? I mean, are there many other places where we should do the same (adding the dash support)? QueryParsing#parseSort has the same problem. Anything else? I'm probably missing something.

Depending on where we need to add support for dash to add consistency, I would try to add support for the trailing dash here for backward compatibility (I'd have a patch almost ready), and work on validation as well.

                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222379#comment-13222379 ] 

Yonik Seeley commented on SOLR-2719:
------------------------------------

I've been saying for a while that using roughly java identifiers for field names was best practice, but we should document it somewhere.

I don't think we should change StrParser.getId to be more permissive though - that will just cause more problems in the future (say when we want to start adding infix and want a-b to be a minus b.  There's not a regression in that specific code since the function parser has never accepted "-" as part of a field name.
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222408#comment-13222408 ] 

Yonik Seeley commented on SOLR-2719:
------------------------------------

bq. Yonik, I see your point. On the other hand, the dash is a widely used character within field names. The regression is on the Solr behaviour, and I think it's pretty annoying from a user perspective.

The easiest way to handle this would be this code in ReturnFields:
{code}

        // short circuit test for a really simple field name
        String key = null;
        String field = sp.getId(null);
        char ch = sp.ch();
{code}

Instead of using getId, we should hand-roll something that also accepts "-" as part of the field name.  That would leave function parser (and other users of getId) alone, but allow fieldnames with dashes in the fl param.

bq. What matters more than anything else is consistency.

If we really want to go for consistency, then we should not accept "-" anywhere (rather than attempting to expand it to everywhere).
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13222449#comment-13222449 ] 

Yonik Seeley commented on SOLR-2719:
------------------------------------

bq. How about trying to achieve both? I mean, are there many other places where we should do the same (adding the dash support)?

I think this "regression" is limited to "fl" since that code was changed to support pseudo-fields.

bq. QueryParsing#parseSort has the same problem.

I just tried trunk with "sort=a-b_s desc" and it seemed to work fine.
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Yonik Seeley (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-2719:
-------------------------------

    Attachment: SOLR-2719.patch

Here's a simpler patch that tries to change less (only that first getId call).

I didn't go with the varargs version because it creates objects on every call.
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Assignee: Yonik Seeley
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch, SOLR-2719.patch, SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Yonik Seeley (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley reassigned SOLR-2719:
----------------------------------

    Assignee: Yonik Seeley
    
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Assignee: Yonik Seeley
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch, SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Luca Cavanna (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Cavanna updated SOLR-2719:
-------------------------------

    Attachment: SOLR-2719.patch

My patch isn't a fix but just a starting point: it adds a ReturnFieldsTest class which tests some of the new fl features. Some tests are of course failing. The biggest problem is the hyphen within the field name, which I guess is widely used. This could be corrected as suggested by Nik, but we have problems with other characters, even if less used within field names.

Solr doesn't validate field names, but now a lot of potential field names can't actually be used within the fl parameter, or even worse they break the query. Some of my tests method are intentionally weird, like the ~idtest or id$test, but those field names are both allowed by Solr. I'm afraid we might have the same problem with sorting since the QueryParsing#parseSort uses the same StrParser#getId method.

The main rule to identify the end of a field name seems to be the following:
{code}
if (!Character.isJavaIdentifierPart(ch) && ch != '.')
	break;
{code}

In my opinion, the point here is not just correct the hyphen regression. I think we should introduce consistency between fl and decide which characters Solr accepts within a field name.

What are your thoughts guys?
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Luca Cavanna (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13227608#comment-13227608 ] 

Luca Cavanna commented on SOLR-2719:
------------------------------------

Since this issue is blocker for 4.0, it would great to close it soon. Let me know if there's something more I can do to help!
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch, SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Luca Cavanna (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luca Cavanna updated SOLR-2719:
-------------------------------

    Attachment: SOLR-2719.patch

Right Yonik, I've been misled by the 
{code}
String field = sp.getId(null);
{code}
at the beginning of QueryParsing#parseSort, while the method to look at was getSimpleName. Sorting is ok (but I don't completely understand the sp.getId at the beginning).

I attached a new patch: I added to StrParser a getFieldName method and added to getId the possibility to pass a vararg parameter of allowed trailing chars.
I made also some changes to ReturnFields to make the code a little bit more readable to me, hope is the same for you guys. Tests work.
I removed the weird tests about trailing tilde and so on, kept just the trailing dash test which now works (removed @Ignore).

Let me know what you think.

I'm going to open a new issue to add field names validation.
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch, SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Nik V. Babichev (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13128223#comment-13128223 ] 

Nik V. Babichev commented on SOLR-2719:
---------------------------------------

Is it so hard to fix it? 
How can I help in fixing?
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Resolved] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Yonik Seeley (Resolved) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley resolved SOLR-2719.
--------------------------------

    Resolution: Fixed

committed.  I also added a note about recommended field naming to the schema.
                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Assignee: Yonik Seeley
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch, SOLR-2719.patch, SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-2719) REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases

Posted by "Luca Cavanna (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228302#comment-13228302 ] 

Luca Cavanna commented on SOLR-2719:
------------------------------------

Thanks Yonik. I'm of course ok with your patch! 
I agree with your varargs comment, my code was also more generic than it should. But maybe some other changes could be useful to make the code a little more readable here. Furthermore, if we can avoid copy & pasting the same code it's better I guess.
Anyway, the most important thing is closing this issue, up to you.

                
> REGRESSION ReturnFields incorrect parse fields with hyphen - breaks existing "fl=my-field-name" type usecases
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2719
>                 URL: https://issues.apache.org/jira/browse/SOLR-2719
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 4.0
>            Reporter: Nik V. Babichev
>            Assignee: Yonik Seeley
>            Priority: Blocker
>              Labels: field, fl, query, queryparser
>             Fix For: 4.0
>
>         Attachments: SOLR-2719.patch, SOLR-2719.patch, SOLR-2719.patch
>
>
> fl=my-hyphen-field in query params parsed as "my" instead of "my-hyphen-field".
> OAS.search.ReturnFields use method getId() from OAS.search.QueryParsing
> in which check chars "if (!Character.isJavaIdentifierPart(ch) && ch != '.')"
> Hyphen is not JavaIdentifierPart and this check break when first "-" is found.
> This problem solve by passing '-' to check:
> if (!Character.isJavaIdentifierPart(ch) && ch != '.' && ch != '-') break;
> But I don't know how it can affect on whole project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org