You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2012/05/08 11:25:51 UTC
[jira] [Created] (STANBOL-607) SolrYard should use quotes instead
of AND for multi word TextConstraints
Rupert Westenthaler created STANBOL-607:
-------------------------------------------
Summary: SolrYard should use quotes instead of AND for multi word TextConstraints
Key: STANBOL-607
URL: https://issues.apache.org/jira/browse/STANBOL-607
Project: Stanbol
Issue Type: Improvement
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
Priority: Minor
Fix For: 0.10.0-incubating
Currently a Text constraint for rdfs:label containing "The Book of Three" is encoded like
(_\!@/rdfs\:label/:The) AND ((_\!@/rdfs\:label/:Book) AND (_\!@/rdfs\:label/:of) AND(_\!@/rdfs\:label/:Three))
however Solr/Lucene allow to use quotes for multi word searches. So the correct way to encode this query would be
((_\!@/rdfs\:label/:"The Book of Three"))
This need to be fixed in the "org.apache.stanbol.entityhub.yard.solr.query.QueryUtils#encodeQueryValue(..)
NOTE: The impact of this change for Wildcard queries need to be further investigated
e.g take a query "Frankf* am Main" currently encoded like
(_\!@/rdfs\:label/:Frankf*) AND ((_\!@/rdfs\:label/:am) AND (_\!@/rdfs\:label/:main))
would than result in
((_\!@/rdfs\:label/:"Frankf* am Main"))
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (STANBOL-607) SolrYard should use
quotes instead of AND for multi word TextConstraints
Posted by "Rupert Westenthaler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/STANBOL-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271285#comment-13271285 ]
Rupert Westenthaler edited comment on STANBOL-607 at 5/10/12 5:42 AM:
----------------------------------------------------------------------
Ragarding "Frankf* am Main":
This is not supported to use wild cards within brackets. So the best solution is to keep the tokenized for for TextConstraint that do use wild cards (meaning separate tokens that contain '?' or '*')
That means that the above query would be best encoded as follows.
(_\!@/rdfs\:label/:frankf*) AND (_\!@/rdfs\:label/:"am Main")
NOTE: until we switch to Solr 3.6+ tokens that contain wildcards need to be converted to lower case (see SOLR-2438). Because of that "Frankf*" is changed to "frankf*"
was (Author: rwesten):
Ragarding "Frankf* am Main":
This is not supported to use wild cards within brackets. So the best solution is to keep the tokenized for for TextConstraint that do use wild cards (meaning separate tokens that contain '?' or '*')
That means that the above query would be best encoded as follows.
(_\!@/rdfs\:label/:Frankf*) AND (_\!@/rdfs\:label/:"am Main")
> SolrYard should use quotes instead of AND for multi word TextConstraints
> ------------------------------------------------------------------------
>
> Key: STANBOL-607
> URL: https://issues.apache.org/jira/browse/STANBOL-607
> Project: Stanbol
> Issue Type: Improvement
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Priority: Minor
> Fix For: 0.10.0-incubating
>
>
> Currently a Text constraint for rdfs:label containing "The Book of Three" is encoded like
> (_\!@/rdfs\:label/:The) AND ((_\!@/rdfs\:label/:Book) AND (_\!@/rdfs\:label/:of) AND(_\!@/rdfs\:label/:Three))
> however Solr/Lucene allow to use quotes for multi word searches. So the correct way to encode this query would be
> ((_\!@/rdfs\:label/:"The Book of Three"))
> This need to be fixed in the "org.apache.stanbol.entityhub.yard.solr.query.QueryUtils#encodeQueryValue(..)
> NOTE: The impact of this change for Wildcard queries need to be further investigated
> e.g take a query "Frankf* am Main" currently encoded like
> (_\!@/rdfs\:label/:Frankf*) AND ((_\!@/rdfs\:label/:am) AND (_\!@/rdfs\:label/:main))
> would than result in
> ((_\!@/rdfs\:label/:"Frankf* am Main"))
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (STANBOL-607) SolrYard should use quotes instead
of AND for multi word TextConstraints
Posted by "Fabian Christ (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/STANBOL-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fabian Christ updated STANBOL-607:
----------------------------------
Affects Version/s: commons.web.base-0.10.0-incubating
Fix Version/s: commons.web.base-0.10.0-incubating
> SolrYard should use quotes instead of AND for multi word TextConstraints
> ------------------------------------------------------------------------
>
> Key: STANBOL-607
> URL: https://issues.apache.org/jira/browse/STANBOL-607
> Project: Stanbol
> Issue Type: Improvement
> Components: Entity Hub
> Affects Versions: 0.9.0-incubating, commons.web.base-0.10.0-incubating
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Priority: Minor
> Fix For: entityhub-0.10.0-incubating, commons.web.base-0.10.0-incubating
>
>
> Currently a Text constraint for rdfs:label containing "The Book of Three" is encoded like
> (_\!@/rdfs\:label/:The) AND ((_\!@/rdfs\:label/:Book) AND (_\!@/rdfs\:label/:of) AND(_\!@/rdfs\:label/:Three))
> however Solr/Lucene allow to use quotes for multi word searches. So the correct way to encode this query would be
> ((_\!@/rdfs\:label/:"The Book of Three"))
> This need to be fixed in the "org.apache.stanbol.entityhub.yard.solr.query.QueryUtils#encodeQueryValue(..)
> NOTE: The impact of this change for Wildcard queries need to be further investigated
> e.g take a query "Frankf* am Main" currently encoded like
> (_\!@/rdfs\:label/:Frankf*) AND ((_\!@/rdfs\:label/:am) AND (_\!@/rdfs\:label/:main))
> would than result in
> ((_\!@/rdfs\:label/:"Frankf* am Main"))
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (STANBOL-607) SolrYard should use quotes instead
of AND for multi word TextConstraints
Posted by "Rupert Westenthaler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/STANBOL-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rupert Westenthaler resolved STANBOL-607.
-----------------------------------------
Resolution: Fixed
fixed with #1337141
> SolrYard should use quotes instead of AND for multi word TextConstraints
> ------------------------------------------------------------------------
>
> Key: STANBOL-607
> URL: https://issues.apache.org/jira/browse/STANBOL-607
> Project: Stanbol
> Issue Type: Improvement
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Priority: Minor
> Fix For: 0.10.0-incubating
>
>
> Currently a Text constraint for rdfs:label containing "The Book of Three" is encoded like
> (_\!@/rdfs\:label/:The) AND ((_\!@/rdfs\:label/:Book) AND (_\!@/rdfs\:label/:of) AND(_\!@/rdfs\:label/:Three))
> however Solr/Lucene allow to use quotes for multi word searches. So the correct way to encode this query would be
> ((_\!@/rdfs\:label/:"The Book of Three"))
> This need to be fixed in the "org.apache.stanbol.entityhub.yard.solr.query.QueryUtils#encodeQueryValue(..)
> NOTE: The impact of this change for Wildcard queries need to be further investigated
> e.g take a query "Frankf* am Main" currently encoded like
> (_\!@/rdfs\:label/:Frankf*) AND ((_\!@/rdfs\:label/:am) AND (_\!@/rdfs\:label/:main))
> would than result in
> ((_\!@/rdfs\:label/:"Frankf* am Main"))
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (STANBOL-607) SolrYard should use quotes instead
of AND for multi word TextConstraints
Posted by "Rupert Westenthaler (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/STANBOL-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13271285#comment-13271285 ]
Rupert Westenthaler commented on STANBOL-607:
---------------------------------------------
Ragarding "Frankf* am Main":
This is not supported to use wild cards within brackets. So the best solution is to keep the tokenized for for TextConstraint that do use wild cards (meaning separate tokens that contain '?' or '*')
That means that the above query would be best encoded as follows.
(_\!@/rdfs\:label/:Frankf*) AND (_\!@/rdfs\:label/:"am Main")
> SolrYard should use quotes instead of AND for multi word TextConstraints
> ------------------------------------------------------------------------
>
> Key: STANBOL-607
> URL: https://issues.apache.org/jira/browse/STANBOL-607
> Project: Stanbol
> Issue Type: Improvement
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Priority: Minor
> Fix For: 0.10.0-incubating
>
>
> Currently a Text constraint for rdfs:label containing "The Book of Three" is encoded like
> (_\!@/rdfs\:label/:The) AND ((_\!@/rdfs\:label/:Book) AND (_\!@/rdfs\:label/:of) AND(_\!@/rdfs\:label/:Three))
> however Solr/Lucene allow to use quotes for multi word searches. So the correct way to encode this query would be
> ((_\!@/rdfs\:label/:"The Book of Three"))
> This need to be fixed in the "org.apache.stanbol.entityhub.yard.solr.query.QueryUtils#encodeQueryValue(..)
> NOTE: The impact of this change for Wildcard queries need to be further investigated
> e.g take a query "Frankf* am Main" currently encoded like
> (_\!@/rdfs\:label/:Frankf*) AND ((_\!@/rdfs\:label/:am) AND (_\!@/rdfs\:label/:main))
> would than result in
> ((_\!@/rdfs\:label/:"Frankf* am Main"))
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (STANBOL-607) SolrYard should use quotes instead
of AND for multi word TextConstraints
Posted by "Fabian Christ (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/STANBOL-607?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fabian Christ updated STANBOL-607:
----------------------------------
Component/s: Entity Hub
> SolrYard should use quotes instead of AND for multi word TextConstraints
> ------------------------------------------------------------------------
>
> Key: STANBOL-607
> URL: https://issues.apache.org/jira/browse/STANBOL-607
> Project: Stanbol
> Issue Type: Improvement
> Components: Entity Hub
> Affects Versions: 0.9.0-incubating
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Priority: Minor
>
> Currently a Text constraint for rdfs:label containing "The Book of Three" is encoded like
> (_\!@/rdfs\:label/:The) AND ((_\!@/rdfs\:label/:Book) AND (_\!@/rdfs\:label/:of) AND(_\!@/rdfs\:label/:Three))
> however Solr/Lucene allow to use quotes for multi word searches. So the correct way to encode this query would be
> ((_\!@/rdfs\:label/:"The Book of Three"))
> This need to be fixed in the "org.apache.stanbol.entityhub.yard.solr.query.QueryUtils#encodeQueryValue(..)
> NOTE: The impact of this change for Wildcard queries need to be further investigated
> e.g take a query "Frankf* am Main" currently encoded like
> (_\!@/rdfs\:label/:Frankf*) AND ((_\!@/rdfs\:label/:am) AND (_\!@/rdfs\:label/:main))
> would than result in
> ((_\!@/rdfs\:label/:"Frankf* am Main"))
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira