You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Mark Nemeskey (Created) (JIRA)" <ji...@apache.org> on 2011/11/07 11:51:52 UTC
[jira] [Created] (LUCENE-3566) Parametrizing H1 and H2
Parametrizing H1 and H2
-----------------------
Key: LUCENE-3566
URL: https://issues.apache.org/jira/browse/LUCENE-3566
Project: Lucene - Java
Issue Type: Improvement
Components: core/search
Affects Versions: flexscoring branch
Reporter: David Mark Nemeskey
Assignee: David Mark Nemeskey
Priority: Minor
Fix For: flexscoring branch
The DFR normalizations {{H1}} and {{H2}} are parameter-free. This is in line with the [original article|http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.101.742], but not with the [thesis|http://theses.gla.ac.uk/1570/], where H2 accepts a {{c}} parameter, nor with [information-based models|http://dl.acm.org/citation.cfm?id=1835490], where H1 also accepts a {{c}} parameter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3566) Parametrizing H1 and H2
Posted by "David Mark Nemeskey (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Mark Nemeskey updated LUCENE-3566:
----------------------------------------
Attachment: LUCENE-3566.patch
Patch re-based on trunk.
> Parametrizing H1 and H2
> -----------------------
>
> Key: LUCENE-3566
> URL: https://issues.apache.org/jira/browse/LUCENE-3566
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 4.0
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Priority: Minor
> Labels: score
> Fix For: 4.0
>
> Attachments: LUCENE-3566.patch, LUCENE-3566.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The DFR normalizations {{H1}} and {{H2}} are parameter-free. This is in line with the [original article|http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.101.742], but not with the [thesis|http://theses.gla.ac.uk/1570/], where H2 accepts a {{c}} parameter, nor with [information-based models|http://dl.acm.org/citation.cfm?id=1835490], where H1 also accepts a {{c}} parameter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3566) Parametrizing H1 and H2
Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-3566:
--------------------------------
Attachment: LUCENE-3566.patch
I thought we had done this already: but realized I forgot about it!
I added the solr factory/parsing stuff to the patch. Will commit shortly.
> Parametrizing H1 and H2
> -----------------------
>
> Key: LUCENE-3566
> URL: https://issues.apache.org/jira/browse/LUCENE-3566
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 4.0
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Priority: Minor
> Labels: score
> Fix For: 4.0
>
> Attachments: LUCENE-3566.patch, LUCENE-3566.patch, LUCENE-3566.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The DFR normalizations {{H1}} and {{H2}} are parameter-free. This is in line with the [original article|http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.101.742], but not with the [thesis|http://theses.gla.ac.uk/1570/], where H2 accepts a {{c}} parameter, nor with [information-based models|http://dl.acm.org/citation.cfm?id=1835490], where H1 also accepts a {{c}} parameter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3566) Parametrizing H1 and H2
Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145360#comment-13145360 ]
Robert Muir commented on LUCENE-3566:
-------------------------------------
+1, lets add these.
i didnt think H1 took params (the thesis says 'Therefore, the constant of C is 1 assuming H1', then defines it without C). did the IB paper make a mistake?
either way, it wont hurt anything to add the parameter, just confusing :)
> Parametrizing H1 and H2
> -----------------------
>
> Key: LUCENE-3566
> URL: https://issues.apache.org/jira/browse/LUCENE-3566
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/search
> Affects Versions: flexscoring branch
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Priority: Minor
> Labels: score
> Fix For: flexscoring branch
>
> Attachments: LUCENE-3566.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The DFR normalizations {{H1}} and {{H2}} are parameter-free. This is in line with the [original article|http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.101.742], but not with the [thesis|http://theses.gla.ac.uk/1570/], where H2 accepts a {{c}} parameter, nor with [information-based models|http://dl.acm.org/citation.cfm?id=1835490], where H1 also accepts a {{c}} parameter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3566) Parametrizing H1 and H2
Posted by "Robert Muir (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated LUCENE-3566:
--------------------------------
Affects Version/s: (was: flexscoring branch)
4.0
Fix Version/s: (was: flexscoring branch)
4.0
editing fix version to 4.0, since flexscoring branch was merged, i think we can safely do any scoring improvements in mainline trunk
> Parametrizing H1 and H2
> -----------------------
>
> Key: LUCENE-3566
> URL: https://issues.apache.org/jira/browse/LUCENE-3566
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 4.0
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Priority: Minor
> Labels: score
> Fix For: 4.0
>
> Attachments: LUCENE-3566.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The DFR normalizations {{H1}} and {{H2}} are parameter-free. This is in line with the [original article|http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.101.742], but not with the [thesis|http://theses.gla.ac.uk/1570/], where H2 accepts a {{c}} parameter, nor with [information-based models|http://dl.acm.org/citation.cfm?id=1835490], where H1 also accepts a {{c}} parameter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3566) Parametrizing H1 and H2
Posted by "David Mark Nemeskey (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145502#comment-13145502 ]
David Mark Nemeskey commented on LUCENE-3566:
---------------------------------------------
bq. i didnt think H1 took params (the thesis says 'Therefore, the constant of C is 1 assuming H1', then defines it without C). did the IB paper make a mistake?
Good question. Perhaps it was a mistake; however, according to my colleague, who had experimented with the IB method in our own engine and proposed to add the parameter to Lucene, a well chosen {{c}} can improve the results. Well, duh really; nevertheless, as long as we have defaults, shouldn't be a problem. :)
> Parametrizing H1 and H2
> -----------------------
>
> Key: LUCENE-3566
> URL: https://issues.apache.org/jira/browse/LUCENE-3566
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 4.0
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Priority: Minor
> Labels: score
> Fix For: 4.0
>
> Attachments: LUCENE-3566.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The DFR normalizations {{H1}} and {{H2}} are parameter-free. This is in line with the [original article|http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.101.742], but not with the [thesis|http://theses.gla.ac.uk/1570/], where H2 accepts a {{c}} parameter, nor with [information-based models|http://dl.acm.org/citation.cfm?id=1835490], where H1 also accepts a {{c}} parameter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Resolved] (LUCENE-3566) Parametrizing H1 and H2
Posted by "Robert Muir (Resolved) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir resolved LUCENE-3566.
---------------------------------
Resolution: Fixed
Thanks David!
> Parametrizing H1 and H2
> -----------------------
>
> Key: LUCENE-3566
> URL: https://issues.apache.org/jira/browse/LUCENE-3566
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 4.0
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Priority: Minor
> Labels: score
> Fix For: 4.0
>
> Attachments: LUCENE-3566.patch, LUCENE-3566.patch, LUCENE-3566.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The DFR normalizations {{H1}} and {{H2}} are parameter-free. This is in line with the [original article|http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.101.742], but not with the [thesis|http://theses.gla.ac.uk/1570/], where H2 accepts a {{c}} parameter, nor with [information-based models|http://dl.acm.org/citation.cfm?id=1835490], where H1 also accepts a {{c}} parameter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Commented] (LUCENE-3566) Parametrizing H1 and H2
Posted by "Robert Muir (Commented) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13145509#comment-13145509 ]
Robert Muir commented on LUCENE-3566:
-------------------------------------
Yeah I agree... maybe in the patch we can expose the parameter to the factory in solr (DFRSimilarityFactory has a param-parsing method for Normalization reused by IB, too) ?
> Parametrizing H1 and H2
> -----------------------
>
> Key: LUCENE-3566
> URL: https://issues.apache.org/jira/browse/LUCENE-3566
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 4.0
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Priority: Minor
> Labels: score
> Fix For: 4.0
>
> Attachments: LUCENE-3566.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The DFR normalizations {{H1}} and {{H2}} are parameter-free. This is in line with the [original article|http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.101.742], but not with the [thesis|http://theses.gla.ac.uk/1570/], where H2 accepts a {{c}} parameter, nor with [information-based models|http://dl.acm.org/citation.cfm?id=1835490], where H1 also accepts a {{c}} parameter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3566) Parametrizing H1 and H2
Posted by "David Mark Nemeskey (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Mark Nemeskey updated LUCENE-3566:
----------------------------------------
Lucene Fields: New,Patch Available (was: New)
> Parametrizing H1 and H2
> -----------------------
>
> Key: LUCENE-3566
> URL: https://issues.apache.org/jira/browse/LUCENE-3566
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/search
> Affects Versions: flexscoring branch
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Priority: Minor
> Labels: score
> Fix For: flexscoring branch
>
> Attachments: LUCENE-3566.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The DFR normalizations {{H1}} and {{H2}} are parameter-free. This is in line with the [original article|http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.101.742], but not with the [thesis|http://theses.gla.ac.uk/1570/], where H2 accepts a {{c}} parameter, nor with [information-based models|http://dl.acm.org/citation.cfm?id=1835490], where H1 also accepts a {{c}} parameter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org
[jira] [Updated] (LUCENE-3566) Parametrizing H1 and H2
Posted by "David Mark Nemeskey (Updated) (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/LUCENE-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
David Mark Nemeskey updated LUCENE-3566:
----------------------------------------
Attachment: LUCENE-3566.patch
Patch.
> Parametrizing H1 and H2
> -----------------------
>
> Key: LUCENE-3566
> URL: https://issues.apache.org/jira/browse/LUCENE-3566
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/search
> Affects Versions: flexscoring branch
> Reporter: David Mark Nemeskey
> Assignee: David Mark Nemeskey
> Priority: Minor
> Labels: score
> Fix For: flexscoring branch
>
> Attachments: LUCENE-3566.patch
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The DFR normalizations {{H1}} and {{H2}} are parameter-free. This is in line with the [original article|http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.101.742], but not with the [thesis|http://theses.gla.ac.uk/1570/], where H2 accepts a {{c}} parameter, nor with [information-based models|http://dl.acm.org/citation.cfm?id=1835490], where H1 also accepts a {{c}} parameter.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org