You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Aleksander M. Stensby" <al...@integrasco.no> on 2008/12/29 15:50:11 UTC

Some "best practices" and questions

I'm not sure if this is the place to ask, so please feel free to correct  
me if I'm completely wrong.

I've been using Lucene for about three years or so now, and Solr for about  
6 months or so. I really love what you guys are doing for the community  
and I would like to help in whatever way I can.
I asked on the solr list about maybe it would be nice to have SolrJ  
support for handling the term vector component responses in a similar  
fashion to how facets are treated today, and thought that I could help by  
adding this functionality. Since I'm new to contributing (yes I have read  
the wiki etc, so I know the basics), but I just wanted to ask a few things:

- Regarding naming conventions, do you always use the _privateVariable vs.  
parameterVariable naming?
- When testing, I figured that I would add my tests to the  
SolrExampleTests.java class, so that I can run them against embedded solr,  
jetty etc etc. Is that the correct approach?
- Specifically for my contribution, dealing with the term vector response  
handler, is it okay to use something like:
  query.set(CommonParams.QT, "tvrh");
  in the test? or is this bad? i just figured that you dont want such  
parameters forced into the SolrQuery class... or? What do you all suggest?  
for highlighting and facets its straigtforward since you don't need to set  
any QT, and you can just say solrQuery.setHighlight(true); etc. But i  
guess it might be bad to force the handler into the SolrQuery class, if  
someone decides to not register the handler in their solrconfig.xml... Any  
thoughts here?

And how "done" should my patch be before i add it to jira?

Thanks, and sorry for all the "newbie" questions...

Best regards,
  Aleksander


-- 
Aleksander M. Stensby
Senior software developer
Integrasco A/S
www.integrasco.no

Please consider the environment before printing all or any of this e-mail

Re: Some "best practices" and questions

Posted by "Aleksander M. Stensby" <al...@integrasco.no>.
Thanks Grant!

On Mon, 29 Dec 2008 22:05:04 +0100, Grant Ingersoll <gs...@apache.org>  
wrote:

>
> On Dec 29, 2008, at 9:50 AM, Aleksander M. Stensby wrote:
>
>> I'm not sure if this is the place to ask, so please feel free to  
>> correct me if I'm completely wrong.
>>
>> I've been using Lucene for about three years or so now, and Solr for  
>> about 6 months or so. I really love what you guys are doing for the  
>> community and I would like to help in whatever way I can.
>> I asked on the solr list about maybe it would be nice to have SolrJ  
>> support for handling the term vector component responses in a similar  
>> fashion to how facets are treated today, and thought that I could help  
>> by adding this functionality. Since I'm new to contributing (yes I have  
>> read the wiki etc, so I know the basics), but I just wanted to ask a  
>> few things:
>>
>> - Regarding naming conventions, do you always use the _privateVariable  
>> vs. parameterVariable naming?
>
> I would say we generally prefer the Java conventions, which I believe  
> means the latter: privateVariable, parameterVariable.  I personally  
> don't like the _ prefix, but we aren't that much of sticklers over it.
>

Okay, I'm not used to the _ prefix either, but asked since the existing  
code in one of the classes I was changing used this convention.

>>
>> - When testing, I figured that I would add my tests to the  
>> SolrExampleTests.java class, so that I can run them against embedded  
>> solr, jetty etc etc. Is that the correct approach?
>
> Not sure why a unit test would need Jetty.  All you need to check, IMO,  
> is that the new methods create valid objects given valid inputs, and  
> also properly handle when no term vector info is present.  Otherwise,  
> you are testing all the serialization, etc.  Personally, I really don't  
> like the fact that our core tests have a dependency on the example  
> configuration.  I know the tests are useful, just don't think they  
> belong in the core.
>
> SolrQueryTest may be more appropriate, but feel free to add your own  
> standalone test.
>

I did not write any jetty based test, but was referring to the existing  
tests for SolrJ. I just looked at the existing tests, thats why I asked.  
I'll see if I can add a test to the SolrQueryTest then, though that class  
seem to only check that the correct parameters get set, and does not check  
any mocked response...

>>
>> - Specifically for my contribution, dealing with the term vector  
>> response handler,
>
> The TVRH is just an example of the use of the TermVectorComponent.  It  
> is intended that one would configure the TermVectorComponent on their  
> own ReqHandler, so as not to have to make an extra call to Solr.
>

I understand (but again based my question on the jetty / embedded driven  
unit tests which use the example configuration).

>> is it okay to use something like:
>> query.set(CommonParams.QT, "tvrh");
>> in the test? or is this bad? i just figured that you dont want such  
>> parameters forced into the SolrQuery class... or? What do you all  
>> suggest? for highlighting and facets its straigtforward since you don't  
>> need to set any QT, and you can just say solrQuery.setHighlight(true);  
>> etc. But i guess it might be bad to force the handler into the  
>> SolrQuery class, if someone decides to not register the handler in  
>> their solrconfig.xml... Any thoughts here?
>
> All solrQuery.setHightlight does is:
> this.set(HighlightParams.HIGHLIGHT, true);
> which assumes that highlighting is configured for that RequestHandler  
> (which it is by default)
>
> In your case, you could have:
> SolrQuery.setTermVectors(boolean)
>
> and it should add a param for TermVectorComponent.tv = true.  There is  
> no need to have the request handler involved.
>
>
Yes, setHighlight (and setFacets) were just examples of how I imagine we  
would like the termvectors to be handled aswell. I'm using  
TermVectorComponent.tv = true :) So all is good then!

>>
>>
>> And how "done" should my patch be before i add it to jira?
>
> I guess that depends on whether you want it committed or not  :-)  I'd  
> recommend an iterative approach.  You don't need to have it done for the  
> first time you put it up, but you should make sure you care and feed it  
> based on comments that you get back and post a new patch.
>
> At a minimum, I would say it should compile and all tests should still  
> pass, but that is not a hard and fast rule as it is OK to put up patches  
> where you are explicitly seeking feedback on the APIs.
>
> Since the TVC is something I wrote, I'll likely take a look at whatever  
> you put up and provide feedback at some point, but I'm pretty swamped at  
> the moment, so please don't take it wrong.  Of course, others are free  
> to help as they see fit.
>
> HTH,
> Grant
>

Thanks again, I will try to clean up my code and add a patch to jira and  
see if holds the expected quality for people to even consider it ;)

Cheers,
and a happy new year to you all!

  - Aleks


-- 
Aleksander M. Stensby
Senior software developer
Integrasco A/S
www.integrasco.no

Please consider the environment before printing all or any of this e-mail

Re: Some "best practices" and questions

Posted by Grant Ingersoll <gs...@apache.org>.
On Dec 29, 2008, at 9:50 AM, Aleksander M. Stensby wrote:

> I'm not sure if this is the place to ask, so please feel free to  
> correct me if I'm completely wrong.
>
> I've been using Lucene for about three years or so now, and Solr for  
> about 6 months or so. I really love what you guys are doing for the  
> community and I would like to help in whatever way I can.
> I asked on the solr list about maybe it would be nice to have SolrJ  
> support for handling the term vector component responses in a  
> similar fashion to how facets are treated today, and thought that I  
> could help by adding this functionality. Since I'm new to  
> contributing (yes I have read the wiki etc, so I know the basics),  
> but I just wanted to ask a few things:
>
> - Regarding naming conventions, do you always use the  
> _privateVariable vs. parameterVariable naming?

I would say we generally prefer the Java conventions, which I believe  
means the latter: privateVariable, parameterVariable.  I personally  
don't like the _ prefix, but we aren't that much of sticklers over it.

>
> - When testing, I figured that I would add my tests to the  
> SolrExampleTests.java class, so that I can run them against embedded  
> solr, jetty etc etc. Is that the correct approach?

Not sure why a unit test would need Jetty.  All you need to check,  
IMO, is that the new methods create valid objects given valid inputs,  
and also properly handle when no term vector info is present.   
Otherwise, you are testing all the serialization, etc.  Personally, I  
really don't like the fact that our core tests have a dependency on  
the example configuration.  I know the tests are useful, just don't  
think they belong in the core.

SolrQueryTest may be more appropriate, but feel free to add your own  
standalone test.

>
> - Specifically for my contribution, dealing with the term vector  
> response handler,

The TVRH is just an example of the use of the TermVectorComponent.  It  
is intended that one would configure the TermVectorComponent on their  
own ReqHandler, so as not to have to make an extra call to Solr.

> is it okay to use something like:
> query.set(CommonParams.QT, "tvrh");
> in the test? or is this bad? i just figured that you dont want such  
> parameters forced into the SolrQuery class... or? What do you all  
> suggest? for highlighting and facets its straigtforward since you  
> don't need to set any QT, and you can just say  
> solrQuery.setHighlight(true); etc. But i guess it might be bad to  
> force the handler into the SolrQuery class, if someone decides to  
> not register the handler in their solrconfig.xml... Any thoughts here?

All solrQuery.setHightlight does is:
this.set(HighlightParams.HIGHLIGHT, true);
which assumes that highlighting is configured for that RequestHandler  
(which it is by default)

In your case, you could have:
SolrQuery.setTermVectors(boolean)

and it should add a param for TermVectorComponent.tv = true.  There is  
no need to have the request handler involved.


>
>
> And how "done" should my patch be before i add it to jira?

I guess that depends on whether you want it committed or not  :-)  I'd  
recommend an iterative approach.  You don't need to have it done for  
the first time you put it up, but you should make sure you care and  
feed it based on comments that you get back and post a new patch.

At a minimum, I would say it should compile and all tests should still  
pass, but that is not a hard and fast rule as it is OK to put up  
patches where you are explicitly seeking feedback on the APIs.

Since the TVC is something I wrote, I'll likely take a look at  
whatever you put up and provide feedback at some point, but I'm pretty  
swamped at the moment, so please don't take it wrong.  Of course,  
others are free to help as they see fit.

HTH,
Grant