You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by getagrip <ge...@web.de> on 2010/11/01 00:06:43 UTC

Design and Usage Questions

Hi,

I've got some basic usage / design questions.

1. The SolrJ wiki proposes to use the same CommonsHttpSolrServer
    instance for all requests to avoid connection leaks.
    So if I create a Singleton instance upon application-startup I can
    securely use this instance for ALL queries/updates throughout my
    application without running into performance issues?

2. My System's documents are stored in a Subversion repository.
    For fast searchresults I want to periodically index new documents
    from the repository.

    What I get from the repository is a ByteArrayOutputStream. How can I
    pass this Stream to Solr?

    I only see possibilities to pass Files but in my case it does not
    make sense to write the ByteArrayOutputStream to disk again as this
    would cause performance issues apart from making no sense anyway.

3. Are there any disadvantages using Solrj over some other HTTP based
    solution e.g. creating & sending my own HTTP requests? Do I even
    have to use HTTP?
    I see the EmbeddedSolrServer exists. Any drawbacks using that?

Any hints are welcome, Thanks!

Re: Design and Usage Questions

Posted by Xin Li <xi...@gmail.com>.

If you just want a quick way to query Solr server, Perl module
Webservice::Solr is pretty good.


On Mon, Nov 1, 2010 at 4:56 PM, Lance Norskog <go...@gmail.com> wrote:

> Yes, you can write your own app to read the file with SVNkit and post
> it to the ExtractingRequestHandler. This would be easiest.
>
> On Mon, Nov 1, 2010 at 5:49 AM, getagrip <ge...@web.de> wrote:
> > Ok, so if I did NOT use Solr_J I could PUSH a Stream to Solr somehow?
> > I do not depend on Solr_J, any connection-method would suffice.
> >
> > On 11/01/2010 03:23 AM, Lance Norskog wrote:
> >>
> >> 2.
> >> The SolrJ library handling of content streams is "pull", not "push".
> >> That is, you give it a reader and it pulls content when it feels like
> >> it. If your software to feed the connection wants to write the data,
> >> you have to either buffer the whole thing or do a dual-thread
> >> writer/reader pair.
> >>
> >> The easiest way to pull stuff from SVN is to use one of the web server
> >> apps. Solr takes a "stream.url" parameter. (Also stream.file.) Note
> >> that there is no outbound authentication supported; your web server
> >> has to be open (at least to the Solr instance).
> >>
> >>
> >> On Sun, Oct 31, 2010 at 4:06 PM, getagrip<ge...@web.de>  wrote:
> >>>
> >>> Hi,
> >>>
> >>> I've got some basic usage / design questions.
> >>>
> >>> 1. The SolrJ wiki proposes to use the same CommonsHttpSolrServer
> >>>   instance for all requests to avoid connection leaks.
> >>>   So if I create a Singleton instance upon application-startup I can
> >>>   securely use this instance for ALL queries/updates throughout my
> >>>   application without running into performance issues?
> >>>
> >>> 2. My System's documents are stored in a Subversion repository.
> >>>   For fast searchresults I want to periodically index new documents
> >>>   from the repository.
> >>>
> >>>   What I get from the repository is a ByteArrayOutputStream. How can I
> >>>   pass this Stream to Solr?
> >>>
> >>>   I only see possibilities to pass Files but in my case it does not
> >>>   make sense to write the ByteArrayOutputStream to disk again as this
> >>>   would cause performance issues apart from making no sense anyway.
> >>>
> >>> 3. Are there any disadvantages using Solrj over some other HTTP based
> >>>   solution e.g. creating&  sending my own HTTP requests? Do I even
> >>>   have to use HTTP?
> >>>   I see the EmbeddedSolrServer exists. Any drawbacks using that?
> >>>
> >>> Any hints are welcome, Thanks!
> >>>
> >>
> >>
> >>
> >
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Re: Design and Usage Questions

Posted by Lance Norskog <go...@gmail.com>.

Yes, you can write your own app to read the file with SVNkit and post
it to the ExtractingRequestHandler. This would be easiest.

On Mon, Nov 1, 2010 at 5:49 AM, getagrip <ge...@web.de> wrote:
> Ok, so if I did NOT use Solr_J I could PUSH a Stream to Solr somehow?
> I do not depend on Solr_J, any connection-method would suffice.
>
> On 11/01/2010 03:23 AM, Lance Norskog wrote:
>>
>> 2.
>> The SolrJ library handling of content streams is "pull", not "push".
>> That is, you give it a reader and it pulls content when it feels like
>> it. If your software to feed the connection wants to write the data,
>> you have to either buffer the whole thing or do a dual-thread
>> writer/reader pair.
>>
>> The easiest way to pull stuff from SVN is to use one of the web server
>> apps. Solr takes a "stream.url" parameter. (Also stream.file.) Note
>> that there is no outbound authentication supported; your web server
>> has to be open (at least to the Solr instance).
>>
>>
>> On Sun, Oct 31, 2010 at 4:06 PM, getagrip<ge...@web.de>  wrote:
>>>
>>> Hi,
>>>
>>> I've got some basic usage / design questions.
>>>
>>> 1. The SolrJ wiki proposes to use the same CommonsHttpSolrServer
>>>   instance for all requests to avoid connection leaks.
>>>   So if I create a Singleton instance upon application-startup I can
>>>   securely use this instance for ALL queries/updates throughout my
>>>   application without running into performance issues?
>>>
>>> 2. My System's documents are stored in a Subversion repository.
>>>   For fast searchresults I want to periodically index new documents
>>>   from the repository.
>>>
>>>   What I get from the repository is a ByteArrayOutputStream. How can I
>>>   pass this Stream to Solr?
>>>
>>>   I only see possibilities to pass Files but in my case it does not
>>>   make sense to write the ByteArrayOutputStream to disk again as this
>>>   would cause performance issues apart from making no sense anyway.
>>>
>>> 3. Are there any disadvantages using Solrj over some other HTTP based
>>>   solution e.g. creating&  sending my own HTTP requests? Do I even
>>>   have to use HTTP?
>>>   I see the EmbeddedSolrServer exists. Any drawbacks using that?
>>>
>>> Any hints are welcome, Thanks!
>>>
>>
>>
>>
>



-- 
Lance Norskog
goksron@gmail.com

Re: Design and Usage Questions

Posted by getagrip <ge...@web.de>.

Ok, so if I did NOT use Solr_J I could PUSH a Stream to Solr somehow?
I do not depend on Solr_J, any connection-method would suffice.

On 11/01/2010 03:23 AM, Lance Norskog wrote:
> 2.
> The SolrJ library handling of content streams is "pull", not "push".
> That is, you give it a reader and it pulls content when it feels like
> it. If your software to feed the connection wants to write the data,
> you have to either buffer the whole thing or do a dual-thread
> writer/reader pair.
>
> The easiest way to pull stuff from SVN is to use one of the web server
> apps. Solr takes a "stream.url" parameter. (Also stream.file.) Note
> that there is no outbound authentication supported; your web server
> has to be open (at least to the Solr instance).
>
>
> On Sun, Oct 31, 2010 at 4:06 PM, getagrip<ge...@web.de>  wrote:
>> Hi,
>>
>> I've got some basic usage / design questions.
>>
>> 1. The SolrJ wiki proposes to use the same CommonsHttpSolrServer
>>    instance for all requests to avoid connection leaks.
>>    So if I create a Singleton instance upon application-startup I can
>>    securely use this instance for ALL queries/updates throughout my
>>    application without running into performance issues?
>>
>> 2. My System's documents are stored in a Subversion repository.
>>    For fast searchresults I want to periodically index new documents
>>    from the repository.
>>
>>    What I get from the repository is a ByteArrayOutputStream. How can I
>>    pass this Stream to Solr?
>>
>>    I only see possibilities to pass Files but in my case it does not
>>    make sense to write the ByteArrayOutputStream to disk again as this
>>    would cause performance issues apart from making no sense anyway.
>>
>> 3. Are there any disadvantages using Solrj over some other HTTP based
>>    solution e.g. creating&  sending my own HTTP requests? Do I even
>>    have to use HTTP?
>>    I see the EmbeddedSolrServer exists. Any drawbacks using that?
>>
>> Any hints are welcome, Thanks!
>>
>
>
>

Re: Design and Usage Questions

Posted by torin farmer <ge...@web.de>.

Hm, I do not have a webserver setup for security reasons.I use SVNKit to connect to SVN via the "file://" protocol, what I get then is the ByteArrayOutputStream.What would the buffer-solution or the DualThread Writer/Reader pair look like?-----Ursprüngliche Nachricht-----

Von: "Lance Norskog" <go...@gmail.com>

Gesendet: Nov 1, 2010 3:23:55 AM

An: solr-user@lucene.apache.org

Betreff: Re: Design and Usage Questions



>2.

>The SolrJ library handling of content streams is "pull", not "push".

>That is, you give it a reader and it pulls content when it feels like

>it. If your software to feed the connection wants to write the data,

>you have to either buffer the whole thing or do a dual-thread

>writer/reader pair.

>

>The easiest way to pull stuff from SVN is to use one of the web server

>apps. Solr takes a "stream.url" parameter. (Also stream.file.) Note

>that there is no outbound authentication supported; your web server

>has to be open (at least to the Solr instance).

>

>

>On Sun, Oct 31, 2010 at 4:06 PM, getagrip <ge...@web.de> wrote:

>> Hi,

>>

>> I've got some basic usage / design questions.

>>

>> 1. The SolrJ wiki proposes to use the same CommonsHttpSolrServer

>>   instance for all requests to avoid connection leaks.

>>   So if I create a Singleton instance upon application-startup I can

>>   securely use this instance for ALL queries/updates throughout my

>>   application without running into performance issues?

>>

>> 2. My System's documents are stored in a Subversion repository.

>>   For fast searchresults I want to periodically index new documents

>>   from the repository.

>>

>>   What I get from the repository is a ByteArrayOutputStream. How can I

>>   pass this Stream to Solr?

>>

>>   I only see possibilities to pass Files but in my case it does not

>>   make sense to write the ByteArrayOutputStream to disk again as this

>>   would cause performance issues apart from making no sense anyway.

>>

>> 3. Are there any disadvantages using Solrj over some other HTTP based

>>   solution e.g. creating & sending my own HTTP requests? Do I even

>>   have to use HTTP?

>>   I see the EmbeddedSolrServer exists. Any drawbacks using that?

>>

>> Any hints are welcome, Thanks!

>>

>

>

>

>-- 

>Lance Norskog

>goksron@gmail.com
___________________________________________________________
Neu: WEB.DE De-Mail - Einfach wie E-Mail, sicher wie ein Brief!  
Jetzt De-Mail-Adresse reservieren: https://produkte.web.de/go/demail02

Re: Design and Usage Questions

Posted by Lance Norskog <go...@gmail.com>.

2.
The SolrJ library handling of content streams is "pull", not "push".
That is, you give it a reader and it pulls content when it feels like
it. If your software to feed the connection wants to write the data,
you have to either buffer the whole thing or do a dual-thread
writer/reader pair.

The easiest way to pull stuff from SVN is to use one of the web server
apps. Solr takes a "stream.url" parameter. (Also stream.file.) Note
that there is no outbound authentication supported; your web server
has to be open (at least to the Solr instance).

On Sun, Oct 31, 2010 at 4:06 PM, getagrip <ge...@web.de> wrote:
> Hi,
>
> I've got some basic usage / design questions.
>
> 1. The SolrJ wiki proposes to use the same CommonsHttpSolrServer
>   instance for all requests to avoid connection leaks.
>   So if I create a Singleton instance upon application-startup I can
>   securely use this instance for ALL queries/updates throughout my
>   application without running into performance issues?
>
> 2. My System's documents are stored in a Subversion repository.
>   For fast searchresults I want to periodically index new documents
>   from the repository.
>
>   What I get from the repository is a ByteArrayOutputStream. How can I
>   pass this Stream to Solr?
>
>   I only see possibilities to pass Files but in my case it does not
>   make sense to write the ByteArrayOutputStream to disk again as this
>   would cause performance issues apart from making no sense anyway.
>
> 3. Are there any disadvantages using Solrj over some other HTTP based
>   solution e.g. creating & sending my own HTTP requests? Do I even
>   have to use HTTP?
>   I see the EmbeddedSolrServer exists. Any drawbacks using that?
>
> Any hints are welcome, Thanks!
>

-- 
Lance Norskog
goksron@gmail.com