You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Alessandro Benedetti <ab...@apache.org> on 2015/11/05 12:30:38 UTC

[SolrJ Clients] RequestWriter VS BinaryRequestWriter

Hi guys,
I was taking a look to the implementation details to understand how Solr
requests are written by SolrJ APIs.
The interesting classes are :

*org.apache.solr.client.solrj.request.RequestWriter*

*org.apache.solr.client.solrj.impl.BinaryRequestWriter* ( wrong package ? )

I discovered that :

*CloudSolrClient *- is using the javabin format ( *BinaryRequestWriter*)
*HttpSolrClient *and* LBHttpSolrClient* - are using the *RequestWriter* (
which writes xml)

In consequence the ConcurrentUpdateSolrClient is using the xml
ResponseWriter as well.

Is there any reason in this ?
I did know that the javabin  format is the most efficient for Solr requests.
Why the xml RequestWriter is still used as default with those SolrClients ?

Cheers

-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

Posted by Alessandro Benedetti <ab...@apache.org>.
Hi Vincenzo,
according to our discoveries I would say the CloudSolrClient to be the most
efficient way to interact with a Solr Cloud cluster.

ConcurrentUpdateSolrServer will be efficient for a single Solr instance,
but using under the hood the XML Response Writer.
Even if you prefer to use the javabin one ( which should be more efficient)
.

Cheers

On 6 November 2015 at 14:15, Vincenzo D'Amore <v....@gmail.com> wrote:

> Hi Alessandro,
>
> I have followed your same path, having a look at java source. I inherited
> an installation with CloudSolrServer (I still had solrcloud 4.8) but I was
> not sure it was the right choice instead of the (apparently) more appealing
> ConcurrentUpdateSolrClient.
>
> As far as I understood, ConcurrentUpdateSolrClient is rooted with older
> versions of solr, may be older than the cloud version.
> Because of ConcurrentUpdateSolrClient constructors signature, they don't
> accept a zookeeper client or host:port as parameter.
>
> On the other hand, well, I'm not sure that a concurrent client does a job
> better than the standard CloudSolrServer.
>
> Best,
> Vincenzo
>
>
> On Thu, Nov 5, 2015 at 12:30 PM, Alessandro Benedetti <
> abenedetti@apache.org
> > wrote:
>
> > Hi guys,
> > I was taking a look to the implementation details to understand how Solr
> > requests are written by SolrJ APIs.
> > The interesting classes are :
> >
> > *org.apache.solr.client.solrj.request.RequestWriter*
> >
> > *org.apache.solr.client.solrj.impl.BinaryRequestWriter* ( wrong package
> ? )
> >
> > I discovered that :
> >
> > *CloudSolrClient *- is using the javabin format ( *BinaryRequestWriter*)
> > *HttpSolrClient *and* LBHttpSolrClient* - are using the *RequestWriter* (
> > which writes xml)
> >
> > In consequence the ConcurrentUpdateSolrClient is using the xml
> > ResponseWriter as well.
> >
> > Is there any reason in this ?
> > I did know that the javabin  format is the most efficient for Solr
> > requests.
> > Why the xml RequestWriter is still used as default with those
> SolrClients ?
> >
> > Cheers
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>
>
>
> --
> Vincenzo D'Amore
> email: v.damore@gmail.com
> skype: free.dev
> mobile: +39 349 8513251
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

Posted by Erick Erickson <er...@gmail.com>.
And the other large benefit of CloudSolrClient is that it
routes documents directly to the correct leader, i.e. does
the routing on the client rather than have the Solr
instances forward docs to the routing. Using CloudSolrClient
should scale more nearly linearly with increasing
shards.

Best,
Erick

On Fri, Nov 6, 2015 at 6:39 AM, Shawn Heisey <ap...@elyograg.org> wrote:
> On 11/6/2015 7:15 AM, Vincenzo D'Amore wrote:
>> I have followed your same path, having a look at java source. I inherited
>> an installation with CloudSolrServer (I still had solrcloud 4.8) but I was
>> not sure it was the right choice instead of the (apparently) more appealing
>> ConcurrentUpdateSolrClient.
>>
>> As far as I understood, ConcurrentUpdateSolrClient is rooted with older
>> versions of solr, may be older than the cloud version.
>> Because of ConcurrentUpdateSolrClient constructors signature, they don't
>> accept a zookeeper client or host:port as parameter.
>>
>> On the other hand, well, I'm not sure that a concurrent client does a job
>> better than the standard CloudSolrServer.
>
> The concurrent client has one glaring flaw:  It puts all update requests
> into background threads, so any exceptions thrown by those requests are
> logged and ignored.  When you send an add or delete request, the client
> returns immediately to your program and indicates success (by not
> throwing an exception) ... even if the server you're talking to is
> completely offline.
>
> In a bulk insert situation, you might not care about error handling, but
> most people DO care about it.
>
> For most situations, you will want to use HttpSolrClient or
> CloudSolrClient, depending on whether the target is running SolrCloud.
>
> Thanks,
> Shawn
>

Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

Posted by Shawn Heisey <ap...@elyograg.org>.
On 11/6/2015 7:15 AM, Vincenzo D'Amore wrote:
> I have followed your same path, having a look at java source. I inherited
> an installation with CloudSolrServer (I still had solrcloud 4.8) but I was
> not sure it was the right choice instead of the (apparently) more appealing
> ConcurrentUpdateSolrClient.
> 
> As far as I understood, ConcurrentUpdateSolrClient is rooted with older
> versions of solr, may be older than the cloud version.
> Because of ConcurrentUpdateSolrClient constructors signature, they don't
> accept a zookeeper client or host:port as parameter.
> 
> On the other hand, well, I'm not sure that a concurrent client does a job
> better than the standard CloudSolrServer.

The concurrent client has one glaring flaw:  It puts all update requests
into background threads, so any exceptions thrown by those requests are
logged and ignored.  When you send an add or delete request, the client
returns immediately to your program and indicates success (by not
throwing an exception) ... even if the server you're talking to is
completely offline.

In a bulk insert situation, you might not care about error handling, but
most people DO care about it.

For most situations, you will want to use HttpSolrClient or
CloudSolrClient, depending on whether the target is running SolrCloud.

Thanks,
Shawn


Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

Posted by Alessandro Benedetti <ab...@apache.org>.
Hi Vincenzo,
according to our discoveries I would say the CloudSolrClient to be the most
efficient way to interact with a Solr Cloud cluster.

ConcurrentUpdateSolrServer will be efficient for a single Solr instance,
but using under the hood the XML Response Writer.
Even if you prefer to use the javabin one ( which should be more efficient)
.

Cheers

On 6 November 2015 at 14:15, Vincenzo D'Amore <v....@gmail.com> wrote:

> Hi Alessandro,
>
> I have followed your same path, having a look at java source. I inherited
> an installation with CloudSolrServer (I still had solrcloud 4.8) but I was
> not sure it was the right choice instead of the (apparently) more appealing
> ConcurrentUpdateSolrClient.
>
> As far as I understood, ConcurrentUpdateSolrClient is rooted with older
> versions of solr, may be older than the cloud version.
> Because of ConcurrentUpdateSolrClient constructors signature, they don't
> accept a zookeeper client or host:port as parameter.
>
> On the other hand, well, I'm not sure that a concurrent client does a job
> better than the standard CloudSolrServer.
>
> Best,
> Vincenzo
>
>
> On Thu, Nov 5, 2015 at 12:30 PM, Alessandro Benedetti <
> abenedetti@apache.org
> > wrote:
>
> > Hi guys,
> > I was taking a look to the implementation details to understand how Solr
> > requests are written by SolrJ APIs.
> > The interesting classes are :
> >
> > *org.apache.solr.client.solrj.request.RequestWriter*
> >
> > *org.apache.solr.client.solrj.impl.BinaryRequestWriter* ( wrong package
> ? )
> >
> > I discovered that :
> >
> > *CloudSolrClient *- is using the javabin format ( *BinaryRequestWriter*)
> > *HttpSolrClient *and* LBHttpSolrClient* - are using the *RequestWriter* (
> > which writes xml)
> >
> > In consequence the ConcurrentUpdateSolrClient is using the xml
> > ResponseWriter as well.
> >
> > Is there any reason in this ?
> > I did know that the javabin  format is the most efficient for Solr
> > requests.
> > Why the xml RequestWriter is still used as default with those
> SolrClients ?
> >
> > Cheers
> >
> > --
> > --------------------------
> >
> > Benedetti Alessandro
> > Visiting card : http://about.me/alessandro_benedetti
> >
> > "Tyger, tyger burning bright
> > In the forests of the night,
> > What immortal hand or eye
> > Could frame thy fearful symmetry?"
> >
> > William Blake - Songs of Experience -1794 England
> >
>
>
>
> --
> Vincenzo D'Amore
> email: v.damore@gmail.com
> skype: free.dev
> mobile: +39 349 8513251
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

Posted by Vincenzo D'Amore <v....@gmail.com>.
Hi Alessandro,

I have followed your same path, having a look at java source. I inherited
an installation with CloudSolrServer (I still had solrcloud 4.8) but I was
not sure it was the right choice instead of the (apparently) more appealing
ConcurrentUpdateSolrClient.

As far as I understood, ConcurrentUpdateSolrClient is rooted with older
versions of solr, may be older than the cloud version.
Because of ConcurrentUpdateSolrClient constructors signature, they don't
accept a zookeeper client or host:port as parameter.

On the other hand, well, I'm not sure that a concurrent client does a job
better than the standard CloudSolrServer.

Best,
Vincenzo


On Thu, Nov 5, 2015 at 12:30 PM, Alessandro Benedetti <abenedetti@apache.org
> wrote:

> Hi guys,
> I was taking a look to the implementation details to understand how Solr
> requests are written by SolrJ APIs.
> The interesting classes are :
>
> *org.apache.solr.client.solrj.request.RequestWriter*
>
> *org.apache.solr.client.solrj.impl.BinaryRequestWriter* ( wrong package ? )
>
> I discovered that :
>
> *CloudSolrClient *- is using the javabin format ( *BinaryRequestWriter*)
> *HttpSolrClient *and* LBHttpSolrClient* - are using the *RequestWriter* (
> which writes xml)
>
> In consequence the ConcurrentUpdateSolrClient is using the xml
> ResponseWriter as well.
>
> Is there any reason in this ?
> I did know that the javabin  format is the most efficient for Solr
> requests.
> Why the xml RequestWriter is still used as default with those SolrClients ?
>
> Cheers
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
Vincenzo D'Amore
email: v.damore@gmail.com
skype: free.dev
mobile: +39 349 8513251

RE: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

Posted by Martin Gainty <mg...@hotmail.com>.
                                                                                                


> Subject: Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter
> To: dev@lucene.apache.org
> From: apache@elyograg.org
> Date: Thu, 5 Nov 2015 08:41:16 -0700
> 
> On 11/5/2015 4:30 AM, Alessandro Benedetti wrote:
> > Hi guys,
> > I was taking a look to the implementation details to understand how
> > Solr requests are written by SolrJ APIs.
> > The interesting classes are :
> >
> > /org.apache.solr.client.solrj.request.RequestWriter/
> >
> > /org.apache.solr.client.solrj.impl.BinaryRequestWriter/( wrong package ? )
> 
> Even though RequestWriter is not written as an actual interface, and is
> itself an implementation, you can think of it as an interface. 
> BinaryRequestWriter is a very specific implementation, which is why it
> is in a package that includes "impl."
> 
> RequestWriter should probably be an interface or an abstract class (like
> SolrClient), with an XML and a binary implementation.
> 
> I believe that there is one primary reason that HttpSolrClient is still
> using the XML request writer and that RequestWriter is still an actual
> implementation rather than abstract class or an interface:  Inertia. 
> The HTTP client has been available for users for a very long time.  Even
> though the chances of causing a problem with existing user programs by
> switching the default writer is very small, it's much safer and easier
> to leave it at XML.

Martini> the conflict is over the /update endpoint which for 3.x and lower versions is always XML
Martini> so if you have one active 3.x server you have no choice but to retain XML RequestWriterMartini> good point shawn
> 
> We had an opportunity with the 5.0 release to make some of these
> changes, but a new major release has so much activity that it is easy to
> lose track of desired fixes, especially when the existing version works
> perfectly.  The Cloud client is using binary because it is a newer
> implementation and did not have years of history with XML to worry about.
> 
> I think that these changes should be made in trunk, so they will be
> there in 6.0 when that version is released. 
> 
> Thanks,
> Shawn
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 
 		 	   		  

Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

Posted by Shawn Heisey <ap...@elyograg.org>.
On 11/5/2015 4:30 AM, Alessandro Benedetti wrote:
> Hi guys,
> I was taking a look to the implementation details to understand how
> Solr requests are written by SolrJ APIs.
> The interesting classes are :
>
> /org.apache.solr.client.solrj.request.RequestWriter/
>
> /org.apache.solr.client.solrj.impl.BinaryRequestWriter/( wrong package ? )

Even though RequestWriter is not written as an actual interface, and is
itself an implementation, you can think of it as an interface. 
BinaryRequestWriter is a very specific implementation, which is why it
is in a package that includes "impl."

RequestWriter should probably be an interface or an abstract class (like
SolrClient), with an XML and a binary implementation.

I believe that there is one primary reason that HttpSolrClient is still
using the XML request writer and that RequestWriter is still an actual
implementation rather than abstract class or an interface:  Inertia. 
The HTTP client has been available for users for a very long time.  Even
though the chances of causing a problem with existing user programs by
switching the default writer is very small, it's much safer and easier
to leave it at XML.

We had an opportunity with the 5.0 release to make some of these
changes, but a new major release has so much activity that it is easy to
lose track of desired fixes, especially when the existing version works
perfectly.  The Cloud client is using binary because it is a newer
implementation and did not have years of history with XML to worry about.

I think that these changes should be made in trunk, so they will be
there in 6.0 when that version is released. 

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

Posted by Mark Miller <ma...@gmail.com>.
At some point I made some issue like "make all SolrCloud communication via
javabin by default".

That made it so that CloudSolrClient started talking javabin by default, as
well as our internal usage of other clients (where we turned it on).

This issue did not include changing the defaults for the non cloud clients.

If someone thought that was a good idea, the upcoming 6 release would be a
good time to make the change.

- Mark

On Thu, Nov 5, 2015 at 6:30 AM Alessandro Benedetti <ab...@apache.org>
wrote:

> Hi guys,
> I was taking a look to the implementation details to understand how Solr
> requests are written by SolrJ APIs.
> The interesting classes are :
>
> *org.apache.solr.client.solrj.request.RequestWriter*
>
> *org.apache.solr.client.solrj.impl.BinaryRequestWriter* ( wrong package ?
> )
>
> I discovered that :
>
> *CloudSolrClient *- is using the javabin format ( *BinaryRequestWriter*)
> *HttpSolrClient *and* LBHttpSolrClient* - are using the *RequestWriter* (
> which writes xml)
>
> In consequence the ConcurrentUpdateSolrClient is using the xml
> ResponseWriter as well.
>
> Is there any reason in this ?
> I did know that the javabin  format is the most efficient for Solr
> requests.
> Why the xml RequestWriter is still used as default with those SolrClients ?
>
> Cheers
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>
-- 
- Mark
about.me/markrmiller

Re: [SolrJ Clients] RequestWriter VS BinaryRequestWriter

Posted by Vincenzo D'Amore <v....@gmail.com>.
Hi Alessandro,

I have followed your same path, having a look at java source. I inherited
an installation with CloudSolrServer (I still had solrcloud 4.8) but I was
not sure it was the right choice instead of the (apparently) more appealing
ConcurrentUpdateSolrClient.

As far as I understood, ConcurrentUpdateSolrClient is rooted with older
versions of solr, may be older than the cloud version.
Because of ConcurrentUpdateSolrClient constructors signature, they don't
accept a zookeeper client or host:port as parameter.

On the other hand, well, I'm not sure that a concurrent client does a job
better than the standard CloudSolrServer.

Best,
Vincenzo


On Thu, Nov 5, 2015 at 12:30 PM, Alessandro Benedetti <abenedetti@apache.org
> wrote:

> Hi guys,
> I was taking a look to the implementation details to understand how Solr
> requests are written by SolrJ APIs.
> The interesting classes are :
>
> *org.apache.solr.client.solrj.request.RequestWriter*
>
> *org.apache.solr.client.solrj.impl.BinaryRequestWriter* ( wrong package ? )
>
> I discovered that :
>
> *CloudSolrClient *- is using the javabin format ( *BinaryRequestWriter*)
> *HttpSolrClient *and* LBHttpSolrClient* - are using the *RequestWriter* (
> which writes xml)
>
> In consequence the ConcurrentUpdateSolrClient is using the xml
> ResponseWriter as well.
>
> Is there any reason in this ?
> I did know that the javabin  format is the most efficient for Solr
> requests.
> Why the xml RequestWriter is still used as default with those SolrClients ?
>
> Cheers
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>



-- 
Vincenzo D'Amore
email: v.damore@gmail.com
skype: free.dev
mobile: +39 349 8513251