You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Michael Sokolov <ms...@safaribooksonline.com> on 2013/05/11 03:42:17 UTC

Solr 4.x/3.x update javabin incompatibility?

I upgraded one of my solrj clients to 4.2.0, and am testing using it 
with a 3.4 server.  We generally use a BinaryRequestWriter (ie 
javabin).  With the 3.4 solrj client, this caused update requests to be 
directed to /update/javabin.  However, in 4.2, the dispatch seems to be 
getting handled using the content-type header, rather than a distinct 
url path, but the 3.4 server isn't set up to understand the difference - 
in the standard config that everone seems to use, it interprets requests 
at /update as XML format requests.  This leads to errors, naturally.

By the way the errors themselves are quite difficult to comprehend, 
because of the format mismatch I guess -- you end up with a mysterious 
"Invalid version (expected 2, but 1) or the data in not in 'javabin' 
format" ... at least there is an indication this might be 
javabin-related in some way, maybe just by coincidence.  On the server 
side, the error you get is a bad UTF8 character - "Invalid UTF-8 middle 
byte 0xe0 (at char #1, byte #-1)" - since it is trying to parse javabin 
as XML.

My question is: is this intentional?  It's unfortunate that we don't 
seem to be able to update the client and have it continue to work with 
(ie send updates to) the old servers.  We have a centralized client 
library that we share across a large number of different installations, 
and getting all the servers upgraded is going to take some time.  It 
would be really nice if we could decouple the upgrade efforts.

I guess as a workaround we can enforce that the clients send requests to 
/update/javabin as they used to (new UpdateRequest("/update/javabin"), 
but am I missing a straightforward out-of-the-box way? - I don't think I 
saw anything about this in the release notes.

-Mike

Re: Solr 4.x/3.x update javabin incompatibility?

Posted by Michael Sokolov <ms...@safaribooksonline.com>.
Just a quick followup to this - I see now that this change is logged in 
jira as SOLR-2857.  It doesn't seem that anybody considered the 
backwards incompatibility of new client/old server there.  And it's not 
really clear to me how I can get SolrJ to emit requests to 
/update/javabin - is there any way to do it without patching SolrJ?

On 5/10/2013 9:42 PM, Michael Sokolov wrote:
> I upgraded one of my solrj clients to 4.2.0, and am testing using it 
> with a 3.4 server.  We generally use a BinaryRequestWriter (ie 
> javabin).  With the 3.4 solrj client, this caused update requests to 
> be directed to /update/javabin. However, in 4.2, the dispatch seems to 
> be getting handled using the content-type header, rather than a 
> distinct url path, but the 3.4 server isn't set up to understand the 
> difference - in the standard config that everone seems to use, it 
> interprets requests at /update as XML format requests.  This leads to 
> errors, naturally.
>
> By the way the errors themselves are quite difficult to comprehend, 
> because of the format mismatch I guess -- you end up with a mysterious 
> "Invalid version (expected 2, but 1) or the data in not in 'javabin' 
> format" ... at least there is an indication this might be 
> javabin-related in some way, maybe just by coincidence.  On the server 
> side, the error you get is a bad UTF8 character - "Invalid UTF-8 
> middle byte 0xe0 (at char #1, byte #-1)" - since it is trying to parse 
> javabin as XML.
>
> My question is: is this intentional?  It's unfortunate that we don't 
> seem to be able to update the client and have it continue to work with 
> (ie send updates to) the old servers.  We have a centralized client 
> library that we share across a large number of different 
> installations, and getting all the servers upgraded is going to take 
> some time.  It would be really nice if we could decouple the upgrade 
> efforts.
>
> I guess as a workaround we can enforce that the clients send requests 
> to /update/javabin as they used to (new 
> UpdateRequest("/update/javabin"), but am I missing a straightforward 
> out-of-the-box way? - I don't think I saw anything about this in the 
> release notes.
>
> -Mike


Re: Solr 4.x/3.x update javabin incompatibility?

Posted by Michael Sokolov <ms...@safaribooksonline.com>.
On 5/10/2013 10:18 PM, Shawn Heisey wrote:
> On 5/10/2013 7:42 PM, Michael Sokolov wrote:
>> My question is: is this intentional?  It's unfortunate that we don't
>> seem to be able to update the client and have it continue to work with
>> (ie send updates to) the old servers.  We have a centralized client
>> library that we share across a large number of different installations,
>> and getting all the servers upgraded is going to take some time.  It
>> would be really nice if we could decouple the upgrade efforts.
> I have a SolrJ 4.2.1 client that keeps both copies of my index up to
> date.  A single program runs and updates both, it's not multiple copies.
>   One of those copies is running 3.5.0 and the other has been upgraded to
> 4.2.1.  I haven't had any trouble with javabin.
OK - I couldn't believe this was a universal problem that had gone 
unnoticed for a year or more.  Must be something weird in our setup.  
I'm still trying to figure that out.  I had gone so far as intercepting 
the /update requests (in HttpClient) and rewriting them to 
/update/javabin.  But when I do that I am getting a different error when 
inserting documents (Unknown type 16 at 
org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:216)) 
which seems like it might be a clue.

I'll post back when I know more in case someone else ends up in this hole.

-Mike

Re: Solr 4.x/3.x update javabin incompatibility?

Posted by Michael Sokolov <ms...@safaribooksonline.com>.
On 5/11/2013 11:31 AM, Michael Sokolov wrote:
> On 5/11/2013 11:14 AM, Steve Rowe wrote:
>> On May 11, 2013 7:27 AM, "Michael Sokolov" 
>> <ms...@safaribooksonline.com>
>> wrote:
>>> If somebody grants me access to the wiki, I'd be happy to write 
>>> something
>> there to let people know about this issue.
>>
>> What's your wiki username?
>>
> sokolov
Thanks Steve  - I've updated these pages: 
http://wiki.apache.org/solr/javabin; http://wiki.apache.org/solr/Solrj


Re: Solr 4.x/3.x update javabin incompatibility?

Posted by Michael Sokolov <ms...@safaribooksonline.com>.
On 5/11/2013 11:14 AM, Steve Rowe wrote:
> On May 11, 2013 7:27 AM, "Michael Sokolov" <ms...@safaribooksonline.com>
> wrote:
>> If somebody grants me access to the wiki, I'd be happy to write something
> there to let people know about this issue.
>
> What's your wiki username?
>
sokolov

Re: Solr 4.x/3.x update javabin incompatibility?

Posted by Steve Rowe <sa...@gmail.com>.
On May 11, 2013 7:27 AM, "Michael Sokolov" <ms...@safaribooksonline.com>
wrote:
> If somebody grants me access to the wiki, I'd be happy to write something
there to let people know about this issue.

What's your wiki username?

Re: Solr 4.x/3.x update javabin incompatibility?

Posted by Shawn Heisey <so...@elyograg.org>.
On 5/11/2013 8:26 AM, Michael Sokolov wrote:
>  try {
>                     httpServer.request(new
> DirectXmlRequest("/admin.html", ""));
>                     httpServer.setRequestWriter(new
> BinaryRequestWriter());
>                 } catch (SolrException e) {
>                     // There is a backwards-compatibility issue
> w/javabin; use XML for now
>                     // assume this is a 404 ??
>                     getLogger().warn("Could not load /admin.html;
> assuming this is a < 4.0 server, we won't use javabin format");
>                 } ...
>
> I don't know if this is really an ideal detection mechanism. Probably
> this could be done better in Solr/J code somewhere.  At the very least
> there should be warnings in release notes and on the Wiki.  If
> somebody grants me access to the wiki, I'd be happy to write something
> there to let people know about this issue.

That looks like a reasonable way to go, I'll give it a try.  I do wonder
if there are any known "pure SolrJ" calls that will fail against a 3.x
server but not fail against a 4.x server, but I don't think there's
anything wrong with the way you've done it.

If the problem can be fixed within SolrJ so the app developer doesn't
have to worry about it at all, that would be the best way to go.

I found SOLR-3038 and put in my two cents.  I'm a little bit amused by
my own earlier comment on that issue, where I say that I'm not a java
programmer.  It was completely true at the time.  Now it's only a little
bit true. :)

Thanks,
Shawn


Re: Solr 4.x/3.x update javabin incompatibility?

Posted by Michael Sokolov <ms...@safaribooksonline.com>.
On 5/10/2013 11:39 PM, Shawn Heisey wrote:
> On 5/10/2013 8:56 PM, Michael Sokolov wrote:
>> On 5/10/2013 10:18 PM, Shawn Heisey wrote:
>>> I don't know why I'm not having any trouble. I'm certainly glad that
>>> I'm not, though! Thanks, Shawn
>> Shawn, one question - in your server setup do you have:
>>
>> _querySolr.setRequestWriter(new BinaryRequestWriter());
>>
>> ?  I didn't see that - it (used to be) the way you would request
>> javabin to be used to communicate with the server.
>>
>> If you don't have that, you are probably using XML format.  I guess we
>> could just do that, too, at least until all our servers have been
>> upgraded.
> I did a packet capture on the both server versions, the updates that the
> client is sending are indeed in XML format.  I find myself a little
> surprised by this.  I'm not going to try to change it in production
> until I have upgraded everything and lo longer have 3.5.0 in my network,
> but I will go ahead and try it on my dev server.
>
Just to wrap this up; what I ended up doing was to attempt to detect the 
server version when we start up a new connection, and only use javabin 
if we are reasonably positive we are talking to a 4.0+ server:

  try {
                     httpServer.request(new 
DirectXmlRequest("/admin.html", ""));
                     httpServer.setRequestWriter(new BinaryRequestWriter());
                 } catch (SolrException e) {
                     // There is a backwards-compatibility issue 
w/javabin; use XML for now
                     // assume this is a 404 ??
                     getLogger().warn("Could not load /admin.html; 
assuming this is a < 4.0 server, we won't use javabin format");
                 } ...

I don't know if this is really an ideal detection mechanism. Probably 
this could be done better in Solr/J code somewhere.  At the very least 
there should be warnings in release notes and on the Wiki.  If somebody 
grants me access to the wiki, I'd be happy to write something there to 
let people know about this issue.

-Mike

Re: Solr 4.x/3.x update javabin incompatibility?

Posted by Shawn Heisey <so...@elyograg.org>.
On 5/10/2013 8:56 PM, Michael Sokolov wrote:
> On 5/10/2013 10:18 PM, Shawn Heisey wrote:
>> I don't know why I'm not having any trouble. I'm certainly glad that
>> I'm not, though! Thanks, Shawn 
> Shawn, one question - in your server setup do you have:
>
> _querySolr.setRequestWriter(new BinaryRequestWriter());
>
> ?  I didn't see that - it (used to be) the way you would request
> javabin to be used to communicate with the server.
>
> If you don't have that, you are probably using XML format.  I guess we
> could just do that, too, at least until all our servers have been
> upgraded.

I did a packet capture on the both server versions, the updates that the
client is sending are indeed in XML format.  I find myself a little
surprised by this.  I'm not going to try to change it in production
until I have upgraded everything and lo longer have 3.5.0 in my network,
but I will go ahead and try it on my dev server.

I don't have a call to setRequestWriter or setParser.  My server log
shows that the responses are javabin, and the packet captures confirm
that as well:

>From the 3.5.0 log:
May 10, 2013 9:04:07 PM org.apache.solr.core.SolrCore execute
INFO: [s0live] webapp=/solr path=/update
params={waitSearcher=true&wt=javabin&commit=true&softCommit=false&version=2}
status=0 QTime=5446

>From the 4.2.1 log:
INFO  - 2013-05-10 21:04:02.276;
org.apache.solr.update.processor.LogUpdateProcessor; [s0live]
webapp=/solr path=/update
params={waitSearcher=true&commit=true&wt=javabin&version=2&softCommit=false}
{commit=} 0 1561

Thanks,
Shawn


Re: Solr 4.x/3.x update javabin incompatibility?

Posted by Michael Sokolov <ms...@safaribooksonline.com>.
On 5/10/2013 10:18 PM, Shawn Heisey wrote:
> I don't know why I'm not having any trouble. I'm certainly glad that 
> I'm not, though! Thanks, Shawn 
Shawn, one question - in your server setup do you have:

_querySolr.setRequestWriter(new BinaryRequestWriter());

?  I didn't see that - it (used to be) the way you would request javabin 
to be used to communicate with the server.

If you don't have that, you are probably using XML format.  I guess we 
could just do that, too, at least until all our servers have been upgraded.

-Mike

Re: Solr 4.x/3.x update javabin incompatibility?

Posted by Shawn Heisey <so...@elyograg.org>.
On 5/10/2013 7:42 PM, Michael Sokolov wrote:
> My question is: is this intentional?  It's unfortunate that we don't
> seem to be able to update the client and have it continue to work with
> (ie send updates to) the old servers.  We have a centralized client
> library that we share across a large number of different installations,
> and getting all the servers upgraded is going to take some time.  It
> would be really nice if we could decouple the upgrade efforts.

I have a SolrJ 4.2.1 client that keeps both copies of my index up to
date.  A single program runs and updates both, it's not multiple copies.
 One of those copies is running 3.5.0 and the other has been upgraded to
4.2.1.  I haven't had any trouble with javabin.

Here's a rundown of how I set up my server object.  It's more complex
than I remembered making it:

// class fields
	private static boolean firstInstance = true;
	private static final PoolingClientConnectionManager mgr =
	    new PoolingClientConnectionManager();
	private static final DefaultHttpClient httpClient =
	    new DefaultHttpClient(mgr);
	private HttpSolrServer _querySolr;

// in the constructor
	if (firstInstance)
	{
		firstInstance = false;
		mgr.setDefaultMaxPerRoute(25);
		mgr.setMaxTotal(1000);
	}

	serverBaseUrl = "http://" + serverHost + ":" +
	    serverPort + "/solr/";
	coreBaseUrl = serverBaseUrl + name + "/";
	_querySolr = new HttpSolrServer(coreBaseUrl, httpClient);

	_querySolr.setMaxRetries(1);
	_querySolr.setConnectionTimeout(15000);

I don't know why I'm not having any trouble.  I'm certainly glad that
I'm not, though!

Thanks,
Shawn