You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Richard <so...@okcoder.com> on 2022/09/26 00:21:02 UTC

Solr7 / 8 interop

Hello again Solr gurus!

We've been slowly progressing on our project to move our deployment from
v7 to v8 and I've hit a snag while doing some validation. I wanted to
see if this was an expected failure or a peculiarity of our setup.

tl;dr: Solr8, when retrieving responses from a solr7 overseer throws
exceptions because they (seem to) use different codecs to save different
objects. Is this an expected incompatibility in a heterogeneous fleet,
are there others I may need to be wary of? Are we doing something odd?

Details:

I did an initial deployment of our build and ran one of our
orchestration tools. It essentially: 

1. issues an async deletereplica call to the local solr instance (v8)
2. polls requeststatus for that async_id until it completes

The solr8 node logs contained the following while servicing
requeststatus:

org.apache.solr.common.SolrException: Exception deserializing response
from Javabin 
  at
org.apache.solr.cloud.OverseerSolrResponseSerializer.deserialize(OverseerSolrResponseSerializer.java:72)
~[?:?]
  at
org.apache.solr.handler.admin.CollectionsHandler$CollectionOperation.lambda$static$21(CollectionsHandler.java:908)
~[?:?]
  at
org.apache.solr.handler.admin.CollectionsHandler$CollectionOperation.execute(CollectionsHandler.java:1463)
~[?:?]
  at
org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:285)
~[?:?]
  at
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:257)
~[?:?]
  at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
~[?:?]
  at
org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:833)
~[?:?]
  at
org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:797)
~[?:?]
  at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:542)
~[?:?]
  ...
Caused by: java.lang.RuntimeException: Invalid version (expected 2, but
-84) or the data in not in 'javabin' format
  at
org.apache.solr.common.util.JavaBinCodec._init(JavaBinCodec.java:213)
~[?:?]
  at
org.apache.solr.common.util.JavaBinCodec.initRead(JavaBinCodec.java:207)
~[?:?]
  at
org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:191)
~[?:?]
  at org.apache.solr.common.util.Utils.fromJavabin(Utils.java:188)
~[?:?]
  at
org.apache.solr.cloud.OverseerSolrResponseSerializer.deserialize(OverseerSolrResponseSerializer.java:66)
~[?:?]
  ... 49 more

Looking at JavaBinCodec on 8.11.2 I see that, sure enough, we require
0x02 as the first byte of the serialized data at
/overseer/collection-map-completed. However the same is true on the
7.7.3 JavaBinCodec. Initially I thought that this might mean we made a
change that resulted changing how the data there got serialized.

My next step was to take an essentially clean release from the web for
8.11.2 and 7.7.3 and issue an async deletereplica to see what got
stored:

localhost $ ./zk.sh get
/overseer/collection-map-completed/mn-solr8-async-deletereplica | xxd
00000000: 02c2 e027 7375 6363 6573 73a2 e037 3139  ...'success..719
00000010: 322e 3136 382e 312e 3137 393a 3839 3833  2.168.1.179:8983
00000020: 5f73 6f6c 72a1 e02e 7265 7370 6f6e 7365  _solr...response
00000030: 4865 6164 6572 a2e0 2673 7461 7475 7306  Header..&status.

localhost $ ./zk.sh get
/overseer/collection-map-completed/mn-solr7-async-deletereplica | xxd
00000000: aced 0005 7372 002a 6f72 672e 6170 6163  ....sr.*org.apac
00000010: 6865 2e73 6f6c 722e 636c 6f75 642e 4f76  he.solr.cloud.Ov
00000020: 6572 7365 6572 536f 6c72 5265 7370 6f6e  erseerSolrRespon
00000030: 7365 4186 ae6d 5e29 0df0 0200 024a 000b  seA..m^).....J..

For reference: 0xACED -> the magic number for Java's Serializeable
output and 0xAC -> 172 (uint8) and -84 (int8) which matches the
RuntimeException we originally saw. 

In addition to the codec change (Solr7 looks to use straight Java
Serialization (0xaced) vs JavaBinCodec (0x02) for Solr8) it also looks
like a different object entirely is getting stored. Solr7 keeping the
OverseerSolrResponse container and Solr8 having only the NamedList
responseData. 

Is this an expected failure mode / version incompatibility and is it
documented in someplace that I might not have seen? Maybe this is all
surprising and we've clearly configured something unexpectedly or are
doing something surprising? Happy to answer any questions if i've missed
some important details. 

Appreciate any guidance or insight folks might have here. 
-r

Re: Solr7 / 8 interop

Posted by Richard <so...@okcoder.com>.
Oh wild, this is perfect and after skimming the 8.5 notes almost 
certainly what's going on. Not sure how I missed it / thought I'd 
searched for "overseer" on the upgrade page.

No telling how long it would've taken me to turn this up otherwise as 
I'd crossed off the official docs as "checked." Super grateful for the 
pointer!

thanks
-r

On 2022-09-25 18:38, Mike Drob wrote:
> This is likely the same situation described in the upgrade notes:
> https://solr.apache.org/guide/8_11/solr-upgrade-notes.html#solr-8-5
> 
> Solr 8.5 introduces a change in the format used for the elements in the
> Overseer queues and maps
> 
> Please let us know if that matches and helps
> 
> Mike
> 
> On Sun, Sep 25, 2022 at 7:21 PM Richard <so...@okcoder.com> wrote:
> 
>> Hello again Solr gurus!
>> 
>> We've been slowly progressing on our project to move our deployment 
>> from
>> v7 to v8 and I've hit a snag while doing some validation. I wanted to
>> see if this was an expected failure or a peculiarity of our setup.
>> 
>> tl;dr: Solr8, when retrieving responses from a solr7 overseer throws
>> exceptions because they (seem to) use different codecs to save 
>> different
>> objects. Is this an expected incompatibility in a heterogeneous fleet,
>> are there others I may need to be wary of? Are we doing something odd?
>> 
>> Details:
>> 
>> I did an initial deployment of our build and ran one of our
>> orchestration tools. It essentially:
>> 
>> 1. issues an async deletereplica call to the local solr instance (v8)
>> 2. polls requeststatus for that async_id until it completes
>> 
>> The solr8 node logs contained the following while servicing
>> requeststatus:
>> 
>> org.apache.solr.common.SolrException: Exception deserializing response
>> from Javabin
>>   at
>> 
>> org.apache.solr.cloud.OverseerSolrResponseSerializer.deserialize(OverseerSolrResponseSerializer.java:72)
>> ~[?:?]
>>   at
>> 
>> org.apache.solr.handler.admin.CollectionsHandler$CollectionOperation.lambda$static$21(CollectionsHandler.java:908)
>> ~[?:?]
>>   at
>> 
>> org.apache.solr.handler.admin.CollectionsHandler$CollectionOperation.execute(CollectionsHandler.java:1463)
>> ~[?:?]
>>   at
>> 
>> org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:285)
>> ~[?:?]
>>   at
>> 
>> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:257)
>> ~[?:?]
>>   at
>> 
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
>> ~[?:?]
>>   at
>> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:833)
>> ~[?:?]
>>   at
>> 
>> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:797)
>> ~[?:?]
>>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:542)
>> ~[?:?]
>>   ...
>> Caused by: java.lang.RuntimeException: Invalid version (expected 2, 
>> but
>> -84) or the data in not in 'javabin' format
>>   at
>> org.apache.solr.common.util.JavaBinCodec._init(JavaBinCodec.java:213)
>> ~[?:?]
>>   at
>> org.apache.solr.common.util.JavaBinCodec.initRead(JavaBinCodec.java:207)
>> ~[?:?]
>>   at
>> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:191)
>> ~[?:?]
>>   at org.apache.solr.common.util.Utils.fromJavabin(Utils.java:188)
>> ~[?:?]
>>   at
>> 
>> org.apache.solr.cloud.OverseerSolrResponseSerializer.deserialize(OverseerSolrResponseSerializer.java:66)
>> ~[?:?]
>>   ... 49 more
>> 
>> Looking at JavaBinCodec on 8.11.2 I see that, sure enough, we require
>> 0x02 as the first byte of the serialized data at
>> /overseer/collection-map-completed. However the same is true on the
>> 7.7.3 JavaBinCodec. Initially I thought that this might mean we made a
>> change that resulted changing how the data there got serialized.
>> 
>> My next step was to take an essentially clean release from the web for
>> 8.11.2 and 7.7.3 and issue an async deletereplica to see what got
>> stored:
>> 
>> localhost $ ./zk.sh get
>> /overseer/collection-map-completed/mn-solr8-async-deletereplica | xxd
>> 00000000: 02c2 e027 7375 6363 6573 73a2 e037 3139  ...'success..719
>> 00000010: 322e 3136 382e 312e 3137 393a 3839 3833  2.168.1.179:8983
>> 00000020: 5f73 6f6c 72a1 e02e 7265 7370 6f6e 7365  _solr...response
>> 00000030: 4865 6164 6572 a2e0 2673 7461 7475 7306  Header..&status.
>> 
>> localhost $ ./zk.sh get
>> /overseer/collection-map-completed/mn-solr7-async-deletereplica | xxd
>> 00000000: aced 0005 7372 002a 6f72 672e 6170 6163  ....sr.*org.apac
>> 00000010: 6865 2e73 6f6c 722e 636c 6f75 642e 4f76  he.solr.cloud.Ov
>> 00000020: 6572 7365 6572 536f 6c72 5265 7370 6f6e  erseerSolrRespon
>> 00000030: 7365 4186 ae6d 5e29 0df0 0200 024a 000b  seA..m^).....J..
>> 
>> For reference: 0xACED -> the magic number for Java's Serializeable
>> output and 0xAC -> 172 (uint8) and -84 (int8) which matches the
>> RuntimeException we originally saw.
>> 
>> In addition to the codec change (Solr7 looks to use straight Java
>> Serialization (0xaced) vs JavaBinCodec (0x02) for Solr8) it also looks
>> like a different object entirely is getting stored. Solr7 keeping the
>> OverseerSolrResponse container and Solr8 having only the NamedList
>> responseData.
>> 
>> Is this an expected failure mode / version incompatibility and is it
>> documented in someplace that I might not have seen? Maybe this is all
>> surprising and we've clearly configured something unexpectedly or are
>> doing something surprising? Happy to answer any questions if i've 
>> missed
>> some important details.
>> 
>> Appreciate any guidance or insight folks might have here.
>> -r

Re: Solr7 / 8 interop

Posted by Mike Drob <md...@mdrob.com>.
This is likely the same situation described in the upgrade notes:
https://solr.apache.org/guide/8_11/solr-upgrade-notes.html#solr-8-5

Solr 8.5 introduces a change in the format used for the elements in the
Overseer queues and maps

Please let us know if that matches and helps

Mike

On Sun, Sep 25, 2022 at 7:21 PM Richard <so...@okcoder.com> wrote:

> Hello again Solr gurus!
>
> We've been slowly progressing on our project to move our deployment from
> v7 to v8 and I've hit a snag while doing some validation. I wanted to
> see if this was an expected failure or a peculiarity of our setup.
>
> tl;dr: Solr8, when retrieving responses from a solr7 overseer throws
> exceptions because they (seem to) use different codecs to save different
> objects. Is this an expected incompatibility in a heterogeneous fleet,
> are there others I may need to be wary of? Are we doing something odd?
>
> Details:
>
> I did an initial deployment of our build and ran one of our
> orchestration tools. It essentially:
>
> 1. issues an async deletereplica call to the local solr instance (v8)
> 2. polls requeststatus for that async_id until it completes
>
> The solr8 node logs contained the following while servicing
> requeststatus:
>
> org.apache.solr.common.SolrException: Exception deserializing response
> from Javabin
>   at
>
> org.apache.solr.cloud.OverseerSolrResponseSerializer.deserialize(OverseerSolrResponseSerializer.java:72)
> ~[?:?]
>   at
>
> org.apache.solr.handler.admin.CollectionsHandler$CollectionOperation.lambda$static$21(CollectionsHandler.java:908)
> ~[?:?]
>   at
>
> org.apache.solr.handler.admin.CollectionsHandler$CollectionOperation.execute(CollectionsHandler.java:1463)
> ~[?:?]
>   at
>
> org.apache.solr.handler.admin.CollectionsHandler.invokeAction(CollectionsHandler.java:285)
> ~[?:?]
>   at
>
> org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:257)
> ~[?:?]
>   at
>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:216)
> ~[?:?]
>   at
> org.apache.solr.servlet.HttpSolrCall.handleAdmin(HttpSolrCall.java:833)
> ~[?:?]
>   at
>
> org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(HttpSolrCall.java:797)
> ~[?:?]
>   at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:542)
> ~[?:?]
>   ...
> Caused by: java.lang.RuntimeException: Invalid version (expected 2, but
> -84) or the data in not in 'javabin' format
>   at
> org.apache.solr.common.util.JavaBinCodec._init(JavaBinCodec.java:213)
> ~[?:?]
>   at
> org.apache.solr.common.util.JavaBinCodec.initRead(JavaBinCodec.java:207)
> ~[?:?]
>   at
> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:191)
> ~[?:?]
>   at org.apache.solr.common.util.Utils.fromJavabin(Utils.java:188)
> ~[?:?]
>   at
>
> org.apache.solr.cloud.OverseerSolrResponseSerializer.deserialize(OverseerSolrResponseSerializer.java:66)
> ~[?:?]
>   ... 49 more
>
> Looking at JavaBinCodec on 8.11.2 I see that, sure enough, we require
> 0x02 as the first byte of the serialized data at
> /overseer/collection-map-completed. However the same is true on the
> 7.7.3 JavaBinCodec. Initially I thought that this might mean we made a
> change that resulted changing how the data there got serialized.
>
> My next step was to take an essentially clean release from the web for
> 8.11.2 and 7.7.3 and issue an async deletereplica to see what got
> stored:
>
> localhost $ ./zk.sh get
> /overseer/collection-map-completed/mn-solr8-async-deletereplica | xxd
> 00000000: 02c2 e027 7375 6363 6573 73a2 e037 3139  ...'success..719
> 00000010: 322e 3136 382e 312e 3137 393a 3839 3833  2.168.1.179:8983
> 00000020: 5f73 6f6c 72a1 e02e 7265 7370 6f6e 7365  _solr...response
> 00000030: 4865 6164 6572 a2e0 2673 7461 7475 7306  Header..&status.
>
> localhost $ ./zk.sh get
> /overseer/collection-map-completed/mn-solr7-async-deletereplica | xxd
> 00000000: aced 0005 7372 002a 6f72 672e 6170 6163  ....sr.*org.apac
> 00000010: 6865 2e73 6f6c 722e 636c 6f75 642e 4f76  he.solr.cloud.Ov
> 00000020: 6572 7365 6572 536f 6c72 5265 7370 6f6e  erseerSolrRespon
> 00000030: 7365 4186 ae6d 5e29 0df0 0200 024a 000b  seA..m^).....J..
>
> For reference: 0xACED -> the magic number for Java's Serializeable
> output and 0xAC -> 172 (uint8) and -84 (int8) which matches the
> RuntimeException we originally saw.
>
> In addition to the codec change (Solr7 looks to use straight Java
> Serialization (0xaced) vs JavaBinCodec (0x02) for Solr8) it also looks
> like a different object entirely is getting stored. Solr7 keeping the
> OverseerSolrResponse container and Solr8 having only the NamedList
> responseData.
>
> Is this an expected failure mode / version incompatibility and is it
> documented in someplace that I might not have seen? Maybe this is all
> surprising and we've clearly configured something unexpectedly or are
> doing something surprising? Happy to answer any questions if i've missed
> some important details.
>
> Appreciate any guidance or insight folks might have here.
> -r