You are viewing a plain text version of this content. The canonical link for it is here.

Posted to oak-dev@jackrabbit.apache.org by "jorgeeflorez ." <jo...@gmail.com> on 2020/03/03 12:45:40 UTC

Cluster Info problem

Hi,
I am having some problems starting a node store in a server. I am getting:
DocumentStoreException: Configured cluster node id 123 already in use:
needs recovery and machineId/instanceId do not match:
mac:02421b0c73d3//home/ec2-user != mac:0242a5c0c5e5//home/ec2-user

I read the code from class ClusterNodeInfo and I see that, the algorithm
uses the lowest mac address found in the machine. The problem is that in
the server there is a "docker0" interface (I don't know what for), and that
interface's mac address always changes (always being the lowest).
For example (after using ifconfig -a):

docker0   Link encap:Ethernet  HWaddr 02:42:C5:C3:C5:E5

eth0      Link encap:Ethernet  HWaddr 06:C5:FB:D5:C2:C0

lo        Link encap:Local Loopback

Does anyone see alternatives other than disabling the lease check or
"uninstalling" docker0?

Thanks.

Jorge.

Re: Cluster Info problem

Posted by Julian Reschke <ju...@gmx.de>.

On 03.03.2020 18:45, jorgeeflorez . wrote:
> For now,
> if I use setLeaseCheckMode(LeaseCheckMode.DISABLED) in the document store,
> I should stop seeing that error, right?
> ...

You should be able to override the algorithm by setting the system property

   org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo.HWADDRESS

(12 hex digits, and make sure they differ for all cluster nodes :-)

Best regards, Julian

Re: Cluster Info problem

Posted by "jorgeeflorez ." <jo...@gmail.com>.

For now,
if I use setLeaseCheckMode(LeaseCheckMode.DISABLED) in the document store,
I should stop seeing that error, right?

El mar., 3 mar. 2020 a las 12:38, jorgeeflorez . (<
jorgeeduardoflorez@gmail.com>) escribió:

> It is done: https://issues.apache.org/jira/browse/OAK-8935
>
> El mar., 3 mar. 2020 a las 10:06, Julian Reschke (<ju...@gmx.de>)
> escribió:
>
>> On 03.03.2020 15:48, jorgeeflorez . wrote:
>> >>
>> >> First step would be to upgrade to 1.24.0
>> >>
>> > We just upgraded from 1.5 last year. I guess I will not find much
>> trouble
>> > upgrading from 1.12, right?
>>
>> 1.5 was an unstable branch, you shouldn't use that in production anyway.
>>
>> 1.24.0 has replaced 1.12.0 as latest stable release.
>>
>> > the output of "ipconfig -a" for that machine would be helpful.
>> >>
>> >
>> > Here is the output, MAC changed and IP addresses hidden/changed.
>> >
>> > docker0   Link encap:Ethernet  HWaddr 02:4E:C5:4B:C5:E5
>> >            inet addr:xxx.xx.x.x  Bcast:xxx.xxx.xxx.xxx  Mask:xxx.xxx.x.x
>> >            UP BROADCAST MULTICAST  MTU:1500  Metric:1
>> >            RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>> >            TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>> >            collisions:0 txqueuelen:0
>> >            RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>> >
>> > eth0      Link encap:Ethernet  HWaddr 06:C5:FB:F8:CF:C7
>> >            inet addr:xxx.xx.xx.xx  Bcast:xxx.xxx.xxx.xxx
>> Mask:xxx.xxx.xxx.xx
>> >            inet6 addr: fe80::4c8:fafd:fec1:cfc0/64 Scope:Link
>> >            UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
>> >            RX packets:2694 errors:0 dropped:0 overruns:0 frame:0
>> >            TX packets:2721 errors:0 dropped:0 overruns:0 carrier:0
>> >            collisions:0 txqueuelen:1000
>> >            RX bytes:306659 (299.4 KiB)  TX bytes:343151 (335.1 KiB)
>> >
>> > lo        Link encap:Local Loopback
>> >            inet addr:127.0.0.1  Mask:255.0.0.0
>> >            inet6 addr: ::1/128 Scope:Host
>> >            UP LOOPBACK RUNNING  MTU:65536  Metric:1
>> >            RX packets:35 errors:0 dropped:0 overruns:0 frame:0
>> >            TX packets:35 errors:0 dropped:0 overruns:0 carrier:0
>> >            collisions:0 txqueuelen:1000
>> >            RX bytes:81477 (79.5 KiB)  TX bytes:81477 (79.5 KiB)
>> > ...
>>
>> Ack. So yes, it probably would be good to extend the hack that we
>> introduced in https://issues.apache.org/jira/browse/OAK-3885 for that
>> interface.
>>
>> Can you open a Jira ticket for this?
>>
>> Best regards, Julian
>>
>

Re: Cluster Info problem

Posted by "jorgeeflorez ." <jo...@gmail.com>.

It is done: https://issues.apache.org/jira/browse/OAK-8935

El mar., 3 mar. 2020 a las 10:06, Julian Reschke (<ju...@gmx.de>)
escribió:

> On 03.03.2020 15:48, jorgeeflorez . wrote:
> >>
> >> First step would be to upgrade to 1.24.0
> >>
> > We just upgraded from 1.5 last year. I guess I will not find much trouble
> > upgrading from 1.12, right?
>
> 1.5 was an unstable branch, you shouldn't use that in production anyway.
>
> 1.24.0 has replaced 1.12.0 as latest stable release.
>
> > the output of "ipconfig -a" for that machine would be helpful.
> >>
> >
> > Here is the output, MAC changed and IP addresses hidden/changed.
> >
> > docker0   Link encap:Ethernet  HWaddr 02:4E:C5:4B:C5:E5
> >            inet addr:xxx.xx.x.x  Bcast:xxx.xxx.xxx.xxx  Mask:xxx.xxx.x.x
> >            UP BROADCAST MULTICAST  MTU:1500  Metric:1
> >            RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >            TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >            collisions:0 txqueuelen:0
> >            RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
> >
> > eth0      Link encap:Ethernet  HWaddr 06:C5:FB:F8:CF:C7
> >            inet addr:xxx.xx.xx.xx  Bcast:xxx.xxx.xxx.xxx
> Mask:xxx.xxx.xxx.xx
> >            inet6 addr: fe80::4c8:fafd:fec1:cfc0/64 Scope:Link
> >            UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
> >            RX packets:2694 errors:0 dropped:0 overruns:0 frame:0
> >            TX packets:2721 errors:0 dropped:0 overruns:0 carrier:0
> >            collisions:0 txqueuelen:1000
> >            RX bytes:306659 (299.4 KiB)  TX bytes:343151 (335.1 KiB)
> >
> > lo        Link encap:Local Loopback
> >            inet addr:127.0.0.1  Mask:255.0.0.0
> >            inet6 addr: ::1/128 Scope:Host
> >            UP LOOPBACK RUNNING  MTU:65536  Metric:1
> >            RX packets:35 errors:0 dropped:0 overruns:0 frame:0
> >            TX packets:35 errors:0 dropped:0 overruns:0 carrier:0
> >            collisions:0 txqueuelen:1000
> >            RX bytes:81477 (79.5 KiB)  TX bytes:81477 (79.5 KiB)
> > ...
>
> Ack. So yes, it probably would be good to extend the hack that we
> introduced in https://issues.apache.org/jira/browse/OAK-3885 for that
> interface.
>
> Can you open a Jira ticket for this?
>
> Best regards, Julian
>

Re: Cluster Info problem

Posted by Julian Reschke <ju...@gmx.de>.

On 03.03.2020 15:48, jorgeeflorez . wrote:
>>
>> First step would be to upgrade to 1.24.0
>>
> We just upgraded from 1.5 last year. I guess I will not find much trouble
> upgrading from 1.12, right?

1.5 was an unstable branch, you shouldn't use that in production anyway.

1.24.0 has replaced 1.12.0 as latest stable release.

> the output of "ipconfig -a" for that machine would be helpful.
>>
>
> Here is the output, MAC changed and IP addresses hidden/changed.
>
> docker0   Link encap:Ethernet  HWaddr 02:4E:C5:4B:C5:E5
>            inet addr:xxx.xx.x.x  Bcast:xxx.xxx.xxx.xxx  Mask:xxx.xxx.x.x
>            UP BROADCAST MULTICAST  MTU:1500  Metric:1
>            RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>            TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>            collisions:0 txqueuelen:0
>            RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>
> eth0      Link encap:Ethernet  HWaddr 06:C5:FB:F8:CF:C7
>            inet addr:xxx.xx.xx.xx  Bcast:xxx.xxx.xxx.xxx  Mask:xxx.xxx.xxx.xx
>            inet6 addr: fe80::4c8:fafd:fec1:cfc0/64 Scope:Link
>            UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
>            RX packets:2694 errors:0 dropped:0 overruns:0 frame:0
>            TX packets:2721 errors:0 dropped:0 overruns:0 carrier:0
>            collisions:0 txqueuelen:1000
>            RX bytes:306659 (299.4 KiB)  TX bytes:343151 (335.1 KiB)
>
> lo        Link encap:Local Loopback
>            inet addr:127.0.0.1  Mask:255.0.0.0
>            inet6 addr: ::1/128 Scope:Host
>            UP LOOPBACK RUNNING  MTU:65536  Metric:1
>            RX packets:35 errors:0 dropped:0 overruns:0 frame:0
>            TX packets:35 errors:0 dropped:0 overruns:0 carrier:0
>            collisions:0 txqueuelen:1000
>            RX bytes:81477 (79.5 KiB)  TX bytes:81477 (79.5 KiB)
> ...

Ack. So yes, it probably would be good to extend the hack that we
introduced in https://issues.apache.org/jira/browse/OAK-3885 for that
interface.

Can you open a Jira ticket for this?

Best regards, Julian

Re: Cluster Info problem

Posted by "jorgeeflorez ." <jo...@gmail.com>.

>
> First step would be to upgrade to 1.24.0
>
We just upgraded from 1.5 last year. I guess I will not find much trouble
upgrading from 1.12, right?

the output of "ipconfig -a" for that machine would be helpful.
>

Here is the output, MAC changed and IP addresses hidden/changed.

docker0   Link encap:Ethernet  HWaddr 02:4E:C5:4B:C5:E5
          inet addr:xxx.xx.x.x  Bcast:xxx.xxx.xxx.xxx  Mask:xxx.xxx.x.x
          UP BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

eth0      Link encap:Ethernet  HWaddr 06:C5:FB:F8:CF:C7
          inet addr:xxx.xx.xx.xx  Bcast:xxx.xxx.xxx.xxx  Mask:xxx.xxx.xxx.xx
          inet6 addr: fe80::4c8:fafd:fec1:cfc0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:9001  Metric:1
          RX packets:2694 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2721 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:306659 (299.4 KiB)  TX bytes:343151 (335.1 KiB)

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:35 errors:0 dropped:0 overruns:0 frame:0
          TX packets:35 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:81477 (79.5 KiB)  TX bytes:81477 (79.5 KiB)

El mar., 3 mar. 2020 a las 9:29, Julian Reschke (<ju...@gmx.de>)
escribió:

> On 03.03.2020 15:15, jorgeeflorez . wrote:
> > Hi Julian, thanks for your reply,
> >
> > a) I am using version 1.12.0.
> > b) As far as I know, Oak is not running in Docker. It is running in the
> > JVM installed on the Linux machine, but not inside Docker.
> > c) I see... some things may add a "low" mac address to the machine,
> > interfering with the cluster info.
> >
> > Jorge
> > ...
>
>
> First step would be to upgrade to 1.24.0. If that doesn't change things,
> the output of "ipconfig -a" for that machine would be helpful.
>
> Best regards, Julian
>

Re: Cluster Info problem

Posted by Julian Reschke <ju...@gmx.de>.

On 03.03.2020 15:15, jorgeeflorez . wrote:
> Hi Julian, thanks for your reply,
>
> a) I am using version 1.12.0.
> b) As far as I know, Oak is not running in Docker. It is running in the
> JVM installed on the Linux machine, but not inside Docker.
> c) I see... some things may add a "low" mac address to the machine,
> interfering with the cluster info.
>
> Jorge
> ...


First step would be to upgrade to 1.24.0. If that doesn't change things,
the output of "ipconfig -a" for that machine would be helpful.

Best regards, Julian

Re: Cluster Info problem

Posted by "jorgeeflorez ." <jo...@gmail.com>.

Hi Julian, thanks for your reply,

a) I am using version 1.12.0.
b) As far as I know, Oak is not running in Docker. It is running in the JVM
installed on the Linux machine, but not inside Docker.
c) I see... some things may add a "low" mac address to the machine,
interfering with the cluster info.

Jorge

El mar., 3 mar. 2020 a las 8:16, Julian Reschke (<ju...@gmx.de>)
escribió:

> On 03.03.2020 13:45, jorgeeflorez . wrote:
> > Hi,
> > I am having some problems starting a node store in a server. I am
> getting:
> > DocumentStoreException: Configured cluster node id 123 already in use:
> > needs recovery and machineId/instanceId do not match:
> > mac:02421b0c73d3//home/ec2-user != mac:0242a5c0c5e5//home/ec2-user
> >
> > I read the code from class ClusterNodeInfo and I see that, the algorithm
> > uses the lowest mac address found in the machine. The problem is that in
> > the server there is a "docker0" interface (I don't know what for), and
> that
> > interface's mac address always changes (always being the lowest).
> > For example (after using ifconfig -a):
> >
> > docker0   Link encap:Ethernet  HWaddr 02:42:C5:C3:C5:E5
> >
> > eth0      Link encap:Ethernet  HWaddr 06:C5:FB:D5:C2:C0
> >
> > lo        Link encap:Local Loopback
> >
> > Does anyone see alternatives other than disabling the lease check or
> > "uninstalling" docker0?
> >
> > Thanks.
> >
> > Jorge.
>
> a) What Oak version are you on?
>
> b) Is Oak running *inside* docker?
>
> c) We've seen similar issues in the past
> (https://issues.apache.org/jira/browse/OAK-3885), but not with Docker.
> Maybe we should add yet another special case...
>
> Best regards, Julian
>
>

Re: Cluster Info problem

Posted by Julian Reschke <ju...@gmx.de>.

On 03.03.2020 13:45, jorgeeflorez . wrote:
> Hi,
> I am having some problems starting a node store in a server. I am getting:
> DocumentStoreException: Configured cluster node id 123 already in use:
> needs recovery and machineId/instanceId do not match:
> mac:02421b0c73d3//home/ec2-user != mac:0242a5c0c5e5//home/ec2-user
>
> I read the code from class ClusterNodeInfo and I see that, the algorithm
> uses the lowest mac address found in the machine. The problem is that in
> the server there is a "docker0" interface (I don't know what for), and that
> interface's mac address always changes (always being the lowest).
> For example (after using ifconfig -a):
>
> docker0   Link encap:Ethernet  HWaddr 02:42:C5:C3:C5:E5
>
> eth0      Link encap:Ethernet  HWaddr 06:C5:FB:D5:C2:C0
>
> lo        Link encap:Local Loopback
>
> Does anyone see alternatives other than disabling the lease check or
> "uninstalling" docker0?
>
> Thanks.
>
> Jorge.

a) What Oak version are you on?

b) Is Oak running *inside* docker?

c) We've seen similar issues in the past
(https://issues.apache.org/jira/browse/OAK-3885), but not with Docker.
Maybe we should add yet another special case...

Best regards, Julian