You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Koen De Groote <ko...@limecraft.com> on 2019/07/18 19:24:38 UTC

log files not being cleaned up despite purgeInterval

Greetings,

Working with Zookeeper version 3.4.13 in the official docker image.

I was under the impression that the setting "autopurge.purgeInterval=1"
meant that log files would be cleaned up every hour.

Instead, I now find that months of these files are just sitting in their
directory, untouched.

So perhaps I'm wrong about that, but I'm not sure.

What I wish to achieve is that these log files stop accumulating and keep
only the most recent. Is there a way to achieve this? Or are they merely
historical and can they be deleted freely?

Kind regards,
Koen De Groote

Re: log files not being cleaned up despite purgeInterval

Posted by Koen De Groote <ko...@limecraft.com>.
I'll go ahead and make a ticket there with my findings.

Thanks for your time.

Regards,
Koen

On Wed, Jul 24, 2019 at 10:08 AM Norbert Kalmar
<nk...@cloudera.com.invalid> wrote:

> Hello Koen,
>
> Thanks for looking into this.
> Unfortunately, we are not maintaining the docker image, and I'm also pretty
> much out of ideas at this point, except that it looks like permission
> setting issue with the docker script to me.
> I suggest you file a ticker here:
> https://github.com/31z4/zookeeper-docker/issues
> Or in the general docker forums. There are multiple issues posted already,
> like this one for example:
>
> https://forums.docker.com/t/cannot-get-zookeeper-to-work-running-in-docker-using-swarm-mode/27109
>
> Sorry I couldn't be any further help, I don't work much with docker :(
>
> Regards,
> Norbert
>
> On Tue, Jul 23, 2019 at 3:28 PM Koen De Groote <
> koen.degroote@limecraft.com>
> wrote:
>
> > Hello again, Norbert,
> >
> > I haven't been able to get it working yet, but did notice something else
> > concerning the zookeeper user getting that error for the directory not
> > being found.
> >
> > After looking into it a bit more, these are my findings:
> >
> > The zookeeper dockerfile can be found here:
> > https://github.com/31z4/zookeeper-docker/blob/master/3.4.14/Dockerfile
> >
> > And the relevant part shows up at the very top:
> >
> > ENV ZOO_CONF_DIR=/conf \
> >     ZOO_DATA_DIR=/data \
> >     ZOO_DATA_LOG_DIR=/datalog \
> >     ZOO_LOG_DIR=/logs \
> >     ZOO_TICK_TIME=2000 \
> >     ZOO_INIT_LIMIT=5 \
> >     ZOO_SYNC_LIMIT=2 \
> >     ZOO_AUTOPURGE_PURGEINTERVAL=0 \
> >     ZOO_AUTOPURGE_SNAPRETAINCOUNT=3 \
> >     ZOO_MAX_CLIENT_CNXNS=60
> >
> > This sets these settings as environment variables inside the container.
> >
> > First thing of note: These environment variables are only available to
> the
> > root user. The process does run as the zookeeper user, to which said
> > environment variables are not available.
> >
> > As can be seen from the output of this command, it is indeed the
> zookeeper
> > user running the process:
> >
> > bash-4.4# ps -a | grep zookeeper
> >     1 zookeepe  0:04 /usr/lib/jvm/java-1.8-openjdk/jre/bin/java
> > -Dzookeeper.log.dir=/logs -Dzookeeper.root.logger=INFO,CONSOLE -cp
> >
> >
> /zookeeper-3.4.13/bin/../build/classes:/zookeeper-3.4.13/bin/../build/lib/*.jar:/zookeeper-3.4.13/bin/../lib/slf4j-log4j12-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/slf4j-api-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/netty-3.10.6.Final.jar:/zookeeper-3.4.13/bin/../lib/log4j-1.2.17.jar:/zookeeper-3.4.13/bin/../lib/jline-0.9.94.jar:/zookeeper-3.4.13/bin/../lib/audience-annotations-0.5.0.jar:/zookeeper-3.4.13/bin/../zookeeper-3.4.13.jar:/zookeeper-3.4.13/bin/../src/java/lib/*.jar:/conf:
> > -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote.local.only=false
> > org.apache.zookeeper.server.quorum.QuorumPeerMain /conf/zoo.cfg
> >
> > Second thing of note: The 3rd part of the dockerfile installs gosu, but
> the
> > user isn't actually changed to the zookeeper user at this point. This
> > happens later in docker-entrypoint.sh. Only the install of gosu is
> verified
> > to work at this point.
> >
> > Third thing of note: At the end of the dockerfile, this happens:
> >
> > ENV PATH=$PATH:/$DISTRO_NAME/bin \
> >    ZOOCFGDIR=$ZOO_CONF_DIR
> >
> > But again, this environment variable is only available to the root user
> and
> > not the zookeeper user.
> >
> > Then, zkServer.sh is executed to start the process. The thing of note
> here
> > is the docker-entrypoint file:
> >
> >
> https://github.com/31z4/zookeeper-docker/blob/master/3.4.14/docker-entrypoint.sh
> >
> > Which does in fact change the user to zookeeper, but doesn't take along
> > with it any environment variables.
> >
> > The part where it all goes wrong is when zkCleanup.sh calls zkEnv.sh to
> get
> > environment variables. Since the zookeeper user is the one running that
> > process, it won't actually see what it needs to see.
> >
> > This part:
> >
> >
> > if [ "x$ZOOCFGDIR" = "x" ]
> > then
> >   if [ -e "${ZOOKEEPER_PREFIX}/conf" ]; then
> >     ZOOCFGDIR="$ZOOBINDIR/../conf"
> >   else
> >     ZOOCFGDIR="$ZOOBINDIR/../etc/zookeeper"
> >   fi
> > fi
> >
> >
> > Does not follow the logic of directory layout we see at the start of the
> > dockerfile(the environment variables) at all.
> >
> > The warning about the folder not being found is fixed if I perform this
> > first:
> >
> > export ZOOCFGDIR="/conf"
> >
> > But the cleanup still doesn't work. The script just finishes with no
> output
> > and the files are still there. The user is correct, zookeeper is owner of
> > the files and owner has write permissions.
> > There's no extended file attributes on the files either.
> >
> > So I'm at my wit's end here.
> >
> > For information: the command that the script generates and runs is this:
> >
> > java -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp
> >
> >
> '/zookeeper-3.4.13/bin/../build/classes:/zookeeper-3.4.13/bin/../build/lib/*.jar:/zookeeper-3.4.13/bin/../lib/slf4j-log4j12-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/slf4j-api-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/netty-3.10.6.Final.jar:/zookeeper-3.4.13/bin/../lib/log4j-1.2.17.jar:/zookeeper-3.4.13/bin/../lib/jline-0.9.94.jar:/zookeeper-3.4.13/bin/../lib/audience-annotations-0.5.0.jar:/zookeeper-3.4.13/bin/../zookeeper-3.4.13.jar:/zookeeper-3.4.13/bin/../src/java/lib/*.jar:/conf:'
> > org.apache.zookeeper.server.PurgeTxnLog /data /data -n 3
> >
> > Changing logger to TRACE offers no output either.
> >
> >
> >
> >
> > On Mon, Jul 22, 2019 at 10:13 AM Koen De Groote <
> > koen.degroote@limecraft.com>
> > wrote:
> >
> > > Performing "bash -ex ./zkCleanup.sh /data/version-2 -n 3" as root
> results
> > > in the creation of another version-2 folder(empty) in the existing
> > > version-2 folder.
> > >
> > > As both root and zookeeper user I am able to create files in the
> > > /data/version-2 directory inside the container.
> > >
> > > The zookeeper user is indeed not the owner of anything in the zk/bin
> > > folder(/zookeeper-3.4.13/bin). Executing zkCli.sh works, but creating a
> > > file in there doesn't.
> > >
> > > Permission level for the folder seems to be 0755 on all files and the
> > > folder itself.
> > >
> > > Just ran into what I think is the problem: the relative path to the
> > > zoo.cfg file isn't correct.
> > >
> > > I tried running just plain "./zkCleanup.sh" as the zookeeper user from
> > > within the folder and it printed that it could not find the zoo.cfg
> file,
> > > but the path it printed was basically
> "current_dir/../expected_cfg_dir",
> > > which is one ".." too little.
> > >
> > > Will check if this is due to a setting of mine.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Fri, Jul 19, 2019 at 2:23 PM Norbert Kalmar
> > > <nk...@cloudera.com.invalid> wrote:
> > >
> > >> I would first check the permission on zkCleanup.sh and the bin folder.
> > >> Sounds like zookeeper user has no access to the /zk/bin directory.
> > >> That might also explain why it is not getting deleted by the zk
> > instance.
> > >>
> > >> And I'm not sure in this one, but did you try giving the full path to
> > the
> > >> txn log files like bash -ex ./zkCleanup.sh /data/version-2 -n 3 as
> root?
> > >> I think this script might be expecting the full path, including the
> > >> version-2 directory.
> > >>
> > >> Regards,
> > >> Norbert
> > >>
> > >> On Fri, Jul 19, 2019 at 2:00 PM Koen De Groote <
> > >> koen.degroote@limecraft.com>
> > >> wrote:
> > >>
> > >> > Hello Norbert,
> > >> >
> > >> > I've set up a new environment which then reached at least 4 *.log
> > files
> > >> > All snapshots and log files are kept in /data/version-2/(default for
> > the
> > >> > image)
> > >> >
> > >> > I went into the zookeeper container and executed:
> > >> >
> > >> > bash -ex ./zkCleanup.sh /data -n 3
> > >> >
> > >> > As root, this changes nothing. There are still 4 *.log files
> > >> >
> > >> > Changing to the zookeeper user, I get the following output:
> > >> >
> > >> > Path '/zookeeper-3.4.13/bin' does not exist.
> > >> > Usage:
> > >> > PurgeTxnLog dataLogDir [snapDir] -n count
> > >> > dataLogDir -- path to the txn log directory
> > >> > snapDir -- path to the snapshot directory
> > >> > count -- the number of old snaps/logs you want to keep, value should
> > be
> > >> > greater than or equal to 3
> > >> >
> > >> > And the 4 *.log files still exist.
> > >> > Also printing the usage, indicating, to me at least, that something
> > >> about
> > >> > the input is wrong, even though it is identical to the one used as
> > root,
> > >> > which did not result in this output.
> > >> >
> > >> > No actual error messages seem to be printed or logged anywhere.
> > >> >
> > >> > Not sure what to do next.
> > >> >
> > >> >
> > >> >
> > >> > On Fri, Jul 19, 2019 at 11:01 AM Norbert Kalmar
> > >> > <nk...@cloudera.com.invalid> wrote:
> > >> >
> > >> > > Hi Koen,
> > >> > >
> > >> > > It should do just as you said. You can also set
> > >> > autopurge.snapRetainCount,
> > >> > > bu default it is set to 3, so if you didn't set anything it is
> not a
> > >> > reason
> > >> > > to keep old logs.
> > >> > >
> > >> > > As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to
> delete
> > >> all
> > >> > > except the last 3 log files. You can add this to a cron job.
> > >> > >
> > >> > > As for why the old log files not getting deleted, could be
> something
> > >> > > related to the docker image, maybe a permission problem? Do you
> see
> > >> any
> > >> > > errors in the server log?
> > >> > >
> > >> > > Regards,
> > >> > > Norbert
> > >> > >
> > >> > > On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote <
> > >> > > koen.degroote@limecraft.com>
> > >> > > wrote:
> > >> > >
> > >> > > > Greetings,
> > >> > > >
> > >> > > > Working with Zookeeper version 3.4.13 in the official docker
> > image.
> > >> > > >
> > >> > > > I was under the impression that the setting
> > >> "autopurge.purgeInterval=1"
> > >> > > > meant that log files would be cleaned up every hour.
> > >> > > >
> > >> > > > Instead, I now find that months of these files are just sitting
> in
> > >> > their
> > >> > > > directory, untouched.
> > >> > > >
> > >> > > > So perhaps I'm wrong about that, but I'm not sure.
> > >> > > >
> > >> > > > What I wish to achieve is that these log files stop accumulating
> > and
> > >> > keep
> > >> > > > only the most recent. Is there a way to achieve this? Or are
> they
> > >> > merely
> > >> > > > historical and can they be deleted freely?
> > >> > > >
> > >> > > > Kind regards,
> > >> > > > Koen De Groote
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Re: log files not being cleaned up despite purgeInterval

Posted by Norbert Kalmar <nk...@cloudera.com.INVALID>.
Hello Koen,

Thanks for looking into this.
Unfortunately, we are not maintaining the docker image, and I'm also pretty
much out of ideas at this point, except that it looks like permission
setting issue with the docker script to me.
I suggest you file a ticker here:
https://github.com/31z4/zookeeper-docker/issues
Or in the general docker forums. There are multiple issues posted already,
like this one for example:
https://forums.docker.com/t/cannot-get-zookeeper-to-work-running-in-docker-using-swarm-mode/27109

Sorry I couldn't be any further help, I don't work much with docker :(

Regards,
Norbert

On Tue, Jul 23, 2019 at 3:28 PM Koen De Groote <ko...@limecraft.com>
wrote:

> Hello again, Norbert,
>
> I haven't been able to get it working yet, but did notice something else
> concerning the zookeeper user getting that error for the directory not
> being found.
>
> After looking into it a bit more, these are my findings:
>
> The zookeeper dockerfile can be found here:
> https://github.com/31z4/zookeeper-docker/blob/master/3.4.14/Dockerfile
>
> And the relevant part shows up at the very top:
>
> ENV ZOO_CONF_DIR=/conf \
>     ZOO_DATA_DIR=/data \
>     ZOO_DATA_LOG_DIR=/datalog \
>     ZOO_LOG_DIR=/logs \
>     ZOO_TICK_TIME=2000 \
>     ZOO_INIT_LIMIT=5 \
>     ZOO_SYNC_LIMIT=2 \
>     ZOO_AUTOPURGE_PURGEINTERVAL=0 \
>     ZOO_AUTOPURGE_SNAPRETAINCOUNT=3 \
>     ZOO_MAX_CLIENT_CNXNS=60
>
> This sets these settings as environment variables inside the container.
>
> First thing of note: These environment variables are only available to the
> root user. The process does run as the zookeeper user, to which said
> environment variables are not available.
>
> As can be seen from the output of this command, it is indeed the zookeeper
> user running the process:
>
> bash-4.4# ps -a | grep zookeeper
>     1 zookeepe  0:04 /usr/lib/jvm/java-1.8-openjdk/jre/bin/java
> -Dzookeeper.log.dir=/logs -Dzookeeper.root.logger=INFO,CONSOLE -cp
>
> /zookeeper-3.4.13/bin/../build/classes:/zookeeper-3.4.13/bin/../build/lib/*.jar:/zookeeper-3.4.13/bin/../lib/slf4j-log4j12-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/slf4j-api-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/netty-3.10.6.Final.jar:/zookeeper-3.4.13/bin/../lib/log4j-1.2.17.jar:/zookeeper-3.4.13/bin/../lib/jline-0.9.94.jar:/zookeeper-3.4.13/bin/../lib/audience-annotations-0.5.0.jar:/zookeeper-3.4.13/bin/../zookeeper-3.4.13.jar:/zookeeper-3.4.13/bin/../src/java/lib/*.jar:/conf:
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.local.only=false
> org.apache.zookeeper.server.quorum.QuorumPeerMain /conf/zoo.cfg
>
> Second thing of note: The 3rd part of the dockerfile installs gosu, but the
> user isn't actually changed to the zookeeper user at this point. This
> happens later in docker-entrypoint.sh. Only the install of gosu is verified
> to work at this point.
>
> Third thing of note: At the end of the dockerfile, this happens:
>
> ENV PATH=$PATH:/$DISTRO_NAME/bin \
>    ZOOCFGDIR=$ZOO_CONF_DIR
>
> But again, this environment variable is only available to the root user and
> not the zookeeper user.
>
> Then, zkServer.sh is executed to start the process. The thing of note here
> is the docker-entrypoint file:
>
> https://github.com/31z4/zookeeper-docker/blob/master/3.4.14/docker-entrypoint.sh
>
> Which does in fact change the user to zookeeper, but doesn't take along
> with it any environment variables.
>
> The part where it all goes wrong is when zkCleanup.sh calls zkEnv.sh to get
> environment variables. Since the zookeeper user is the one running that
> process, it won't actually see what it needs to see.
>
> This part:
>
>
> if [ "x$ZOOCFGDIR" = "x" ]
> then
>   if [ -e "${ZOOKEEPER_PREFIX}/conf" ]; then
>     ZOOCFGDIR="$ZOOBINDIR/../conf"
>   else
>     ZOOCFGDIR="$ZOOBINDIR/../etc/zookeeper"
>   fi
> fi
>
>
> Does not follow the logic of directory layout we see at the start of the
> dockerfile(the environment variables) at all.
>
> The warning about the folder not being found is fixed if I perform this
> first:
>
> export ZOOCFGDIR="/conf"
>
> But the cleanup still doesn't work. The script just finishes with no output
> and the files are still there. The user is correct, zookeeper is owner of
> the files and owner has write permissions.
> There's no extended file attributes on the files either.
>
> So I'm at my wit's end here.
>
> For information: the command that the script generates and runs is this:
>
> java -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp
>
> '/zookeeper-3.4.13/bin/../build/classes:/zookeeper-3.4.13/bin/../build/lib/*.jar:/zookeeper-3.4.13/bin/../lib/slf4j-log4j12-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/slf4j-api-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/netty-3.10.6.Final.jar:/zookeeper-3.4.13/bin/../lib/log4j-1.2.17.jar:/zookeeper-3.4.13/bin/../lib/jline-0.9.94.jar:/zookeeper-3.4.13/bin/../lib/audience-annotations-0.5.0.jar:/zookeeper-3.4.13/bin/../zookeeper-3.4.13.jar:/zookeeper-3.4.13/bin/../src/java/lib/*.jar:/conf:'
> org.apache.zookeeper.server.PurgeTxnLog /data /data -n 3
>
> Changing logger to TRACE offers no output either.
>
>
>
>
> On Mon, Jul 22, 2019 at 10:13 AM Koen De Groote <
> koen.degroote@limecraft.com>
> wrote:
>
> > Performing "bash -ex ./zkCleanup.sh /data/version-2 -n 3" as root results
> > in the creation of another version-2 folder(empty) in the existing
> > version-2 folder.
> >
> > As both root and zookeeper user I am able to create files in the
> > /data/version-2 directory inside the container.
> >
> > The zookeeper user is indeed not the owner of anything in the zk/bin
> > folder(/zookeeper-3.4.13/bin). Executing zkCli.sh works, but creating a
> > file in there doesn't.
> >
> > Permission level for the folder seems to be 0755 on all files and the
> > folder itself.
> >
> > Just ran into what I think is the problem: the relative path to the
> > zoo.cfg file isn't correct.
> >
> > I tried running just plain "./zkCleanup.sh" as the zookeeper user from
> > within the folder and it printed that it could not find the zoo.cfg file,
> > but the path it printed was basically "current_dir/../expected_cfg_dir",
> > which is one ".." too little.
> >
> > Will check if this is due to a setting of mine.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On Fri, Jul 19, 2019 at 2:23 PM Norbert Kalmar
> > <nk...@cloudera.com.invalid> wrote:
> >
> >> I would first check the permission on zkCleanup.sh and the bin folder.
> >> Sounds like zookeeper user has no access to the /zk/bin directory.
> >> That might also explain why it is not getting deleted by the zk
> instance.
> >>
> >> And I'm not sure in this one, but did you try giving the full path to
> the
> >> txn log files like bash -ex ./zkCleanup.sh /data/version-2 -n 3 as root?
> >> I think this script might be expecting the full path, including the
> >> version-2 directory.
> >>
> >> Regards,
> >> Norbert
> >>
> >> On Fri, Jul 19, 2019 at 2:00 PM Koen De Groote <
> >> koen.degroote@limecraft.com>
> >> wrote:
> >>
> >> > Hello Norbert,
> >> >
> >> > I've set up a new environment which then reached at least 4 *.log
> files
> >> > All snapshots and log files are kept in /data/version-2/(default for
> the
> >> > image)
> >> >
> >> > I went into the zookeeper container and executed:
> >> >
> >> > bash -ex ./zkCleanup.sh /data -n 3
> >> >
> >> > As root, this changes nothing. There are still 4 *.log files
> >> >
> >> > Changing to the zookeeper user, I get the following output:
> >> >
> >> > Path '/zookeeper-3.4.13/bin' does not exist.
> >> > Usage:
> >> > PurgeTxnLog dataLogDir [snapDir] -n count
> >> > dataLogDir -- path to the txn log directory
> >> > snapDir -- path to the snapshot directory
> >> > count -- the number of old snaps/logs you want to keep, value should
> be
> >> > greater than or equal to 3
> >> >
> >> > And the 4 *.log files still exist.
> >> > Also printing the usage, indicating, to me at least, that something
> >> about
> >> > the input is wrong, even though it is identical to the one used as
> root,
> >> > which did not result in this output.
> >> >
> >> > No actual error messages seem to be printed or logged anywhere.
> >> >
> >> > Not sure what to do next.
> >> >
> >> >
> >> >
> >> > On Fri, Jul 19, 2019 at 11:01 AM Norbert Kalmar
> >> > <nk...@cloudera.com.invalid> wrote:
> >> >
> >> > > Hi Koen,
> >> > >
> >> > > It should do just as you said. You can also set
> >> > autopurge.snapRetainCount,
> >> > > bu default it is set to 3, so if you didn't set anything it is not a
> >> > reason
> >> > > to keep old logs.
> >> > >
> >> > > As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to delete
> >> all
> >> > > except the last 3 log files. You can add this to a cron job.
> >> > >
> >> > > As for why the old log files not getting deleted, could be something
> >> > > related to the docker image, maybe a permission problem? Do you see
> >> any
> >> > > errors in the server log?
> >> > >
> >> > > Regards,
> >> > > Norbert
> >> > >
> >> > > On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote <
> >> > > koen.degroote@limecraft.com>
> >> > > wrote:
> >> > >
> >> > > > Greetings,
> >> > > >
> >> > > > Working with Zookeeper version 3.4.13 in the official docker
> image.
> >> > > >
> >> > > > I was under the impression that the setting
> >> "autopurge.purgeInterval=1"
> >> > > > meant that log files would be cleaned up every hour.
> >> > > >
> >> > > > Instead, I now find that months of these files are just sitting in
> >> > their
> >> > > > directory, untouched.
> >> > > >
> >> > > > So perhaps I'm wrong about that, but I'm not sure.
> >> > > >
> >> > > > What I wish to achieve is that these log files stop accumulating
> and
> >> > keep
> >> > > > only the most recent. Is there a way to achieve this? Or are they
> >> > merely
> >> > > > historical and can they be deleted freely?
> >> > > >
> >> > > > Kind regards,
> >> > > > Koen De Groote
> >> > > >
> >> > >
> >> >
> >>
> >
>

Re: log files not being cleaned up despite purgeInterval

Posted by Koen De Groote <ko...@limecraft.com>.
Hello again, Norbert,

I haven't been able to get it working yet, but did notice something else
concerning the zookeeper user getting that error for the directory not
being found.

After looking into it a bit more, these are my findings:

The zookeeper dockerfile can be found here:
https://github.com/31z4/zookeeper-docker/blob/master/3.4.14/Dockerfile

And the relevant part shows up at the very top:

ENV ZOO_CONF_DIR=/conf \
    ZOO_DATA_DIR=/data \
    ZOO_DATA_LOG_DIR=/datalog \
    ZOO_LOG_DIR=/logs \
    ZOO_TICK_TIME=2000 \
    ZOO_INIT_LIMIT=5 \
    ZOO_SYNC_LIMIT=2 \
    ZOO_AUTOPURGE_PURGEINTERVAL=0 \
    ZOO_AUTOPURGE_SNAPRETAINCOUNT=3 \
    ZOO_MAX_CLIENT_CNXNS=60

This sets these settings as environment variables inside the container.

First thing of note: These environment variables are only available to the
root user. The process does run as the zookeeper user, to which said
environment variables are not available.

As can be seen from the output of this command, it is indeed the zookeeper
user running the process:

bash-4.4# ps -a | grep zookeeper
    1 zookeepe  0:04 /usr/lib/jvm/java-1.8-openjdk/jre/bin/java
-Dzookeeper.log.dir=/logs -Dzookeeper.root.logger=INFO,CONSOLE -cp
/zookeeper-3.4.13/bin/../build/classes:/zookeeper-3.4.13/bin/../build/lib/*.jar:/zookeeper-3.4.13/bin/../lib/slf4j-log4j12-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/slf4j-api-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/netty-3.10.6.Final.jar:/zookeeper-3.4.13/bin/../lib/log4j-1.2.17.jar:/zookeeper-3.4.13/bin/../lib/jline-0.9.94.jar:/zookeeper-3.4.13/bin/../lib/audience-annotations-0.5.0.jar:/zookeeper-3.4.13/bin/../zookeeper-3.4.13.jar:/zookeeper-3.4.13/bin/../src/java/lib/*.jar:/conf:
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.local.only=false
org.apache.zookeeper.server.quorum.QuorumPeerMain /conf/zoo.cfg

Second thing of note: The 3rd part of the dockerfile installs gosu, but the
user isn't actually changed to the zookeeper user at this point. This
happens later in docker-entrypoint.sh. Only the install of gosu is verified
to work at this point.

Third thing of note: At the end of the dockerfile, this happens:

ENV PATH=$PATH:/$DISTRO_NAME/bin \
   ZOOCFGDIR=$ZOO_CONF_DIR

But again, this environment variable is only available to the root user and
not the zookeeper user.

Then, zkServer.sh is executed to start the process. The thing of note here
is the docker-entrypoint file:
https://github.com/31z4/zookeeper-docker/blob/master/3.4.14/docker-entrypoint.sh

Which does in fact change the user to zookeeper, but doesn't take along
with it any environment variables.

The part where it all goes wrong is when zkCleanup.sh calls zkEnv.sh to get
environment variables. Since the zookeeper user is the one running that
process, it won't actually see what it needs to see.

This part:


if [ "x$ZOOCFGDIR" = "x" ]
then
  if [ -e "${ZOOKEEPER_PREFIX}/conf" ]; then
    ZOOCFGDIR="$ZOOBINDIR/../conf"
  else
    ZOOCFGDIR="$ZOOBINDIR/../etc/zookeeper"
  fi
fi


Does not follow the logic of directory layout we see at the start of the
dockerfile(the environment variables) at all.

The warning about the folder not being found is fixed if I perform this
first:

export ZOOCFGDIR="/conf"

But the cleanup still doesn't work. The script just finishes with no output
and the files are still there. The user is correct, zookeeper is owner of
the files and owner has write permissions.
There's no extended file attributes on the files either.

So I'm at my wit's end here.

For information: the command that the script generates and runs is this:

java -Dzookeeper.log.dir=. -Dzookeeper.root.logger=INFO,CONSOLE -cp
'/zookeeper-3.4.13/bin/../build/classes:/zookeeper-3.4.13/bin/../build/lib/*.jar:/zookeeper-3.4.13/bin/../lib/slf4j-log4j12-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/slf4j-api-1.7.25.jar:/zookeeper-3.4.13/bin/../lib/netty-3.10.6.Final.jar:/zookeeper-3.4.13/bin/../lib/log4j-1.2.17.jar:/zookeeper-3.4.13/bin/../lib/jline-0.9.94.jar:/zookeeper-3.4.13/bin/../lib/audience-annotations-0.5.0.jar:/zookeeper-3.4.13/bin/../zookeeper-3.4.13.jar:/zookeeper-3.4.13/bin/../src/java/lib/*.jar:/conf:'
org.apache.zookeeper.server.PurgeTxnLog /data /data -n 3

Changing logger to TRACE offers no output either.




On Mon, Jul 22, 2019 at 10:13 AM Koen De Groote <ko...@limecraft.com>
wrote:

> Performing "bash -ex ./zkCleanup.sh /data/version-2 -n 3" as root results
> in the creation of another version-2 folder(empty) in the existing
> version-2 folder.
>
> As both root and zookeeper user I am able to create files in the
> /data/version-2 directory inside the container.
>
> The zookeeper user is indeed not the owner of anything in the zk/bin
> folder(/zookeeper-3.4.13/bin). Executing zkCli.sh works, but creating a
> file in there doesn't.
>
> Permission level for the folder seems to be 0755 on all files and the
> folder itself.
>
> Just ran into what I think is the problem: the relative path to the
> zoo.cfg file isn't correct.
>
> I tried running just plain "./zkCleanup.sh" as the zookeeper user from
> within the folder and it printed that it could not find the zoo.cfg file,
> but the path it printed was basically "current_dir/../expected_cfg_dir",
> which is one ".." too little.
>
> Will check if this is due to a setting of mine.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Jul 19, 2019 at 2:23 PM Norbert Kalmar
> <nk...@cloudera.com.invalid> wrote:
>
>> I would first check the permission on zkCleanup.sh and the bin folder.
>> Sounds like zookeeper user has no access to the /zk/bin directory.
>> That might also explain why it is not getting deleted by the zk instance.
>>
>> And I'm not sure in this one, but did you try giving the full path to the
>> txn log files like bash -ex ./zkCleanup.sh /data/version-2 -n 3 as root?
>> I think this script might be expecting the full path, including the
>> version-2 directory.
>>
>> Regards,
>> Norbert
>>
>> On Fri, Jul 19, 2019 at 2:00 PM Koen De Groote <
>> koen.degroote@limecraft.com>
>> wrote:
>>
>> > Hello Norbert,
>> >
>> > I've set up a new environment which then reached at least 4 *.log files
>> > All snapshots and log files are kept in /data/version-2/(default for the
>> > image)
>> >
>> > I went into the zookeeper container and executed:
>> >
>> > bash -ex ./zkCleanup.sh /data -n 3
>> >
>> > As root, this changes nothing. There are still 4 *.log files
>> >
>> > Changing to the zookeeper user, I get the following output:
>> >
>> > Path '/zookeeper-3.4.13/bin' does not exist.
>> > Usage:
>> > PurgeTxnLog dataLogDir [snapDir] -n count
>> > dataLogDir -- path to the txn log directory
>> > snapDir -- path to the snapshot directory
>> > count -- the number of old snaps/logs you want to keep, value should be
>> > greater than or equal to 3
>> >
>> > And the 4 *.log files still exist.
>> > Also printing the usage, indicating, to me at least, that something
>> about
>> > the input is wrong, even though it is identical to the one used as root,
>> > which did not result in this output.
>> >
>> > No actual error messages seem to be printed or logged anywhere.
>> >
>> > Not sure what to do next.
>> >
>> >
>> >
>> > On Fri, Jul 19, 2019 at 11:01 AM Norbert Kalmar
>> > <nk...@cloudera.com.invalid> wrote:
>> >
>> > > Hi Koen,
>> > >
>> > > It should do just as you said. You can also set
>> > autopurge.snapRetainCount,
>> > > bu default it is set to 3, so if you didn't set anything it is not a
>> > reason
>> > > to keep old logs.
>> > >
>> > > As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to delete
>> all
>> > > except the last 3 log files. You can add this to a cron job.
>> > >
>> > > As for why the old log files not getting deleted, could be something
>> > > related to the docker image, maybe a permission problem? Do you see
>> any
>> > > errors in the server log?
>> > >
>> > > Regards,
>> > > Norbert
>> > >
>> > > On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote <
>> > > koen.degroote@limecraft.com>
>> > > wrote:
>> > >
>> > > > Greetings,
>> > > >
>> > > > Working with Zookeeper version 3.4.13 in the official docker image.
>> > > >
>> > > > I was under the impression that the setting
>> "autopurge.purgeInterval=1"
>> > > > meant that log files would be cleaned up every hour.
>> > > >
>> > > > Instead, I now find that months of these files are just sitting in
>> > their
>> > > > directory, untouched.
>> > > >
>> > > > So perhaps I'm wrong about that, but I'm not sure.
>> > > >
>> > > > What I wish to achieve is that these log files stop accumulating and
>> > keep
>> > > > only the most recent. Is there a way to achieve this? Or are they
>> > merely
>> > > > historical and can they be deleted freely?
>> > > >
>> > > > Kind regards,
>> > > > Koen De Groote
>> > > >
>> > >
>> >
>>
>

Re: log files not being cleaned up despite purgeInterval

Posted by Koen De Groote <ko...@limecraft.com>.
Performing "bash -ex ./zkCleanup.sh /data/version-2 -n 3" as root results
in the creation of another version-2 folder(empty) in the existing
version-2 folder.

As both root and zookeeper user I am able to create files in the
/data/version-2 directory inside the container.

The zookeeper user is indeed not the owner of anything in the zk/bin
folder(/zookeeper-3.4.13/bin). Executing zkCli.sh works, but creating a
file in there doesn't.

Permission level for the folder seems to be 0755 on all files and the
folder itself.

Just ran into what I think is the problem: the relative path to the zoo.cfg
file isn't correct.

I tried running just plain "./zkCleanup.sh" as the zookeeper user from
within the folder and it printed that it could not find the zoo.cfg file,
but the path it printed was basically "current_dir/../expected_cfg_dir",
which is one ".." too little.

Will check if this is due to a setting of mine.














On Fri, Jul 19, 2019 at 2:23 PM Norbert Kalmar <nk...@cloudera.com.invalid>
wrote:

> I would first check the permission on zkCleanup.sh and the bin folder.
> Sounds like zookeeper user has no access to the /zk/bin directory.
> That might also explain why it is not getting deleted by the zk instance.
>
> And I'm not sure in this one, but did you try giving the full path to the
> txn log files like bash -ex ./zkCleanup.sh /data/version-2 -n 3 as root?
> I think this script might be expecting the full path, including the
> version-2 directory.
>
> Regards,
> Norbert
>
> On Fri, Jul 19, 2019 at 2:00 PM Koen De Groote <
> koen.degroote@limecraft.com>
> wrote:
>
> > Hello Norbert,
> >
> > I've set up a new environment which then reached at least 4 *.log files
> > All snapshots and log files are kept in /data/version-2/(default for the
> > image)
> >
> > I went into the zookeeper container and executed:
> >
> > bash -ex ./zkCleanup.sh /data -n 3
> >
> > As root, this changes nothing. There are still 4 *.log files
> >
> > Changing to the zookeeper user, I get the following output:
> >
> > Path '/zookeeper-3.4.13/bin' does not exist.
> > Usage:
> > PurgeTxnLog dataLogDir [snapDir] -n count
> > dataLogDir -- path to the txn log directory
> > snapDir -- path to the snapshot directory
> > count -- the number of old snaps/logs you want to keep, value should be
> > greater than or equal to 3
> >
> > And the 4 *.log files still exist.
> > Also printing the usage, indicating, to me at least, that something about
> > the input is wrong, even though it is identical to the one used as root,
> > which did not result in this output.
> >
> > No actual error messages seem to be printed or logged anywhere.
> >
> > Not sure what to do next.
> >
> >
> >
> > On Fri, Jul 19, 2019 at 11:01 AM Norbert Kalmar
> > <nk...@cloudera.com.invalid> wrote:
> >
> > > Hi Koen,
> > >
> > > It should do just as you said. You can also set
> > autopurge.snapRetainCount,
> > > bu default it is set to 3, so if you didn't set anything it is not a
> > reason
> > > to keep old logs.
> > >
> > > As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to delete all
> > > except the last 3 log files. You can add this to a cron job.
> > >
> > > As for why the old log files not getting deleted, could be something
> > > related to the docker image, maybe a permission problem? Do you see any
> > > errors in the server log?
> > >
> > > Regards,
> > > Norbert
> > >
> > > On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote <
> > > koen.degroote@limecraft.com>
> > > wrote:
> > >
> > > > Greetings,
> > > >
> > > > Working with Zookeeper version 3.4.13 in the official docker image.
> > > >
> > > > I was under the impression that the setting
> "autopurge.purgeInterval=1"
> > > > meant that log files would be cleaned up every hour.
> > > >
> > > > Instead, I now find that months of these files are just sitting in
> > their
> > > > directory, untouched.
> > > >
> > > > So perhaps I'm wrong about that, but I'm not sure.
> > > >
> > > > What I wish to achieve is that these log files stop accumulating and
> > keep
> > > > only the most recent. Is there a way to achieve this? Or are they
> > merely
> > > > historical and can they be deleted freely?
> > > >
> > > > Kind regards,
> > > > Koen De Groote
> > > >
> > >
> >
>

Re: log files not being cleaned up despite purgeInterval

Posted by Norbert Kalmar <nk...@cloudera.com.INVALID>.
I would first check the permission on zkCleanup.sh and the bin folder.
Sounds like zookeeper user has no access to the /zk/bin directory.
That might also explain why it is not getting deleted by the zk instance.

And I'm not sure in this one, but did you try giving the full path to the
txn log files like bash -ex ./zkCleanup.sh /data/version-2 -n 3 as root?
I think this script might be expecting the full path, including the
version-2 directory.

Regards,
Norbert

On Fri, Jul 19, 2019 at 2:00 PM Koen De Groote <ko...@limecraft.com>
wrote:

> Hello Norbert,
>
> I've set up a new environment which then reached at least 4 *.log files
> All snapshots and log files are kept in /data/version-2/(default for the
> image)
>
> I went into the zookeeper container and executed:
>
> bash -ex ./zkCleanup.sh /data -n 3
>
> As root, this changes nothing. There are still 4 *.log files
>
> Changing to the zookeeper user, I get the following output:
>
> Path '/zookeeper-3.4.13/bin' does not exist.
> Usage:
> PurgeTxnLog dataLogDir [snapDir] -n count
> dataLogDir -- path to the txn log directory
> snapDir -- path to the snapshot directory
> count -- the number of old snaps/logs you want to keep, value should be
> greater than or equal to 3
>
> And the 4 *.log files still exist.
> Also printing the usage, indicating, to me at least, that something about
> the input is wrong, even though it is identical to the one used as root,
> which did not result in this output.
>
> No actual error messages seem to be printed or logged anywhere.
>
> Not sure what to do next.
>
>
>
> On Fri, Jul 19, 2019 at 11:01 AM Norbert Kalmar
> <nk...@cloudera.com.invalid> wrote:
>
> > Hi Koen,
> >
> > It should do just as you said. You can also set
> autopurge.snapRetainCount,
> > bu default it is set to 3, so if you didn't set anything it is not a
> reason
> > to keep old logs.
> >
> > As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to delete all
> > except the last 3 log files. You can add this to a cron job.
> >
> > As for why the old log files not getting deleted, could be something
> > related to the docker image, maybe a permission problem? Do you see any
> > errors in the server log?
> >
> > Regards,
> > Norbert
> >
> > On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote <
> > koen.degroote@limecraft.com>
> > wrote:
> >
> > > Greetings,
> > >
> > > Working with Zookeeper version 3.4.13 in the official docker image.
> > >
> > > I was under the impression that the setting "autopurge.purgeInterval=1"
> > > meant that log files would be cleaned up every hour.
> > >
> > > Instead, I now find that months of these files are just sitting in
> their
> > > directory, untouched.
> > >
> > > So perhaps I'm wrong about that, but I'm not sure.
> > >
> > > What I wish to achieve is that these log files stop accumulating and
> keep
> > > only the most recent. Is there a way to achieve this? Or are they
> merely
> > > historical and can they be deleted freely?
> > >
> > > Kind regards,
> > > Koen De Groote
> > >
> >
>

Re: log files not being cleaned up despite purgeInterval

Posted by Koen De Groote <ko...@limecraft.com>.
Hello Norbert,

I've set up a new environment which then reached at least 4 *.log files
All snapshots and log files are kept in /data/version-2/(default for the
image)

I went into the zookeeper container and executed:

bash -ex ./zkCleanup.sh /data -n 3

As root, this changes nothing. There are still 4 *.log files

Changing to the zookeeper user, I get the following output:

Path '/zookeeper-3.4.13/bin' does not exist.
Usage:
PurgeTxnLog dataLogDir [snapDir] -n count
dataLogDir -- path to the txn log directory
snapDir -- path to the snapshot directory
count -- the number of old snaps/logs you want to keep, value should be
greater than or equal to 3

And the 4 *.log files still exist.
Also printing the usage, indicating, to me at least, that something about
the input is wrong, even though it is identical to the one used as root,
which did not result in this output.

No actual error messages seem to be printed or logged anywhere.

Not sure what to do next.



On Fri, Jul 19, 2019 at 11:01 AM Norbert Kalmar
<nk...@cloudera.com.invalid> wrote:

> Hi Koen,
>
> It should do just as you said. You can also set autopurge.snapRetainCount,
> bu default it is set to 3, so if you didn't set anything it is not a reason
> to keep old logs.
>
> As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to delete all
> except the last 3 log files. You can add this to a cron job.
>
> As for why the old log files not getting deleted, could be something
> related to the docker image, maybe a permission problem? Do you see any
> errors in the server log?
>
> Regards,
> Norbert
>
> On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote <
> koen.degroote@limecraft.com>
> wrote:
>
> > Greetings,
> >
> > Working with Zookeeper version 3.4.13 in the official docker image.
> >
> > I was under the impression that the setting "autopurge.purgeInterval=1"
> > meant that log files would be cleaned up every hour.
> >
> > Instead, I now find that months of these files are just sitting in their
> > directory, untouched.
> >
> > So perhaps I'm wrong about that, but I'm not sure.
> >
> > What I wish to achieve is that these log files stop accumulating and keep
> > only the most recent. Is there a way to achieve this? Or are they merely
> > historical and can they be deleted freely?
> >
> > Kind regards,
> > Koen De Groote
> >
>

Re: log files not being cleaned up despite purgeInterval

Posted by Norbert Kalmar <nk...@cloudera.com.INVALID>.
Hi Koen,

It should do just as you said. You can also set autopurge.snapRetainCount,
bu default it is set to 3, so if you didn't set anything it is not a reason
to keep old logs.

As a plan B you could use zkCleanup.sh [snapshotDir] -n 3 to delete all
except the last 3 log files. You can add this to a cron job.

As for why the old log files not getting deleted, could be something
related to the docker image, maybe a permission problem? Do you see any
errors in the server log?

Regards,
Norbert

On Thu, Jul 18, 2019 at 9:25 PM Koen De Groote <ko...@limecraft.com>
wrote:

> Greetings,
>
> Working with Zookeeper version 3.4.13 in the official docker image.
>
> I was under the impression that the setting "autopurge.purgeInterval=1"
> meant that log files would be cleaned up every hour.
>
> Instead, I now find that months of these files are just sitting in their
> directory, untouched.
>
> So perhaps I'm wrong about that, but I'm not sure.
>
> What I wish to achieve is that these log files stop accumulating and keep
> only the most recent. Is there a way to achieve this? Or are they merely
> historical and can they be deleted freely?
>
> Kind regards,
> Koen De Groote
>