You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Denis Magda <dm...@apache.org> on 2020/11/05 22:58:13 UTC

Why WAL archives enabled by default?

Folks,

In my understanding, you need the archives only for features such as PITR.
Considering, that the PITR functionality is not provided in Ignite why do
we have the archives enabled by default?

How about having this feature disabled by default to prevent the following
issues experienced by our users:
http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html

-
Denis

Re: Why WAL archives enabled by default?

Posted by Ivan Daschinsky <iv...@gmail.com>.

Dmitriy, as far as I understand, Denis have adviced user to "disable"
archiving by setting wal and wal archive path to the same value. I tried to
explain that this measure doesn't prevent from storing wal segments needed
for recovery but bring additional performance penalty.

> In older versions of Apache Ignite, WAL archive could contain valid
records
needed for recovery. If something was changed since then, my comment may be
not valid.

Nothing changed since that time.


ср, 11 нояб. 2020 г., 03:31 Dmitriy Pavlov <dp...@apache.org>:

> In older versions of Apache Ignite, WAL archive could contain valid records
> needed for recovery. If something was changed since then, my comment may be
> not valid.
>
> We've discussed that before, that naming this directory as 'archive' was
> not the best possible option. The archive is often considered by users as
> something not needed and sometimes it was deleted.
>
> See also page related to internals and directory structure:
>
> https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood#IgnitePersistentStoreunderthehood-WALstructure
>
> So infinite storage of archive is definitely not necessary for vanilla
> open-source version, but archive itself is needed.
>
> Sincerely,
> Dmitriy Pavlov
>
> ср, 11 нояб. 2020 г. в 01:21, Raymond Wilson <ra...@trimble.com>:
>
> > Isn't the discussion here related to the WAL archive? If you disable that
> > don't you still have the WAL containing un-checkpointed changes?
> >
> > On Wed, Nov 11, 2020 at 11:01 AM Dmitriy Pavlov <dp...@apache.org>
> > wrote:
> >
> > > Hi Denis,
> > >
> > > the short answer here, Apache Ignite guarantees ACID, and for
> > D-Durability
> > > it is required to save all changes in some WAL/Redo Log to have a safe
> > way
> > > to recover from any hardware failures/disk outage.
> > >
> > > Should the user disable WAL, he/she could potentially lose durability.
> > >
> > > Sincerely,
> > > Dmitriy Pavlov
> > >
> > > вт, 10 нояб. 2020 г. в 09:57, ткаленко кирилл <tk...@yandex.ru>:
> > >
> > > > Hello guys again!
> > > >
> > > > Does anyone know why we are doing any calculation here
> > > > IgniteUtils#adjustedWalHistorySize at all?
> > > > Would it be easier to always take the
> > > > DataStorageConfiguration#maxWalArchiveSize? It seems that the user
> can
> > > > easily do this himself by changing the value by 1 byte.
> > > >
> > > > 06.11.2020, 13:56, "Ivan Daschinsky" <iv...@gmail.com>:
> > > > > Alex, thanks for pointing that out. Shame that I missed it.
> > > > >
> > > > > пт, 6 нояб. 2020 г. в 13:45, Alex Plehanov <
> plehanov.alex@gmail.com
> > >:
> > > > >
> > > > >>  Guys,
> > > > >>
> > > > >>  We already have
> > > FileWriteAheadLogManager#maxSegCountWithoutCheckpoint.
> > > > >>  Checkpoint triggered if there are too many WAL segments without
> > > > checkpoint.
> > > > >>  Looks like you are talking about this feature.
> > > > >>
> > > > >>  пт, 6 нояб. 2020 г. в 13:21, Ivan Daschinsky <
> ivandasch@gmail.com
> > >:
> > > > >>
> > > > >>  > Kirill and I discussed privately proposed approach. As far as I
> > > > >>  understand,
> > > > >>  > Kirill suggests to implement some
> > > > >>  > heuristic to do a force checkpoint in some cases if user by
> > mistake
> > > > >>  > misconfigured cluster in order to preserve
> > > > >>  > requested size of WAL archive.
> > > > >>  > Currently, as for me, this approach is questionable, because it
> > can
> > > > cause
> > > > >>  > some performance problems. But as an option,
> > > > >>  > it can be used and should be switchable.
> > > > >>  >
> > > > >>  > пт, 6 нояб. 2020 г. в 12:36, Ivan Daschinsky <
> > ivandasch@gmail.com
> > > >:
> > > > >>  >
> > > > >>  > > Kirill, how your approach will help if user tuned a cluster
> to
> > do
> > > > >>  > > checkpoints rarely under load?
> > > > >>  > > No way.
> > > > >>  > >
> > > > >>  > > пт, 6 нояб. 2020 г. в 12:19, ткаленко кирилл <
> > > tkalkirill@yandex.ru
> > > > >:
> > > > >>  > >
> > > > >>  > >> Ivan, I agree with you that the archive is primarily about
> > > > >>  optimization.
> > > > >>  > >>
> > > > >>  > >> If the size of the archive is critical for the user, we have
> > no
> > > > >>  > >> protection against this, we can always go beyond this limit.
> > > > >>  > >> Thus, the user needs to remember this and configure it in
> some
> > > > way.
> > > > >>  > >>
> > > > >>  > >> I suggest not to exceed this limit and give the expected
> > > behavior
> > > > for
> > > > >>  > the
> > > > >>  > >> user. At the same time, the segments needed for recovery
> will
> > > > remain
> > > > >>  and
> > > > >>  > >> there will be no data loss.
> > > > >>  > >>
> > > > >>  > >> 06.11.2020, 11:29, "Ivan Daschinsky" <iv...@gmail.com>:
> > > > >>  > >> > Guys, fisrt of all, archiving is not for PITR at all, this
> > is
> > > > >>  > >> optimization.
> > > > >>  > >> > If we disable archiving, every rollover we need to create
> > new
> > > > file.
> > > > >>  If
> > > > >>  > >> we
> > > > >>  > >> > enable archiving, we reserve 10 (by default) segments
> filled
> > > > with
> > > > >>  > >> zeroes.
> > > > >>  > >> > We use mmap by default, so if we use no-archiver approach:
> > > > >>  > >> > 1. We firstly create new empty file
> > > > >>  > >> > 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under
> > the
> > > > hood
> > > > >>  > >> > a. If file is shorter, than wal segment size, it
> > > > >>  > >> > calls sun.nio.ch.FileDispatcherImpl#truncate0, this is
> under
> > > the
> > > > >>  hood
> > > > >>  > >> just
> > > > >>  > >> > a system call truncate [1]
> > > > >>  > >> > b. Than it calls system call mmap on this
> > > > >>  > >> > file sun.nio.ch.FileChannelImpl#map0, under the hood see
> [2]
> > > > >>  > >> > These manipulation are not free and cheap. So rollover
> will
> > be
> > > > much
> > > > >>  > much
> > > > >>  > >> > slower.
> > > > >>  > >> > If archiving is enabled, 10 segments are already
> > preallocated
> > > > at the
> > > > >>  > >> moment
> > > > >>  > >> > of node's start.
> > > > >>  > >> >
> > > > >>  > >> > When archiving is enabled, archiver just copy previous
> > > > preallocated
> > > > >>  > >> segment
> > > > >>  > >> > and move it to archive directory.
> > > > >>  > >> > This archived segment is crucial for recovery. When new
> > > > checkpoints
> > > > >>  > >> > finished, all eligible for trunocating segments are just
> > > > removed.
> > > > >>  > >> >
> > > > >>  > >> > If archiving is disabled, we also write WAL segments in
> wal
> > > > >>  directory
> > > > >>  > >> and
> > > > >>  > >> > disabling archiving don't prevent you from storing
> segments,
> > > if
> > > > they
> > > > >>  > are
> > > > >>  > >> > required for recovery.
> > > > >>  > >> >
> > > > >>  > >> >>> Before increasing the size of WAL archive (transferring
> to
> > > > archive
> > > > >>  > >> >
> > > > >>  > >> > /rollOver, compression, decompression), we can make sure
> > that
> > > > there
> > > > >>  > >> will be
> > > > >>  > >> > enough space in the archive and if there is no such, then
> we
> > > > will
> > > > >>  try
> > > > >>  > to
> > > > >>  > >> >>> clean it. We cannot delete those segments that are
> > required
> > > > for
> > > > >>  > >> recovery
> > > > >>  > >> >
> > > > >>  > >> > (between the last two checkpoints) and reserved for
> example
> > > for
> > > > >>  > >> historical
> > > > >>  > >> > rebalancing.
> > > > >>  > >> > First of all, compression/decompression is offtopic here.
> > > > >>  > >> > Secondly, wal segments are required only with idx higher
> > than
> > > > LAST
> > > > >>  > >> > checkpoint marker.
> > > > >>  > >> > Thirdly, archiving and rolling over can be during
> checkpoint
> > > > and we
> > > > >>  > can
> > > > >>  > >> > broke everything accidentially.
> > > > >>  > >> > Fourthly, I see no benefits to overcomplicated already
> > > > complicated
> > > > >>  > >> logic.
> > > > >>  > >> > This is basically problem of misunderstanding and tuning.
> > > > >>  > >> > There are a lot of similar topics for almost every DB. [3]
> > > > >>  > >> >
> > > > >>  > >> > [1] --
> > https://man7.org/linux/man-pages/man2/ftruncate.2.html
> > > > >>  > >> > [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
> > > > >>  > >> > [3] --
> > > > >>  > >> >
> > > > >>  > >>
> > > > >>  >
> > > > >>
> > > >
> > >
> >
> https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
> > > > >>  > >> >
> > > > >>  > >> > пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <
> > > > tkalkirill@yandex.ru
> > > > >>  >:
> > > > >>  > >> >
> > > > >>  > >> >> Hi, Ivan!
> > > > >>  > >> >>
> > > > >>  > >> >> I have only described ideas. But here are a few more
> > details.
> > > > >>  > >> >>
> > > > >>  > >> >> We can take care not to go beyond
> > > > >>  > >> >> DataStorageConfiguration#maxWalArchiveSize.
> > > > >>  > >> >>
> > > > >>  > >> >> Before increasing the size of WAL archive (transferring
> to
> > > > archive
> > > > >>  > >> >> /rollOver, compression, decompression), we can make sure
> > that
> > > > >>  there
> > > > >>  > >> will be
> > > > >>  > >> >> enough space in the archive and if there is no such, then
> > we
> > > > will
> > > > >>  > try
> > > > >>  > >> to
> > > > >>  > >> >> clean it. We cannot delete those segments that are
> required
> > > for
> > > > >>  > >> recovery
> > > > >>  > >> >> (between the last two checkpoints) and reserved for
> example
> > > for
> > > > >>  > >> historical
> > > > >>  > >> >> rebalancing.
> > > > >>  > >> >>
> > > > >>  > >> >> We can receive a notification about the change of
> > checkpoints
> > > > and
> > > > >>  > the
> > > > >>  > >> >> reservation / release of segments, thus we can know how
> > many
> > > > >>  > segments
> > > > >>  > >> we
> > > > >>  > >> >> can delete right now.
> > > > >>  > >> >>
> > > > >>  > >> >> 06.11.2020, 09:53, "Ivan Daschinsky" <
> ivandasch@gmail.com
> > >:
> > > > >>  > >> >> >>> For example, when trying to move a segment to the
> > > archive.
> > > > >>  > >> >> >
> > > > >>  > >> >> > We cannot do this, we will lost data. We can truncate
> > > > archived
> > > > >>  > >> segment if
> > > > >>  > >> >> > and only if it is not required for recovery. If last
> > > > checkpoint
> > > > >>  > >> marker
> > > > >>  > >> >> > points to segment
> > > > >>  > >> >> > with lower index, we cannot delete any segment with
> > higher
> > > > >>  index.
> > > > >>  > >> So the
> > > > >>  > >> >> > only moment where we can remove truncate segments is a
> > > > finish of
> > > > >>  > >> >> checkpoint.
> > > > >>  > >> >> >
> > > > >>  > >> >> > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <
> > > > >>  > tkalkirill@yandex.ru
> > > > >>  > >> >:
> > > > >>  > >> >> >
> > > > >>  > >> >> >> Hello, everybody!
> > > > >>  > >> >> >>
> > > > >>  > >> >> >> As far as I know, WAL archive is used for
> PITP(GridGain
> > > > >>  feature)
> > > > >>  > >> and
> > > > >>  > >> >> >> historical rebalancing.
> > > > >>  > >> >> >>
> > > > >>  > >> >> >> Facundo seems to have a problem with running out of
> > > > directory
> > > > >>  > >> >> >> (/opt/work/walarchive) space.
> > > > >>  > >> >> >> Currently, WAL archive is cleared at the end of
> > > checkpoint.
> > > > >>  > >> Potentially
> > > > >>  > >> >> >> long transaction may prevent checkpoint starting,
> > thereby
> > > > not
> > > > >>  > >> cleaning
> > > > >>  > >> >> WAL
> > > > >>  > >> >> >> archive, which will lead to such an error.
> > > > >>  > >> >> >> At the moment, I see such a WA to increase size of
> > > directory
> > > > >>  > >> >> >> (/opt/work/walarchive) in k8s and avoid long
> > transactions
> > > or
> > > > >>  > >> something
> > > > >>  > >> >> like
> > > > >>  > >> >> >> that that modifies data and runs for a long time.
> > > > >>  > >> >> >>
> > > > >>  > >> >> >> And it is best to fix the logic of working with WAL
> > > > archive. I
> > > > >>  > >> think we
> > > > >>  > >> >> >> should remove WAL archive cleanup from the end of the
> > > > >>  checkpoint
> > > > >>  > >> and
> > > > >>  > >> >> do it
> > > > >>  > >> >> >> on demand. For example, when trying to move a segment
> to
> > > the
> > > > >>  > >> archive.
> > > > >>  > >> >> >>
> > > > >>  > >> >> >> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
> > > > >>  > >> >> >> > Folks,
> > > > >>  > >> >> >> >
> > > > >>  > >> >> >> > In my understanding, you need the archives only for
> > > > features
> > > > >>  > >> such as
> > > > >>  > >> >> >> PITR.
> > > > >>  > >> >> >> > Considering, that the PITR functionality is not
> > provided
> > > > in
> > > > >>  > >> Ignite
> > > > >>  > >> >> why do
> > > > >>  > >> >> >> > we have the archives enabled by default?
> > > > >>  > >> >> >> >
> > > > >>  > >> >> >> > How about having this feature disabled by default to
> > > > prevent
> > > > >>  > the
> > > > >>  > >> >> >> following
> > > > >>  > >> >> >> > issues experienced by our users:
> > > > >>  > >> >> >> >
> > > > >>  > >> >> >>
> > > > >>  > >> >>
> > > > >>  > >>
> > > > >>  >
> > > > >>
> > > >
> > >
> >
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
> > > > >>  > >> >> >> >
> > > > >>  > >> >> >> > -
> > > > >>  > >> >> >> > Denis
> > > > >>  > >> >> >
> > > > >>  > >> >> > --
> > > > >>  > >> >> > Sincerely yours, Ivan Daschinskiy
> > > > >>  > >> >
> > > > >>  > >> > --
> > > > >>  > >> > Sincerely yours, Ivan Daschinskiy
> > > > >>  > >>
> > > > >>  > >
> > > > >>  > >
> > > > >>  > > --
> > > > >>  > > Sincerely yours, Ivan Daschinskiy
> > > > >>  > >
> > > > >>  >
> > > > >>  >
> > > > >>  > --
> > > > >>  > Sincerely yours, Ivan Daschinskiy
> > > > >>  >
> > > > >
> > > > > --
> > > > > Sincerely yours, Ivan Daschinskiy
> > > >
> > >
> >
> >
> > --
> > <http://www.trimble.com/>
> > Raymond Wilson
> > Solution Architect, Civil Construction Software Systems (CCSS)
> > 11 Birmingham Drive | Christchurch, New Zealand
> > +64-21-2013317 Mobile
> > raymond_wilson@trimble.com
> >
> > <
> >
> https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch
> > >
> >
>

Re: Why WAL archives enabled by default?

Posted by Dmitriy Pavlov <dp...@apache.org>.

In older versions of Apache Ignite, WAL archive could contain valid records
needed for recovery. If something was changed since then, my comment may be
not valid.

We've discussed that before, that naming this directory as 'archive' was
not the best possible option. The archive is often considered by users as
something not needed and sometimes it was deleted.

See also page related to internals and directory structure:
https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Persistent+Store+-+under+the+hood#IgnitePersistentStoreunderthehood-WALstructure

So infinite storage of archive is definitely not necessary for vanilla
open-source version, but archive itself is needed.

Sincerely,
Dmitriy Pavlov

ср, 11 нояб. 2020 г. в 01:21, Raymond Wilson <ra...@trimble.com>:

> Isn't the discussion here related to the WAL archive? If you disable that
> don't you still have the WAL containing un-checkpointed changes?
>
> On Wed, Nov 11, 2020 at 11:01 AM Dmitriy Pavlov <dp...@apache.org>
> wrote:
>
> > Hi Denis,
> >
> > the short answer here, Apache Ignite guarantees ACID, and for
> D-Durability
> > it is required to save all changes in some WAL/Redo Log to have a safe
> way
> > to recover from any hardware failures/disk outage.
> >
> > Should the user disable WAL, he/she could potentially lose durability.
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > вт, 10 нояб. 2020 г. в 09:57, ткаленко кирилл <tk...@yandex.ru>:
> >
> > > Hello guys again!
> > >
> > > Does anyone know why we are doing any calculation here
> > > IgniteUtils#adjustedWalHistorySize at all?
> > > Would it be easier to always take the
> > > DataStorageConfiguration#maxWalArchiveSize? It seems that the user can
> > > easily do this himself by changing the value by 1 byte.
> > >
> > > 06.11.2020, 13:56, "Ivan Daschinsky" <iv...@gmail.com>:
> > > > Alex, thanks for pointing that out. Shame that I missed it.
> > > >
> > > > пт, 6 нояб. 2020 г. в 13:45, Alex Plehanov <plehanov.alex@gmail.com
> >:
> > > >
> > > >>  Guys,
> > > >>
> > > >>  We already have
> > FileWriteAheadLogManager#maxSegCountWithoutCheckpoint.
> > > >>  Checkpoint triggered if there are too many WAL segments without
> > > checkpoint.
> > > >>  Looks like you are talking about this feature.
> > > >>
> > > >>  пт, 6 нояб. 2020 г. в 13:21, Ivan Daschinsky <ivandasch@gmail.com
> >:
> > > >>
> > > >>  > Kirill and I discussed privately proposed approach. As far as I
> > > >>  understand,
> > > >>  > Kirill suggests to implement some
> > > >>  > heuristic to do a force checkpoint in some cases if user by
> mistake
> > > >>  > misconfigured cluster in order to preserve
> > > >>  > requested size of WAL archive.
> > > >>  > Currently, as for me, this approach is questionable, because it
> can
> > > cause
> > > >>  > some performance problems. But as an option,
> > > >>  > it can be used and should be switchable.
> > > >>  >
> > > >>  > пт, 6 нояб. 2020 г. в 12:36, Ivan Daschinsky <
> ivandasch@gmail.com
> > >:
> > > >>  >
> > > >>  > > Kirill, how your approach will help if user tuned a cluster to
> do
> > > >>  > > checkpoints rarely under load?
> > > >>  > > No way.
> > > >>  > >
> > > >>  > > пт, 6 нояб. 2020 г. в 12:19, ткаленко кирилл <
> > tkalkirill@yandex.ru
> > > >:
> > > >>  > >
> > > >>  > >> Ivan, I agree with you that the archive is primarily about
> > > >>  optimization.
> > > >>  > >>
> > > >>  > >> If the size of the archive is critical for the user, we have
> no
> > > >>  > >> protection against this, we can always go beyond this limit.
> > > >>  > >> Thus, the user needs to remember this and configure it in some
> > > way.
> > > >>  > >>
> > > >>  > >> I suggest not to exceed this limit and give the expected
> > behavior
> > > for
> > > >>  > the
> > > >>  > >> user. At the same time, the segments needed for recovery will
> > > remain
> > > >>  and
> > > >>  > >> there will be no data loss.
> > > >>  > >>
> > > >>  > >> 06.11.2020, 11:29, "Ivan Daschinsky" <iv...@gmail.com>:
> > > >>  > >> > Guys, fisrt of all, archiving is not for PITR at all, this
> is
> > > >>  > >> optimization.
> > > >>  > >> > If we disable archiving, every rollover we need to create
> new
> > > file.
> > > >>  If
> > > >>  > >> we
> > > >>  > >> > enable archiving, we reserve 10 (by default) segments filled
> > > with
> > > >>  > >> zeroes.
> > > >>  > >> > We use mmap by default, so if we use no-archiver approach:
> > > >>  > >> > 1. We firstly create new empty file
> > > >>  > >> > 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under
> the
> > > hood
> > > >>  > >> > a. If file is shorter, than wal segment size, it
> > > >>  > >> > calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under
> > the
> > > >>  hood
> > > >>  > >> just
> > > >>  > >> > a system call truncate [1]
> > > >>  > >> > b. Than it calls system call mmap on this
> > > >>  > >> > file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
> > > >>  > >> > These manipulation are not free and cheap. So rollover will
> be
> > > much
> > > >>  > much
> > > >>  > >> > slower.
> > > >>  > >> > If archiving is enabled, 10 segments are already
> preallocated
> > > at the
> > > >>  > >> moment
> > > >>  > >> > of node's start.
> > > >>  > >> >
> > > >>  > >> > When archiving is enabled, archiver just copy previous
> > > preallocated
> > > >>  > >> segment
> > > >>  > >> > and move it to archive directory.
> > > >>  > >> > This archived segment is crucial for recovery. When new
> > > checkpoints
> > > >>  > >> > finished, all eligible for trunocating segments are just
> > > removed.
> > > >>  > >> >
> > > >>  > >> > If archiving is disabled, we also write WAL segments in wal
> > > >>  directory
> > > >>  > >> and
> > > >>  > >> > disabling archiving don't prevent you from storing segments,
> > if
> > > they
> > > >>  > are
> > > >>  > >> > required for recovery.
> > > >>  > >> >
> > > >>  > >> >>> Before increasing the size of WAL archive (transferring to
> > > archive
> > > >>  > >> >
> > > >>  > >> > /rollOver, compression, decompression), we can make sure
> that
> > > there
> > > >>  > >> will be
> > > >>  > >> > enough space in the archive and if there is no such, then we
> > > will
> > > >>  try
> > > >>  > to
> > > >>  > >> >>> clean it. We cannot delete those segments that are
> required
> > > for
> > > >>  > >> recovery
> > > >>  > >> >
> > > >>  > >> > (between the last two checkpoints) and reserved for example
> > for
> > > >>  > >> historical
> > > >>  > >> > rebalancing.
> > > >>  > >> > First of all, compression/decompression is offtopic here.
> > > >>  > >> > Secondly, wal segments are required only with idx higher
> than
> > > LAST
> > > >>  > >> > checkpoint marker.
> > > >>  > >> > Thirdly, archiving and rolling over can be during checkpoint
> > > and we
> > > >>  > can
> > > >>  > >> > broke everything accidentially.
> > > >>  > >> > Fourthly, I see no benefits to overcomplicated already
> > > complicated
> > > >>  > >> logic.
> > > >>  > >> > This is basically problem of misunderstanding and tuning.
> > > >>  > >> > There are a lot of similar topics for almost every DB. [3]
> > > >>  > >> >
> > > >>  > >> > [1] --
> https://man7.org/linux/man-pages/man2/ftruncate.2.html
> > > >>  > >> > [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
> > > >>  > >> > [3] --
> > > >>  > >> >
> > > >>  > >>
> > > >>  >
> > > >>
> > >
> >
> https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
> > > >>  > >> >
> > > >>  > >> > пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <
> > > tkalkirill@yandex.ru
> > > >>  >:
> > > >>  > >> >
> > > >>  > >> >> Hi, Ivan!
> > > >>  > >> >>
> > > >>  > >> >> I have only described ideas. But here are a few more
> details.
> > > >>  > >> >>
> > > >>  > >> >> We can take care not to go beyond
> > > >>  > >> >> DataStorageConfiguration#maxWalArchiveSize.
> > > >>  > >> >>
> > > >>  > >> >> Before increasing the size of WAL archive (transferring to
> > > archive
> > > >>  > >> >> /rollOver, compression, decompression), we can make sure
> that
> > > >>  there
> > > >>  > >> will be
> > > >>  > >> >> enough space in the archive and if there is no such, then
> we
> > > will
> > > >>  > try
> > > >>  > >> to
> > > >>  > >> >> clean it. We cannot delete those segments that are required
> > for
> > > >>  > >> recovery
> > > >>  > >> >> (between the last two checkpoints) and reserved for example
> > for
> > > >>  > >> historical
> > > >>  > >> >> rebalancing.
> > > >>  > >> >>
> > > >>  > >> >> We can receive a notification about the change of
> checkpoints
> > > and
> > > >>  > the
> > > >>  > >> >> reservation / release of segments, thus we can know how
> many
> > > >>  > segments
> > > >>  > >> we
> > > >>  > >> >> can delete right now.
> > > >>  > >> >>
> > > >>  > >> >> 06.11.2020, 09:53, "Ivan Daschinsky" <ivandasch@gmail.com
> >:
> > > >>  > >> >> >>> For example, when trying to move a segment to the
> > archive.
> > > >>  > >> >> >
> > > >>  > >> >> > We cannot do this, we will lost data. We can truncate
> > > archived
> > > >>  > >> segment if
> > > >>  > >> >> > and only if it is not required for recovery. If last
> > > checkpoint
> > > >>  > >> marker
> > > >>  > >> >> > points to segment
> > > >>  > >> >> > with lower index, we cannot delete any segment with
> higher
> > > >>  index.
> > > >>  > >> So the
> > > >>  > >> >> > only moment where we can remove truncate segments is a
> > > finish of
> > > >>  > >> >> checkpoint.
> > > >>  > >> >> >
> > > >>  > >> >> > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <
> > > >>  > tkalkirill@yandex.ru
> > > >>  > >> >:
> > > >>  > >> >> >
> > > >>  > >> >> >> Hello, everybody!
> > > >>  > >> >> >>
> > > >>  > >> >> >> As far as I know, WAL archive is used for PITP(GridGain
> > > >>  feature)
> > > >>  > >> and
> > > >>  > >> >> >> historical rebalancing.
> > > >>  > >> >> >>
> > > >>  > >> >> >> Facundo seems to have a problem with running out of
> > > directory
> > > >>  > >> >> >> (/opt/work/walarchive) space.
> > > >>  > >> >> >> Currently, WAL archive is cleared at the end of
> > checkpoint.
> > > >>  > >> Potentially
> > > >>  > >> >> >> long transaction may prevent checkpoint starting,
> thereby
> > > not
> > > >>  > >> cleaning
> > > >>  > >> >> WAL
> > > >>  > >> >> >> archive, which will lead to such an error.
> > > >>  > >> >> >> At the moment, I see such a WA to increase size of
> > directory
> > > >>  > >> >> >> (/opt/work/walarchive) in k8s and avoid long
> transactions
> > or
> > > >>  > >> something
> > > >>  > >> >> like
> > > >>  > >> >> >> that that modifies data and runs for a long time.
> > > >>  > >> >> >>
> > > >>  > >> >> >> And it is best to fix the logic of working with WAL
> > > archive. I
> > > >>  > >> think we
> > > >>  > >> >> >> should remove WAL archive cleanup from the end of the
> > > >>  checkpoint
> > > >>  > >> and
> > > >>  > >> >> do it
> > > >>  > >> >> >> on demand. For example, when trying to move a segment to
> > the
> > > >>  > >> archive.
> > > >>  > >> >> >>
> > > >>  > >> >> >> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
> > > >>  > >> >> >> > Folks,
> > > >>  > >> >> >> >
> > > >>  > >> >> >> > In my understanding, you need the archives only for
> > > features
> > > >>  > >> such as
> > > >>  > >> >> >> PITR.
> > > >>  > >> >> >> > Considering, that the PITR functionality is not
> provided
> > > in
> > > >>  > >> Ignite
> > > >>  > >> >> why do
> > > >>  > >> >> >> > we have the archives enabled by default?
> > > >>  > >> >> >> >
> > > >>  > >> >> >> > How about having this feature disabled by default to
> > > prevent
> > > >>  > the
> > > >>  > >> >> >> following
> > > >>  > >> >> >> > issues experienced by our users:
> > > >>  > >> >> >> >
> > > >>  > >> >> >>
> > > >>  > >> >>
> > > >>  > >>
> > > >>  >
> > > >>
> > >
> >
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
> > > >>  > >> >> >> >
> > > >>  > >> >> >> > -
> > > >>  > >> >> >> > Denis
> > > >>  > >> >> >
> > > >>  > >> >> > --
> > > >>  > >> >> > Sincerely yours, Ivan Daschinskiy
> > > >>  > >> >
> > > >>  > >> > --
> > > >>  > >> > Sincerely yours, Ivan Daschinskiy
> > > >>  > >>
> > > >>  > >
> > > >>  > >
> > > >>  > > --
> > > >>  > > Sincerely yours, Ivan Daschinskiy
> > > >>  > >
> > > >>  >
> > > >>  >
> > > >>  > --
> > > >>  > Sincerely yours, Ivan Daschinskiy
> > > >>  >
> > > >
> > > > --
> > > > Sincerely yours, Ivan Daschinskiy
> > >
> >
>
>
> --
> <http://www.trimble.com/>
> Raymond Wilson
> Solution Architect, Civil Construction Software Systems (CCSS)
> 11 Birmingham Drive | Christchurch, New Zealand
> +64-21-2013317 Mobile
> raymond_wilson@trimble.com
>
> <
> https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch
> >
>

Re: Why WAL archives enabled by default?

Posted by Raymond Wilson <ra...@trimble.com>.

Isn't the discussion here related to the WAL archive? If you disable that
don't you still have the WAL containing un-checkpointed changes?

On Wed, Nov 11, 2020 at 11:01 AM Dmitriy Pavlov <dp...@apache.org> wrote:

> Hi Denis,
>
> the short answer here, Apache Ignite guarantees ACID, and for D-Durability
> it is required to save all changes in some WAL/Redo Log to have a safe way
> to recover from any hardware failures/disk outage.
>
> Should the user disable WAL, he/she could potentially lose durability.
>
> Sincerely,
> Dmitriy Pavlov
>
> вт, 10 нояб. 2020 г. в 09:57, ткаленко кирилл <tk...@yandex.ru>:
>
> > Hello guys again!
> >
> > Does anyone know why we are doing any calculation here
> > IgniteUtils#adjustedWalHistorySize at all?
> > Would it be easier to always take the
> > DataStorageConfiguration#maxWalArchiveSize? It seems that the user can
> > easily do this himself by changing the value by 1 byte.
> >
> > 06.11.2020, 13:56, "Ivan Daschinsky" <iv...@gmail.com>:
> > > Alex, thanks for pointing that out. Shame that I missed it.
> > >
> > > пт, 6 нояб. 2020 г. в 13:45, Alex Plehanov <pl...@gmail.com>:
> > >
> > >>  Guys,
> > >>
> > >>  We already have
> FileWriteAheadLogManager#maxSegCountWithoutCheckpoint.
> > >>  Checkpoint triggered if there are too many WAL segments without
> > checkpoint.
> > >>  Looks like you are talking about this feature.
> > >>
> > >>  пт, 6 нояб. 2020 г. в 13:21, Ivan Daschinsky <iv...@gmail.com>:
> > >>
> > >>  > Kirill and I discussed privately proposed approach. As far as I
> > >>  understand,
> > >>  > Kirill suggests to implement some
> > >>  > heuristic to do a force checkpoint in some cases if user by mistake
> > >>  > misconfigured cluster in order to preserve
> > >>  > requested size of WAL archive.
> > >>  > Currently, as for me, this approach is questionable, because it can
> > cause
> > >>  > some performance problems. But as an option,
> > >>  > it can be used and should be switchable.
> > >>  >
> > >>  > пт, 6 нояб. 2020 г. в 12:36, Ivan Daschinsky <ivandasch@gmail.com
> >:
> > >>  >
> > >>  > > Kirill, how your approach will help if user tuned a cluster to do
> > >>  > > checkpoints rarely under load?
> > >>  > > No way.
> > >>  > >
> > >>  > > пт, 6 нояб. 2020 г. в 12:19, ткаленко кирилл <
> tkalkirill@yandex.ru
> > >:
> > >>  > >
> > >>  > >> Ivan, I agree with you that the archive is primarily about
> > >>  optimization.
> > >>  > >>
> > >>  > >> If the size of the archive is critical for the user, we have no
> > >>  > >> protection against this, we can always go beyond this limit.
> > >>  > >> Thus, the user needs to remember this and configure it in some
> > way.
> > >>  > >>
> > >>  > >> I suggest not to exceed this limit and give the expected
> behavior
> > for
> > >>  > the
> > >>  > >> user. At the same time, the segments needed for recovery will
> > remain
> > >>  and
> > >>  > >> there will be no data loss.
> > >>  > >>
> > >>  > >> 06.11.2020, 11:29, "Ivan Daschinsky" <iv...@gmail.com>:
> > >>  > >> > Guys, fisrt of all, archiving is not for PITR at all, this is
> > >>  > >> optimization.
> > >>  > >> > If we disable archiving, every rollover we need to create new
> > file.
> > >>  If
> > >>  > >> we
> > >>  > >> > enable archiving, we reserve 10 (by default) segments filled
> > with
> > >>  > >> zeroes.
> > >>  > >> > We use mmap by default, so if we use no-archiver approach:
> > >>  > >> > 1. We firstly create new empty file
> > >>  > >> > 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under the
> > hood
> > >>  > >> > a. If file is shorter, than wal segment size, it
> > >>  > >> > calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under
> the
> > >>  hood
> > >>  > >> just
> > >>  > >> > a system call truncate [1]
> > >>  > >> > b. Than it calls system call mmap on this
> > >>  > >> > file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
> > >>  > >> > These manipulation are not free and cheap. So rollover will be
> > much
> > >>  > much
> > >>  > >> > slower.
> > >>  > >> > If archiving is enabled, 10 segments are already preallocated
> > at the
> > >>  > >> moment
> > >>  > >> > of node's start.
> > >>  > >> >
> > >>  > >> > When archiving is enabled, archiver just copy previous
> > preallocated
> > >>  > >> segment
> > >>  > >> > and move it to archive directory.
> > >>  > >> > This archived segment is crucial for recovery. When new
> > checkpoints
> > >>  > >> > finished, all eligible for trunocating segments are just
> > removed.
> > >>  > >> >
> > >>  > >> > If archiving is disabled, we also write WAL segments in wal
> > >>  directory
> > >>  > >> and
> > >>  > >> > disabling archiving don't prevent you from storing segments,
> if
> > they
> > >>  > are
> > >>  > >> > required for recovery.
> > >>  > >> >
> > >>  > >> >>> Before increasing the size of WAL archive (transferring to
> > archive
> > >>  > >> >
> > >>  > >> > /rollOver, compression, decompression), we can make sure that
> > there
> > >>  > >> will be
> > >>  > >> > enough space in the archive and if there is no such, then we
> > will
> > >>  try
> > >>  > to
> > >>  > >> >>> clean it. We cannot delete those segments that are required
> > for
> > >>  > >> recovery
> > >>  > >> >
> > >>  > >> > (between the last two checkpoints) and reserved for example
> for
> > >>  > >> historical
> > >>  > >> > rebalancing.
> > >>  > >> > First of all, compression/decompression is offtopic here.
> > >>  > >> > Secondly, wal segments are required only with idx higher than
> > LAST
> > >>  > >> > checkpoint marker.
> > >>  > >> > Thirdly, archiving and rolling over can be during checkpoint
> > and we
> > >>  > can
> > >>  > >> > broke everything accidentially.
> > >>  > >> > Fourthly, I see no benefits to overcomplicated already
> > complicated
> > >>  > >> logic.
> > >>  > >> > This is basically problem of misunderstanding and tuning.
> > >>  > >> > There are a lot of similar topics for almost every DB. [3]
> > >>  > >> >
> > >>  > >> > [1] -- https://man7.org/linux/man-pages/man2/ftruncate.2.html
> > >>  > >> > [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
> > >>  > >> > [3] --
> > >>  > >> >
> > >>  > >>
> > >>  >
> > >>
> >
> https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
> > >>  > >> >
> > >>  > >> > пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <
> > tkalkirill@yandex.ru
> > >>  >:
> > >>  > >> >
> > >>  > >> >> Hi, Ivan!
> > >>  > >> >>
> > >>  > >> >> I have only described ideas. But here are a few more details.
> > >>  > >> >>
> > >>  > >> >> We can take care not to go beyond
> > >>  > >> >> DataStorageConfiguration#maxWalArchiveSize.
> > >>  > >> >>
> > >>  > >> >> Before increasing the size of WAL archive (transferring to
> > archive
> > >>  > >> >> /rollOver, compression, decompression), we can make sure that
> > >>  there
> > >>  > >> will be
> > >>  > >> >> enough space in the archive and if there is no such, then we
> > will
> > >>  > try
> > >>  > >> to
> > >>  > >> >> clean it. We cannot delete those segments that are required
> for
> > >>  > >> recovery
> > >>  > >> >> (between the last two checkpoints) and reserved for example
> for
> > >>  > >> historical
> > >>  > >> >> rebalancing.
> > >>  > >> >>
> > >>  > >> >> We can receive a notification about the change of checkpoints
> > and
> > >>  > the
> > >>  > >> >> reservation / release of segments, thus we can know how many
> > >>  > segments
> > >>  > >> we
> > >>  > >> >> can delete right now.
> > >>  > >> >>
> > >>  > >> >> 06.11.2020, 09:53, "Ivan Daschinsky" <iv...@gmail.com>:
> > >>  > >> >> >>> For example, when trying to move a segment to the
> archive.
> > >>  > >> >> >
> > >>  > >> >> > We cannot do this, we will lost data. We can truncate
> > archived
> > >>  > >> segment if
> > >>  > >> >> > and only if it is not required for recovery. If last
> > checkpoint
> > >>  > >> marker
> > >>  > >> >> > points to segment
> > >>  > >> >> > with lower index, we cannot delete any segment with higher
> > >>  index.
> > >>  > >> So the
> > >>  > >> >> > only moment where we can remove truncate segments is a
> > finish of
> > >>  > >> >> checkpoint.
> > >>  > >> >> >
> > >>  > >> >> > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <
> > >>  > tkalkirill@yandex.ru
> > >>  > >> >:
> > >>  > >> >> >
> > >>  > >> >> >> Hello, everybody!
> > >>  > >> >> >>
> > >>  > >> >> >> As far as I know, WAL archive is used for PITP(GridGain
> > >>  feature)
> > >>  > >> and
> > >>  > >> >> >> historical rebalancing.
> > >>  > >> >> >>
> > >>  > >> >> >> Facundo seems to have a problem with running out of
> > directory
> > >>  > >> >> >> (/opt/work/walarchive) space.
> > >>  > >> >> >> Currently, WAL archive is cleared at the end of
> checkpoint.
> > >>  > >> Potentially
> > >>  > >> >> >> long transaction may prevent checkpoint starting, thereby
> > not
> > >>  > >> cleaning
> > >>  > >> >> WAL
> > >>  > >> >> >> archive, which will lead to such an error.
> > >>  > >> >> >> At the moment, I see such a WA to increase size of
> directory
> > >>  > >> >> >> (/opt/work/walarchive) in k8s and avoid long transactions
> or
> > >>  > >> something
> > >>  > >> >> like
> > >>  > >> >> >> that that modifies data and runs for a long time.
> > >>  > >> >> >>
> > >>  > >> >> >> And it is best to fix the logic of working with WAL
> > archive. I
> > >>  > >> think we
> > >>  > >> >> >> should remove WAL archive cleanup from the end of the
> > >>  checkpoint
> > >>  > >> and
> > >>  > >> >> do it
> > >>  > >> >> >> on demand. For example, when trying to move a segment to
> the
> > >>  > >> archive.
> > >>  > >> >> >>
> > >>  > >> >> >> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
> > >>  > >> >> >> > Folks,
> > >>  > >> >> >> >
> > >>  > >> >> >> > In my understanding, you need the archives only for
> > features
> > >>  > >> such as
> > >>  > >> >> >> PITR.
> > >>  > >> >> >> > Considering, that the PITR functionality is not provided
> > in
> > >>  > >> Ignite
> > >>  > >> >> why do
> > >>  > >> >> >> > we have the archives enabled by default?
> > >>  > >> >> >> >
> > >>  > >> >> >> > How about having this feature disabled by default to
> > prevent
> > >>  > the
> > >>  > >> >> >> following
> > >>  > >> >> >> > issues experienced by our users:
> > >>  > >> >> >> >
> > >>  > >> >> >>
> > >>  > >> >>
> > >>  > >>
> > >>  >
> > >>
> >
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
> > >>  > >> >> >> >
> > >>  > >> >> >> > -
> > >>  > >> >> >> > Denis
> > >>  > >> >> >
> > >>  > >> >> > --
> > >>  > >> >> > Sincerely yours, Ivan Daschinskiy
> > >>  > >> >
> > >>  > >> > --
> > >>  > >> > Sincerely yours, Ivan Daschinskiy
> > >>  > >>
> > >>  > >
> > >>  > >
> > >>  > > --
> > >>  > > Sincerely yours, Ivan Daschinskiy
> > >>  > >
> > >>  >
> > >>  >
> > >>  > --
> > >>  > Sincerely yours, Ivan Daschinskiy
> > >>  >
> > >
> > > --
> > > Sincerely yours, Ivan Daschinskiy
> >
>


-- 
<http://www.trimble.com/>
Raymond Wilson
Solution Architect, Civil Construction Software Systems (CCSS)
11 Birmingham Drive | Christchurch, New Zealand
+64-21-2013317 Mobile
raymond_wilson@trimble.com

<https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>

Re: Why WAL archives enabled by default?

Posted by Dmitriy Pavlov <dp...@apache.org>.

Hi Denis,

the short answer here, Apache Ignite guarantees ACID, and for D-Durability
it is required to save all changes in some WAL/Redo Log to have a safe way
to recover from any hardware failures/disk outage.

Should the user disable WAL, he/she could potentially lose durability.

Sincerely,
Dmitriy Pavlov

вт, 10 нояб. 2020 г. в 09:57, ткаленко кирилл <tk...@yandex.ru>:

> Hello guys again!
>
> Does anyone know why we are doing any calculation here
> IgniteUtils#adjustedWalHistorySize at all?
> Would it be easier to always take the
> DataStorageConfiguration#maxWalArchiveSize? It seems that the user can
> easily do this himself by changing the value by 1 byte.
>
> 06.11.2020, 13:56, "Ivan Daschinsky" <iv...@gmail.com>:
> > Alex, thanks for pointing that out. Shame that I missed it.
> >
> > пт, 6 нояб. 2020 г. в 13:45, Alex Plehanov <pl...@gmail.com>:
> >
> >>  Guys,
> >>
> >>  We already have FileWriteAheadLogManager#maxSegCountWithoutCheckpoint.
> >>  Checkpoint triggered if there are too many WAL segments without
> checkpoint.
> >>  Looks like you are talking about this feature.
> >>
> >>  пт, 6 нояб. 2020 г. в 13:21, Ivan Daschinsky <iv...@gmail.com>:
> >>
> >>  > Kirill and I discussed privately proposed approach. As far as I
> >>  understand,
> >>  > Kirill suggests to implement some
> >>  > heuristic to do a force checkpoint in some cases if user by mistake
> >>  > misconfigured cluster in order to preserve
> >>  > requested size of WAL archive.
> >>  > Currently, as for me, this approach is questionable, because it can
> cause
> >>  > some performance problems. But as an option,
> >>  > it can be used and should be switchable.
> >>  >
> >>  > пт, 6 нояб. 2020 г. в 12:36, Ivan Daschinsky <iv...@gmail.com>:
> >>  >
> >>  > > Kirill, how your approach will help if user tuned a cluster to do
> >>  > > checkpoints rarely under load?
> >>  > > No way.
> >>  > >
> >>  > > пт, 6 нояб. 2020 г. в 12:19, ткаленко кирилл <tkalkirill@yandex.ru
> >:
> >>  > >
> >>  > >> Ivan, I agree with you that the archive is primarily about
> >>  optimization.
> >>  > >>
> >>  > >> If the size of the archive is critical for the user, we have no
> >>  > >> protection against this, we can always go beyond this limit.
> >>  > >> Thus, the user needs to remember this and configure it in some
> way.
> >>  > >>
> >>  > >> I suggest not to exceed this limit and give the expected behavior
> for
> >>  > the
> >>  > >> user. At the same time, the segments needed for recovery will
> remain
> >>  and
> >>  > >> there will be no data loss.
> >>  > >>
> >>  > >> 06.11.2020, 11:29, "Ivan Daschinsky" <iv...@gmail.com>:
> >>  > >> > Guys, fisrt of all, archiving is not for PITR at all, this is
> >>  > >> optimization.
> >>  > >> > If we disable archiving, every rollover we need to create new
> file.
> >>  If
> >>  > >> we
> >>  > >> > enable archiving, we reserve 10 (by default) segments filled
> with
> >>  > >> zeroes.
> >>  > >> > We use mmap by default, so if we use no-archiver approach:
> >>  > >> > 1. We firstly create new empty file
> >>  > >> > 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under the
> hood
> >>  > >> > a. If file is shorter, than wal segment size, it
> >>  > >> > calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under the
> >>  hood
> >>  > >> just
> >>  > >> > a system call truncate [1]
> >>  > >> > b. Than it calls system call mmap on this
> >>  > >> > file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
> >>  > >> > These manipulation are not free and cheap. So rollover will be
> much
> >>  > much
> >>  > >> > slower.
> >>  > >> > If archiving is enabled, 10 segments are already preallocated
> at the
> >>  > >> moment
> >>  > >> > of node's start.
> >>  > >> >
> >>  > >> > When archiving is enabled, archiver just copy previous
> preallocated
> >>  > >> segment
> >>  > >> > and move it to archive directory.
> >>  > >> > This archived segment is crucial for recovery. When new
> checkpoints
> >>  > >> > finished, all eligible for trunocating segments are just
> removed.
> >>  > >> >
> >>  > >> > If archiving is disabled, we also write WAL segments in wal
> >>  directory
> >>  > >> and
> >>  > >> > disabling archiving don't prevent you from storing segments, if
> they
> >>  > are
> >>  > >> > required for recovery.
> >>  > >> >
> >>  > >> >>> Before increasing the size of WAL archive (transferring to
> archive
> >>  > >> >
> >>  > >> > /rollOver, compression, decompression), we can make sure that
> there
> >>  > >> will be
> >>  > >> > enough space in the archive and if there is no such, then we
> will
> >>  try
> >>  > to
> >>  > >> >>> clean it. We cannot delete those segments that are required
> for
> >>  > >> recovery
> >>  > >> >
> >>  > >> > (between the last two checkpoints) and reserved for example for
> >>  > >> historical
> >>  > >> > rebalancing.
> >>  > >> > First of all, compression/decompression is offtopic here.
> >>  > >> > Secondly, wal segments are required only with idx higher than
> LAST
> >>  > >> > checkpoint marker.
> >>  > >> > Thirdly, archiving and rolling over can be during checkpoint
> and we
> >>  > can
> >>  > >> > broke everything accidentially.
> >>  > >> > Fourthly, I see no benefits to overcomplicated already
> complicated
> >>  > >> logic.
> >>  > >> > This is basically problem of misunderstanding and tuning.
> >>  > >> > There are a lot of similar topics for almost every DB. [3]
> >>  > >> >
> >>  > >> > [1] -- https://man7.org/linux/man-pages/man2/ftruncate.2.html
> >>  > >> > [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
> >>  > >> > [3] --
> >>  > >> >
> >>  > >>
> >>  >
> >>
> https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
> >>  > >> >
> >>  > >> > пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <
> tkalkirill@yandex.ru
> >>  >:
> >>  > >> >
> >>  > >> >> Hi, Ivan!
> >>  > >> >>
> >>  > >> >> I have only described ideas. But here are a few more details.
> >>  > >> >>
> >>  > >> >> We can take care not to go beyond
> >>  > >> >> DataStorageConfiguration#maxWalArchiveSize.
> >>  > >> >>
> >>  > >> >> Before increasing the size of WAL archive (transferring to
> archive
> >>  > >> >> /rollOver, compression, decompression), we can make sure that
> >>  there
> >>  > >> will be
> >>  > >> >> enough space in the archive and if there is no such, then we
> will
> >>  > try
> >>  > >> to
> >>  > >> >> clean it. We cannot delete those segments that are required for
> >>  > >> recovery
> >>  > >> >> (between the last two checkpoints) and reserved for example for
> >>  > >> historical
> >>  > >> >> rebalancing.
> >>  > >> >>
> >>  > >> >> We can receive a notification about the change of checkpoints
> and
> >>  > the
> >>  > >> >> reservation / release of segments, thus we can know how many
> >>  > segments
> >>  > >> we
> >>  > >> >> can delete right now.
> >>  > >> >>
> >>  > >> >> 06.11.2020, 09:53, "Ivan Daschinsky" <iv...@gmail.com>:
> >>  > >> >> >>> For example, when trying to move a segment to the archive.
> >>  > >> >> >
> >>  > >> >> > We cannot do this, we will lost data. We can truncate
> archived
> >>  > >> segment if
> >>  > >> >> > and only if it is not required for recovery. If last
> checkpoint
> >>  > >> marker
> >>  > >> >> > points to segment
> >>  > >> >> > with lower index, we cannot delete any segment with higher
> >>  index.
> >>  > >> So the
> >>  > >> >> > only moment where we can remove truncate segments is a
> finish of
> >>  > >> >> checkpoint.
> >>  > >> >> >
> >>  > >> >> > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <
> >>  > tkalkirill@yandex.ru
> >>  > >> >:
> >>  > >> >> >
> >>  > >> >> >> Hello, everybody!
> >>  > >> >> >>
> >>  > >> >> >> As far as I know, WAL archive is used for PITP(GridGain
> >>  feature)
> >>  > >> and
> >>  > >> >> >> historical rebalancing.
> >>  > >> >> >>
> >>  > >> >> >> Facundo seems to have a problem with running out of
> directory
> >>  > >> >> >> (/opt/work/walarchive) space.
> >>  > >> >> >> Currently, WAL archive is cleared at the end of checkpoint.
> >>  > >> Potentially
> >>  > >> >> >> long transaction may prevent checkpoint starting, thereby
> not
> >>  > >> cleaning
> >>  > >> >> WAL
> >>  > >> >> >> archive, which will lead to such an error.
> >>  > >> >> >> At the moment, I see such a WA to increase size of directory
> >>  > >> >> >> (/opt/work/walarchive) in k8s and avoid long transactions or
> >>  > >> something
> >>  > >> >> like
> >>  > >> >> >> that that modifies data and runs for a long time.
> >>  > >> >> >>
> >>  > >> >> >> And it is best to fix the logic of working with WAL
> archive. I
> >>  > >> think we
> >>  > >> >> >> should remove WAL archive cleanup from the end of the
> >>  checkpoint
> >>  > >> and
> >>  > >> >> do it
> >>  > >> >> >> on demand. For example, when trying to move a segment to the
> >>  > >> archive.
> >>  > >> >> >>
> >>  > >> >> >> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
> >>  > >> >> >> > Folks,
> >>  > >> >> >> >
> >>  > >> >> >> > In my understanding, you need the archives only for
> features
> >>  > >> such as
> >>  > >> >> >> PITR.
> >>  > >> >> >> > Considering, that the PITR functionality is not provided
> in
> >>  > >> Ignite
> >>  > >> >> why do
> >>  > >> >> >> > we have the archives enabled by default?
> >>  > >> >> >> >
> >>  > >> >> >> > How about having this feature disabled by default to
> prevent
> >>  > the
> >>  > >> >> >> following
> >>  > >> >> >> > issues experienced by our users:
> >>  > >> >> >> >
> >>  > >> >> >>
> >>  > >> >>
> >>  > >>
> >>  >
> >>
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
> >>  > >> >> >> >
> >>  > >> >> >> > -
> >>  > >> >> >> > Denis
> >>  > >> >> >
> >>  > >> >> > --
> >>  > >> >> > Sincerely yours, Ivan Daschinskiy
> >>  > >> >
> >>  > >> > --
> >>  > >> > Sincerely yours, Ivan Daschinskiy
> >>  > >>
> >>  > >
> >>  > >
> >>  > > --
> >>  > > Sincerely yours, Ivan Daschinskiy
> >>  > >
> >>  >
> >>  >
> >>  > --
> >>  > Sincerely yours, Ivan Daschinskiy
> >>  >
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
>

Re: Why WAL archives enabled by default?

Posted by ткаленко кирилл <tk...@yandex.ru>.

Hello guys again!

Does anyone know why we are doing any calculation here IgniteUtils#adjustedWalHistorySize at all?
Would it be easier to always take the DataStorageConfiguration#maxWalArchiveSize? It seems that the user can easily do this himself by changing the value by 1 byte.

06.11.2020, 13:56, "Ivan Daschinsky" <iv...@gmail.com>:
> Alex, thanks for pointing that out. Shame that I missed it.
>
> пт, 6 нояб. 2020 г. в 13:45, Alex Plehanov <pl...@gmail.com>:
>
>>  Guys,
>>
>>  We already have FileWriteAheadLogManager#maxSegCountWithoutCheckpoint.
>>  Checkpoint triggered if there are too many WAL segments without checkpoint.
>>  Looks like you are talking about this feature.
>>
>>  пт, 6 нояб. 2020 г. в 13:21, Ivan Daschinsky <iv...@gmail.com>:
>>
>>  > Kirill and I discussed privately proposed approach. As far as I
>>  understand,
>>  > Kirill suggests to implement some
>>  > heuristic to do a force checkpoint in some cases if user by mistake
>>  > misconfigured cluster in order to preserve
>>  > requested size of WAL archive.
>>  > Currently, as for me, this approach is questionable, because it can cause
>>  > some performance problems. But as an option,
>>  > it can be used and should be switchable.
>>  >
>>  > пт, 6 нояб. 2020 г. в 12:36, Ivan Daschinsky <iv...@gmail.com>:
>>  >
>>  > > Kirill, how your approach will help if user tuned a cluster to do
>>  > > checkpoints rarely under load?
>>  > > No way.
>>  > >
>>  > > пт, 6 нояб. 2020 г. в 12:19, ткаленко кирилл <tk...@yandex.ru>:
>>  > >
>>  > >> Ivan, I agree with you that the archive is primarily about
>>  optimization.
>>  > >>
>>  > >> If the size of the archive is critical for the user, we have no
>>  > >> protection against this, we can always go beyond this limit.
>>  > >> Thus, the user needs to remember this and configure it in some way.
>>  > >>
>>  > >> I suggest not to exceed this limit and give the expected behavior for
>>  > the
>>  > >> user. At the same time, the segments needed for recovery will remain
>>  and
>>  > >> there will be no data loss.
>>  > >>
>>  > >> 06.11.2020, 11:29, "Ivan Daschinsky" <iv...@gmail.com>:
>>  > >> > Guys, fisrt of all, archiving is not for PITR at all, this is
>>  > >> optimization.
>>  > >> > If we disable archiving, every rollover we need to create new file.
>>  If
>>  > >> we
>>  > >> > enable archiving, we reserve 10 (by default) segments filled with
>>  > >> zeroes.
>>  > >> > We use mmap by default, so if we use no-archiver approach:
>>  > >> > 1. We firstly create new empty file
>>  > >> > 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under the hood
>>  > >> > a. If file is shorter, than wal segment size, it
>>  > >> > calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under the
>>  hood
>>  > >> just
>>  > >> > a system call truncate [1]
>>  > >> > b. Than it calls system call mmap on this
>>  > >> > file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
>>  > >> > These manipulation are not free and cheap. So rollover will be much
>>  > much
>>  > >> > slower.
>>  > >> > If archiving is enabled, 10 segments are already preallocated at the
>>  > >> moment
>>  > >> > of node's start.
>>  > >> >
>>  > >> > When archiving is enabled, archiver just copy previous preallocated
>>  > >> segment
>>  > >> > and move it to archive directory.
>>  > >> > This archived segment is crucial for recovery. When new checkpoints
>>  > >> > finished, all eligible for trunocating segments are just removed.
>>  > >> >
>>  > >> > If archiving is disabled, we also write WAL segments in wal
>>  directory
>>  > >> and
>>  > >> > disabling archiving don't prevent you from storing segments, if they
>>  > are
>>  > >> > required for recovery.
>>  > >> >
>>  > >> >>> Before increasing the size of WAL archive (transferring to archive
>>  > >> >
>>  > >> > /rollOver, compression, decompression), we can make sure that there
>>  > >> will be
>>  > >> > enough space in the archive and if there is no such, then we will
>>  try
>>  > to
>>  > >> >>> clean it. We cannot delete those segments that are required for
>>  > >> recovery
>>  > >> >
>>  > >> > (between the last two checkpoints) and reserved for example for
>>  > >> historical
>>  > >> > rebalancing.
>>  > >> > First of all, compression/decompression is offtopic here.
>>  > >> > Secondly, wal segments are required only with idx higher than LAST
>>  > >> > checkpoint marker.
>>  > >> > Thirdly, archiving and rolling over can be during checkpoint and we
>>  > can
>>  > >> > broke everything accidentially.
>>  > >> > Fourthly, I see no benefits to overcomplicated already complicated
>>  > >> logic.
>>  > >> > This is basically problem of misunderstanding and tuning.
>>  > >> > There are a lot of similar topics for almost every DB. [3]
>>  > >> >
>>  > >> > [1] -- https://man7.org/linux/man-pages/man2/ftruncate.2.html
>>  > >> > [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
>>  > >> > [3] --
>>  > >> >
>>  > >>
>>  >
>>  https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
>>  > >> >
>>  > >> > пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <tkalkirill@yandex.ru
>>  >:
>>  > >> >
>>  > >> >> Hi, Ivan!
>>  > >> >>
>>  > >> >> I have only described ideas. But here are a few more details.
>>  > >> >>
>>  > >> >> We can take care not to go beyond
>>  > >> >> DataStorageConfiguration#maxWalArchiveSize.
>>  > >> >>
>>  > >> >> Before increasing the size of WAL archive (transferring to archive
>>  > >> >> /rollOver, compression, decompression), we can make sure that
>>  there
>>  > >> will be
>>  > >> >> enough space in the archive and if there is no such, then we will
>>  > try
>>  > >> to
>>  > >> >> clean it. We cannot delete those segments that are required for
>>  > >> recovery
>>  > >> >> (between the last two checkpoints) and reserved for example for
>>  > >> historical
>>  > >> >> rebalancing.
>>  > >> >>
>>  > >> >> We can receive a notification about the change of checkpoints and
>>  > the
>>  > >> >> reservation / release of segments, thus we can know how many
>>  > segments
>>  > >> we
>>  > >> >> can delete right now.
>>  > >> >>
>>  > >> >> 06.11.2020, 09:53, "Ivan Daschinsky" <iv...@gmail.com>:
>>  > >> >> >>> For example, when trying to move a segment to the archive.
>>  > >> >> >
>>  > >> >> > We cannot do this, we will lost data. We can truncate archived
>>  > >> segment if
>>  > >> >> > and only if it is not required for recovery. If last checkpoint
>>  > >> marker
>>  > >> >> > points to segment
>>  > >> >> > with lower index, we cannot delete any segment with higher
>>  index.
>>  > >> So the
>>  > >> >> > only moment where we can remove truncate segments is a finish of
>>  > >> >> checkpoint.
>>  > >> >> >
>>  > >> >> > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <
>>  > tkalkirill@yandex.ru
>>  > >> >:
>>  > >> >> >
>>  > >> >> >> Hello, everybody!
>>  > >> >> >>
>>  > >> >> >> As far as I know, WAL archive is used for PITP(GridGain
>>  feature)
>>  > >> and
>>  > >> >> >> historical rebalancing.
>>  > >> >> >>
>>  > >> >> >> Facundo seems to have a problem with running out of directory
>>  > >> >> >> (/opt/work/walarchive) space.
>>  > >> >> >> Currently, WAL archive is cleared at the end of checkpoint.
>>  > >> Potentially
>>  > >> >> >> long transaction may prevent checkpoint starting, thereby not
>>  > >> cleaning
>>  > >> >> WAL
>>  > >> >> >> archive, which will lead to such an error.
>>  > >> >> >> At the moment, I see such a WA to increase size of directory
>>  > >> >> >> (/opt/work/walarchive) in k8s and avoid long transactions or
>>  > >> something
>>  > >> >> like
>>  > >> >> >> that that modifies data and runs for a long time.
>>  > >> >> >>
>>  > >> >> >> And it is best to fix the logic of working with WAL archive. I
>>  > >> think we
>>  > >> >> >> should remove WAL archive cleanup from the end of the
>>  checkpoint
>>  > >> and
>>  > >> >> do it
>>  > >> >> >> on demand. For example, when trying to move a segment to the
>>  > >> archive.
>>  > >> >> >>
>>  > >> >> >> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
>>  > >> >> >> > Folks,
>>  > >> >> >> >
>>  > >> >> >> > In my understanding, you need the archives only for features
>>  > >> such as
>>  > >> >> >> PITR.
>>  > >> >> >> > Considering, that the PITR functionality is not provided in
>>  > >> Ignite
>>  > >> >> why do
>>  > >> >> >> > we have the archives enabled by default?
>>  > >> >> >> >
>>  > >> >> >> > How about having this feature disabled by default to prevent
>>  > the
>>  > >> >> >> following
>>  > >> >> >> > issues experienced by our users:
>>  > >> >> >> >
>>  > >> >> >>
>>  > >> >>
>>  > >>
>>  >
>>  http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
>>  > >> >> >> >
>>  > >> >> >> > -
>>  > >> >> >> > Denis
>>  > >> >> >
>>  > >> >> > --
>>  > >> >> > Sincerely yours, Ivan Daschinskiy
>>  > >> >
>>  > >> > --
>>  > >> > Sincerely yours, Ivan Daschinskiy
>>  > >>
>>  > >
>>  > >
>>  > > --
>>  > > Sincerely yours, Ivan Daschinskiy
>>  > >
>>  >
>>  >
>>  > --
>>  > Sincerely yours, Ivan Daschinskiy
>>  >
>
> --
> Sincerely yours, Ivan Daschinskiy

Re: Why WAL archives enabled by default?

Posted by Ivan Daschinsky <iv...@gmail.com>.

Alex, thanks for pointing that out. Shame that I missed it.

пт, 6 нояб. 2020 г. в 13:45, Alex Plehanov <pl...@gmail.com>:

> Guys,
>
> We already have FileWriteAheadLogManager#maxSegCountWithoutCheckpoint.
> Checkpoint triggered if there are too many WAL segments without checkpoint.
> Looks like you are talking about this feature.
>
> пт, 6 нояб. 2020 г. в 13:21, Ivan Daschinsky <iv...@gmail.com>:
>
> > Kirill and I discussed privately proposed approach. As far as I
> understand,
> > Kirill suggests to implement some
> > heuristic to do a force checkpoint in some cases if user by mistake
> > misconfigured cluster in order to preserve
> > requested size of WAL archive.
> > Currently, as for me, this approach is questionable, because it can cause
> > some performance problems. But as an option,
> > it can be used and should be switchable.
> >
> > пт, 6 нояб. 2020 г. в 12:36, Ivan Daschinsky <iv...@gmail.com>:
> >
> > > Kirill, how your approach will help if user tuned a cluster to do
> > > checkpoints rarely under load?
> > > No way.
> > >
> > > пт, 6 нояб. 2020 г. в 12:19, ткаленко кирилл <tk...@yandex.ru>:
> > >
> > >> Ivan, I agree with you that the archive is primarily about
> optimization.
> > >>
> > >> If the size of the archive is critical for the user, we have no
> > >> protection against this, we can always go beyond this limit.
> > >> Thus, the user needs to remember this and configure it in some way.
> > >>
> > >> I suggest not to exceed this limit and give the expected behavior for
> > the
> > >> user. At the same time, the segments needed for recovery will remain
> and
> > >> there will be no data loss.
> > >>
> > >> 06.11.2020, 11:29, "Ivan Daschinsky" <iv...@gmail.com>:
> > >> > Guys, fisrt of all, archiving is not for PITR at all, this is
> > >> optimization.
> > >> > If we disable archiving, every rollover we need to create new file.
> If
> > >> we
> > >> > enable archiving, we reserve 10 (by default) segments filled with
> > >> zeroes.
> > >> > We use mmap by default, so if we use no-archiver approach:
> > >> > 1. We firstly create new empty file
> > >> > 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under the hood
> > >> > a. If file is shorter, than wal segment size, it
> > >> > calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under the
> hood
> > >> just
> > >> > a system call truncate [1]
> > >> > b. Than it calls system call mmap on this
> > >> > file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
> > >> > These manipulation are not free and cheap. So rollover will be much
> > much
> > >> > slower.
> > >> > If archiving is enabled, 10 segments are already preallocated at the
> > >> moment
> > >> > of node's start.
> > >> >
> > >> > When archiving is enabled, archiver just copy previous preallocated
> > >> segment
> > >> > and move it to archive directory.
> > >> > This archived segment is crucial for recovery. When new checkpoints
> > >> > finished, all eligible for trunocating segments are just removed.
> > >> >
> > >> > If archiving is disabled, we also write WAL segments in wal
> directory
> > >> and
> > >> > disabling archiving don't prevent you from storing segments, if they
> > are
> > >> > required for recovery.
> > >> >
> > >> >>> Before increasing the size of WAL archive (transferring to archive
> > >> >
> > >> > /rollOver, compression, decompression), we can make sure that there
> > >> will be
> > >> > enough space in the archive and if there is no such, then we will
> try
> > to
> > >> >>> clean it. We cannot delete those segments that are required for
> > >> recovery
> > >> >
> > >> > (between the last two checkpoints) and reserved for example for
> > >> historical
> > >> > rebalancing.
> > >> > First of all, compression/decompression is offtopic here.
> > >> > Secondly, wal segments are required only with idx higher than LAST
> > >> > checkpoint marker.
> > >> > Thirdly, archiving and rolling over can be during checkpoint and we
> > can
> > >> > broke everything accidentially.
> > >> > Fourthly, I see no benefits to overcomplicated already complicated
> > >> logic.
> > >> > This is basically problem of misunderstanding and tuning.
> > >> > There are a lot of similar topics for almost every DB. [3]
> > >> >
> > >> > [1] -- https://man7.org/linux/man-pages/man2/ftruncate.2.html
> > >> > [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
> > >> > [3] --
> > >> >
> > >>
> >
> https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
> > >> >
> > >> > пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <tkalkirill@yandex.ru
> >:
> > >> >
> > >> >>  Hi, Ivan!
> > >> >>
> > >> >>  I have only described ideas. But here are a few more details.
> > >> >>
> > >> >>  We can take care not to go beyond
> > >> >>  DataStorageConfiguration#maxWalArchiveSize.
> > >> >>
> > >> >>  Before increasing the size of WAL archive (transferring to archive
> > >> >>  /rollOver, compression, decompression), we can make sure that
> there
> > >> will be
> > >> >>  enough space in the archive and if there is no such, then we will
> > try
> > >> to
> > >> >>  clean it. We cannot delete those segments that are required for
> > >> recovery
> > >> >>  (between the last two checkpoints) and reserved for example for
> > >> historical
> > >> >>  rebalancing.
> > >> >>
> > >> >>  We can receive a notification about the change of checkpoints and
> > the
> > >> >>  reservation / release of segments, thus we can know how many
> > segments
> > >> we
> > >> >>  can delete right now.
> > >> >>
> > >> >>  06.11.2020, 09:53, "Ivan Daschinsky" <iv...@gmail.com>:
> > >> >>  >>> For example, when trying to move a segment to the archive.
> > >> >>  >
> > >> >>  > We cannot do this, we will lost data. We can truncate archived
> > >> segment if
> > >> >>  > and only if it is not required for recovery. If last checkpoint
> > >> marker
> > >> >>  > points to segment
> > >> >>  > with lower index, we cannot delete any segment with higher
> index.
> > >> So the
> > >> >>  > only moment where we can remove truncate segments is a finish of
> > >> >>  checkpoint.
> > >> >>  >
> > >> >>  > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <
> > tkalkirill@yandex.ru
> > >> >:
> > >> >>  >
> > >> >>  >> Hello, everybody!
> > >> >>  >>
> > >> >>  >> As far as I know, WAL archive is used for PITP(GridGain
> feature)
> > >> and
> > >> >>  >> historical rebalancing.
> > >> >>  >>
> > >> >>  >> Facundo seems to have a problem with running out of directory
> > >> >>  >> (/opt/work/walarchive) space.
> > >> >>  >> Currently, WAL archive is cleared at the end of checkpoint.
> > >> Potentially
> > >> >>  >> long transaction may prevent checkpoint starting, thereby not
> > >> cleaning
> > >> >>  WAL
> > >> >>  >> archive, which will lead to such an error.
> > >> >>  >> At the moment, I see such a WA to increase size of directory
> > >> >>  >> (/opt/work/walarchive) in k8s and avoid long transactions or
> > >> something
> > >> >>  like
> > >> >>  >> that that modifies data and runs for a long time.
> > >> >>  >>
> > >> >>  >> And it is best to fix the logic of working with WAL archive. I
> > >> think we
> > >> >>  >> should remove WAL archive cleanup from the end of the
> checkpoint
> > >> and
> > >> >>  do it
> > >> >>  >> on demand. For example, when trying to move a segment to the
> > >> archive.
> > >> >>  >>
> > >> >>  >> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
> > >> >>  >> > Folks,
> > >> >>  >> >
> > >> >>  >> > In my understanding, you need the archives only for features
> > >> such as
> > >> >>  >> PITR.
> > >> >>  >> > Considering, that the PITR functionality is not provided in
> > >> Ignite
> > >> >>  why do
> > >> >>  >> > we have the archives enabled by default?
> > >> >>  >> >
> > >> >>  >> > How about having this feature disabled by default to prevent
> > the
> > >> >>  >> following
> > >> >>  >> > issues experienced by our users:
> > >> >>  >> >
> > >> >>  >>
> > >> >>
> > >>
> >
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
> > >> >>  >> >
> > >> >>  >> > -
> > >> >>  >> > Denis
> > >> >>  >
> > >> >>  > --
> > >> >>  > Sincerely yours, Ivan Daschinskiy
> > >> >
> > >> > --
> > >> > Sincerely yours, Ivan Daschinskiy
> > >>
> > >
> > >
> > > --
> > > Sincerely yours, Ivan Daschinskiy
> > >
> >
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
> >
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: Why WAL archives enabled by default?

Posted by Alex Plehanov <pl...@gmail.com>.

Guys,

We already have FileWriteAheadLogManager#maxSegCountWithoutCheckpoint.
Checkpoint triggered if there are too many WAL segments without checkpoint.
Looks like you are talking about this feature.

пт, 6 нояб. 2020 г. в 13:21, Ivan Daschinsky <iv...@gmail.com>:

> Kirill and I discussed privately proposed approach. As far as I understand,
> Kirill suggests to implement some
> heuristic to do a force checkpoint in some cases if user by mistake
> misconfigured cluster in order to preserve
> requested size of WAL archive.
> Currently, as for me, this approach is questionable, because it can cause
> some performance problems. But as an option,
> it can be used and should be switchable.
>
> пт, 6 нояб. 2020 г. в 12:36, Ivan Daschinsky <iv...@gmail.com>:
>
> > Kirill, how your approach will help if user tuned a cluster to do
> > checkpoints rarely under load?
> > No way.
> >
> > пт, 6 нояб. 2020 г. в 12:19, ткаленко кирилл <tk...@yandex.ru>:
> >
> >> Ivan, I agree with you that the archive is primarily about optimization.
> >>
> >> If the size of the archive is critical for the user, we have no
> >> protection against this, we can always go beyond this limit.
> >> Thus, the user needs to remember this and configure it in some way.
> >>
> >> I suggest not to exceed this limit and give the expected behavior for
> the
> >> user. At the same time, the segments needed for recovery will remain and
> >> there will be no data loss.
> >>
> >> 06.11.2020, 11:29, "Ivan Daschinsky" <iv...@gmail.com>:
> >> > Guys, fisrt of all, archiving is not for PITR at all, this is
> >> optimization.
> >> > If we disable archiving, every rollover we need to create new file. If
> >> we
> >> > enable archiving, we reserve 10 (by default) segments filled with
> >> zeroes.
> >> > We use mmap by default, so if we use no-archiver approach:
> >> > 1. We firstly create new empty file
> >> > 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under the hood
> >> > a. If file is shorter, than wal segment size, it
> >> > calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under the hood
> >> just
> >> > a system call truncate [1]
> >> > b. Than it calls system call mmap on this
> >> > file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
> >> > These manipulation are not free and cheap. So rollover will be much
> much
> >> > slower.
> >> > If archiving is enabled, 10 segments are already preallocated at the
> >> moment
> >> > of node's start.
> >> >
> >> > When archiving is enabled, archiver just copy previous preallocated
> >> segment
> >> > and move it to archive directory.
> >> > This archived segment is crucial for recovery. When new checkpoints
> >> > finished, all eligible for trunocating segments are just removed.
> >> >
> >> > If archiving is disabled, we also write WAL segments in wal directory
> >> and
> >> > disabling archiving don't prevent you from storing segments, if they
> are
> >> > required for recovery.
> >> >
> >> >>> Before increasing the size of WAL archive (transferring to archive
> >> >
> >> > /rollOver, compression, decompression), we can make sure that there
> >> will be
> >> > enough space in the archive and if there is no such, then we will try
> to
> >> >>> clean it. We cannot delete those segments that are required for
> >> recovery
> >> >
> >> > (between the last two checkpoints) and reserved for example for
> >> historical
> >> > rebalancing.
> >> > First of all, compression/decompression is offtopic here.
> >> > Secondly, wal segments are required only with idx higher than LAST
> >> > checkpoint marker.
> >> > Thirdly, archiving and rolling over can be during checkpoint and we
> can
> >> > broke everything accidentially.
> >> > Fourthly, I see no benefits to overcomplicated already complicated
> >> logic.
> >> > This is basically problem of misunderstanding and tuning.
> >> > There are a lot of similar topics for almost every DB. [3]
> >> >
> >> > [1] -- https://man7.org/linux/man-pages/man2/ftruncate.2.html
> >> > [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
> >> > [3] --
> >> >
> >>
> https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
> >> >
> >> > пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <tk...@yandex.ru>:
> >> >
> >> >>  Hi, Ivan!
> >> >>
> >> >>  I have only described ideas. But here are a few more details.
> >> >>
> >> >>  We can take care not to go beyond
> >> >>  DataStorageConfiguration#maxWalArchiveSize.
> >> >>
> >> >>  Before increasing the size of WAL archive (transferring to archive
> >> >>  /rollOver, compression, decompression), we can make sure that there
> >> will be
> >> >>  enough space in the archive and if there is no such, then we will
> try
> >> to
> >> >>  clean it. We cannot delete those segments that are required for
> >> recovery
> >> >>  (between the last two checkpoints) and reserved for example for
> >> historical
> >> >>  rebalancing.
> >> >>
> >> >>  We can receive a notification about the change of checkpoints and
> the
> >> >>  reservation / release of segments, thus we can know how many
> segments
> >> we
> >> >>  can delete right now.
> >> >>
> >> >>  06.11.2020, 09:53, "Ivan Daschinsky" <iv...@gmail.com>:
> >> >>  >>> For example, when trying to move a segment to the archive.
> >> >>  >
> >> >>  > We cannot do this, we will lost data. We can truncate archived
> >> segment if
> >> >>  > and only if it is not required for recovery. If last checkpoint
> >> marker
> >> >>  > points to segment
> >> >>  > with lower index, we cannot delete any segment with higher index.
> >> So the
> >> >>  > only moment where we can remove truncate segments is a finish of
> >> >>  checkpoint.
> >> >>  >
> >> >>  > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <
> tkalkirill@yandex.ru
> >> >:
> >> >>  >
> >> >>  >> Hello, everybody!
> >> >>  >>
> >> >>  >> As far as I know, WAL archive is used for PITP(GridGain feature)
> >> and
> >> >>  >> historical rebalancing.
> >> >>  >>
> >> >>  >> Facundo seems to have a problem with running out of directory
> >> >>  >> (/opt/work/walarchive) space.
> >> >>  >> Currently, WAL archive is cleared at the end of checkpoint.
> >> Potentially
> >> >>  >> long transaction may prevent checkpoint starting, thereby not
> >> cleaning
> >> >>  WAL
> >> >>  >> archive, which will lead to such an error.
> >> >>  >> At the moment, I see such a WA to increase size of directory
> >> >>  >> (/opt/work/walarchive) in k8s and avoid long transactions or
> >> something
> >> >>  like
> >> >>  >> that that modifies data and runs for a long time.
> >> >>  >>
> >> >>  >> And it is best to fix the logic of working with WAL archive. I
> >> think we
> >> >>  >> should remove WAL archive cleanup from the end of the checkpoint
> >> and
> >> >>  do it
> >> >>  >> on demand. For example, when trying to move a segment to the
> >> archive.
> >> >>  >>
> >> >>  >> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
> >> >>  >> > Folks,
> >> >>  >> >
> >> >>  >> > In my understanding, you need the archives only for features
> >> such as
> >> >>  >> PITR.
> >> >>  >> > Considering, that the PITR functionality is not provided in
> >> Ignite
> >> >>  why do
> >> >>  >> > we have the archives enabled by default?
> >> >>  >> >
> >> >>  >> > How about having this feature disabled by default to prevent
> the
> >> >>  >> following
> >> >>  >> > issues experienced by our users:
> >> >>  >> >
> >> >>  >>
> >> >>
> >>
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
> >> >>  >> >
> >> >>  >> > -
> >> >>  >> > Denis
> >> >>  >
> >> >>  > --
> >> >>  > Sincerely yours, Ivan Daschinskiy
> >> >
> >> > --
> >> > Sincerely yours, Ivan Daschinskiy
> >>
> >
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
> >
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>

Re: Why WAL archives enabled by default?

Posted by Ivan Daschinsky <iv...@gmail.com>.

Kirill and I discussed privately proposed approach. As far as I understand,
Kirill suggests to implement some
heuristic to do a force checkpoint in some cases if user by mistake
misconfigured cluster in order to preserve
requested size of WAL archive.
Currently, as for me, this approach is questionable, because it can cause
some performance problems. But as an option,
it can be used and should be switchable.

пт, 6 нояб. 2020 г. в 12:36, Ivan Daschinsky <iv...@gmail.com>:

> Kirill, how your approach will help if user tuned a cluster to do
> checkpoints rarely under load?
> No way.
>
> пт, 6 нояб. 2020 г. в 12:19, ткаленко кирилл <tk...@yandex.ru>:
>
>> Ivan, I agree with you that the archive is primarily about optimization.
>>
>> If the size of the archive is critical for the user, we have no
>> protection against this, we can always go beyond this limit.
>> Thus, the user needs to remember this and configure it in some way.
>>
>> I suggest not to exceed this limit and give the expected behavior for the
>> user. At the same time, the segments needed for recovery will remain and
>> there will be no data loss.
>>
>> 06.11.2020, 11:29, "Ivan Daschinsky" <iv...@gmail.com>:
>> > Guys, fisrt of all, archiving is not for PITR at all, this is
>> optimization.
>> > If we disable archiving, every rollover we need to create new file. If
>> we
>> > enable archiving, we reserve 10 (by default) segments filled with
>> zeroes.
>> > We use mmap by default, so if we use no-archiver approach:
>> > 1. We firstly create new empty file
>> > 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under the hood
>> > a. If file is shorter, than wal segment size, it
>> > calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under the hood
>> just
>> > a system call truncate [1]
>> > b. Than it calls system call mmap on this
>> > file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
>> > These manipulation are not free and cheap. So rollover will be much much
>> > slower.
>> > If archiving is enabled, 10 segments are already preallocated at the
>> moment
>> > of node's start.
>> >
>> > When archiving is enabled, archiver just copy previous preallocated
>> segment
>> > and move it to archive directory.
>> > This archived segment is crucial for recovery. When new checkpoints
>> > finished, all eligible for trunocating segments are just removed.
>> >
>> > If archiving is disabled, we also write WAL segments in wal directory
>> and
>> > disabling archiving don't prevent you from storing segments, if they are
>> > required for recovery.
>> >
>> >>> Before increasing the size of WAL archive (transferring to archive
>> >
>> > /rollOver, compression, decompression), we can make sure that there
>> will be
>> > enough space in the archive and if there is no such, then we will try to
>> >>> clean it. We cannot delete those segments that are required for
>> recovery
>> >
>> > (between the last two checkpoints) and reserved for example for
>> historical
>> > rebalancing.
>> > First of all, compression/decompression is offtopic here.
>> > Secondly, wal segments are required only with idx higher than LAST
>> > checkpoint marker.
>> > Thirdly, archiving and rolling over can be during checkpoint and we can
>> > broke everything accidentially.
>> > Fourthly, I see no benefits to overcomplicated already complicated
>> logic.
>> > This is basically problem of misunderstanding and tuning.
>> > There are a lot of similar topics for almost every DB. [3]
>> >
>> > [1] -- https://man7.org/linux/man-pages/man2/ftruncate.2.html
>> > [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
>> > [3] --
>> >
>> https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
>> >
>> > пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <tk...@yandex.ru>:
>> >
>> >>  Hi, Ivan!
>> >>
>> >>  I have only described ideas. But here are a few more details.
>> >>
>> >>  We can take care not to go beyond
>> >>  DataStorageConfiguration#maxWalArchiveSize.
>> >>
>> >>  Before increasing the size of WAL archive (transferring to archive
>> >>  /rollOver, compression, decompression), we can make sure that there
>> will be
>> >>  enough space in the archive and if there is no such, then we will try
>> to
>> >>  clean it. We cannot delete those segments that are required for
>> recovery
>> >>  (between the last two checkpoints) and reserved for example for
>> historical
>> >>  rebalancing.
>> >>
>> >>  We can receive a notification about the change of checkpoints and the
>> >>  reservation / release of segments, thus we can know how many segments
>> we
>> >>  can delete right now.
>> >>
>> >>  06.11.2020, 09:53, "Ivan Daschinsky" <iv...@gmail.com>:
>> >>  >>> For example, when trying to move a segment to the archive.
>> >>  >
>> >>  > We cannot do this, we will lost data. We can truncate archived
>> segment if
>> >>  > and only if it is not required for recovery. If last checkpoint
>> marker
>> >>  > points to segment
>> >>  > with lower index, we cannot delete any segment with higher index.
>> So the
>> >>  > only moment where we can remove truncate segments is a finish of
>> >>  checkpoint.
>> >>  >
>> >>  > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <tkalkirill@yandex.ru
>> >:
>> >>  >
>> >>  >> Hello, everybody!
>> >>  >>
>> >>  >> As far as I know, WAL archive is used for PITP(GridGain feature)
>> and
>> >>  >> historical rebalancing.
>> >>  >>
>> >>  >> Facundo seems to have a problem with running out of directory
>> >>  >> (/opt/work/walarchive) space.
>> >>  >> Currently, WAL archive is cleared at the end of checkpoint.
>> Potentially
>> >>  >> long transaction may prevent checkpoint starting, thereby not
>> cleaning
>> >>  WAL
>> >>  >> archive, which will lead to such an error.
>> >>  >> At the moment, I see such a WA to increase size of directory
>> >>  >> (/opt/work/walarchive) in k8s and avoid long transactions or
>> something
>> >>  like
>> >>  >> that that modifies data and runs for a long time.
>> >>  >>
>> >>  >> And it is best to fix the logic of working with WAL archive. I
>> think we
>> >>  >> should remove WAL archive cleanup from the end of the checkpoint
>> and
>> >>  do it
>> >>  >> on demand. For example, when trying to move a segment to the
>> archive.
>> >>  >>
>> >>  >> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
>> >>  >> > Folks,
>> >>  >> >
>> >>  >> > In my understanding, you need the archives only for features
>> such as
>> >>  >> PITR.
>> >>  >> > Considering, that the PITR functionality is not provided in
>> Ignite
>> >>  why do
>> >>  >> > we have the archives enabled by default?
>> >>  >> >
>> >>  >> > How about having this feature disabled by default to prevent the
>> >>  >> following
>> >>  >> > issues experienced by our users:
>> >>  >> >
>> >>  >>
>> >>
>> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
>> >>  >> >
>> >>  >> > -
>> >>  >> > Denis
>> >>  >
>> >>  > --
>> >>  > Sincerely yours, Ivan Daschinskiy
>> >
>> > --
>> > Sincerely yours, Ivan Daschinskiy
>>
>
>
> --
> Sincerely yours, Ivan Daschinskiy
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: Why WAL archives enabled by default?

Posted by Ivan Daschinsky <iv...@gmail.com>.

Kirill, how your approach will help if user tuned a cluster to do
checkpoints rarely under load?
No way.

пт, 6 нояб. 2020 г. в 12:19, ткаленко кирилл <tk...@yandex.ru>:

> Ivan, I agree with you that the archive is primarily about optimization.
>
> If the size of the archive is critical for the user, we have no protection
> against this, we can always go beyond this limit.
> Thus, the user needs to remember this and configure it in some way.
>
> I suggest not to exceed this limit and give the expected behavior for the
> user. At the same time, the segments needed for recovery will remain and
> there will be no data loss.
>
> 06.11.2020, 11:29, "Ivan Daschinsky" <iv...@gmail.com>:
> > Guys, fisrt of all, archiving is not for PITR at all, this is
> optimization.
> > If we disable archiving, every rollover we need to create new file. If we
> > enable archiving, we reserve 10 (by default) segments filled with zeroes.
> > We use mmap by default, so if we use no-archiver approach:
> > 1. We firstly create new empty file
> > 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under the hood
> > a. If file is shorter, than wal segment size, it
> > calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under the hood
> just
> > a system call truncate [1]
> > b. Than it calls system call mmap on this
> > file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
> > These manipulation are not free and cheap. So rollover will be much much
> > slower.
> > If archiving is enabled, 10 segments are already preallocated at the
> moment
> > of node's start.
> >
> > When archiving is enabled, archiver just copy previous preallocated
> segment
> > and move it to archive directory.
> > This archived segment is crucial for recovery. When new checkpoints
> > finished, all eligible for trunocating segments are just removed.
> >
> > If archiving is disabled, we also write WAL segments in wal directory and
> > disabling archiving don't prevent you from storing segments, if they are
> > required for recovery.
> >
> >>> Before increasing the size of WAL archive (transferring to archive
> >
> > /rollOver, compression, decompression), we can make sure that there will
> be
> > enough space in the archive and if there is no such, then we will try to
> >>> clean it. We cannot delete those segments that are required for
> recovery
> >
> > (between the last two checkpoints) and reserved for example for
> historical
> > rebalancing.
> > First of all, compression/decompression is offtopic here.
> > Secondly, wal segments are required only with idx higher than LAST
> > checkpoint marker.
> > Thirdly, archiving and rolling over can be during checkpoint and we can
> > broke everything accidentially.
> > Fourthly, I see no benefits to overcomplicated already complicated logic.
> > This is basically problem of misunderstanding and tuning.
> > There are a lot of similar topics for almost every DB. [3]
> >
> > [1] -- https://man7.org/linux/man-pages/man2/ftruncate.2.html
> > [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
> > [3] --
> >
> https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
> >
> > пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <tk...@yandex.ru>:
> >
> >>  Hi, Ivan!
> >>
> >>  I have only described ideas. But here are a few more details.
> >>
> >>  We can take care not to go beyond
> >>  DataStorageConfiguration#maxWalArchiveSize.
> >>
> >>  Before increasing the size of WAL archive (transferring to archive
> >>  /rollOver, compression, decompression), we can make sure that there
> will be
> >>  enough space in the archive and if there is no such, then we will try
> to
> >>  clean it. We cannot delete those segments that are required for
> recovery
> >>  (between the last two checkpoints) and reserved for example for
> historical
> >>  rebalancing.
> >>
> >>  We can receive a notification about the change of checkpoints and the
> >>  reservation / release of segments, thus we can know how many segments
> we
> >>  can delete right now.
> >>
> >>  06.11.2020, 09:53, "Ivan Daschinsky" <iv...@gmail.com>:
> >>  >>> For example, when trying to move a segment to the archive.
> >>  >
> >>  > We cannot do this, we will lost data. We can truncate archived
> segment if
> >>  > and only if it is not required for recovery. If last checkpoint
> marker
> >>  > points to segment
> >>  > with lower index, we cannot delete any segment with higher index. So
> the
> >>  > only moment where we can remove truncate segments is a finish of
> >>  checkpoint.
> >>  >
> >>  > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <tk...@yandex.ru>:
> >>  >
> >>  >> Hello, everybody!
> >>  >>
> >>  >> As far as I know, WAL archive is used for PITP(GridGain feature) and
> >>  >> historical rebalancing.
> >>  >>
> >>  >> Facundo seems to have a problem with running out of directory
> >>  >> (/opt/work/walarchive) space.
> >>  >> Currently, WAL archive is cleared at the end of checkpoint.
> Potentially
> >>  >> long transaction may prevent checkpoint starting, thereby not
> cleaning
> >>  WAL
> >>  >> archive, which will lead to such an error.
> >>  >> At the moment, I see such a WA to increase size of directory
> >>  >> (/opt/work/walarchive) in k8s and avoid long transactions or
> something
> >>  like
> >>  >> that that modifies data and runs for a long time.
> >>  >>
> >>  >> And it is best to fix the logic of working with WAL archive. I
> think we
> >>  >> should remove WAL archive cleanup from the end of the checkpoint and
> >>  do it
> >>  >> on demand. For example, when trying to move a segment to the
> archive.
> >>  >>
> >>  >> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
> >>  >> > Folks,
> >>  >> >
> >>  >> > In my understanding, you need the archives only for features such
> as
> >>  >> PITR.
> >>  >> > Considering, that the PITR functionality is not provided in Ignite
> >>  why do
> >>  >> > we have the archives enabled by default?
> >>  >> >
> >>  >> > How about having this feature disabled by default to prevent the
> >>  >> following
> >>  >> > issues experienced by our users:
> >>  >> >
> >>  >>
> >>
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
> >>  >> >
> >>  >> > -
> >>  >> > Denis
> >>  >
> >>  > --
> >>  > Sincerely yours, Ivan Daschinskiy
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: Why WAL archives enabled by default?

Posted by ткаленко кирилл <tk...@yandex.ru>.

Ivan, I agree with you that the archive is primarily about optimization.

If the size of the archive is critical for the user, we have no protection against this, we can always go beyond this limit.
Thus, the user needs to remember this and configure it in some way. 

I suggest not to exceed this limit and give the expected behavior for the user. At the same time, the segments needed for recovery will remain and there will be no data loss.

06.11.2020, 11:29, "Ivan Daschinsky" <iv...@gmail.com>:
> Guys, fisrt of all, archiving is not for PITR at all, this is optimization.
> If we disable archiving, every rollover we need to create new file. If we
> enable archiving, we reserve 10 (by default) segments filled with zeroes.
> We use mmap by default, so if we use no-archiver approach:
> 1. We firstly create new empty file
> 2. Call on it sun.nio.ch.FileChannelImpl#map, thats under the hood
> a. If file is shorter, than wal segment size, it
> calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under the hood just
> a system call truncate [1]
> b. Than it calls system call mmap on this
> file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
> These manipulation are not free and cheap. So rollover will be much much
> slower.
> If archiving is enabled, 10 segments are already preallocated at the moment
> of node's start.
>
> When archiving is enabled, archiver just copy previous preallocated segment
> and move it to archive directory.
> This archived segment is crucial for recovery. When new checkpoints
> finished, all eligible for trunocating segments are just removed.
>
> If archiving is disabled, we also write WAL segments in wal directory and
> disabling archiving don't prevent you from storing segments, if they are
> required for recovery.
>
>>> Before increasing the size of WAL archive (transferring to archive
>
> /rollOver, compression, decompression), we can make sure that there will be
> enough space in the archive and if there is no such, then we will try to
>>> clean it. We cannot delete those segments that are required for recovery
>
> (between the last two checkpoints) and reserved for example for historical
> rebalancing.
> First of all, compression/decompression is offtopic here.
> Secondly, wal segments are required only with idx higher than LAST
> checkpoint marker.
> Thirdly, archiving and rolling over can be during checkpoint and we can
> broke everything accidentially.
> Fourthly, I see no benefits to overcomplicated already complicated logic.
> This is basically problem of misunderstanding and tuning.
> There are a lot of similar topics for almost every DB. [3]
>
> [1] -- https://man7.org/linux/man-pages/man2/ftruncate.2.html
> [2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
> [3] --
> https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no
>
> пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <tk...@yandex.ru>:
>
>>  Hi, Ivan!
>>
>>  I have only described ideas. But here are a few more details.
>>
>>  We can take care not to go beyond
>>  DataStorageConfiguration#maxWalArchiveSize.
>>
>>  Before increasing the size of WAL archive (transferring to archive
>>  /rollOver, compression, decompression), we can make sure that there will be
>>  enough space in the archive and if there is no such, then we will try to
>>  clean it. We cannot delete those segments that are required for recovery
>>  (between the last two checkpoints) and reserved for example for historical
>>  rebalancing.
>>
>>  We can receive a notification about the change of checkpoints and the
>>  reservation / release of segments, thus we can know how many segments we
>>  can delete right now.
>>
>>  06.11.2020, 09:53, "Ivan Daschinsky" <iv...@gmail.com>:
>>  >>> For example, when trying to move a segment to the archive.
>>  >
>>  > We cannot do this, we will lost data. We can truncate archived segment if
>>  > and only if it is not required for recovery. If last checkpoint marker
>>  > points to segment
>>  > with lower index, we cannot delete any segment with higher index. So the
>>  > only moment where we can remove truncate segments is a finish of
>>  checkpoint.
>>  >
>>  > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <tk...@yandex.ru>:
>>  >
>>  >> Hello, everybody!
>>  >>
>>  >> As far as I know, WAL archive is used for PITP(GridGain feature) and
>>  >> historical rebalancing.
>>  >>
>>  >> Facundo seems to have a problem with running out of directory
>>  >> (/opt/work/walarchive) space.
>>  >> Currently, WAL archive is cleared at the end of checkpoint. Potentially
>>  >> long transaction may prevent checkpoint starting, thereby not cleaning
>>  WAL
>>  >> archive, which will lead to such an error.
>>  >> At the moment, I see such a WA to increase size of directory
>>  >> (/opt/work/walarchive) in k8s and avoid long transactions or something
>>  like
>>  >> that that modifies data and runs for a long time.
>>  >>
>>  >> And it is best to fix the logic of working with WAL archive. I think we
>>  >> should remove WAL archive cleanup from the end of the checkpoint and
>>  do it
>>  >> on demand. For example, when trying to move a segment to the archive.
>>  >>
>>  >> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
>>  >> > Folks,
>>  >> >
>>  >> > In my understanding, you need the archives only for features such as
>>  >> PITR.
>>  >> > Considering, that the PITR functionality is not provided in Ignite
>>  why do
>>  >> > we have the archives enabled by default?
>>  >> >
>>  >> > How about having this feature disabled by default to prevent the
>>  >> following
>>  >> > issues experienced by our users:
>>  >> >
>>  >>
>>  http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
>>  >> >
>>  >> > -
>>  >> > Denis
>>  >
>>  > --
>>  > Sincerely yours, Ivan Daschinskiy
>
> --
> Sincerely yours, Ivan Daschinskiy

Re: Why WAL archives enabled by default?

Posted by Ivan Daschinsky <iv...@gmail.com>.

Guys, fisrt of all, archiving is not for PITR at all, this is optimization.
If we disable archiving, every rollover we need to create new file. If we
enable archiving, we reserve 10 (by default) segments filled with zeroes.
We use mmap by default, so if we use no-archiver approach:
1. We firstly create new empty file
2. Call on it sun.nio.ch.FileChannelImpl#map, thats under the hood
a. If file is shorter, than wal segment size, it
calls sun.nio.ch.FileDispatcherImpl#truncate0, this is under the hood just
a system call truncate [1]
b. Than it calls system call mmap on this
file sun.nio.ch.FileChannelImpl#map0, under the hood see [2]
These manipulation are not free and cheap. So rollover will be much much
slower.
If archiving is enabled, 10 segments are already preallocated at the moment
of node's start.

When archiving is enabled, archiver just copy previous preallocated segment
and move it to archive directory.
This archived segment is crucial for recovery. When new checkpoints
finished, all eligible for trunocating segments are just removed.

If archiving is disabled, we also write WAL segments in wal directory and
disabling archiving don't prevent you from storing segments, if they are
required for recovery.

>>Before increasing the size of WAL archive (transferring to archive
/rollOver, compression, decompression), we can make sure that there will be
enough space in the archive and if there is no such, then we will try to
>>clean it. We cannot delete those segments that are required for recovery
(between the last two checkpoints) and reserved for example for historical
rebalancing.
First of all, compression/decompression is offtopic here.
Secondly, wal segments are required only with idx higher than LAST
checkpoint marker.
Thirdly, archiving and rolling over can be during checkpoint and we can
broke everything accidentially.
Fourthly, I see no benefits to overcomplicated already complicated logic.
This is basically problem of misunderstanding and tuning.
There are a lot of similar topics for almost every DB. [3]



[1] -- https://man7.org/linux/man-pages/man2/ftruncate.2.html
[2] -- https://man7.org/linux/man-pages/man2/mmap.2.html
[3] --
https://www.google.com/search?q=pg_wal%2Fxlogtemp+no+space+left+on+device&oq=pg+wal+no

пт, 6 нояб. 2020 г. в 10:42, ткаленко кирилл <tk...@yandex.ru>:

> Hi, Ivan!
>
> I have only described ideas. But here are a few more details.
>
> We can take care not to go beyond
> DataStorageConfiguration#maxWalArchiveSize.
>
> Before increasing the size of WAL archive (transferring to archive
> /rollOver, compression, decompression), we can make sure that there will be
> enough space in the archive and if there is no such, then we will try to
> clean it. We cannot delete those segments that are required for recovery
> (between the last two checkpoints) and reserved for example for historical
> rebalancing.
>
> We can receive a notification about the change of checkpoints and the
> reservation / release of segments, thus we can know how many segments we
> can delete right now.
>
> 06.11.2020, 09:53, "Ivan Daschinsky" <iv...@gmail.com>:
> >>>  For example, when trying to move a segment to the archive.
> >
> > We cannot do this, we will lost data. We can truncate archived segment if
> > and only if it is not required for recovery. If last checkpoint marker
> > points to segment
> > with lower index, we cannot delete any segment with higher index. So the
> > only moment where we can remove truncate segments is a finish of
> checkpoint.
> >
> > пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <tk...@yandex.ru>:
> >
> >>  Hello, everybody!
> >>
> >>  As far as I know, WAL archive is used for PITP(GridGain feature) and
> >>  historical rebalancing.
> >>
> >>  Facundo seems to have a problem with running out of directory
> >>  (/opt/work/walarchive) space.
> >>  Currently, WAL archive is cleared at the end of checkpoint. Potentially
> >>  long transaction may prevent checkpoint starting, thereby not cleaning
> WAL
> >>  archive, which will lead to such an error.
> >>  At the moment, I see such a WA to increase size of directory
> >>  (/opt/work/walarchive) in k8s and avoid long transactions or something
> like
> >>  that that modifies data and runs for a long time.
> >>
> >>  And it is best to fix the logic of working with WAL archive. I think we
> >>  should remove WAL archive cleanup from the end of the checkpoint and
> do it
> >>  on demand. For example, when trying to move a segment to the archive.
> >>
> >>  06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
> >>  > Folks,
> >>  >
> >>  > In my understanding, you need the archives only for features such as
> >>  PITR.
> >>  > Considering, that the PITR functionality is not provided in Ignite
> why do
> >>  > we have the archives enabled by default?
> >>  >
> >>  > How about having this feature disabled by default to prevent the
> >>  following
> >>  > issues experienced by our users:
> >>  >
> >>
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
> >>  >
> >>  > -
> >>  > Denis
> >
> > --
> > Sincerely yours, Ivan Daschinskiy
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: Why WAL archives enabled by default?

Posted by ткаленко кирилл <tk...@yandex.ru>.

Hi, Ivan!

I have only described ideas. But here are a few more details.

We can take care not to go beyond DataStorageConfiguration#maxWalArchiveSize. 

Before increasing the size of WAL archive (transferring to archive /rollOver, compression, decompression), we can make sure that there will be enough space in the archive and if there is no such, then we will try to clean it. We cannot delete those segments that are required for recovery (between the last two checkpoints) and reserved for example for historical rebalancing.

We can receive a notification about the change of checkpoints and the reservation / release of segments, thus we can know how many segments we can delete right now.

06.11.2020, 09:53, "Ivan Daschinsky" <iv...@gmail.com>:
>>>  For example, when trying to move a segment to the archive.
>
> We cannot do this, we will lost data. We can truncate archived segment if
> and only if it is not required for recovery. If last checkpoint marker
> points to segment
> with lower index, we cannot delete any segment with higher index. So the
> only moment where we can remove truncate segments is a finish of checkpoint.
>
> пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <tk...@yandex.ru>:
>
>>  Hello, everybody!
>>
>>  As far as I know, WAL archive is used for PITP(GridGain feature) and
>>  historical rebalancing.
>>
>>  Facundo seems to have a problem with running out of directory
>>  (/opt/work/walarchive) space.
>>  Currently, WAL archive is cleared at the end of checkpoint. Potentially
>>  long transaction may prevent checkpoint starting, thereby not cleaning WAL
>>  archive, which will lead to such an error.
>>  At the moment, I see such a WA to increase size of directory
>>  (/opt/work/walarchive) in k8s and avoid long transactions or something like
>>  that that modifies data and runs for a long time.
>>
>>  And it is best to fix the logic of working with WAL archive. I think we
>>  should remove WAL archive cleanup from the end of the checkpoint and do it
>>  on demand. For example, when trying to move a segment to the archive.
>>
>>  06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
>>  > Folks,
>>  >
>>  > In my understanding, you need the archives only for features such as
>>  PITR.
>>  > Considering, that the PITR functionality is not provided in Ignite why do
>>  > we have the archives enabled by default?
>>  >
>>  > How about having this feature disabled by default to prevent the
>>  following
>>  > issues experienced by our users:
>>  >
>>  http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
>>  >
>>  > -
>>  > Denis
>
> --
> Sincerely yours, Ivan Daschinskiy

Re: Why WAL archives enabled by default?

Posted by Ivan Daschinsky <iv...@gmail.com>.

>> For example, when trying to move a segment to the archive.
We cannot do this, we will lost data. We can truncate archived segment if
and only if it is not required for recovery. If last checkpoint marker
points to segment
with lower index, we cannot delete any segment with higher index. So the
only moment where we can remove truncate segments is a finish of checkpoint.

пт, 6 нояб. 2020 г. в 09:46, ткаленко кирилл <tk...@yandex.ru>:

> Hello, everybody!
>
> As far as I know, WAL archive is used for PITP(GridGain feature) and
> historical rebalancing.
>
> Facundo seems to have a problem with running out of directory
> (/opt/work/walarchive) space.
> Currently, WAL archive is cleared at the end of checkpoint. Potentially
> long transaction may prevent checkpoint starting, thereby not cleaning WAL
> archive, which will lead to such an error.
> At the moment, I see such a WA to increase size of directory
> (/opt/work/walarchive) in k8s and avoid long transactions or something like
> that that modifies data and runs for a long time.
>
> And it is best to fix the logic of working with WAL archive. I think we
> should remove WAL archive cleanup from the end of the checkpoint and do it
> on demand. For example, when trying to move a segment to the archive.
>
>
> 06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
> > Folks,
> >
> > In my understanding, you need the archives only for features such as
> PITR.
> > Considering, that the PITR functionality is not provided in Ignite why do
> > we have the archives enabled by default?
> >
> > How about having this feature disabled by default to prevent the
> following
> > issues experienced by our users:
> >
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
> >
> > -
> > Denis
>


-- 
Sincerely yours, Ivan Daschinskiy

Re: Why WAL archives enabled by default?

Posted by ткаленко кирилл <tk...@yandex.ru>.

Hello, everybody!

As far as I know, WAL archive is used for PITP(GridGain feature) and historical rebalancing.

Facundo seems to have a problem with running out of directory (/opt/work/walarchive) space.
Currently, WAL archive is cleared at the end of checkpoint. Potentially long transaction may prevent checkpoint starting, thereby not cleaning WAL archive, which will lead to such an error.
At the moment, I see such a WA to increase size of directory (/opt/work/walarchive) in k8s and avoid long transactions or something like that that modifies data and runs for a long time.

And it is best to fix the logic of working with WAL archive. I think we should remove WAL archive cleanup from the end of the checkpoint and do it on demand. For example, when trying to move a segment to the archive.


06.11.2020, 01:58, "Denis Magda" <dm...@apache.org>:
> Folks,
>
> In my understanding, you need the archives only for features such as PITR.
> Considering, that the PITR functionality is not provided in Ignite why do
> we have the archives enabled by default?
>
> How about having this feature disabled by default to prevent the following
> issues experienced by our users:
> http://apache-ignite-users.70518.x6.nabble.com/WAL-and-WAL-Archive-volume-size-recommendation-td34458.html
>
> -
> Denis