You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cassandra.apache.org by Sreenivasulu Nallapati <sr...@gmail.com> on 2019/02/08 07:34:18 UTC

Commit-log structure changes - versions

Hello folks,

I am exploring the CDC option to move data from cassandra to Hive on
periodic basis.
While exploring this option, I overheard saying that the internal
commit-log structure will change form version to version. Is this correct?

As per this link
<http://cassandra.apache.org/doc/latest/architecture/storage_engine.html#sstable-versions>,
sstables have changed in multiple times in multiple versions.
I want to understand about the commit-log internal structure as well. Is
there any change in the commit-log file structure in different cassandra
versions? If so, can someone please redirect me to the docs/change log?

Please help me to understand more on this. Thanks in advance

Thanks
Sreeni

Re: Commit-log structure changes - versions

Posted by Joshua McKenzie <jm...@apache.org>.
You'll probably see the bulk of changes in
CommitLogReader.java:CommitLogFormat
<https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java#L483>.
I tried to limit the dependencies on any internals of the
CommitLogDescriptor when I was refactoring for C-8844, so aside from some
metadata that should hopefully remain consistent (id, filename, compression
flag, etc), you shouldn't see a ton of drift in terms of changes inside the
reader itself.

On Sat, Feb 9, 2019 at 11:22 PM Sreenivasulu Nallapati <
sreeni.nallapati@gmail.com> wrote:

> Hi Jay,
> Thanks for your response.
> So in the future if there is a change in the commit-log
> (CommitLogDescriptor.java
> <
> https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/commitlog/CommitLogDescriptor.java#L69
> >)
> structure, will CommitLogReader  be updated to read the new changes as well
> ?
>
> Thanks
> Sreeni
>
> On Sat, Feb 9, 2019 at 5:13 AM jay.zhuang@yahoo.com.INVALID
> <ja...@yahoo.com.invalid> wrote:
>
> >  Hi,
> > Yes, the commit-log format may change, here is the current version of
> > commit-log:
> >
> https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/commitlog/CommitLogDescriptor.java#L69
> You
> > need to look into git log to find out the changes.
> > But if you use Cassandra lib to read the files, CommitLogReader is able
> to
> > read the current and previous version of commit-logs:
> >
> https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java#L168
> If
> > you implement that by yourself (maybe in other languages), you should do
> > the similar thing.
> > Thanks,Jay
> >     On Thursday, February 7, 2019, 11:34:35 PM PST, Sreenivasulu
> Nallapati
> > <sr...@gmail.com> wrote:
> >
> >  Hello folks,
> >
> > I am exploring the CDC option to move data from cassandra to Hive on
> > periodic basis.
> > While exploring this option, I overheard saying that the internal
> > commit-log structure will change form version to version. Is this
> correct?
> >
> > As per this link
> > <
> >
> http://cassandra.apache.org/doc/latest/architecture/storage_engine.html#sstable-versions
> > >,
> > sstables have changed in multiple times in multiple versions.
> > I want to understand about the commit-log internal structure as well. Is
> > there any change in the commit-log file structure in different cassandra
> > versions? If so, can someone please redirect me to the docs/change log?
> >
> > Please help me to understand more on this. Thanks in advance
> >
> > Thanks
> > Sreeni
> >
>

Re: Commit-log structure changes - versions

Posted by Sreenivasulu Nallapati <sr...@gmail.com>.
Hi Jay,
Thanks for your response.
So in the future if there is a change in the commit-log
(CommitLogDescriptor.java
<https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/commitlog/CommitLogDescriptor.java#L69>)
structure, will CommitLogReader  be updated to read the new changes as well
?

Thanks
Sreeni

On Sat, Feb 9, 2019 at 5:13 AM jay.zhuang@yahoo.com.INVALID
<ja...@yahoo.com.invalid> wrote:

>  Hi,
> Yes, the commit-log format may change, here is the current version of
> commit-log:
> https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/commitlog/CommitLogDescriptor.java#L69 You
> need to look into git log to find out the changes.
> But if you use Cassandra lib to read the files, CommitLogReader is able to
> read the current and previous version of commit-logs:
> https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java#L168 If
> you implement that by yourself (maybe in other languages), you should do
> the similar thing.
> Thanks,Jay
>     On Thursday, February 7, 2019, 11:34:35 PM PST, Sreenivasulu Nallapati
> <sr...@gmail.com> wrote:
>
>  Hello folks,
>
> I am exploring the CDC option to move data from cassandra to Hive on
> periodic basis.
> While exploring this option, I overheard saying that the internal
> commit-log structure will change form version to version. Is this correct?
>
> As per this link
> <
> http://cassandra.apache.org/doc/latest/architecture/storage_engine.html#sstable-versions
> >,
> sstables have changed in multiple times in multiple versions.
> I want to understand about the commit-log internal structure as well. Is
> there any change in the commit-log file structure in different cassandra
> versions? If so, can someone please redirect me to the docs/change log?
>
> Please help me to understand more on this. Thanks in advance
>
> Thanks
> Sreeni
>

Re: Commit-log structure changes - versions

Posted by "jay.zhuang@yahoo.com.INVALID" <ja...@yahoo.com.INVALID>.
 Hi,
Yes, the commit-log format may change, here is the current version of commit-log: https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/commitlog/CommitLogDescriptor.java#L69 You need to look into git log to find out the changes.
But if you use Cassandra lib to read the files, CommitLogReader is able to read the current and previous version of commit-logs: https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/commitlog/CommitLogReader.java#L168 If you implement that by yourself (maybe in other languages), you should do the similar thing.
Thanks,Jay
    On Thursday, February 7, 2019, 11:34:35 PM PST, Sreenivasulu Nallapati <sr...@gmail.com> wrote:  
 
 Hello folks,

I am exploring the CDC option to move data from cassandra to Hive on
periodic basis.
While exploring this option, I overheard saying that the internal
commit-log structure will change form version to version. Is this correct?

As per this link
<http://cassandra.apache.org/doc/latest/architecture/storage_engine.html#sstable-versions>,
sstables have changed in multiple times in multiple versions.
I want to understand about the commit-log internal structure as well. Is
there any change in the commit-log file structure in different cassandra
versions? If so, can someone please redirect me to the docs/change log?

Please help me to understand more on this. Thanks in advance

Thanks
Sreeni