You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tephra.apache.org by Varun Rao <va...@gmail.com> on 2019/01/22 23:44:42 UTC

Reading tephra snapshots

Hello,

We are interested in reading snapshot files from an HDFS directory and we
want to know if this was already provided and if not, is it even possible?
Within the HDFSTransactionStateStorage.java file , the main method looks as
if it supports a CLI tool that reads a transaction state snapshot or
transaction log from HDFS. I have left out extraneous code but the for loop
below iterates through a list of files (each file pointing to a transaction
state snapshot or transaction log) and prints each snapshot to stdout.

1) Is this CLI tool existent, or on the roadmap to be developed?
2) Is it safe to call this main method to read snapshots from an HDFS
directory?

// TODO move this out as a separate command line tool
private enum CLIMode { SNAPSHOT, TXLOG };

/**
 * Reads a transaction state snapshot or transaction log from HDFS and
prints the entries to stdout.
 *
 * Supports the following options:
 *    -s    read snapshot state (defaults to the latest)
 *    -l    read a transaction log
 *    [filename]  reads the given file
 * @param args
 */

public static void main(String[] args) {

*//......... *

    for (String file : filenames) {
        Path path = new Path(file);
        TransactionSnapshot snapshot = storage.readSnapshotFile(path);
        printSnapshot(snapshot);

    }

}

Thanks very much

Re: Reading tephra snapshots

Posted by Andreas Neumann <an...@apache.org>.
Hi Varun,

for debugging purposes, you should only need to read the last snapshot and
the transaction log that follows it. Older snapshots should not be needed.

Tephra does not have a CLI tool to do that yet, CDAP (which uses Tephra)
has a similar tool, however a little more complex because it implements a
snapshot server-side and the inspection of that snapshot client-side. See:
-
https://github.com/cdapio/cdap/blob/develop/cdap-data-fabric/src/main/java/co/cask/cdap/data2/transaction/TransactionManagerDebuggerMain.java
-
https://github.com/cdapio/cdap/blob/develop/cdap-app-fabric/src/main/java/co/cask/cdap/gateway/handlers/TransactionHttpHandler.java#L97

A tool like that would be a great contribution for Tephra.

Cheers -Andreas

On Tue, Jan 22, 2019 at 3:44 PM Varun Rao <va...@gmail.com> wrote:

> Hello,
>
> We are interested in reading snapshot files from an HDFS directory and we
> want to know if this was already provided and if not, is it even possible?
> Within the HDFSTransactionStateStorage.java file , the main method looks as
> if it supports a CLI tool that reads a transaction state snapshot or
> transaction log from HDFS. I have left out extraneous code but the for loop
> below iterates through a list of files (each file pointing to a transaction
> state snapshot or transaction log) and prints each snapshot to stdout.
>
> 1) Is this CLI tool existent, or on the roadmap to be developed?
> 2) Is it safe to call this main method to read snapshots from an HDFS
> directory?
>
> // TODO move this out as a separate command line tool
> private enum CLIMode { SNAPSHOT, TXLOG };
>
> /**
>  * Reads a transaction state snapshot or transaction log from HDFS and
> prints the entries to stdout.
>  *
>  * Supports the following options:
>  *    -s    read snapshot state (defaults to the latest)
>  *    -l    read a transaction log
>  *    [filename]  reads the given file
>  * @param args
>  */
>
> public static void main(String[] args) {
>
> *//......... *
>
>     for (String file : filenames) {
>         Path path = new Path(file);
>         TransactionSnapshot snapshot = storage.readSnapshotFile(path);
>         printSnapshot(snapshot);
>
>     }
>
> }
>
> Thanks very much
>