You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Tony Foerster (Code Review)" <ge...@cloudera.org> on 2018/07/13 14:37:47 UTC

[kudu-CR] WIP Add error handling to KuduBackup DONT BUILD KuduBackup will attempt to backup every table. If any one table fails the entire job will fail. All exceptions are logged when they happen.

Tony Foerster has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10941


Change subject: WIP Add error handling to KuduBackup DONT_BUILD KuduBackup will attempt to backup every table. If any one table fails the entire job will fail. All exceptions are logged when they happen.
......................................................................

WIP Add error handling to KuduBackup
DONT_BUILD
KuduBackup will attempt to backup every table. If any one table fails the
entire job will fail. All exceptions are logged when they happen.

A backup attempt can have 3 results:
  - Success - table was backed up
  - Exists - table backup 'metadata' directory already existed. Assuming
  this was a retry.
  - Failure - table failed to backup. All data

At the end of the job lists of failures, successes, and existing backups
are printed out.

A lot of this code is still assuming the 'batch' backup mode. I think a lot of this will
change when we start doing the incremental backups.

TODO -
- if 'overwrite' flag is specified, overwrite the 'metadata'
folder. Not sure yet how to handle this on retries or if it's even a
good idea.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
4 files changed, 208 insertions(+), 46 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/1
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 1
Gerrit-Owner: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/10941 )

Change subject: Add error handling to Backup and Restore
......................................................................


Patch Set 6:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala:

http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@35
PS6, Line 35:                                      private val cause: Throwable = None.orNull)
nit: Just None.orNull is a bit unusual. Just null should work.


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@46
PS6, Line 46:     // TODO: Check if a vailid-looking result is in the directory and don't retry
nit: s/vailid/valid


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@57
PS6, Line 57:           if (metadata.getToMs >= options.timestampMs) {
I think we should only consider the case where getToMs == options.timestampMs right now. In that case we can skip the full backup. Otherwise we should fail saying a backup for a different time already exists (that should automatically happen if you fall through here). Then the user can delete it or provide a new path for the new backup. 

Because we don't support any idea of incremental yet, that seams the most straightforward to me.


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@111
PS6, Line 111:     val fs   = hPath.getFileSystem(conf)
nit: extra spaces


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala:

http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@90
PS6, Line 90:     val conf  = session.sparkContext.hadoopConfiguration
nit: extra spaces. I think a formatter is aligning the =. This should be fixed in general.


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@103
PS6, Line 103:                                context: KuduContext): Try[KuduTable] = Try {
nit: explicit return type.


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@115
PS6, Line 115:   private def dropTable(tableName: String, kuduClient: KuduClient): Try[Unit] =
Where is this used? We definitely don't want the restore process dropping tables. I suppose we may want to drop a created table if it's restore failed for some reason.


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@136
PS6, Line 136:         s"${failures.length} were note restored, see previous logs for details")
nit: s/note/not



-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 6
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>
Gerrit-Comment-Date: Fri, 20 Jul 2018 20:42:30 +0000
Gerrit-HasComments: Yes

[kudu-CR] WIP Add error handling to KuduBackup DONT BUILD KuduBackup will attempt to backup every table. If any one table fails the entire job will fail. All exceptions are logged when they happen.

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Tony Foerster has posted comments on this change. ( http://gerrit.cloudera.org:8080/10941 )

Change subject: WIP Add error handling to KuduBackup DONT_BUILD KuduBackup will attempt to backup every table. If any one table fails the entire job will fail. All exceptions are logged when they happen.
......................................................................


Patch Set 1:

Ignore the 'KuduRestore' changes, that's even more half-baked than the KuduBackup (and not tested)


-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 1
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>
Gerrit-Comment-Date: Fri, 13 Jul 2018 14:43:19 +0000
Gerrit-HasComments: No

[kudu-CR] WIP Add error handling to KuduBackup DONT BUILD KuduBackup will attempt to backup every table. If any one table fails the entire job will fail.

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/10941 )

Change subject: WIP Add error handling to KuduBackup DONT_BUILD KuduBackup will attempt to backup every table. If any one table fails the entire job will fail.
......................................................................


Patch Set 2:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/10941/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10941/2//COMMIT_MSG@7
PS2, Line 7: WIP Add error handling to KuduBackup
Nit: This subject is really long. Shows up as 4 lines on gerrit.


http://gerrit.cloudera.org:8080/#/c/10941/2//COMMIT_MSG@21
PS2, Line 21: - if 'overwrite' flag is specified, overwrite the 'metadata'
I think we can keep it simple and let users manually delete files they don't want for now.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/BackupStatus.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/BackupStatus.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/BackupStatus.scala@5
PS2, Line 5: class BackupStatus(val tableName: String, val isSuccess: Boolean) {
Since their isn't a ton of custom functionality perhaps using Scala's built in Try directly is good enough and more lightweight.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@39
PS2, Line 39:   def main(args: Array[String]): Unit = {
Is there a reason you moved main to the top of the class? I think we should try to keep the layout between KuduBackup and KuduRestore similar since the processes are so similar.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@56
PS2, Line 56:       throw new Exception(
Nit. Use a more specific exception.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@57
PS2, Line 57:         s"Failed backing up at least one table. See previous error messages for details. Failed tables: ${failures.mkString(",")}")
Instead of "at least one". Include the count that failed out of the total submitted.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@70
PS2, Line 70:         // @TODO: do incremental backup goes here
I am not sure this is how we will handle incremental. I am thinking we would pass an argument requesting incremental vs full. In the case of a full backup I think we can rely on Sparks SaveMode semantics to detect existing data. That covers more cases because any data existing, not just the metadata thats successful, we should likely fail on.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@74
PS2, Line 74:                                  context <- Try(new KuduContext(options.kuduMasterAddresses, session.sparkContext));
Why did this context get move here? I think it can be shared for all uses in the application.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@87
PS2, Line 87:               case Success(_) => log.error(s"Removed partial backup directory: $tablePath")
Maybe info level


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@96
PS2, Line 96:   private def writeMetadata(options: KuduBackupOptions, session: SparkSession, tablePath: Path, table: KuduTable) = {
Nit: add explicit return type

Could this method just be merged into writeTableMetadata?


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@127
PS2, Line 127:     val conf = session.sparkContext.hadoopConfiguration
Now that we do this 3 line preamble to get the filesystem and hpath a few times, maybe we should move these methods to a utility?


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@135
PS2, Line 135:   private def getTableMetadata(options: KuduBackupOptions, session: SparkSession, tablePath: Path, table: KuduTable) = Try {
What is this for? It looks like it writes the metadata.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@66
PS2, Line 66:           BackupSuccess(t)
Should the be RestoreSuccess/RestoreFailure? Since their isn't a ton of custom functionality perhaps using Scala's built in Try directly is good enough and more lightweight.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@79
PS2, Line 79:   def readTableMetadata(path: Path, session: SparkSession): Try[TableMetadataPB] = Try {
Nit: Explicit return type.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@91
PS2, Line 91:   private def getTable(metadata: TableMetadataPB, options: KuduRestoreOptions, context: KuduContext): Try[KuduTable] = Try {
Nit: Explicit return type

Maybe getOrCreateTable to be more descriptive?


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
File java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala@85
PS2, Line 85:         val backupRows = kuduContext.kuduRDD(ss.sparkContext, s"$tableName").collect
nit: accidental indent


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala@140
PS2, Line 140:   test("Backup Failure When No Connection to Kudu") {
This could fail because the path is not found. Need to validate why it failed.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala@147
PS2, Line 147:   test("Restore Throws Exception When Path Not Found") {
Need to fill this out.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala@151
PS2, Line 151:   test("Restore Throws Exception When Table Already Exists") {
Need to fill this out.



-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 2
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 20:11:34 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#12).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every,regardless of any one table
failure.  If any one table fails the job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp equaling the user-supplied timestamp, the table
will be marked successful.

If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore Kudu table has already been created the
`-restore` table will not be removed. This should probably be done in a
later commit.

A list of failed/success full backups or restores will be printed at the
end of each job.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
5 files changed, 342 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/12
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 12
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#10).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every table.  If any one table
fails the job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp up to the current user supplied timestamp, the table
will be marked successful.

If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore Kudu table has already been created, it will not be
removed.

A list of failed/success full backups or restores will be printed at the
end of each job.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
5 files changed, 344 insertions(+), 114 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/10
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 10
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#8).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every table.  If any one table
fails the job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp up to the current user supplied timestamp, the table
will be marked successful.
If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore Kudu table has already been created, it will not be
removed.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
A java/kudu-backup/out/test/resources/log4j.properties
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
6 files changed, 369 insertions(+), 145 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/8
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 8
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to KuduBackup

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#4).

Change subject: Add error handling to KuduBackup
......................................................................

Add error handling to KuduBackup

KuduBackup will attempt to backup every table. If any one table fails the
entire job will fail.

If a table is was already backed up to the specified timestamp or the
table is backed up, the table is marked successful.
If a table fails to back up any partial data will be removed.

At the end of the job if any table has failed to backup/restore an
exception is thrown.

Spark SaveMode overwrite/append semantics are not handled as part of this commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
5 files changed, 335 insertions(+), 126 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/4
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 4
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#11).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every,regardless of any one table
failure.  If any one table fails the job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp equaling the user-supplied timestamp, the table
will be marked successful.

If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore Kudu table has already been created the
`-restore` table will not be removed. This should probably be done in a
later commit.

A list of failed/success full backups or restores will be printed at the
end of each job.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
5 files changed, 343 insertions(+), 109 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/11
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 11
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Tony Foerster has restored this change. ( http://gerrit.cloudera.org:8080/10941 )

Change subject: Add error handling to Backup and Restore
......................................................................


Restored

Restoring after fixes
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: restore
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 9
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to KuduBackup

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Tony Foerster has posted comments on this change. ( http://gerrit.cloudera.org:8080/10941 )

Change subject: Add error handling to KuduBackup
......................................................................


Patch Set 4: Code-Review-1

(16 comments)

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/BackupStatus.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/BackupStatus.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/BackupStatus.scala@5
PS2, Line 5: 
> Since their isn't a ton of custom functionality perhaps using Scala's built
I've removed this for now, will tackle it at a later time. Currently just throw an exception and tell the user to look back through the logs for failure.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@39
PS2, Line 39: @InterfaceStability.Unstable
> Is there a reason you moved main to the top of the class? I think we should
I've got the 'scalafmt' plugin in intellij and I did the formatting out of reflex. I'll fix the formatting/order


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@56
PS2, Line 56:         .map { metadata =>
> Nit. Use a more specific exception.
Done


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@57
PS2, Line 57:           if (metadata.getToMs >= options.timestampMs) {
> Instead of "at least one". Include the count that failed out of the total s
Done


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@70
PS2, Line 70:   }
> I am not sure this is how we will handle incremental. I am thinking we woul
I've changed this to check the timestamp in the metadata to see if the current timestamp is already backed up. If it is, mark it success. If it's not, the incremental can be handled there. 
If we failed when the metadata exists, and in a yarn job some tables succeeded and there was a retry, the tables that succeeded the first time would be marked failed on the retry.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@74
PS2, Line 74:                           t: String,
> Why did this context get move here? I think it can be shared for all uses i
Done


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@87
PS2, Line 87:       }
> Maybe info level
Done


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@96
PS2, Line 96:             log.error(s"Failed to remove path: $tablePath", ex)
> Nit: add explicit return type
merged


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@127
PS2, Line 127:         session.sqlContext.createDataFrame(rdd, sparkSchema(table.getSchema))
> Now that we do this 3 line preamble to get the filesystem and hpath a few t
It's used twice, once in restore and once in backup (third time was removed). I'd like to leave it for now since the utility class + number of parameters in the method would be a little more complex than the few lines it takes right now.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@135
PS2, Line 135:     }
> What is this for? It looks like it writes the metadata.
removed.


http://gerrit.cloudera.org:8080/#/c/10941/4/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala:

http://gerrit.cloudera.org:8080/#/c/10941/4/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@21
PS4, Line 21: import java.nio.file.{ Path, Paths }
Is there a standard formatting tool you guys use? This is inconsistent with the other files, I'll fix this.


http://gerrit.cloudera.org:8080/#/c/10941/1/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala:

PS1: 
Ignore this file. Restore will be more consistent with Backup.


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala:

http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@66
PS2, Line 66:           val writeOptions =
> Should the be RestoreSuccess/RestoreFailure? Since their isn't a ton of cus
Done


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@79
PS2, Line 79:       }
> Nit: Explicit return type.
Done


http://gerrit.cloudera.org:8080/#/c/10941/2/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@91
PS2, Line 91:     val hPath = new HPath(path.resolve(TableMetadata.MetadataFileName).toString)
> Nit: Explicit return type
Done


http://gerrit.cloudera.org:8080/#/c/10941/4/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
File java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala:

http://gerrit.cloudera.org:8080/#/c/10941/4/java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala@94
PS4, Line 94:     tmpDir = Files.createTempDirectory("backup")
The 'backup' name doesn't make sense in this more global context. Will fix.



-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 4
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>
Gerrit-Comment-Date: Thu, 19 Jul 2018 19:31:28 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Tony Foerster has abandoned this change. ( http://gerrit.cloudera.org:8080/10941 )

Change subject: Add error handling to Backup and Restore
......................................................................


Abandoned

I was considering this a sort of incremental change, but looking at it I think it needs more work. I'll open a new request after reworking.
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 9
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Tony Foerster has posted comments on this change. ( http://gerrit.cloudera.org:8080/10941 )

Change subject: Add error handling to Backup and Restore
......................................................................


Patch Set 9:

Build failure is not related to this changeset (OptionalSSL/TestRpc.TestClientConnectionMetrics/0)


-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 9
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>
Gerrit-Comment-Date: Mon, 23 Jul 2018 20:17:35 +0000
Gerrit-HasComments: No

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#13).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every,regardless of any one table
failure.  If any one table fails the job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp equaling the user-supplied timestamp, the table
will be marked successful.

If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore Kudu table has already been created the
`-restore` table will not be removed. This should probably be done in a
later commit.

A list of failed/success full backups or restores will be printed at the
end of each job.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
5 files changed, 343 insertions(+), 110 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/13
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 13
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#7).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every table.  If any one table
fails the job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp up to the current user supplied timestamp, the table
will be marked successful.
If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore Kudu table has already been created, it will not be
removed.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
5 files changed, 354 insertions(+), 153 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/7
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 7
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#9).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every table.  If any one table
fails the job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp up to the current user supplied timestamp, the table
will be marked successful.
If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore Kudu table has already been created, it will not be
removed.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
5 files changed, 346 insertions(+), 145 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/9
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 9
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] WIP Add error handling to KuduBackup DONT BUILD KuduBackup will attempt to backup every table. If any one table fails the entire job will fail.

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#2).

Change subject: WIP Add error handling to KuduBackup DONT_BUILD KuduBackup will attempt to backup every table. If any one table fails the entire job will fail.
......................................................................

WIP Add error handling to KuduBackup
DONT_BUILD
KuduBackup will attempt to backup every table. If any one table fails the
entire job will fail.

A backup attempt can have 2 results:
  - Success - table was backed up or an existing backup is in the output
  dir. Incremental backup to be implemented
  - Failure - table failed to backup. Any partial data will be removed

At the end of the job lists of successes and failures are printed out.
An exception is thrown if there are any table failures.

TODO -
- if 'overwrite' flag is specified, overwrite the 'metadata'
folder. Not sure yet how to handle this on retries or if it's even a
good idea.
- more tests

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
A java/kudu-backup/out/test/resources/log4j.properties
A java/kudu-backup/src/main/scala/org/apache/kudu/backup/BackupStatus.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
6 files changed, 232 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/2
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 2
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to KuduBackup

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#3).

Change subject: Add error handling to KuduBackup
......................................................................

Add error handling to KuduBackup

KuduBackup will attempt to backup every table. If any one table fails the
entire job will fail.

If a table is was already backed up to the specified timestamp or the
table is backed up, the table is marked successful.
If a table fails to back up any partial data will be removed.

At the end of the job if any table has failed to backup/restore an
exception is thrown.

Spark SaveMode overwrite/append semantics are not handled as part of this commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
A java/kudu-backup/out/test/resources/log4j.properties
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
6 files changed, 358 insertions(+), 126 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/3
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 3
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#5).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every table.  If any one table
fails job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp up to the current user supplied timestamp, the table
will be marked successful.
If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore table has already been created, it will not be
removed.

At the end of the job if any table has failed to backup/restore an
exception is thrown and the job exits with a non-zero code.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
5 files changed, 368 insertions(+), 159 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/5
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 5
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#6).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every table.  If any one table
fails the job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp up to the current user supplied timestamp, the table
will be marked successful.
If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore Kudu table has already been created, it will not be
removed.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala
5 files changed, 368 insertions(+), 159 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/6
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 6
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#14).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every,regardless of any one table
failure.  If any one table fails the job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp equaling the user-supplied timestamp, the table
will be marked successful.

If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore Kudu table has already been created the
`-restore` table will not be removed. This should probably be done in a
later commit.

A list of failed/success full backups or restores will be printed at the
end of each job.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
5 files changed, 384 insertions(+), 116 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/14
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 14
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Tony Foerster has posted comments on this change. ( http://gerrit.cloudera.org:8080/10941 )

Change subject: Add error handling to Backup and Restore
......................................................................


Patch Set 8:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala:

http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@35
PS6, Line 35:                                      private val cause: Throwable = null)
> nit: Just None.orNull is a bit unusual. Just null should work.
Done


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@46
PS6, Line 46:     // TODO: Make parallel so each table isn't process serially?
> nit: s/vailid/valid
Done


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@57
PS6, Line 57:             // Backup for this timestamp has already been completed.
> I think we should only consider the case where getToMs == options.timestamp
Done


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@111
PS6, Line 111:     val out = fs.create(hPath, /* overwrite= */ false)
> nit: extra spaces
Done


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala:

http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@90
PS6, Line 90:     val conf = session.sparkContext.hadoopConfiguration
> nit: extra spaces. I think a formatter is aligning the =. This should be fi
right it's my formatter...


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@103
PS6, Line 103:                                context: KuduContext): Try[KuduTable] = Try {
> nit: explicit return type.
?? This has an explicit return type


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@115
PS6, Line 115:   def main(args: Array[String]): Unit = {
> Where is this used? We definitely don't want the restore process dropping t
It was used when removing partial "${tableName}-restore" tables. I've removed that part of the code for now but this was left behind. Fixed.


http://gerrit.cloudera.org:8080/#/c/10941/6/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@136
PS6, Line 136: 
> nit: s/note/not
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 8
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>
Gerrit-Comment-Date: Mon, 23 Jul 2018 19:36:21 +0000
Gerrit-HasComments: Yes

[kudu-CR] Add error handling to Backup and Restore

Posted by "Grant Henke (Code Review)" <ge...@cloudera.org>.
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/10941 )

Change subject: Add error handling to Backup and Restore
......................................................................


Patch Set 6:

I think you are auto-reformatting all the files you edit.  Using an auto-formatter for Scala is something I support, but it should be enabled and standardized as a separate patch. Until then please, disable it and keep the formatting changes to lines you are changing as a part of this feature.


-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 6
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>
Gerrit-Comment-Date: Fri, 20 Jul 2018 20:44:40 +0000
Gerrit-HasComments: No

[kudu-CR] Add error handling to Backup and Restore

Posted by "Tony Foerster (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10941

to look at the new patch set (#15).

Change subject: Add error handling to Backup and Restore
......................................................................

Add error handling to Backup and Restore

A backup/restore will be attempted on every,regardless of any one table
failure.  If any one table fails the job will exit with a non-zero code.

If a table is successfully backed up in the current job or the table
metadata has a timestamp equaling the user-supplied timestamp, the table
will be marked successful.

If a table fails to back up any partial data will be removed. If a table fails
to restore and a restore Kudu table has already been created the
`-restore` table will not be removed. This should probably be done in a
later commit.

A list of failed/success full backups or restores will be printed at the
end of each job.

Spark SaveMode overwrite/append semantics are not handled as part of this
commit.

Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
---
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala
M java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala
M java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala
5 files changed, 384 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/10941/15
-- 
To view, visit http://gerrit.cloudera.org:8080/10941
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I104fd604e12b70fcff9beace71ed4ef96b54d691
Gerrit-Change-Number: 10941
Gerrit-PatchSet: 15
Gerrit-Owner: Tony Foerster <an...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tony Foerster <an...@gmail.com>