You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Steve Loughran (JIRA)" <ji...@apache.org> on 2017/06/02 20:24:05 UTC

[jira] [Commented] (HADOOP-13786) Add S3Guard committer for zero-rename commits to consistent S3 endpoints

    [ https://issues.apache.org/jira/browse/HADOOP-13786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035358#comment-16035358 ] 

Steve Loughran commented on HADOOP-13786:
-----------------------------------------

Patch 030: evolution based on integration testing with the InconsistentAmazonS3Client enabled, s3guard on/off, in Spark, so using its workflow.

* the _SUCCESS marker contains more information & diagnostics
* various bits of tuning shown (making cleanup resilient to inconsistencies in list vs actual)
* docs

It's in sync with commit 0fbb4aa in [https://github.com/hortonworks-spark/cloud-integration]; as is [the documentation|https://github.com/hortonworks-spark/cloud-integration/blob/master/cloud-committer/src/main/site/markdown/index.md]

The core integration tests are working; more is always welcome...I plan to scale things up & create 1+ test designed to work on large clusters. This is all just querying data, but it adds validation of the data from the _SUCCESS marker, which is new.

Example printing of success marker data
{code}
2017-06-02 19:59:19,780 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  s3.S3AOperations (Logging.scala:logInfo(54)) - success data at s3a://hwdev-steve-new/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/_SUCCESS : SuccessData{committer='PartitionedStagingCommitter', hostname='HW13176.cotham.uk', description='Task committer attempt_20170602195913_0000_m_000000_0', date='Fri Jun 02 19:59:17 BST 2017', filenames=[/cloud-integration/DELAY_LISTING_ME/S3ACommitDataframeSuite/dataframe-committer/partitioned/orc/part-00000-f22d488c-dad0-4fa5-8ca4-8d00b058c77c-c000.snappy.orc]}
2017-06-02 19:59:19,781 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  s3.S3AOperations (Logging.scala:logInfo(54)) - Metrics:
  S3guard_metadatastore_put_path_latency50thPercentileLatency = 548156
  S3guard_metadatastore_put_path_latency75thPercentileLatency = 548156
  S3guard_metadatastore_put_path_latency90thPercentileLatency = 548156
  S3guard_metadatastore_put_path_latency95thPercentileLatency = 548156
  S3guard_metadatastore_put_path_latency99thPercentileLatency = 548156
  S3guard_metadatastore_put_path_latencyNumOps = 1
  committer_bytes_committed = 384
  committer_commits_aborted = 0
  committer_commits_completed = 1
  committer_commits_created = 1
  committer_commits_failed = 0
  committer_commits_reverted = 0
  committer_jobs_completed = 1
  committer_jobs_failed = 0
  committer_tasks_completed = 1
  committer_tasks_failed = 0
  directories_created = 1
  directories_deleted = 0
  fake_directories_deleted = 6
  files_copied = 0
  files_copied_bytes = 0
  files_created = 0
  files_deleted = 2
  ignored_errors = 1
  object_continue_list_requests = 0
  object_copy_requests = 0
  object_delete_requests = 2
  object_list_requests = 5
  object_metadata_requests = 8
  object_multipart_aborted = 0
  object_put_bytes = 384
  object_put_bytes_pending = 0
  object_put_requests = 2
  object_put_requests_active = 0
  object_put_requests_completed = 2
  op_copy_from_local_file = 0
  op_exists = 2
  op_get_file_status = 4
  op_glob_status = 0
  op_is_directory = 0
  op_is_file = 0
  op_list_files = 0
  op_list_located_status = 0
  op_list_status = 0
  op_mkdirs = 0
  op_rename = 0
  s3guard_metadatastore_initialization = 0
  s3guard_metadatastore_put_path_request = 2
  stream_aborted = 0
  stream_backward_seek_operations = 0
  stream_bytes_backwards_on_seek = 0
  stream_bytes_discarded_in_abort = 0
  stream_bytes_read = 0
  stream_bytes_read_in_close = 0
  stream_bytes_skipped_on_seek = 0
  stream_close_operations = 0
  stream_closed = 0
  stream_forward_seek_operations = 0
  stream_opened = 0
  stream_read_exceptions = 0
  stream_read_fully_operations = 0
  stream_read_operations = 0
  stream_read_operations_incomplete = 0
  stream_seek_operations = 0
  stream_write_block_uploads = 0
  stream_write_block_uploads_aborted = 0
  stream_write_block_uploads_active = 0
  stream_write_block_uploads_committed = 0
  stream_write_block_uploads_data_pending = 0
  stream_write_block_uploads_pending = 0
  stream_write_failures = 0
  stream_write_total_data = 0
  stream_write_total_time = 0

2017-06-02 19:59:19,782 [ScalaTest-main-running-S3ACommitDataframeSuite] INFO  s3.S3AOperations (Logging.scala:logInfo(54)) - Diagnostics:
  fs.s3a.committer.magic.enabled = true
  fs.s3a.metadatastore.authoritative = false
  fs.s3a.metadatastore.impl = org.apache.hadoop.fs.s3a.s3guard.LocalMetadataStore
{code}



> Add S3Guard committer for zero-rename commits to consistent S3 endpoints
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-13786
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13786
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-13786-HADOOP-13345-001.patch, HADOOP-13786-HADOOP-13345-002.patch, HADOOP-13786-HADOOP-13345-003.patch, HADOOP-13786-HADOOP-13345-004.patch, HADOOP-13786-HADOOP-13345-005.patch, HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-006.patch, HADOOP-13786-HADOOP-13345-007.patch, HADOOP-13786-HADOOP-13345-009.patch, HADOOP-13786-HADOOP-13345-010.patch, HADOOP-13786-HADOOP-13345-011.patch, HADOOP-13786-HADOOP-13345-012.patch, HADOOP-13786-HADOOP-13345-013.patch, HADOOP-13786-HADOOP-13345-015.patch, HADOOP-13786-HADOOP-13345-016.patch, HADOOP-13786-HADOOP-13345-017.patch, HADOOP-13786-HADOOP-13345-018.patch, HADOOP-13786-HADOOP-13345-019.patch, HADOOP-13786-HADOOP-13345-020.patch, HADOOP-13786-HADOOP-13345-021.patch, HADOOP-13786-HADOOP-13345-022.patch, HADOOP-13786-HADOOP-13345-023.patch, HADOOP-13786-HADOOP-13345-024.patch, HADOOP-13786-HADOOP-13345-025.patch, HADOOP-13786-HADOOP-13345-026.patch, HADOOP-13786-HADOOP-13345-027.patch, HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-028.patch, HADOOP-13786-HADOOP-13345-029.patch, HADOOP-13786-HADOOP-13345-030.patch, objectstore.pdf, s3committer-master.zip
>
>
> A goal of this code is "support O(1) commits to S3 repositories in the presence of failures". Implement it, including whatever is needed to demonstrate the correctness of the algorithm. (that is, assuming that s3guard provides a consistent view of the presence/absence of blobs, show that we can commit directly).
> I consider ourselves free to expose the blobstore-ness of the s3 output streams (ie. not visible until the close()), if we need to use that to allow us to abort commit operations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org