You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/03/30 22:20:16 UTC

[GitHub] [druid] zachjsh opened a new pull request #9588: Allow Cloud SegmentKillers to be instantiated without segment bucket or path

zachjsh opened a new pull request #9588: Allow Cloud SegmentKillers to be instantiated without segment bucket or path
URL: https://github.com/apache/druid/pull/9588
 
 
   ### Description
   
   This change fixes a bug that was introduced that causes ingestion
   to fail if data is ingested from one of the supported cloud storages
   (Azure, Google, S3), and the user is using another type of storage
   for deep storage. In this case the all segment killer implementations
   are instantiated. A change recently made forced a dependency between
   the supported cloud storage type SegmentKiller classes and the
   deep storage configuration for that storage type being set, which
   forced the deep storage bucket and prefix to be non-null. This caused
   a NullPointerException to be thrown when instantiating the
   SegmentKiller classes during ingestion.
   
   To fix this issue, the respective deep storage segment configs for the
   cloud storage types supported in druid are now allowed to have nullable
   bucket and prefix configurations
   
   This PR has:
   - [ ] been self-reviewed.
      - [ ] using the [concurrency checklist](https://github.com/apache/druid/blob/master/dev/code-review/concurrency.md) (Remove this item if the PR doesn't have any relation to concurrency.)
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson edited a comment on issue #9588: Allow Cloud Deep Storage configs without segment bucket or path specified

Posted by GitBox <gi...@apache.org>.
jihoonson edited a comment on issue #9588: Allow Cloud Deep Storage configs without segment bucket or path specified
URL: https://github.com/apache/druid/pull/9588#issuecomment-606306324
 
 
   Please regard my comment as a non-blocker. The approach of this PR looks good to me as a short term solution.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on issue #9588: Allow Cloud Deep Storage configs without segment bucket or path specified

Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9588: Allow Cloud Deep Storage configs without segment bucket or path specified
URL: https://github.com/apache/druid/pull/9588#issuecomment-606306324
 
 
   Please regard my comment as not a blocker. The approach of this PR looks good to me as a short term solution.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis merged pull request #9588: Allow Cloud Deep Storage configs without segment bucket or path specified

Posted by GitBox <gi...@apache.org>.
clintropolis merged pull request #9588: Allow Cloud Deep Storage configs without segment bucket or path specified
URL: https://github.com/apache/druid/pull/9588
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] zachjsh commented on issue #9588: Allow Cloud Deep Storage configs without segment bucket or path specified

Posted by GitBox <gi...@apache.org>.
zachjsh commented on issue #9588: Allow Cloud Deep Storage configs without segment bucket or path specified
URL: https://github.com/apache/druid/pull/9588#issuecomment-606301116
 
 
   Manual Tests ran:
   
   1. Start local druid cluster with S3 deep strorage, and ingest data from google storage. 
   
   2. Start local druid cluster with S3 deep strorage, and ingest data from azure storage. 
   
   Both tests succeeded with these changes, and failed without

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson commented on issue #9588: Allow Cloud Deep Storage configs without segment bucket or path specified

Posted by GitBox <gi...@apache.org>.
jihoonson commented on issue #9588: Allow Cloud Deep Storage configs without segment bucket or path specified
URL: https://github.com/apache/druid/pull/9588#issuecomment-606303285
 
 
   As I commented on a [similar PR](https://github.com/apache/druid/pull/9296#pullrequestreview-351220321), I'm wondering whether it's better to split the inputSources and the deep storage types into different extensions. I think it's better to split them because 
   
   1) the configurations will be still non-nulls and thus you don't have to add null checks everywhere which  is less error-prone.
   2) the error will be still thrown during the initialization instead of when a relevant method is called while the cluster is running. This could matter because it would be nice to fail fast when some configuration is accidentally wrong so that users don't have to restart the cluster whenever they find wrong configurations.
   3) if we unify all extension inputSources as a separate extension, users can just choose the new extension to use all input source types which is better than listing all different extensions in the load list.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org