You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/04/06 21:49:58 UTC

[GitHub] [druid] clintropolis opened a new pull request #9630: check paths used for shuffle intermediary data manager get and delete

clintropolis opened a new pull request #9630: check paths used for shuffle intermediary data manager get and delete
URL: https://github.com/apache/druid/pull/9630
 
 
   ### Description
   This PR fixes an issue spotted by LGTM where strings provided by an HTTP request to the `ShuffleResource` used by native batch indexing were not checked, allowing paths to be supplied to delete, and to a lesser extent, read files outside of the intended storage directory.
   
   Both of the fix areas are flagged with: https://lgtm.com/rules/5970070/
   
   The added tests would fail prior to this PR and serve to illustrate the bug. I moved logic of `DataSchema. validateDatasourceName` and `RandomIdUtils` into `druid-core` under `TaskIdUtils` so that this logic could be recycled to ensure that the `supervisorTaskId` are chill to use.
   
   <hr>
   
   This PR has:
   - [ ] been self-reviewed.
   - [ ] added documentation for new or modified features or behaviors.
   - [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
   - [ ] added or updated version, license, or notice information in [licenses.yaml](https://github.com/apache/druid/blob/master/licenses.yaml)
   - [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
   - [ ] added unit tests or modified existing tests to cover new code paths.
   - [ ] added integration tests.
   - [ ] been tested in a test Druid cluster.
   
   <!-- Check the items by putting "x" in the brackets for the done things. Not all of these items apply to every PR. Remove the items which are not done or not relevant to the PR. None of the items from the checklist above are strictly necessary, but it would be very helpful if you at least self-review the PR. -->
   
   <hr>
   
   ##### Key changed/added classes in this PR
    * `MyFoo`
    * `OurBar`
    * `TheirBaz`
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lgtm-com[bot] commented on issue #9630: check paths used for shuffle intermediary data manager get and delete

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on issue #9630: check paths used for shuffle intermediary data manager get and delete
URL: https://github.com/apache/druid/pull/9630#issuecomment-610344733
 
 
   This pull request **fixes 2 alerts** when merging d120c288fecaab124e1953f373ea6c2b9e842d1a into 79522f3e25f434caa2e7e56aa97a064a91af805f - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-5947963af448f01faa00550304f9ca50fe070b42)
   
   **fixed alerts:**
   
   * 2 for Uncontrolled data used in path expression

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] jihoonson merged pull request #9630: check paths used for shuffle intermediary data manager get and delete

Posted by GitBox <gi...@apache.org>.
jihoonson merged pull request #9630: check paths used for shuffle intermediary data manager get and delete
URL: https://github.com/apache/druid/pull/9630
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] ccaominh commented on a change in pull request #9630: check paths used for shuffle intermediary data manager get and delete

Posted by GitBox <gi...@apache.org>.
ccaominh commented on a change in pull request #9630: check paths used for shuffle intermediary data manager get and delete
URL: https://github.com/apache/druid/pull/9630#discussion_r404495536
 
 

 ##########
 File path: indexing-service/src/main/java/org/apache/druid/indexing/worker/IntermediaryDataManager.java
 ##########
 @@ -336,6 +337,7 @@ long addSegment(String supervisorTaskId, String subTaskId, DataSegment segment,
   @Nullable
   public File findPartitionFile(String supervisorTaskId, String subTaskId, Interval interval, int partitionId)
   {
+    TaskIdUtils.validateId("supervisorTaskId", supervisorTaskId);
 
 Review comment:
   What do you think about moving the validation up the stack to right when the external data enters druid (i.e., in the API resource layer like `ShuffleResource.getPartition()/deletePartitions()`)? For example, in the future, there may be clients of the lower level code that do not need to do the validation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] lgtm-com[bot] commented on issue #9630: check paths used for shuffle intermediary data manager get and delete

Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on issue #9630: check paths used for shuffle intermediary data manager get and delete
URL: https://github.com/apache/druid/pull/9630#issuecomment-610078841
 
 
   This pull request **fixes 2 alerts** when merging c009e2ee043421990b21e225c536f3ca08baa88c into fc2897da1d27e5846ee5b6ad10ffb45ee65f0be5 - [view on LGTM.com](https://lgtm.com/projects/g/apache/druid/rev/pr-f58b695a06c92bfdcab668e6f3c839cac117dd7f)
   
   **fixed alerts:**
   
   * 2 for Uncontrolled data used in path expression

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org


[GitHub] [druid] clintropolis commented on a change in pull request #9630: check paths used for shuffle intermediary data manager get and delete

Posted by GitBox <gi...@apache.org>.
clintropolis commented on a change in pull request #9630: check paths used for shuffle intermediary data manager get and delete
URL: https://github.com/apache/druid/pull/9630#discussion_r404501766
 
 

 ##########
 File path: indexing-service/src/main/java/org/apache/druid/indexing/worker/IntermediaryDataManager.java
 ##########
 @@ -336,6 +337,7 @@ long addSegment(String supervisorTaskId, String subTaskId, DataSegment segment,
   @Nullable
   public File findPartitionFile(String supervisorTaskId, String subTaskId, Interval interval, int partitionId)
   {
+    TaskIdUtils.validateId("supervisorTaskId", supervisorTaskId);
 
 Review comment:
   It seems safer to me to be here where the problem is possible so that any future potential caller wouldn't need to worry about forgetting to do this and avoid the possibility.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org