You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2021/03/19 18:11:51 UTC

[GitHub] [druid] techdocsmith opened a new pull request #11016: Update security overview with additional recommendations

techdocsmith opened a new pull request #11016:
URL: https://github.com/apache/druid/pull/11016


   Update security overview with additional recommendations and improved security
   
   ### Description
   
   Adds best practice to clarify Druid behavior and necessary measures to run Druid in a secure environment.
   
   This PR has:
   - [x] been self-reviewed.
   
   cc: @2bethere , @suneet-s 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] 2bethere commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

2bethere commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600873201



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.

Review comment:
       Actually, I think any authenticated user is expected to not act in a non-malicious way. If you are worried about those users, you should build another layer on top of Druid?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] techdocsmith commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

techdocsmith commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600917529



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.

Review comment:
       What can a read-only user do to attack? Asking for my own knowledge. Like if I can't write something, I can't submit malicious code or other, right? (I may be naive on this front).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] techdocsmith commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

techdocsmith commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600913738



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can read and write to local file system.
+
+Within the cluster:
+1. Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. It is the responsibility of system implementers to setup firewalls and other methods of protection.

Review comment:
       I think you may need to secure incoming and outgoing, no? If Druid runs as a user that could have permissions may have access to other network resources. This almost begs the question of do we need a security whitepaper type doc with illustrative examples of the best practices. (Not the goal of this PR).
   
   Here's my attempt to clarify FWIW:
   > Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. When you implement Druid, take care to setup firewalls and other security measures to secure both inbound and outbound connections.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] 2bethere commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

2bethere commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600875125



##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.

Review comment:
       ```suggestion
   Users with write privileges to any data source are considered as a Druid administrator. Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] techdocsmith commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

techdocsmith commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r610215140



##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.

Review comment:
       Now that we have grouped these, I think it is OK to keep them as separate points.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r610919733



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +322,31 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Within Druid's trust model there users can have different authorization levels:
+- Users with resource write permissions can are allowed to anything that the druid process can do.
+- Authenticated read only users can execute queries against resources to which they have permissions.
+- An authenticated user without any permissions is allowed to execute queries that don't require access to a resource.
+
+Additionally, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes have the same access to the local files granted to the specified system user running the process.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files or external resources that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can  act as the Druid process.

Review comment:
       Here and [this figure](https://github.com/apache/druid/pull/11016/files#diff-3e8eb443238c8a04b52e7691033cfa2b8bd133611434b548c5cd9eaa3a8a72c3R189) talk about the permission to submit ingestion tasks. This seems ambiguous to me. Maybe it would be better to say the `DATASOURCE WRITE` permission instead.

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +24,112 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
+The following recommendations apply to the Druid cluster setup:
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrators have the same OS permissions as the Unix user account running Druid. See [Authentication and authorization model](security-user-auth.md#authentication-and-authorization-model). If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Enable authorization and do not expose the Druid Console without authorization enabled. If authorization is not enabled, any user that has access to the web console has the same privileges as the operating system user that runs the Druid Console process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.
+* * Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
+
+The following recommendations apply the network where Druid runs:
+* Enable TLS to encrypt communication within the cluster.
+* Use an API gateway to:
+  - Restrict access from untrusted networks
+  - Create an allow list of specific APIs that your users need to access
+  - Implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.
+
+The following recommendation applies to Druids authorization and authentication model:
+* Only grant `WRITE` permissions to any `DATASOURCE` to trusted users. Druid's trust model assumes those users have the same privileges as the operating system user that runs the Druid Console process. 
+* Only grant `STATE READ`, `STATE WRITE`, and `DATASOURCE WRITE` permissions to highly-trusted users. These permissions allows users to access resources on behalf of the Druid server process regardless of the datasource.

Review comment:
       I think `CONFIG WRITE` should be an adminitrator-ish permission too as you can update dynamic system configs such as lookups with it. I'm not sure about `STATE READ` though. What bad things can happen when a malicious user have the `STATE READ` permission?

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +24,112 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices

Review comment:
       Suggest linking these configs somewhere in this section.
   
   - Ingestion security configs: https://github.com/apache/druid/blob/master/docs/configuration/index.md#ingestion-security-configuration
   - JDBC connections security configs: https://github.com/apache/druid/blob/master/docs/configuration/index.md#jdbc-connections-to-external-databases

##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +322,31 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Within Druid's trust model there users can have different authorization levels:
+- Users with resource write permissions can are allowed to anything that the druid process can do.

Review comment:
       Should `can` also be eliminated? Like, `Users with resource write permissions are allowed to do`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] 2bethere commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

2bethere commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600917724



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can read and write to local file system.
+
+Within the cluster:
+1. Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. It is the responsibility of system implementers to setup firewalls and other methods of protection.
+2. Druid supports TLS encryption for network traffic, including API calls and data transfers.

Review comment:
       Ha, I see. Maybe this:
   ```suggestion
   2. Druid assuming network traffic within the cluster is encrypted, including API calls and data transfers. By default, this is implemented via TLS.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] techdocsmith commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

techdocsmith commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600922595



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can read and write to local file system.
+
+Within the cluster:
+1. Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. It is the responsibility of system implementers to setup firewalls and other methods of protection.
+2. Druid supports TLS encryption for network traffic, including API calls and data transfers.

Review comment:
       ```suggestion
   Druid assumes network traffic within the cluster is encrypted, including API calls and data transfers. The default encryption implementation uses TLS.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] suneet-s merged pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

suneet-s merged pull request #11016:
URL: https://github.com/apache/druid/pull/11016


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] techdocsmith commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

techdocsmith commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600907928



##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.

Review comment:
       Everything in that list should be in the API gateway. Might be clearer with sub-bullets.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] suneet-s commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

suneet-s commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r605294614



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.

Review comment:
       @techdocsmith It is possible that a security flaw can be exploited where a read only user elevates their permissions. This would be a very serious attack vector, so having a security model that says a read only user should not act maliciously seems counter-intuitive to me.
   
   All security flaws are because someone is acting maliciously. The question then is what are the levels of defense in the system. That's what I thought a security trust model is.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] suneet-s commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

suneet-s commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r605295249



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +321,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files or external resources that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can read and write to local file system.

Review comment:
       It's more than just the filesystem, they can make network calls as well.
   
   ```suggestion
   > Note: Only grant the permission to submit ingestion tasks to trusted users because they can act as the Druid process.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] techdocsmith commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

techdocsmith commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600913738



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can read and write to local file system.
+
+Within the cluster:
+1. Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. It is the responsibility of system implementers to setup firewalls and other methods of protection.

Review comment:
       ```suggestion
   1. Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. When you implement Druid, take care to setup firewalls and other security measures to secure both inbound and outbound connections.
   ```
   
   I think you may need to secure incoming and outgoing, no? If Druid runs as a user that could have permissions may have access to other network resources. This almost begs the question of do we need a security whitepaper type doc with illustrative examples of the best practices. (Not the goal of this PR).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] suneet-s commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

suneet-s commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r610331091



##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +24,112 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
+The following recommendations apply to the Druid cluster setup:
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrators have the same OS permissions as the Unix user account running Druid. See [Authentication and authorization model](security-user-auth.md#authentication-and-authorization-model). If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Enable authorization and do not expose the Druid Console without authorization enabled. If authorization is not enabled, any user that has access to the web console has the same privileges as the operating system user that runs the Druid Console process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.
+* * Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
+
+The following recommendations apply the network where Druid runs:

Review comment:
       ```suggestion
   The following recommendations apply to the network where Druid runs:
   ```

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +24,112 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
+The following recommendations apply to the Druid cluster setup:
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrators have the same OS permissions as the Unix user account running Druid. See [Authentication and authorization model](security-user-auth.md#authentication-and-authorization-model). If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Enable authorization and do not expose the Druid Console without authorization enabled. If authorization is not enabled, any user that has access to the web console has the same privileges as the operating system user that runs the Druid Console process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.
+* * Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.

Review comment:
       ```suggestion
   * Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
   ```

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +24,112 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
+The following recommendations apply to the Druid cluster setup:
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrators have the same OS permissions as the Unix user account running Druid. See [Authentication and authorization model](security-user-auth.md#authentication-and-authorization-model). If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Enable authorization and do not expose the Druid Console without authorization enabled. If authorization is not enabled, any user that has access to the web console has the same privileges as the operating system user that runs the Druid Console process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.
+* * Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
+
+The following recommendations apply the network where Druid runs:
+* Enable TLS to encrypt communication within the cluster.
+* Use an API gateway to:
+  - Restrict access from untrusted networks
+  - Create an allow list of specific APIs that your users need to access
+  - Implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.

Review comment:
       The previous bullet and this one make the same point, just with different technologies. I think we should collapse them and tell operators they can use either an API gateway, or a firewall or something else...

##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +322,31 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Within Druid's trust model there users can have different authorization levels:
+- Users with resource write permissions can are allowed to anything that the druid process can do.

Review comment:
       ```suggestion
   - Users with resource write permissions can are allowed to do anything that the druid process can do.
   ```

##########
File path: docs/operations/security-overview.md
##########
@@ -136,37 +193,37 @@ The following steps walk through a sample setup procedure:
 
 > The default Coordinator API port is 8081 for non-TLS connections and 8281 for secured connections.
 
-1. Create a user by issuing a POST request to `druid-ext/basic-security/authentication/db/MyBasicMetadataAuthenticator/users/<USERNAME>`, replacing USERNAME with the new username. For example: 
+1. Create a user by issuing a POST request to `druid-ext/basic-security/authentication/db/MyBasicMetadataAuthenticator/users/<USERNAME>`, replacing USERNAME with the *new* username you are trying to create. For example: 

Review comment:
       This example talks about using the druid basic security extension. The rest of this doc talks about principles that apply across Druid. It feels like we want to separate this into separate pages.
   
   Since this example already existed in this page, I don't think we need to move it in this PR, but perhaps in a follow up one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] suneet-s commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

suneet-s commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600702094



##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.

Review comment:
       What is an account lockout? Are these throttling features something that should be implemented outside of Druid?

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.
+* Only grant `STATE READ`, `STATE WRITE`, and `DATASOURCE WRITE` permissions to highly-trusted users. These permissions allows users to access resources on behalf of the Druid server process regardless of the datasource.
+* If your Druid client application allows less-trusted users to control the input source or firehose of an ingestion task, validate the URLs from the users. It is possible to point unchecked URLs to other locations and resources within your network or local file system.
+* Enable TLS to encrypt communication within the cluster.
+* You should only grant `WRITE` permissions to any `DATASOURCE` to trusted users. Druid's trust model assumes those users have the same privileges as the operating system user that runs the Druid process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.
+* Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
 
+## Authentication and Authorization
 
-This document gives you an overview of security features in Druid and how to configure them, and some best practices for securing Druid.
+You can configure authentication and authorization to control access to the the Druid APIs. The first step is enabling TLS for the cluster nodes. This is essential to ensure passwords and authentication tokens are encrypted over the network.
+Then configure users, roles, and permissions, as described in the following sections. 
 
+The configuration settings mentioned below are primarily located in the `common.runtime.properties` file. Note that you need to make the configuration changes on all Druid server in the cluster.
 
-## Best practices
+## Enable TLS
 
-* Do not expose the Druid Console without authentication on untrusted networks. Access to the console effectively confers access the file system on the installation machine, via file browsers in the UI. You should use an API gateway that restricts who can connect from untrusted networks, allow list the specific APIs that your users need to access, and implements account lockout and throttling features.
-* You should only grant `WRITE` permissions to a `DATASOURCE` to trusted users. Druid assumes that these users have the same privileges as the operating system user that runs the Druid process. 
-* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.  
-* Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
-* Run Druid as an unprivileged Unix user on the installation machine (not root).
-   > This is an important point! Administrator users on Druid have the same permission as the Unix user account it is running under. If the Druid process is running under the root user account in the OS, then Administrator users on Druid can read/write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+Enabling TLS encrypts the traffic between external clients and the Druid cluster and traffic between services within the cluster.
 
-You can configure authentication and authorization to control access to the the Druid APIs. The first step is enabling TLS for the cluster nodes. Then configure users, roles, and permissions, as described in the following sections. 
+### Generating keys
+Before you enable TLS in Druid, generate the keystore and truststore. When one Druid process, e.g. Broker, contacts another Druid process , e.g. Historical, the first service is a client for the second service, considered the server.
 
-The configuration settings mentioned below are primarily located in the `common.runtime.properties` file. Note that you need to make the configuration changes on each Druid server in the cluster. 
+The client uses a trustStore that contains certificates trusted by the client. For example, the Broker.
 
+The server uses a keyStore that contains private keys and certificate chain used to securely identify itself.
 
-## Enable TLS
+The following example demonstrates how to use Java keytool to generate the keyStore for the server and then create a trustStore to trust the key for the client:
 
-The first step in securing Druid is enabling TLS. You can enable TLS to secure external client connections to Druid as well as connections between cluster nodes. 
+1. Generate the keyStore with Java keytool:

Review comment:
       nit: Then the spelling file won't need keytool to be added to it
   ```suggestion
   1. Generate the keyStore with Java `keytool`:
   ```

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.

Review comment:
       I think this should say authorization instead of authentication. Authenticated users who only have READ DATASOURCE privileges, can use the web console without access to the files on the Druid server.
   
   ```suggestion
   * Do not expose the Druid Console without authorization enabled. If authorization is not enabled, any user that has access to the web console will have the same permissions as the OS user running the Druid process.
   ```

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices

Review comment:
       Thanks for putting these best practices together. Reading through them, I made a bunch of comments. Here is some structure that I think will help.
   
   We should break this into a few sections
   1. Druid cluster set up
   1a. Run as a non root user
   1b. Disable javascript
   1c. Enable authentication  and authorization
   2.  Network setup
   2a. Enable TLS
   2b. Limit access to the cluster, etc.
   3. Authentication + Authorization
   3a. Points about authentication and authorization, minimum privileges, etc.

##########
File path: docs/operations/security-overview.md
##########
@@ -136,37 +189,37 @@ The following steps walk through a sample setup procedure:
 
 > The default Coordinator API port is 8081 for non-TLS connections and 8281 for secured connections.
 
-1. Create a user by issuing a POST request to `druid-ext/basic-security/authentication/db/MyBasicMetadataAuthenticator/users/<USERNAME>`, replacing USERNAME with the new username. For example: 
+1. Create a user by issuing a POST request to `druid-ext/basic-security/authentication/db/MyBasicMetadataAuthenticator/users/<USERNAME>`, replacing USERNAME with the *new* username you are trying to create. For example: 
   ```
-   curl -u admin:password -XPOST https://my-coordinator-ip:8281/druid-ext/basic-security/authentication/db/basic/users/myname
+   curl -u admin:password1 -XPOST https://my-coordinator-ip:8281/druid-ext/basic-security/authentication/db/basic/users/myname
   ```
   >  If you have TLS enabled, be sure to adjust the curl command accordingly. For example, if your Druid servers use self-signed certificates, you may choose to include the `insecure` curl option to forgo certificate checking for the curl command. 
 2. Add a credential for the user by issuing a POST to `druid-ext/basic-security/authentication/db/MyBasicMetadataAuthenticator/users/<USERNAME>/credentials`. For example:
     ```
-    curl -u admin:password -H'Content-Type: application/json' -XPOST --data-binary @pass.json https://my-coordinator-ip:8281/druid-ext/basic-security/authentication/db/basic/users/myname/credentials
+    curl -u admin:password1 -H'Content-Type: application/json' -XPOST --data-binary @pass.json https://my-coordinator-ip:8281/druid-ext/basic-security/authentication/db/basic/users/myname/credentials
     ```
     The password is conveyed in the `pass.json` file in the following form:
    	```
    	{
-      "password": "password"
+      "password": "myname_password"
     }
     ```
 2. For each authenticator user you create, create a corresponding authorizer user by issuing a POST request to `druid-ext/basic-security/authorization/db/MyBasicMetadataAuthorizer/users/<USERNAME>`. For example: 
 	```
-	curl -u admin:password -XPOST https://my-coordinator-ip:8281/druid-ext/basic-security/authorization/db/basic/users/myname
+	curl -u admin:password1 -XPOST https://my-coordinator-ip:8281/druid-ext/basic-security/authorization/db/basic/users/myname
 	```
 3. Create authorizer roles to control permissions by issuing a POST request to `druid-ext/basic-security/authorization/db/MyBasicMetadataAuthorizer/roles/<ROLENAME>`. For example: 
 	```
-   curl -u admin:password -XPOST https://my-coordinator-ip:8281/druid-ext/basic-security/authorization/db/basic/roles/myrole
+   curl -u admin:password1 -XPOST https://my-coordinator-ip:8281/druid-ext/basic-security/authorization/db/basic/roles/myrole
    ```
 4. Assign roles to users by issuing a POST request to `druid-ext/basic-security/authorization/db/MyBasicMetadataAuthorizer/users/<USERNAME>/roles/<ROLENAME>`. For example: 
 	```
-	curl -u admin:password -XPOST https://my-coordinator-ip:8281/druid-ext/basic-security/authorization/db/basic/users/myname/roles/myrole | jq
+	curl -u admin:password1 -XPOST https://my-coordinator-ip:8281/druid-ext/basic-security/authorization/db/basic/users/myname/roles/myrole | jq
 	```
 5. Finally, attach permissions to the roles to control how they can interact with Druid at `druid-ext/basic-security/authorization/db/MyBasicMetadataAuthorizer/roles/<ROLENAME>/permissions`. 
 	For example: 
 	```
-	curl -u admin:password -H'Content-Type: application/json' -XPOST --data-binary @perms.json https://my-coordinator-ip:8281/druid-ext/basic-security/authorization/db/basic/roles/myrole/permissions
+	curl -u admin:password1 -H'Content-Type: application/json' -XPOST --data-binary @perms.json https://my-coordinator-ip:8281/druid-ext/basic-security/authorization/db/basic/roles/myrole/permissions

Review comment:
       why the change to `password1`?

##########
File path: docs/operations/security-overview.md
##########
@@ -93,18 +136,28 @@ The following takes you through sample configuration steps for enabling basic au
 
 1. Add the `druid-basic-security` extension to `druid.extensions.loadList` in `common.runtime.properties`. For the quickstart installation, for example, the properties file is at `conf/druid/cluster/_common`:
    ```
-   druid.extensions.loadList=["druid-basic-security", "druid-histogram", "druid-datasketches", "druid-kafka-indexing-service"]
+   druid.extensions.loadList=["druid-basic-security", "druid-histogram", "druid-datasketches", "druid-kafka-indexing-service", "imply-utility-belt"]

Review comment:
       ```suggestion
      druid.extensions.loadList=["druid-basic-security", "druid-histogram", "druid-datasketches", "druid-kafka-indexing-service"]
   ```

##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files that the Druid process has access to.
+

Review comment:
       When a druid cluster is configured to talk to an external resource (for example S3, an internal service, etc.) The user submitting the task also has permissions to talk to the external resource

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.

Review comment:
       > Druid administrator
   
   Is there documentation anywhere that talks about who a Druid administrator is. I think users coming from other systems will not be aware that someone who has write privileges to any datasource is considered an administrator. This is a really important fact an operator should be aware of when they are thinking of securing their Druid cluster.

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.
+* Only grant `STATE READ`, `STATE WRITE`, and `DATASOURCE WRITE` permissions to highly-trusted users. These permissions allows users to access resources on behalf of the Druid server process regardless of the datasource.
+* If your Druid client application allows less-trusted users to control the input source or firehose of an ingestion task, validate the URLs from the users. It is possible to point unchecked URLs to other locations and resources within your network or local file system.
+* Enable TLS to encrypt communication within the cluster.
+* You should only grant `WRITE` permissions to any `DATASOURCE` to trusted users. Druid's trust model assumes those users have the same privileges as the operating system user that runs the Druid process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.
+* Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
 
+## Authentication and Authorization
 
-This document gives you an overview of security features in Druid and how to configure them, and some best practices for securing Druid.
+You can configure authentication and authorization to control access to the the Druid APIs. The first step is enabling TLS for the cluster nodes. This is essential to ensure passwords and authentication tokens are encrypted over the network.
+Then configure users, roles, and permissions, as described in the following sections. 
 
+The configuration settings mentioned below are primarily located in the `common.runtime.properties` file. Note that you need to make the configuration changes on all Druid server in the cluster.
 
-## Best practices
+## Enable TLS
 
-* Do not expose the Druid Console without authentication on untrusted networks. Access to the console effectively confers access the file system on the installation machine, via file browsers in the UI. You should use an API gateway that restricts who can connect from untrusted networks, allow list the specific APIs that your users need to access, and implements account lockout and throttling features.
-* You should only grant `WRITE` permissions to a `DATASOURCE` to trusted users. Druid assumes that these users have the same privileges as the operating system user that runs the Druid process. 
-* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.  
-* Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
-* Run Druid as an unprivileged Unix user on the installation machine (not root).
-   > This is an important point! Administrator users on Druid have the same permission as the Unix user account it is running under. If the Druid process is running under the root user account in the OS, then Administrator users on Druid can read/write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+Enabling TLS encrypts the traffic between external clients and the Druid cluster and traffic between services within the cluster.
 
-You can configure authentication and authorization to control access to the the Druid APIs. The first step is enabling TLS for the cluster nodes. Then configure users, roles, and permissions, as described in the following sections. 
+### Generating keys
+Before you enable TLS in Druid, generate the keystore and truststore. When one Druid process, e.g. Broker, contacts another Druid process , e.g. Historical, the first service is a client for the second service, considered the server.
 
-The configuration settings mentioned below are primarily located in the `common.runtime.properties` file. Note that you need to make the configuration changes on each Druid server in the cluster. 
+The client uses a trustStore that contains certificates trusted by the client. For example, the Broker.
 
+The server uses a keyStore that contains private keys and certificate chain used to securely identify itself.
 
-## Enable TLS
+The following example demonstrates how to use Java keytool to generate the keyStore for the server and then create a trustStore to trust the key for the client:

Review comment:
       nit: Since keytool is a tool, perhaps it should be escape quotes. Maybe we should capitalize KeyStore since that's how I see it show up when I search google?
   
   ```suggestion
   The following example demonstrates how to use Java `keytool` to generate the KeyStore for the server and then create a trustStore to trust the key for the client:
   ```

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.
+* Only grant `STATE READ`, `STATE WRITE`, and `DATASOURCE WRITE` permissions to highly-trusted users. These permissions allows users to access resources on behalf of the Druid server process regardless of the datasource.
+* If your Druid client application allows less-trusted users to control the input source or firehose of an ingestion task, validate the URLs from the users. It is possible to point unchecked URLs to other locations and resources within your network or local file system.
+* Enable TLS to encrypt communication within the cluster.
+* You should only grant `WRITE` permissions to any `DATASOURCE` to trusted users. Druid's trust model assumes those users have the same privileges as the operating system user that runs the Druid process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.
+* Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.

Review comment:
       This should probably be higher

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.

Review comment:
       I think this point should be collapsed with the previous point. They're both talking about restricting access to API end-points via different tools - an API gateway or a firewall

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.
+* Only grant `STATE READ`, `STATE WRITE`, and `DATASOURCE WRITE` permissions to highly-trusted users. These permissions allows users to access resources on behalf of the Druid server process regardless of the datasource.
+* If your Druid client application allows less-trusted users to control the input source or firehose of an ingestion task, validate the URLs from the users. It is possible to point unchecked URLs to other locations and resources within your network or local file system.
+* Enable TLS to encrypt communication within the cluster.
+* You should only grant `WRITE` permissions to any `DATASOURCE` to trusted users. Druid's trust model assumes those users have the same privileges as the operating system user that runs the Druid process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.

Review comment:
       This overlaps with the point about authorization

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.
+* Only grant `STATE READ`, `STATE WRITE`, and `DATASOURCE WRITE` permissions to highly-trusted users. These permissions allows users to access resources on behalf of the Druid server process regardless of the datasource.
+* If your Druid client application allows less-trusted users to control the input source or firehose of an ingestion task, validate the URLs from the users. It is possible to point unchecked URLs to other locations and resources within your network or local file system.
+* Enable TLS to encrypt communication within the cluster.
+* You should only grant `WRITE` permissions to any `DATASOURCE` to trusted users. Druid's trust model assumes those users have the same privileges as the operating system user that runs the Druid process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.
+* Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
 
+## Authentication and Authorization
 
-This document gives you an overview of security features in Druid and how to configure them, and some best practices for securing Druid.
+You can configure authentication and authorization to control access to the the Druid APIs. The first step is enabling TLS for the cluster nodes. This is essential to ensure passwords and authentication tokens are encrypted over the network.
+Then configure users, roles, and permissions, as described in the following sections. 
 
+The configuration settings mentioned below are primarily located in the `common.runtime.properties` file. Note that you need to make the configuration changes on all Druid server in the cluster.
 
-## Best practices
+## Enable TLS
 
-* Do not expose the Druid Console without authentication on untrusted networks. Access to the console effectively confers access the file system on the installation machine, via file browsers in the UI. You should use an API gateway that restricts who can connect from untrusted networks, allow list the specific APIs that your users need to access, and implements account lockout and throttling features.
-* You should only grant `WRITE` permissions to a `DATASOURCE` to trusted users. Druid assumes that these users have the same privileges as the operating system user that runs the Druid process. 
-* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.  
-* Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
-* Run Druid as an unprivileged Unix user on the installation machine (not root).
-   > This is an important point! Administrator users on Druid have the same permission as the Unix user account it is running under. If the Druid process is running under the root user account in the OS, then Administrator users on Druid can read/write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+Enabling TLS encrypts the traffic between external clients and the Druid cluster and traffic between services within the cluster.
 
-You can configure authentication and authorization to control access to the the Druid APIs. The first step is enabling TLS for the cluster nodes. Then configure users, roles, and permissions, as described in the following sections. 
+### Generating keys
+Before you enable TLS in Druid, generate the keystore and truststore. When one Druid process, e.g. Broker, contacts another Druid process , e.g. Historical, the first service is a client for the second service, considered the server.
 
-The configuration settings mentioned below are primarily located in the `common.runtime.properties` file. Note that you need to make the configuration changes on each Druid server in the cluster. 
+The client uses a trustStore that contains certificates trusted by the client. For example, the Broker.
 
+The server uses a keyStore that contains private keys and certificate chain used to securely identify itself.
 
-## Enable TLS
+The following example demonstrates how to use Java keytool to generate the keyStore for the server and then create a trustStore to trust the key for the client:
 
-The first step in securing Druid is enabling TLS. You can enable TLS to secure external client connections to Druid as well as connections between cluster nodes. 
+1. Generate the keyStore with Java keytool:
+```
+$> keytool -keystore keystore.jks -alias druid -genkey -keyalg RSA
+```
+2. Export a public certificate:
+```
+$> keytool -export -alias druid -keystore keystore.jks -rfc -file public.cert
+```
+3. Create the trustStore:
+```
+$> keytool -import -file public.cert -alias druid -keystore truststore.jks
+```
 
-The configuration steps are: 
+Druid uses Jetty as its embedded web server. See [Configuring SSL/TLS KeyStores
+](https://www.eclipse.org/jetty/documentation/jetty-11/operations-guide/index.html#og-keystore) from the Jetty documentation.
 
-1. Enable TLS by adding `druid.enableTlsPort=true` to `common.runtime.properties` on each node in the Druid cluster.
-2. Disable the non-TLS port by setting `druid.enablePlaintextPort` to `false`. 
-2. Follow the steps in [Understanding Certificates and Keys](https://www.eclipse.org/jetty/documentation/current/configuring-ssl.html#understanding-certificates-and-keys) to generate or import a key and certificate. 
-3. Configure the keystore and truststore settings in `common.runtime.properties`. The file should look something like this: 
-  ```
-  druid.enablePlaintextPort=false
-  druid.enableTlsPort=true
-  
-  druid.server.https.keyStoreType=jks
-  druid.server.https.keyStorePath=sample-keystore.jks
-  druid.server.https.keyStorePassword=secret123 # replace with your own password
-  druid.server.https.certAlias=druid 
-  
-  druid.client.https.protocol=TLSv1.2
-  druid.client.https.trustStoreType=jks
-  druid.client.https.trustStorePath=sample-truststore.jks
-  druid.client.https.trustStorePassword=secret123  # replace with your own password
-
-  ``` 
-4. Add the `simple-client-sslcontext` extension to `druid.extensions.loadList` in `common.runtime.properties`. This enables TLS for Druid nodes acting as clients.
-5. Restart the cluster.
 
+   > WARNING: Do not use use self-signed certificates for production environments. Instead, rely on your current public key infrastructure to generate and distribute trusted keys.
+   
+
+
+### Update Druid TLS configurations
+Edit `common.runtime.properties` for all Druid services on all nodes. Add or update the following TLS options. Restart the cluster when you are finished.
+
+```
+# Turn on TLS globally
+druid.enableTlsPort=true
+
+# Disable non-TLS communicatoins
+druid.enablePlaintextPort=false
+
+# For Druid processes acting as a client
+# Load simple-client-sslcontext to enable client side TLS
+# Add the following to extension load list
+druid.extensions.loadList=[......., "simple-client-sslcontext"]
+
+# Setup client side TLS
+druid.client.https.protocol=TLSv1.2
+druid.client.https.trustStoreType=jks
+druid.client.https.trustStorePath=truststore.jks # replace with correct turstStore file
+druid.client.https.trustStorePassword=secret123  # replace with your own password
+
+# Setup server side TLS
+druid.server.https.keyStoreType=jks
+druid.server.https.keyStorePath=imply-keystore.jks # replace with correct keyStore file

Review comment:
       ```suggestion
   druid.server.https.keyStorePath=my-keystore.jks # replace with correct keyStore file
   ```

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.
+* Only grant `STATE READ`, `STATE WRITE`, and `DATASOURCE WRITE` permissions to highly-trusted users. These permissions allows users to access resources on behalf of the Druid server process regardless of the datasource.
+* If your Druid client application allows less-trusted users to control the input source or firehose of an ingestion task, validate the URLs from the users. It is possible to point unchecked URLs to other locations and resources within your network or local file system.
+* Enable TLS to encrypt communication within the cluster.
+* You should only grant `WRITE` permissions to any `DATASOURCE` to trusted users. Druid's trust model assumes those users have the same privileges as the operating system user that runs the Druid process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.
+* Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
 
+## Authentication and Authorization
 
-This document gives you an overview of security features in Druid and how to configure them, and some best practices for securing Druid.
+You can configure authentication and authorization to control access to the the Druid APIs. The first step is enabling TLS for the cluster nodes. This is essential to ensure passwords and authentication tokens are encrypted over the network.
+Then configure users, roles, and permissions, as described in the following sections. 
 
+The configuration settings mentioned below are primarily located in the `common.runtime.properties` file. Note that you need to make the configuration changes on all Druid server in the cluster.
 

Review comment:
       It looks like this title section is accidentally duplicated. More info seems to be written starting on line 121 

##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.

Review comment:
       I think it's reasonable to expect administrators to be trusted users, but the requirement that read-only act in a non-malicious way is not necessarily true.

##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can read and write to local file system.
+
+Within the cluster:
+1. Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. It is the responsibility of system implementers to setup firewalls and other methods of protection.
+2. Druid supports TLS encryption for network traffic, including API calls and data transfers.

Review comment:
       I don't think TLS encryption is part of the trust model.

##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can read and write to local file system.
+
+Within the cluster:
+1. Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. It is the responsibility of system implementers to setup firewalls and other methods of protection.
+2. Druid supports TLS encryption for network traffic, including API calls and data transfers.
+3. Druid assumes auxiliary services such as the metadata store and ZooKeeper nodes are not under adversary control.
+
+Cluster to deep storage:
+1. Druid does not make assumptions about the security for deep storage. It follows the system's native security policies to authenticate and authorize with deep storage.
+2. Druid does not encrypt files for deep storage. Instead, it relies on the storage system's native encryption capabilities to ensure compatibility with encryption schemes across all storage types.
+
+Cluster to client:
+1. Druid authenticates with the client based on the configured authenticator.
+2. Druid only executes queries when an authorizer grants permission. The default configuration is `allowAll authorizer`.

Review comment:
       Slight suggestion since it's not just queries, but even things like loading data, reading configuration, changing state, etc. The authorizer must grant permissions for any of these actions to be performed.
   
   ```suggestion
   2. Druid only performs actions when an authorizer grants permission. The default configuration is `allowAll authorizer`.
   ```

##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can read and write to local file system.
+
+Within the cluster:
+1. Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. It is the responsibility of system implementers to setup firewalls and other methods of protection.

Review comment:
       This is a little vague. Is the operator supposed to protect incoming connections to Druid or outgoing or both?
   
   As it's written it seems like the expectation is that restrictions should be set up for both incoming and outgoing connections. Most systems I know of primarily concern themselves with incoming connections, but I'm not a security expert.

##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.
+* Only grant `STATE READ`, `STATE WRITE`, and `DATASOURCE WRITE` permissions to highly-trusted users. These permissions allows users to access resources on behalf of the Druid server process regardless of the datasource.

Review comment:
       This should be the point right after "enable authentication". It should be something like "Enable authorization because ..."
   
   Typing out this comment made me realize there's a big overlap between this point and line 39. We'll want to put those 2 points closer to each other I think




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] techdocsmith commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

techdocsmith commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600911873



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can read and write to local file system.
+
+Within the cluster:
+1. Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. It is the responsibility of system implementers to setup firewalls and other methods of protection.
+2. Druid supports TLS encryption for network traffic, including API calls and data transfers.

Review comment:
       @2bethere , @suneet-s TLS is "A" trust model if not "Druid's" model. We have already mentioned in Best Practices section to Enable TLS. So maybe we are covered on this front?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] techdocsmith commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

techdocsmith commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r610947449



##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +24,112 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
+The following recommendations apply to the Druid cluster setup:
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrators have the same OS permissions as the Unix user account running Druid. See [Authentication and authorization model](security-user-auth.md#authentication-and-authorization-model). If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Enable authorization and do not expose the Druid Console without authorization enabled. If authorization is not enabled, any user that has access to the web console has the same privileges as the operating system user that runs the Druid Console process.
+* Grant users the minimum permissions necessary to perform their functions. For instance, do not allow users who only need to query data to write to data sources or view state.
+* * Disable JavaScript, as noted in the [Security section](https://druid.apache.org/docs/latest/development/javascript.html#security) of the JavaScript guide.
+
+The following recommendations apply the network where Druid runs:
+* Enable TLS to encrypt communication within the cluster.
+* Use an API gateway to:
+  - Restrict access from untrusted networks
+  - Create an allow list of specific APIs that your users need to access
+  - Implement account lockout and throttling features.
+* When possible, use firewall and other network layer filtering to only expose Druid services and ports specifically required for your use case. For example, only expose Broker ports to downstream applications that execute queries. You can limit access to a specific IP address or IP range to further tighten and enhance security.

Review comment:
       These are grouped now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] 2bethere commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

2bethere commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600872151



##########
File path: docs/operations/security-overview.md
##########
@@ -23,66 +23,109 @@ title: "Security overview"
   -->
 
 
-## Overview
+
+This document provides an overview of Apache Druid security features, configuration instructions, and some best practices to secure Druid.
 
 By default, security features in Druid are disabled, which simplifies the initial deployment experience. However, security features must be configured in a production deployment. These features include TLS, authentication, and authorization.
 
-To implement Druid security, you configure authenticators and authorizers. Authenticators control the way user identities are verified, while authorizers map the authenticated users (via user roles) to the datasources they are permitted to access. Consequently, implementing Druid security also involves considering your datasource scheme, since that scheme represents the granularity at which data access permissions are allocated. 
 
-The following graphic depicts the course of request through the authentication process: 
+## Best practices
 
 
-![Druid security check flow](../assets/security-model-1.png "Druid security check flow") 
+* Run Druid as an unprivileged Unix user. Do not run Druid as the root user.
+   > **WARNING!** \
+   Druid administrator users have the same OS permissions as the Unix user account running Druid. If the Druid process is running under the OS root user account, then Druid administrators can read or write all files that the root account has access to, including sensitive files such as `/etc/passwd`.
+* Enable authentication to the Druid cluster for production environments and other environments that can be accessed by untrusted networks.
+* Do not expose the Druid Console without authentication on untrusted networks. Authenticated Druid Console users have the same permissions as the OS user running the Druid Console process.
+* Use an API gateway to restrict access from untrusted networks, create an allow list of specific APIs that your users need to access, and implement account lockout and throttling features.

Review comment:
       Yes, those are features provided by API gateways. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] 2bethere commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

2bethere commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r600874102



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +318,29 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Like all other security systems, trust is the foundation of the Druid security model. Druid administrators and read-only users are trusted users. Therefore, they are not expected to act maliciously.
+
+
+Based on this expectation, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes run within the system user context. They have access to the local files granted to the specified system user.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can read and write to local file system.
+
+Within the cluster:
+1. Druid assumes it operates on an isolated, protected network where no reachable IP within the network is under adversary control. It is the responsibility of system implementers to setup firewalls and other methods of protection.
+2. Druid supports TLS encryption for network traffic, including API calls and data transfers.

Review comment:
       Any suggestions on how we describe that the network traffic is protected?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org

[GitHub] [druid] jihoonson commented on a change in pull request #11016: Update security overview with additional recommendations

Posted by GitBox <gi...@apache.org>.

jihoonson commented on a change in pull request #11016:
URL: https://github.com/apache/druid/pull/11016#discussion_r610919733



##########
File path: docs/operations/security-overview.md
##########
@@ -264,3 +322,31 @@ As an alternative to using the basic metadata authenticator, as shown in the pre
 
 
 Congratulations, you have configured permissions for user-assigned roles in Druid!
+
+
+## Druid security trust model
+Within Druid's trust model there users can have different authorization levels:
+- Users with resource write permissions can are allowed to anything that the druid process can do.
+- Authenticated read only users can execute queries against resources to which they have permissions.
+- An authenticated user without any permissions is allowed to execute queries that don't require access to a resource.
+
+Additionally, Druid operates according to the following principles:
+
+From the inner most layer:
+1. Druid processes have the same access to the local files granted to the specified system user running the process.
+2. The Druid ingestion system can create new processes to execute tasks. Those tasks inherit the user of their parent process. This means that any user authorized to submit an ingestion task can use the ingestion task permissions to read or write any local files or external resources that the Druid process has access to.
+
+> Note: Only grant the permission to submit ingestion tasks to trusted users because they can  act as the Druid process.

Review comment:
       Here and [this figure](https://github.com/apache/druid/blob/e3eb18abcd3c8b2d71f51abf9d55a94bee6d10ea/docs/assets/security-model-2.png) talk about the permission to submit ingestion tasks. This seems ambiguous to me. Maybe it would be better to say the `DATASOURCE WRITE` permission instead.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org