You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Shuyi Chen <su...@gmail.com> on 2018/01/03 23:43:19 UTC

Re: Flink-Yarn-Kerberos integration

Thanks a lot for the clarification, Eron. That's very helpful. Currently,
we are more concerned about 1) data access, but will get to 2) and 3)
eventually.

I was thinking doing the following:
1) extend the current HadoopModule to use and refresh DTs as suggested on YARN
Application Security docs.
2) I found the current SecurityModule interface might be enough for
supporting other security mechanisms. However, the loading of security
modules are hard-coded, not configuration based. I think we can extend
SecurityUtils to load from configurations. So we can implement our own
security mechanism in our internal repo, and have flink jobs to load it at
runtime.

Please let me know your comments. Thanks a lot.

On Fri, Dec 22, 2017 at 3:05 PM, Eron Wright <er...@gmail.com> wrote:

> I agree that it is reasonable to use Hadoop DTs as you describe.  That
> approach is even recommended in YARN's documentation (see Securing
> Long-lived YARN Services on the YARN Application Security page).   But one
> of the goals of Kerberos integration is to support Kerberized data access
> for connectors other than HDFS, such as Kafka, Cassandra, and
> Elasticsearch.   So your second point makes sense too, suggesting a general
> architecture for managing secrets (DTs, keytabs, certificates, oauth
> tokens, etc.) within the cluster.
>
> There's quite a few aspects to Flink security, including:
> 1. data access (e.g. how a connector authenticates to a data source)
> 2. service authorization and network security (e.g. how a Flink cluster
> protects itself from unauthorized access)
> 3. multi-user support (e.g. multi-user Flink clusters, RBAC)
>
> I mention these aspects to clarify your point about AuthN, which I took to
> be related to (1).   Do tell if I misunderstood.
>
> Eron
>
>
> On Wed, Dec 20, 2017 at 11:21 AM, Shuyi Chen <su...@gmail.com> wrote:
>
> > Hi community,
> >
> > We are working on secure Flink on YARN. The current Flink-Yarn-Kerberos
> > integration will require each container of a job to log in Kerberos via
> > keytab every say, 24 hours, and does not use any Hadoop delegation token
> > mechanism except when localizing the container. As I fixed the current
> > Flink-Yarn-Kerberos (FLINK-8275) and tried to add more
> > features(FLINK-7860), I have some concern regarding the current
> > implementation. It can pose a scalability issue to the KDC, e.g., if YARN
> > cluster is restarted and all 10s of thousands of containers suddenly DDOS
> > KDC.
> >
> > I would like to propose to improve the current Flink-YARN-Kerberos
> > integration by doing something like the following:
> > 1) AppMaster (JobManager) periodically authenticate the KDC, get all
> > required DTs for the job.
> > 2) all other TM or TE containers periodically retrieve new DTs from the
> > AppMaster (either through a secure HDFS folder, or a secure Akka channel)
> >
> > Also, we want to extend Flink to support pluggable AuthN mechanism,
> because
> > we have our own internal AuthN mechanism. We would like add support in
> > Flink to authenticate periodically to our internal AuthN service as well
> > through, e.g., dynamic class loading, and use similar mechanism to
> > distribute the credential from the appMaster to containers.
> >
> > I would like to get comments and feedbacks. I can also write a design doc
> > or create a Flip if needed. Thanks a lot.
> >
> > Shuyi
> >
> >
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
> >
>



-- 
"So you have to trust that the dots will somehow connect in your future."

Re: Flink-Yarn-Kerberos integration

Posted by Shuyi Chen <su...@gmail.com>.

Thanks a lot, Eron. I'll draft a proposal and share it with the community.

On Thu, Jan 18, 2018 at 4:18 PM, Eron Wright <er...@gmail.com> wrote:

> I would suggest that you draft a proposal that lays out your goals and the
> technical challenges that you perceive.  Then the community can provide
> some feedback on potential solutions to those challenges, culminating in a
> concrete improvement proposal.
>
> Thanks
>
> On Wed, Jan 17, 2018 at 7:29 PM, Shuyi Chen <su...@gmail.com> wrote:
>
> > Ping, any comments?  Thanks a lot.
> >
> > Shuyi
> >
> > On Wed, Jan 3, 2018 at 3:43 PM, Shuyi Chen <su...@gmail.com> wrote:
> >
> > > Thanks a lot for the clarification, Eron. That's very helpful.
> Currently,
> > > we are more concerned about 1) data access, but will get to 2) and 3)
> > > eventually.
> > >
> > > I was thinking doing the following:
> > > 1) extend the current HadoopModule to use and refresh DTs as suggested
> > on YARN
> > > Application Security docs.
> > > 2) I found the current SecurityModule interface might be enough for
> > > supporting other security mechanisms. However, the loading of security
> > > modules are hard-coded, not configuration based. I think we can extend
> > > SecurityUtils to load from configurations. So we can implement our own
> > > security mechanism in our internal repo, and have flink jobs to load it
> > at
> > > runtime.
> > >
> > > Please let me know your comments. Thanks a lot.
> > >
> > > On Fri, Dec 22, 2017 at 3:05 PM, Eron Wright <er...@gmail.com>
> > wrote:
> > >
> > >> I agree that it is reasonable to use Hadoop DTs as you describe.  That
> > >> approach is even recommended in YARN's documentation (see Securing
> > >> Long-lived YARN Services on the YARN Application Security page).   But
> > one
> > >> of the goals of Kerberos integration is to support Kerberized data
> > access
> > >> for connectors other than HDFS, such as Kafka, Cassandra, and
> > >> Elasticsearch.   So your second point makes sense too, suggesting a
> > >> general
> > >> architecture for managing secrets (DTs, keytabs, certificates, oauth
> > >> tokens, etc.) within the cluster.
> > >>
> > >> There's quite a few aspects to Flink security, including:
> > >> 1. data access (e.g. how a connector authenticates to a data source)
> > >> 2. service authorization and network security (e.g. how a Flink
> cluster
> > >> protects itself from unauthorized access)
> > >> 3. multi-user support (e.g. multi-user Flink clusters, RBAC)
> > >>
> > >> I mention these aspects to clarify your point about AuthN, which I
> took
> > to
> > >> be related to (1).   Do tell if I misunderstood.
> > >>
> > >> Eron
> > >>
> > >>
> > >> On Wed, Dec 20, 2017 at 11:21 AM, Shuyi Chen <su...@gmail.com>
> > wrote:
> > >>
> > >> > Hi community,
> > >> >
> > >> > We are working on secure Flink on YARN. The current
> > Flink-Yarn-Kerberos
> > >> > integration will require each container of a job to log in Kerberos
> > via
> > >> > keytab every say, 24 hours, and does not use any Hadoop delegation
> > token
> > >> > mechanism except when localizing the container. As I fixed the
> current
> > >> > Flink-Yarn-Kerberos (FLINK-8275) and tried to add more
> > >> > features(FLINK-7860), I have some concern regarding the current
> > >> > implementation. It can pose a scalability issue to the KDC, e.g., if
> > >> YARN
> > >> > cluster is restarted and all 10s of thousands of containers suddenly
> > >> DDOS
> > >> > KDC.
> > >> >
> > >> > I would like to propose to improve the current Flink-YARN-Kerberos
> > >> > integration by doing something like the following:
> > >> > 1) AppMaster (JobManager) periodically authenticate the KDC, get all
> > >> > required DTs for the job.
> > >> > 2) all other TM or TE containers periodically retrieve new DTs from
> > the
> > >> > AppMaster (either through a secure HDFS folder, or a secure Akka
> > >> channel)
> > >> >
> > >> > Also, we want to extend Flink to support pluggable AuthN mechanism,
> > >> because
> > >> > we have our own internal AuthN mechanism. We would like add support
> in
> > >> > Flink to authenticate periodically to our internal AuthN service as
> > well
> > >> > through, e.g., dynamic class loading, and use similar mechanism to
> > >> > distribute the credential from the appMaster to containers.
> > >> >
> > >> > I would like to get comments and feedbacks. I can also write a
> design
> > >> doc
> > >> > or create a Flip if needed. Thanks a lot.
> > >> >
> > >> > Shuyi
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > "So you have to trust that the dots will somehow connect in your
> > >> future."
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > "So you have to trust that the dots will somehow connect in your
> future."
> > >
> >
> >
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
> >
>



-- 
"So you have to trust that the dots will somehow connect in your future."

Re: Flink-Yarn-Kerberos integration

Posted by Eron Wright <er...@gmail.com>.

I would suggest that you draft a proposal that lays out your goals and the
technical challenges that you perceive.  Then the community can provide
some feedback on potential solutions to those challenges, culminating in a
concrete improvement proposal.

Thanks

On Wed, Jan 17, 2018 at 7:29 PM, Shuyi Chen <su...@gmail.com> wrote:

> Ping, any comments?  Thanks a lot.
>
> Shuyi
>
> On Wed, Jan 3, 2018 at 3:43 PM, Shuyi Chen <su...@gmail.com> wrote:
>
> > Thanks a lot for the clarification, Eron. That's very helpful. Currently,
> > we are more concerned about 1) data access, but will get to 2) and 3)
> > eventually.
> >
> > I was thinking doing the following:
> > 1) extend the current HadoopModule to use and refresh DTs as suggested
> on YARN
> > Application Security docs.
> > 2) I found the current SecurityModule interface might be enough for
> > supporting other security mechanisms. However, the loading of security
> > modules are hard-coded, not configuration based. I think we can extend
> > SecurityUtils to load from configurations. So we can implement our own
> > security mechanism in our internal repo, and have flink jobs to load it
> at
> > runtime.
> >
> > Please let me know your comments. Thanks a lot.
> >
> > On Fri, Dec 22, 2017 at 3:05 PM, Eron Wright <er...@gmail.com>
> wrote:
> >
> >> I agree that it is reasonable to use Hadoop DTs as you describe.  That
> >> approach is even recommended in YARN's documentation (see Securing
> >> Long-lived YARN Services on the YARN Application Security page).   But
> one
> >> of the goals of Kerberos integration is to support Kerberized data
> access
> >> for connectors other than HDFS, such as Kafka, Cassandra, and
> >> Elasticsearch.   So your second point makes sense too, suggesting a
> >> general
> >> architecture for managing secrets (DTs, keytabs, certificates, oauth
> >> tokens, etc.) within the cluster.
> >>
> >> There's quite a few aspects to Flink security, including:
> >> 1. data access (e.g. how a connector authenticates to a data source)
> >> 2. service authorization and network security (e.g. how a Flink cluster
> >> protects itself from unauthorized access)
> >> 3. multi-user support (e.g. multi-user Flink clusters, RBAC)
> >>
> >> I mention these aspects to clarify your point about AuthN, which I took
> to
> >> be related to (1).   Do tell if I misunderstood.
> >>
> >> Eron
> >>
> >>
> >> On Wed, Dec 20, 2017 at 11:21 AM, Shuyi Chen <su...@gmail.com>
> wrote:
> >>
> >> > Hi community,
> >> >
> >> > We are working on secure Flink on YARN. The current
> Flink-Yarn-Kerberos
> >> > integration will require each container of a job to log in Kerberos
> via
> >> > keytab every say, 24 hours, and does not use any Hadoop delegation
> token
> >> > mechanism except when localizing the container. As I fixed the current
> >> > Flink-Yarn-Kerberos (FLINK-8275) and tried to add more
> >> > features(FLINK-7860), I have some concern regarding the current
> >> > implementation. It can pose a scalability issue to the KDC, e.g., if
> >> YARN
> >> > cluster is restarted and all 10s of thousands of containers suddenly
> >> DDOS
> >> > KDC.
> >> >
> >> > I would like to propose to improve the current Flink-YARN-Kerberos
> >> > integration by doing something like the following:
> >> > 1) AppMaster (JobManager) periodically authenticate the KDC, get all
> >> > required DTs for the job.
> >> > 2) all other TM or TE containers periodically retrieve new DTs from
> the
> >> > AppMaster (either through a secure HDFS folder, or a secure Akka
> >> channel)
> >> >
> >> > Also, we want to extend Flink to support pluggable AuthN mechanism,
> >> because
> >> > we have our own internal AuthN mechanism. We would like add support in
> >> > Flink to authenticate periodically to our internal AuthN service as
> well
> >> > through, e.g., dynamic class loading, and use similar mechanism to
> >> > distribute the credential from the appMaster to containers.
> >> >
> >> > I would like to get comments and feedbacks. I can also write a design
> >> doc
> >> > or create a Flip if needed. Thanks a lot.
> >> >
> >> > Shuyi
> >> >
> >> >
> >> >
> >> > --
> >> > "So you have to trust that the dots will somehow connect in your
> >> future."
> >> >
> >>
> >
> >
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
> >
>
>
>
> --
> "So you have to trust that the dots will somehow connect in your future."
>

Re: Flink-Yarn-Kerberos integration

Posted by Shuyi Chen <su...@gmail.com>.

Ping, any comments?  Thanks a lot.

Shuyi

On Wed, Jan 3, 2018 at 3:43 PM, Shuyi Chen <su...@gmail.com> wrote:

> Thanks a lot for the clarification, Eron. That's very helpful. Currently,
> we are more concerned about 1) data access, but will get to 2) and 3)
> eventually.
>
> I was thinking doing the following:
> 1) extend the current HadoopModule to use and refresh DTs as suggested on YARN
> Application Security docs.
> 2) I found the current SecurityModule interface might be enough for
> supporting other security mechanisms. However, the loading of security
> modules are hard-coded, not configuration based. I think we can extend
> SecurityUtils to load from configurations. So we can implement our own
> security mechanism in our internal repo, and have flink jobs to load it at
> runtime.
>
> Please let me know your comments. Thanks a lot.
>
> On Fri, Dec 22, 2017 at 3:05 PM, Eron Wright <er...@gmail.com> wrote:
>
>> I agree that it is reasonable to use Hadoop DTs as you describe.  That
>> approach is even recommended in YARN's documentation (see Securing
>> Long-lived YARN Services on the YARN Application Security page).   But one
>> of the goals of Kerberos integration is to support Kerberized data access
>> for connectors other than HDFS, such as Kafka, Cassandra, and
>> Elasticsearch.   So your second point makes sense too, suggesting a
>> general
>> architecture for managing secrets (DTs, keytabs, certificates, oauth
>> tokens, etc.) within the cluster.
>>
>> There's quite a few aspects to Flink security, including:
>> 1. data access (e.g. how a connector authenticates to a data source)
>> 2. service authorization and network security (e.g. how a Flink cluster
>> protects itself from unauthorized access)
>> 3. multi-user support (e.g. multi-user Flink clusters, RBAC)
>>
>> I mention these aspects to clarify your point about AuthN, which I took to
>> be related to (1).   Do tell if I misunderstood.
>>
>> Eron
>>
>>
>> On Wed, Dec 20, 2017 at 11:21 AM, Shuyi Chen <su...@gmail.com> wrote:
>>
>> > Hi community,
>> >
>> > We are working on secure Flink on YARN. The current Flink-Yarn-Kerberos
>> > integration will require each container of a job to log in Kerberos via
>> > keytab every say, 24 hours, and does not use any Hadoop delegation token
>> > mechanism except when localizing the container. As I fixed the current
>> > Flink-Yarn-Kerberos (FLINK-8275) and tried to add more
>> > features(FLINK-7860), I have some concern regarding the current
>> > implementation. It can pose a scalability issue to the KDC, e.g., if
>> YARN
>> > cluster is restarted and all 10s of thousands of containers suddenly
>> DDOS
>> > KDC.
>> >
>> > I would like to propose to improve the current Flink-YARN-Kerberos
>> > integration by doing something like the following:
>> > 1) AppMaster (JobManager) periodically authenticate the KDC, get all
>> > required DTs for the job.
>> > 2) all other TM or TE containers periodically retrieve new DTs from the
>> > AppMaster (either through a secure HDFS folder, or a secure Akka
>> channel)
>> >
>> > Also, we want to extend Flink to support pluggable AuthN mechanism,
>> because
>> > we have our own internal AuthN mechanism. We would like add support in
>> > Flink to authenticate periodically to our internal AuthN service as well
>> > through, e.g., dynamic class loading, and use similar mechanism to
>> > distribute the credential from the appMaster to containers.
>> >
>> > I would like to get comments and feedbacks. I can also write a design
>> doc
>> > or create a Flip if needed. Thanks a lot.
>> >
>> > Shuyi
>> >
>> >
>> >
>> > --
>> > "So you have to trust that the dots will somehow connect in your
>> future."
>> >
>>
>
>
>
> --
> "So you have to trust that the dots will somehow connect in your future."
>



-- 
"So you have to trust that the dots will somehow connect in your future."