You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by Ayush Saxena <ay...@gmail.com> on 2021/10/11 06:38:17 UTC

Re: [DISCUSS] Add remote port information to HDFS audit log

Hey
I am not sure whether we can directly go and change this. Any changes to Audit Log format are considered incompatible.

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output

-Ayush

> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
> 
> Hi all,
> 
> In our production environment, we occasionally encounter a problem where a
> user submits an abnormal computation task, causing a sudden flood of
> requests, which causes the queueTime and processingTime of the Namenode to
> rise very high, causing a large backlog of tasks.
> 
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based
> on metrics and audit logs. Currently, IP and UGI are recorded in audit
> logs, but there is no port information, so it is difficult to locate
> specific processes sometimes. Therefore, I propose that we add the port
> information to the audit log, so that we can easily track the upstream
> process.
> 
> Currently, some projects contain port information in audit logs, such as
> Hbase and Alluxio. I think it is also necessary to add port information for
> HDFS audit logs.
> 
> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which has
> been tested in our test environment, and both RPC and HTTP are in effect. I
> look forward to your discussion on possible problems and suggestions for
> modification. I will actively update the PR.
> 
> Best Regards,
> Tom

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by tom lee <to...@gmail.com>.

Thanks @Takanobu for your reply. I will make it configurable and default as
disable.

Takanobu Asanuma <ta...@apache.org> 于2021年10月13日周三 下午3:34写道：

> I think many users parse audit logs in their own way, and they will be
> affected if the format is changed. So I agree with Masatake's suggestion.
> - Takanobu
>
> 2021年10月11日(月) 18:19 tom lee <to...@gmail.com>:
>
>> Thanks @Masatake Iwasaki <iw...@oss.nttdata.co.jp> for your
>> suggestion. This is a good idea.
>>
>> Masatake Iwasaki <iw...@oss.nttdata.co.jp> 于2021年10月11日周一 下午3:26写道：
>>
>> > > I am not sure whether we can directly go and change this. Any changes
>> to
>> > Audit Log format are considered incompatible.
>> > >
>> > >
>> >
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>> >
>> > Adding a field for caller context seemed to be accepted since it is
>> > optional feature disabled by default.
>> >
>> >
>> https://github.com/apache/hadoop/blob/rel/release-3.3.1/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L8480-L8498
>> >
>> > If we need to add fields, making it optional might be an option.
>> >
>> > Masatake Iwasaki
>> >
>> > On 2021/10/11 16:09, tom lee wrote:
>> > > However, adding port is to modify the internal content of the IP
>> field,
>> > > which has little impact on the overall layout.
>> > >
>> > > In our cluster, we parse the audit log through Vector and send the
>> data
>> > to
>> > > Kafka, which is unaffected.
>> > >
>> > > tom lee <to...@gmail.com> 于2021年10月11日周一 下午2:44写道：
>> > >
>> > >> Thank Ayush for reminding me. I also have similar concerns, so I
>> > published
>> > >> this discussion, hoping to let the members of the community know
>> about
>> > this
>> > >> matter and then give suggestions.
>> > >>
>> > >> Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：
>> > >>
>> > >>> Hey
>> > >>> I am not sure whether we can directly go and change this. Any
>> changes
>> > to
>> > >>> Audit Log format are considered incompatible.
>> > >>>
>> > >>>
>> > >>>
>> >
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>> > >>>
>> > >>> -Ayush
>> > >>>
>> > >>> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
>> > >>>
>> > >>> Hi all,
>> > >>>
>> > >>> In our production environment, we occasionally encounter a problem
>> > where a
>> > >>> user submits an abnormal computation task, causing a sudden flood of
>> > >>> requests, which causes the queueTime and processingTime of the
>> > Namenode to
>> > >>> rise very high, causing a large backlog of tasks.
>> > >>>
>> > >>> We usually locate and kill specific Spark, Flink, or MapReduce tasks
>> > based
>> > >>> on metrics and audit logs. Currently, IP and UGI are recorded in
>> audit
>> > >>> logs, but there is no port information, so it is difficult to locate
>> > >>> specific processes sometimes. Therefore, I propose that we add the
>> port
>> > >>> information to the audit log, so that we can easily track the
>> upstream
>> > >>> process.
>> > >>>
>> > >>> Currently, some projects contain port information in audit logs,
>> such
>> > as
>> > >>> Hbase and Alluxio. I think it is also necessary to add port
>> information
>> > >>> for
>> > >>> HDFS audit logs.
>> > >>>
>> > >>> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which
>> > has
>> > >>> been tested in our test environment, and both RPC and HTTP are in
>> > effect.
>> > >>> I
>> > >>> look forward to your discussion on possible problems and suggestions
>> > for
>> > >>> modification. I will actively update the PR.
>> > >>>
>> > >>> Best Regards,
>> > >>> Tom
>> > >>>
>> > >>>
>> > >
>> >
>>
>

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by Takanobu Asanuma <ta...@apache.org>.

I think many users parse audit logs in their own way, and they will be
affected if the format is changed. So I agree with Masatake's suggestion.
- Takanobu

2021年10月11日(月) 18:19 tom lee <to...@gmail.com>:

> Thanks @Masatake Iwasaki <iw...@oss.nttdata.co.jp> for your
> suggestion. This is a good idea.
>
> Masatake Iwasaki <iw...@oss.nttdata.co.jp> 于2021年10月11日周一 下午3:26写道：
>
> > > I am not sure whether we can directly go and change this. Any changes
> to
> > Audit Log format are considered incompatible.
> > >
> > >
> >
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
> >
> > Adding a field for caller context seemed to be accepted since it is
> > optional feature disabled by default.
> >
> >
> https://github.com/apache/hadoop/blob/rel/release-3.3.1/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L8480-L8498
> >
> > If we need to add fields, making it optional might be an option.
> >
> > Masatake Iwasaki
> >
> > On 2021/10/11 16:09, tom lee wrote:
> > > However, adding port is to modify the internal content of the IP field,
> > > which has little impact on the overall layout.
> > >
> > > In our cluster, we parse the audit log through Vector and send the data
> > to
> > > Kafka, which is unaffected.
> > >
> > > tom lee <to...@gmail.com> 于2021年10月11日周一 下午2:44写道：
> > >
> > >> Thank Ayush for reminding me. I also have similar concerns, so I
> > published
> > >> this discussion, hoping to let the members of the community know about
> > this
> > >> matter and then give suggestions.
> > >>
> > >> Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：
> > >>
> > >>> Hey
> > >>> I am not sure whether we can directly go and change this. Any changes
> > to
> > >>> Audit Log format are considered incompatible.
> > >>>
> > >>>
> > >>>
> >
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
> > >>>
> > >>> -Ayush
> > >>>
> > >>> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
> > >>>
> > >>> Hi all,
> > >>>
> > >>> In our production environment, we occasionally encounter a problem
> > where a
> > >>> user submits an abnormal computation task, causing a sudden flood of
> > >>> requests, which causes the queueTime and processingTime of the
> > Namenode to
> > >>> rise very high, causing a large backlog of tasks.
> > >>>
> > >>> We usually locate and kill specific Spark, Flink, or MapReduce tasks
> > based
> > >>> on metrics and audit logs. Currently, IP and UGI are recorded in
> audit
> > >>> logs, but there is no port information, so it is difficult to locate
> > >>> specific processes sometimes. Therefore, I propose that we add the
> port
> > >>> information to the audit log, so that we can easily track the
> upstream
> > >>> process.
> > >>>
> > >>> Currently, some projects contain port information in audit logs, such
> > as
> > >>> Hbase and Alluxio. I think it is also necessary to add port
> information
> > >>> for
> > >>> HDFS audit logs.
> > >>>
> > >>> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which
> > has
> > >>> been tested in our test environment, and both RPC and HTTP are in
> > effect.
> > >>> I
> > >>> look forward to your discussion on possible problems and suggestions
> > for
> > >>> modification. I will actively update the PR.
> > >>>
> > >>> Best Regards,
> > >>> Tom
> > >>>
> > >>>
> > >
> >
>

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by Takanobu Asanuma <ta...@apache.org>.

I think many users parse audit logs in their own way, and they will be
affected if the format is changed. So I agree with Masatake's suggestion.
- Takanobu

2021年10月11日(月) 18:19 tom lee <to...@gmail.com>:

> Thanks @Masatake Iwasaki <iw...@oss.nttdata.co.jp> for your
> suggestion. This is a good idea.
>
> Masatake Iwasaki <iw...@oss.nttdata.co.jp> 于2021年10月11日周一 下午3:26写道：
>
> > > I am not sure whether we can directly go and change this. Any changes
> to
> > Audit Log format are considered incompatible.
> > >
> > >
> >
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
> >
> > Adding a field for caller context seemed to be accepted since it is
> > optional feature disabled by default.
> >
> >
> https://github.com/apache/hadoop/blob/rel/release-3.3.1/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L8480-L8498
> >
> > If we need to add fields, making it optional might be an option.
> >
> > Masatake Iwasaki
> >
> > On 2021/10/11 16:09, tom lee wrote:
> > > However, adding port is to modify the internal content of the IP field,
> > > which has little impact on the overall layout.
> > >
> > > In our cluster, we parse the audit log through Vector and send the data
> > to
> > > Kafka, which is unaffected.
> > >
> > > tom lee <to...@gmail.com> 于2021年10月11日周一 下午2:44写道：
> > >
> > >> Thank Ayush for reminding me. I also have similar concerns, so I
> > published
> > >> this discussion, hoping to let the members of the community know about
> > this
> > >> matter and then give suggestions.
> > >>
> > >> Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：
> > >>
> > >>> Hey
> > >>> I am not sure whether we can directly go and change this. Any changes
> > to
> > >>> Audit Log format are considered incompatible.
> > >>>
> > >>>
> > >>>
> >
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
> > >>>
> > >>> -Ayush
> > >>>
> > >>> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
> > >>>
> > >>> Hi all,
> > >>>
> > >>> In our production environment, we occasionally encounter a problem
> > where a
> > >>> user submits an abnormal computation task, causing a sudden flood of
> > >>> requests, which causes the queueTime and processingTime of the
> > Namenode to
> > >>> rise very high, causing a large backlog of tasks.
> > >>>
> > >>> We usually locate and kill specific Spark, Flink, or MapReduce tasks
> > based
> > >>> on metrics and audit logs. Currently, IP and UGI are recorded in
> audit
> > >>> logs, but there is no port information, so it is difficult to locate
> > >>> specific processes sometimes. Therefore, I propose that we add the
> port
> > >>> information to the audit log, so that we can easily track the
> upstream
> > >>> process.
> > >>>
> > >>> Currently, some projects contain port information in audit logs, such
> > as
> > >>> Hbase and Alluxio. I think it is also necessary to add port
> information
> > >>> for
> > >>> HDFS audit logs.
> > >>>
> > >>> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which
> > has
> > >>> been tested in our test environment, and both RPC and HTTP are in
> > effect.
> > >>> I
> > >>> look forward to your discussion on possible problems and suggestions
> > for
> > >>> modification. I will actively update the PR.
> > >>>
> > >>> Best Regards,
> > >>> Tom
> > >>>
> > >>>
> > >
> >
>

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by tom lee <to...@gmail.com>.

Thanks @Masatake Iwasaki <iw...@oss.nttdata.co.jp> for your
suggestion. This is a good idea.

Masatake Iwasaki <iw...@oss.nttdata.co.jp> 于2021年10月11日周一 下午3:26写道：

> > I am not sure whether we can directly go and change this. Any changes to
> Audit Log format are considered incompatible.
> >
> >
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>
> Adding a field for caller context seemed to be accepted since it is
> optional feature disabled by default.
>
> https://github.com/apache/hadoop/blob/rel/release-3.3.1/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L8480-L8498
>
> If we need to add fields, making it optional might be an option.
>
> Masatake Iwasaki
>
> On 2021/10/11 16:09, tom lee wrote:
> > However, adding port is to modify the internal content of the IP field,
> > which has little impact on the overall layout.
> >
> > In our cluster, we parse the audit log through Vector and send the data
> to
> > Kafka, which is unaffected.
> >
> > tom lee <to...@gmail.com> 于2021年10月11日周一 下午2:44写道：
> >
> >> Thank Ayush for reminding me. I also have similar concerns, so I
> published
> >> this discussion, hoping to let the members of the community know about
> this
> >> matter and then give suggestions.
> >>
> >> Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：
> >>
> >>> Hey
> >>> I am not sure whether we can directly go and change this. Any changes
> to
> >>> Audit Log format are considered incompatible.
> >>>
> >>>
> >>>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
> >>>
> >>> -Ayush
> >>>
> >>> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> In our production environment, we occasionally encounter a problem
> where a
> >>> user submits an abnormal computation task, causing a sudden flood of
> >>> requests, which causes the queueTime and processingTime of the
> Namenode to
> >>> rise very high, causing a large backlog of tasks.
> >>>
> >>> We usually locate and kill specific Spark, Flink, or MapReduce tasks
> based
> >>> on metrics and audit logs. Currently, IP and UGI are recorded in audit
> >>> logs, but there is no port information, so it is difficult to locate
> >>> specific processes sometimes. Therefore, I propose that we add the port
> >>> information to the audit log, so that we can easily track the upstream
> >>> process.
> >>>
> >>> Currently, some projects contain port information in audit logs, such
> as
> >>> Hbase and Alluxio. I think it is also necessary to add port information
> >>> for
> >>> HDFS audit logs.
> >>>
> >>> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which
> has
> >>> been tested in our test environment, and both RPC and HTTP are in
> effect.
> >>> I
> >>> look forward to your discussion on possible problems and suggestions
> for
> >>> modification. I will actively update the PR.
> >>>
> >>> Best Regards,
> >>> Tom
> >>>
> >>>
> >
>

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by tom lee <to...@gmail.com>.

Thanks @Masatake Iwasaki <iw...@oss.nttdata.co.jp> for your
suggestion. This is a good idea.

Masatake Iwasaki <iw...@oss.nttdata.co.jp> 于2021年10月11日周一 下午3:26写道：

> > I am not sure whether we can directly go and change this. Any changes to
> Audit Log format are considered incompatible.
> >
> >
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>
> Adding a field for caller context seemed to be accepted since it is
> optional feature disabled by default.
>
> https://github.com/apache/hadoop/blob/rel/release-3.3.1/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L8480-L8498
>
> If we need to add fields, making it optional might be an option.
>
> Masatake Iwasaki
>
> On 2021/10/11 16:09, tom lee wrote:
> > However, adding port is to modify the internal content of the IP field,
> > which has little impact on the overall layout.
> >
> > In our cluster, we parse the audit log through Vector and send the data
> to
> > Kafka, which is unaffected.
> >
> > tom lee <to...@gmail.com> 于2021年10月11日周一 下午2:44写道：
> >
> >> Thank Ayush for reminding me. I also have similar concerns, so I
> published
> >> this discussion, hoping to let the members of the community know about
> this
> >> matter and then give suggestions.
> >>
> >> Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：
> >>
> >>> Hey
> >>> I am not sure whether we can directly go and change this. Any changes
> to
> >>> Audit Log format are considered incompatible.
> >>>
> >>>
> >>>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
> >>>
> >>> -Ayush
> >>>
> >>> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> In our production environment, we occasionally encounter a problem
> where a
> >>> user submits an abnormal computation task, causing a sudden flood of
> >>> requests, which causes the queueTime and processingTime of the
> Namenode to
> >>> rise very high, causing a large backlog of tasks.
> >>>
> >>> We usually locate and kill specific Spark, Flink, or MapReduce tasks
> based
> >>> on metrics and audit logs. Currently, IP and UGI are recorded in audit
> >>> logs, but there is no port information, so it is difficult to locate
> >>> specific processes sometimes. Therefore, I propose that we add the port
> >>> information to the audit log, so that we can easily track the upstream
> >>> process.
> >>>
> >>> Currently, some projects contain port information in audit logs, such
> as
> >>> Hbase and Alluxio. I think it is also necessary to add port information
> >>> for
> >>> HDFS audit logs.
> >>>
> >>> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which
> has
> >>> been tested in our test environment, and both RPC and HTTP are in
> effect.
> >>> I
> >>> look forward to your discussion on possible problems and suggestions
> for
> >>> modification. I will actively update the PR.
> >>>
> >>> Best Regards,
> >>> Tom
> >>>
> >>>
> >
>

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.

> I am not sure whether we can directly go and change this. Any changes to Audit Log format are considered incompatible.
> 
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output

Adding a field for caller context seemed to be accepted since it is optional feature disabled by default.
https://github.com/apache/hadoop/blob/rel/release-3.3.1/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L8480-L8498

If we need to add fields, making it optional might be an option.

Masatake Iwasaki

On 2021/10/11 16:09, tom lee wrote:
> However, adding port is to modify the internal content of the IP field,
> which has little impact on the overall layout.
> 
> In our cluster, we parse the audit log through Vector and send the data to
> Kafka, which is unaffected.
> 
> tom lee <to...@gmail.com> 于2021年10月11日周一 下午2:44写道：
> 
>> Thank Ayush for reminding me. I also have similar concerns, so I published
>> this discussion, hoping to let the members of the community know about this
>> matter and then give suggestions.
>>
>> Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：
>>
>>> Hey
>>> I am not sure whether we can directly go and change this. Any changes to
>>> Audit Log format are considered incompatible.
>>>
>>>
>>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>>>
>>> -Ayush
>>>
>>> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
>>>
>>> Hi all,
>>>
>>> In our production environment, we occasionally encounter a problem where a
>>> user submits an abnormal computation task, causing a sudden flood of
>>> requests, which causes the queueTime and processingTime of the Namenode to
>>> rise very high, causing a large backlog of tasks.
>>>
>>> We usually locate and kill specific Spark, Flink, or MapReduce tasks based
>>> on metrics and audit logs. Currently, IP and UGI are recorded in audit
>>> logs, but there is no port information, so it is difficult to locate
>>> specific processes sometimes. Therefore, I propose that we add the port
>>> information to the audit log, so that we can easily track the upstream
>>> process.
>>>
>>> Currently, some projects contain port information in audit logs, such as
>>> Hbase and Alluxio. I think it is also necessary to add port information
>>> for
>>> HDFS audit logs.
>>>
>>> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which has
>>> been tested in our test environment, and both RPC and HTTP are in effect.
>>> I
>>> look forward to your discussion on possible problems and suggestions for
>>> modification. I will actively update the PR.
>>>
>>> Best Regards,
>>> Tom
>>>
>>>
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by Masatake Iwasaki <iw...@oss.nttdata.co.jp>.

> I am not sure whether we can directly go and change this. Any changes to Audit Log format are considered incompatible.
> 
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output

Adding a field for caller context seemed to be accepted since it is optional feature disabled by default.
https://github.com/apache/hadoop/blob/rel/release-3.3.1/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L8480-L8498

If we need to add fields, making it optional might be an option.

Masatake Iwasaki

On 2021/10/11 16:09, tom lee wrote:
> However, adding port is to modify the internal content of the IP field,
> which has little impact on the overall layout.
> 
> In our cluster, we parse the audit log through Vector and send the data to
> Kafka, which is unaffected.
> 
> tom lee <to...@gmail.com> 于2021年10月11日周一 下午2:44写道：
> 
>> Thank Ayush for reminding me. I also have similar concerns, so I published
>> this discussion, hoping to let the members of the community know about this
>> matter and then give suggestions.
>>
>> Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：
>>
>>> Hey
>>> I am not sure whether we can directly go and change this. Any changes to
>>> Audit Log format are considered incompatible.
>>>
>>>
>>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>>>
>>> -Ayush
>>>
>>> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
>>>
>>> Hi all,
>>>
>>> In our production environment, we occasionally encounter a problem where a
>>> user submits an abnormal computation task, causing a sudden flood of
>>> requests, which causes the queueTime and processingTime of the Namenode to
>>> rise very high, causing a large backlog of tasks.
>>>
>>> We usually locate and kill specific Spark, Flink, or MapReduce tasks based
>>> on metrics and audit logs. Currently, IP and UGI are recorded in audit
>>> logs, but there is no port information, so it is difficult to locate
>>> specific processes sometimes. Therefore, I propose that we add the port
>>> information to the audit log, so that we can easily track the upstream
>>> process.
>>>
>>> Currently, some projects contain port information in audit logs, such as
>>> Hbase and Alluxio. I think it is also necessary to add port information
>>> for
>>> HDFS audit logs.
>>>
>>> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which has
>>> been tested in our test environment, and both RPC and HTTP are in effect.
>>> I
>>> look forward to your discussion on possible problems and suggestions for
>>> modification. I will actively update the PR.
>>>
>>> Best Regards,
>>> Tom
>>>
>>>
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by tom lee <to...@gmail.com>.

However, adding port is to modify the internal content of the IP field,
which has little impact on the overall layout.

In our cluster, we parse the audit log through Vector and send the data to
Kafka, which is unaffected.

tom lee <to...@gmail.com> 于2021年10月11日周一 下午2:44写道：

> Thank Ayush for reminding me. I also have similar concerns, so I published
> this discussion, hoping to let the members of the community know about this
> matter and then give suggestions.
>
> Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：
>
>> Hey
>> I am not sure whether we can directly go and change this. Any changes to
>> Audit Log format are considered incompatible.
>>
>>
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>>
>> -Ayush
>>
>> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
>>
>> Hi all,
>>
>> In our production environment, we occasionally encounter a problem where a
>> user submits an abnormal computation task, causing a sudden flood of
>> requests, which causes the queueTime and processingTime of the Namenode to
>> rise very high, causing a large backlog of tasks.
>>
>> We usually locate and kill specific Spark, Flink, or MapReduce tasks based
>> on metrics and audit logs. Currently, IP and UGI are recorded in audit
>> logs, but there is no port information, so it is difficult to locate
>> specific processes sometimes. Therefore, I propose that we add the port
>> information to the audit log, so that we can easily track the upstream
>> process.
>>
>> Currently, some projects contain port information in audit logs, such as
>> Hbase and Alluxio. I think it is also necessary to add port information
>> for
>> HDFS audit logs.
>>
>> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which has
>> been tested in our test environment, and both RPC and HTTP are in effect.
>> I
>> look forward to your discussion on possible problems and suggestions for
>> modification. I will actively update the PR.
>>
>> Best Regards,
>> Tom
>>
>>

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by tom lee <to...@gmail.com>.

However, adding port is to modify the internal content of the IP field,
which has little impact on the overall layout.

In our cluster, we parse the audit log through Vector and send the data to
Kafka, which is unaffected.

tom lee <to...@gmail.com> 于2021年10月11日周一 下午2:44写道：

> Thank Ayush for reminding me. I also have similar concerns, so I published
> this discussion, hoping to let the members of the community know about this
> matter and then give suggestions.
>
> Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：
>
>> Hey
>> I am not sure whether we can directly go and change this. Any changes to
>> Audit Log format are considered incompatible.
>>
>>
>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>>
>> -Ayush
>>
>> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
>>
>> Hi all,
>>
>> In our production environment, we occasionally encounter a problem where a
>> user submits an abnormal computation task, causing a sudden flood of
>> requests, which causes the queueTime and processingTime of the Namenode to
>> rise very high, causing a large backlog of tasks.
>>
>> We usually locate and kill specific Spark, Flink, or MapReduce tasks based
>> on metrics and audit logs. Currently, IP and UGI are recorded in audit
>> logs, but there is no port information, so it is difficult to locate
>> specific processes sometimes. Therefore, I propose that we add the port
>> information to the audit log, so that we can easily track the upstream
>> process.
>>
>> Currently, some projects contain port information in audit logs, such as
>> Hbase and Alluxio. I think it is also necessary to add port information
>> for
>> HDFS audit logs.
>>
>> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which has
>> been tested in our test environment, and both RPC and HTTP are in effect.
>> I
>> look forward to your discussion on possible problems and suggestions for
>> modification. I will actively update the PR.
>>
>> Best Regards,
>> Tom
>>
>>

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by tom lee <to...@gmail.com>.

Thank Ayush for reminding me. I also have similar concerns, so I published
this discussion, hoping to let the members of the community know about this
matter and then give suggestions.

Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：

> Hey
> I am not sure whether we can directly go and change this. Any changes to
> Audit Log format are considered incompatible.
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>
> -Ayush
>
> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
>
> Hi all,
>
> In our production environment, we occasionally encounter a problem where a
> user submits an abnormal computation task, causing a sudden flood of
> requests, which causes the queueTime and processingTime of the Namenode to
> rise very high, causing a large backlog of tasks.
>
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based
> on metrics and audit logs. Currently, IP and UGI are recorded in audit
> logs, but there is no port information, so it is difficult to locate
> specific processes sometimes. Therefore, I propose that we add the port
> information to the audit log, so that we can easily track the upstream
> process.
>
> Currently, some projects contain port information in audit logs, such as
> Hbase and Alluxio. I think it is also necessary to add port information for
> HDFS audit logs.
>
> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which has
> been tested in our test environment, and both RPC and HTTP are in effect. I
> look forward to your discussion on possible problems and suggestions for
> modification. I will actively update the PR.
>
> Best Regards,
> Tom
>
>

Re: [DISCUSS] Add remote port information to HDFS audit log

Posted by tom lee <to...@gmail.com>.

Thank Ayush for reminding me. I also have similar concerns, so I published
this discussion, hoping to let the members of the community know about this
matter and then give suggestions.

Ayush Saxena <ay...@gmail.com> 于2021年10月11日周一 下午2:38写道：

> Hey
> I am not sure whether we can directly go and change this. Any changes to
> Audit Log format are considered incompatible.
>
>
> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/Compatibility.html#Audit_Log_Output
>
> -Ayush
>
> On 10-Oct-2021, at 7:57 PM, tom lee <to...@gmail.com> wrote:
>
> Hi all,
>
> In our production environment, we occasionally encounter a problem where a
> user submits an abnormal computation task, causing a sudden flood of
> requests, which causes the queueTime and processingTime of the Namenode to
> rise very high, causing a large backlog of tasks.
>
> We usually locate and kill specific Spark, Flink, or MapReduce tasks based
> on metrics and audit logs. Currently, IP and UGI are recorded in audit
> logs, but there is no port information, so it is difficult to locate
> specific processes sometimes. Therefore, I propose that we add the port
> information to the audit log, so that we can easily track the upstream
> process.
>
> Currently, some projects contain port information in audit logs, such as
> Hbase and Alluxio. I think it is also necessary to add port information for
> HDFS audit logs.
>
> I submitted a PR(https://github.com/apache/hadoop/pull/3538), which has
> been tested in our test environment, and both RPC and HTTP are in effect. I
> look forward to your discussion on possible problems and suggestions for
> modification. I will actively update the PR.
>
> Best Regards,
> Tom
>
>