You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by Attila Simon <sa...@cloudera.com> on 2016/08/17 15:39:55 UTC

Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/
-----------------------------------------------------------

Review request for Flume.


Bugs: FLUME-2954
    https://issues.apache.org/jira/browse/FLUME-2954


Repository: flume-git


Description
-------

--------------------------------------------------------------------------------
flume-ng-channel                              ---
  flume-jdbc-channel                          ---
    JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
    JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
  flume-kafka-channel                         ---
    KafkaChannel#230 #253                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------
flume-ng-configuration                        ---
  FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-core                                 ---
  SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
  GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
  LoggerSink#95                               <- fail data: on purpose <KEPT>
  AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
  MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
  BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
-------------------------------------------------------------------q-------------
flume-ng-embedded-agent                       ---
  EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sinks                                ---
  flume-hive-sink                             ---
    HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
        It may contain private data
        (URI string may contain password) as it is
        excessively logged within this module.
        Appears in HiveSink#298 #342 #400 #403 #428,
        HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
        HiveEndPoint is also attached to exception logs as well
  flume-ng-hbase-sink                         ---
    AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
  flume-ng-kafka-sink                         ---
    KafkaSink#179                             <- fail data: log whole message <REMOVED>
    KafkaSink#304                             <- fail properties <REMOVED>
  flume-ng-morphline-solr-sink                ---
    BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
    MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sources                              ---
  flume-kafka-source                          ---
    KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
  flume-twitter-source                        ---
    TwitterSource#110-113                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------


Diffs
-----

  conf/flume-env.ps1.template 8bf535a 
  conf/flume-env.sh.template c8b660f 
  flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
  flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
  flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
  flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
  flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
  flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
  flume-ng-doc/sphinx/FlumeUserGuide.rst fde9ff7 
  flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
  flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
  flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
  flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
  flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 

Diff: https://reviews.apache.org/r/51182/diff/


Testing
-------

compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)


Thanks,

Attila Simon


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Denes Arvay <de...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review146079
-----------------------------------------------------------




conf/flume-env.ps1.template (line 23)
<https://reviews.apache.org/r/51182/#comment212455>

    I'd add the original `$JAVA_OPTS` to this line so that not to overwrite the already added parameters if the user uncomments this line.
    (`$JAVA_OPTS="$JAVA_OPTS -Dorg...."`)



conf/flume-env.sh.template (line 29)
<https://reviews.apache.org/r/51182/#comment212456>

    same as in `flume-env.ps1.template` line 23
    (`export JAVA_OPTS="$JAVA_OPTS -Dorg..."`)


- Denes Arvay


On Aug. 17, 2016, 5:39 p.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 17, 2016, 5:39 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 1c15f1e 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Denes Arvay <de...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review146175
-----------------------------------------------------------


Ship it!




Ship It!

- Denes Arvay


On Aug. 19, 2016, 9:25 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 19, 2016, 9:25 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 1c15f1e 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Balázs Donát Bessenyei <be...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review146645
-----------------------------------------------------------


Ship it!




Ship It!

- Bal�zs Don�t Bessenyei


On Aug. 24, 2016, 11:58 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 24, 2016, 11:58 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5e677c6 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.

> On Aug. 26, 2016, 7:20 p.m., Mike Percy wrote:
> > flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java, line 178
> > <https://reviews.apache.org/r/51182/diff/7/?file=1486710#file1486710line178>
> >
> >     nit: Rather than having to call logger.isTraceEnabled() twice, I think this would be a little better written as:
> >     
> >     if (logger.isTraceEnabled()) {
> >       if (log raw data) {
> >         logger.trace("...");
> >       } else {
> >         logger.trace("...");
> >       }
> >     }

fixed


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review147012
-----------------------------------------------------------


On Aug. 28, 2016, 11:57 p.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 28, 2016, 11:57 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review147012
-----------------------------------------------------------




flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java (line 177)
<https://reviews.apache.org/r/51182/#comment213962>

    nit: Rather than having to call logger.isTraceEnabled() twice, I think this would be a little better written as:
    
    if (logger.isTraceEnabled()) {
      if (log raw data) {
        logger.trace("...");
      } else {
        logger.trace("...");
      }
    }


- Mike Percy


On Aug. 26, 2016, 4:27 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 26, 2016, 4:27 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.

> On Aug. 26, 2016, 7:17 p.m., Mike Percy wrote:
> > flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java, line 39
> > <https://reviews.apache.org/r/51182/diff/7/?file=1486705#file1486705line39>
> >
> >     spurious import reorganization

fixed


> On Aug. 26, 2016, 7:17 p.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java, line 62
> > <https://reviews.apache.org/r/51182/diff/7/?file=1486709#file1486709line62>
> >
> >     How about we rename this to allowLogRawData()

renamed


> On Aug. 26, 2016, 7:17 p.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java, line 71
> > <https://reviews.apache.org/r/51182/diff/7/?file=1486709#file1486709line71>
> >
> >     Let's get rid of these public static setters and just have the public getters call Boolean.getBoolean(PROP).
> >     
> >     I don't think there is any real need to cache this value. While System.getProperty() ultimately calls a synchronized method on the System properties HashTable, I think that lock is unlikely to be contended and would be elided by modern CPUs (Haswell) in modern JDKs (jdk8u20, see https://bugs.openjdk.java.net/browse/JDK-8031320 ... although I haven't tested it).
> >     
> >     So rather than add public methods and a complicated implementation right now, let's go with the simple implementation, and an interface that will allow us to optimize the impl later if needed.

fixed


> On Aug. 26, 2016, 7:17 p.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java, line 31
> > <https://reviews.apache.org/r/51182/diff/7/?file=1486709#file1486709line31>
> >
> >     How about we name this class LogPrivacyUtil?

renamed


> On Aug. 26, 2016, 7:17 p.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java, line 90
> > <https://reviews.apache.org/r/51182/diff/7/?file=1486709#file1486709line90>
> >
> >     How about renaming this method to allowLogConfiguration()

renamed


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review146997
-----------------------------------------------------------


On Aug. 28, 2016, 11:57 p.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 28, 2016, 11:57 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review146997
-----------------------------------------------------------




flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java (line 32)
<https://reviews.apache.org/r/51182/#comment213944>

    spurious import reorganization



flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java (line 31)
<https://reviews.apache.org/r/51182/#comment213945>

    How about we name this class LogPrivacyUtil?



flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java (line 62)
<https://reviews.apache.org/r/51182/#comment213946>

    How about we rename this to allowLogRawData()



flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java (line 71)
<https://reviews.apache.org/r/51182/#comment213957>

    Let's get rid of these public static setters and just have the public getters call Boolean.getBoolean(PROP).
    
    I don't think there is any real need to cache this value. While System.getProperty() ultimately calls a synchronized method on the System properties HashTable, I think that lock is unlikely to be contended and would be elided by modern CPUs (Haswell) in modern JDKs (jdk8u20, see https://bugs.openjdk.java.net/browse/JDK-8031320 ... although I haven't tested it).
    
    So rather than add public methods and a complicated implementation right now, let's go with the simple implementation, and an interface that will allow us to optimize the impl later if needed.



flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java (line 90)
<https://reviews.apache.org/r/51182/#comment213947>

    How about renaming this method to allowLogConfiguration()


- Mike Percy


On Aug. 26, 2016, 4:27 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 26, 2016, 4:27 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.

> On Aug. 29, 2016, 8:09 a.m., Mike Percy wrote:
> > flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java, line 373
> > <https://reviews.apache.org/r/51182/diff/7-8/?file=1486702#file1486702line373>
> >
> >     for consistency, add logger.isDebugEnabled()

thanks for spotting, indeed it should have been like that. fixed


> On Aug. 29, 2016, 8:09 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java, line 71
> > <https://reviews.apache.org/r/51182/diff/8/?file=1487629#file1487629line71>
> >
> >     nit: This is weird camel caps. Should be allowLogRawData()

fixed


> On Aug. 29, 2016, 8:09 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java, line 80
> > <https://reviews.apache.org/r/51182/diff/8/?file=1487629#file1487629line80>
> >
> >     nit: fix camel caps: allowLogPrintConfig()

fixed


> On Aug. 29, 2016, 8:09 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java, line 46
> > <https://reviews.apache.org/r/51182/diff/8/?file=1487629#file1487629line46>
> >
> >     nit: the indentation here is weird, can you just line up the indentation with the opening parenthesis? Something like
> >     
> >           logger.warn("Logging of configuration details of the agent has been turned on by " +
> >                       "setting {} to true. Please use this setting with extra caution as it may result " +
> >                       "in logging of private data. This setting is not recommended in " +
> >                       "production environments.",
> >                       LOG_PRINTCONFIG_PROP);
> >                       
> >     Same with the other log messages in this file.

fixed


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review147119
-----------------------------------------------------------


On Aug. 29, 2016, 8:41 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 29, 2016, 8:41 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review147119
-----------------------------------------------------------



Noticed a couple more minor things


flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java (line 373)
<https://reviews.apache.org/r/51182/#comment214220>

    for consistency, add logger.isDebugEnabled()



flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java (line 46)
<https://reviews.apache.org/r/51182/#comment214219>

    nit: the indentation here is weird, can you just line up the indentation with the opening parenthesis? Something like
    
          logger.warn("Logging of configuration details of the agent has been turned on by " +
                      "setting {} to true. Please use this setting with extra caution as it may result " +
                      "in logging of private data. This setting is not recommended in " +
                      "production environments.",
                      LOG_PRINTCONFIG_PROP);
                      
    Same with the other log messages in this file.



flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java (line 71)
<https://reviews.apache.org/r/51182/#comment214217>

    nit: This is weird camel caps. Should be allowLogRawData()



flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java (line 80)
<https://reviews.apache.org/r/51182/#comment214218>

    nit: fix camel caps: allowLogPrintConfig()


- Mike Percy


On Aug. 28, 2016, 4:57 p.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 28, 2016, 4:57 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review147122
-----------------------------------------------------------


Ship it!




Ship It!

- Mike Percy


On Aug. 29, 2016, 1:41 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 29, 2016, 1:41 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review147186
-----------------------------------------------------------


Ship it!




Looks good. After the package name change, the imports were out of order. I will fix those on commit.

- Mike Percy


On Aug. 29, 2016, 2:32 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 29, 2016, 2:32 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/LogPrivacyUtil.java PRE-CREATION 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/
-----------------------------------------------------------

(Updated Aug. 29, 2016, 9:32 a.m.)


Review request for Flume.


Changes
-------

moving LogPrivayUtil to flume-ng-configuration. "mvn clean install -DskipTests"  pass


Bugs: FLUME-2954
    https://issues.apache.org/jira/browse/FLUME-2954


Repository: flume-git


Description
-------

--------------------------------------------------------------------------------
flume-ng-channel                              ---
  flume-jdbc-channel                          ---
    JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
    JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
  flume-kafka-channel                         ---
    KafkaChannel#230 #253                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------
flume-ng-configuration                        ---
  FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-core                                 ---
  SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
  GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
  LoggerSink#95                               <- fail data: on purpose <KEPT>
  AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
  MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
  BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
-------------------------------------------------------------------q-------------
flume-ng-embedded-agent                       ---
  EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sinks                                ---
  flume-hive-sink                             ---
    HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
        It may contain private data
        (URI string may contain password) as it is
        excessively logged within this module.
        Appears in HiveSink#298 #342 #400 #403 #428,
        HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
        HiveEndPoint is also attached to exception logs as well
  flume-ng-hbase-sink                         ---
    AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
  flume-ng-kafka-sink                         ---
    KafkaSink#179                             <- fail data: log whole message <REMOVED>
    KafkaSink#304                             <- fail properties <REMOVED>
  flume-ng-morphline-solr-sink                ---
    BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
    MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sources                              ---
  flume-kafka-source                          ---
    KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
  flume-twitter-source                        ---
    TwitterSource#110-113                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------


Diffs (updated)
-----

  conf/flume-env.ps1.template 8bf535a 
  conf/flume-env.sh.template c8b660f 
  flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
  flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/LogPrivacyUtil.java PRE-CREATION 
  flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
  flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
  flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
  flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
  flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
  flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
  flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
  flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
  flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 

Diff: https://reviews.apache.org/r/51182/diff/


Testing
-------

compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)


Thanks,

Attila Simon


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Mike Percy <mp...@apache.org>.

> On Aug. 29, 2016, 2:06 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java, line 31
> > <https://reviews.apache.org/r/51182/diff/9/?file=1487764#file1487764line31>
> >
> >     Hrm. Why is this in the sdk module? I don't think clients will ever use it. Only people writing sources, channels, and sinks will have to use it. I think this class should be in the core module. If we move it there, we can also add the annotations to this class.

Scratch that, this class is used by the flume-ng-configuration module as well. Let's put this class in there instead of sdk.


- Mike


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review147123
-----------------------------------------------------------


On Aug. 29, 2016, 1:41 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 29, 2016, 1:41 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.

> On Aug. 29, 2016, 9:06 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java, line 31
> > <https://reviews.apache.org/r/51182/diff/9/?file=1487764#file1487764line31>
> >
> >     Hrm. Why is this in the sdk module? I don't think clients will ever use it. Only people writing sources, channels, and sinks will have to use it. I think this class should be in the core module. If we move it there, we can also add the annotations to this class.
> 
> Mike Percy wrote:
>     Scratch that, this class is used by the flume-ng-configuration module as well. Let's put this class in there instead of sdk.

moved to config it is kind of a configuration. indeed my original idea was flume-ng-core, unfortunately config module usage would have created a circular dep.


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review147123
-----------------------------------------------------------


On Aug. 29, 2016, 9:32 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 29, 2016, 9:32 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/LogPrivacyUtil.java PRE-CREATION 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review147123
-----------------------------------------------------------




flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java (line 31)
<https://reviews.apache.org/r/51182/#comment214228>

    Hrm. Why is this in the sdk module? I don't think clients will ever use it. Only people writing sources, channels, and sinks will have to use it. I think this class should be in the core module. If we move it there, we can also add the annotations to this class.


- Mike Percy


On Aug. 29, 2016, 1:41 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 29, 2016, 1:41 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/
-----------------------------------------------------------

(Updated Aug. 29, 2016, 8:41 a.m.)


Review request for Flume.


Changes
-------

"mvn clean install -DskipTests" passes


Bugs: FLUME-2954
    https://issues.apache.org/jira/browse/FLUME-2954


Repository: flume-git


Description
-------

--------------------------------------------------------------------------------
flume-ng-channel                              ---
  flume-jdbc-channel                          ---
    JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
    JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
  flume-kafka-channel                         ---
    KafkaChannel#230 #253                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------
flume-ng-configuration                        ---
  FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-core                                 ---
  SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
  GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
  LoggerSink#95                               <- fail data: on purpose <KEPT>
  AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
  MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
  BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
-------------------------------------------------------------------q-------------
flume-ng-embedded-agent                       ---
  EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sinks                                ---
  flume-hive-sink                             ---
    HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
        It may contain private data
        (URI string may contain password) as it is
        excessively logged within this module.
        Appears in HiveSink#298 #342 #400 #403 #428,
        HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
        HiveEndPoint is also attached to exception logs as well
  flume-ng-hbase-sink                         ---
    AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
  flume-ng-kafka-sink                         ---
    KafkaSink#179                             <- fail data: log whole message <REMOVED>
    KafkaSink#304                             <- fail properties <REMOVED>
  flume-ng-morphline-solr-sink                ---
    BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
    MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sources                              ---
  flume-kafka-source                          ---
    KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
  flume-twitter-source                        ---
    TwitterSource#110-113                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------


Diffs (updated)
-----

  conf/flume-env.ps1.template 8bf535a 
  conf/flume-env.sh.template c8b660f 
  flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
  flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
  flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
  flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
  flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
  flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
  flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
  flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
  flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java PRE-CREATION 
  flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
  flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
  flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 

Diff: https://reviews.apache.org/r/51182/diff/


Testing
-------

compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)


Thanks,

Attila Simon


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/
-----------------------------------------------------------

(Updated Aug. 28, 2016, 11:57 p.m.)


Review request for Flume.


Changes
-------

mvn clean intall -DskipTests passed


Bugs: FLUME-2954
    https://issues.apache.org/jira/browse/FLUME-2954


Repository: flume-git


Description
-------

--------------------------------------------------------------------------------
flume-ng-channel                              ---
  flume-jdbc-channel                          ---
    JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
    JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
  flume-kafka-channel                         ---
    KafkaChannel#230 #253                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------
flume-ng-configuration                        ---
  FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-core                                 ---
  SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
  GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
  LoggerSink#95                               <- fail data: on purpose <KEPT>
  AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
  MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
  BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
-------------------------------------------------------------------q-------------
flume-ng-embedded-agent                       ---
  EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sinks                                ---
  flume-hive-sink                             ---
    HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
        It may contain private data
        (URI string may contain password) as it is
        excessively logged within this module.
        Appears in HiveSink#298 #342 #400 #403 #428,
        HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
        HiveEndPoint is also attached to exception logs as well
  flume-ng-hbase-sink                         ---
    AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
  flume-ng-kafka-sink                         ---
    KafkaSink#179                             <- fail data: log whole message <REMOVED>
    KafkaSink#304                             <- fail properties <REMOVED>
  flume-ng-morphline-solr-sink                ---
    BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
    MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sources                              ---
  flume-kafka-source                          ---
    KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
  flume-twitter-source                        ---
    TwitterSource#110-113                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------


Diffs (updated)
-----

  conf/flume-env.ps1.template 8bf535a 
  conf/flume-env.sh.template c8b660f 
  flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
  flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
  flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
  flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
  flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
  flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
  flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
  flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
  flume-ng-sdk/src/main/java/org/apache/flume/util/LogPrivacyUtil.java PRE-CREATION 
  flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
  flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
  flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 

Diff: https://reviews.apache.org/r/51182/diff/


Testing
-------

compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)


Thanks,

Attila Simon


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/
-----------------------------------------------------------

(Updated Aug. 26, 2016, 11:27 a.m.)


Review request for Flume.


Changes
-------

added interfaceaudience to javadocs


Bugs: FLUME-2954
    https://issues.apache.org/jira/browse/FLUME-2954


Repository: flume-git


Description
-------

--------------------------------------------------------------------------------
flume-ng-channel                              ---
  flume-jdbc-channel                          ---
    JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
    JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
  flume-kafka-channel                         ---
    KafkaChannel#230 #253                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------
flume-ng-configuration                        ---
  FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-core                                 ---
  SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
  GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
  LoggerSink#95                               <- fail data: on purpose <KEPT>
  AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
  MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
  BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
-------------------------------------------------------------------q-------------
flume-ng-embedded-agent                       ---
  EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sinks                                ---
  flume-hive-sink                             ---
    HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
        It may contain private data
        (URI string may contain password) as it is
        excessively logged within this module.
        Appears in HiveSink#298 #342 #400 #403 #428,
        HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
        HiveEndPoint is also attached to exception logs as well
  flume-ng-hbase-sink                         ---
    AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
  flume-ng-kafka-sink                         ---
    KafkaSink#179                             <- fail data: log whole message <REMOVED>
    KafkaSink#304                             <- fail properties <REMOVED>
  flume-ng-morphline-solr-sink                ---
    BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
    MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sources                              ---
  flume-kafka-source                          ---
    KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
  flume-twitter-source                        ---
    TwitterSource#110-113                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------


Diffs (updated)
-----

  conf/flume-env.ps1.template 8bf535a 
  conf/flume-env.sh.template c8b660f 
  flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
  flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
  flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
  flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
  flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
  flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
  flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
  flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
  flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
  flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
  flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
  flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 

Diff: https://reviews.apache.org/r/51182/diff/


Testing
-------

compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)


Thanks,

Attila Simon


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/
-----------------------------------------------------------

(Updated Aug. 26, 2016, 11:19 a.m.)


Review request for Flume.


Changes
-------

Thanks Mike for the review. Providing the english wording instead of only saying that it is wrong was very useful.


Bugs: FLUME-2954
    https://issues.apache.org/jira/browse/FLUME-2954


Repository: flume-git


Description
-------

--------------------------------------------------------------------------------
flume-ng-channel                              ---
  flume-jdbc-channel                          ---
    JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
    JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
  flume-kafka-channel                         ---
    KafkaChannel#230 #253                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------
flume-ng-configuration                        ---
  FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-core                                 ---
  SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
  GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
  LoggerSink#95                               <- fail data: on purpose <KEPT>
  AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
  MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
  BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
-------------------------------------------------------------------q-------------
flume-ng-embedded-agent                       ---
  EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sinks                                ---
  flume-hive-sink                             ---
    HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
        It may contain private data
        (URI string may contain password) as it is
        excessively logged within this module.
        Appears in HiveSink#298 #342 #400 #403 #428,
        HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
        HiveEndPoint is also attached to exception logs as well
  flume-ng-hbase-sink                         ---
    AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
  flume-ng-kafka-sink                         ---
    KafkaSink#179                             <- fail data: log whole message <REMOVED>
    KafkaSink#304                             <- fail properties <REMOVED>
  flume-ng-morphline-solr-sink                ---
    BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
    MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sources                              ---
  flume-kafka-source                          ---
    KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
  flume-twitter-source                        ---
    TwitterSource#110-113                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------


Diffs (updated)
-----

  conf/flume-env.ps1.template 8bf535a 
  conf/flume-env.sh.template c8b660f 
  flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
  flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
  flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
  flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
  flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
  flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
  flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
  flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
  flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
  flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
  flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
  flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 

Diff: https://reviews.apache.org/r/51182/diff/


Testing
-------

compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)


Thanks,

Attila Simon


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.

> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > conf/flume-env.ps1.template, line 21
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483687#file1483687line21>
> >
> >     Consider using this slightly different wording, here and in the flume-env.sh.template file:
> >     
> >     Let Flume write raw event data and configuration information to its log files for debugging purposes. Enabling these flags is not recommended in production, as it may result in logging sensitive user information or encryption secrets.

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java, line 609
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483689#file1483689line609>
> >
> >     how about: @param defaultValue default value, null if no default

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java, line 316
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483691#file1483691line316>
> >
> >     nit: s/Initial-configuration/Initial configuration/

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java, line 373
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483691#file1483691line373>
> >
> >     I don't think this log line is useful by itself. This part should also go inside the LogRawDataUtil statement below.

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-doc/sphinx/FlumeUserGuide.rst, line 240
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483696#file1483696line240>
> >
> >     Consider the following wording instead:
> >     
> >     Logging the raw stream of data flowing through the ingest pipeline is not desired behaviour in
> >     many production environments because this may result in leaking sensitive data or security related configurations, such as secret keys, to Flume log files. By default, Flume will not log such information. On the other hand, if the data pipeline is broken, Flume will attempt to provide clues for debugging the problem.
> >     
> >     One way to debug problems with event pipelines is to set up an additional `Memory Channel` connected to a `Logger Sink`, which will output all event data to the Flume logs.
> >     
> >     In some situations, however, this approach is insufficient.

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-doc/sphinx/FlumeUserGuide.rst, line 249
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483696#file1483696line249>
> >
> >     Consider the following wording:
> >     
> >     In order to enable logging of event- and configuration-related data, some Java system properties must be set in addition to log4j properties.
> >     
> >     To enable configuration-related logging, set the Java system property ``-Dorg.apache.flume.log.printconfig=true``. This can either be passed on the command line or by setting this in the JAVA_OPTS variable in *flume-env.sh*.
> >     
> >     To enable data logging, set the Java system property ``-Dorg.apache.flume.log.rawdata=true`` in the same way described above. For most components, the log4j logging level must also be set to DEBUG or TRACE to make event-specific logging appear in the Flume logs.

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-doc/sphinx/FlumeUserGuide.rst, line 256
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483696#file1483696line256>
> >
> >     Consider the following wording:
> >     
> >     Here is an example of enabling both configuration logging and raw data logging while also setting the Log4j loglevel to DEBUG for console output:

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java, line 25
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483698#file1483698line25>
> >
> >     Consider this wording:
> >     
> >     Utility class to help any Flume component determine whether logging potentially sensitive information is allowed or not.

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java, line 28
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483698#file1483698line28>
> >
> >     Please add InterfaceAudience and stability annotations for Public / Evolving

Both InterfaceAudience and InterfaceStability would introduce a cicular dependency between flume-ng-sdk and flume-ng-core.


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java, line 32
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483698#file1483698line32>
> >
> >     Because this is a public API available to all plugin writers, please expose these as static methods instead of static constants.

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java, line 51
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483698#file1483698line51>
> >
> >     s/in log/in the log/

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java, line 45
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483698#file1483698line45>
> >
> >     Why put this in a static block? It might be useful to allow people to set these system properties via remote debugging if needed. Otherwise if they are using MemoryChannel and something is stuck there may be no way to debug it.

Remote debugging also requires to set some system properties first (to allow attaching the debugger). During that exercise -Dorg.apache.flume.log.printconfig=true can be also set. For test before the very first reference to LogRawDataUtils you can call System.setProperty("org.apache.flume.log.printconfig", "true") or whatever the test requires. For even more flexibility I changed the way you requested.


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java, line 53
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483698#file1483698line53>
> >
> >     nit: to the log file.

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java, line 178
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483699#file1483699line178>
> >
> >     here we are checking if trace is enabled but then logging at debug level. we should also log at trace

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java, line 39
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483700#file1483700line39>
> >
> >     Please avoid rearranging imports here and elsewhere unless necessary, or unless they are out of alphabetical order within a single grouping.
> >     
> >     Often, peoples' IDEs disagree on the order of the imports (IntelliJ, Netbeans, and Eclipse each have their own favorite grouping order) so I've found rearranging these things to be very noisy and pointless.

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java, line 48
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483700#file1483700line48>
> >
> >     Please avoid reformatting these comments unless you need to change them. Same with the changes elsewhere.

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java, line 57
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483700#file1483700line57>
> >
> >     Personally I think it hurts readability wrapping this. Let's just keep the changes to the required minimum for the scope of this patch.

fixed


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java, line 124
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483700#file1483700line124>
> >
> >     +1 to removing this trailing whitespace

thanks


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java, line 263
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483702#file1483702line263>
> >
> >     why trace here and debug above?

fixed to both trace


> On Aug. 26, 2016, 1:14 a.m., Mike Percy wrote:
> > flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java, line 110
> > <https://reviews.apache.org/r/51182/diff/5/?file=1483703#file1483703line110>
> >
> >     wrap this in a configuration logging check?

If configuration logging is set then everything is logged which was in the properties (including these). I would like to avoid printing this redundant information here. E.g. when some component changes or extends its configuration so it is not a full copy of the porp file then logging would be desired


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review146895
-----------------------------------------------------------


On Aug. 26, 2016, 11:19 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 26, 2016, 11:19 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 684120f 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 7e207aa 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Mike Percy <mp...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review146895
-----------------------------------------------------------



Thanks for the patch. Good stuff. I have mostly minor feedback.


conf/flume-env.ps1.template (line 21)
<https://reviews.apache.org/r/51182/#comment213724>

    Consider using this slightly different wording, here and in the flume-env.sh.template file:
    
    Let Flume write raw event data and configuration information to its log files for debugging purposes. Enabling these flags is not recommended in production, as it may result in logging sensitive user information or encryption secrets.



flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java (line 591)
<https://reviews.apache.org/r/51182/#comment213723>

    how about: @param defaultValue default value, null if no default



flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java (line 316)
<https://reviews.apache.org/r/51182/#comment213722>

    nit: s/Initial-configuration/Initial configuration/



flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java (line 373)
<https://reviews.apache.org/r/51182/#comment213683>

    I don't think this log line is useful by itself. This part should also go inside the LogRawDataUtil statement below.



flume-ng-doc/sphinx/FlumeUserGuide.rst (line 240)
<https://reviews.apache.org/r/51182/#comment213686>

    Consider the following wording instead:
    
    Logging the raw stream of data flowing through the ingest pipeline is not desired behaviour in
    many production environments because this may result in leaking sensitive data or security related configurations, such as secret keys, to Flume log files. By default, Flume will not log such information. On the other hand, if the data pipeline is broken, Flume will attempt to provide clues for debugging the problem.
    
    One way to debug problems with event pipelines is to set up an additional `Memory Channel` connected to a `Logger Sink`, which will output all event data to the Flume logs.
    
    In some situations, however, this approach is insufficient.



flume-ng-doc/sphinx/FlumeUserGuide.rst (line 249)
<https://reviews.apache.org/r/51182/#comment213695>

    Consider the following wording:
    
    In order to enable logging of event- and configuration-related data, some Java system properties must be set in addition to log4j properties.
    
    To enable configuration-related logging, set the Java system property ``-Dorg.apache.flume.log.printconfig=true``. This can either be passed on the command line or by setting this in the JAVA_OPTS variable in *flume-env.sh*.
    
    To enable data logging, set the Java system property ``-Dorg.apache.flume.log.rawdata=true`` in the same way described above. For most components, the log4j logging level must also be set to DEBUG or TRACE to make event-specific logging appear in the Flume logs.



flume-ng-doc/sphinx/FlumeUserGuide.rst (line 256)
<https://reviews.apache.org/r/51182/#comment213696>

    Consider the following wording:
    
    Here is an example of enabling both configuration logging and raw data logging while also setting the Log4j loglevel to DEBUG for console output:



flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java (line 25)
<https://reviews.apache.org/r/51182/#comment213710>

    Consider this wording:
    
    Utility class to help any Flume component determine whether logging potentially sensitive information is allowed or not.



flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java (line 28)
<https://reviews.apache.org/r/51182/#comment213704>

    Please add InterfaceAudience and stability annotations for Public / Evolving



flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java (line 32)
<https://reviews.apache.org/r/51182/#comment213703>

    Because this is a public API available to all plugin writers, please expose these as static methods instead of static constants.



flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java (line 45)
<https://reviews.apache.org/r/51182/#comment213706>

    Why put this in a static block? It might be useful to allow people to set these system properties via remote debugging if needed. Otherwise if they are using MemoryChannel and something is stuck there may be no way to debug it.



flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java (line 51)
<https://reviews.apache.org/r/51182/#comment213701>

    s/in log/in the log/



flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java (line 53)
<https://reviews.apache.org/r/51182/#comment213699>

    nit: to the log file.



flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java (line 177)
<https://reviews.apache.org/r/51182/#comment213711>

    here we are checking if trace is enabled but then logging at debug level. we should also log at trace



flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java (line 30)
<https://reviews.apache.org/r/51182/#comment213714>

    Please avoid rearranging imports here and elsewhere unless necessary, or unless they are out of alphabetical order within a single grouping.
    
    Often, peoples' IDEs disagree on the order of the imports (IntelliJ, Netbeans, and Eclipse each have their own favorite grouping order) so I've found rearranging these things to be very noisy and pointless.



flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java (line 39)
<https://reviews.apache.org/r/51182/#comment213715>

    Please avoid reformatting these comments unless you need to change them. Same with the changes elsewhere.



flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java (line 47)
<https://reviews.apache.org/r/51182/#comment213717>

    Personally I think it hurts readability wrapping this. Let's just keep the changes to the required minimum for the scope of this patch.



flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java (line 114)
<https://reviews.apache.org/r/51182/#comment213720>

    +1 to removing this trailing whitespace



flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java (line 251)
<https://reviews.apache.org/r/51182/#comment213718>

    why trace here and debug above?



flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java 
<https://reviews.apache.org/r/51182/#comment213721>

    wrap this in a configuration logging check?


- Mike Percy


On Aug. 24, 2016, 4:58 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 24, 2016, 4:58 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5e677c6 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/
-----------------------------------------------------------

(Updated Aug. 24, 2016, 11:58 a.m.)


Review request for Flume.


Bugs: FLUME-2954
    https://issues.apache.org/jira/browse/FLUME-2954


Repository: flume-git


Description
-------

--------------------------------------------------------------------------------
flume-ng-channel                              ---
  flume-jdbc-channel                          ---
    JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
    JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
  flume-kafka-channel                         ---
    KafkaChannel#230 #253                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------
flume-ng-configuration                        ---
  FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-core                                 ---
  SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
  GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
  LoggerSink#95                               <- fail data: on purpose <KEPT>
  AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
  MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
  BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
-------------------------------------------------------------------q-------------
flume-ng-embedded-agent                       ---
  EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sinks                                ---
  flume-hive-sink                             ---
    HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
        It may contain private data
        (URI string may contain password) as it is
        excessively logged within this module.
        Appears in HiveSink#298 #342 #400 #403 #428,
        HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
        HiveEndPoint is also attached to exception logs as well
  flume-ng-hbase-sink                         ---
    AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
  flume-ng-kafka-sink                         ---
    KafkaSink#179                             <- fail data: log whole message <REMOVED>
    KafkaSink#304                             <- fail properties <REMOVED>
  flume-ng-morphline-solr-sink                ---
    BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
    MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sources                              ---
  flume-kafka-source                          ---
    KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
  flume-twitter-source                        ---
    TwitterSource#110-113                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------


Diffs (updated)
-----

  conf/flume-env.ps1.template 8bf535a 
  conf/flume-env.sh.template c8b660f 
  flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
  flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
  flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
  flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
  flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
  flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
  flume-ng-doc/sphinx/FlumeUserGuide.rst 5e677c6 
  flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
  flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
  flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
  flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
  flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 

Diff: https://reviews.apache.org/r/51182/diff/


Testing
-------

compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)


Thanks,

Attila Simon


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.

> On Aug. 24, 2016, 10:17 a.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-doc/sphinx/FlumeUserGuide.rst, lines 246-247
> > <https://reviews.apache.org/r/51182/diff/4/?file=1483597#file1483597line246>
> >
> >     nit: still a few spelling errors

fixed


> On Aug. 24, 2016, 10:17 a.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java, line 180
> > <https://reviews.apache.org/r/51182/diff/4/?file=1483600#file1483600line180>
> >
> >     With LogRawDataUtil (raw data logging enabled) this could be logged

added LogRawDataUtil


> On Aug. 24, 2016, 10:17 a.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java, line 305
> > <https://reviews.apache.org/r/51182/diff/4/?file=1483600#file1483600line305>
> >
> >     With using LogRawDataUtil (configuration setting) I guess, this could be logged.

added LogRawDataUtil. since properties for KafkaProducer is dynamically built (flume adds, renames and alters them during startup) it is good to be logged to see the modifications comparable to the properties logged at startup.


> On Aug. 24, 2016, 10:17 a.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java, line 374
> > <https://reviews.apache.org/r/51182/diff/4/?file=1483600#file1483600line374>
> >
> >     I guess we should use the LogRawDataUtil here

this happens right before KafkaSink#304 and does the same: "logs out the kafkaProps field". So removing this line eliminates a duplicate.


> On Aug. 24, 2016, 10:17 a.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java, line 142
> > <https://reviews.apache.org/r/51182/diff/4/?file=1483602#file1483602line142>
> >
> >     {}, event seems to be missing

event was added


> On Aug. 24, 2016, 10:17 a.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java, lines 110-113
> > <https://reviews.apache.org/r/51182/diff/4/?file=1483604#file1483604line110>
> >
> >     With LogRawDataUtil config logging, this could be logged

all properties for the agent are logged if LogRawDataUtil.LOG_PRINTCONFIG was set at startup including these above

2016-08-24 13:44:06,803 (conf-file-poller-0) [DEBUG - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:316)] Initial-configuration: AgentConfiguration[a1]
SOURCES: {r1={ parameters:{port=44444, channels=c1, type=netcat, bind=localhost} }}
CHANNELS: {c1={ parameters:{transactionCapacity=100, capacity=1000, type=memory} }}
SINKS: {kafka={ parameters:{kafka.doesntexists=1} }, k1={ parameters:{type=logger, channel=c1} }} 

so logging them here again is a clear duplicate.


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review146634
-----------------------------------------------------------


On Aug. 24, 2016, 11:58 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 24, 2016, 11:58 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5e677c6 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Balázs Donát Bessenyei <be...@cloudera.com>.

> On Aug. 24, 2016, 10:17 a.m., Bal�zs Don�t Bessenyei wrote:
> >

Otherwise, LGTM


- Bal�zs Don�t


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review146634
-----------------------------------------------------------


On Aug. 24, 2016, 8:46 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 24, 2016, 8:46 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5e677c6 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Balázs Donát Bessenyei <be...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review146634
-----------------------------------------------------------




flume-ng-doc/sphinx/FlumeUserGuide.rst (lines 246 - 247)
<https://reviews.apache.org/r/51182/#comment213200>

    nit: still a few spelling errors



flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 
<https://reviews.apache.org/r/51182/#comment213196>

    With LogRawDataUtil (raw data logging enabled) this could be logged



flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 
<https://reviews.apache.org/r/51182/#comment213195>

    With using LogRawDataUtil (configuration setting) I guess, this could be logged.



flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java (line 367)
<https://reviews.apache.org/r/51182/#comment213194>

    I guess we should use the LogRawDataUtil here



flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java (line 140)
<https://reviews.apache.org/r/51182/#comment213198>

    {}, event seems to be missing



flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java 
<https://reviews.apache.org/r/51182/#comment213199>

    With LogRawDataUtil config logging, this could be logged


- Bal�zs Don�t Bessenyei


On Aug. 24, 2016, 8:46 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 24, 2016, 8:46 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5e677c6 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/
-----------------------------------------------------------

(Updated Aug. 24, 2016, 8:46 a.m.)


Review request for Flume.


Bugs: FLUME-2954
    https://issues.apache.org/jira/browse/FLUME-2954


Repository: flume-git


Description
-------

--------------------------------------------------------------------------------
flume-ng-channel                              ---
  flume-jdbc-channel                          ---
    JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
    JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
  flume-kafka-channel                         ---
    KafkaChannel#230 #253                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------
flume-ng-configuration                        ---
  FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-core                                 ---
  SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
  GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
  LoggerSink#95                               <- fail data: on purpose <KEPT>
  AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
  MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
  BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
-------------------------------------------------------------------q-------------
flume-ng-embedded-agent                       ---
  EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sinks                                ---
  flume-hive-sink                             ---
    HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
        It may contain private data
        (URI string may contain password) as it is
        excessively logged within this module.
        Appears in HiveSink#298 #342 #400 #403 #428,
        HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
        HiveEndPoint is also attached to exception logs as well
  flume-ng-hbase-sink                         ---
    AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
  flume-ng-kafka-sink                         ---
    KafkaSink#179                             <- fail data: log whole message <REMOVED>
    KafkaSink#304                             <- fail properties <REMOVED>
  flume-ng-morphline-solr-sink                ---
    BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
    MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sources                              ---
  flume-kafka-source                          ---
    KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
  flume-twitter-source                        ---
    TwitterSource#110-113                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------


Diffs (updated)
-----

  conf/flume-env.ps1.template 8bf535a 
  conf/flume-env.sh.template c8b660f 
  flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
  flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
  flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
  flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
  flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
  flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
  flume-ng-doc/sphinx/FlumeUserGuide.rst 5e677c6 
  flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
  flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
  flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
  flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
  flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 

Diff: https://reviews.apache.org/r/51182/diff/


Testing
-------

compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)


Thanks,

Attila Simon


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/
-----------------------------------------------------------

(Updated Aug. 19, 2016, 9:25 a.m.)


Review request for Flume.


Bugs: FLUME-2954
    https://issues.apache.org/jira/browse/FLUME-2954


Repository: flume-git


Description
-------

--------------------------------------------------------------------------------
flume-ng-channel                              ---
  flume-jdbc-channel                          ---
    JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
    JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
  flume-kafka-channel                         ---
    KafkaChannel#230 #253                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------
flume-ng-configuration                        ---
  FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-core                                 ---
  SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
  GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
  LoggerSink#95                               <- fail data: on purpose <KEPT>
  AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
  MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
  BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
-------------------------------------------------------------------q-------------
flume-ng-embedded-agent                       ---
  EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sinks                                ---
  flume-hive-sink                             ---
    HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
        It may contain private data
        (URI string may contain password) as it is
        excessively logged within this module.
        Appears in HiveSink#298 #342 #400 #403 #428,
        HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
        HiveEndPoint is also attached to exception logs as well
  flume-ng-hbase-sink                         ---
    AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
  flume-ng-kafka-sink                         ---
    KafkaSink#179                             <- fail data: log whole message <REMOVED>
    KafkaSink#304                             <- fail properties <REMOVED>
  flume-ng-morphline-solr-sink                ---
    BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
    MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sources                              ---
  flume-kafka-source                          ---
    KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
  flume-twitter-source                        ---
    TwitterSource#110-113                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------


Diffs (updated)
-----

  conf/flume-env.ps1.template 8bf535a 
  conf/flume-env.sh.template c8b660f 
  flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
  flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
  flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
  flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
  flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
  flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
  flume-ng-doc/sphinx/FlumeUserGuide.rst 1c15f1e 
  flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
  flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
  flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
  flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
  flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 

Diff: https://reviews.apache.org/r/51182/diff/


Testing
-------

compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)


Thanks,

Attila Simon


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/
-----------------------------------------------------------

(Updated Aug. 17, 2016, 5:39 p.m.)


Review request for Flume.


Bugs: FLUME-2954
    https://issues.apache.org/jira/browse/FLUME-2954


Repository: flume-git


Description
-------

--------------------------------------------------------------------------------
flume-ng-channel                              ---
  flume-jdbc-channel                          ---
    JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
    JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
  flume-kafka-channel                         ---
    KafkaChannel#230 #253                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------
flume-ng-configuration                        ---
  FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-core                                 ---
  SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
  GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
  LoggerSink#95                               <- fail data: on purpose <KEPT>
  AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
  MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
  BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
-------------------------------------------------------------------q-------------
flume-ng-embedded-agent                       ---
  EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sinks                                ---
  flume-hive-sink                             ---
    HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
        It may contain private data
        (URI string may contain password) as it is
        excessively logged within this module.
        Appears in HiveSink#298 #342 #400 #403 #428,
        HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
        HiveEndPoint is also attached to exception logs as well
  flume-ng-hbase-sink                         ---
    AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
  flume-ng-kafka-sink                         ---
    KafkaSink#179                             <- fail data: log whole message <REMOVED>
    KafkaSink#304                             <- fail properties <REMOVED>
  flume-ng-morphline-solr-sink                ---
    BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
    MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
--------------------------------------------------------------------------------
flume-ng-sources                              ---
  flume-kafka-source                          ---
    KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
  flume-twitter-source                        ---
    TwitterSource#110-113                     <- fail properties <REMOVED>
--------------------------------------------------------------------------------


Diffs (updated)
-----

  conf/flume-env.ps1.template 8bf535a 
  conf/flume-env.sh.template c8b660f 
  flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
  flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
  flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
  flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
  flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
  flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
  flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
  flume-ng-doc/sphinx/FlumeUserGuide.rst 1c15f1e 
  flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
  flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
  flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
  flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
  flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
  flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 

Diff: https://reviews.apache.org/r/51182/diff/


Testing
-------

compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)


Thanks,

Attila Simon


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.

> On Aug. 17, 2016, 4:13 p.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java, line 179
> > <https://reviews.apache.org/r/51182/diff/1/?file=1476843#file1476843line179>
> >
> >     Some logging could be helpful, I guess (probably dropping the eventBody should be enough)

Logging that an event arrived and essentially two headers for each processed event is something a LoggerSink can also do if really needed (during debugging). On the other hand counters may be a better candididate than log for moitoring such metric.


> On Aug. 17, 2016, 4:13 p.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java, line 139
> > <https://reviews.apache.org/r/51182/diff/1/?file=1476845#file1476845line139>
> >
> >     Can we just drop the "event" part?

Logging that an event arrived for each processed event is something a LoggerSink can also do if really needed (during debugging). On the other hand counters may be a better candididate than log for moitoring such metric.


> On Aug. 17, 2016, 4:13 p.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java, line 98
> > <https://reviews.apache.org/r/51182/diff/1/?file=1476833#file1476833line98>
> >
> >     Can we just skip the context? The "Initializing JDBC Channel provider" could be helpful.

Although Application and AbstractConfigurationProvider classes provide related information I added this back.


> On Aug. 17, 2016, 4:13 p.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-doc/sphinx/FlumeUserGuide.rst, lines 240-247
> > <https://reviews.apache.org/r/51182/diff/1/?file=1476840#file1476840line240>
> >
> >     nit: spelling errors + sometimes unclear meanings

I corrected some. Could you be a bit more specific?


> On Aug. 17, 2016, 4:13 p.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java, line 244
> > <https://reviews.apache.org/r/51182/diff/1/?file=1476846#file1476846line244>
> >
> >     If LogRawDataUtil.LOG_RAWDATA is false, we have no logging. What if the Topic and Partition remain in the logs?

fixed and moved to trace level


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review145992
-----------------------------------------------------------


On Aug. 17, 2016, 5:39 p.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 17, 2016, 5:39 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 1c15f1e 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Attila Simon <sa...@cloudera.com>.

> On Aug. 17, 2016, 4:13 p.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java, line 139
> > <https://reviews.apache.org/r/51182/diff/1/?file=1476845#file1476845line139>
> >
> >     Can we just drop the "event" part?
> 
> Attila Simon wrote:
>     Logging that an event arrived for each processed event is something a LoggerSink can also do if really needed (during debugging). On the other hand counters may be a better candididate than log for moitoring such metric.

fixed in version 4


> On Aug. 17, 2016, 4:13 p.m., Bal�zs Don�t Bessenyei wrote:
> > flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java, line 179
> > <https://reviews.apache.org/r/51182/diff/1/?file=1476843#file1476843line179>
> >
> >     Some logging could be helpful, I guess (probably dropping the eventBody should be enough)
> 
> Attila Simon wrote:
>     Logging that an event arrived and essentially two headers for each processed event is something a LoggerSink can also do if really needed (during debugging). On the other hand counters may be a better candididate than log for moitoring such metric.

fixed in version 4


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review145992
-----------------------------------------------------------


On Aug. 24, 2016, 8:46 a.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 24, 2016, 8:46 a.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst 5e677c6 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>


Re: Review Request 51182: FLUME-2954: make raw data appearing in log messages explicit

Posted by Balázs Donát Bessenyei <be...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51182/#review145992
-----------------------------------------------------------




flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 
<https://reviews.apache.org/r/51182/#comment212342>

    Can we just skip the context? The "Initializing JDBC Channel provider" could be helpful.



flume-ng-doc/sphinx/FlumeUserGuide.rst (lines 240 - 247)
<https://reviews.apache.org/r/51182/#comment212344>

    nit: spelling errors + sometimes unclear meanings



flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 
<https://reviews.apache.org/r/51182/#comment212346>

    Some logging could be helpful, I guess (probably dropping the eventBody should be enough)



flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java 
<https://reviews.apache.org/r/51182/#comment212348>

    Can we just drop the "event" part?



flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java (line 244)
<https://reviews.apache.org/r/51182/#comment212349>

    If LogRawDataUtil.LOG_RAWDATA is false, we have no logging. What if the Topic and Partition remain in the logs?


- Bal�zs Don�t Bessenyei


On Aug. 17, 2016, 3:39 p.m., Attila Simon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51182/
> -----------------------------------------------------------
> 
> (Updated Aug. 17, 2016, 3:39 p.m.)
> 
> 
> Review request for Flume.
> 
> 
> Bugs: FLUME-2954
>     https://issues.apache.org/jira/browse/FLUME-2954
> 
> 
> Repository: flume-git
> 
> 
> Description
> -------
> 
> --------------------------------------------------------------------------------
> flume-ng-channel                              ---
>   flume-jdbc-channel                          ---
>     JdbcChannelProviderImpl#98                <- fail properties <REMOVED>
>     JdbcChannelProviderImpl#261 #431          <- fail properties: jdbc url might include password <KEPT><FOLLOWUP IN JIRA>
>   flume-kafka-channel                         ---
>     KafkaChannel#230 #253                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> flume-ng-configuration                        ---
>   FlumeConfiguration#315 #372                 <- fail properties <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-core                                 ---
>   SyslogAvroEventSerializer#150               <- fail data: SyslogEvent.message gets logged <DRIVE BY PROPERTY>
>   GangliaServer#224 #245                      <- safe data: only flume component metrics data <KEPT>
>   LoggerSink#95                               <- fail data: on purpose <KEPT>
>   AvroSource#347                              <- fail data: log whole message <DRIVE BY PROPERTY>
>   MultiportSyslogTCPSource#360                <- fail data: log whole message <DRIVE BY PROPERTY>
>   BLOBHandler#70                              <- fail data: logs http request headers <DRIVE BY PROPERTY>
> -------------------------------------------------------------------q-------------
> flume-ng-embedded-agent                       ---
>   EmbeddedAgent#155                           <- fail properties: printing all config <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sinks                                ---
>   flume-hive-sink                             ---
>     HiveEndPoint has an URI field.            <- fail properties <KEPT><FOLLOWUP IN JIRA>
>         It may contain private data
>         (URI string may contain password) as it is
>         excessively logged within this module.
>         Appears in HiveSink#298 #342 #400 #403 #428,
>         HiveWriter#210 #319 #330 #337 #353 #365 #368 #407...)
>         HiveEndPoint is also attached to exception logs as well
>   flume-ng-hbase-sink                         ---
>     AsyncHBaseSink#641                        <- safe data: error details gets logged in case of failure <KEPT>
>   flume-ng-kafka-sink                         ---
>     KafkaSink#179                             <- fail data: log whole message <REMOVED>
>     KafkaSink#304                             <- fail properties <REMOVED>
>   flume-ng-morphline-solr-sink                ---
>     BlobHandler#98 #113                       <- fail data: log http request headers <DRIVE BY PROPERTY>
>     MorphlineSink#139                         <- fail data: logs event <DRIVE BY PROPERTY>
> --------------------------------------------------------------------------------
> flume-ng-sources                              ---
>   flume-kafka-source                          ---
>     KafkaSource#247                           <- fail data: log whole <DRIVE BY PROPERTY>
>   flume-twitter-source                        ---
>     TwitterSource#110-113                     <- fail properties <REMOVED>
> --------------------------------------------------------------------------------
> 
> 
> Diffs
> -----
> 
>   conf/flume-env.ps1.template 8bf535a 
>   conf/flume-env.sh.template c8b660f 
>   flume-ng-channels/flume-jdbc-channel/src/main/java/org/apache/flume/channel/jdbc/impl/JdbcChannelProviderImpl.java 845b794 
>   flume-ng-channels/flume-kafka-channel/src/main/java/org/apache/flume/channel/kafka/KafkaChannel.java 90e3288 
>   flume-ng-configuration/src/main/java/org/apache/flume/conf/FlumeConfiguration.java 9b3a434 
>   flume-ng-core/src/main/java/org/apache/flume/source/AvroSource.java 8b9b956 
>   flume-ng-core/src/main/java/org/apache/flume/source/MultiportSyslogTCPSource.java b9f2438 
>   flume-ng-core/src/main/java/org/apache/flume/source/http/BLOBHandler.java e24d4c6 
>   flume-ng-core/src/test/java/org/apache/flume/serialization/SyslogAvroEventSerializer.java 05af3b1 
>   flume-ng-doc/sphinx/FlumeUserGuide.rst fde9ff7 
>   flume-ng-embedded-agent/src/main/java/org/apache/flume/agent/embedded/EmbeddedAgent.java ad3e138 
>   flume-ng-sdk/src/main/java/org/apache/flume/util/LogRawDataUtil.java PRE-CREATION 
>   flume-ng-sinks/flume-ng-kafka-sink/src/main/java/org/apache/flume/sink/kafka/KafkaSink.java 9453546 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/BlobHandler.java ca7614a 
>   flume-ng-sinks/flume-ng-morphline-solr-sink/src/main/java/org/apache/flume/sink/solr/morphline/MorphlineSink.java f7a73f3 
>   flume-ng-sources/flume-kafka-source/src/main/java/org/apache/flume/source/kafka/KafkaSource.java 90e4715 
>   flume-ng-sources/flume-twitter-source/src/main/java/org/apache/flume/source/twitter/TwitterSource.java f5c8328 
> 
> Diff: https://reviews.apache.org/r/51182/diff/
> 
> 
> Testing
> -------
> 
> compiles, site builds, all unit test passes, distribution target handles the system properties as expected:
> bin/flume-ng agent --conf conf --conf-file ../../../../../flume-conf/flume-log.conf --name a1 -Dflume.root.logger=DEBUG,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true (with and without the extra properties)
> 
> 
> Thanks,
> 
> Attila Simon
> 
>