You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/04/02 17:59:07 UTC

[GitHub] [airflow] millin opened a new pull request #15163: Fix exception caused by missing keys in the ES Record

millin opened a new pull request #15163:
URL: https://github.com/apache/airflow/pull/15163


   Optional LogRecord attributes cannot be added to log_format due to format exception.
   This happened because ElasticSearch drops keys with null value from record.
   
   Configurations to reproduce:
   ```
   [logging]
   remote_logging = True
   log_format = [%%(asctime)s] {%%(filename)s:%%(lineno)d} %%(levelname)s - %%(message)s - %%(exc_text)s
   
   [elasticsearch]
   json_format = True
   json_fields = asctime, filename, lineno, levelname, message, exc_text
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #15163: Fix exception caused by missing keys in the ES Record

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #15163:
URL: https://github.com/apache/airflow/pull/15163#issuecomment-817205934


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest master or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #15163: Fix exception caused by missing keys in the ES Record

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #15163:
URL: https://github.com/apache/airflow/pull/15163#discussion_r610939364



##########
File path: tests/providers/elasticsearch/log/test_es_task_handler.py
##########
@@ -253,7 +253,9 @@ def test_set_context_w_json_format_and_write_stdout(self):
 
     def test_read_with_json_format(self):
         ts = pendulum.now()
-        formatter = logging.Formatter('[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s')
+        formatter = logging.Formatter(
+            '[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s - %(exc_text)s'
+        )

Review comment:
       This test passes without the changes in airflow/providers/elasticsearch/log/es_task_handler.py
   
   ideally the test should fail without the actual fix in the PR




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #15163: Fix exception caused by missing keys in the ES Record

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #15163:
URL: https://github.com/apache/airflow/pull/15163#discussion_r606438807



##########
File path: airflow/providers/elasticsearch/log/es_task_handler.py
##########
@@ -207,7 +207,7 @@ def _format_msg(self, log_line):
         if self.json_format:
             try:
                 # pylint: disable=protected-access
-                return self.formatter._style.format(_ESJsonLogFmt(**log_line.to_dict()))
+                return self.formatter._style.format(_ESJsonLogFmt(self.json_fields, **log_line.to_dict()))

Review comment:
       @millin Thanks for the PR, can you please also add a test case or update an existing test in https://github.com/apache/airflow/blob/master/tests/providers/elasticsearch/log/test_es_task_handler.py




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] millin commented on pull request #15163: Fix exception caused by missing keys in the ES Record

Posted by GitBox <gi...@apache.org>.
millin commented on pull request #15163:
URL: https://github.com/apache/airflow/pull/15163#issuecomment-815624162


   @kaxil I have no idea why Quarantined tests failed. It seems this is not related to my changes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil merged pull request #15163: Fix exception caused by missing keys in the ES Record

Posted by GitBox <gi...@apache.org>.
kaxil merged pull request #15163:
URL: https://github.com/apache/airflow/pull/15163


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] millin commented on a change in pull request #15163: Fix exception caused by missing keys in the ES Record

Posted by GitBox <gi...@apache.org>.
millin commented on a change in pull request #15163:
URL: https://github.com/apache/airflow/pull/15163#discussion_r606630498



##########
File path: airflow/providers/elasticsearch/log/es_task_handler.py
##########
@@ -207,7 +207,7 @@ def _format_msg(self, log_line):
         if self.json_format:
             try:
                 # pylint: disable=protected-access
-                return self.formatter._style.format(_ESJsonLogFmt(**log_line.to_dict()))
+                return self.formatter._style.format(_ESJsonLogFmt(self.json_fields, **log_line.to_dict()))

Review comment:
       Ok, I'll try




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] millin commented on a change in pull request #15163: Fix exception caused by missing keys in the ES Record

Posted by GitBox <gi...@apache.org>.
millin commented on a change in pull request #15163:
URL: https://github.com/apache/airflow/pull/15163#discussion_r608489666



##########
File path: airflow/providers/elasticsearch/log/es_task_handler.py
##########
@@ -207,7 +207,7 @@ def _format_msg(self, log_line):
         if self.json_format:
             try:
                 # pylint: disable=protected-access
-                return self.formatter._style.format(_ESJsonLogFmt(**log_line.to_dict()))
+                return self.formatter._style.format(_ESJsonLogFmt(self.json_fields, **log_line.to_dict()))

Review comment:
       I have updated the test




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil commented on a change in pull request #15163: Fix exception caused by missing keys in the ES Record

Posted by GitBox <gi...@apache.org>.
kaxil commented on a change in pull request #15163:
URL: https://github.com/apache/airflow/pull/15163#discussion_r611096501



##########
File path: tests/providers/elasticsearch/log/test_es_task_handler.py
##########
@@ -253,7 +253,9 @@ def test_set_context_w_json_format_and_write_stdout(self):
 
     def test_read_with_json_format(self):
         ts = pendulum.now()
-        formatter = logging.Formatter('[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s')
+        formatter = logging.Formatter(
+            '[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s - %(exc_text)s'
+        )

Review comment:
       Oh yeah -- it failed for me now -- I might have missed something before




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] millin commented on a change in pull request #15163: Fix exception caused by missing keys in the ES Record

Posted by GitBox <gi...@apache.org>.
millin commented on a change in pull request #15163:
URL: https://github.com/apache/airflow/pull/15163#discussion_r611035166



##########
File path: tests/providers/elasticsearch/log/test_es_task_handler.py
##########
@@ -253,7 +253,9 @@ def test_set_context_w_json_format_and_write_stdout(self):
 
     def test_read_with_json_format(self):
         ts = pendulum.now()
-        formatter = logging.Formatter('[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s')
+        formatter = logging.Formatter(
+            '[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s - %(exc_text)s'
+        )

Review comment:
       It failed for me without these fixes:
   
   ```
   tests/providers/elasticsearch/log/test_es_task_handler.py:253 (TestElasticsearchTaskHandler.test_read_with_json_format)
   [2020-12-24 19:25:00,962] {taskinstance.py:851} INFO - some random stuff -  != some random stuff
   
   Expected :some random stuff
   Actual   :[2020-12-24 19:25:00,962] {taskinstance.py:851} INFO - some random stuff - 
   <Click to see difference>
   
   self = <tests.providers.elasticsearch.log.test_es_task_handler.TestElasticsearchTaskHandler testMethod=test_read_with_json_format>
   
       def test_read_with_json_format(self):
           ts = pendulum.now()
           formatter = logging.Formatter(
               '[%(asctime)s] {%(filename)s:%(lineno)d} %(levelname)s - %(message)s - %(exc_text)s'
           )
           self.es_task_handler.formatter = formatter
           self.es_task_handler.json_format = True
       
           self.body = {
               'message': self.test_message,
               'log_id': f'{self.DAG_ID}-{self.TASK_ID}-2016_01_01T00_00_00_000000-1',
               'offset': 1,
               'asctime': '2020-12-24 19:25:00,962',
               'filename': 'taskinstance.py',
               'lineno': 851,
               'levelname': 'INFO',
           }
           self.es_task_handler.set_context(self.ti)
           self.es.index(index=self.index_name, doc_type=self.doc_type, body=self.body, id=id)
       
           logs, _ = self.es_task_handler.read(
               self.ti, 1, {'offset': 0, 'last_log_timestamp': str(ts), 'end_of_log': False}
           )
   >       assert "[2020-12-24 19:25:00,962] {taskinstance.py:851} INFO - some random stuff - " == logs[0][0][1]
   E       AssertionError: assert '[2020-12-24 19:25:00,962] {taskinstance.py:851} INFO - some random stuff - ' == 'some random stuff'
   
   log/test_es_task_handler.py:277: AssertionError
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org