You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/02 00:52:57 UTC

[GitHub] [airflow] whynick1 opened a new pull request #8049: Mysql to gcs

whynick1 opened a new pull request #8049: Mysql to gcs
URL: https://github.com/apache/airflow/pull/8049
 
 
   ---
   Issue link: [AIRFLOW-7117](https://issues.apache.org/jira/browse/AIRFLOW-7117)
   
   Make sure to mark the boxes below before creating PR: [x]
   
   - [x] Description above provides context of the change
   - [x] Unit tests coverage for changes (not needed for documentation changes)
   - [x] Commits follow "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)"
   - [x] Relevant documentation is updated including usage instructions.
   - [x] I will engage committers as explained in [Contribution Workflow Example](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#contribution-workflow-example).
   
   ---
   In case of fundamental code change, Airflow Improvement Proposal ([AIP](https://cwiki.apache.org/confluence/display/AIRFLOW/Airflow+Improvements+Proposals)) is needed.
   In case of a new dependency, check compliance with the [ASF 3rd Party License Policy](https://www.apache.org/legal/resolved.html#category-x).
   In case of backwards incompatible changes please leave a note in [UPDATING.md](https://github.com/apache/airflow/blob/master/UPDATING.md).
   Read the [Pull Request Guidelines](https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst#pull-request-guidelines) for more information.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#issuecomment-608169012
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=h1) Report
   > Merging [#8049](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/340e947d5db0ae1a5d5ca2d464fb78779acadc69&el=desc) will **decrease** coverage by `0.23%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8049/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #8049      +/-   ##
   ==========================================
   - Coverage   87.25%   87.01%   -0.24%     
   ==========================================
     Files         935      937       +2     
     Lines       45384    45555     +171     
   ==========================================
   + Hits        39601    39641      +40     
   - Misses       5783     5914     +131     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5) | `93.91% <100.00%> (+1.25%)` | :arrow_up: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0.00%> (-25.26%)` | :arrow_down: |
   | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0.00%> (-23.53%)` | :arrow_down: |
   | [airflow/providers/google/ads/hooks/ads.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Fkcy9ob29rcy9hZHMucHk=) | `71.79% <0.00%> (-6.78%)` | :arrow_down: |
   | [airflow/models/renderedtifields.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMvcmVuZGVyZWR0aWZpZWxkcy5weQ==) | `92.85% <0.00%> (-2.98%)` | :arrow_down: |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `92.40% <0.00%> (-1.89%)` | :arrow_down: |
   | [airflow/www/utils.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdXRpbHMucHk=) | `80.85% <0.00%> (-1.02%)` | :arrow_down: |
   | ... and [15 more](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=footer). Last update [340e947...ec32f57](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#issuecomment-608169012
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=h1) Report
   > Merging [#8049](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/340e947d5db0ae1a5d5ca2d464fb78779acadc69&el=desc) will **decrease** coverage by `0.23%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8049/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #8049      +/-   ##
   ==========================================
   - Coverage   87.25%   87.01%   -0.24%     
   ==========================================
     Files         935      937       +2     
     Lines       45384    45555     +171     
   ==========================================
   + Hits        39601    39641      +40     
   - Misses       5783     5914     +131     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5) | `93.91% <100.00%> (+1.25%)` | :arrow_up: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0.00%> (-25.26%)` | :arrow_down: |
   | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0.00%> (-23.53%)` | :arrow_down: |
   | [airflow/providers/google/ads/hooks/ads.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Fkcy9ob29rcy9hZHMucHk=) | `71.79% <0.00%> (-6.78%)` | :arrow_down: |
   | [airflow/models/renderedtifields.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMvcmVuZGVyZWR0aWZpZWxkcy5weQ==) | `92.85% <0.00%> (-2.98%)` | :arrow_down: |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `92.40% <0.00%> (-1.89%)` | :arrow_down: |
   | [airflow/www/utils.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdXRpbHMucHk=) | `80.85% <0.00%> (-1.02%)` | :arrow_down: |
   | ... and [15 more](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=footer). Last update [340e947...ec32f57](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#issuecomment-608169012
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=h1) Report
   > Merging [#8049](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/340e947d5db0ae1a5d5ca2d464fb78779acadc69&el=desc) will **decrease** coverage by `0.23%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8049/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #8049      +/-   ##
   ==========================================
   - Coverage   87.25%   87.01%   -0.24%     
   ==========================================
     Files         935      937       +2     
     Lines       45384    45555     +171     
   ==========================================
   + Hits        39601    39641      +40     
   - Misses       5783     5914     +131     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5) | `93.91% <100.00%> (+1.25%)` | :arrow_up: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0.00%> (-25.26%)` | :arrow_down: |
   | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0.00%> (-23.53%)` | :arrow_down: |
   | [airflow/providers/google/ads/hooks/ads.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Fkcy9ob29rcy9hZHMucHk=) | `71.79% <0.00%> (-6.78%)` | :arrow_down: |
   | [airflow/models/renderedtifields.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMvcmVuZGVyZWR0aWZpZWxkcy5weQ==) | `92.85% <0.00%> (-2.98%)` | :arrow_down: |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `92.40% <0.00%> (-1.89%)` | :arrow_down: |
   | [airflow/www/utils.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdXRpbHMucHk=) | `80.85% <0.00%> (-1.02%)` | :arrow_down: |
   | ... and [15 more](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=footer). Last update [340e947...ec32f57](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#discussion_r402558274
 
 

 ##########
 File path: airflow/providers/google/cloud/operators/sql_to_gcs.py
 ##########
 @@ -251,18 +252,20 @@ def _get_col_type_dict(self):
     def _write_local_schema_file(self, cursor):
         """
         Takes a cursor, and writes the BigQuery schema for the results to a
-        local file system.
+        local file system. Schema for database will be read from cursor if
+        not specified.
 
         :return: A dictionary where key is a filename to be used as an object
             name in GCS, and values are file handles to local files that
             contains the BigQuery schema fields in .json format.
         """
-        schema = [self.field_to_bigquery(field) for field in cursor.description]
+        schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]
 
         self.log.info('Using schema for %s', self.schema_filename)
         self.log.debug("Current schema: %s", schema)
         tmp_schema_file_handle = NamedTemporaryFile(delete=True)
-        tmp_schema_file_handle.write(json.dumps(schema, sort_keys=True).encode('utf-8'))
+        schema_json = json.dumps(schema, sort_keys=True) if isinstance(schema, list) else schema
+        tmp_schema_file_handle.write(schema_json.encode('utf-8'))
 
 Review comment:
   LGTM

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#discussion_r402214684
 
 

 ##########
 File path: airflow/providers/google/cloud/operators/sql_to_gcs.py
 ##########
 @@ -251,18 +252,20 @@ def _get_col_type_dict(self):
     def _write_local_schema_file(self, cursor):
         """
         Takes a cursor, and writes the BigQuery schema for the results to a
-        local file system.
+        local file system. Schema for database will be read from cursor if
+        not specified.
 
         :return: A dictionary where key is a filename to be used as an object
             name in GCS, and values are file handles to local files that
             contains the BigQuery schema fields in .json format.
         """
-        schema = [self.field_to_bigquery(field) for field in cursor.description]
+        schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]
 
         self.log.info('Using schema for %s', self.schema_filename)
         self.log.debug("Current schema: %s", schema)
         tmp_schema_file_handle = NamedTemporaryFile(delete=True)
-        tmp_schema_file_handle.write(json.dumps(schema, sort_keys=True).encode('utf-8'))
+        schema_json = json.dumps(schema, sort_keys=True) if isinstance(schema, list) else schema
+        tmp_schema_file_handle.write(schema_json.encode('utf-8'))
 
 Review comment:
   It looks overly complicated. What do you think about the following code?
   ```python
           if self.shcema:
               schema_json = self.schema
               self.log.info("Using user schema")
           else:
               self.log.info("Starts generating schema")
               schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]            
               tmp_schema_file_handle = NamedTemporaryFile(delete=True)
               schema_json = json.dumps(schema, sort_keys=True)
   
           self.log.info('Using schema for %s', self.schema_filename)
           self.log.debug("Current schema: %s", schema)
   
           tmp_schema_file_handle.write(schema_json.encode('utf-8'))
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] whynick1 commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
whynick1 commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#discussion_r402507732
 
 

 ##########
 File path: airflow/providers/google/cloud/operators/sql_to_gcs.py
 ##########
 @@ -251,18 +252,20 @@ def _get_col_type_dict(self):
     def _write_local_schema_file(self, cursor):
         """
         Takes a cursor, and writes the BigQuery schema for the results to a
-        local file system.
+        local file system. Schema for database will be read from cursor if
+        not specified.
 
         :return: A dictionary where key is a filename to be used as an object
             name in GCS, and values are file handles to local files that
             contains the BigQuery schema fields in .json format.
         """
-        schema = [self.field_to_bigquery(field) for field in cursor.description]
+        schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]
 
         self.log.info('Using schema for %s', self.schema_filename)
         self.log.debug("Current schema: %s", schema)
         tmp_schema_file_handle = NamedTemporaryFile(delete=True)
-        tmp_schema_file_handle.write(json.dumps(schema, sort_keys=True).encode('utf-8'))
+        schema_json = json.dumps(schema, sort_keys=True) if isinstance(schema, list) else schema
+        tmp_schema_file_handle.write(schema_json.encode('utf-8'))
 
 Review comment:
   Thanks for the review! I like the structure.
   Just to be clear, `self.schema` could be either `str` or `list`, and we don't want to call `json.dumps()` on a string. How about this edited version? 
   ```
           if self.shcema:
               self.log.info("Using user schema")
               schema = self.schema
           else:
               self.log.info("Starts generating schema")
               schema = [self.field_to_bigquery(field) for field in cursor.description]            
           
           if isinstance(schema, list):
               schema = json.dumps(schema, sort_keys=True)
   
           self.log.info('Using schema for %s', self.schema_filename)
           self.log.debug("Current schema: %s", schema)
   
           tmp_schema_file_handle = NamedTemporaryFile(delete=True)
           tmp_schema_file_handle.write(schema.encode('utf-8'))
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io commented on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
codecov-io commented on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#issuecomment-608169012
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=h1) Report
   > Merging [#8049](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/340e947d5db0ae1a5d5ca2d464fb78779acadc69&el=desc) will **decrease** coverage by `0.23%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8049/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #8049      +/-   ##
   ==========================================
   - Coverage   87.25%   87.01%   -0.24%     
   ==========================================
     Files         935      937       +2     
     Lines       45384    45555     +171     
   ==========================================
   + Hits        39601    39641      +40     
   - Misses       5783     5914     +131     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5) | `93.91% <100.00%> (+1.25%)` | :arrow_up: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0.00%> (-25.26%)` | :arrow_down: |
   | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0.00%> (-23.53%)` | :arrow_down: |
   | [airflow/providers/google/ads/hooks/ads.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Fkcy9ob29rcy9hZHMucHk=) | `71.79% <0.00%> (-6.78%)` | :arrow_down: |
   | [airflow/models/renderedtifields.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMvcmVuZGVyZWR0aWZpZWxkcy5weQ==) | `92.85% <0.00%> (-2.98%)` | :arrow_down: |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `92.40% <0.00%> (-1.89%)` | :arrow_down: |
   | [airflow/www/utils.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdXRpbHMucHk=) | `80.85% <0.00%> (-1.02%)` | :arrow_down: |
   | ... and [15 more](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=footer). Last update [340e947...ec32f57](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#discussion_r402214684
 
 

 ##########
 File path: airflow/providers/google/cloud/operators/sql_to_gcs.py
 ##########
 @@ -251,18 +252,20 @@ def _get_col_type_dict(self):
     def _write_local_schema_file(self, cursor):
         """
         Takes a cursor, and writes the BigQuery schema for the results to a
-        local file system.
+        local file system. Schema for database will be read from cursor if
+        not specified.
 
         :return: A dictionary where key is a filename to be used as an object
             name in GCS, and values are file handles to local files that
             contains the BigQuery schema fields in .json format.
         """
-        schema = [self.field_to_bigquery(field) for field in cursor.description]
+        schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]
 
         self.log.info('Using schema for %s', self.schema_filename)
         self.log.debug("Current schema: %s", schema)
         tmp_schema_file_handle = NamedTemporaryFile(delete=True)
-        tmp_schema_file_handle.write(json.dumps(schema, sort_keys=True).encode('utf-8'))
+        schema_json = json.dumps(schema, sort_keys=True) if isinstance(schema, list) else schema
+        tmp_schema_file_handle.write(schema_json.encode('utf-8'))
 
 Review comment:
   It looks overly complicated. What do you think about the following code?
   ```
           if self.shcema:
               schema_json = self.schemaa
           else:
               schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]            
               tmp_schema_file_handle = NamedTemporaryFile(delete=True)
               schema_json = json.dumps(schema, sort_keys=True)
   
           self.log.info('Using schema for %s', self.schema_filename)
           self.log.debug("Current schema: %s", schema)
   
           tmp_schema_file_handle.write(schema_json.encode('utf-8'))
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#discussion_r402211566
 
 

 ##########
 File path: airflow/providers/google/cloud/operators/sql_to_gcs.py
 ##########
 @@ -25,6 +25,7 @@
 from tempfile import NamedTemporaryFile
 
 import unicodecsv as csv
+from six import string_types
 
 Review comment:
   We don't use six on master branch anymore.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj merged pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
mik-laj merged pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] whynick1 commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
whynick1 commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#discussion_r402494163
 
 

 ##########
 File path: airflow/providers/google/cloud/operators/sql_to_gcs.py
 ##########
 @@ -25,6 +25,7 @@
 from tempfile import NamedTemporaryFile
 
 import unicodecsv as csv
+from six import string_types
 
 Review comment:
   Woops, I will remove that! Good catch.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#issuecomment-608169012
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=h1) Report
   > Merging [#8049](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/340e947d5db0ae1a5d5ca2d464fb78779acadc69&el=desc) will **decrease** coverage by `0.23%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8049/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #8049      +/-   ##
   ==========================================
   - Coverage   87.25%   87.01%   -0.24%     
   ==========================================
     Files         935      937       +2     
     Lines       45384    45555     +171     
   ==========================================
   + Hits        39601    39641      +40     
   - Misses       5783     5914     +131     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5) | `93.91% <100.00%> (+1.25%)` | :arrow_up: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0.00%> (-25.26%)` | :arrow_down: |
   | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0.00%> (-23.53%)` | :arrow_down: |
   | [airflow/providers/google/ads/hooks/ads.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Fkcy9ob29rcy9hZHMucHk=) | `71.79% <0.00%> (-6.78%)` | :arrow_down: |
   | [airflow/models/renderedtifields.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMvcmVuZGVyZWR0aWZpZWxkcy5weQ==) | `92.85% <0.00%> (-2.98%)` | :arrow_down: |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `92.40% <0.00%> (-1.89%)` | :arrow_down: |
   | [airflow/www/utils.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdXRpbHMucHk=) | `80.85% <0.00%> (-1.02%)` | :arrow_down: |
   | ... and [15 more](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=footer). Last update [340e947...ec32f57](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#issuecomment-608169012
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=h1) Report
   > Merging [#8049](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/340e947d5db0ae1a5d5ca2d464fb78779acadc69&el=desc) will **decrease** coverage by `0.23%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8049/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #8049      +/-   ##
   ==========================================
   - Coverage   87.25%   87.01%   -0.24%     
   ==========================================
     Files         935      937       +2     
     Lines       45384    45555     +171     
   ==========================================
   + Hits        39601    39641      +40     
   - Misses       5783     5914     +131     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5) | `93.91% <100.00%> (+1.25%)` | :arrow_up: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0.00%> (-25.26%)` | :arrow_down: |
   | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0.00%> (-23.53%)` | :arrow_down: |
   | [airflow/providers/google/ads/hooks/ads.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Fkcy9ob29rcy9hZHMucHk=) | `71.79% <0.00%> (-6.78%)` | :arrow_down: |
   | [airflow/models/renderedtifields.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMvcmVuZGVyZWR0aWZpZWxkcy5weQ==) | `92.85% <0.00%> (-2.98%)` | :arrow_down: |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `92.40% <0.00%> (-1.89%)` | :arrow_down: |
   | [airflow/www/utils.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdXRpbHMucHk=) | `80.85% <0.00%> (-1.02%)` | :arrow_down: |
   | ... and [15 more](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=footer). Last update [340e947...ec32f57](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] whynick1 commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
whynick1 commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#discussion_r402507732
 
 

 ##########
 File path: airflow/providers/google/cloud/operators/sql_to_gcs.py
 ##########
 @@ -251,18 +252,20 @@ def _get_col_type_dict(self):
     def _write_local_schema_file(self, cursor):
         """
         Takes a cursor, and writes the BigQuery schema for the results to a
-        local file system.
+        local file system. Schema for database will be read from cursor if
+        not specified.
 
         :return: A dictionary where key is a filename to be used as an object
             name in GCS, and values are file handles to local files that
             contains the BigQuery schema fields in .json format.
         """
-        schema = [self.field_to_bigquery(field) for field in cursor.description]
+        schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]
 
         self.log.info('Using schema for %s', self.schema_filename)
         self.log.debug("Current schema: %s", schema)
         tmp_schema_file_handle = NamedTemporaryFile(delete=True)
-        tmp_schema_file_handle.write(json.dumps(schema, sort_keys=True).encode('utf-8'))
+        schema_json = json.dumps(schema, sort_keys=True) if isinstance(schema, list) else schema
+        tmp_schema_file_handle.write(schema_json.encode('utf-8'))
 
 Review comment:
   Thanks for the review! I like the structure.
   Just to be clear, `self.schema` could be either `str` or `list`, and we don't want to call `json.dumps()` on a string. How about this edited version? @mik-laj 
   ```
           if self.shcema:
               self.log.info("Using user schema")
               schema = self.schema
           else:
               self.log.info("Starts generating schema")
               schema = [self.field_to_bigquery(field) for field in cursor.description]            
           
           if isinstance(schema, list):
               schema = json.dumps(schema, sort_keys=True)
   
           self.log.info('Using schema for %s', self.schema_filename)
           self.log.debug("Current schema: %s", schema)
   
           tmp_schema_file_handle = NamedTemporaryFile(delete=True)
           tmp_schema_file_handle.write(schema.encode('utf-8'))
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#discussion_r402214684
 
 

 ##########
 File path: airflow/providers/google/cloud/operators/sql_to_gcs.py
 ##########
 @@ -251,18 +252,20 @@ def _get_col_type_dict(self):
     def _write_local_schema_file(self, cursor):
         """
         Takes a cursor, and writes the BigQuery schema for the results to a
-        local file system.
+        local file system. Schema for database will be read from cursor if
+        not specified.
 
         :return: A dictionary where key is a filename to be used as an object
             name in GCS, and values are file handles to local files that
             contains the BigQuery schema fields in .json format.
         """
-        schema = [self.field_to_bigquery(field) for field in cursor.description]
+        schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]
 
         self.log.info('Using schema for %s', self.schema_filename)
         self.log.debug("Current schema: %s", schema)
         tmp_schema_file_handle = NamedTemporaryFile(delete=True)
-        tmp_schema_file_handle.write(json.dumps(schema, sort_keys=True).encode('utf-8'))
+        schema_json = json.dumps(schema, sort_keys=True) if isinstance(schema, list) else schema
+        tmp_schema_file_handle.write(schema_json.encode('utf-8'))
 
 Review comment:
   It looks overly complicated. What do you think about the following code?
   ```python
           if self.shcema:
               schema_json = self.schemaa
           else:
               schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]            
               tmp_schema_file_handle = NamedTemporaryFile(delete=True)
               schema_json = json.dumps(schema, sort_keys=True)
   
           self.log.info('Using schema for %s', self.schema_filename)
           self.log.debug("Current schema: %s", schema)
   
           tmp_schema_file_handle.write(schema_json.encode('utf-8'))
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#issuecomment-608169012
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=h1) Report
   > Merging [#8049](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/340e947d5db0ae1a5d5ca2d464fb78779acadc69&el=desc) will **decrease** coverage by `0.23%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8049/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #8049      +/-   ##
   ==========================================
   - Coverage   87.25%   87.01%   -0.24%     
   ==========================================
     Files         935      937       +2     
     Lines       45384    45555     +171     
   ==========================================
   + Hits        39601    39641      +40     
   - Misses       5783     5914     +131     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5) | `93.91% <100.00%> (+1.25%)` | :arrow_up: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0.00%> (-25.26%)` | :arrow_down: |
   | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0.00%> (-23.53%)` | :arrow_down: |
   | [airflow/providers/google/ads/hooks/ads.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Fkcy9ob29rcy9hZHMucHk=) | `71.79% <0.00%> (-6.78%)` | :arrow_down: |
   | [airflow/models/renderedtifields.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMvcmVuZGVyZWR0aWZpZWxkcy5weQ==) | `92.85% <0.00%> (-2.98%)` | :arrow_down: |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `92.40% <0.00%> (-1.89%)` | :arrow_down: |
   | [airflow/www/utils.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdXRpbHMucHk=) | `80.85% <0.00%> (-1.02%)` | :arrow_down: |
   | ... and [15 more](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=footer). Last update [340e947...ec32f57](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
mik-laj commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#discussion_r402214684
 
 

 ##########
 File path: airflow/providers/google/cloud/operators/sql_to_gcs.py
 ##########
 @@ -251,18 +252,20 @@ def _get_col_type_dict(self):
     def _write_local_schema_file(self, cursor):
         """
         Takes a cursor, and writes the BigQuery schema for the results to a
-        local file system.
+        local file system. Schema for database will be read from cursor if
+        not specified.
 
         :return: A dictionary where key is a filename to be used as an object
             name in GCS, and values are file handles to local files that
             contains the BigQuery schema fields in .json format.
         """
-        schema = [self.field_to_bigquery(field) for field in cursor.description]
+        schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]
 
         self.log.info('Using schema for %s', self.schema_filename)
         self.log.debug("Current schema: %s", schema)
         tmp_schema_file_handle = NamedTemporaryFile(delete=True)
-        tmp_schema_file_handle.write(json.dumps(schema, sort_keys=True).encode('utf-8'))
+        schema_json = json.dumps(schema, sort_keys=True) if isinstance(schema, list) else schema
+        tmp_schema_file_handle.write(schema_json.encode('utf-8'))
 
 Review comment:
   It looks overly complicated. What do you think about the following code?
   ```python
           if self.shcema:
               schema_json = self.schema
               self.log.info("Using user schema")
           else:
               self.log.info("Starting generating schema")
               schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]            
               tmp_schema_file_handle = NamedTemporaryFile(delete=True)
               schema_json = json.dumps(schema, sort_keys=True)
   
           self.log.info('Using schema for %s', self.schema_filename)
           self.log.debug("Current schema: %s", schema)
   
           tmp_schema_file_handle.write(schema_json.encode('utf-8'))
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on issue #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#issuecomment-608169012
 
 
   # [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=h1) Report
   > Merging [#8049](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=desc) into [master](https://codecov.io/gh/apache/airflow/commit/340e947d5db0ae1a5d5ca2d464fb78779acadc69&el=desc) will **decrease** coverage by `0.23%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/airflow/pull/8049/graphs/tree.svg?width=650&height=150&src=pr&token=WdLKlKHOAU)](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #8049      +/-   ##
   ==========================================
   - Coverage   87.25%   87.01%   -0.24%     
   ==========================================
     Files         935      937       +2     
     Lines       45384    45555     +171     
   ==========================================
   + Hits        39601    39641      +40     
   - Misses       5783     5914     +131     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [...low/providers/google/cloud/operators/sql\_to\_gcs.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Nsb3VkL29wZXJhdG9ycy9zcWxfdG9fZ2NzLnB5) | `93.91% <100.00%> (+1.25%)` | :arrow_up: |
   | [airflow/kubernetes/volume\_mount.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZV9tb3VudC5weQ==) | `44.44% <0.00%> (-55.56%)` | :arrow_down: |
   | [airflow/kubernetes/volume.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3ZvbHVtZS5weQ==) | `52.94% <0.00%> (-47.06%)` | :arrow_down: |
   | [airflow/kubernetes/pod\_launcher.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3BvZF9sYXVuY2hlci5weQ==) | `47.18% <0.00%> (-45.08%)` | :arrow_down: |
   | [...viders/cncf/kubernetes/operators/kubernetes\_pod.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvY25jZi9rdWJlcm5ldGVzL29wZXJhdG9ycy9rdWJlcm5ldGVzX3BvZC5weQ==) | `69.69% <0.00%> (-25.26%)` | :arrow_down: |
   | [airflow/kubernetes/refresh\_config.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9rdWJlcm5ldGVzL3JlZnJlc2hfY29uZmlnLnB5) | `50.98% <0.00%> (-23.53%)` | :arrow_down: |
   | [airflow/providers/google/ads/hooks/ads.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9wcm92aWRlcnMvZ29vZ2xlL2Fkcy9ob29rcy9hZHMucHk=) | `71.79% <0.00%> (-6.78%)` | :arrow_down: |
   | [airflow/models/renderedtifields.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy9tb2RlbHMvcmVuZGVyZWR0aWZpZWxkcy5weQ==) | `92.85% <0.00%> (-2.98%)` | :arrow_down: |
   | [airflow/www/app.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvYXBwLnB5) | `92.40% <0.00%> (-1.89%)` | :arrow_down: |
   | [airflow/www/utils.py](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree#diff-YWlyZmxvdy93d3cvdXRpbHMucHk=) | `80.85% <0.00%> (-1.02%)` | :arrow_down: |
   | ... and [15 more](https://codecov.io/gh/apache/airflow/pull/8049/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=footer). Last update [340e947...ec32f57](https://codecov.io/gh/apache/airflow/pull/8049?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [airflow] whynick1 commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload

Posted by GitBox <gi...@apache.org>.
whynick1 commented on a change in pull request #8049: [AIRFLOW-7117] Honor self.schema in sql_to_gcs as schema to upload
URL: https://github.com/apache/airflow/pull/8049#discussion_r402507732
 
 

 ##########
 File path: airflow/providers/google/cloud/operators/sql_to_gcs.py
 ##########
 @@ -251,18 +252,20 @@ def _get_col_type_dict(self):
     def _write_local_schema_file(self, cursor):
         """
         Takes a cursor, and writes the BigQuery schema for the results to a
-        local file system.
+        local file system. Schema for database will be read from cursor if
+        not specified.
 
         :return: A dictionary where key is a filename to be used as an object
             name in GCS, and values are file handles to local files that
             contains the BigQuery schema fields in .json format.
         """
-        schema = [self.field_to_bigquery(field) for field in cursor.description]
+        schema = self.schema or [self.field_to_bigquery(field) for field in cursor.description]
 
         self.log.info('Using schema for %s', self.schema_filename)
         self.log.debug("Current schema: %s", schema)
         tmp_schema_file_handle = NamedTemporaryFile(delete=True)
-        tmp_schema_file_handle.write(json.dumps(schema, sort_keys=True).encode('utf-8'))
+        schema_json = json.dumps(schema, sort_keys=True) if isinstance(schema, list) else schema
+        tmp_schema_file_handle.write(schema_json.encode('utf-8'))
 
 Review comment:
   I like the structure. Just to be clear, `self.schema` could be either `str` or `list`, and we don't want to call `json.dumps()` on a string. How about this edited version? 
   ```
           if self.shcema:
               self.log.info("Using user schema")
               schema = self.schema
           else:
               self.log.info("Starts generating schema")
               schema = [self.field_to_bigquery(field) for field in cursor.description]            
           
           if isinstance(schema, list):
               schema = json.dumps(schema, sort_keys=True)
   
           self.log.info('Using schema for %s', self.schema_filename)
           self.log.debug("Current schema: %s", schema)
   
           tmp_schema_file_handle = NamedTemporaryFile(delete=True)
           tmp_schema_file_handle.write(schema.encode('utf-8'))
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services