You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/02/24 07:26:22 UTC

[GitHub] [airflow] DreamyWen opened a new issue #14413: support chinese character in xcom output

DreamyWen opened a new issue #14413:
URL: https://github.com/apache/airflow/issues/14413


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the following questions.
   Don't worry if they're not all applicable; just try to include what you can :-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   **Description**
   
   Apache Airflow version:
   1.10.14
   
   Kubernetes version (if you are using kubernetes) (use kubectl version):
   
   Environment:
   
   System Version: macOS 11.1 (20C69)
   Version: Darwin 20.2.0
   Python 3.7
   
   
   <!-- A short description of your feature -->
   
   **Use case / motivation**
   In some scenario , i want to use the BashOperator result in my app to future use.  
   the dag snapcode like this
   ```
   run_analysis = BashOperator(
       task_id='start_run_analysis_task',
       bash_command='cd /Users/saith/test_run && /Users/saith/anaconda3/envs/airflow/bin/python '
                    '/Users/saith/test_run/bash_chn.py {{ dag_run.conf["key1"] if dag_run else "" }} {{ dag_run.conf["key2"] if dag_run else "" }}',
       xcom_push=True,
       dag=dag,
   )
   ```
   the /Users/saith/test_run/bash_chn.py will return like '中文测试123'.
   
   i debug the dag in my IDE and find that in xcom.py json.dumps use default options, which may convert chinese character to unicode like '\u6d4b\u8bd5\u4e2d\u65871'. 
   ```
       def serialize_value(value):
           # TODO: "pickling" has been deprecated and JSON is preferred.
           # "pickling" will be removed in Airflow 2.0.
           if conf.getboolean('core', 'enable_xcom_pickling'):
               return pickle.dumps(value)
   
           try:
               return json.dumps(value).encode('UTF-8')
           except ValueError:
               log.error("Could not serialize the XCOM value into JSON. "
                         "If you are using pickles instead of JSON "
                         "for XCOM, then you need to enable pickle "
                         "support for XCOM in your airflow config.")
               raise
   ```
   
   the solution is simple , i change that to 
   ```
   json.dumps(value, ensure_ascii=False).encode('UTF-8')
   ```
   it works fine. So if would be better to add some options in xcom serialization like  ensure_ascii=False
   
   <!-- What do you want to happen?
   
   Rather than telling us how you might implement this solution, try to take a
   step back and describe what you are trying to achieve.
   
   -->
   
   **Are you willing to submit a PR?**
   
   <!--- We accept contributions! -->
   
   **Related Issues**
   
   <!-- Is there currently another issue associated with this? -->
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #14413: support chinese character in xcom table

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #14413:
URL: https://github.com/apache/airflow/issues/14413#issuecomment-859576073


   This should be fine since [Airflow already mandates the database must use UTF-8](https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html). Would you be interested in submitting a pull request? Aside for the suggested fix, we’ll also need a test to ensure we don’t escape non-ASCII in `XCom.serialize_value()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org