You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/06/15 07:39:14 UTC

[GitHub] [airflow] yene opened a new issue #9299: Unable to store xcom because of MySQL BLOB type limitation

yene opened a new issue #9299:
URL: https://github.com/apache/airflow/issues/9299


   
   **Apache Airflow version**:
   Airflow 1.10.10, 2020-04-09
   
   **Environment**:
   - MySQL: 5.7.30
   
   **What happened**:
   Storing a message bigger than 65,535 bytes resulted in a mysql error.
   ```
   (_mysql_exceptions.DataError) (1406, "Data too long for column 'value' at row 1")
   ```
   
   **What you expected to happen**:
   
   No error, since this amount of data is still the range of acceptable payload sizes.
   
   **How to reproduce it**:
   
   
   **Anything else we need to know**:
   sqlalchemy `sa.PickleType()` uses BLOB on MySQL which limits xcom.value to 65,535 bytes. 
   
   `sa.PickleType()` uses BYTEA in Postgres which is has a much higher max size.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vietvudanh edited a comment on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
vietvudanh edited a comment on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-969971272


   I do not use xcom for data exchange but still get this error since output are stored in `return_value`anyway.
   This is kind of annoying, is there config where airflow can just truncate the output to the max size, before inserting into database?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vikramkoka edited a comment on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
vikramkoka edited a comment on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-752188981


   Just as a note, the maximum xcom value to be stored is defined as being 49344 bytes. 
   In the code, this is defined as MAX_XCOM_SIZE = 49344
   
   However, this does not seem to be used consistently. In practice, this seems to be database column size dependent. 
   The one way to bypass this limitation is to use a custom_xcom_backend for persistence. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vikramkoka commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
vikramkoka commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-752188981


   Just as a note, the maximum xcom value to be stored is defined as being 49344 bytes. 
   In the code, this is defined as MAX_XCOM_SIZE = 49344
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-922647064


   I'm closing this as won't fix.
   You can set alternative xcom backend if needed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-970284297


   ```
   result = self.subprocess_hook.run_command(
   ```
   
   SubProcessHook (run_command):
   
   ```
           self.log.info('Output:')
               line = ''
               for raw_line in iter(self.sub_process.stdout.readline, b''):
                   line = raw_line.decode(output_encoding).rstrip()
                   self.log.info("%s", line)
   
               self.sub_process.wait()
   
               self.log.info('Command exited with return code %s', self.sub_process.returncode)
   
           return SubprocessResult(exit_code=self.sub_process.returncode, output=line)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] superina5 commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
superina5 commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-707804015


   experienced the same isseu, upvote +1


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vietvudanh commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
vietvudanh commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-969971272


   I do not use xcom for data exchange but still get this error. It is because the key `return_value` of my BashOperator's output is too big; over 65,535 bytes.
   This is kind of annoying, is there config where airflow can just truncate the output to the max value, before inserting into database?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-752240505


   I agree with @vikramkoka -> XCOM is not supposed to store big amounts of data and MAX_XCOM_SIZE is the maximum you can expect. Rather than increasing the limits I'd turn that into an issue to verify that the MAX_XCOM_SIZE is consistently applied everywhere. The limitation is not due to limit on the field size it is because of the frequency of accessing and model of accessing the XCOM table by Airflow - allowing for much bigger xcom fields might open up scenarios where different components start throttling each other and the whole system might become unusable. This is a dangerous path to go through.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vietvudanh edited a comment on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
vietvudanh edited a comment on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-969971272






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-970284297


   ```
   result = self.subprocess_hook.run_command(
   ```
   
   ```
           self.log.info('Output:')
               line = ''
               for raw_line in iter(self.sub_process.stdout.readline, b''):
                   line = raw_line.decode(output_encoding).rstrip()
                   self.log.info("%s", line)
   
               self.sub_process.wait()
   
               self.log.info('Command exited with return code %s', self.sub_process.returncode)
   
           return SubprocessResult(exit_code=self.sub_process.returncode, output=line)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vietvudanh commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
vietvudanh commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-970103822


   > It's the problem with your Bash script. By default Bash operator will store LAST LINE of your output to XCom: see the documentation http://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/operators/bash/index.html#module-airflow.operators.bash
   > 
   > If your Bash prints 65k last line then yes, this is the problem you will get at.
   
   I have read the source code for [1.10](http://airflow.apache.org/docs/apache-airflow/1.10.12/_modules/airflow/operators/bash_operator.html#BashOperator.execute) and [latest](https://airflow.apache.org/docs/apache-airflow/stable/_modules/airflow/operators/bash.html#BashOperator.execute)
   
   it looks like the `return last line` is only correct for version `1`, not `2` (I am using `v2.1.1`)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] vietvudanh commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
vietvudanh commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-971360272


   > SubProcessHook
   
   Thank you, after looking into SubProcessHook I have realized the problem, it was indeed my script.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] yene commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
yene commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-753805099


   In my opinion MySQL type should not limit users, maybe VARBINARY would fit?
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-643959128


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] eladkal closed issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
eladkal closed issue #9299:
URL: https://github.com/apache/airflow/issues/9299


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-969996013


   It's the problem with your Bash script. By default Bash operator will store LAST LINE of your output to XCom: see the documentation http://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/operators/bash/index.html#module-airflow.operators.bash 
   
   If your Bash prints 65k last line then yes, this is the problem you will get at. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on issue #9299: Unable to store xcom because of MySQL BLOB type limitation

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on issue #9299:
URL: https://github.com/apache/airflow/issues/9299#issuecomment-970284297


   ```
   result = self.subprocess_hook.run_command(
   ```
   
   SubProcessHook:
   
   ```
           self.log.info('Output:')
               line = ''
               for raw_line in iter(self.sub_process.stdout.readline, b''):
                   line = raw_line.decode(output_encoding).rstrip()
                   self.log.info("%s", line)
   
               self.sub_process.wait()
   
               self.log.info('Command exited with return code %s', self.sub_process.returncode)
   
           return SubprocessResult(exit_code=self.sub_process.returncode, output=line)
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org