You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/12/03 08:42:07 UTC

[GitHub] [airflow] rmanvar-indeed opened a new issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

rmanvar-indeed opened a new issue #12776:
URL: https://github.com/apache/airflow/issues/12776


   <!--
   
   Welcome to Apache Airflow!  For a smooth issue process, try to answer the following questions.
   Don't worry if they're not all applicable; just try to include what you can :-)
   
   If you need to include code snippets or logs, please put them in fenced code
   blocks.  If they're super-long, please use the details tag like
   <details><summary>super-long log</summary> lots of stuff </details>
   
   Please delete these comment blocks before submitting the issue.
   
   -->
   
   **Description**
   
   Update source_code field of dag_code table to MEDIUMTEXT
   <!-- A short description of your feature -->
   
   **Use case / motivation**
   
   Lot of dags exceed the limit of 65K characters limit giving error `"Data too long for column 'source_code' at row 1"` when enabling webserver to fetch dag_code from db. 
   <!-- What do you want to happen?
   
   Rather than telling us how you might implement this solution, try to take a
   step back and describe what you are trying to achieve.
   
   -->
   
   **Related Issues**
   
   <!-- Is there currently another issue associated with this? -->
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-737989380


   I think this shoudl be solved for 2.0 (marked is as such) @rmanvar-indeed  - do you plan to contribute fix for that back to Airlfow? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-737753953


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rmanvar-indeed commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
rmanvar-indeed commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-737931160


   Do the migration files at https://github.com/apache/airflow/tree/master/airflow/migrations/versions include non-sql databases too? How can we apply only for sql if so


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-737937789


   Again -- this is only a problem for MySQL -- PostgreSQL's TEXT type can store 4gb 
   
   https://www.postgresql.org/docs/13/datatype-character.html
   
   So your migration only needs to change it for MySQL.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rmanvar-indeed commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
rmanvar-indeed commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-738601784


   Is it possible to add this update for previous / upcoming versions of Airflow like 1.10.12 and 1.10.13 ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-738658932


   We are just about to release 1.10.14 (this morning) so not very likely - maybe 1.10.15 if we have one


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-738001608


   > Yup, planning to create a MR for the fix. Might take few days tho as I don't have the airflow dev setup.
   
   Takes literally few minutes with Breeze: https://github.com/apache/airflow/blob/master/BREEZE.rst 
   
   "It's a Breeze to contribute to Airflow!"


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rmanvar-indeed commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
rmanvar-indeed commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-737928014


   Currently it's TEXT which has limit of 65K characters. MEDIUMTEXT will have 16777K characters limit. 
   
   yup was primarily aiming for sql, haven't use Airflow with other databases so need to explore them to be sure. 
   
   I tried updating column type from TEXT to MEDIUMTEXT of a running db and airflow setup and sql updated it without re-creating table and no errors in Airflow logs as well. ( Tho not many dags were running in the setup, will do a re-test to be sure )
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rmanvar-indeed commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
rmanvar-indeed commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-738003669


   Awesome. Thanks :smile:  will take a look.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk closed issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
potiuk closed issue #12776:
URL: https://github.com/apache/airflow/issues/12776


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
ashb commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-737883554


   What is the current column type?
   
   This only applies to mysql, not other DBs I think


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-739809481


   HEy @rmanvar-indeed - did you have a chance to work on it?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rmanvar-indeed commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
rmanvar-indeed commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-737999473


   Yup, planning to create a MR for the fix. Might take few days tho as I don't have the airflow dev setup. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rmanvar-indeed edited a comment on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
rmanvar-indeed edited a comment on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-737931160


   Do the migration files at https://github.com/apache/airflow/tree/master/airflow/migrations/versions include non-sql databases too? How can we apply only for sql if so
   EDIT: nvm looks like it can handle only sql changes too. could spot some code like `if conn.dialect.name not in ('sqlite'):`, so might be on similar lines


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-738660170


   But we are releasing 2.0rc next week and hopefully final 2.0 version  after that, and if you are considering running Airflow in PROD we HIGHLY recommend migrating to 2.0. And for that one - it will happen not sooner than Tuesday!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-738661059


   If you have any questions/etc. you can always ask me on airflow's slack (we have #breeze channel there)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] rmanvar-indeed commented on issue #12776: Update source_code field of dag_code table to MEDIUMTEXT

Posted by GitBox <gi...@apache.org>.
rmanvar-indeed commented on issue #12776:
URL: https://github.com/apache/airflow/issues/12776#issuecomment-740323266


   Thank for the MR!
   I had written the alembic migrations, but couldn't figure a good way to add tests. 
   Easiest way seemed to add a big dag exceeding 65Kb limit in example dags but that didn't seem ideal. 
   Tried adding test case in test_dagcode on the lines of: 
   ```
       def test_storing_large_dag_file_in_db(self):
           """Test if large dag exceeding 65Kb size can be added to db"""
           with TemporaryDirectory(prefix='airflow_dagcode_dir') as tmp_dir:
               with NamedTemporaryFile(dir=tmp_dir, mode='w', suffix='.py') as f:
                   f.write(get_large_test_dag())
                   f.flush()
                   f.close()
   
                   import os
                   os.listdir(tmp_dir)
                   dag_bag = DagBag(tmp_dir)
   
                   for dag in dag_bag.dags.values():
                       DagCode(dag.fileloc, get_large_test_dag()).sync_to_db()
                       dag.sync_to_db()
   
   ```
   but this wasn't writing dag code to db as I had expected. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org