You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Jarek Potiuk <Ja...@polidea.com> on 2020/11/11 18:18:46 UTC

Re: Proposal: Change default to LocalExecutor and (eventually) remove SequentialExecutor

How about this? Are we doing it for 2.0 :)? Changing default /failing hard
when Postgres/MySQL is used?  WDYT ?

J.


On Sat, Oct 24, 2020 at 10:01 AM Jarek Potiuk <Ja...@polidea.com>
wrote:

> One more proposal on that. Why don't we fail hard Airflow in
> Postgres/MySQL when Sequential Executor is used?
>
> I think we might avoid some confusion.
>
> We had this long discussion with Kaxil - where  (after 2 years of working
> with Airflow) I've been (wrongly) almost 100% sure that Postgres/MySQL
> already use local executor by default (because 1.5 years ago we configured
> it like that for our system tests) and I have not realized that this is not
> the default.
>
> I do not think there is any benefit of using Sequential now for
> Postgres/MySQL so we can simply fail hard if it is set for those with
> (Please change to Local Executor message)
>
> This might be 2.0-only change.
>
> J.
>
>
>
>
> On Mon, Oct 12, 2020 at 1:35 AM Daniel Imberman <da...@gmail.com>
> wrote:
>
>> +1 to the general notes of the convo not much to add
>>
>> via Newton Mail
>> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.6&source=email_footer_2>
>>
>> On Wed, Oct 7, 2020 at 6:05 AM, Kaxil Naik <ka...@gmail.com> wrote:
>>
>> As long as we make sure LocalExecutor works fine with Sqlite, I am fine
>> with that. But we find any issues with making Sqlite work with
>> LocalExecutor, we should the SequentialExecutor as for new users, they can
>> easily start Airflow without having to worry about DB setup.
>>
>> Regards,
>> Kaxil
>>
>> On Wed, Oct 7, 2020, 10:36 Jarek Potiuk <Ja...@polidea.com> wrote:
>>
>>> Right - if we make sqlite works with LocalExecutor, there is no reason
>>> to keep Sequential Executor :).
>>>
>>> J.
>>>
>>>
>>>
>>> On Wed, Oct 7, 2020 at 11:26 AM Ash Berlin-Taylor <as...@apache.org>
>>> wrote:
>>>
>>>> Oh good point.
>>>>
>>>> I'll take a look -- I think our "don't use SQLite from more than one
>>>> process" is over-zealous, as SQLite has built in locking and can be used by
>>>> multiple processes at the same time, with a few caveats.
>>>>
>>>> http://www.sqlite.org/draft/faq.html#q5
>>>>
>>>> Multiple processes can have the same database open at the same time.
>>>> Multiple processes can be doing a SELECT at the same time. But only one
>>>> process can be making changes to the database at any moment in time,
>>>> however.
>>>>
>>>> SQLite uses reader/writer locks to control access to the database.
>>>> (Under Win95/98/ME which lacks support for reader/writer locks, a
>>>> probabilistic simulation is used instead.) But use caution: this locking
>>>> mechanism might not work correctly if the database file is kept on an NFS
>>>> filesystem. This is because fcntl() file locking is broken on many NFS
>>>> implementations. You should avoid putting SQLite database files on NFS if
>>>> multiple processes might try to access the file at the same time. On
>>>> Windows, Microsoft's documentation says that locking may not work under FAT
>>>> filesystems if you are not running the Share.exe daemon. People who have a
>>>> lot of experience with Windows tell me that file locking of network files
>>>> is very buggy and is not dependable. If what they say is true, sharing an
>>>> SQLite database between two or more Windows machines might cause unexpected
>>>> problems.
>>>>
>>>> We are aware of no other *embedded* SQL database engine that supports
>>>> as much concurrency as SQLite. SQLite allows multiple processes to have the
>>>> database file open at once, and for multiple processes to read the database
>>>> at once. When any process wants to write, it must lock the entire database
>>>> file for the duration of its update. But that normally only takes a few
>>>> milliseconds. Other processes just wait on the writer to finish then
>>>> continue about their business. Other embedded SQL database engines
>>>> typically only allow a single process to connect to the database at once.
>>>>
>>>>
>>>> So it's doable, we've just been overly cautious in the past.
>>>>
>>>> You're right that the change isn't just removing the executor though! I
>>>> think worth it overall though.
>>>>
>>>> -ash
>>>>
>>>> On Oct 7 2020, at 10:07 am, Jarek Potiuk <Ja...@polidea.com>
>>>> wrote:
>>>>
>>>> How about sqlite? I believe it only runs with Sequential Executor?
>>>>
>>>> On Wed, Oct 7, 2020 at 10:59 AM Ash Berlin-Taylor <as...@apache.org>
>>>> wrote:
>>>>
>>>> Hi everyone,
>>>>
>>>> I've just had a thought: the sequential executor is gives an all around
>>>> pretty bad experience (it blocks the scheduler, you'll see "scheduler
>>>> stopped heartbeating" messages if your task run takes a while.
>>>>
>>>> So I'd like to propose we change the default executor to LocalExecutor
>>>> -- to do this we should probably change the default number of
>>>> slots/processes from 16 to num cpus.
>>>>
>>>> Thoughts?
>>>>
>>>> None of this has to happen for 2.0 (I don't have time to do it), but
>>>> just wanted to suggest it.
>>>>
>>>> -ash
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>>
>>>>
>>>
>>> --
>>>
>>> Jarek Potiuk
>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>
>>> M: +48 660 796 129 <+48660796129>
>>> [image: Polidea] <https://www.polidea.com/>
>>>
>>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>

-- 

Jarek Potiuk
Polidea <https://www.polidea.com/> | Principal Software Engineer

M: +48 660 796 129 <+48660796129>
[image: Polidea] <https://www.polidea.com/>

Re: Proposal: Change default to LocalExecutor and (eventually) remove SequentialExecutor

Posted by Deng Xiaodong <xd...@gmail.com>.
Sounds good to me to enforce LocalExecutor (and ok to fail hard to me),
when (and ONLY when) it’s confirmed user is using MySQL/Postgres.


XD

On Wed, Nov 11, 2020 at 19:19 Jarek Potiuk <Ja...@polidea.com> wrote:

> How about this? Are we doing it for 2.0 :)? Changing default /failing hard
> when Postgres/MySQL is used?  WDYT ?
>
> J.
>
>
> On Sat, Oct 24, 2020 at 10:01 AM Jarek Potiuk <Ja...@polidea.com>
> wrote:
>
>> One more proposal on that. Why don't we fail hard Airflow in
>> Postgres/MySQL when Sequential Executor is used?
>>
>> I think we might avoid some confusion.
>>
>> We had this long discussion with Kaxil - where  (after 2 years of working
>> with Airflow) I've been (wrongly) almost 100% sure that Postgres/MySQL
>> already use local executor by default (because 1.5 years ago we configured
>> it like that for our system tests) and I have not realized that this is not
>> the default.
>>
>> I do not think there is any benefit of using Sequential now for
>> Postgres/MySQL so we can simply fail hard if it is set for those with
>> (Please change to Local Executor message)
>>
>> This might be 2.0-only change.
>>
>> J.
>>
>>
>>
>>
>> On Mon, Oct 12, 2020 at 1:35 AM Daniel Imberman <
>> daniel.imberman@gmail.com> wrote:
>>
>>> +1 to the general notes of the convo not much to add
>>>
>>> via Newton Mail
>>> <https://cloudmagic.com/k/d/mailapp?ct=dx&cv=10.0.51&pv=10.15.6&source=email_footer_2>
>>>
>>> On Wed, Oct 7, 2020 at 6:05 AM, Kaxil Naik <ka...@gmail.com> wrote:
>>>
>>> As long as we make sure LocalExecutor works fine with Sqlite, I am fine
>>> with that. But we find any issues with making Sqlite work with
>>> LocalExecutor, we should the SequentialExecutor as for new users, they can
>>> easily start Airflow without having to worry about DB setup.
>>>
>>> Regards,
>>> Kaxil
>>>
>>> On Wed, Oct 7, 2020, 10:36 Jarek Potiuk <Ja...@polidea.com>
>>> wrote:
>>>
>>>> Right - if we make sqlite works with LocalExecutor, there is no reason
>>>> to keep Sequential Executor :).
>>>>
>>>> J.
>>>>
>>>>
>>>>
>>>> On Wed, Oct 7, 2020 at 11:26 AM Ash Berlin-Taylor <as...@apache.org>
>>>> wrote:
>>>>
>>>>> Oh good point.
>>>>>
>>>>> I'll take a look -- I think our "don't use SQLite from more than one
>>>>> process" is over-zealous, as SQLite has built in locking and can be used by
>>>>> multiple processes at the same time, with a few caveats.
>>>>>
>>>>> http://www.sqlite.org/draft/faq.html#q5
>>>>>
>>>>> Multiple processes can have the same database open at the same time.
>>>>> Multiple processes can be doing a SELECT at the same time. But only one
>>>>> process can be making changes to the database at any moment in time,
>>>>> however.
>>>>>
>>>>> SQLite uses reader/writer locks to control access to the database.
>>>>> (Under Win95/98/ME which lacks support for reader/writer locks, a
>>>>> probabilistic simulation is used instead.) But use caution: this locking
>>>>> mechanism might not work correctly if the database file is kept on an NFS
>>>>> filesystem. This is because fcntl() file locking is broken on many NFS
>>>>> implementations. You should avoid putting SQLite database files on NFS if
>>>>> multiple processes might try to access the file at the same time. On
>>>>> Windows, Microsoft's documentation says that locking may not work under FAT
>>>>> filesystems if you are not running the Share.exe daemon. People who have a
>>>>> lot of experience with Windows tell me that file locking of network files
>>>>> is very buggy and is not dependable. If what they say is true, sharing an
>>>>> SQLite database between two or more Windows machines might cause unexpected
>>>>> problems.
>>>>>
>>>>> We are aware of no other *embedded* SQL database engine that supports
>>>>> as much concurrency as SQLite. SQLite allows multiple processes to have the
>>>>> database file open at once, and for multiple processes to read the database
>>>>> at once. When any process wants to write, it must lock the entire database
>>>>> file for the duration of its update. But that normally only takes a few
>>>>> milliseconds. Other processes just wait on the writer to finish then
>>>>> continue about their business. Other embedded SQL database engines
>>>>> typically only allow a single process to connect to the database at once.
>>>>>
>>>>>
>>>>> So it's doable, we've just been overly cautious in the past.
>>>>>
>>>>> You're right that the change isn't just removing the executor though!
>>>>> I think worth it overall though.
>>>>>
>>>>> -ash
>>>>>
>>>>> On Oct 7 2020, at 10:07 am, Jarek Potiuk <Ja...@polidea.com>
>>>>> wrote:
>>>>>
>>>>> How about sqlite? I believe it only runs with Sequential Executor?
>>>>>
>>>>> On Wed, Oct 7, 2020 at 10:59 AM Ash Berlin-Taylor <as...@apache.org>
>>>>> wrote:
>>>>>
>>>>> Hi everyone,
>>>>>
>>>>> I've just had a thought: the sequential executor is gives an all
>>>>> around pretty bad experience (it blocks the scheduler, you'll see
>>>>> "scheduler stopped heartbeating" messages if your task run takes a while.
>>>>>
>>>>> So I'd like to propose we change the default executor to LocalExecutor
>>>>> -- to do this we should probably change the default number of
>>>>> slots/processes from 16 to num cpus.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> None of this has to happen for 2.0 (I don't have time to do it), but
>>>>> just wanted to suggest it.
>>>>>
>>>>> -ash
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Jarek Potiuk
>>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>>
>>>>> M: +48 660 796 129 <+48660796129>
>>>>> [image: Polidea] <https://www.polidea.com/>
>>>>>
>>>>>
>>>>
>>>> --
>>>>
>>>> Jarek Potiuk
>>>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>>>
>>>> M: +48 660 796 129 <+48660796129>
>>>> [image: Polidea] <https://www.polidea.com/>
>>>>
>>>>
>>
>> --
>>
>> Jarek Potiuk
>> Polidea <https://www.polidea.com/> | Principal Software Engineer
>>
>> M: +48 660 796 129 <+48660796129>
>> [image: Polidea] <https://www.polidea.com/>
>>
>>
>
> --
>
> Jarek Potiuk
> Polidea <https://www.polidea.com/> | Principal Software Engineer
>
> M: +48 660 796 129 <+48660796129>
> [image: Polidea] <https://www.polidea.com/>
>
>