You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Chris Software <so...@gmail.com> on 2019/09/11 13:51:42 UTC

Re: Ignite Map-Reduce Deadlocking, Running in SYS pool

Ilya,

When will an official announcement and support-drop schedule be made about
dropping IGFS?

Thank you,

Chris

On Tue, Aug 27, 2019 at 1:47 PM Chris Software <so...@gmail.com>
wrote:

> I see.  Thank you.
>
> On Tue, Aug 27, 2019 at 12:30 PM Ilya Kasnacheev <
> ilya.kasnacheev@gmail.com> wrote:
>
>> Hello!
>>
>> This looks like a mistake. However, we're going to drop IGFS so the fix
>> is unlikely to be expected.
>>
>> The recommended practical approach is to increase number of threads in
>> system thread pool to large value.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> вт, 27 авг. 2019 г. в 00:34, Chris Software <so...@gmail.com>:
>>
>>> Hello,
>>>
>>> I am working on a project and we have run into two related problems
>>> while doing Map_Reduce on Ignite Filesystem Cache.
>>>
>>> We were originally on Ignite 2.6 but upgraded to 2.7.5 in an
>>> unsuccessful bid to resolve the problem.
>>>
>>> We have a deadlock in our map-reduce process and have reproduced it at
>>> https://github.com/csteppp/ignite-deadlock-issue in
>>> https://github.com/csteppp/ignite-deadlock-issue/blob/master/src/test/java/testignite/MapReduceIgniteTest.java.
>>>
>>>
>>> Basically, if you run the test (mvn test) it will deadlock and hang.  We
>>> have two IgfsTasks created and have set the SYS threadpool to size 2 for
>>> demonstration purposes.  Each IgfsTask sleeps and then writes to a file.
>>> This causes a deadlock because:
>>> 1.  The IgfsTask is run in the SYS pool.
>>> 2.  The Igfs write action uses a separate thread in the SYS pool
>>> 3.  Then if there are no empty threads available, the whole system
>>> hangs.
>>>
>>> First, shouldn't executeAsync execute the task in the PUBLIC pool?
>>> Using the SYS pool seems unnecessarily risky, as we found it actually *locks
>>> up an entire cluster of many ignite nodes *when it deadlocks.  How do I
>>> get it to use the PUBLIC pool?  Also, since it is using the SYS pool, it
>>> actually seems to execute this on the client.  This is not obvious in this
>>> test, but in my real cluster of 30 nodes, the client seems to be doing this
>>> work, which is a problem.
>>>
>>> Second, is it bad form to open a file within a map-reduce?  Even using
>>> the public pool will not solve the inherent deadlock here--that one thread
>>> is depending on another thread in the same thread pool.  That's an inherent
>>> risk.  In our real process we open the file because we are performing file
>>> transformations in the IgfsTask, and then writing the results out to temp
>>> files in the cluster.  In the end, we collate all the temp files.  Is there
>>> a better approach, or a safe way to open a file and write to it from within
>>> a reduce?
>>>
>>> Thank you for your time!
>>>
>>> Chris
>>>
>>>
>>>

Re: Ignite Map-Reduce Deadlocking, Running in SYS pool

Posted by Denis Magda <dm...@apache.org>.

Chris,

Thanks for reminding. We'll update the docs the next week. Sorry for
wasting your time.

Btw, what was the task you tried to solve with IGFS? We might find an
alternate API or solution with Ignite or recommend another project.

-
Denis


On Wed, Sep 11, 2019 at 10:21 AM Chris Software <so...@gmail.com>
wrote:

> I respectfully suggest you update your public documentation ASAP, as
> people (like my team) are developing new software now, using IGFS,
> expecting that it will continue to be supported.  Please don't wait until
> you release 2.8.
>
> On Wed, Sep 11, 2019 at 11:07 AM Ilya Kasnacheev <
> ilya.kasnacheev@gmail.com> wrote:
>
>> Hello!
>>
>> https://issues.apache.org/jira/browse/IGNITE-11942
>>
>> Vote already succeeded, there shouldn't be IGFS in 2.8.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> ср, 11 сент. 2019 г. в 16:52, Chris Software <so...@gmail.com>:
>>
>>> Ilya,
>>>
>>> When will an official announcement and support-drop schedule be made
>>> about dropping IGFS?
>>>
>>> Thank you,
>>>
>>> Chris
>>>
>>> On Tue, Aug 27, 2019 at 1:47 PM Chris Software <
>>> softwarechris77@gmail.com> wrote:
>>>
>>>> I see.  Thank you.
>>>>
>>>> On Tue, Aug 27, 2019 at 12:30 PM Ilya Kasnacheev <
>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> This looks like a mistake. However, we're going to drop IGFS so the
>>>>> fix is unlikely to be expected.
>>>>>
>>>>> The recommended practical approach is to increase number of threads in
>>>>> system thread pool to large value.
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> вт, 27 авг. 2019 г. в 00:34, Chris Software <softwarechris77@gmail.com
>>>>> >:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I am working on a project and we have run into two related problems
>>>>>> while doing Map_Reduce on Ignite Filesystem Cache.
>>>>>>
>>>>>> We were originally on Ignite 2.6 but upgraded to 2.7.5 in an
>>>>>> unsuccessful bid to resolve the problem.
>>>>>>
>>>>>> We have a deadlock in our map-reduce process and have reproduced it
>>>>>> at https://github.com/csteppp/ignite-deadlock-issue in
>>>>>> https://github.com/csteppp/ignite-deadlock-issue/blob/master/src/test/java/testignite/MapReduceIgniteTest.java.
>>>>>>
>>>>>>
>>>>>> Basically, if you run the test (mvn test) it will deadlock and hang.
>>>>>> We have two IgfsTasks created and have set the SYS threadpool to size 2 for
>>>>>> demonstration purposes.  Each IgfsTask sleeps and then writes to a file.
>>>>>> This causes a deadlock because:
>>>>>> 1.  The IgfsTask is run in the SYS pool.
>>>>>> 2.  The Igfs write action uses a separate thread in the SYS pool
>>>>>> 3.  Then if there are no empty threads available, the whole system
>>>>>> hangs.
>>>>>>
>>>>>> First, shouldn't executeAsync execute the task in the PUBLIC pool?
>>>>>> Using the SYS pool seems unnecessarily risky, as we found it actually *locks
>>>>>> up an entire cluster of many ignite nodes *when it deadlocks.  How
>>>>>> do I get it to use the PUBLIC pool?  Also, since it is using the SYS pool,
>>>>>> it actually seems to execute this on the client.  This is not obvious in
>>>>>> this test, but in my real cluster of 30 nodes, the client seems to be doing
>>>>>> this work, which is a problem.
>>>>>>
>>>>>> Second, is it bad form to open a file within a map-reduce?  Even
>>>>>> using the public pool will not solve the inherent deadlock here--that one
>>>>>> thread is depending on another thread in the same thread pool.  That's an
>>>>>> inherent risk.  In our real process we open the file because we are
>>>>>> performing file transformations in the IgfsTask, and then writing the
>>>>>> results out to temp files in the cluster.  In the end, we collate all the
>>>>>> temp files.  Is there a better approach, or a safe way to open a file and
>>>>>> write to it from within a reduce?
>>>>>>
>>>>>> Thank you for your time!
>>>>>>
>>>>>> Chris
>>>>>>
>>>>>>
>>>>>>

Re: Ignite Map-Reduce Deadlocking, Running in SYS pool

Posted by Chris Software <so...@gmail.com>.

I respectfully suggest you update your public documentation ASAP, as people
(like my team) are developing new software now, using IGFS, expecting that
it will continue to be supported.  Please don't wait until you release
2.8.

On Wed, Sep 11, 2019 at 11:07 AM Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> https://issues.apache.org/jira/browse/IGNITE-11942
>
> Vote already succeeded, there shouldn't be IGFS in 2.8.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> ср, 11 сент. 2019 г. в 16:52, Chris Software <so...@gmail.com>:
>
>> Ilya,
>>
>> When will an official announcement and support-drop schedule be made
>> about dropping IGFS?
>>
>> Thank you,
>>
>> Chris
>>
>> On Tue, Aug 27, 2019 at 1:47 PM Chris Software <so...@gmail.com>
>> wrote:
>>
>>> I see.  Thank you.
>>>
>>> On Tue, Aug 27, 2019 at 12:30 PM Ilya Kasnacheev <
>>> ilya.kasnacheev@gmail.com> wrote:
>>>
>>>> Hello!
>>>>
>>>> This looks like a mistake. However, we're going to drop IGFS so the fix
>>>> is unlikely to be expected.
>>>>
>>>> The recommended practical approach is to increase number of threads in
>>>> system thread pool to large value.
>>>>
>>>> Regards,
>>>> --
>>>> Ilya Kasnacheev
>>>>
>>>>
>>>> вт, 27 авг. 2019 г. в 00:34, Chris Software <softwarechris77@gmail.com
>>>> >:
>>>>
>>>>> Hello,
>>>>>
>>>>> I am working on a project and we have run into two related problems
>>>>> while doing Map_Reduce on Ignite Filesystem Cache.
>>>>>
>>>>> We were originally on Ignite 2.6 but upgraded to 2.7.5 in an
>>>>> unsuccessful bid to resolve the problem.
>>>>>
>>>>> We have a deadlock in our map-reduce process and have reproduced it at
>>>>> https://github.com/csteppp/ignite-deadlock-issue in
>>>>> https://github.com/csteppp/ignite-deadlock-issue/blob/master/src/test/java/testignite/MapReduceIgniteTest.java.
>>>>>
>>>>>
>>>>> Basically, if you run the test (mvn test) it will deadlock and hang.
>>>>> We have two IgfsTasks created and have set the SYS threadpool to size 2 for
>>>>> demonstration purposes.  Each IgfsTask sleeps and then writes to a file.
>>>>> This causes a deadlock because:
>>>>> 1.  The IgfsTask is run in the SYS pool.
>>>>> 2.  The Igfs write action uses a separate thread in the SYS pool
>>>>> 3.  Then if there are no empty threads available, the whole system
>>>>> hangs.
>>>>>
>>>>> First, shouldn't executeAsync execute the task in the PUBLIC pool?
>>>>> Using the SYS pool seems unnecessarily risky, as we found it actually *locks
>>>>> up an entire cluster of many ignite nodes *when it deadlocks.  How do
>>>>> I get it to use the PUBLIC pool?  Also, since it is using the SYS pool, it
>>>>> actually seems to execute this on the client.  This is not obvious in this
>>>>> test, but in my real cluster of 30 nodes, the client seems to be doing this
>>>>> work, which is a problem.
>>>>>
>>>>> Second, is it bad form to open a file within a map-reduce?  Even using
>>>>> the public pool will not solve the inherent deadlock here--that one thread
>>>>> is depending on another thread in the same thread pool.  That's an inherent
>>>>> risk.  In our real process we open the file because we are performing file
>>>>> transformations in the IgfsTask, and then writing the results out to temp
>>>>> files in the cluster.  In the end, we collate all the temp files.  Is there
>>>>> a better approach, or a safe way to open a file and write to it from within
>>>>> a reduce?
>>>>>
>>>>> Thank you for your time!
>>>>>
>>>>> Chris
>>>>>
>>>>>
>>>>>

Re: Ignite Map-Reduce Deadlocking, Running in SYS pool

Posted by Ilya Kasnacheev <il...@gmail.com>.

Hello!

https://issues.apache.org/jira/browse/IGNITE-11942

Vote already succeeded, there shouldn't be IGFS in 2.8.

Regards,
-- 
Ilya Kasnacheev


ср, 11 сент. 2019 г. в 16:52, Chris Software <so...@gmail.com>:

> Ilya,
>
> When will an official announcement and support-drop schedule be made about
> dropping IGFS?
>
> Thank you,
>
> Chris
>
> On Tue, Aug 27, 2019 at 1:47 PM Chris Software <so...@gmail.com>
> wrote:
>
>> I see.  Thank you.
>>
>> On Tue, Aug 27, 2019 at 12:30 PM Ilya Kasnacheev <
>> ilya.kasnacheev@gmail.com> wrote:
>>
>>> Hello!
>>>
>>> This looks like a mistake. However, we're going to drop IGFS so the fix
>>> is unlikely to be expected.
>>>
>>> The recommended practical approach is to increase number of threads in
>>> system thread pool to large value.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> вт, 27 авг. 2019 г. в 00:34, Chris Software <so...@gmail.com>:
>>>
>>>> Hello,
>>>>
>>>> I am working on a project and we have run into two related problems
>>>> while doing Map_Reduce on Ignite Filesystem Cache.
>>>>
>>>> We were originally on Ignite 2.6 but upgraded to 2.7.5 in an
>>>> unsuccessful bid to resolve the problem.
>>>>
>>>> We have a deadlock in our map-reduce process and have reproduced it at
>>>> https://github.com/csteppp/ignite-deadlock-issue in
>>>> https://github.com/csteppp/ignite-deadlock-issue/blob/master/src/test/java/testignite/MapReduceIgniteTest.java.
>>>>
>>>>
>>>> Basically, if you run the test (mvn test) it will deadlock and hang.
>>>> We have two IgfsTasks created and have set the SYS threadpool to size 2 for
>>>> demonstration purposes.  Each IgfsTask sleeps and then writes to a file.
>>>> This causes a deadlock because:
>>>> 1.  The IgfsTask is run in the SYS pool.
>>>> 2.  The Igfs write action uses a separate thread in the SYS pool
>>>> 3.  Then if there are no empty threads available, the whole system
>>>> hangs.
>>>>
>>>> First, shouldn't executeAsync execute the task in the PUBLIC pool?
>>>> Using the SYS pool seems unnecessarily risky, as we found it actually *locks
>>>> up an entire cluster of many ignite nodes *when it deadlocks.  How do
>>>> I get it to use the PUBLIC pool?  Also, since it is using the SYS pool, it
>>>> actually seems to execute this on the client.  This is not obvious in this
>>>> test, but in my real cluster of 30 nodes, the client seems to be doing this
>>>> work, which is a problem.
>>>>
>>>> Second, is it bad form to open a file within a map-reduce?  Even using
>>>> the public pool will not solve the inherent deadlock here--that one thread
>>>> is depending on another thread in the same thread pool.  That's an inherent
>>>> risk.  In our real process we open the file because we are performing file
>>>> transformations in the IgfsTask, and then writing the results out to temp
>>>> files in the cluster.  In the end, we collate all the temp files.  Is there
>>>> a better approach, or a safe way to open a file and write to it from within
>>>> a reduce?
>>>>
>>>> Thank you for your time!
>>>>
>>>> Chris
>>>>
>>>>
>>>>