You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@airavata.apache.org by Jarett DeAngelis <ja...@bioteam.net> on 2016/12/28 22:13:53 UTC

phantom job

So, I keep getting this popping up in my log, and whenever it does, the middleware grinds to a halt and will not process any more jobs through:

2016-12-28 14:56:03,058 [pool-3-thread-49] ERROR org.apache.airavata.api.server.handler.AiravataServerHandler  - ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03
org.apache.airavata.registry.cpi.AppCatalogException: javax.persistence.NoResultException: Query "SELECT p FROM ApplicationInterface p WHERE p.interfaceID =:param0" selected no result, but expected unique result.

This “ls_snakemake” job is months old and is associated with an interface, module and deployment which no longer exist, which is why this error is getting thrown. What I don’t understand is what is “reminding” Airavata of this job and causing it to get hung up. Last week, I found references to it in the experiment catalog database in the EXPERIMENT table, and deleting these allowed Airavata to continue. Now it is happening again and the references are gone. How do I eliminate all references to this job from all of Airavata such that it stops getting hung up on this phantom job?

Thanks,
Jarett

Re: phantom job

Posted by Jarett DeAngelis <ja...@bioteam.net>.
Yeah, I poked through everything that had any references to “param” (which was the prefix I used for parameters in my Snakemake exploration) and deleted all of them in those tables. As of today the gateway is still functioning normally, so I think I’m okay there.

Thanks,
Jarett

> On Dec 29, 2016, at 10:05 AM, Christie, Marcus Aaron <ma...@iu.edu> wrote:
> 
> 
> Jarett,
> 
> I’m a little out of the loop on the issue you’ve run into, but I’ll chime in in the hope that I can help.
> 
> EXPERIMENT_SUMMARY is a DB View that joins together the EXPERIMENT and USER_CONFIGURATION_DATA tables with the LATEST_EXPERIMENT_STATUS view (which in turn is a self-joined view of the EXPERIMENT_STATUS table).
> 
> So you might want to check also the EXPERIMENT, USER_CONFIGURATION_DATA and EXPERIMENT_STATUS tables.
> 
> 
> Marcus
> 
> 
>> On Dec 28, 2016, at 6:07 PM, Jarett DeAngelis <jarett@bioteam.net <ma...@bioteam.net>> wrote:
>> 
>> I found more culprits in EXPERIMENT_SUMMARY, but:
>> 
>> MariaDB [Airavata_production_expcatalog]> select * from EXPERIMENT_SUMMARY where EXECUTION_ID like '%snake%';
>> +---------------------------------------------------+-----------------------------------------------------+--------------------------+------------------+---------------------------------------------------+-----------------+---------------------+-------------+-----------+----------------------------------------------------------------+----------------------+
>> | EXPERIMENT_ID                                     | PROJECT_ID                                          | GATEWAY_ID               | USER_NAME        | EXECUTION_ID                                       | EXPERIMENT_NAME | CREATION_TIME       | DESCRIPTION | STATE     | RESOURCE_HOST_ID                                               | TIME_OF_STATE_CHANGE |
>> +---------------------------------------------------+-----------------------------------------------------+--------------------------+------------------+---------------------------------------------------+-----------------+---------------------+-------------+-----------+----------------------------------------------------------------+----------------------+
>> | 12345_f870dd35-e6d4-4053-a92b-302820dd129e        | Project1_3c40da06-4bef-4326-8097-ccf44859285a       | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | 12345           | 2016-11-02 10:22:14 |             | COMPLETED | login.scinet.ars.usda.gov <http://login.scinet.ars.usda.gov/>_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-11-02 10:22:52  |
>> | 2345_aa3b31ea-6f2c-48e4-8b15-6a39ef23330c         | Project1_3c40da06-4bef-4326-8097-ccf44859285a       | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | 2345            | 2016-11-02 10:24:46 |             | COMPLETED | login.scinet.ars.usda.gov <http://login.scinet.ars.usda.gov/>_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-11-02 10:25:24  |
>> | asdfasdfasdf_6727643c-1749-45df-86c4-1a9ecb2fd1a3 | Project1_3c40da06-4bef-4326-8097-ccf44859285a       | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | asdfasdfasdf    | 2016-10-21 16:08:33 |             | LAUNCHED  | login.scinet.ars.usda.gov <http://login.scinet.ars.usda.gov/>_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-10-21 16:08:44  |
>> | qwerty_68f5e215-d998-47c1-b702-c66346889689       | DefaultProject_dfd1a414-dbdf-41c8-ad9f-6092af16cf3d | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | qwerty          | 2016-10-31 10:57:09 | asdf        | LAUNCHED  | login.scinet.ars.usda.gov <http://login.scinet.ars.usda.gov/>_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-10-31 10:57:09  |
>> | qwerty_68f5e215-d998-47c1-b702-c66346889689       | DefaultProject_dfd1a414-dbdf-41c8-ad9f-6092af16cf3d | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | qwerty          | 2016-10-31 10:57:09 | asdf        | CREATED   | login.scinet.ars.usda.gov <http://login.scinet.ars.usda.gov/>_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-10-31 10:57:09  |
>> +---------------------------------------------------+-----------------------------------------------------+--------------------------+------------------+---------------------------------------------------+-----------------+---------------------+-------------+-----------+----------------------------------------------------------------+----------------------+
>> 5 rows in set (0.01 sec)
>> 
>> MariaDB [Airavata_production_expcatalog]> delete from EXPERIMENT_SUMMARY where EXECUTION_ID like '%snake%';
>> ERROR 1288 (HY000): The target table EXPERIMENT_SUMMARY of the DELETE is not updatable
>> 
>> 
>> Not sure what I can do with that.
>> 
>> Jarett
>> 
>>> On Dec 28, 2016, at 5:24 PM, Suresh Marru <smarru@apache.org <ma...@apache.org>> wrote:
>>> 
>>> Hi Jarett,
>>> 
>>> Can you remove the entries of ls_snakemake in application catalog database as well? I believe Eroma was able to reproduce the issue and we will work on a fix. Sorry for the troubles. 
>>> 
>>> Suresh
>>> 
>>>> On Dec 28, 2016, at 5:13 PM, Jarett DeAngelis <jarett@bioteam.net <ma...@bioteam.net>> wrote:
>>>> 
>>>> So, I keep getting this popping up in my log, and whenever it does, the middleware grinds to a halt and will not process any more jobs through:
>>>> 
>>>> 2016-12-28 14:56:03,058 [pool-3-thread-49] ERROR org.apache.airavata.api.server.handler.AiravataServerHandler  - ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03
>>>> org.apache.airavata.registry.cpi.AppCatalogException: javax.persistence.NoResultException: Query "SELECT p FROM ApplicationInterface p WHERE p.interfaceID =:param0" selected no result, but expected unique result.
>>>> 
>>>> This “ls_snakemake” job is months old and is associated with an interface, module and deployment which no longer exist, which is why this error is getting thrown. What I don’t understand is what is “reminding” Airavata of this job and causing it to get hung up. Last week, I found references to it in the experiment catalog database in the EXPERIMENT table, and deleting these allowed Airavata to continue. Now it is happening again and the references are gone. How do I eliminate all references to this job from all of Airavata such that it stops getting hung up on this phantom job?
>>>> 
>>>> Thanks,
>>>> Jarett
>>> 
>> 
> 


Re: phantom job

Posted by "Christie, Marcus Aaron" <ma...@iu.edu>.
Jarett,

I’m a little out of the loop on the issue you’ve run into, but I’ll chime in in the hope that I can help.

EXPERIMENT_SUMMARY is a DB View that joins together the EXPERIMENT and USER_CONFIGURATION_DATA tables with the LATEST_EXPERIMENT_STATUS view (which in turn is a self-joined view of the EXPERIMENT_STATUS table).

So you might want to check also the EXPERIMENT, USER_CONFIGURATION_DATA and EXPERIMENT_STATUS tables.


Marcus


On Dec 28, 2016, at 6:07 PM, Jarett DeAngelis <ja...@bioteam.net>> wrote:

I found more culprits in EXPERIMENT_SUMMARY, but:

MariaDB [Airavata_production_expcatalog]> select * from EXPERIMENT_SUMMARY where EXECUTION_ID like '%snake%';
+---------------------------------------------------+-----------------------------------------------------+--------------------------+------------------+---------------------------------------------------+-----------------+---------------------+-------------+-----------+----------------------------------------------------------------+----------------------+
| EXPERIMENT_ID                                     | PROJECT_ID                                          | GATEWAY_ID               | USER_NAME        | EXECUTION_ID                                      | EXPERIMENT_NAME | CREATION_TIME       | DESCRIPTION | STATE     | RESOURCE_HOST_ID                                               | TIME_OF_STATE_CHANGE |
+---------------------------------------------------+-----------------------------------------------------+--------------------------+------------------+---------------------------------------------------+-----------------+---------------------+-------------+-----------+----------------------------------------------------------------+----------------------+
| 12345_f870dd35-e6d4-4053-a92b-302820dd129e        | Project1_3c40da06-4bef-4326-8097-ccf44859285a       | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | 12345           | 2016-11-02 10:22:14 |             | COMPLETED | login.scinet.ars.usda.gov<http://login.scinet.ars.usda.gov/>_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-11-02 10:22:52  |
| 2345_aa3b31ea-6f2c-48e4-8b15-6a39ef23330c         | Project1_3c40da06-4bef-4326-8097-ccf44859285a       | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | 2345            | 2016-11-02 10:24:46 |             | COMPLETED | login.scinet.ars.usda.gov<http://login.scinet.ars.usda.gov/>_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-11-02 10:25:24  |
| asdfasdfasdf_6727643c-1749-45df-86c4-1a9ecb2fd1a3 | Project1_3c40da06-4bef-4326-8097-ccf44859285a       | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | asdfasdfasdf    | 2016-10-21 16:08:33 |             | LAUNCHED  | login.scinet.ars.usda.gov<http://login.scinet.ars.usda.gov/>_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-10-21 16:08:44  |
| qwerty_68f5e215-d998-47c1-b702-c66346889689       | DefaultProject_dfd1a414-dbdf-41c8-ad9f-6092af16cf3d | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | qwerty          | 2016-10-31 10:57:09 | asdf        | LAUNCHED  | login.scinet.ars.usda.gov<http://login.scinet.ars.usda.gov/>_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-10-31 10:57:09  |
| qwerty_68f5e215-d998-47c1-b702-c66346889689       | DefaultProject_dfd1a414-dbdf-41c8-ad9f-6092af16cf3d | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | qwerty          | 2016-10-31 10:57:09 | asdf        | CREATED   | login.scinet.ars.usda.gov<http://login.scinet.ars.usda.gov/>_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-10-31 10:57:09  |
+---------------------------------------------------+-----------------------------------------------------+--------------------------+------------------+---------------------------------------------------+-----------------+---------------------+-------------+-----------+----------------------------------------------------------------+----------------------+
5 rows in set (0.01 sec)

MariaDB [Airavata_production_expcatalog]> delete from EXPERIMENT_SUMMARY where EXECUTION_ID like '%snake%';
ERROR 1288 (HY000): The target table EXPERIMENT_SUMMARY of the DELETE is not updatable


Not sure what I can do with that.

Jarett

On Dec 28, 2016, at 5:24 PM, Suresh Marru <sm...@apache.org>> wrote:

Hi Jarett,

Can you remove the entries of ls_snakemake in application catalog database as well? I believe Eroma was able to reproduce the issue and we will work on a fix. Sorry for the troubles.

Suresh

On Dec 28, 2016, at 5:13 PM, Jarett DeAngelis <ja...@bioteam.net>> wrote:

So, I keep getting this popping up in my log, and whenever it does, the middleware grinds to a halt and will not process any more jobs through:

2016-12-28 14:56:03,058 [pool-3-thread-49] ERROR org.apache.airavata.api.server.handler.AiravataServerHandler  - ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03
org.apache.airavata.registry.cpi.AppCatalogException: javax.persistence.NoResultException: Query "SELECT p FROM ApplicationInterface p WHERE p.interfaceID =:param0" selected no result, but expected unique result.

This “ls_snakemake” job is months old and is associated with an interface, module and deployment which no longer exist, which is why this error is getting thrown. What I don’t understand is what is “reminding” Airavata of this job and causing it to get hung up. Last week, I found references to it in the experiment catalog database in the EXPERIMENT table, and deleting these allowed Airavata to continue. Now it is happening again and the references are gone. How do I eliminate all references to this job from all of Airavata such that it stops getting hung up on this phantom job?

Thanks,
Jarett




Re: phantom job

Posted by Jarett DeAngelis <ja...@bioteam.net>.
I found more culprits in EXPERIMENT_SUMMARY, but:

MariaDB [Airavata_production_expcatalog]> select * from EXPERIMENT_SUMMARY where EXECUTION_ID like '%snake%';
+---------------------------------------------------+-----------------------------------------------------+--------------------------+------------------+---------------------------------------------------+-----------------+---------------------+-------------+-----------+----------------------------------------------------------------+----------------------+
| EXPERIMENT_ID                                     | PROJECT_ID                                          | GATEWAY_ID               | USER_NAME        | EXECUTION_ID                                      | EXPERIMENT_NAME | CREATION_TIME       | DESCRIPTION | STATE     | RESOURCE_HOST_ID                                               | TIME_OF_STATE_CHANGE |
+---------------------------------------------------+-----------------------------------------------------+--------------------------+------------------+---------------------------------------------------+-----------------+---------------------+-------------+-----------+----------------------------------------------------------------+----------------------+
| 12345_f870dd35-e6d4-4053-a92b-302820dd129e        | Project1_3c40da06-4bef-4326-8097-ccf44859285a       | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | 12345           | 2016-11-02 10:22:14 |             | COMPLETED | login.scinet.ars.usda.gov_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-11-02 10:22:52  |
| 2345_aa3b31ea-6f2c-48e4-8b15-6a39ef23330c         | Project1_3c40da06-4bef-4326-8097-ccf44859285a       | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | 2345            | 2016-11-02 10:24:46 |             | COMPLETED | login.scinet.ars.usda.gov_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-11-02 10:25:24  |
| asdfasdfasdf_6727643c-1749-45df-86c4-1a9ecb2fd1a3 | Project1_3c40da06-4bef-4326-8097-ccf44859285a       | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | asdfasdfasdf    | 2016-10-21 16:08:33 |             | LAUNCHED  | login.scinet.ars.usda.gov_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-10-21 16:08:44  |
| qwerty_68f5e215-d998-47c1-b702-c66346889689       | DefaultProject_dfd1a414-dbdf-41c8-ad9f-6092af16cf3d | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | qwerty          | 2016-10-31 10:57:09 | asdf        | LAUNCHED  | login.scinet.ars.usda.gov_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-10-31 10:57:09  |
| qwerty_68f5e215-d998-47c1-b702-c66346889689       | DefaultProject_dfd1a414-dbdf-41c8-ad9f-6092af16cf3d | usda_ars_science_gateway | jarett.deangelis | ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03 | qwerty          | 2016-10-31 10:57:09 | asdf        | CREATED   | login.scinet.ars.usda.gov_b43d44c7-6d73-4321-b88d-62f3e7768d0c | 2016-10-31 10:57:09  |
+---------------------------------------------------+-----------------------------------------------------+--------------------------+------------------+---------------------------------------------------+-----------------+---------------------+-------------+-----------+----------------------------------------------------------------+----------------------+
5 rows in set (0.01 sec)

MariaDB [Airavata_production_expcatalog]> delete from EXPERIMENT_SUMMARY where EXECUTION_ID like '%snake%';
ERROR 1288 (HY000): The target table EXPERIMENT_SUMMARY of the DELETE is not updatable


Not sure what I can do with that.

Jarett

> On Dec 28, 2016, at 5:24 PM, Suresh Marru <sm...@apache.org> wrote:
> 
> Hi Jarett,
> 
> Can you remove the entries of ls_snakemake in application catalog database as well? I believe Eroma was able to reproduce the issue and we will work on a fix. Sorry for the troubles. 
> 
> Suresh
> 
>> On Dec 28, 2016, at 5:13 PM, Jarett DeAngelis <ja...@bioteam.net> wrote:
>> 
>> So, I keep getting this popping up in my log, and whenever it does, the middleware grinds to a halt and will not process any more jobs through:
>> 
>> 2016-12-28 14:56:03,058 [pool-3-thread-49] ERROR org.apache.airavata.api.server.handler.AiravataServerHandler  - ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03
>> org.apache.airavata.registry.cpi.AppCatalogException: javax.persistence.NoResultException: Query "SELECT p FROM ApplicationInterface p WHERE p.interfaceID =:param0" selected no result, but expected unique result.
>> 
>> This “ls_snakemake” job is months old and is associated with an interface, module and deployment which no longer exist, which is why this error is getting thrown. What I don’t understand is what is “reminding” Airavata of this job and causing it to get hung up. Last week, I found references to it in the experiment catalog database in the EXPERIMENT table, and deleting these allowed Airavata to continue. Now it is happening again and the references are gone. How do I eliminate all references to this job from all of Airavata such that it stops getting hung up on this phantom job?
>> 
>> Thanks,
>> Jarett
> 


Re: phantom job

Posted by Jarett DeAngelis <ja...@bioteam.net>.
Hi Suresh,

There was no ls_snakemake in the application catalog, but I have at least temporarily fixed the problem. What ended up (at least for now) resolving the issue was also looking through the EXPERIMENT_INPUT table for anything matching the string “param” (as seen where it’s looking for p.interfaceID =:param0, for example) and deleting all instances of those. Notably, none of those records contained the string ls_snakemake, so I’m not sure where it got it from. However, getting rid of those seems to have un-jammed RabbitMQ again for now.

Jarett

> On Dec 28, 2016, at 5:24 PM, Suresh Marru <sm...@apache.org> wrote:
> 
> Hi Jarett,
> 
> Can you remove the entries of ls_snakemake in application catalog database as well? I believe Eroma was able to reproduce the issue and we will work on a fix. Sorry for the troubles. 
> 
> Suresh
> 
>> On Dec 28, 2016, at 5:13 PM, Jarett DeAngelis <ja...@bioteam.net> wrote:
>> 
>> So, I keep getting this popping up in my log, and whenever it does, the middleware grinds to a halt and will not process any more jobs through:
>> 
>> 2016-12-28 14:56:03,058 [pool-3-thread-49] ERROR org.apache.airavata.api.server.handler.AiravataServerHandler  - ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03
>> org.apache.airavata.registry.cpi.AppCatalogException: javax.persistence.NoResultException: Query "SELECT p FROM ApplicationInterface p WHERE p.interfaceID =:param0" selected no result, but expected unique result.
>> 
>> This “ls_snakemake” job is months old and is associated with an interface, module and deployment which no longer exist, which is why this error is getting thrown. What I don’t understand is what is “reminding” Airavata of this job and causing it to get hung up. Last week, I found references to it in the experiment catalog database in the EXPERIMENT table, and deleting these allowed Airavata to continue. Now it is happening again and the references are gone. How do I eliminate all references to this job from all of Airavata such that it stops getting hung up on this phantom job?
>> 
>> Thanks,
>> Jarett
> 


Re: phantom job

Posted by Suresh Marru <sm...@apache.org>.
Hi Jarett,

Can you remove the entries of ls_snakemake in application catalog database as well? I believe Eroma was able to reproduce the issue and we will work on a fix. Sorry for the troubles. 

Suresh

> On Dec 28, 2016, at 5:13 PM, Jarett DeAngelis <ja...@bioteam.net> wrote:
> 
> So, I keep getting this popping up in my log, and whenever it does, the middleware grinds to a halt and will not process any more jobs through:
> 
> 2016-12-28 14:56:03,058 [pool-3-thread-49] ERROR org.apache.airavata.api.server.handler.AiravataServerHandler  - ls_snakemake_003a47d0-0d2c-474a-b734-9f4bf2903a03
> org.apache.airavata.registry.cpi.AppCatalogException: javax.persistence.NoResultException: Query "SELECT p FROM ApplicationInterface p WHERE p.interfaceID =:param0" selected no result, but expected unique result.
> 
> This “ls_snakemake” job is months old and is associated with an interface, module and deployment which no longer exist, which is why this error is getting thrown. What I don’t understand is what is “reminding” Airavata of this job and causing it to get hung up. Last week, I found references to it in the experiment catalog database in the EXPERIMENT table, and deleting these allowed Airavata to continue. Now it is happening again and the references are gone. How do I eliminate all references to this job from all of Airavata such that it stops getting hung up on this phantom job?
> 
> Thanks,
> Jarett