You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ctakes.apache.org by "Hari, Sekhar" <se...@cgi.com> on 2015/04/22 13:21:43 UTC

Request for help:: NCBO Ontology Extraction Tool for i2b2

Hello there -

Introducing myself:


My name is Sekhar Hari, responsible for Bio-informatics products/ solutions in CGI, a Canadian company. In this capacity, I am also responsible for developing a software to identify potential adverse events and serious adverse events in healthcare settings.


I have been trying to extract and process few Ontologies using the latest version of NCBO Ontology Extraction Tool to load into I2B2 but with no luck. I could extract the staging file, and can load this into the  I2B2 staging table. However, when I run the edu.harvard.i2b2.ncbo.extraction.NCBOOntologyProcessAll program, it always fails with GCOverheadLimit. I tried by increasing the JVM memory to 8GB but no result. My hardware resource is limited at present, and I can't increase the JVM memory size beyond 8GB.

As I have a demo for a large hospital coming up soon, in the interest of time, would you be kind enough to extract and process the following ontologies, and upload the final metadata file here? http://i2b2.bioontology.org/

Ontology IDs:

1.       WHO-ART

2.       OAE

3.       SSE

4.       OVAE

The user-guide that I was following is attached.

Many thanks in advance.

Regards,
Sekhar H.

RE: Request for help:: NCBO Ontology Extraction Tool for i2b2

Posted by "Savova, Guergana" <Gu...@childrens.harvard.edu>.
Hi Sekhar,
You'd want to be on the i2b2 mailing list, not the cTAKES mailing list.
--Guergana

-----Original Message-----
From: Miller, Timothy [mailto:Timothy.Miller@childrens.harvard.edu] 
Sent: Monday, April 27, 2015 7:57 AM
To: dev@ctakes.apache.org
Subject: RE: Request for help:: NCBO Ontology Extraction Tool for i2b2

Sekhar,
You seem to be on the wrong email list.
Tim

________________________________________
From: Hari, Sekhar [sekhar.hari@cgi.com]
Sent: Monday, April 27, 2015 7:50 AM
To: dev@ctakes.apache.org
Subject: RE: Request for help:: NCBO Ontology Extraction Tool for i2b2

Hello there - Any luck on extracting and processing these ontologies; particularly OAE, SSE, and OVAE?



Many thanks,

Sekhar H.



-----Original Message-----

From: Hari, Sekhar

Sent: Friday, April 24, 2015 11:45 AM

To: dev@ctakes.apache.org

Subject: RE: Request for help:: NCBO Ontology Extraction Tool for i2b2



I checked only those 4 Ontologies that I mentioned in my email. In this site - https://urldefense.proofpoint.com/v2/url?u=http-3A__i2b2.bioontology.org_&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=o9OW8Ggj5Sf1Qj9MP74B-vyof_EDsrOWZhRPNHFNsh0&e=  , I see that you have submitted a number of final metadata files for different Ontologies. I am not familiar with Extraction and Processing programs to modify it; hence I requested the group under the hope that somebody can extract and process the final metadata files for these Ontologies.



WHO-ART:

For this one, the problem is that the Processing program dies with the "GC Overhead Limit reached" error exactly after the output file size reaches 11GB (if I provide the  pathFormat as 'Medium'; dies at 9.4GB if the pathFormat is 'Short'). The Extraction program worked very well.

I contacted Lori, and here is what he has to say:

"Problem with WHO-ART is that its circular...  I don't have a solution for this problem.  ..

Traverse down one of the AV Block paths / Retinal Odeama / Fungal ../ Thyroid ... / Aspiration / and Av Block again...  it goes on and on..."



OAE, SSE, OVAE:

For this one, the problem is different. There is no "GC Overhead Limit" error. But when the Extraction program runs, after each page there is Java "NullPointerException". Lori asked me to modify the program. Below is Lori's response:



"I see the problem



My code assumes the following format for each concept:



Example from ICD9:

<class><properties>

<tuiCollection><tui type="https://urldefense.proofpoint.com/v2/url?u=http-3A__bioportal.bioontology.org_ontologies_umls_tui&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=pJ0tV9QPzp3YPIl85qnP8S4zaxpEE7m8auQPWFGkvNA&e= ">T061</tui></tuiCollection>

<notationCollection><notation type="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_2004_02_skos_core-23notation&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=Khia9dB2IyUg57GmBl39USHjHqNrPovzCUP3ivBAOR4&e= ">83.72</notation></notationCollection>

<cuiCollection><cui type="https://urldefense.proofpoint.com/v2/url?u=http-3A__bioportal.bioontology.org_ontologies_umls_cui&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=PrdG0tWcPXszObCrai-Xn-pVAVhIbmw8-jvCfImP2zA&e= ">C0185466</cui></cuiCollection>

<prefLabelCollection><prefLabel type="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_2004_02_skos_core-23prefLabel&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=p0y5VnSnfBTe2SjT5xQGpY4PoWOKqMQTkuTV1bY1O9M&e= ">Recession of tendon</prefLabel></prefLabelCollection>



Its expecting to see <notationCollection> to obtain the basecode of the term.



In your case

There is no <notationCollection> entry.   (why you are seeing null pointers)



It does have (which I assume is the basecode) <prefixIRICollection><prefixIRI type="https://urldefense.proofpoint.com/v2/url?u=http-3A__data.bioontology.org_metadata_prefixIRI&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=TQXj3b4O_PVIR5VZnzWgq2RzphBc3LeKpnI2LPFih40&e= ">OAE:0001620</prefixIRI>



Your problem is going to need a custom solution, that unfortunately I don't have the bandwidth for.   I can tell where/how to modify the code to fit your needs.  Let me know if you need assistance in modifying the code."



Thanks,

Sekhar H.



-----Original Message-----

From: Pei Chen [mailto:chenpei@apache.org]

Sent: Thursday, April 23, 2015 9:22 PM

To: dev@ctakes.apache.org

Subject: Re: Request for help:: NCBO Ontology Extraction Tool for i2b2



Sekhar,

Is it happening to all of the ontologies you mentioned or just one?  Those ontologies do not seem very big or deep.  Did you notice in the logs if something in the ontology having some sort of circular reference or causing an infinite loop?

I think lori from i2b2 may be better at answering this since this isn't exactly cTAKES related...

--Pei





On Wed, Apr 22, 2015 at 7:21 AM, Hari, Sekhar <se...@cgi.com> wrote:



> Hello there -

>

> Introducing myself:

>

>

> My name is Sekhar Hari, responsible for Bio-informatics products/

> solutions in CGI, a Canadian company. In this capacity, I am also

> responsible for developing a software to identify potential adverse

> events and serious adverse events in healthcare settings.

>

>

> I have been trying to extract and process few Ontologies using the

> latest version of NCBO Ontology Extraction Tool to load into I2B2 but

> with no luck. I could extract the staging file, and can load this into

> the  I2B2 staging table. However, when I run the

> edu.harvard.i2b2.ncbo.extraction.NCBOOntologyProcessAll program, it

> always fails with GCOverheadLimit. I tried by increasing the JVM

> memory to 8GB but no result. My hardware resource is limited at

> present, and I can't increase the JVM memory size beyond 8GB.

>

> As I have a demo for a large hospital coming up soon, in the interest

> of time, would you be kind enough to extract and process the following

> ontologies, and upload the final metadata file here?

> https://urldefense.proofpoint.com/v2/url?u=http-3A__i2b2.bioontology.org_&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=o9OW8Ggj5Sf1Qj9MP74B-vyof_EDsrOWZhRPNHFNsh0&e=

>

> Ontology IDs:

>

> 1.       WHO-ART

>

> 2.       OAE

>

> 3.       SSE

>

> 4.       OVAE

>

> The user-guide that I was following is attached.

>

> Many thanks in advance.

>

> Regards,

> Sekhar H.

>


RE: Request for help:: NCBO Ontology Extraction Tool for i2b2

Posted by "Miller, Timothy" <Ti...@childrens.harvard.edu>.
Sekhar,
You seem to be on the wrong email list.
Tim

________________________________________
From: Hari, Sekhar [sekhar.hari@cgi.com]
Sent: Monday, April 27, 2015 7:50 AM
To: dev@ctakes.apache.org
Subject: RE: Request for help:: NCBO Ontology Extraction Tool for i2b2

Hello there - Any luck on extracting and processing these ontologies; particularly OAE, SSE, and OVAE?



Many thanks,

Sekhar H.



-----Original Message-----

From: Hari, Sekhar

Sent: Friday, April 24, 2015 11:45 AM

To: dev@ctakes.apache.org

Subject: RE: Request for help:: NCBO Ontology Extraction Tool for i2b2



I checked only those 4 Ontologies that I mentioned in my email. In this site - https://urldefense.proofpoint.com/v2/url?u=http-3A__i2b2.bioontology.org_&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=o9OW8Ggj5Sf1Qj9MP74B-vyof_EDsrOWZhRPNHFNsh0&e=  , I see that you have submitted a number of final metadata files for different Ontologies. I am not familiar with Extraction and Processing programs to modify it; hence I requested the group under the hope that somebody can extract and process the final metadata files for these Ontologies.



WHO-ART:

For this one, the problem is that the Processing program dies with the "GC Overhead Limit reached" error exactly after the output file size reaches 11GB (if I provide the  pathFormat as 'Medium'; dies at 9.4GB if the pathFormat is 'Short'). The Extraction program worked very well.

I contacted Lori, and here is what he has to say:

"Problem with WHO-ART is that its circular…  I don’t have a solution for this problem.  ..

Traverse down one of the AV Block paths / Retinal Odeama / Fungal ../ Thyroid … / Aspiration / and Av Block again…  it goes on and on..."



OAE, SSE, OVAE:

For this one, the problem is different. There is no "GC Overhead Limit" error. But when the Extraction program runs, after each page there is Java "NullPointerException". Lori asked me to modify the program. Below is Lori's response:



"I see the problem



My code assumes the following format for each concept:



Example from ICD9:

<class><properties>

<tuiCollection><tui type="https://urldefense.proofpoint.com/v2/url?u=http-3A__bioportal.bioontology.org_ontologies_umls_tui&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=pJ0tV9QPzp3YPIl85qnP8S4zaxpEE7m8auQPWFGkvNA&e= ">T061</tui></tuiCollection>

<notationCollection><notation type="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_2004_02_skos_core-23notation&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=Khia9dB2IyUg57GmBl39USHjHqNrPovzCUP3ivBAOR4&e= ">83.72</notation></notationCollection>

<cuiCollection><cui type="https://urldefense.proofpoint.com/v2/url?u=http-3A__bioportal.bioontology.org_ontologies_umls_cui&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=PrdG0tWcPXszObCrai-Xn-pVAVhIbmw8-jvCfImP2zA&e= ">C0185466</cui></cuiCollection>

<prefLabelCollection><prefLabel type="https://urldefense.proofpoint.com/v2/url?u=http-3A__www.w3.org_2004_02_skos_core-23prefLabel&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=p0y5VnSnfBTe2SjT5xQGpY4PoWOKqMQTkuTV1bY1O9M&e= ">Recession of tendon</prefLabel></prefLabelCollection>



Its expecting to see <notationCollection> to obtain the basecode of the term.



In your case

There is no <notationCollection> entry.   (why you are seeing null pointers)



It does have (which I assume is the basecode) <prefixIRICollection><prefixIRI type="https://urldefense.proofpoint.com/v2/url?u=http-3A__data.bioontology.org_metadata_prefixIRI&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=TQXj3b4O_PVIR5VZnzWgq2RzphBc3LeKpnI2LPFih40&e= ">OAE:0001620</prefixIRI>



Your problem is going to need a custom solution, that unfortunately I don't have the bandwidth for.   I can tell where/how to modify the code to fit your needs.  Let me know if you need assistance in modifying the code."



Thanks,

Sekhar H.



-----Original Message-----

From: Pei Chen [mailto:chenpei@apache.org]

Sent: Thursday, April 23, 2015 9:22 PM

To: dev@ctakes.apache.org

Subject: Re: Request for help:: NCBO Ontology Extraction Tool for i2b2



Sekhar,

Is it happening to all of the ontologies you mentioned or just one?  Those ontologies do not seem very big or deep.  Did you notice in the logs if something in the ontology having some sort of circular reference or causing an infinite loop?

I think lori from i2b2 may be better at answering this since this isn't exactly cTAKES related...

--Pei





On Wed, Apr 22, 2015 at 7:21 AM, Hari, Sekhar <se...@cgi.com> wrote:



> Hello there -

>

> Introducing myself:

>

>

> My name is Sekhar Hari, responsible for Bio-informatics products/

> solutions in CGI, a Canadian company. In this capacity, I am also

> responsible for developing a software to identify potential adverse

> events and serious adverse events in healthcare settings.

>

>

> I have been trying to extract and process few Ontologies using the

> latest version of NCBO Ontology Extraction Tool to load into I2B2 but

> with no luck. I could extract the staging file, and can load this into

> the  I2B2 staging table. However, when I run the

> edu.harvard.i2b2.ncbo.extraction.NCBOOntologyProcessAll program, it

> always fails with GCOverheadLimit. I tried by increasing the JVM

> memory to 8GB but no result. My hardware resource is limited at

> present, and I can't increase the JVM memory size beyond 8GB.

>

> As I have a demo for a large hospital coming up soon, in the interest

> of time, would you be kind enough to extract and process the following

> ontologies, and upload the final metadata file here?

> https://urldefense.proofpoint.com/v2/url?u=http-3A__i2b2.bioontology.org_&d=BQIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=qiYabyWg17ViMroLBEPKvb0M8-0Z-JpASggJzpQMMNE&s=o9OW8Ggj5Sf1Qj9MP74B-vyof_EDsrOWZhRPNHFNsh0&e=

>

> Ontology IDs:

>

> 1.       WHO-ART

>

> 2.       OAE

>

> 3.       SSE

>

> 4.       OVAE

>

> The user-guide that I was following is attached.

>

> Many thanks in advance.

>

> Regards,

> Sekhar H.

>


RE: Request for help:: NCBO Ontology Extraction Tool for i2b2

Posted by "Hari, Sekhar" <se...@cgi.com>.
Hello there - Any luck on extracting and processing these ontologies; particularly OAE, SSE, and OVAE?

Many thanks,
Sekhar H.

-----Original Message-----
From: Hari, Sekhar 
Sent: Friday, April 24, 2015 11:45 AM
To: dev@ctakes.apache.org
Subject: RE: Request for help:: NCBO Ontology Extraction Tool for i2b2

I checked only those 4 Ontologies that I mentioned in my email. In this site - http://i2b2.bioontology.org/ , I see that you have submitted a number of final metadata files for different Ontologies. I am not familiar with Extraction and Processing programs to modify it; hence I requested the group under the hope that somebody can extract and process the final metadata files for these Ontologies.

WHO-ART:
For this one, the problem is that the Processing program dies with the "GC Overhead Limit reached" error exactly after the output file size reaches 11GB (if I provide the  pathFormat as 'Medium'; dies at 9.4GB if the pathFormat is 'Short'). The Extraction program worked very well.
I contacted Lori, and here is what he has to say:
"Problem with WHO-ART is that its circular…  I don’t have a solution for this problem.  .. 
Traverse down one of the AV Block paths / Retinal Odeama / Fungal ../ Thyroid … / Aspiration / and Av Block again…  it goes on and on..."

OAE, SSE, OVAE:
For this one, the problem is different. There is no "GC Overhead Limit" error. But when the Extraction program runs, after each page there is Java "NullPointerException". Lori asked me to modify the program. Below is Lori's response:

"I see the problem
 
My code assumes the following format for each concept:
 
Example from ICD9:
<class><properties>
<tuiCollection><tui type="http://bioportal.bioontology.org/ontologies/umls/tui">T061</tui></tuiCollection>
<notationCollection><notation type="http://www.w3.org/2004/02/skos/core#notation">83.72</notation></notationCollection>
<cuiCollection><cui type="http://bioportal.bioontology.org/ontologies/umls/cui">C0185466</cui></cuiCollection>
<prefLabelCollection><prefLabel type="http://www.w3.org/2004/02/skos/core#prefLabel">Recession of tendon</prefLabel></prefLabelCollection>
 
Its expecting to see <notationCollection> to obtain the basecode of the term.
 
In your case
There is no <notationCollection> entry.   (why you are seeing null pointers)
 
It does have (which I assume is the basecode) <prefixIRICollection><prefixIRI type="http://data.bioontology.org/metadata/prefixIRI">OAE:0001620</prefixIRI>
 
Your problem is going to need a custom solution, that unfortunately I don't have the bandwidth for.   I can tell where/how to modify the code to fit your needs.  Let me know if you need assistance in modifying the code."

Thanks,
Sekhar H.

-----Original Message-----
From: Pei Chen [mailto:chenpei@apache.org]
Sent: Thursday, April 23, 2015 9:22 PM
To: dev@ctakes.apache.org
Subject: Re: Request for help:: NCBO Ontology Extraction Tool for i2b2

Sekhar,
Is it happening to all of the ontologies you mentioned or just one?  Those ontologies do not seem very big or deep.  Did you notice in the logs if something in the ontology having some sort of circular reference or causing an infinite loop?
I think lori from i2b2 may be better at answering this since this isn't exactly cTAKES related...
--Pei


On Wed, Apr 22, 2015 at 7:21 AM, Hari, Sekhar <se...@cgi.com> wrote:

> Hello there -
>
> Introducing myself:
>
>
> My name is Sekhar Hari, responsible for Bio-informatics products/ 
> solutions in CGI, a Canadian company. In this capacity, I am also 
> responsible for developing a software to identify potential adverse 
> events and serious adverse events in healthcare settings.
>
>
> I have been trying to extract and process few Ontologies using the 
> latest version of NCBO Ontology Extraction Tool to load into I2B2 but 
> with no luck. I could extract the staging file, and can load this into 
> the  I2B2 staging table. However, when I run the 
> edu.harvard.i2b2.ncbo.extraction.NCBOOntologyProcessAll program, it 
> always fails with GCOverheadLimit. I tried by increasing the JVM 
> memory to 8GB but no result. My hardware resource is limited at 
> present, and I can't increase the JVM memory size beyond 8GB.
>
> As I have a demo for a large hospital coming up soon, in the interest 
> of time, would you be kind enough to extract and process the following 
> ontologies, and upload the final metadata file here?
> http://i2b2.bioontology.org/
>
> Ontology IDs:
>
> 1.       WHO-ART
>
> 2.       OAE
>
> 3.       SSE
>
> 4.       OVAE
>
> The user-guide that I was following is attached.
>
> Many thanks in advance.
>
> Regards,
> Sekhar H.
>

RE: Request for help:: NCBO Ontology Extraction Tool for i2b2

Posted by "Hari, Sekhar" <se...@cgi.com>.
I checked only those 4 Ontologies that I mentioned in my email. In this site - http://i2b2.bioontology.org/ , I see that you have submitted a number of final metadata files for different Ontologies. I am not familiar with Extraction and Processing programs to modify it; hence I requested the group under the hope that somebody can extract and process the final metadata files for these Ontologies.

WHO-ART:
For this one, the problem is that the Processing program dies with the "GC Overhead Limit reached" error exactly after the output file size reaches 11GB (if I provide the  pathFormat as 'Medium'; dies at 9.4GB if the pathFormat is 'Short'). The Extraction program worked very well.
I contacted Lori, and here is what he has to say:
"Problem with WHO-ART is that its circular…  I don’t have a solution for this problem.  .. 
Traverse down one of the AV Block paths / Retinal Odeama / Fungal ../ Thyroid … / Aspiration / and Av Block again…  it goes on and on..."

OAE, SSE, OVAE:
For this one, the problem is different. There is no "GC Overhead Limit" error. But when the Extraction program runs, after each page there is Java "NullPointerException". Lori asked me to modify the program. Below is Lori's response:

"I see the problem
 
My code assumes the following format for each concept:
 
Example from ICD9:
<class><properties>
<tuiCollection><tui type="http://bioportal.bioontology.org/ontologies/umls/tui">T061</tui></tuiCollection>
<notationCollection><notation type="http://www.w3.org/2004/02/skos/core#notation">83.72</notation></notationCollection>
<cuiCollection><cui type="http://bioportal.bioontology.org/ontologies/umls/cui">C0185466</cui></cuiCollection>
<prefLabelCollection><prefLabel type="http://www.w3.org/2004/02/skos/core#prefLabel">Recession of tendon</prefLabel></prefLabelCollection>
 
Its expecting to see <notationCollection> to obtain the basecode of the term.
 
In your case
There is no <notationCollection> entry.   (why you are seeing null pointers)
 
It does have (which I assume is the basecode) <prefixIRICollection><prefixIRI type="http://data.bioontology.org/metadata/prefixIRI">OAE:0001620</prefixIRI>
 
Your problem is going to need a custom solution, that unfortunately I don't have the bandwidth for.   I can tell where/how to modify the code to fit your needs.  Let me know if you need assistance in modifying the code."

Thanks,
Sekhar H.

-----Original Message-----
From: Pei Chen [mailto:chenpei@apache.org] 
Sent: Thursday, April 23, 2015 9:22 PM
To: dev@ctakes.apache.org
Subject: Re: Request for help:: NCBO Ontology Extraction Tool for i2b2

Sekhar,
Is it happening to all of the ontologies you mentioned or just one?  Those ontologies do not seem very big or deep.  Did you notice in the logs if something in the ontology having some sort of circular reference or causing an infinite loop?
I think lori from i2b2 may be better at answering this since this isn't exactly cTAKES related...
--Pei


On Wed, Apr 22, 2015 at 7:21 AM, Hari, Sekhar <se...@cgi.com> wrote:

> Hello there -
>
> Introducing myself:
>
>
> My name is Sekhar Hari, responsible for Bio-informatics products/ 
> solutions in CGI, a Canadian company. In this capacity, I am also 
> responsible for developing a software to identify potential adverse 
> events and serious adverse events in healthcare settings.
>
>
> I have been trying to extract and process few Ontologies using the 
> latest version of NCBO Ontology Extraction Tool to load into I2B2 but 
> with no luck. I could extract the staging file, and can load this into 
> the  I2B2 staging table. However, when I run the 
> edu.harvard.i2b2.ncbo.extraction.NCBOOntologyProcessAll program, it 
> always fails with GCOverheadLimit. I tried by increasing the JVM 
> memory to 8GB but no result. My hardware resource is limited at 
> present, and I can't increase the JVM memory size beyond 8GB.
>
> As I have a demo for a large hospital coming up soon, in the interest 
> of time, would you be kind enough to extract and process the following 
> ontologies, and upload the final metadata file here?
> http://i2b2.bioontology.org/
>
> Ontology IDs:
>
> 1.       WHO-ART
>
> 2.       OAE
>
> 3.       SSE
>
> 4.       OVAE
>
> The user-guide that I was following is attached.
>
> Many thanks in advance.
>
> Regards,
> Sekhar H.
>

Re: Request for help:: NCBO Ontology Extraction Tool for i2b2

Posted by Pei Chen <ch...@apache.org>.
Sekhar,
Is it happening to all of the ontologies you mentioned or just one?  Those
ontologies do not seem very big or deep.  Did you notice in the logs if
something in the ontology having some sort of circular reference or causing
an infinite loop?
I think lori from i2b2 may be better at answering this since this isn't
exactly cTAKES related...
--Pei


On Wed, Apr 22, 2015 at 7:21 AM, Hari, Sekhar <se...@cgi.com> wrote:

> Hello there -
>
> Introducing myself:
>
>
> My name is Sekhar Hari, responsible for Bio-informatics products/
> solutions in CGI, a Canadian company. In this capacity, I am also
> responsible for developing a software to identify potential adverse events
> and serious adverse events in healthcare settings.
>
>
> I have been trying to extract and process few Ontologies using the latest
> version of NCBO Ontology Extraction Tool to load into I2B2 but with no
> luck. I could extract the staging file, and can load this into the  I2B2
> staging table. However, when I run the
> edu.harvard.i2b2.ncbo.extraction.NCBOOntologyProcessAll program, it always
> fails with GCOverheadLimit. I tried by increasing the JVM memory to 8GB but
> no result. My hardware resource is limited at present, and I can't increase
> the JVM memory size beyond 8GB.
>
> As I have a demo for a large hospital coming up soon, in the interest of
> time, would you be kind enough to extract and process the following
> ontologies, and upload the final metadata file here?
> http://i2b2.bioontology.org/
>
> Ontology IDs:
>
> 1.       WHO-ART
>
> 2.       OAE
>
> 3.       SSE
>
> 4.       OVAE
>
> The user-guide that I was following is attached.
>
> Many thanks in advance.
>
> Regards,
> Sekhar H.
>