You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@stanbol.apache.org by Anuj Kumar <an...@formcept.com> on 2012/05/07 18:47:56 UTC

[Accepted] FORMCEPT Proposal for Early adopter programme

Hello Everyone,

We are excited to become a part of early adopter programme with our
medical records use case. As a part of this programme we will be working
on the analysis of medical records and their overall management. You can
read the details of our proposal here-
http://wiki.iks-project.eu/index.php/Formcept_Proposal

Life science demo that has been setup by Stanbol community will
bootstrap the project and we will be building on top of the datasets
mentioned by Rupert in his previous email [1]. I have been following
Stanbol project since its inception and I am looking forward to work
with the team.

Regards,
Anuj

[1]
http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201203.mbox/%3CCAA7LAO0mDmYxGH0HN3=OAMxMU9C8Ho9yS+Yry_N4JAX_3OMHdw@mail.gmail.com%3E

On Thu, 2012-03-01 at 11:29 +0530, Anuj Kumar wrote:

> Hello Everyone,
> 
> We at FORMCEPT are interested in joining the Early Adopter Programme
> with two of our use cases-
> 
> 1. Profile Analysis - Analysis of user profile, extraction of skills
> and matching it to the right job description
> 2. Analysis of Medical Records - Analysis of various medical reports
> and documents adhering to Clinical Document Architecture
> 
> We started FORMCEPT (http://www.formcept.com) in the month of
> September last year at Bangalore, India. We have built a big data
> analysis stack that can accept data in any format, from any source and
> has the capability of making it useful for everyone. We have our
> profile analysis and job matching service nearing completion and we
> have started work on analysis of medical records.
> 
> Our stack is built entirely on top of open source projects, like-
> Hadoop, HBase, Solr, OpenNLP, UIMA, etc. Our knowledge base is built
> on top of DBpedia dumps and allows you to extend it with your domain
> specific knowledge. Our user interface is based on HTML5, CSS3 and
> JavaScript. We have been in touch of Stanbol community and see a good
> integration points with our stack.
> 
> As a part of this proposal, we will be evaluating (but not limited to)
> the following components-
> 
> 1. Enhancement Engines (Adding more for the above use cases)
> 2. Ontology Manager
> 3. Entityhub
> 4. Keyword Linking Engine
> 
> Please let us know if it will be useful for the Stanbol community.
> 
> Thanks,
> Anuj
> Co-Founder,
> FORMCEPT,
> Bangalore, India

Re: [Implemented] FORMCEPT Proposal for Early adopter programme

Posted by Anuj Kumar <an...@formcept.com>.

Hello Everyone,

This is the follow up email on our implementation of enhancement engine
related to the medical entities. We have posted the details of our
enhancement engine and some results here-
http://formcept.com/blog/healthcare-stanbol/ and also on the IKS blog. 

Please take a look at it and provide your feedback. 

Regards,
Anuj

On Sun, 2012-07-01 at 21:50 +0530, Anuj Kumar wrote:

> Hello Everyone,
> 
> We have implemented an enhancement engine for medical domain as a part
> of our proposal. The demo is hosted at-
> http://demo.formcept.com:8080/enhancer
> We are currently annotating drugs, diseases, chemical compounds and
> few other medical and related domains.
> 
> The enhancement engine uses FORMCEPT's knowledge graph that is built
> on top of DBpedia 3.6 dumps. So, thanks to DBpedia project as well. We
> not only annotate the entities but also find the overall broader
> categories with the relevant hierarchy. As of now, we are annotating
> the hierarchy as a path under SKOS_BROADER annotation. A typical
> annotation looks like-
> 
>                 "broader": "Health->Diseases and
> disorders->Symptoms->Symptoms and signs: Digestive system and
> abdomen->Vomiting", 
> 
> "comment": "Vomiting (known medically as emesis and informally as throwing up and a number of other terms) is the forceful expulsion of the contents of one's stomach through the mouth and sometimes the nose. Vomiting may result from many causes, ranging from gastritis or poisoning to brain tumors, or elevated intracranial pressure. The feeling that one is about to vomit is called nausea, which usually precedes, but does not always lead to, vomiting.",
>       "created": "2012-07-01T12:23:58.495Z",
>       "creator": "org.formcept.engine.enhancer.FCHealthCareEnhancer",
>       "end": 768,
>       "extracted-from": "urn:content-item-sha1-09055dcdcb07a20b3b30f64079a9a2779600f801",
>       "selected-text": "vomiting",
>       "start": 760,
>       "type": "Health"
> 
> We are working on benchmarking the overall performance. We will post
> the details soon.
> 
> Please give it a try and provide your feedback.
> 
> Regards,
> Anuj
> PS: If there is any change in the URL, we will update the community.
> 
> On Mon, 2012-05-07 at 22:17 +0530, Anuj Kumar wrote:
> 
> > Hello Everyone,
> > 
> > We are excited to become a part of early adopter programme with our
> > medical records use case. As a part of this programme we will be
> > working on the analysis of medical records and their overall
> > management. You can read the details of our proposal here-
> > http://wiki.iks-project.eu/index.php/Formcept_Proposal
> > 
> > Life science demo that has been setup by Stanbol community will
> > bootstrap the project and we will be building on top of the datasets
> > mentioned by Rupert in his previous email [1]. I have been following
> > Stanbol project since its inception and I am looking forward to work
> > with the team.
> > 
> > Regards,
> > Anuj
> > 
> > [1]
> > http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201203.mbox/%3CCAA7LAO0mDmYxGH0HN3=OAMxMU9C8Ho9yS+Yry_N4JAX_3OMHdw@mail.gmail.com%3E
> > 
> > On Thu, 2012-03-01 at 11:29 +0530, Anuj Kumar wrote:
> > 
> > > Hello Everyone,
> > > 
> > > We at FORMCEPT are interested in joining the Early Adopter
> > > Programme with two of our use cases-
> > > 
> > > 1. Profile Analysis - Analysis of user profile, extraction of
> > > skills and matching it to the right job description
> > > 2. Analysis of Medical Records - Analysis of various medical
> > > reports and documents adhering to Clinical Document Architecture
> > > 
> > > We started FORMCEPT (http://www.formcept.com) in the month of
> > > September last year at Bangalore, India. We have built a big data
> > > analysis stack that can accept data in any format, from any source
> > > and has the capability of making it useful for everyone. We have
> > > our profile analysis and job matching service nearing completion
> > > and we have started work on analysis of medical records.
> > > 
> > > Our stack is built entirely on top of open source projects, like-
> > > Hadoop, HBase, Solr, OpenNLP, UIMA, etc. Our knowledge base is
> > > built on top of DBpedia dumps and allows you to extend it with
> > > your domain specific knowledge. Our user interface is based on
> > > HTML5, CSS3 and JavaScript. We have been in touch of Stanbol
> > > community and see a good integration points with our stack.
> > > 
> > > As a part of this proposal, we will be evaluating (but not limited
> > > to) the following components-
> > > 
> > > 1. Enhancement Engines (Adding more for the above use cases)
> > > 2. Ontology Manager
> > > 3. Entityhub
> > > 4. Keyword Linking Engine
> > > 
> > > Please let us know if it will be useful for the Stanbol community.
> > > 
> > > Thanks,
> > > Anuj
> > > Co-Founder,
> > > FORMCEPT,
> > > Bangalore, India
> > 
> > 
> 
>

[Implemented] FORMCEPT Proposal for Early adopter programme

Posted by Anuj Kumar <an...@formcept.com>.

Hello Everyone,

We have implemented an enhancement engine for medical domain as a part
of our proposal. The demo is hosted at-
http://demo.formcept.com:8080/enhancer
We are currently annotating drugs, diseases, chemical compounds and few
other medical and related domains.

The enhancement engine uses FORMCEPT's knowledge graph that is built on
top of DBpedia 3.6 dumps. So, thanks to DBpedia project as well. We not
only annotate the entities but also find the overall broader categories
with the relevant hierarchy. As of now, we are annotating the hierarchy
as a path under SKOS_BROADER annotation. A typical annotation looks
like-

                "broader": "Health->Diseases and
disorders->Symptoms->Symptoms and signs: Digestive system and
abdomen->Vomiting", 

"comment": "Vomiting (known medically as emesis and informally as throwing up and a number of other terms) is the forceful expulsion of the contents of one's stomach through the mouth and sometimes the nose. Vomiting may result from many causes, ranging from gastritis or poisoning to brain tumors, or elevated intracranial pressure. The feeling that one is about to vomit is called nausea, which usually precedes, but does not always lead to, vomiting.",
      "created": "2012-07-01T12:23:58.495Z",
      "creator": "org.formcept.engine.enhancer.FCHealthCareEnhancer",
      "end": 768,
      "extracted-from": "urn:content-item-sha1-09055dcdcb07a20b3b30f64079a9a2779600f801",
      "selected-text": "vomiting",
      "start": 760,
      "type": "Health"

We are working on benchmarking the overall performance. We will post the
details soon.
 
Please give it a try and provide your feedback.

Regards,
Anuj
PS: If there is any change in the URL, we will update the community.

On Mon, 2012-05-07 at 22:17 +0530, Anuj Kumar wrote:

> Hello Everyone,
> 
> We are excited to become a part of early adopter programme with our
> medical records use case. As a part of this programme we will be
> working on the analysis of medical records and their overall
> management. You can read the details of our proposal here-
> http://wiki.iks-project.eu/index.php/Formcept_Proposal
> 
> Life science demo that has been setup by Stanbol community will
> bootstrap the project and we will be building on top of the datasets
> mentioned by Rupert in his previous email [1]. I have been following
> Stanbol project since its inception and I am looking forward to work
> with the team.
> 
> Regards,
> Anuj
> 
> [1]
> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201203.mbox/%3CCAA7LAO0mDmYxGH0HN3=OAMxMU9C8Ho9yS+Yry_N4JAX_3OMHdw@mail.gmail.com%3E
> 
> On Thu, 2012-03-01 at 11:29 +0530, Anuj Kumar wrote:
> 
> > Hello Everyone,
> > 
> > We at FORMCEPT are interested in joining the Early Adopter Programme
> > with two of our use cases-
> > 
> > 1. Profile Analysis - Analysis of user profile, extraction of skills
> > and matching it to the right job description
> > 2. Analysis of Medical Records - Analysis of various medical reports
> > and documents adhering to Clinical Document Architecture
> > 
> > We started FORMCEPT (http://www.formcept.com) in the month of
> > September last year at Bangalore, India. We have built a big data
> > analysis stack that can accept data in any format, from any source
> > and has the capability of making it useful for everyone. We have our
> > profile analysis and job matching service nearing completion and we
> > have started work on analysis of medical records.
> > 
> > Our stack is built entirely on top of open source projects, like-
> > Hadoop, HBase, Solr, OpenNLP, UIMA, etc. Our knowledge base is built
> > on top of DBpedia dumps and allows you to extend it with your domain
> > specific knowledge. Our user interface is based on HTML5, CSS3 and
> > JavaScript. We have been in touch of Stanbol community and see a
> > good integration points with our stack.
> > 
> > As a part of this proposal, we will be evaluating (but not limited
> > to) the following components-
> > 
> > 1. Enhancement Engines (Adding more for the above use cases)
> > 2. Ontology Manager
> > 3. Entityhub
> > 4. Keyword Linking Engine
> > 
> > Please let us know if it will be useful for the Stanbol community.
> > 
> > Thanks,
> > Anuj
> > Co-Founder,
> > FORMCEPT,
> > Bangalore, India
> 
>

Re: [Accepted] FORMCEPT Proposal for Early adopter programme

Posted by Anuj Kumar <an...@gmail.com>.

Thanks Rupert. I will take a look at these datasets and the recent
additions as well. We are currently evaluating the existing datasets and
there are few more mentioned in the research paper that you mentioned
earlier.

Once the integration is done, I will be definitely share the statistics
regarding the performance of the enhancement engine.

Regards,
Anuj

On Wed, May 9, 2012 at 2:49 PM, Rupert Westenthaler <
rupert.westenthaler@gmail.com> wrote:

> Hi Anuj,
>
> thats great to hear to have an other member that is interested in the
> Life science domain in the Stanbol community!
>
> While I have currently only very little time to further improve the
> health demo I am generally very interested in this domain.
>
> Note also that with the recent additions to the Entityhub Indexing
> tools (STANBOL-593, STANBOL-592, STANBOL-590) it would now be possible
> to further improve the ehealth demo (e.g. by using LDpath statements
> on the IndexingSource to collect information from the different
> datasets).
>
> In addition there are still some interesting data sources that are not
> yet included in the demo such as
>
> * Linked Clinical (http://linkedct.org/)
> * GeneDB (http://www.genedb.org/Homepage)
>
> Suat had also some additional eHealth related datasources that one
> could try to incorporate.
>
> So if you have any problems, questions or suggestions feel free to
> ask. As "medical records" are typically not public available I would
> be especially interested if you could share information about the
> performance of the Stanbol Enhancer (e.g. recall, precision) on real
> data sets.
>
> best
> Rupert
>
>
> On Mon, May 7, 2012 at 6:47 PM, Anuj Kumar <an...@formcept.com>
> wrote:
> > Hello Everyone,
> >
> > We are excited to become a part of early adopter programme with our
> > medical records use case. As a part of this programme we will be working
> > on the analysis of medical records and their overall management. You can
> > read the details of our proposal here-
> > http://wiki.iks-project.eu/index.php/Formcept_Proposal
> >
> > Life science demo that has been setup by Stanbol community will
> > bootstrap the project and we will be building on top of the datasets
> > mentioned by Rupert in his previous email [1]. I have been following
> > Stanbol project since its inception and I am looking forward to work
> > with the team.
> >
> > Regards,
> > Anuj
> >
> > [1]
> >
> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201203.mbox/%3CCAA7LAO0mDmYxGH0HN3=OAMxMU9C8Ho9yS+Yry_N4JAX_3OMHdw@mail.gmail.com%3E
> >
> > On Thu, 2012-03-01 at 11:29 +0530, Anuj Kumar wrote:
> >
> >> Hello Everyone,
> >>
> >> We at FORMCEPT are interested in joining the Early Adopter Programme
> >> with two of our use cases-
> >>
> >> 1. Profile Analysis - Analysis of user profile, extraction of skills
> >> and matching it to the right job description
> >> 2. Analysis of Medical Records - Analysis of various medical reports
> >> and documents adhering to Clinical Document Architecture
> >>
> >> We started FORMCEPT (http://www.formcept.com) in the month of
> >> September last year at Bangalore, India. We have built a big data
> >> analysis stack that can accept data in any format, from any source and
> >> has the capability of making it useful for everyone. We have our
> >> profile analysis and job matching service nearing completion and we
> >> have started work on analysis of medical records.
> >>
> >> Our stack is built entirely on top of open source projects, like-
> >> Hadoop, HBase, Solr, OpenNLP, UIMA, etc. Our knowledge base is built
> >> on top of DBpedia dumps and allows you to extend it with your domain
> >> specific knowledge. Our user interface is based on HTML5, CSS3 and
> >> JavaScript. We have been in touch of Stanbol community and see a good
> >> integration points with our stack.
> >>
> >> As a part of this proposal, we will be evaluating (but not limited to)
> >> the following components-
> >>
> >> 1. Enhancement Engines (Adding more for the above use cases)
> >> 2. Ontology Manager
> >> 3. Entityhub
> >> 4. Keyword Linking Engine
> >>
> >> Please let us know if it will be useful for the Stanbol community.
> >>
> >> Thanks,
> >> Anuj
> >> Co-Founder,
> >> FORMCEPT,
> >> Bangalore, India
> >
> >
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>

Re: [Accepted] FORMCEPT Proposal for Early adopter programme

Posted by Rupert Westenthaler <ru...@gmail.com>.

Hi Anuj,

thats great to hear to have an other member that is interested in the
Life science domain in the Stanbol community!

While I have currently only very little time to further improve the
health demo I am generally very interested in this domain.

Note also that with the recent additions to the Entityhub Indexing
tools (STANBOL-593, STANBOL-592, STANBOL-590) it would now be possible
to further improve the ehealth demo (e.g. by using LDpath statements
on the IndexingSource to collect information from the different
datasets).

In addition there are still some interesting data sources that are not
yet included in the demo such as

* Linked Clinical (http://linkedct.org/)
* GeneDB (http://www.genedb.org/Homepage)

Suat had also some additional eHealth related datasources that one
could try to incorporate.

So if you have any problems, questions or suggestions feel free to
ask. As "medical records" are typically not public available I would
be especially interested if you could share information about the
performance of the Stanbol Enhancer (e.g. recall, precision) on real
data sets.

best
Rupert


On Mon, May 7, 2012 at 6:47 PM, Anuj Kumar <an...@formcept.com> wrote:
> Hello Everyone,
>
> We are excited to become a part of early adopter programme with our
> medical records use case. As a part of this programme we will be working
> on the analysis of medical records and their overall management. You can
> read the details of our proposal here-
> http://wiki.iks-project.eu/index.php/Formcept_Proposal
>
> Life science demo that has been setup by Stanbol community will
> bootstrap the project and we will be building on top of the datasets
> mentioned by Rupert in his previous email [1]. I have been following
> Stanbol project since its inception and I am looking forward to work
> with the team.
>
> Regards,
> Anuj
>
> [1]
> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201203.mbox/%3CCAA7LAO0mDmYxGH0HN3=OAMxMU9C8Ho9yS+Yry_N4JAX_3OMHdw@mail.gmail.com%3E
>
> On Thu, 2012-03-01 at 11:29 +0530, Anuj Kumar wrote:
>
>> Hello Everyone,
>>
>> We at FORMCEPT are interested in joining the Early Adopter Programme
>> with two of our use cases-
>>
>> 1. Profile Analysis - Analysis of user profile, extraction of skills
>> and matching it to the right job description
>> 2. Analysis of Medical Records - Analysis of various medical reports
>> and documents adhering to Clinical Document Architecture
>>
>> We started FORMCEPT (http://www.formcept.com) in the month of
>> September last year at Bangalore, India. We have built a big data
>> analysis stack that can accept data in any format, from any source and
>> has the capability of making it useful for everyone. We have our
>> profile analysis and job matching service nearing completion and we
>> have started work on analysis of medical records.
>>
>> Our stack is built entirely on top of open source projects, like-
>> Hadoop, HBase, Solr, OpenNLP, UIMA, etc. Our knowledge base is built
>> on top of DBpedia dumps and allows you to extend it with your domain
>> specific knowledge. Our user interface is based on HTML5, CSS3 and
>> JavaScript. We have been in touch of Stanbol community and see a good
>> integration points with our stack.
>>
>> As a part of this proposal, we will be evaluating (but not limited to)
>> the following components-
>>
>> 1. Enhancement Engines (Adding more for the above use cases)
>> 2. Ontology Manager
>> 3. Entityhub
>> 4. Keyword Linking Engine
>>
>> Please let us know if it will be useful for the Stanbol community.
>>
>> Thanks,
>> Anuj
>> Co-Founder,
>> FORMCEPT,
>> Bangalore, India
>
>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen