You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Bertrand Delacretaz <bd...@apache.org> on 2012/07/12 14:49:29 UTC

GSoC disambiguation project, any news? (was: GSoC project accepted)

Hi,

On Tue, Apr 24, 2012 at 9:22 AM, Bertrand Delacretaz
<bd...@apache.org> wrote:
> ...According to [1], Kritarth Anand's GSoC entity disambiguation project
> has been accepted, congrats!...

How's that project going forward? I don't remember seeing any
discussions about it here, did I miss something?

-Bertrand

Re: GSoC disambiguation project, any news? (was: GSoC project accepted)

Posted by Bertrand Delacretaz <bd...@apache.org>.
Hi,

On Fri, Jul 13, 2012 at 9:19 AM, Rupert Westenthaler
<ru...@gmail.com> wrote:
...
> [1] https://github.com/kritarthanand/Disambiguation-Stanbol/pull/1
...

So, is https://github.com/kritarthanand/Disambiguation-Stanbol where
the code of this GSoC project is happening? Why not in Stanbol's
repository? Was that ever discussed here? Could commit events be sent
to our commits list so that we can follow? Did Kritarth file an iCLA
so that the code can easily move here? (lots of questions, I agree ;-)

>...It is critical to discuss your ideas with
> the community. Especially to get more feedback on how to apply those
> ideas to Apache Stanbol....

Yes. As a former GSoC mentor, I'm *very* disappointed to see that this
community hasn't been made aware of the progress of this project.

GSoC is as much about learning to work with a community as it is about
writing code. Looks like the latter is working, as for the
former...not at all.

I hope student and mentor can change course for the second half of
GSoC and make sure things happen in the open, on this list, as the
project progresses. A simple rule for that is to avoid any direct
email between student and mentor, and use this list instead (unless
discussing private or adminstrative things of course).

"If it didn't happen on the dev list, it didn't happen" .

-Bertrand

Re: GSoC disambiguation project, any news? (was: GSoC project accepted)

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Kritart, all

On Thu, Jul 12, 2012 at 4:51 PM, kritarth anand
<kr...@gmail.com> wrote:
> Hi Bertrand,
>
> The project is going good.
>
> We have now ,a working version of an entity disambiguation engine which on
> simple algorithm. It does work well for very simple cases. It does require
> some code cleaning and I will sharing it (and my mid term report) as an
> update in a day or two you guys. Rupert is reviewing it as of now.
>

Kritart can you have a look at my Pull request [1]. I needed to make
some adoption to make it work with the API changes of the Entityhub
introduced by STANBOL-673. With those changes the Engine runs fine in
the Stanbol trunk.

There are still some limitations and issues but it is worth a try (I
recommend to download/install one of the bigger dbpedia indexes for
testing).

Kritart can provide please provide a short how to install/run/test your engine?


[1] https://github.com/kritarthanand/Disambiguation-Stanbol/pull/1

> The initial part of my project was mainly familiarizing with Stanbol,
> getting background on Entity Disambiguation, get a simple version running
> etc and carrying out some reading to get some ideas about the possible
> choice for algorithm. I had issues with those but was mainly interacting
> one on one with Rupert and Anuj.
>
> However for the later part of my project I will taking a decisions on
> algorithms to chose and many concerns related to it and therefore I am
> hoping to  interact a lot more with the entire Stanbol community to get
> their views and feed backs. I am looking forward to it.
>

I completely agree with that. It is critical to discuss you ideas with
the community. Especially to get more feedback on how to apply those
ideas to Apache Stanbol.

For that I see two things that need discussion/feedback of the community:

* Most research papers do use Wikipedia/DBpedia as test data, but
Stanbol users tend more often to use company/domain specific
controlled vocabularies.
* How can/need we do adapt/improve Stanbol to collect/provide the
information needed by those algorithm.


Next Steps:

* Improve the current Engine in a 2nd iteration (I think we should
create a Jira issue for that)
* Discuss other disambiguation possibilities here on the list and
select one to be implemented


Kritarth thanks for your work so far
best
Rupert

> Kritarth
>
>
> On Thu, Jul 12, 2012 at 6:19 PM, Bertrand Delacretaz <bdelacretaz@apache.org
>> wrote:
>
>> Hi,
>>
>> On Tue, Apr 24, 2012 at 9:22 AM, Bertrand Delacretaz
>> <bd...@apache.org> wrote:
>> > ...According to [1], Kritarth Anand's GSoC entity disambiguation project
>> > has been accepted, congrats!...
>>
>> How's that project going forward? I don't remember seeing any
>> discussions about it here, did I miss something?
>>
>> -Bertrand
>>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: GSoC disambiguation project, any news? (was: GSoC project accepted)

Posted by kritarth anand <kr...@gmail.com>.
Hi Bertrand,

The project is going good.

We have now ,a working version of an entity disambiguation engine which on
simple algorithm. It does work well for very simple cases. It does require
some code cleaning and I will sharing it (and my mid term report) as an
update in a day or two you guys. Rupert is reviewing it as of now.

The initial part of my project was mainly familiarizing with Stanbol,
getting background on Entity Disambiguation, get a simple version running
etc and carrying out some reading to get some ideas about the possible
choice for algorithm. I had issues with those but was mainly interacting
one on one with Rupert and Anuj.

However for the later part of my project I will taking a decisions on
algorithms to chose and many concerns related to it and therefore I am
hoping to  interact a lot more with the entire Stanbol community to get
their views and feed backs. I am looking forward to it.

Kritarth


On Thu, Jul 12, 2012 at 6:19 PM, Bertrand Delacretaz <bdelacretaz@apache.org
> wrote:

> Hi,
>
> On Tue, Apr 24, 2012 at 9:22 AM, Bertrand Delacretaz
> <bd...@apache.org> wrote:
> > ...According to [1], Kritarth Anand's GSoC entity disambiguation project
> > has been accepted, congrats!...
>
> How's that project going forward? I don't remember seeing any
> discussions about it here, did I miss something?
>
> -Bertrand
>