You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Tyler Bui-Palsulich <tp...@apache.org> on 2018/06/24 18:11:37 UTC

Re: Welcome Thejan Wijesinghe as an Apache Tika PMC and committer!

Welcome, Thejan! It's great to have you on board!

Tyler

(Catching up on some old email.)

On Thu, May 10, 2018, 4:52 PM Thejan Wijesinghe <th...@apache.org> wrote:

> *Hi Chris Mattmann,Thank you for the invitation.Hi everyone,First of all, I
> should say, I am very excited to be on board. Being a PMC member in Tika is
> a huge accomplishment because Tika is one of those TLPs in Apache with a
> history of more than 10 years.I’m currently a final year undergraduate at
> Univ. of Moratuwa, Sri Lanka. I found a keen interest in information
> retrieval, data science and machine learning related domains. Tika, being
> one of the key technologies, used in many information retrieval
> applications, I got the opportunity to work with Tika, couple of years back
> but never got the chance to use Tika for an industry level application
> until my internship. During my internship, I worked with a startup in SL,
> to build their own cognitive platform where I had to use some of the Apache
> technologies such as Kafka, Solr, Superset(incubating) and Tika. We could
> successfully complete the initial version of the platform and I still work
> as an external consultant for the same project. However, becoming a
> committer to Apache Tika was one of the life goals I set when I got
> selected as the Google Summer of Code intern at Apache Tika in 2017. My
> project was “Supporting Image-to-Text (Image Captioning) in Tika for Image
> MIME Types”[1], it was an amazing project idea by Thamme Gowda, which lots
> of people paid so much attention. I was mentored by Chris Mattmann and
> Thamme Gowda. I feel myself very lucky to have met these two people in my
> life, because not for them, I don’t think, I would ever find the guidance
> to become a PMC member or a committer. Most of my contributions are related
> to enhancing ML based capabilities in Tika. I have many future plans to
> improve the Tika-dl module. Including a parser with NMT based translation,
> a sentiment parser, a dl4j based captioning parser to tika-dl. I also love
> to improve Tika’s capabilities in mime type detection and language
> detection. Other than that, I would love to clean up some of the parsers in
> Tika. Our code base is quite a big one, evolved throughout many years and I
> have seen instances where some of the parsers, not being in their
> appropriate place, just to point out as an example, we have an age
> recognizer parser in the Tika-nlp module while having a sentiment parser
> under Tika-parsers module. I know that’s quite a lot of plans, I got there
> for Tika, but I have nothing to be afraid of because I got an entire
> lifetime to accomplish them.[1]
> https://issues.apache.org/jira/browse/TIKA-2262
> <https://issues.apache.org/jira/browse/TIKA-2262>   Thanks and Best
> Regards,ThejanW*
>
>
> On Tue, May 8, 2018 at 12:10 AM Chris Mattmann <ma...@apache.org>
> wrote:
>
> > Welcome to Thejan Wijesinghe who has joined as a new Tika PMC member and
> > committer!
> >
> >
> >
> > Please say a bit about yourself…thanks!
> >
> >
> >
> > Cheers,
> >
> > Chris
> >
> >
> >
> >
> >
> >
> >
> >
>