You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@opennlp.apache.org by Anthony Beylerian <an...@gmail.com> on 2016/06/07 16:36:12 UTC

Profiler for OpenNLP

Hello,

We are currently working on an experimental author profiler that we think
could be added to the toolkit.

The profiler aims to detect the gender and age range of an author.
Later we hope to add personality aspects such as:
[extroverted, stable, agreeable, conscientious]

We would like the teams' opinion on the matter.
An initial code drop can be found here[1] if someone is willing to
contribute/collaborate on it with us please let us know.

Thanks!

[1] https://github.com/beylerian/profiler

Re: Profiler for OpenNLP

Posted by Anthony Beylerian <an...@gmail.com>.
Hello,

Thank you very much for your interest.

We are planning to implement some of features listed here [1].
However due to the breadth of approaches, any suggestions or hints based on
your experience are of course welcome.

[1] : http://www.ripublication.com/ijaer16/ijaerv11n5_24.pdf

On Wed, Jun 8, 2016 at 8:14 PM, Kostas Perifanos <kostas.perifanos@gmail.com
> wrote:

> Hi,
>
> very interesting, I have done some work in the field and I would like to
> contribute as well
>
> On Wed, Jun 8, 2016 at 4:29 AM, Madhawa Kasun Gunasekara <
> madhawa30@gmail.com> wrote:
>
> > +1
> > I would like to contribute.
> >
> > Thanks,
> > Madhawa
> >
> > Madhawa
> >
> > On Wed, Jun 8, 2016 at 1:26 AM, Tommaso Teofili <
> tommaso.teofili@gmail.com
> > >
> > wrote:
> >
> > > +1 that sounds quite interesting.
> > >
> > > Regards,
> > > Tommaso
> > >
> > > Il giorno mar 7 giu 2016 alle ore 20:03 Mattmann, Chris A (3980) <
> > > chris.a.mattmann@jpl.nasa.gov> ha scritto:
> > >
> > > > We would love to have this part of Apache Tika. You can take a look
> > > > at the existing NER/NLP stuff integrated like in GeoTopicParser as
> > > > an example and yes please file a JIRA issue:
> > > >
> > > > http://issues.apache.org/jira/browse/TIKA
> > > >
> > > > I would be happy to work with you to make it happen.
> > > >
> > > > See: http://github.com/apache/tika/#contributing-via-github
> > > >
> > > > For guidance.
> > > >
> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > Chris Mattmann, Ph.D.
> > > > Chief Architect
> > > > Instrument Software and Science Data Systems Section (398)
> > > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > > > Office: 168-519, Mailstop: 168-527
> > > > Email: chris.a.mattmann@nasa.gov
> > > > WWW:  http://sunset.usc.edu/~mattmann/
> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > Director, Information Retrieval and Data Science Group (IRDS)
> > > > Adjunct Associate Professor, Computer Science Department
> > > > University of Southern California, Los Angeles, CA 90089 USA
> > > > WWW: http://irds.usc.edu/
> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 6/7/16, 9:36 AM, "Anthony Beylerian" <anthony.beylerian@gmail.com
> >
> > > > wrote:
> > > >
> > > > >Hello,
> > > > >
> > > > >We are currently working on an experimental author profiler that we
> > > think
> > > > >could be added to the toolkit.
> > > > >
> > > > >The profiler aims to detect the gender and age range of an author.
> > > > >Later we hope to add personality aspects such as:
> > > > >[extroverted, stable, agreeable, conscientious]
> > > > >
> > > > >We would like the teams' opinion on the matter.
> > > > >An initial code drop can be found here[1] if someone is willing to
> > > > >contribute/collaborate on it with us please let us know.
> > > > >
> > > > >Thanks!
> > > > >
> > > > >[1] https://github.com/beylerian/profiler
> > > >
> > >
> >
>

Re: Profiler for OpenNLP

Posted by Anthony Beylerian <an...@gmail.com>.
Hello,

Thank you very much for your interest.

We are planning to implement some of features listed here [1].
However due to the breadth of approaches, any suggestions or hints based on
your experience are of course welcome.

[1] : http://www.ripublication.com/ijaer16/ijaerv11n5_24.pdf

On Wed, Jun 8, 2016 at 8:14 PM, Kostas Perifanos <kostas.perifanos@gmail.com
> wrote:

> Hi,
>
> very interesting, I have done some work in the field and I would like to
> contribute as well
>
> On Wed, Jun 8, 2016 at 4:29 AM, Madhawa Kasun Gunasekara <
> madhawa30@gmail.com> wrote:
>
> > +1
> > I would like to contribute.
> >
> > Thanks,
> > Madhawa
> >
> > Madhawa
> >
> > On Wed, Jun 8, 2016 at 1:26 AM, Tommaso Teofili <
> tommaso.teofili@gmail.com
> > >
> > wrote:
> >
> > > +1 that sounds quite interesting.
> > >
> > > Regards,
> > > Tommaso
> > >
> > > Il giorno mar 7 giu 2016 alle ore 20:03 Mattmann, Chris A (3980) <
> > > chris.a.mattmann@jpl.nasa.gov> ha scritto:
> > >
> > > > We would love to have this part of Apache Tika. You can take a look
> > > > at the existing NER/NLP stuff integrated like in GeoTopicParser as
> > > > an example and yes please file a JIRA issue:
> > > >
> > > > http://issues.apache.org/jira/browse/TIKA
> > > >
> > > > I would be happy to work with you to make it happen.
> > > >
> > > > See: http://github.com/apache/tika/#contributing-via-github
> > > >
> > > > For guidance.
> > > >
> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > Chris Mattmann, Ph.D.
> > > > Chief Architect
> > > > Instrument Software and Science Data Systems Section (398)
> > > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > > > Office: 168-519, Mailstop: 168-527
> > > > Email: chris.a.mattmann@nasa.gov
> > > > WWW:  http://sunset.usc.edu/~mattmann/
> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > > Director, Information Retrieval and Data Science Group (IRDS)
> > > > Adjunct Associate Professor, Computer Science Department
> > > > University of Southern California, Los Angeles, CA 90089 USA
> > > > WWW: http://irds.usc.edu/
> > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 6/7/16, 9:36 AM, "Anthony Beylerian" <anthony.beylerian@gmail.com
> >
> > > > wrote:
> > > >
> > > > >Hello,
> > > > >
> > > > >We are currently working on an experimental author profiler that we
> > > think
> > > > >could be added to the toolkit.
> > > > >
> > > > >The profiler aims to detect the gender and age range of an author.
> > > > >Later we hope to add personality aspects such as:
> > > > >[extroverted, stable, agreeable, conscientious]
> > > > >
> > > > >We would like the teams' opinion on the matter.
> > > > >An initial code drop can be found here[1] if someone is willing to
> > > > >contribute/collaborate on it with us please let us know.
> > > > >
> > > > >Thanks!
> > > > >
> > > > >[1] https://github.com/beylerian/profiler
> > > >
> > >
> >
>

Re: Profiler for OpenNLP

Posted by Kostas Perifanos <ko...@gmail.com>.
Hi,

very interesting, I have done some work in the field and I would like to
contribute as well

On Wed, Jun 8, 2016 at 4:29 AM, Madhawa Kasun Gunasekara <
madhawa30@gmail.com> wrote:

> +1
> I would like to contribute.
>
> Thanks,
> Madhawa
>
> Madhawa
>
> On Wed, Jun 8, 2016 at 1:26 AM, Tommaso Teofili <tommaso.teofili@gmail.com
> >
> wrote:
>
> > +1 that sounds quite interesting.
> >
> > Regards,
> > Tommaso
> >
> > Il giorno mar 7 giu 2016 alle ore 20:03 Mattmann, Chris A (3980) <
> > chris.a.mattmann@jpl.nasa.gov> ha scritto:
> >
> > > We would love to have this part of Apache Tika. You can take a look
> > > at the existing NER/NLP stuff integrated like in GeoTopicParser as
> > > an example and yes please file a JIRA issue:
> > >
> > > http://issues.apache.org/jira/browse/TIKA
> > >
> > > I would be happy to work with you to make it happen.
> > >
> > > See: http://github.com/apache/tika/#contributing-via-github
> > >
> > > For guidance.
> > >
> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > Chris Mattmann, Ph.D.
> > > Chief Architect
> > > Instrument Software and Science Data Systems Section (398)
> > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > > Office: 168-519, Mailstop: 168-527
> > > Email: chris.a.mattmann@nasa.gov
> > > WWW:  http://sunset.usc.edu/~mattmann/
> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > > Director, Information Retrieval and Data Science Group (IRDS)
> > > Adjunct Associate Professor, Computer Science Department
> > > University of Southern California, Los Angeles, CA 90089 USA
> > > WWW: http://irds.usc.edu/
> > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On 6/7/16, 9:36 AM, "Anthony Beylerian" <an...@gmail.com>
> > > wrote:
> > >
> > > >Hello,
> > > >
> > > >We are currently working on an experimental author profiler that we
> > think
> > > >could be added to the toolkit.
> > > >
> > > >The profiler aims to detect the gender and age range of an author.
> > > >Later we hope to add personality aspects such as:
> > > >[extroverted, stable, agreeable, conscientious]
> > > >
> > > >We would like the teams' opinion on the matter.
> > > >An initial code drop can be found here[1] if someone is willing to
> > > >contribute/collaborate on it with us please let us know.
> > > >
> > > >Thanks!
> > > >
> > > >[1] https://github.com/beylerian/profiler
> > >
> >
>

Re: Profiler for OpenNLP

Posted by Madhawa Kasun Gunasekara <ma...@gmail.com>.
+1
I would like to contribute.

Thanks,
Madhawa

Madhawa

On Wed, Jun 8, 2016 at 1:26 AM, Tommaso Teofili <to...@gmail.com>
wrote:

> +1 that sounds quite interesting.
>
> Regards,
> Tommaso
>
> Il giorno mar 7 giu 2016 alle ore 20:03 Mattmann, Chris A (3980) <
> chris.a.mattmann@jpl.nasa.gov> ha scritto:
>
> > We would love to have this part of Apache Tika. You can take a look
> > at the existing NER/NLP stuff integrated like in GeoTopicParser as
> > an example and yes please file a JIRA issue:
> >
> > http://issues.apache.org/jira/browse/TIKA
> >
> > I would be happy to work with you to make it happen.
> >
> > See: http://github.com/apache/tika/#contributing-via-github
> >
> > For guidance.
> >
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Chris Mattmann, Ph.D.
> > Chief Architect
> > Instrument Software and Science Data Systems Section (398)
> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > Office: 168-519, Mailstop: 168-527
> > Email: chris.a.mattmann@nasa.gov
> > WWW:  http://sunset.usc.edu/~mattmann/
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > Director, Information Retrieval and Data Science Group (IRDS)
> > Adjunct Associate Professor, Computer Science Department
> > University of Southern California, Los Angeles, CA 90089 USA
> > WWW: http://irds.usc.edu/
> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On 6/7/16, 9:36 AM, "Anthony Beylerian" <an...@gmail.com>
> > wrote:
> >
> > >Hello,
> > >
> > >We are currently working on an experimental author profiler that we
> think
> > >could be added to the toolkit.
> > >
> > >The profiler aims to detect the gender and age range of an author.
> > >Later we hope to add personality aspects such as:
> > >[extroverted, stable, agreeable, conscientious]
> > >
> > >We would like the teams' opinion on the matter.
> > >An initial code drop can be found here[1] if someone is willing to
> > >contribute/collaborate on it with us please let us know.
> > >
> > >Thanks!
> > >
> > >[1] https://github.com/beylerian/profiler
> >
>

Re: Profiler for OpenNLP

Posted by Tommaso Teofili <to...@gmail.com>.
+1 that sounds quite interesting.

Regards,
Tommaso

Il giorno mar 7 giu 2016 alle ore 20:03 Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> ha scritto:

> We would love to have this part of Apache Tika. You can take a look
> at the existing NER/NLP stuff integrated like in GeoTopicParser as
> an example and yes please file a JIRA issue:
>
> http://issues.apache.org/jira/browse/TIKA
>
> I would be happy to work with you to make it happen.
>
> See: http://github.com/apache/tika/#contributing-via-github
>
> For guidance.
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Director, Information Retrieval and Data Science Group (IRDS)
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> WWW: http://irds.usc.edu/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
>
>
>
>
> On 6/7/16, 9:36 AM, "Anthony Beylerian" <an...@gmail.com>
> wrote:
>
> >Hello,
> >
> >We are currently working on an experimental author profiler that we think
> >could be added to the toolkit.
> >
> >The profiler aims to detect the gender and age range of an author.
> >Later we hope to add personality aspects such as:
> >[extroverted, stable, agreeable, conscientious]
> >
> >We would like the teams' opinion on the matter.
> >An initial code drop can be found here[1] if someone is willing to
> >contribute/collaborate on it with us please let us know.
> >
> >Thanks!
> >
> >[1] https://github.com/beylerian/profiler
>

Re: Profiler for OpenNLP

Posted by Tommaso Teofili <to...@gmail.com>.
+1 that sounds quite interesting.

Regards,
Tommaso

Il giorno mar 7 giu 2016 alle ore 20:03 Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> ha scritto:

> We would love to have this part of Apache Tika. You can take a look
> at the existing NER/NLP stuff integrated like in GeoTopicParser as
> an example and yes please file a JIRA issue:
>
> http://issues.apache.org/jira/browse/TIKA
>
> I would be happy to work with you to make it happen.
>
> See: http://github.com/apache/tika/#contributing-via-github
>
> For guidance.
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Director, Information Retrieval and Data Science Group (IRDS)
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> WWW: http://irds.usc.edu/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
>
>
>
>
> On 6/7/16, 9:36 AM, "Anthony Beylerian" <an...@gmail.com>
> wrote:
>
> >Hello,
> >
> >We are currently working on an experimental author profiler that we think
> >could be added to the toolkit.
> >
> >The profiler aims to detect the gender and age range of an author.
> >Later we hope to add personality aspects such as:
> >[extroverted, stable, agreeable, conscientious]
> >
> >We would like the teams' opinion on the matter.
> >An initial code drop can be found here[1] if someone is willing to
> >contribute/collaborate on it with us please let us know.
> >
> >Thanks!
> >
> >[1] https://github.com/beylerian/profiler
>

Re: Profiler for OpenNLP

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
We would love to have this part of Apache Tika. You can take a look
at the existing NER/NLP stuff integrated like in GeoTopicParser as
an example and yes please file a JIRA issue:

http://issues.apache.org/jira/browse/TIKA 

I would be happy to work with you to make it happen.

See: http://github.com/apache/tika/#contributing-via-github 

For guidance.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++










On 6/7/16, 9:36 AM, "Anthony Beylerian" <an...@gmail.com> wrote:

>Hello,
>
>We are currently working on an experimental author profiler that we think
>could be added to the toolkit.
>
>The profiler aims to detect the gender and age range of an author.
>Later we hope to add personality aspects such as:
>[extroverted, stable, agreeable, conscientious]
>
>We would like the teams' opinion on the matter.
>An initial code drop can be found here[1] if someone is willing to
>contribute/collaborate on it with us please let us know.
>
>Thanks!
>
>[1] https://github.com/beylerian/profiler

Re: Profiler for OpenNLP

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
We would love to have this part of Apache Tika. You can take a look
at the existing NER/NLP stuff integrated like in GeoTopicParser as
an example and yes please file a JIRA issue:

http://issues.apache.org/jira/browse/TIKA 

I would be happy to work with you to make it happen.

See: http://github.com/apache/tika/#contributing-via-github 

For guidance.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++










On 6/7/16, 9:36 AM, "Anthony Beylerian" <an...@gmail.com> wrote:

>Hello,
>
>We are currently working on an experimental author profiler that we think
>could be added to the toolkit.
>
>The profiler aims to detect the gender and age range of an author.
>Later we hope to add personality aspects such as:
>[extroverted, stable, agreeable, conscientious]
>
>We would like the teams' opinion on the matter.
>An initial code drop can be found here[1] if someone is willing to
>contribute/collaborate on it with us please let us know.
>
>Thanks!
>
>[1] https://github.com/beylerian/profiler