You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@any23.apache.org by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2015/03/27 15:07:07 UTC

Re: GSOC RDF Microformats Support

Hi Remzi - thanks! You may want to consider this as a Tika or
Any23 project since Nutch delegates its parsing to Tika (and
Any23 uses Tika [and vice versa] to handle micro formats).

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Remzi Düzağaç <re...@gmail.com>
Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
Date: Friday, March 27, 2015 at 5:07 AM
To: "dev@nutch.apache.org" <de...@nutch.apache.org>
Subject: GSOC RDF Microformats Support

>Hi Guys,
>
>
>I have sent a proposal to gsoc. I would like to add rdf microformat
>support to nutch. I kindly ask for your support. Is there anyone
>volunteer to be my mentor on this topic?
>
>
>Thank you very much
>


Re: GSOC RDF Microformats Support

Posted by Remzi Düzağaç <re...@gmail.com>.
Hi Chirs,

Sorry for late answer I couldnt write I was sick last week.
I have checked links. If I wanna do the job, I must use them and I will.
On the other hand,  I need a mentor for gsoc project. Would you consider
being my mentor?

On Sat, Mar 28, 2015 at 4:53 AM, Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hi Remiz,
>
> Sure!
>
> Check out this 5 min writing a parser guide in Tika:
>
> https://tika.apache.org/1.7/parser_guide.html
>
>
> OK, so then check out Any23:
>
> http://any23.apache.org/
>
> It has support for parsing RDF Microformats. So, you
> may want to create a MicroformatsParser in Tika; then
> if it’s supported in Tika, it will in turn be available
> in Nutch and its parse-tika plugin if you upgrade it to
> the latest version of Tika.
>
> You can see how to do this here:
>
> http://s.apache.org/fsY
>
> Cheers and best of luck - hope that’s enough to get
> your proposal kicked off.
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Remzi Düzağaç <re...@gmail.com>
> Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> Date: Friday, March 27, 2015 at 7:22 AM
> To: dev <de...@nutch.apache.org>
> Cc: "dev@tika.apache.org" <de...@tika.apache.org>, "dev@any23.apache.org"
> <de...@any23.apache.org>
> Subject: Re: GSOC RDF Microformats Support
>
> >Hi Chris,
> >
> >
> >Thanks for your feedback.
> >I was planning to use any23 and tika but I dont have detailed grasp of
> >both projects. I guess Im gonna need to dive in both.
> >
> >
> >I would appreciate if you could guide me
> >
> >
> >thanks
> >
> >On Fri, Mar 27, 2015 at 4:07 PM, Mattmann, Chris A (3980)
> ><ch...@jpl.nasa.gov> wrote:
> >
> >Hi Remzi - thanks! You may want to consider this as a Tika or
> >Any23 project since Nutch delegates its parsing to Tika (and
> >Any23 uses Tika [and vice versa] to handle micro formats).
> >
> >Cheers,
> >Chris
> >
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Chris Mattmann, Ph.D.
> >Chief Architect
> >Instrument Software and Science Data Systems Section (398)
> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >Office: 168-519, Mailstop: 168-527
> >Email: chris.a.mattmann@nasa.gov
> >WWW:  http://sunset.usc.edu/~mattmann/
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Adjunct Associate Professor, Computer Science Department
> >University of Southern California, Los Angeles, CA 90089 USA
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Remzi Düzağaç <re...@gmail.com>
> >Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> >Date: Friday, March 27, 2015 at 5:07 AM
> >To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> >Subject: GSOC RDF Microformats Support
> >
> >>Hi Guys,
> >>
> >>
> >>I have sent a proposal to gsoc. I would like to add rdf microformat
> >>support to nutch. I kindly ask for your support. Is there anyone
> >>volunteer to be my mentor on this topic?
> >>
> >>
> >>Thank you very much
> >>
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>

Re: GSOC RDF Microformats Support

Posted by Remzi Düzağaç <re...@gmail.com>.
Hi Chirs,

Sorry for late answer I couldnt write I was sick last week.
I have checked links. If I wanna do the job, I must use them and I will.
On the other hand,  I need a mentor for gsoc project. Would you consider
being my mentor?

On Sat, Mar 28, 2015 at 4:53 AM, Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hi Remiz,
>
> Sure!
>
> Check out this 5 min writing a parser guide in Tika:
>
> https://tika.apache.org/1.7/parser_guide.html
>
>
> OK, so then check out Any23:
>
> http://any23.apache.org/
>
> It has support for parsing RDF Microformats. So, you
> may want to create a MicroformatsParser in Tika; then
> if it’s supported in Tika, it will in turn be available
> in Nutch and its parse-tika plugin if you upgrade it to
> the latest version of Tika.
>
> You can see how to do this here:
>
> http://s.apache.org/fsY
>
> Cheers and best of luck - hope that’s enough to get
> your proposal kicked off.
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Remzi Düzağaç <re...@gmail.com>
> Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> Date: Friday, March 27, 2015 at 7:22 AM
> To: dev <de...@nutch.apache.org>
> Cc: "dev@tika.apache.org" <de...@tika.apache.org>, "dev@any23.apache.org"
> <de...@any23.apache.org>
> Subject: Re: GSOC RDF Microformats Support
>
> >Hi Chris,
> >
> >
> >Thanks for your feedback.
> >I was planning to use any23 and tika but I dont have detailed grasp of
> >both projects. I guess Im gonna need to dive in both.
> >
> >
> >I would appreciate if you could guide me
> >
> >
> >thanks
> >
> >On Fri, Mar 27, 2015 at 4:07 PM, Mattmann, Chris A (3980)
> ><ch...@jpl.nasa.gov> wrote:
> >
> >Hi Remzi - thanks! You may want to consider this as a Tika or
> >Any23 project since Nutch delegates its parsing to Tika (and
> >Any23 uses Tika [and vice versa] to handle micro formats).
> >
> >Cheers,
> >Chris
> >
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Chris Mattmann, Ph.D.
> >Chief Architect
> >Instrument Software and Science Data Systems Section (398)
> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >Office: 168-519, Mailstop: 168-527
> >Email: chris.a.mattmann@nasa.gov
> >WWW:  http://sunset.usc.edu/~mattmann/
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Adjunct Associate Professor, Computer Science Department
> >University of Southern California, Los Angeles, CA 90089 USA
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Remzi Düzağaç <re...@gmail.com>
> >Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> >Date: Friday, March 27, 2015 at 5:07 AM
> >To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> >Subject: GSOC RDF Microformats Support
> >
> >>Hi Guys,
> >>
> >>
> >>I have sent a proposal to gsoc. I would like to add rdf microformat
> >>support to nutch. I kindly ask for your support. Is there anyone
> >>volunteer to be my mentor on this topic?
> >>
> >>
> >>Thank you very much
> >>
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>

Re: GSOC RDF Microformats Support

Posted by Remzi Düzağaç <re...@gmail.com>.
Hi Chirs,

Sorry for late answer I couldnt write I was sick last week.
I have checked links. If I wanna do the job, I must use them and I will.
On the other hand,  I need a mentor for gsoc project. Would you consider
being my mentor?

On Sat, Mar 28, 2015 at 4:53 AM, Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hi Remiz,
>
> Sure!
>
> Check out this 5 min writing a parser guide in Tika:
>
> https://tika.apache.org/1.7/parser_guide.html
>
>
> OK, so then check out Any23:
>
> http://any23.apache.org/
>
> It has support for parsing RDF Microformats. So, you
> may want to create a MicroformatsParser in Tika; then
> if it’s supported in Tika, it will in turn be available
> in Nutch and its parse-tika plugin if you upgrade it to
> the latest version of Tika.
>
> You can see how to do this here:
>
> http://s.apache.org/fsY
>
> Cheers and best of luck - hope that’s enough to get
> your proposal kicked off.
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Remzi Düzağaç <re...@gmail.com>
> Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> Date: Friday, March 27, 2015 at 7:22 AM
> To: dev <de...@nutch.apache.org>
> Cc: "dev@tika.apache.org" <de...@tika.apache.org>, "dev@any23.apache.org"
> <de...@any23.apache.org>
> Subject: Re: GSOC RDF Microformats Support
>
> >Hi Chris,
> >
> >
> >Thanks for your feedback.
> >I was planning to use any23 and tika but I dont have detailed grasp of
> >both projects. I guess Im gonna need to dive in both.
> >
> >
> >I would appreciate if you could guide me
> >
> >
> >thanks
> >
> >On Fri, Mar 27, 2015 at 4:07 PM, Mattmann, Chris A (3980)
> ><ch...@jpl.nasa.gov> wrote:
> >
> >Hi Remzi - thanks! You may want to consider this as a Tika or
> >Any23 project since Nutch delegates its parsing to Tika (and
> >Any23 uses Tika [and vice versa] to handle micro formats).
> >
> >Cheers,
> >Chris
> >
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Chris Mattmann, Ph.D.
> >Chief Architect
> >Instrument Software and Science Data Systems Section (398)
> >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >Office: 168-519, Mailstop: 168-527
> >Email: chris.a.mattmann@nasa.gov
> >WWW:  http://sunset.usc.edu/~mattmann/
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >Adjunct Associate Professor, Computer Science Department
> >University of Southern California, Los Angeles, CA 90089 USA
> >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >
> >
> >
> >
> >
> >
> >-----Original Message-----
> >From: Remzi Düzağaç <re...@gmail.com>
> >Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> >Date: Friday, March 27, 2015 at 5:07 AM
> >To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> >Subject: GSOC RDF Microformats Support
> >
> >>Hi Guys,
> >>
> >>
> >>I have sent a proposal to gsoc. I would like to add rdf microformat
> >>support to nutch. I kindly ask for your support. Is there anyone
> >>volunteer to be my mentor on this topic?
> >>
> >>
> >>Thank you very much
> >>
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>

Re: GSOC RDF Microformats Support

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Hi Remiz,

Sure!

Check out this 5 min writing a parser guide in Tika:

https://tika.apache.org/1.7/parser_guide.html


OK, so then check out Any23:

http://any23.apache.org/

It has support for parsing RDF Microformats. So, you
may want to create a MicroformatsParser in Tika; then
if it’s supported in Tika, it will in turn be available
in Nutch and its parse-tika plugin if you upgrade it to
the latest version of Tika.

You can see how to do this here:

http://s.apache.org/fsY

Cheers and best of luck - hope that’s enough to get
your proposal kicked off.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Remzi Düzağaç <re...@gmail.com>
Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
Date: Friday, March 27, 2015 at 7:22 AM
To: dev <de...@nutch.apache.org>
Cc: "dev@tika.apache.org" <de...@tika.apache.org>, "dev@any23.apache.org"
<de...@any23.apache.org>
Subject: Re: GSOC RDF Microformats Support

>Hi Chris,
>
>
>Thanks for your feedback.
>I was planning to use any23 and tika but I dont have detailed grasp of
>both projects. I guess Im gonna need to dive in both.
>
>
>I would appreciate if you could guide me
>
>
>thanks
>
>On Fri, Mar 27, 2015 at 4:07 PM, Mattmann, Chris A (3980)
><ch...@jpl.nasa.gov> wrote:
>
>Hi Remzi - thanks! You may want to consider this as a Tika or
>Any23 project since Nutch delegates its parsing to Tika (and
>Any23 uses Tika [and vice versa] to handle micro formats).
>
>Cheers,
>Chris
>
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Chief Architect
>Instrument Software and Science Data Systems Section (398)
>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 168-519, Mailstop: 168-527
>Email: chris.a.mattmann@nasa.gov
>WWW:  http://sunset.usc.edu/~mattmann/
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Adjunct Associate Professor, Computer Science Department
>University of Southern California, Los Angeles, CA 90089 USA
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
>-----Original Message-----
>From: Remzi Düzağaç <re...@gmail.com>
>Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
>Date: Friday, March 27, 2015 at 5:07 AM
>To: "dev@nutch.apache.org" <de...@nutch.apache.org>
>Subject: GSOC RDF Microformats Support
>
>>Hi Guys,
>>
>>
>>I have sent a proposal to gsoc. I would like to add rdf microformat
>>support to nutch. I kindly ask for your support. Is there anyone
>>volunteer to be my mentor on this topic?
>>
>>
>>Thank you very much
>>
>
>
>
>
>
>
>
>
>
>
>


Re: GSOC RDF Microformats Support

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Hi Remiz,

Sure!

Check out this 5 min writing a parser guide in Tika:

https://tika.apache.org/1.7/parser_guide.html


OK, so then check out Any23:

http://any23.apache.org/

It has support for parsing RDF Microformats. So, you
may want to create a MicroformatsParser in Tika; then
if it’s supported in Tika, it will in turn be available
in Nutch and its parse-tika plugin if you upgrade it to
the latest version of Tika.

You can see how to do this here:

http://s.apache.org/fsY

Cheers and best of luck - hope that’s enough to get
your proposal kicked off.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Remzi Düzağaç <re...@gmail.com>
Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
Date: Friday, March 27, 2015 at 7:22 AM
To: dev <de...@nutch.apache.org>
Cc: "dev@tika.apache.org" <de...@tika.apache.org>, "dev@any23.apache.org"
<de...@any23.apache.org>
Subject: Re: GSOC RDF Microformats Support

>Hi Chris,
>
>
>Thanks for your feedback.
>I was planning to use any23 and tika but I dont have detailed grasp of
>both projects. I guess Im gonna need to dive in both.
>
>
>I would appreciate if you could guide me
>
>
>thanks
>
>On Fri, Mar 27, 2015 at 4:07 PM, Mattmann, Chris A (3980)
><ch...@jpl.nasa.gov> wrote:
>
>Hi Remzi - thanks! You may want to consider this as a Tika or
>Any23 project since Nutch delegates its parsing to Tika (and
>Any23 uses Tika [and vice versa] to handle micro formats).
>
>Cheers,
>Chris
>
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Chief Architect
>Instrument Software and Science Data Systems Section (398)
>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 168-519, Mailstop: 168-527
>Email: chris.a.mattmann@nasa.gov
>WWW:  http://sunset.usc.edu/~mattmann/
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Adjunct Associate Professor, Computer Science Department
>University of Southern California, Los Angeles, CA 90089 USA
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
>-----Original Message-----
>From: Remzi Düzağaç <re...@gmail.com>
>Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
>Date: Friday, March 27, 2015 at 5:07 AM
>To: "dev@nutch.apache.org" <de...@nutch.apache.org>
>Subject: GSOC RDF Microformats Support
>
>>Hi Guys,
>>
>>
>>I have sent a proposal to gsoc. I would like to add rdf microformat
>>support to nutch. I kindly ask for your support. Is there anyone
>>volunteer to be my mentor on this topic?
>>
>>
>>Thank you very much
>>
>
>
>
>
>
>
>
>
>
>
>


Re: GSOC RDF Microformats Support

Posted by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov>.
Hi Remiz,

Sure!

Check out this 5 min writing a parser guide in Tika:

https://tika.apache.org/1.7/parser_guide.html


OK, so then check out Any23:

http://any23.apache.org/

It has support for parsing RDF Microformats. So, you
may want to create a MicroformatsParser in Tika; then
if it’s supported in Tika, it will in turn be available
in Nutch and its parse-tika plugin if you upgrade it to
the latest version of Tika.

You can see how to do this here:

http://s.apache.org/fsY

Cheers and best of luck - hope that’s enough to get
your proposal kicked off.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Remzi Düzağaç <re...@gmail.com>
Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
Date: Friday, March 27, 2015 at 7:22 AM
To: dev <de...@nutch.apache.org>
Cc: "dev@tika.apache.org" <de...@tika.apache.org>, "dev@any23.apache.org"
<de...@any23.apache.org>
Subject: Re: GSOC RDF Microformats Support

>Hi Chris,
>
>
>Thanks for your feedback.
>I was planning to use any23 and tika but I dont have detailed grasp of
>both projects. I guess Im gonna need to dive in both.
>
>
>I would appreciate if you could guide me
>
>
>thanks
>
>On Fri, Mar 27, 2015 at 4:07 PM, Mattmann, Chris A (3980)
><ch...@jpl.nasa.gov> wrote:
>
>Hi Remzi - thanks! You may want to consider this as a Tika or
>Any23 project since Nutch delegates its parsing to Tika (and
>Any23 uses Tika [and vice versa] to handle micro formats).
>
>Cheers,
>Chris
>
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Chief Architect
>Instrument Software and Science Data Systems Section (398)
>NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 168-519, Mailstop: 168-527
>Email: chris.a.mattmann@nasa.gov
>WWW:  http://sunset.usc.edu/~mattmann/
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Adjunct Associate Professor, Computer Science Department
>University of Southern California, Los Angeles, CA 90089 USA
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
>-----Original Message-----
>From: Remzi Düzağaç <re...@gmail.com>
>Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
>Date: Friday, March 27, 2015 at 5:07 AM
>To: "dev@nutch.apache.org" <de...@nutch.apache.org>
>Subject: GSOC RDF Microformats Support
>
>>Hi Guys,
>>
>>
>>I have sent a proposal to gsoc. I would like to add rdf microformat
>>support to nutch. I kindly ask for your support. Is there anyone
>>volunteer to be my mentor on this topic?
>>
>>
>>Thank you very much
>>
>
>
>
>
>
>
>
>
>
>
>


Re: GSOC RDF Microformats Support

Posted by Remzi Düzağaç <re...@gmail.com>.
Hi Chris,

Thanks for your feedback.
I was planning to use any23 and tika but I dont have detailed grasp of both
projects. I guess Im gonna need to dive in both.
I would appreciate if you could guide me

thanks

On Fri, Mar 27, 2015 at 4:07 PM, Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hi Remzi - thanks! You may want to consider this as a Tika or
> Any23 project since Nutch delegates its parsing to Tika (and
> Any23 uses Tika [and vice versa] to handle micro formats).
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Remzi Düzağaç <re...@gmail.com>
> Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> Date: Friday, March 27, 2015 at 5:07 AM
> To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> Subject: GSOC RDF Microformats Support
>
> >Hi Guys,
> >
> >
> >I have sent a proposal to gsoc. I would like to add rdf microformat
> >support to nutch. I kindly ask for your support. Is there anyone
> >volunteer to be my mentor on this topic?
> >
> >
> >Thank you very much
> >
>
>

Re: GSOC RDF Microformats Support

Posted by Remzi Düzağaç <re...@gmail.com>.
Hi Chris,

Thanks for your feedback.
I was planning to use any23 and tika but I dont have detailed grasp of both
projects. I guess Im gonna need to dive in both.
I would appreciate if you could guide me

thanks

On Fri, Mar 27, 2015 at 4:07 PM, Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hi Remzi - thanks! You may want to consider this as a Tika or
> Any23 project since Nutch delegates its parsing to Tika (and
> Any23 uses Tika [and vice versa] to handle micro formats).
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Remzi Düzağaç <re...@gmail.com>
> Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> Date: Friday, March 27, 2015 at 5:07 AM
> To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> Subject: GSOC RDF Microformats Support
>
> >Hi Guys,
> >
> >
> >I have sent a proposal to gsoc. I would like to add rdf microformat
> >support to nutch. I kindly ask for your support. Is there anyone
> >volunteer to be my mentor on this topic?
> >
> >
> >Thank you very much
> >
>
>

Re: GSOC RDF Microformats Support

Posted by Remzi Düzağaç <re...@gmail.com>.
Hi Chris,

Thanks for your feedback.
I was planning to use any23 and tika but I dont have detailed grasp of both
projects. I guess Im gonna need to dive in both.
I would appreciate if you could guide me

thanks

On Fri, Mar 27, 2015 at 4:07 PM, Mattmann, Chris A (3980) <
chris.a.mattmann@jpl.nasa.gov> wrote:

> Hi Remzi - thanks! You may want to consider this as a Tika or
> Any23 project since Nutch delegates its parsing to Tika (and
> Any23 uses Tika [and vice versa] to handle micro formats).
>
> Cheers,
> Chris
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattmann@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
> -----Original Message-----
> From: Remzi Düzağaç <re...@gmail.com>
> Reply-To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> Date: Friday, March 27, 2015 at 5:07 AM
> To: "dev@nutch.apache.org" <de...@nutch.apache.org>
> Subject: GSOC RDF Microformats Support
>
> >Hi Guys,
> >
> >
> >I have sent a proposal to gsoc. I would like to add rdf microformat
> >support to nutch. I kindly ask for your support. Is there anyone
> >volunteer to be my mentor on this topic?
> >
> >
> >Thank you very much
> >
>
>