You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by American Jeff Bowden <jl...@houseofdistraction.com> on 2005/08/24 23:04:44 UTC

Re: [Nutch-general] RE: RSS Feed Parser

Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It 
doesn't appear to be in commons svn or on the feedparser site.

Chris Mattmann wrote:

>Hi Zaheed,
>
> Thanks for the nice comments. I've went ahead and wrote an HTML page that
>summarizes what I sent to Zaheed with respect to installing the parse-rss
>plugin. You can find the small guide here:
>
>http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>
>
>Thanks,
>  Chris
>
>
>______________________________________________
>Chris A. Mattmann
>Chris.Mattmann@jpl.nasa.gov 
>Staff Member
>Modeling and Data Management Systems Section (387)
>Data Management Systems and Technologies Group
>
>_________________________________________________
>Jet Propulsion Laboratory            Pasadena, CA
>Office: 171-266B                        Mailstop:  171-246
>_______________________________________________________
>
>Disclaimer:  The opinions presented within are my own and do not reflect
>those of either NASA, JPL, or the California Institute of Technology.
>
>
>
>  
>
>>-----Original Message-----
>>From: Zaheed Haque [mailto:zaheed.haque@gmail.com]
>>Sent: Thursday, August 11, 2005 11:49 AM
>>To: nutch-user@lucene.apache.org
>>Subject: RSS Feed Parser
>>
>>Hello:
>>
>>I am realy hoping that Chris Mattmann RSS parser will make it to the
>>release 0.7.
>>
>>http://issues.apache.org/jira/browse/NUTCH-30
>>
>>I got it working from last nights SVN. I believe newbie users like me
>>would benefit very much having it as a part of the distribution. +1
>>for this plugin!
>>
>>Thanks Chris for solving my problem!!
>>--
>>Best Regards
>>Zaheed Haque
>>    
>>
>
>
>
>-------------------------------------------------------
>SF.Net email is Sponsored by the Better Software Conference & EXPO
>September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
>Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
>Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>_______________________________________________
>Nutch-general mailing list
>Nutch-general@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/nutch-general
>  
>


RE: [Nutch-general] RE: RSS Feed Parser

Posted by Chris Mattmann <ch...@jpl.nasa.gov>.
Hi Jeff,
 
  Yup, that's correct.

Thanks,
 Chris


______________________________________________
Chris A. Mattmann
Chris.Mattmann@jpl.nasa.gov 
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.


> -----Original Message-----
> From: American Jeff Bowden [mailto:jlb@houseofdistraction.com]
> Sent: Thursday, August 25, 2005 12:37 PM
> To: user@nutch.org
> Cc: chris.mattmann@jpl.nasa.gov
> Subject: Re: [Nutch-general] RE: RSS Feed Parser
> 
> I notice that build.xml still creates commons-feedparser-0.5.0-RC1.jar
> but I'll assume you're just renaming it manually to -0.6-fork.
> 
> Thanks.
> 
> 
> Chris Mattmann wrote:
> 
> >Hi Jeff,
> >
> > Okay, here is the link to commons-feedparser source that includes my
> >modifications:
> >
> >http://www-scf.usc.edu/~mattmann/feedparser-0.6-fork-src.zip
> >
> >
> >Thanks!
> >
> >Cheers,
> >  Chris
> >
> >
> >______________________________________________
> >Chris A. Mattmann
> >Chris.Mattmann@jpl.nasa.gov
> >Staff Member
> >Modeling and Data Management Systems Section (387)
> >Data Management Systems and Technologies Group
> >
> >_________________________________________________
> >Jet Propulsion Laboratory            Pasadena, CA
> >Office: 171-266B                        Mailstop:  171-246
> >_______________________________________________________
> >
> >Disclaimer:  The opinions presented within are my own and do not reflect
> >those of either NASA, JPL, or the California Institute of Technology.
> >
> >
> >
> >
> >
> >>-----Original Message-----
> >>From: Jeff Bowden [mailto:jlb@houseofdistraction.com]
> >>Sent: Wednesday, August 24, 2005 10:45 PM
> >>To: user@nutch.org; chris.mattmann@jpl.nasa.gov
> >>Subject: Re: [Nutch-general] RE: RSS Feed Parser
> >>
> >>Yes please, that would be great.  I couldn't even figure out where to
> >>find the 0.6 version of feedparser, much less your patches to it.
> >>
> >>Chris Mattmann wrote:
> >>
> >>
> >>
> >>>Hi Jeff,
> >>>
> >>>  commons-feedparser-fork was a branched off version of the feedparser
> >>>
> >>>
> >>0.6
> >>
> >>
> >>>base code that I made, which removed some of the specific jar files
> that
> >>>were part of standard 0.6 feedparser distro that conflicted with the
> jar
> >>>files included in Nutch's lib directory. Specifically, I changed it so
> >>>
> >>>
> >>that
> >>
> >>
> >>>the core jaxen libraries that the feed parser relied on weren't dom4j,
> >>>
> >>>
> >>but
> >>
> >>
> >>>in fact were jdom (see postings on the Nutch list around March 2005
> >>>
> >>>
> >>between
> >>
> >>
> >>>John X, Stefan G. and I). This required changing about 9 or 10 of the
> >>>
> >>>
> >>source
> >>
> >>
> >>>files for the feedparser to use the jdom Node classes rather than the
> >>>
> >>>
> >>dom4j.
> >>
> >>
> >>>If you like, I can put up a link to the feedparser forked code on my
> >>>website, and post the link to the list.
> >>>
> >>>Thanks,
> >>> Chris
> >>>
> >>>
> >>>
> >>>On 8/24/05 2:04 PM, "American Jeff Bowden" <jl...@houseofdistraction.com>
> >>>wrote:
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
> >>>>doesn't appear to be in commons svn or on the feedparser site.
> >>>>
> >>>>Chris Mattmann wrote:
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>Hi Zaheed,
> >>>>>
> >>>>>Thanks for the nice comments. I've went ahead and wrote an HTML page
> >>>>>
> >>>>>
> >>that
> >>
> >>
> >>>>>summarizes what I sent to Zaheed with respect to installing the
> parse-
> >>>>>
> >>>>>
> >>rss
> >>
> >>
> >>>>>plugin. You can find the small guide here:
> >>>>>
> >>>>>http://www-scf.usc.edu/~mattmann/parse-rss-install.html
> >>>>>
> >>>>>
> >>>>>Thanks,
> >>>>>Chris
> >>>>>
> >>>>>
> >>>>>______________________________________________
> >>>>>Chris A. Mattmann
> >>>>>Chris.Mattmann@jpl.nasa.gov
> >>>>>Staff Member
> >>>>>Modeling and Data Management Systems Section (387)
> >>>>>Data Management Systems and Technologies Group
> >>>>>
> >>>>>_________________________________________________
> >>>>>Jet Propulsion Laboratory            Pasadena, CA
> >>>>>Office: 171-266B                        Mailstop:  171-246
> >>>>>_______________________________________________________
> >>>>>
> >>>>>Disclaimer:  The opinions presented within are my own and do not
> >>>>>
> >>>>>
> >>reflect
> >>
> >>
> >>>>>those of either NASA, JPL, or the California Institute of Technology.
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>-----Original Message-----
> >>>>>>From: Zaheed Haque [mailto:zaheed.haque@gmail.com]
> >>>>>>Sent: Thursday, August 11, 2005 11:49 AM
> >>>>>>To: nutch-user@lucene.apache.org
> >>>>>>Subject: RSS Feed Parser
> >>>>>>
> >>>>>>Hello:
> >>>>>>
> >>>>>>I am realy hoping that Chris Mattmann RSS parser will make it to the
> >>>>>>release 0.7.
> >>>>>>
> >>>>>>http://issues.apache.org/jira/browse/NUTCH-30
> >>>>>>
> >>>>>>I got it working from last nights SVN. I believe newbie users like
> me
> >>>>>>would benefit very much having it as a part of the distribution. +1
> >>>>>>for this plugin!
> >>>>>>
> >>>>>>Thanks Chris for solving my problem!!
> >>>>>>--
> >>>>>>Best Regards
> >>>>>>Zaheed Haque
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>-------------------------------------------------------
> >>>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
> >>>>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> >>>>>
> >>>>>
> >>Practices
> >>
> >>
> >>>>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing
> &
> >>>>>
> >>>>>
> >>QA
> >>
> >>
> >>>>>Security * Process Improvement & Measurement *
> >>>>>
> >>>>>
> >>http://www.sqe.com/bsce5sf
> >>
> >>
> >>>>>_______________________________________________
> >>>>>Nutch-general mailing list
> >>>>>Nutch-general@lists.sourceforge.net
> >>>>>https://lists.sourceforge.net/lists/listinfo/nutch-general
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>______________________________________________
> >>>Chris A. Mattmann
> >>>Chris.Mattmann@jpl.nasa.gov
> >>>Staff Member
> >>>Modeling and Data Management Systems Section (387)
> >>>Data Management Systems and Technologies Group
> >>>
> >>>_________________________________________________
> >>>Jet Propulsion Laboratory            Pasadena, CA
> >>>Office: 171-266B                        Mailstop:  171-246
> >>>_______________________________________________________
> >>>
> >>>Disclaimer:  The opinions presented within are my own and do not
> reflect
> >>>those of either NASA, JPL, or the California Institute of Technology.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>-------------------------------------------------------
> >>>SF.Net email is Sponsored by the Better Software Conference & EXPO
> >>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> >>>
> >>>
> >>Practices
> >>
> >>
> >>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
> >>>
> >>>
> >>QA
> >>
> >>
> >>>Security * Process Improvement & Measurement *
> http://www.sqe.com/bsce5sf
> >>>_______________________________________________
> >>>Nutch-general mailing list
> >>>Nutch-general@lists.sourceforge.net
> >>>https://lists.sourceforge.net/lists/listinfo/nutch-general
> >>>
> >>>
> >>>
> >>>
> >
> >
> >
> >-------------------------------------------------------
> >SF.Net email is Sponsored by the Better Software Conference & EXPO
> >September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> Practices
> >Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
> QA
> >Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> >_______________________________________________
> >Nutch-general mailing list
> >Nutch-general@lists.sourceforge.net
> >https://lists.sourceforge.net/lists/listinfo/nutch-general
> >
> >


Re: [Nutch-general] RE: RSS Feed Parser

Posted by American Jeff Bowden <jl...@houseofdistraction.com>.
I notice that build.xml still creates commons-feedparser-0.5.0-RC1.jar 
but I'll assume you're just renaming it manually to -0.6-fork.

Thanks.


Chris Mattmann wrote:

>Hi Jeff,
>
> Okay, here is the link to commons-feedparser source that includes my
>modifications:
>
>http://www-scf.usc.edu/~mattmann/feedparser-0.6-fork-src.zip
>
>
>Thanks!
>
>Cheers,
>  Chris
>
>
>______________________________________________
>Chris A. Mattmann
>Chris.Mattmann@jpl.nasa.gov 
>Staff Member
>Modeling and Data Management Systems Section (387)
>Data Management Systems and Technologies Group
>
>_________________________________________________
>Jet Propulsion Laboratory            Pasadena, CA
>Office: 171-266B                        Mailstop:  171-246
>_______________________________________________________
>
>Disclaimer:  The opinions presented within are my own and do not reflect
>those of either NASA, JPL, or the California Institute of Technology.
>
>
>
>  
>
>>-----Original Message-----
>>From: Jeff Bowden [mailto:jlb@houseofdistraction.com]
>>Sent: Wednesday, August 24, 2005 10:45 PM
>>To: user@nutch.org; chris.mattmann@jpl.nasa.gov
>>Subject: Re: [Nutch-general] RE: RSS Feed Parser
>>
>>Yes please, that would be great.  I couldn't even figure out where to
>>find the 0.6 version of feedparser, much less your patches to it.
>>
>>Chris Mattmann wrote:
>>
>>    
>>
>>>Hi Jeff,
>>>
>>>  commons-feedparser-fork was a branched off version of the feedparser
>>>      
>>>
>>0.6
>>    
>>
>>>base code that I made, which removed some of the specific jar files that
>>>were part of standard 0.6 feedparser distro that conflicted with the jar
>>>files included in Nutch's lib directory. Specifically, I changed it so
>>>      
>>>
>>that
>>    
>>
>>>the core jaxen libraries that the feed parser relied on weren't dom4j,
>>>      
>>>
>>but
>>    
>>
>>>in fact were jdom (see postings on the Nutch list around March 2005
>>>      
>>>
>>between
>>    
>>
>>>John X, Stefan G. and I). This required changing about 9 or 10 of the
>>>      
>>>
>>source
>>    
>>
>>>files for the feedparser to use the jdom Node classes rather than the
>>>      
>>>
>>dom4j.
>>    
>>
>>>If you like, I can put up a link to the feedparser forked code on my
>>>website, and post the link to the list.
>>>
>>>Thanks,
>>> Chris
>>>
>>>
>>>
>>>On 8/24/05 2:04 PM, "American Jeff Bowden" <jl...@houseofdistraction.com>
>>>wrote:
>>>
>>>
>>>
>>>      
>>>
>>>>Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
>>>>doesn't appear to be in commons svn or on the feedparser site.
>>>>
>>>>Chris Mattmann wrote:
>>>>
>>>>
>>>>
>>>>        
>>>>
>>>>>Hi Zaheed,
>>>>>
>>>>>Thanks for the nice comments. I've went ahead and wrote an HTML page
>>>>>          
>>>>>
>>that
>>    
>>
>>>>>summarizes what I sent to Zaheed with respect to installing the parse-
>>>>>          
>>>>>
>>rss
>>    
>>
>>>>>plugin. You can find the small guide here:
>>>>>
>>>>>http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>>>>>
>>>>>
>>>>>Thanks,
>>>>>Chris
>>>>>
>>>>>
>>>>>______________________________________________
>>>>>Chris A. Mattmann
>>>>>Chris.Mattmann@jpl.nasa.gov
>>>>>Staff Member
>>>>>Modeling and Data Management Systems Section (387)
>>>>>Data Management Systems and Technologies Group
>>>>>
>>>>>_________________________________________________
>>>>>Jet Propulsion Laboratory            Pasadena, CA
>>>>>Office: 171-266B                        Mailstop:  171-246
>>>>>_______________________________________________________
>>>>>
>>>>>Disclaimer:  The opinions presented within are my own and do not
>>>>>          
>>>>>
>>reflect
>>    
>>
>>>>>those of either NASA, JPL, or the California Institute of Technology.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>>>>-----Original Message-----
>>>>>>From: Zaheed Haque [mailto:zaheed.haque@gmail.com]
>>>>>>Sent: Thursday, August 11, 2005 11:49 AM
>>>>>>To: nutch-user@lucene.apache.org
>>>>>>Subject: RSS Feed Parser
>>>>>>
>>>>>>Hello:
>>>>>>
>>>>>>I am realy hoping that Chris Mattmann RSS parser will make it to the
>>>>>>release 0.7.
>>>>>>
>>>>>>http://issues.apache.org/jira/browse/NUTCH-30
>>>>>>
>>>>>>I got it working from last nights SVN. I believe newbie users like me
>>>>>>would benefit very much having it as a part of the distribution. +1
>>>>>>for this plugin!
>>>>>>
>>>>>>Thanks Chris for solving my problem!!
>>>>>>--
>>>>>>Best Regards
>>>>>>Zaheed Haque
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>            
>>>>>>
>>>>>-------------------------------------------------------
>>>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>>>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
>>>>>          
>>>>>
>>Practices
>>    
>>
>>>>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
>>>>>          
>>>>>
>>QA
>>    
>>
>>>>>Security * Process Improvement & Measurement *
>>>>>          
>>>>>
>>http://www.sqe.com/bsce5sf
>>    
>>
>>>>>_______________________________________________
>>>>>Nutch-general mailing list
>>>>>Nutch-general@lists.sourceforge.net
>>>>>https://lists.sourceforge.net/lists/listinfo/nutch-general
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>          
>>>>>
>>>______________________________________________
>>>Chris A. Mattmann
>>>Chris.Mattmann@jpl.nasa.gov
>>>Staff Member
>>>Modeling and Data Management Systems Section (387)
>>>Data Management Systems and Technologies Group
>>>
>>>_________________________________________________
>>>Jet Propulsion Laboratory            Pasadena, CA
>>>Office: 171-266B                        Mailstop:  171-246
>>>_______________________________________________________
>>>
>>>Disclaimer:  The opinions presented within are my own and do not reflect
>>>those of either NASA, JPL, or the California Institute of Technology.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>-------------------------------------------------------
>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
>>>      
>>>
>>Practices
>>    
>>
>>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
>>>      
>>>
>>QA
>>    
>>
>>>Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>>>_______________________________________________
>>>Nutch-general mailing list
>>>Nutch-general@lists.sourceforge.net
>>>https://lists.sourceforge.net/lists/listinfo/nutch-general
>>>
>>>
>>>      
>>>
>
>
>
>-------------------------------------------------------
>SF.Net email is Sponsored by the Better Software Conference & EXPO
>September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
>Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
>Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>_______________________________________________
>Nutch-general mailing list
>Nutch-general@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/nutch-general
>  
>


RE: [Nutch-general] RE: RSS Feed Parser

Posted by Chris Mattmann <ch...@jpl.nasa.gov>.
Hi Jeff,

 Okay, here is the link to commons-feedparser source that includes my
modifications:

http://www-scf.usc.edu/~mattmann/feedparser-0.6-fork-src.zip


Thanks!

Cheers,
  Chris


______________________________________________
Chris A. Mattmann
Chris.Mattmann@jpl.nasa.gov 
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group

_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.



> -----Original Message-----
> From: Jeff Bowden [mailto:jlb@houseofdistraction.com]
> Sent: Wednesday, August 24, 2005 10:45 PM
> To: user@nutch.org; chris.mattmann@jpl.nasa.gov
> Subject: Re: [Nutch-general] RE: RSS Feed Parser
> 
> Yes please, that would be great.  I couldn't even figure out where to
> find the 0.6 version of feedparser, much less your patches to it.
> 
> Chris Mattmann wrote:
> 
> >Hi Jeff,
> >
> >   commons-feedparser-fork was a branched off version of the feedparser
> 0.6
> >base code that I made, which removed some of the specific jar files that
> >were part of standard 0.6 feedparser distro that conflicted with the jar
> >files included in Nutch's lib directory. Specifically, I changed it so
> that
> >the core jaxen libraries that the feed parser relied on weren't dom4j,
> but
> >in fact were jdom (see postings on the Nutch list around March 2005
> between
> >John X, Stefan G. and I). This required changing about 9 or 10 of the
> source
> >files for the feedparser to use the jdom Node classes rather than the
> dom4j.
> >
> >If you like, I can put up a link to the feedparser forked code on my
> >website, and post the link to the list.
> >
> >Thanks,
> >  Chris
> >
> >
> >
> >On 8/24/05 2:04 PM, "American Jeff Bowden" <jl...@houseofdistraction.com>
> >wrote:
> >
> >
> >
> >>Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
> >>doesn't appear to be in commons svn or on the feedparser site.
> >>
> >>Chris Mattmann wrote:
> >>
> >>
> >>
> >>>Hi Zaheed,
> >>>
> >>>Thanks for the nice comments. I've went ahead and wrote an HTML page
> that
> >>>summarizes what I sent to Zaheed with respect to installing the parse-
> rss
> >>>plugin. You can find the small guide here:
> >>>
> >>>http://www-scf.usc.edu/~mattmann/parse-rss-install.html
> >>>
> >>>
> >>>Thanks,
> >>> Chris
> >>>
> >>>
> >>>______________________________________________
> >>>Chris A. Mattmann
> >>>Chris.Mattmann@jpl.nasa.gov
> >>>Staff Member
> >>>Modeling and Data Management Systems Section (387)
> >>>Data Management Systems and Technologies Group
> >>>
> >>>_________________________________________________
> >>>Jet Propulsion Laboratory            Pasadena, CA
> >>>Office: 171-266B                        Mailstop:  171-246
> >>>_______________________________________________________
> >>>
> >>>Disclaimer:  The opinions presented within are my own and do not
> reflect
> >>>those of either NASA, JPL, or the California Institute of Technology.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>>-----Original Message-----
> >>>>From: Zaheed Haque [mailto:zaheed.haque@gmail.com]
> >>>>Sent: Thursday, August 11, 2005 11:49 AM
> >>>>To: nutch-user@lucene.apache.org
> >>>>Subject: RSS Feed Parser
> >>>>
> >>>>Hello:
> >>>>
> >>>>I am realy hoping that Chris Mattmann RSS parser will make it to the
> >>>>release 0.7.
> >>>>
> >>>>http://issues.apache.org/jira/browse/NUTCH-30
> >>>>
> >>>>I got it working from last nights SVN. I believe newbie users like me
> >>>>would benefit very much having it as a part of the distribution. +1
> >>>>for this plugin!
> >>>>
> >>>>Thanks Chris for solving my problem!!
> >>>>--
> >>>>Best Regards
> >>>>Zaheed Haque
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>-------------------------------------------------------
> >>>SF.Net email is Sponsored by the Better Software Conference & EXPO
> >>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> Practices
> >>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
> QA
> >>>Security * Process Improvement & Measurement *
> http://www.sqe.com/bsce5sf
> >>>_______________________________________________
> >>>Nutch-general mailing list
> >>>Nutch-general@lists.sourceforge.net
> >>>https://lists.sourceforge.net/lists/listinfo/nutch-general
> >>>
> >>>
> >>>
> >>>
> >
> >______________________________________________
> >Chris A. Mattmann
> >Chris.Mattmann@jpl.nasa.gov
> >Staff Member
> >Modeling and Data Management Systems Section (387)
> >Data Management Systems and Technologies Group
> >
> >_________________________________________________
> >Jet Propulsion Laboratory            Pasadena, CA
> >Office: 171-266B                        Mailstop:  171-246
> >_______________________________________________________
> >
> >Disclaimer:  The opinions presented within are my own and do not reflect
> >those of either NASA, JPL, or the California Institute of Technology.
> >
> >
> >
> >
> >
> >
> >
> >-------------------------------------------------------
> >SF.Net email is Sponsored by the Better Software Conference & EXPO
> >September 19-22, 2005 * San Francisco, CA * Development Lifecycle
> Practices
> >Agile & Plan-Driven Development * Managing Projects & Teams * Testing &
> QA
> >Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
> >_______________________________________________
> >Nutch-general mailing list
> >Nutch-general@lists.sourceforge.net
> >https://lists.sourceforge.net/lists/listinfo/nutch-general
> >
> >


Re: [Nutch-general] RE: RSS Feed Parser

Posted by Jeff Bowden <jl...@houseofdistraction.com>.
Yes please, that would be great.  I couldn't even figure out where to 
find the 0.6 version of feedparser, much less your patches to it.

Chris Mattmann wrote:

>Hi Jeff,
>
>   commons-feedparser-fork was a branched off version of the feedparser 0.6
>base code that I made, which removed some of the specific jar files that
>were part of standard 0.6 feedparser distro that conflicted with the jar
>files included in Nutch's lib directory. Specifically, I changed it so that
>the core jaxen libraries that the feed parser relied on weren't dom4j, but
>in fact were jdom (see postings on the Nutch list around March 2005 between
>John X, Stefan G. and I). This required changing about 9 or 10 of the source
>files for the feedparser to use the jdom Node classes rather than the dom4j.
>
>If you like, I can put up a link to the feedparser forked code on my
>website, and post the link to the list.
>
>Thanks,
>  Chris
>
>
>
>On 8/24/05 2:04 PM, "American Jeff Bowden" <jl...@houseofdistraction.com>
>wrote:
>
>  
>
>>Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
>>doesn't appear to be in commons svn or on the feedparser site.
>>
>>Chris Mattmann wrote:
>>
>>    
>>
>>>Hi Zaheed,
>>>
>>>Thanks for the nice comments. I've went ahead and wrote an HTML page that
>>>summarizes what I sent to Zaheed with respect to installing the parse-rss
>>>plugin. You can find the small guide here:
>>>
>>>http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>>>
>>>
>>>Thanks,
>>> Chris
>>>
>>>
>>>______________________________________________
>>>Chris A. Mattmann
>>>Chris.Mattmann@jpl.nasa.gov
>>>Staff Member
>>>Modeling and Data Management Systems Section (387)
>>>Data Management Systems and Technologies Group
>>>
>>>_________________________________________________
>>>Jet Propulsion Laboratory            Pasadena, CA
>>>Office: 171-266B                        Mailstop:  171-246
>>>_______________________________________________________
>>>
>>>Disclaimer:  The opinions presented within are my own and do not reflect
>>>those of either NASA, JPL, or the California Institute of Technology.
>>>
>>>
>>>
>>> 
>>>
>>>      
>>>
>>>>-----Original Message-----
>>>>From: Zaheed Haque [mailto:zaheed.haque@gmail.com]
>>>>Sent: Thursday, August 11, 2005 11:49 AM
>>>>To: nutch-user@lucene.apache.org
>>>>Subject: RSS Feed Parser
>>>>
>>>>Hello:
>>>>
>>>>I am realy hoping that Chris Mattmann RSS parser will make it to the
>>>>release 0.7.
>>>>
>>>>http://issues.apache.org/jira/browse/NUTCH-30
>>>>
>>>>I got it working from last nights SVN. I believe newbie users like me
>>>>would benefit very much having it as a part of the distribution. +1
>>>>for this plugin!
>>>>
>>>>Thanks Chris for solving my problem!!
>>>>--
>>>>Best Regards
>>>>Zaheed Haque
>>>>   
>>>>
>>>>        
>>>>
>>>
>>>-------------------------------------------------------
>>>SF.Net email is Sponsored by the Better Software Conference & EXPO
>>>September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
>>>Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
>>>Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>>>_______________________________________________
>>>Nutch-general mailing list
>>>Nutch-general@lists.sourceforge.net
>>>https://lists.sourceforge.net/lists/listinfo/nutch-general
>>> 
>>>
>>>      
>>>
>
>______________________________________________
>Chris A. Mattmann
>Chris.Mattmann@jpl.nasa.gov
>Staff Member
>Modeling and Data Management Systems Section (387)
>Data Management Systems and Technologies Group
> 
>_________________________________________________
>Jet Propulsion Laboratory            Pasadena, CA
>Office: 171-266B                        Mailstop:  171-246
>_______________________________________________________
> 
>Disclaimer:  The opinions presented within are my own and do not reflect
>those of either NASA, JPL, or the California Institute of Technology.
> 
> 
>
>
>
>
>
>-------------------------------------------------------
>SF.Net email is Sponsored by the Better Software Conference & EXPO
>September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
>Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
>Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>_______________________________________________
>Nutch-general mailing list
>Nutch-general@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/nutch-general
>  
>


Re: [Nutch-general] RE: RSS Feed Parser

Posted by Chris Mattmann <ch...@jpl.nasa.gov>.
Hi Jeff,

   commons-feedparser-fork was a branched off version of the feedparser 0.6
base code that I made, which removed some of the specific jar files that
were part of standard 0.6 feedparser distro that conflicted with the jar
files included in Nutch's lib directory. Specifically, I changed it so that
the core jaxen libraries that the feed parser relied on weren't dom4j, but
in fact were jdom (see postings on the Nutch list around March 2005 between
John X, Stefan G. and I). This required changing about 9 or 10 of the source
files for the feedparser to use the jdom Node classes rather than the dom4j.

If you like, I can put up a link to the feedparser forked code on my
website, and post the link to the list.

Thanks,
  Chris



On 8/24/05 2:04 PM, "American Jeff Bowden" <jl...@houseofdistraction.com>
wrote:

> Where can I obtain the source of commons-feedparser-0.6-fork.jar?  It
> doesn't appear to be in commons svn or on the feedparser site.
> 
> Chris Mattmann wrote:
> 
>> Hi Zaheed,
>> 
>> Thanks for the nice comments. I've went ahead and wrote an HTML page that
>> summarizes what I sent to Zaheed with respect to installing the parse-rss
>> plugin. You can find the small guide here:
>> 
>> http://www-scf.usc.edu/~mattmann/parse-rss-install.html
>> 
>> 
>> Thanks,
>>  Chris
>> 
>> 
>> ______________________________________________
>> Chris A. Mattmann
>> Chris.Mattmann@jpl.nasa.gov
>> Staff Member
>> Modeling and Data Management Systems Section (387)
>> Data Management Systems and Technologies Group
>> 
>> _________________________________________________
>> Jet Propulsion Laboratory            Pasadena, CA
>> Office: 171-266B                        Mailstop:  171-246
>> _______________________________________________________
>> 
>> Disclaimer:  The opinions presented within are my own and do not reflect
>> those of either NASA, JPL, or the California Institute of Technology.
>> 
>> 
>> 
>>  
>> 
>>> -----Original Message-----
>>> From: Zaheed Haque [mailto:zaheed.haque@gmail.com]
>>> Sent: Thursday, August 11, 2005 11:49 AM
>>> To: nutch-user@lucene.apache.org
>>> Subject: RSS Feed Parser
>>> 
>>> Hello:
>>> 
>>> I am realy hoping that Chris Mattmann RSS parser will make it to the
>>> release 0.7.
>>> 
>>> http://issues.apache.org/jira/browse/NUTCH-30
>>> 
>>> I got it working from last nights SVN. I believe newbie users like me
>>> would benefit very much having it as a part of the distribution. +1
>>> for this plugin!
>>> 
>>> Thanks Chris for solving my problem!!
>>> --
>>> Best Regards
>>> Zaheed Haque
>>>    
>>> 
>> 
>> 
>> 
>> -------------------------------------------------------
>> SF.Net email is Sponsored by the Better Software Conference & EXPO
>> September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
>> Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
>> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>> _______________________________________________
>> Nutch-general mailing list
>> Nutch-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nutch-general
>>  
>> 
> 

______________________________________________
Chris A. Mattmann
Chris.Mattmann@jpl.nasa.gov
Staff Member
Modeling and Data Management Systems Section (387)
Data Management Systems and Technologies Group
 
_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                        Mailstop:  171-246
_______________________________________________________
 
Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.