You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by lewis john mcgibbney <le...@apache.org> on 2013/03/12 23:43:05 UTC

[WELCOME] Feng Lu as Apache Nutch PMC and Committer

Hi Everyone,

On behalf of the Nutch PMC I would like to announce and welcome Feng Lu on
board as PMC and Committer on the project.
Amongst others, Feng has been an important part of the Nutch development
over the last while and we would like to welcome him.

@Feng,
Please feel free to say a bit about yourself, your involvement and use case
for Nutch or anything else.

Thank you very much.
Lewis

RE: [WELCOME] Feng Lu as Apache Nutch PMC and Committer

Posted by Markus Jelsma <ma...@openindex.io>.
Feng Lu, welcome! :)

 
 
-----Original message-----
> From:Julien Nioche <li...@gmail.com>
> Sent: Mon 18-Mar-2013 13:23
> To: user@nutch.apache.org
> Cc: dev@nutch.apache.org
> Subject: Re: [WELCOME] Feng Lu as Apache Nutch PMC and Committer
> 
> Hi Feng, 
> 
> Congratulations on becoming a committer and welcome! 
>  
> [...]
> 
>  
> A problem has been troubling me a long time is that what is the target of
> nutch 1.x, Does nutch 1.x is just a transitional version of Nutch 2.x, or
> they can coexist because Nutch 1.x has a different data processing method
> to Nutch 2.x,
> 
> the latter, it's not so much the processing method that differs as they are very similar but the way data are stored.
>  
>  like Julien said, Nutch 1.x is great for batch processing and
> 2.x large scale processing. 
> 
> Hmm, I don't think I said that. Both are batch orientated and 1.x is probably better at large scale processing than 2.x (at least currently) 
>  
> Perhaps with more and more people use NoSql as
> their back-end DB, the developers should focus more on the development of
> Nutch 2.x, ensure its stability and improve its function.
> 
> IMHO it's not that the developers should focus on this or that. I see it more as an evolutionary process where things get improved because they are used in the first place or get derelict and abandoned if there is no interest from users.  If as you say  people prefer to have a SQL backend instead of the sequential HDFS data structures then there will be more contributions and as a result 2.x will be improved. 
> 
> Julien
>  
> 
> -- 
>  <http://digitalpebble.com/img/logo.gif> 
> Open Source Solutions for Text Engineering
> 
> http://digitalpebble.blogspot.com/ <http://digitalpebble.blogspot.com/> 
> http://www.digitalpebble.com <http://www.digitalpebble.com> 
> http://twitter.com/digitalpebble <http://twitter.com/digitalpebble> 
> 

Re: [WELCOME] Feng Lu as Apache Nutch PMC and Committer

Posted by kiran chitturi <ch...@gmail.com>.
> IMHO it's not that the developers *should* focus on this or that. I see it
> more as an evolutionary process where things get improved because they are
> used in the first place or get derelict and abandoned if there is no
> interest from users.  If as you say  people prefer to have a SQL backend
> instead of the sequential HDFS data structures then there will be more
> contributions and as a result 2.x will be improved.
>
> Julien
>
Yes, This is a great point for Nutch 2.x. I think there a lot of potential
users for SQL store in 2.x, since 2.x started supporting backends for
Nutch. This was my initial idea when I started out but I settled with 1.6
at the end :)


>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> http://twitter.com/digitalpebble
>



-- 
Kiran Chitturi

<http://www.linkedin.com/in/kiranchitturi>

Re: [WELCOME] Feng Lu as Apache Nutch PMC and Committer

Posted by Julien Nioche <li...@gmail.com>.
Hi Feng,

Congratulations on becoming a committer and welcome!

[...]



> A problem has been troubling me a long time is that what is the target of
> nutch 1.x, Does nutch 1.x is just a transitional version of Nutch 2.x, or
> they can coexist because Nutch 1.x has a different data processing method
> to Nutch 2.x,


the latter, it's not so much the processing method that differs as they are
very similar but the way data are stored.


> like Julien said, Nutch 1.x is great for batch processing and
> 2.x large scale processing.


Hmm, I don't think I said that. Both are batch orientated and 1.x is
probably better at large scale processing than 2.x (at least currently)


> Perhaps with more and more people use NoSql as
> their back-end DB, the developers should focus more on the development of
> Nutch 2.x, ensure its stability and improve its function.


IMHO it's not that the developers *should* focus on this or that. I see it
more as an evolutionary process where things get improved because they are
used in the first place or get derelict and abandoned if there is no
interest from users.  If as you say  people prefer to have a SQL backend
instead of the sequential HDFS data structures then there will be more
contributions and as a result 2.x will be improved.

Julien


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: [WELCOME] Feng Lu as Apache Nutch PMC and Committer

Posted by Julien Nioche <li...@gmail.com>.
Hi Feng,

Congratulations on becoming a committer and welcome!

[...]



> A problem has been troubling me a long time is that what is the target of
> nutch 1.x, Does nutch 1.x is just a transitional version of Nutch 2.x, or
> they can coexist because Nutch 1.x has a different data processing method
> to Nutch 2.x,


the latter, it's not so much the processing method that differs as they are
very similar but the way data are stored.


> like Julien said, Nutch 1.x is great for batch processing and
> 2.x large scale processing.


Hmm, I don't think I said that. Both are batch orientated and 1.x is
probably better at large scale processing than 2.x (at least currently)


> Perhaps with more and more people use NoSql as
> their back-end DB, the developers should focus more on the development of
> Nutch 2.x, ensure its stability and improve its function.


IMHO it's not that the developers *should* focus on this or that. I see it
more as an evolutionary process where things get improved because they are
used in the first place or get derelict and abandoned if there is no
interest from users.  If as you say  people prefer to have a SQL backend
instead of the sequential HDFS data structures then there will be more
contributions and as a result 2.x will be improved.

Julien


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: [WELCOME] Feng Lu as Apache Nutch PMC and Committer

Posted by feng lu <am...@gmail.com>.
Thanks a lot to everyone for inviting me.

I'm a software engineer in China, I have  been using Apache Nutch for three
years. In our team, I mainly responsible for modifying nutch 1.x to suit
the requirements of our database Mongodb. So i also write a simple database
abstraction layer to adapt different database like Apache Gora. In this
process, i found myself more and more like these places @user @dev @jira,
Because in these places, i can get some help from others, also others can
get help from my. Finally, i am also very pleased to make some contribution
for the Apache Nutch.

A problem has been troubling me a long time is that what is the target of
nutch 1.x, Does nutch 1.x is just a transitional version of Nutch 2.x, or
they can coexist because Nutch 1.x has a different data processing method
to Nutch 2.x, like Julien said, Nutch 1.x is great for batch processing and
2.x large scale processing. Perhaps with more and more people use NoSql as
their back-end DB, the developers should focus more on the development of
Nutch 2.x, ensure its stability and improve its function.

Best Regards
Feng

Re: [WELCOME] Feng Lu as Apache Nutch PMC and Committer

Posted by feng lu <am...@gmail.com>.
Thanks a lot to everyone for inviting me.

I'm a software engineer in China, I have  been using Apache Nutch for three
years. In our team, I mainly responsible for modifying nutch 1.x to suit
the requirements of our database Mongodb. So i also write a simple database
abstraction layer to adapt different database like Apache Gora. In this
process, i found myself more and more like these places @user @dev @jira,
Because in these places, i can get some help from others, also others can
get help from my. Finally, i am also very pleased to make some contribution
for the Apache Nutch.

A problem has been troubling me a long time is that what is the target of
nutch 1.x, Does nutch 1.x is just a transitional version of Nutch 2.x, or
they can coexist because Nutch 1.x has a different data processing method
to Nutch 2.x, like Julien said, Nutch 1.x is great for batch processing and
2.x large scale processing. Perhaps with more and more people use NoSql as
their back-end DB, the developers should focus more on the development of
Nutch 2.x, ensure its stability and improve its function.

Best Regards
Feng

Re: [WELCOME] Feng Lu as Apache Nutch PMC and Committer

Posted by kiran chitturi <ch...@gmail.com>.
Congrats Feng. Welcome onboard.


On Tue, Mar 12, 2013 at 6:43 PM, lewis john mcgibbney <le...@apache.org>wrote:

> Hi Everyone,
>
> On behalf of the Nutch PMC I would like to announce and welcome Feng Lu on
> board as PMC and Committer on the project.
> Amongst others, Feng has been an important part of the Nutch development
> over the last while and we would like to welcome him.
>
> @Feng,
> Please feel free to say a bit about yourself, your involvement and use case
> for Nutch or anything else.
>
> Thank you very much.
> Lewis
>



-- 
Kiran Chitturi

<http://www.linkedin.com/in/kiranchitturi>