You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hc.apache.org by Thorsten Scherler <th...@juntadeandalucia.es> on 2008/02/21 09:16:46 UTC

Apache Droids using Norbert

Hi all,

I wrote some basic documentation for droids (till now just about install
and run). It can be found in the trunk [1] but for your comfort I added
it as well to my apache space:
http://people.apache.org/~thorsten/droids/

I will now write some more lines about extending droids and may find the
time to write a basic solr droid (I expect there will be some interest
from the Apache Solr community).

BTW the current version is incorporating Norbert - the (no)robots.txt
parser - which was mentioned by Roland Weber and is part of the sandbox
of HttpComponents.

Is there are still interest to host Droids and sponsor it in incubation
as mentioned in [2]?

Hope you enjoy.

[1] http://svn.apache.org/repos/asf/labs/droids/trunk/
[2] http://labs.markmail.org/message/qu72r7scsbvifcsu?q=norbert

salu2
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: [Proposal] Apache Droids as subproject of HttpComponents (was Re: Apache Droids using Norbert)

Posted by Thorsten Scherler <th...@apache.org>.
On Wed, 2008-08-27 at 13:32 +0200, Thorsten Scherler wrote:
> On Mon, 2008-02-25 at 13:35 +0100, Thorsten Scherler wrote:
> > On Sat, 2008-02-23 at 09:16 +0100, Roland Weber wrote:
> > > Hi Thorsten,
> > > 
> > > > Is there are still interest to host Droids and sponsor it in incubation
> > > > as mentioned in [2]?
> > > 
> > > Definitely :-) Meanwhile, we have created a charter [a] and Droids fits
> > > the "build upon and extend" clause. We will expect it to work well with
> > > HttpClient 4 eventually, but that doesn't have to be exclusive.
> ...
> > 
> > I guess shortly I will finish the docu and a simple example (that I
> > develop for the documentation) and we can then decide how we will
> > procedure.
> 
> I consider the current development stand of Apache Droids as stable
> after enhancing the multi-thread feature and finishing the default
> implementation. The default droid and worker is a simple crawler that
> scraps a webpage and saves the resources to disk. 
> 
> It is highly extensible and in my current work project I am using 5
> different droids that are running very smooth. This droids are extending
> the default implementation and are adding business specific logic.
> 
> Some of this droids are connecting to another web site to get some data
> and invoke then the parsing of this data to generate an internal
> representation of them and send them to an Apache Solr server. Another
> droids simply crawls a file system to edit specific files with the
> result of a database query. 
> 
> I like to start the move to the incubator with Droids with the
> HttpComponents project sponsoring it.

Here are some links:
docu: http://people.apache.org/~thorsten/droids/
svn: https://svn.apache.org/repos/asf/labs/droids/trunk

salu2

> 
> WDYT?
> 
> salu2
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: [Proposal] Apache Droids as subproject of HttpComponents (was Re: Apache Droids using Norbert)

Posted by Erik Abele <er...@apache.org>.
On Aug 27, 2008, at 19:35:52, Oleg Kalnichevski wrote:

> Thorsten Scherler wrote:
>> On Mon, 2008-02-25 at 13:35 +0100, Thorsten Scherler wrote:
>>> On Sat, 2008-02-23 at 09:16 +0100, Roland Weber wrote:
>>>> Hi Thorsten,
>>>>
>>>>> Is there are still interest to host Droids and sponsor it in  
>>>>> incubation
>>>>> as mentioned in [2]?
>>>> Definitely :-) Meanwhile, we have created a charter [a] and  
>>>> Droids fits
>>>> the "build upon and extend" clause. We will expect it to work  
>>>> well with
>>>> HttpClient 4 eventually, but that doesn't have to be exclusive.
>> ...
>>> I guess shortly I will finish the docu and a simple example (that I
>>> develop for the documentation) and we can then decide how we will
>>> procedure.
>> I consider the current development stand of Apache Droids as stable
>> after enhancing the multi-thread feature and finishing the default
>> implementation. The default droid and worker is a simple crawler that
>> scraps a webpage and saves the resources to disk. It is highly  
>> extensible and in my current work project I am using 5
>> different droids that are running very smooth. This droids are  
>> extending
>> the default implementation and are adding business specific logic.
>> Some of this droids are connecting to another web site to get some  
>> data
>> and invoke then the parsing of this data to generate an internal
>> representation of them and send them to an Apache Solr server.  
>> Another
>> droids simply crawls a file system to edit specific files with the
>> result of a database query. I like to start the move to the  
>> incubator with Droids with the
>> HttpComponents project sponsoring it.
>
> +1 to HC sponsoring Droids' incubation.

+1 here too.

> I am willing to participate in the incubation process and help  
> integrate Droids into HC.

Unfortunately I won't be able to help with the incubation process  
myself but I'm certainly happy to foster the final integration into HC  
if it turns out to be the best home for Droids...

Cheers,
Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: [Proposal] Apache Droids as subproject of HttpComponents (was Re: Apache Droids using Norbert)

Posted by Paul Fremantle <pz...@gmail.com>.
I will also help with incubation if you like.

Paul

On Mon, Sep 1, 2008 at 8:17 AM, ant elder <an...@gmail.com> wrote:
> On Wed, Aug 27, 2008 at 6:35 PM, Oleg Kalnichevski <ol...@apache.org> wrote:
>
>> Thorsten Scherler wrote:
>>
>>> On Mon, 2008-02-25 at 13:35 +0100, Thorsten Scherler wrote:
>>>
>>>> On Sat, 2008-02-23 at 09:16 +0100, Roland Weber wrote:
>>>>
>>>>> Hi Thorsten,
>>>>>
>>>>>  Is there are still interest to host Droids and sponsor it in incubation
>>>>>> as mentioned in [2]?
>>>>>>
>>>>> Definitely :-) Meanwhile, we have created a charter [a] and Droids fits
>>>>> the "build upon and extend" clause. We will expect it to work well with
>>>>> HttpClient 4 eventually, but that doesn't have to be exclusive.
>>>>>
>>>> ...
>>>
>>>> I guess shortly I will finish the docu and a simple example (that I
>>>> develop for the documentation) and we can then decide how we will
>>>> procedure.
>>>>
>>>
>>> I consider the current development stand of Apache Droids as stable
>>> after enhancing the multi-thread feature and finishing the default
>>> implementation. The default droid and worker is a simple crawler that
>>> scraps a webpage and saves the resources to disk.
>>> It is highly extensible and in my current work project I am using 5
>>> different droids that are running very smooth. This droids are extending
>>> the default implementation and are adding business specific logic.
>>>
>>> Some of this droids are connecting to another web site to get some data
>>> and invoke then the parsing of this data to generate an internal
>>> representation of them and send them to an Apache Solr server. Another
>>> droids simply crawls a file system to edit specific files with the
>>> result of a database query.
>>> I like to start the move to the incubator with Droids with the
>>> HttpComponents project sponsoring it.
>>>
>>>
>> +1 to HC sponsoring Droids' incubation. I am willing to participate in the
>> incubation process and help integrate Droids into HC.
>>
>> Cheers
>>
>> Oleg
>>
>>
> Sounds good to me.
>
>   ...ant
>



-- 
Paul Fremantle
Co-Founder and CTO, WSO2
Apache Synapse PMC Chair
OASIS WS-RX TC Co-chair

blog: http://pzf.fremantle.org
paul@wso2.com

"Oxygenating the Web Service Platform", www.wso2.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: [Proposal] Apache Droids as subproject of HttpComponents (was Re: Apache Droids using Norbert)

Posted by ant elder <an...@gmail.com>.
On Wed, Aug 27, 2008 at 6:35 PM, Oleg Kalnichevski <ol...@apache.org> wrote:

> Thorsten Scherler wrote:
>
>> On Mon, 2008-02-25 at 13:35 +0100, Thorsten Scherler wrote:
>>
>>> On Sat, 2008-02-23 at 09:16 +0100, Roland Weber wrote:
>>>
>>>> Hi Thorsten,
>>>>
>>>>  Is there are still interest to host Droids and sponsor it in incubation
>>>>> as mentioned in [2]?
>>>>>
>>>> Definitely :-) Meanwhile, we have created a charter [a] and Droids fits
>>>> the "build upon and extend" clause. We will expect it to work well with
>>>> HttpClient 4 eventually, but that doesn't have to be exclusive.
>>>>
>>> ...
>>
>>> I guess shortly I will finish the docu and a simple example (that I
>>> develop for the documentation) and we can then decide how we will
>>> procedure.
>>>
>>
>> I consider the current development stand of Apache Droids as stable
>> after enhancing the multi-thread feature and finishing the default
>> implementation. The default droid and worker is a simple crawler that
>> scraps a webpage and saves the resources to disk.
>> It is highly extensible and in my current work project I am using 5
>> different droids that are running very smooth. This droids are extending
>> the default implementation and are adding business specific logic.
>>
>> Some of this droids are connecting to another web site to get some data
>> and invoke then the parsing of this data to generate an internal
>> representation of them and send them to an Apache Solr server. Another
>> droids simply crawls a file system to edit specific files with the
>> result of a database query.
>> I like to start the move to the incubator with Droids with the
>> HttpComponents project sponsoring it.
>>
>>
> +1 to HC sponsoring Droids' incubation. I am willing to participate in the
> incubation process and help integrate Droids into HC.
>
> Cheers
>
> Oleg
>
>
Sounds good to me.

   ...ant

Re: [Proposal] Apache Droids as subproject of HttpComponents (was Re: Apache Droids using Norbert)

Posted by Oleg Kalnichevski <ol...@apache.org>.
Thorsten Scherler wrote:
> On Mon, 2008-02-25 at 13:35 +0100, Thorsten Scherler wrote:
>> On Sat, 2008-02-23 at 09:16 +0100, Roland Weber wrote:
>>> Hi Thorsten,
>>>
>>>> Is there are still interest to host Droids and sponsor it in incubation
>>>> as mentioned in [2]?
>>> Definitely :-) Meanwhile, we have created a charter [a] and Droids fits
>>> the "build upon and extend" clause. We will expect it to work well with
>>> HttpClient 4 eventually, but that doesn't have to be exclusive.
> ...
>> I guess shortly I will finish the docu and a simple example (that I
>> develop for the documentation) and we can then decide how we will
>> procedure.
> 
> I consider the current development stand of Apache Droids as stable
> after enhancing the multi-thread feature and finishing the default
> implementation. The default droid and worker is a simple crawler that
> scraps a webpage and saves the resources to disk. 
> 
> It is highly extensible and in my current work project I am using 5
> different droids that are running very smooth. This droids are extending
> the default implementation and are adding business specific logic.
> 
> Some of this droids are connecting to another web site to get some data
> and invoke then the parsing of this data to generate an internal
> representation of them and send them to an Apache Solr server. Another
> droids simply crawls a file system to edit specific files with the
> result of a database query. 
> 
> I like to start the move to the incubator with Droids with the
> HttpComponents project sponsoring it.
> 

+1 to HC sponsoring Droids' incubation. I am willing to participate in 
the incubation process and help integrate Droids into HC.

Cheers

Oleg


> WDYT?
> 
> salu2


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


[Proposal] Apache Droids as subproject of HttpComponents (was Re: Apache Droids using Norbert)

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Mon, 2008-02-25 at 13:35 +0100, Thorsten Scherler wrote:
> On Sat, 2008-02-23 at 09:16 +0100, Roland Weber wrote:
> > Hi Thorsten,
> > 
> > > Is there are still interest to host Droids and sponsor it in incubation
> > > as mentioned in [2]?
> > 
> > Definitely :-) Meanwhile, we have created a charter [a] and Droids fits
> > the "build upon and extend" clause. We will expect it to work well with
> > HttpClient 4 eventually, but that doesn't have to be exclusive.
...
> 
> I guess shortly I will finish the docu and a simple example (that I
> develop for the documentation) and we can then decide how we will
> procedure.

I consider the current development stand of Apache Droids as stable
after enhancing the multi-thread feature and finishing the default
implementation. The default droid and worker is a simple crawler that
scraps a webpage and saves the resources to disk. 

It is highly extensible and in my current work project I am using 5
different droids that are running very smooth. This droids are extending
the default implementation and are adding business specific logic.

Some of this droids are connecting to another web site to get some data
and invoke then the parsing of this data to generate an internal
representation of them and send them to an Apache Solr server. Another
droids simply crawls a file system to edit specific files with the
result of a database query. 

I like to start the move to the incubator with Droids with the
HttpComponents project sponsoring it.

WDYT?

salu2
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: Apache Droids using Norbert

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Sat, 2008-02-23 at 09:16 +0100, Roland Weber wrote:
> Hi Thorsten,
> 
> > Is there are still interest to host Droids and sponsor it in incubation
> > as mentioned in [2]?
> 
> Definitely :-) Meanwhile, we have created a charter [a] and Droids fits
> the "build upon and extend" clause. We will expect it to work well with
> HttpClient 4 eventually, but that doesn't have to be exclusive.

Till now the spring based version does not use HttpClient yet, but this
is just a matter of time since in the former version we as well had the
HttpClient.

I guess shortly I will finish the docu and a simple example (that I
develop for the documentation) and we can then decide how we will
procedure.

salu2

> cheers,
>    Roland
> 
> [a] http://hc.apache.org/charter.html
> 
> > 
> > Hope you enjoy.
> > 
> > [1] http://svn.apache.org/repos/asf/labs/droids/trunk/
> > [2] http://labs.markmail.org/message/qu72r7scsbvifcsu?q=norbert
> > 
> > salu2
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> For additional commands, e-mail: dev-help@hc.apache.org
> 
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: Apache Droids using Norbert

Posted by Roland Weber <os...@dubioso.net>.
Hi Thorsten,

> Is there are still interest to host Droids and sponsor it in incubation
> as mentioned in [2]?

Definitely :-) Meanwhile, we have created a charter [a] and Droids fits
the "build upon and extend" clause. We will expect it to work well with
HttpClient 4 eventually, but that doesn't have to be exclusive.

cheers,
   Roland

[a] http://hc.apache.org/charter.html

> 
> Hope you enjoy.
> 
> [1] http://svn.apache.org/repos/asf/labs/droids/trunk/
> [2] http://labs.markmail.org/message/qu72r7scsbvifcsu?q=norbert
> 
> salu2


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: Apache Droids using Norbert

Posted by Thorsten Scherler <th...@juntadeandalucia.es>.
On Thu, 2008-02-21 at 12:30 +0100, Oleg Kalnichevski wrote:
> On Thu, 2008-02-21 at 09:16 +0100, Thorsten Scherler wrote:
> > Hi all,
> > 
> > I wrote some basic documentation for droids (till now just about install
> > and run). It can be found in the trunk [1] but for your comfort I added
> > it as well to my apache space:
> > http://people.apache.org/~thorsten/droids/
> > 
> > I will now write some more lines about extending droids and may find the
> > time to write a basic solr droid (I expect there will be some interest
> > from the Apache Solr community).
> > 
> > BTW the current version is incorporating Norbert - the (no)robots.txt
> > parser - which was mentioned by Roland Weber and is part of the sandbox
> > of HttpComponents.
> > 
> > Is there are still interest to host Droids and sponsor it in incubation
> > as mentioned in [2]?
> > 
> 
> Absolutely. I also would like to see Norbert eventually absorbed into
> Droids, as it does not seem to have a future as a standalone component.

I actually did that already because I was up to patch the code
http://svn.apache.org/repos/asf/labs/droids/trunk/src/core/java/org/apache/http/norobots/

In the end I created a helper class (UrlHelper) and a method
(findRobotsUrl) which would better go into the above package. 

I will write now more about extending the default droid jar with custom
components and how to use it in a java app. After this I will
concentrate to create a roadmap.

Thanks for your interest and fast reply.

salu2

> 
> Oleg
> 
> > Hope you enjoy.
> > 
> > [1] http://svn.apache.org/repos/asf/labs/droids/trunk/
> > [2] http://labs.markmail.org/message/qu72r7scsbvifcsu?q=norbert
> > 
> > salu2
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
> For additional commands, e-mail: dev-help@hc.apache.org
> 
-- 
Thorsten Scherler                                 thorsten.at.apache.org
Open Source Java                      consulting, training and solutions


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: Apache Droids using Norbert

Posted by Oleg Kalnichevski <ol...@apache.org>.
On Thu, 2008-02-21 at 09:16 +0100, Thorsten Scherler wrote:
> Hi all,
> 
> I wrote some basic documentation for droids (till now just about install
> and run). It can be found in the trunk [1] but for your comfort I added
> it as well to my apache space:
> http://people.apache.org/~thorsten/droids/
> 
> I will now write some more lines about extending droids and may find the
> time to write a basic solr droid (I expect there will be some interest
> from the Apache Solr community).
> 
> BTW the current version is incorporating Norbert - the (no)robots.txt
> parser - which was mentioned by Roland Weber and is part of the sandbox
> of HttpComponents.
> 
> Is there are still interest to host Droids and sponsor it in incubation
> as mentioned in [2]?
> 

Absolutely. I also would like to see Norbert eventually absorbed into
Droids, as it does not seem to have a future as a standalone component.

Oleg

> Hope you enjoy.
> 
> [1] http://svn.apache.org/repos/asf/labs/droids/trunk/
> [2] http://labs.markmail.org/message/qu72r7scsbvifcsu?q=norbert
> 
> salu2


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org


Re: Apache Droids using Norbert

Posted by Erik Abele <er...@codefaktor.de>.
On 21.02.2008, at 09:16, Thorsten Scherler wrote:

> Hi all,
>
> I wrote some basic documentation for droids (till now just about  
> install
> and run). It can be found in the trunk [1] but for your comfort I  
> added
> it as well to my apache space:
> http://people.apache.org/~thorsten/droids/

Nice!

> ...
> BTW the current version is incorporating Norbert - the (no)robots.txt
> parser - which was mentioned by Roland Weber and is part of the  
> sandbox
> of HttpComponents.
>
> Is there are still interest to host Droids and sponsor it in  
> incubation
> as mentioned in [2]?

IMHO it would be a perfect fit so a preliminary +1 from here.

Cheers,
Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@hc.apache.org
For additional commands, e-mail: dev-help@hc.apache.org