You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by isidro <is...@gmail.com> on 2012/08/03 03:13:31 UTC

Is it posible to know how long it takes to download an amount of data with nutch.

Hi, 

Is it posible to know how long it takes to download an amount of data with
nutch.

I want to know how long it took to download the first 5 G.

Isidro



--
View this message in context: http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Is it posible to know how long it takes to download an amount of data with nutch.

Posted by Mathijs Homminga <ma...@kalooga.com>.
Hm, I'm not into segments these days, but I think you can use the SegmentReader to dump the rows from a segment. I guess you can take it from there.


On Aug 4, 2012, at 6:57 , isidro <is...@gmail.com> wrote:

> Release 1.5.1 - 07/02/2012
> 
> On Fri, Aug 3, 2012 at 10:53 PM, Mathijs Homminga-3 [via Lucene] <
> ml-node+s472066n3999165h15@n3.nabble.com> wrote:
> 
>> What version of Nutch are you using?
>> 
>> On Aug 4, 2012, at 5:36 , isidro <[hidden email]<http://user/SendEmail.jtp?type=node&node=3999165&i=0>>
>> wrote:
>> 
>>> Hi,
>>> 
>>> Where can I get the content size and the fetch times for each fetched
>> file ?
>>> 
>>> Isidro
>>> 
>>> 
>>> On Thu, Aug 2, 2012 at 11:49 PM, Mathijs Homminga-3 [via Lucene] <
>>> [hidden email] <http://user/SendEmail.jtp?type=node&node=3999165&i=1>>
>> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Unless you monitor the counters of a job while it's running: no.
>>>> However, you could, in theory, replay the fetch/download by looking at
>> the
>>>> fetch times, sum the content size and see when it hits the 5G. But you
>> have
>>>> to write your own tool for that.
>>>> 
>>>> Mathijs
>>>> 
>>>> On 3 aug. 2012, at 03:13, isidro <[hidden email]<
>> http://user/SendEmail.jtp?type=node&node=3998947&i=0>>
>>>> wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> Is it posible to know how long it takes to download an amount of data
>>>> with
>>>>> nutch.
>>>>> 
>>>>> I want to know how long it took to download the first 5 G.
>>>>> 
>>>>> Isidro
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> View this message in context:
>>>> 
>> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936.html
>>>>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>>> 
>>>> 
>>>> 
>>>> ------------------------------
>>>> If you reply to this email, your message will be added to the
>> discussion
>>>> below:
>>>> 
>>>> 
>> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3998947.html
>>>> To unsubscribe from Is it posible to know how long it takes to download
>>>> an amount of data with nutch., click here<
>> 
>>>> .
>>>> NAML<
>> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3999163.html
>> 
>>> Sent from the Nutch - User mailing list archive at Nabble.com.
>> 
>> 
>> 
>> ------------------------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>> 
>> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3999165.html
>> To unsubscribe from Is it posible to know how long it takes to download
>> an amount of data with nutch., click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3998936&code=aXNpZHJvc2FAZ21haWwuY29tfDM5OTg5MzZ8MjAzMTY0OTM3MQ==>
>> .
>> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>> 
> 
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3999166.html
> Sent from the Nutch - User mailing list archive at Nabble.com.


Re: Is it posible to know how long it takes to download an amount of data with nutch.

Posted by isidro <is...@gmail.com>.
Release 1.5.1 - 07/02/2012

On Fri, Aug 3, 2012 at 10:53 PM, Mathijs Homminga-3 [via Lucene] <
ml-node+s472066n3999165h15@n3.nabble.com> wrote:

> What version of Nutch are you using?
>
> On Aug 4, 2012, at 5:36 , isidro <[hidden email]<http://user/SendEmail.jtp?type=node&node=3999165&i=0>>
> wrote:
>
> > Hi,
> >
> > Where can I get the content size and the fetch times for each fetched
> file ?
> >
> > Isidro
> >
> >
> > On Thu, Aug 2, 2012 at 11:49 PM, Mathijs Homminga-3 [via Lucene] <
> > [hidden email] <http://user/SendEmail.jtp?type=node&node=3999165&i=1>>
> wrote:
> >
> >> Hi,
> >>
> >> Unless you monitor the counters of a job while it's running: no.
> >> However, you could, in theory, replay the fetch/download by looking at
> the
> >> fetch times, sum the content size and see when it hits the 5G. But you
> have
> >> to write your own tool for that.
> >>
> >> Mathijs
> >>
> >> On 3 aug. 2012, at 03:13, isidro <[hidden email]<
> http://user/SendEmail.jtp?type=node&node=3998947&i=0>>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> Is it posible to know how long it takes to download an amount of data
> >> with
> >>> nutch.
> >>>
> >>> I want to know how long it took to download the first 5 G.
> >>>
> >>> Isidro
> >>>
> >>>
> >>>
> >>> --
> >>> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936.html
> >>> Sent from the Nutch - User mailing list archive at Nabble.com.
> >>
> >>
> >>
> >> ------------------------------
> >> If you reply to this email, your message will be added to the
> discussion
> >> below:
> >>
> >>
> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3998947.html
> >> To unsubscribe from Is it posible to know how long it takes to download
> >> an amount of data with nutch., click here<
>
> >> .
> >> NAML<
> http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>
> >>
> >
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3999163.html
>
> > Sent from the Nutch - User mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3999165.html
>  To unsubscribe from Is it posible to know how long it takes to download
> an amount of data with nutch., click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3998936&code=aXNpZHJvc2FAZ21haWwuY29tfDM5OTg5MzZ8MjAzMTY0OTM3MQ==>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3999166.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Is it posible to know how long it takes to download an amount of data with nutch.

Posted by Mathijs Homminga <ma...@kalooga.com>.
What version of Nutch are you using?

On Aug 4, 2012, at 5:36 , isidro <is...@gmail.com> wrote:

> Hi,
> 
> Where can I get the content size and the fetch times for each fetched file ?
> 
> Isidro
> 
> 
> On Thu, Aug 2, 2012 at 11:49 PM, Mathijs Homminga-3 [via Lucene] <
> ml-node+s472066n3998947h57@n3.nabble.com> wrote:
> 
>> Hi,
>> 
>> Unless you monitor the counters of a job while it's running: no.
>> However, you could, in theory, replay the fetch/download by looking at the
>> fetch times, sum the content size and see when it hits the 5G. But you have
>> to write your own tool for that.
>> 
>> Mathijs
>> 
>> On 3 aug. 2012, at 03:13, isidro <[hidden email]<http://user/SendEmail.jtp?type=node&node=3998947&i=0>>
>> wrote:
>> 
>>> Hi,
>>> 
>>> Is it posible to know how long it takes to download an amount of data
>> with
>>> nutch.
>>> 
>>> I want to know how long it took to download the first 5 G.
>>> 
>>> Isidro
>>> 
>>> 
>>> 
>>> --
>>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936.html
>>> Sent from the Nutch - User mailing list archive at Nabble.com.
>> 
>> 
>> 
>> ------------------------------
>> If you reply to this email, your message will be added to the discussion
>> below:
>> 
>> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3998947.html
>> To unsubscribe from Is it posible to know how long it takes to download
>> an amount of data with nutch., click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3998936&code=aXNpZHJvc2FAZ21haWwuY29tfDM5OTg5MzZ8MjAzMTY0OTM3MQ==>
>> .
>> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>> 
> 
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3999163.html
> Sent from the Nutch - User mailing list archive at Nabble.com.


Re: Is it posible to know how long it takes to download an amount of data with nutch.

Posted by isidro <is...@gmail.com>.
Hi,

Where can I get the content size and the fetch times for each fetched file ?

Isidro


On Thu, Aug 2, 2012 at 11:49 PM, Mathijs Homminga-3 [via Lucene] <
ml-node+s472066n3998947h57@n3.nabble.com> wrote:

> Hi,
>
> Unless you monitor the counters of a job while it's running: no.
> However, you could, in theory, replay the fetch/download by looking at the
> fetch times, sum the content size and see when it hits the 5G. But you have
> to write your own tool for that.
>
> Mathijs
>
> On 3 aug. 2012, at 03:13, isidro <[hidden email]<http://user/SendEmail.jtp?type=node&node=3998947&i=0>>
> wrote:
>
> > Hi,
> >
> > Is it posible to know how long it takes to download an amount of data
> with
> > nutch.
> >
> > I want to know how long it took to download the first 5 G.
> >
> > Isidro
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936.html
> > Sent from the Nutch - User mailing list archive at Nabble.com.
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3998947.html
>  To unsubscribe from Is it posible to know how long it takes to download
> an amount of data with nutch., click here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=3998936&code=aXNpZHJvc2FAZ21haWwuY29tfDM5OTg5MzZ8MjAzMTY0OTM3MQ==>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936p3999163.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Is it posible to know how long it takes to download an amount of data with nutch.

Posted by Mathijs Homminga <ma...@kalooga.com>.
Hi,

Unless you monitor the counters of a job while it's running: no.
However, you could, in theory, replay the fetch/download by looking at the fetch times, sum the content size and see when it hits the 5G. But you have to write your own tool for that.

Mathijs

On 3 aug. 2012, at 03:13, isidro <is...@gmail.com> wrote:

> Hi, 
> 
> Is it posible to know how long it takes to download an amount of data with
> nutch.
> 
> I want to know how long it took to download the first 5 G.
> 
> Isidro
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Is-it-posible-to-know-how-long-it-takes-to-download-an-amount-of-data-with-nutch-tp3998936.html
> Sent from the Nutch - User mailing list archive at Nabble.com.