You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Luis Armando Roca Fumero <lr...@uclv.edu.cu> on 2013/10/17 20:52:17 UTC

crawling with Nutch 2.2.1

Hello friends,
I configured nutch 2.2.1 to crwal the web page http://intranet.uclv.edu.cu.
I get the results located below in this page when I ran this command: ./bin/crawl urls crawlId http://localhost:8983/solr/ 3
I need to know if I wrong, but I feel like something is not working well, I attached the config files too.
Please, write me, this is my 3rd mail and I haven't answers or suggestions from these mailing list
Thanks in advance,
Luis Armando



root@solr1:/opt/apache-nutch-2.2.1/runtime/local# ./bin/crawl urls crawlId http://localhost:8983/solr/ 3
InjectorJob: starting at 2013-10-17 18:43:13
InjectorJob: Injecting urlDir: urls
InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora storage class.
InjectorJob: total number of urls rejected by filters: 0
InjectorJob: total number of urls injected after normalization and filtering: 1
Injector: finished at 2013-10-17 18:43:15, elapsed: 00:00:02
Thu Oct 17 18:43:15 UTC 2013 : Iteration 1 of 3
Generating batchId
Generating a new fetchlist
GeneratorJob: starting at 2013-10-17 18:43:16
GeneratorJob: Selecting best-scoring urls due for fetch.
GeneratorJob: starting
GeneratorJob: filtering: false
GeneratorJob: normalizing: false
GeneratorJob: topN: 50000
GeneratorJob: finished at 2013-10-17 18:43:19, time elapsed: 00:00:02
GeneratorJob: generated batch id: 1382035395-32147
Fetching :
FetcherJob: starting
FetcherJob: batchId: 1382035395-32147
Fetcher: Your 'http.agent.name' value should be listed first in 'http.robots.agents' property.
FetcherJob: threads: 50
FetcherJob: parsing: false
FetcherJob: resuming: false
FetcherJob : timelimit set for : 1382046200181
Using queue mode : byHost
Fetcher: threads: 50
QueueFeeder finished: total 0 records. Hit by time limit :0
-finishing thread FetcherThread0, activeThreads=0
-finishing thread FetcherThread1, activeThreads=0
-finishing thread FetcherThread2, activeThreads=0
-finishing thread FetcherThread3, activeThreads=0
-finishing thread FetcherThread4, activeThreads=0
-finishing thread FetcherThread6, activeThreads=0
-finishing thread FetcherThread5, activeThreads=0
-finishing thread FetcherThread7, activeThreads=0
-finishing thread FetcherThread8, activeThreads=1
-finishing thread FetcherThread9, activeThreads=0
-finishing thread FetcherThread10, activeThreads=0
-finishing thread FetcherThread11, activeThreads=0
-finishing thread FetcherThread12, activeThreads=0
-finishing thread FetcherThread13, activeThreads=0
-finishing thread FetcherThread15, activeThreads=0
-finishing thread FetcherThread14, activeThreads=0
-finishing thread FetcherThread16, activeThreads=0
-finishing thread FetcherThread17, activeThreads=0
-finishing thread FetcherThread18, activeThreads=0
-finishing thread FetcherThread19, activeThreads=0
-finishing thread FetcherThread20, activeThreads=0
-finishing thread FetcherThread21, activeThreads=0
-finishing thread FetcherThread23, activeThreads=0
-finishing thread FetcherThread22, activeThreads=0
-finishing thread FetcherThread24, activeThreads=0
-finishing thread FetcherThread26, activeThreads=0
-finishing thread FetcherThread25, activeThreads=0
-finishing thread FetcherThread27, activeThreads=0
-finishing thread FetcherThread28, activeThreads=0
-finishing thread FetcherThread29, activeThreads=0
-finishing thread FetcherThread30, activeThreads=0
-finishing thread FetcherThread31, activeThreads=0
-finishing thread FetcherThread32, activeThreads=0
-finishing thread FetcherThread33, activeThreads=0
-finishing thread FetcherThread34, activeThreads=0
-finishing thread FetcherThread35, activeThreads=0
-finishing thread FetcherThread36, activeThreads=0
-finishing thread FetcherThread38, activeThreads=0
-finishing thread FetcherThread37, activeThreads=0
-finishing thread FetcherThread39, activeThreads=0
-finishing thread FetcherThread40, activeThreads=0
-finishing thread FetcherThread41, activeThreads=0
-finishing thread FetcherThread42, activeThreads=0
-finishing thread FetcherThread43, activeThreads=0
-finishing thread FetcherThread44, activeThreads=0
-finishing thread FetcherThread45, activeThreads=0
-finishing thread FetcherThread46, activeThreads=0
-finishing thread FetcherThread47, activeThreads=0
-finishing thread FetcherThread48, activeThreads=0
Fetcher: throughput threshold: -1
Fetcher: throughput threshold sequence: 5
-finishing thread FetcherThread49, activeThreads=0
0/0 spinwaiting/active, 0 pages, 0 errors, 0.0 0 pages/s, 0 0 kb/s, 0 URLs in 0 queues
-activeThreads=0
FetcherJob: done
Parsing :
ParserJob: starting
ParserJob: resuming:    false
ParserJob: forced reparse:      false
ParserJob: batchId:     1382035395-32147
ParserJob: success
CrawlDB update for crawlId
DbUpdaterJob: starting
DbUpdaterJob: done
Indexing crawlId on SOLR index -> http://localhost:8983/solr/
SolrIndexerJob: starting
SolrIndexerJob: done.
SOLR dedup -> http://localhost:8983/solr/

La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. Fundada el 30 de noviembre de 1952. Visítenos en:  http://www.uclv.edu.cu
Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba. http://www.congresouniversidad.cu/



Re: crawling with Nutch 2.2.1

Posted by Honza Bouchner <ja...@gmail.com>.
Hi, I have good experience with Solr 4.0 and Nutch 2.2.1 combination.

Jan


2013/10/17 Luis Armando Roca Fumero <lr...@uclv.edu.cu>

> I need to integrate nutch with solr 4.4.0. Do you think that nutch 1.7
> works well with solr 4.4.0???
> ________________________________________
> De: Julien Nioche [lists.digitalpebble@gmail.com]
> Enviado el: jueves, 17 de octubre de 2013 02:01 p.m.
> Para: user@nutch.apache.org
> Asunto: Re: crawling with Nutch 2.2.1
>
> Memstore is certainly not persistent across jobs. Try using a different
> backend like HBase or Cassandra (see tutorials on the wiki) or switch to
> Nutch 1.x
>
>
> On 17 October 2013 19:52, Luis Armando Roca Fumero <lroca@uclv.edu.cu
> >wrote:
>
> > Hello friends,
> > I configured nutch 2.2.1 to crwal the web page
> http://intranet.uclv.edu.cu
> > .
> > I get the results located below in this page when I ran this command:
> > ./bin/crawl urls crawlId http://localhost:8983/solr/ 3
> > I need to know if I wrong, but I feel like something is not working well,
> > I attached the config files too.
> > Please, write me, this is my 3rd mail and I haven't answers or
> suggestions
> > from these mailing list
> > Thanks in advance,
> > Luis Armando
> >
> >
> >
> > root@solr1:/opt/apache-nutch-2.2.1/runtime/local# ./bin/crawl urls
> > crawlId http://localhost:8983/solr/ 3
> > InjectorJob: starting at 2013-10-17 18:43:13
> > InjectorJob: Injecting urlDir: urls
> > InjectorJob: Using class org.apache.gora.memory.store.MemStore as the
> Gora
> > storage class.
> > InjectorJob: total number of urls rejected by filters: 0
> > InjectorJob: total number of urls injected after normalization and
> > filtering: 1
> > Injector: finished at 2013-10-17 18:43:15, elapsed: 00:00:02
> > Thu Oct 17 18:43:15 UTC 2013 : Iteration 1 of 3
> > Generating batchId
> > Generating a new fetchlist
> > GeneratorJob: starting at 2013-10-17 18:43:16
> > GeneratorJob: Selecting best-scoring urls due for fetch.
> > GeneratorJob: starting
> > GeneratorJob: filtering: false
> > GeneratorJob: normalizing: false
> > GeneratorJob: topN: 50000
> > GeneratorJob: finished at 2013-10-17 18:43:19, time elapsed: 00:00:02
> > GeneratorJob: generated batch id: 1382035395-32147
> > Fetching :
> > FetcherJob: starting
> > FetcherJob: batchId: 1382035395-32147
> > Fetcher: Your 'http.agent.name' value should be listed first in
> > 'http.robots.agents' property.
> > FetcherJob: threads: 50
> > FetcherJob: parsing: false
> > FetcherJob: resuming: false
> > FetcherJob : timelimit set for : 1382046200181
> > Using queue mode : byHost
> > Fetcher: threads: 50
> > QueueFeeder finished: total 0 records. Hit by time limit :0
> > -finishing thread FetcherThread0, activeThreads=0
> > -finishing thread FetcherThread1, activeThreads=0
> > -finishing thread FetcherThread2, activeThreads=0
> > -finishing thread FetcherThread3, activeThreads=0
> > -finishing thread FetcherThread4, activeThreads=0
> > -finishing thread FetcherThread6, activeThreads=0
> > -finishing thread FetcherThread5, activeThreads=0
> > -finishing thread FetcherThread7, activeThreads=0
> > -finishing thread FetcherThread8, activeThreads=1
> > -finishing thread FetcherThread9, activeThreads=0
> > -finishing thread FetcherThread10, activeThreads=0
> > -finishing thread FetcherThread11, activeThreads=0
> > -finishing thread FetcherThread12, activeThreads=0
> > -finishing thread FetcherThread13, activeThreads=0
> > -finishing thread FetcherThread15, activeThreads=0
> > -finishing thread FetcherThread14, activeThreads=0
> > -finishing thread FetcherThread16, activeThreads=0
> > -finishing thread FetcherThread17, activeThreads=0
> > -finishing thread FetcherThread18, activeThreads=0
> > -finishing thread FetcherThread19, activeThreads=0
> > -finishing thread FetcherThread20, activeThreads=0
> > -finishing thread FetcherThread21, activeThreads=0
> > -finishing thread FetcherThread23, activeThreads=0
> > -finishing thread FetcherThread22, activeThreads=0
> > -finishing thread FetcherThread24, activeThreads=0
> > -finishing thread FetcherThread26, activeThreads=0
> > -finishing thread FetcherThread25, activeThreads=0
> > -finishing thread FetcherThread27, activeThreads=0
> > -finishing thread FetcherThread28, activeThreads=0
> > -finishing thread FetcherThread29, activeThreads=0
> > -finishing thread FetcherThread30, activeThreads=0
> > -finishing thread FetcherThread31, activeThreads=0
> > -finishing thread FetcherThread32, activeThreads=0
> > -finishing thread FetcherThread33, activeThreads=0
> > -finishing thread FetcherThread34, activeThreads=0
> > -finishing thread FetcherThread35, activeThreads=0
> > -finishing thread FetcherThread36, activeThreads=0
> > -finishing thread FetcherThread38, activeThreads=0
> > -finishing thread FetcherThread37, activeThreads=0
> > -finishing thread FetcherThread39, activeThreads=0
> > -finishing thread FetcherThread40, activeThreads=0
> > -finishing thread FetcherThread41, activeThreads=0
> > -finishing thread FetcherThread42, activeThreads=0
> > -finishing thread FetcherThread43, activeThreads=0
> > -finishing thread FetcherThread44, activeThreads=0
> > -finishing thread FetcherThread45, activeThreads=0
> > -finishing thread FetcherThread46, activeThreads=0
> > -finishing thread FetcherThread47, activeThreads=0
> > -finishing thread FetcherThread48, activeThreads=0
> > Fetcher: throughput threshold: -1
> > Fetcher: throughput threshold sequence: 5
> > -finishing thread FetcherThread49, activeThreads=0
> > 0/0 spinwaiting/active, 0 pages, 0 errors, 0.0 0 pages/s, 0 0 kb/s, 0
> URLs
> > in 0 queues
> > -activeThreads=0
> > FetcherJob: done
> > Parsing :
> > ParserJob: starting
> > ParserJob: resuming:    false
> > ParserJob: forced reparse:      false
> > ParserJob: batchId:     1382035395-32147
> > ParserJob: success
> > CrawlDB update for crawlId
> > DbUpdaterJob: starting
> > DbUpdaterJob: done
> > Indexing crawlId on SOLR index -> http://localhost:8983/solr/
> > SolrIndexerJob: starting
> > SolrIndexerJob: done.
> > SOLR dedup -> http://localhost:8983/solr/
> >
> > La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario.
> > Fundada el 30 de noviembre de 1952. Visítenos en:
> http://www.uclv.edu.cu
> > Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana.
> > Cuba. http://www.congresouniversidad.cu/
> >
> >
> >
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
> http://twitter.com/digitalpebble
>
> La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario.
> Fundada el 30 de noviembre de 1952. Visítenos en:  http://www.uclv.edu.cu
> Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana.
> Cuba. http://www.congresouniversidad.cu/
>
>
>
> La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario.
> Fundada el 30 de noviembre de 1952. Visítenos en:  http://www.uclv.edu.cu
> Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana.
> Cuba. http://www.congresouniversidad.cu/
>
>
>

RE: crawling with Nutch 2.2.1

Posted by Luis Armando Roca Fumero <lr...@uclv.edu.cu>.
I need to integrate nutch with solr 4.4.0. Do you think that nutch 1.7 works well with solr 4.4.0???
________________________________________
De: Julien Nioche [lists.digitalpebble@gmail.com]
Enviado el: jueves, 17 de octubre de 2013 02:01 p.m.
Para: user@nutch.apache.org
Asunto: Re: crawling with Nutch 2.2.1

Memstore is certainly not persistent across jobs. Try using a different
backend like HBase or Cassandra (see tutorials on the wiki) or switch to
Nutch 1.x


On 17 October 2013 19:52, Luis Armando Roca Fumero <lr...@uclv.edu.cu>wrote:

> Hello friends,
> I configured nutch 2.2.1 to crwal the web page http://intranet.uclv.edu.cu
> .
> I get the results located below in this page when I ran this command:
> ./bin/crawl urls crawlId http://localhost:8983/solr/ 3
> I need to know if I wrong, but I feel like something is not working well,
> I attached the config files too.
> Please, write me, this is my 3rd mail and I haven't answers or suggestions
> from these mailing list
> Thanks in advance,
> Luis Armando
>
>
>
> root@solr1:/opt/apache-nutch-2.2.1/runtime/local# ./bin/crawl urls
> crawlId http://localhost:8983/solr/ 3
> InjectorJob: starting at 2013-10-17 18:43:13
> InjectorJob: Injecting urlDir: urls
> InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
> storage class.
> InjectorJob: total number of urls rejected by filters: 0
> InjectorJob: total number of urls injected after normalization and
> filtering: 1
> Injector: finished at 2013-10-17 18:43:15, elapsed: 00:00:02
> Thu Oct 17 18:43:15 UTC 2013 : Iteration 1 of 3
> Generating batchId
> Generating a new fetchlist
> GeneratorJob: starting at 2013-10-17 18:43:16
> GeneratorJob: Selecting best-scoring urls due for fetch.
> GeneratorJob: starting
> GeneratorJob: filtering: false
> GeneratorJob: normalizing: false
> GeneratorJob: topN: 50000
> GeneratorJob: finished at 2013-10-17 18:43:19, time elapsed: 00:00:02
> GeneratorJob: generated batch id: 1382035395-32147
> Fetching :
> FetcherJob: starting
> FetcherJob: batchId: 1382035395-32147
> Fetcher: Your 'http.agent.name' value should be listed first in
> 'http.robots.agents' property.
> FetcherJob: threads: 50
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : 1382046200181
> Using queue mode : byHost
> Fetcher: threads: 50
> QueueFeeder finished: total 0 records. Hit by time limit :0
> -finishing thread FetcherThread0, activeThreads=0
> -finishing thread FetcherThread1, activeThreads=0
> -finishing thread FetcherThread2, activeThreads=0
> -finishing thread FetcherThread3, activeThreads=0
> -finishing thread FetcherThread4, activeThreads=0
> -finishing thread FetcherThread6, activeThreads=0
> -finishing thread FetcherThread5, activeThreads=0
> -finishing thread FetcherThread7, activeThreads=0
> -finishing thread FetcherThread8, activeThreads=1
> -finishing thread FetcherThread9, activeThreads=0
> -finishing thread FetcherThread10, activeThreads=0
> -finishing thread FetcherThread11, activeThreads=0
> -finishing thread FetcherThread12, activeThreads=0
> -finishing thread FetcherThread13, activeThreads=0
> -finishing thread FetcherThread15, activeThreads=0
> -finishing thread FetcherThread14, activeThreads=0
> -finishing thread FetcherThread16, activeThreads=0
> -finishing thread FetcherThread17, activeThreads=0
> -finishing thread FetcherThread18, activeThreads=0
> -finishing thread FetcherThread19, activeThreads=0
> -finishing thread FetcherThread20, activeThreads=0
> -finishing thread FetcherThread21, activeThreads=0
> -finishing thread FetcherThread23, activeThreads=0
> -finishing thread FetcherThread22, activeThreads=0
> -finishing thread FetcherThread24, activeThreads=0
> -finishing thread FetcherThread26, activeThreads=0
> -finishing thread FetcherThread25, activeThreads=0
> -finishing thread FetcherThread27, activeThreads=0
> -finishing thread FetcherThread28, activeThreads=0
> -finishing thread FetcherThread29, activeThreads=0
> -finishing thread FetcherThread30, activeThreads=0
> -finishing thread FetcherThread31, activeThreads=0
> -finishing thread FetcherThread32, activeThreads=0
> -finishing thread FetcherThread33, activeThreads=0
> -finishing thread FetcherThread34, activeThreads=0
> -finishing thread FetcherThread35, activeThreads=0
> -finishing thread FetcherThread36, activeThreads=0
> -finishing thread FetcherThread38, activeThreads=0
> -finishing thread FetcherThread37, activeThreads=0
> -finishing thread FetcherThread39, activeThreads=0
> -finishing thread FetcherThread40, activeThreads=0
> -finishing thread FetcherThread41, activeThreads=0
> -finishing thread FetcherThread42, activeThreads=0
> -finishing thread FetcherThread43, activeThreads=0
> -finishing thread FetcherThread44, activeThreads=0
> -finishing thread FetcherThread45, activeThreads=0
> -finishing thread FetcherThread46, activeThreads=0
> -finishing thread FetcherThread47, activeThreads=0
> -finishing thread FetcherThread48, activeThreads=0
> Fetcher: throughput threshold: -1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread49, activeThreads=0
> 0/0 spinwaiting/active, 0 pages, 0 errors, 0.0 0 pages/s, 0 0 kb/s, 0 URLs
> in 0 queues
> -activeThreads=0
> FetcherJob: done
> Parsing :
> ParserJob: starting
> ParserJob: resuming:    false
> ParserJob: forced reparse:      false
> ParserJob: batchId:     1382035395-32147
> ParserJob: success
> CrawlDB update for crawlId
> DbUpdaterJob: starting
> DbUpdaterJob: done
> Indexing crawlId on SOLR index -> http://localhost:8983/solr/
> SolrIndexerJob: starting
> SolrIndexerJob: done.
> SOLR dedup -> http://localhost:8983/solr/
>
> La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario.
> Fundada el 30 de noviembre de 1952. Visítenos en:  http://www.uclv.edu.cu
> Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana.
> Cuba. http://www.congresouniversidad.cu/
>
>
>


--
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. Fundada el 30 de noviembre de 1952. Visítenos en:  http://www.uclv.edu.cu
Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba. http://www.congresouniversidad.cu/



La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. Fundada el 30 de noviembre de 1952. Visítenos en:  http://www.uclv.edu.cu
Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba. http://www.congresouniversidad.cu/



Re: crawling with Nutch 2.2.1

Posted by Julien Nioche <li...@gmail.com>.
Memstore is certainly not persistent across jobs. Try using a different
backend like HBase or Cassandra (see tutorials on the wiki) or switch to
Nutch 1.x


On 17 October 2013 19:52, Luis Armando Roca Fumero <lr...@uclv.edu.cu>wrote:

> Hello friends,
> I configured nutch 2.2.1 to crwal the web page http://intranet.uclv.edu.cu
> .
> I get the results located below in this page when I ran this command:
> ./bin/crawl urls crawlId http://localhost:8983/solr/ 3
> I need to know if I wrong, but I feel like something is not working well,
> I attached the config files too.
> Please, write me, this is my 3rd mail and I haven't answers or suggestions
> from these mailing list
> Thanks in advance,
> Luis Armando
>
>
>
> root@solr1:/opt/apache-nutch-2.2.1/runtime/local# ./bin/crawl urls
> crawlId http://localhost:8983/solr/ 3
> InjectorJob: starting at 2013-10-17 18:43:13
> InjectorJob: Injecting urlDir: urls
> InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
> storage class.
> InjectorJob: total number of urls rejected by filters: 0
> InjectorJob: total number of urls injected after normalization and
> filtering: 1
> Injector: finished at 2013-10-17 18:43:15, elapsed: 00:00:02
> Thu Oct 17 18:43:15 UTC 2013 : Iteration 1 of 3
> Generating batchId
> Generating a new fetchlist
> GeneratorJob: starting at 2013-10-17 18:43:16
> GeneratorJob: Selecting best-scoring urls due for fetch.
> GeneratorJob: starting
> GeneratorJob: filtering: false
> GeneratorJob: normalizing: false
> GeneratorJob: topN: 50000
> GeneratorJob: finished at 2013-10-17 18:43:19, time elapsed: 00:00:02
> GeneratorJob: generated batch id: 1382035395-32147
> Fetching :
> FetcherJob: starting
> FetcherJob: batchId: 1382035395-32147
> Fetcher: Your 'http.agent.name' value should be listed first in
> 'http.robots.agents' property.
> FetcherJob: threads: 50
> FetcherJob: parsing: false
> FetcherJob: resuming: false
> FetcherJob : timelimit set for : 1382046200181
> Using queue mode : byHost
> Fetcher: threads: 50
> QueueFeeder finished: total 0 records. Hit by time limit :0
> -finishing thread FetcherThread0, activeThreads=0
> -finishing thread FetcherThread1, activeThreads=0
> -finishing thread FetcherThread2, activeThreads=0
> -finishing thread FetcherThread3, activeThreads=0
> -finishing thread FetcherThread4, activeThreads=0
> -finishing thread FetcherThread6, activeThreads=0
> -finishing thread FetcherThread5, activeThreads=0
> -finishing thread FetcherThread7, activeThreads=0
> -finishing thread FetcherThread8, activeThreads=1
> -finishing thread FetcherThread9, activeThreads=0
> -finishing thread FetcherThread10, activeThreads=0
> -finishing thread FetcherThread11, activeThreads=0
> -finishing thread FetcherThread12, activeThreads=0
> -finishing thread FetcherThread13, activeThreads=0
> -finishing thread FetcherThread15, activeThreads=0
> -finishing thread FetcherThread14, activeThreads=0
> -finishing thread FetcherThread16, activeThreads=0
> -finishing thread FetcherThread17, activeThreads=0
> -finishing thread FetcherThread18, activeThreads=0
> -finishing thread FetcherThread19, activeThreads=0
> -finishing thread FetcherThread20, activeThreads=0
> -finishing thread FetcherThread21, activeThreads=0
> -finishing thread FetcherThread23, activeThreads=0
> -finishing thread FetcherThread22, activeThreads=0
> -finishing thread FetcherThread24, activeThreads=0
> -finishing thread FetcherThread26, activeThreads=0
> -finishing thread FetcherThread25, activeThreads=0
> -finishing thread FetcherThread27, activeThreads=0
> -finishing thread FetcherThread28, activeThreads=0
> -finishing thread FetcherThread29, activeThreads=0
> -finishing thread FetcherThread30, activeThreads=0
> -finishing thread FetcherThread31, activeThreads=0
> -finishing thread FetcherThread32, activeThreads=0
> -finishing thread FetcherThread33, activeThreads=0
> -finishing thread FetcherThread34, activeThreads=0
> -finishing thread FetcherThread35, activeThreads=0
> -finishing thread FetcherThread36, activeThreads=0
> -finishing thread FetcherThread38, activeThreads=0
> -finishing thread FetcherThread37, activeThreads=0
> -finishing thread FetcherThread39, activeThreads=0
> -finishing thread FetcherThread40, activeThreads=0
> -finishing thread FetcherThread41, activeThreads=0
> -finishing thread FetcherThread42, activeThreads=0
> -finishing thread FetcherThread43, activeThreads=0
> -finishing thread FetcherThread44, activeThreads=0
> -finishing thread FetcherThread45, activeThreads=0
> -finishing thread FetcherThread46, activeThreads=0
> -finishing thread FetcherThread47, activeThreads=0
> -finishing thread FetcherThread48, activeThreads=0
> Fetcher: throughput threshold: -1
> Fetcher: throughput threshold sequence: 5
> -finishing thread FetcherThread49, activeThreads=0
> 0/0 spinwaiting/active, 0 pages, 0 errors, 0.0 0 pages/s, 0 0 kb/s, 0 URLs
> in 0 queues
> -activeThreads=0
> FetcherJob: done
> Parsing :
> ParserJob: starting
> ParserJob: resuming:    false
> ParserJob: forced reparse:      false
> ParserJob: batchId:     1382035395-32147
> ParserJob: success
> CrawlDB update for crawlId
> DbUpdaterJob: starting
> DbUpdaterJob: done
> Indexing crawlId on SOLR index -> http://localhost:8983/solr/
> SolrIndexerJob: starting
> SolrIndexerJob: done.
> SOLR dedup -> http://localhost:8983/solr/
>
> La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario.
> Fundada el 30 de noviembre de 1952. Visítenos en:  http://www.uclv.edu.cu
> Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana.
> Cuba. http://www.congresouniversidad.cu/
>
>
>


-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble