You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by al...@aim.com on 2011/01/28 00:00:43 UTC

nutch crawl command takes 98% of cpu

Hello,

I run crawl command with -depth 7 -topN -1 on my linux box with 1.5Mps internet, amd 3.1ghz processor,  4GB memory, Fedora Linux 14, nutch 1.2. After 1-2 days nutch takes 98% of cpu. My seed file includes about 3500 domains and I put fetch.external links to false.

Is this normal? If not, what can be done to improve it?

Thanks.
Alex.

Re: nutch crawl command takes 98% of cpu

Posted by Markus Jelsma <ma...@openindex.io>.
Hi,

There is no -noParse option so your fetcher might actually fetch and parse, 
depening on the parse option in your Nutch config. Parsing usually takes a lot 
CPU.

Cheers,

>  Hello,
> 
> At this time, I am using step by step crawling. In depth 4 nutch-1.2
> started taking all CPU i.e., command bin/nutch fetch $s4 took all CPU
> after fetching for about 1 day .
> 
> Thanks.
> Alex.

Re: nutch crawl command takes 98% of cpu

Posted by al...@aim.com.

 Hello,

At this time, I am using step by step crawling. In depth 4 nutch-1.2 started taking all CPU i.e., command bin/nutch fetch $s4 took all CPU after fetching for about 1 day .

Thanks.
Alex.

 


Re: nutch crawl command takes 98% of cpu

Posted by al...@aim.com.
Hello,

Which version this patch  is applicable?

Thanks.
Alex.

 

 


 

 

-----Original Message-----
From: Alexis <al...@gmail.com>
To: user <us...@nutch.apache.org>
Sent: Tue, Feb 8, 2011 9:59 am
Subject: Re: nutch crawl command takes 98% of cpu


Hi,



Thanks for all the feedback. It looks like there is not much you can

do if you give the FLV parser some corrupted data. From a practical

point of view, we can say that this is extremely annoying as it takes

up all the CPU resources and prevent other threads to perform their

task properly, till the TIMEOUT occurs, kills the thread and frees up

the CPU.



We can notice that this happens when an FLV file is truncated (due to

an http.content.limit property lower that its content-length, in

bytes). So the suggestion is to hint to the parser that it is likely

to get stuck and skip the parsing in case the downloaded content size

mismatches the content-length header.



Besides, I often see errors in the HTML parser when the content is

truncated (https://issues.apache.org/jira/browse/TIKA-307). So it does

not hurt saving time and avoiding errors.



I created the issue here: https://issues.apache.org/jira/browse/NUTCH-965

See attached patch.



Alexis.



On Mon, Feb 7, 2011 at 12:00 PM, Ken Krugler

<kk...@transpac.com> wrote:

> Hi Kirby & others,

>

> On Jan 31, 2011, at 4:39pm, Kirby Bohling wrote:

>

>> On Sat, Jan 29, 2011 at 9:03 AM, Ken Krugler

>> <kk...@transpac.com> wrote:

>>>

>>> Some comments below.

>>>

>>> On Jan 29, 2011, at 5:55am, Julien Nioche wrote:

>>>

>>>> Hi,

>>>>

>>>> This shows the state of the various threads within a Java process. Most

>>>> of

>>>> them seem to be busy parsing zip archives with Tika. The interesting

>>>> part

>>>> is

>>>> that the main thread is at the Generation step :

>>>>

>>>> *  at org.apache.nutch.crawl.Generator.generate(Generator.java:431)

>>>>  at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)

>>>> *

>>>> with the "Thread-415331" normalizing the URLs as part of the generation.

>>>>

>>>> So why do we see threads busy at parsing these archives? I think this is

>>>> a

>>>> result of the Timeout mechanism (

>>>> https://issues.apache.org/jira/browse/NUTCH-696) used for the parsing.

>>>> Before it, we used to have the parsing step loop on a single document

>>>> and

>>>> never complete. Thanks to Andrzej's patch, the parsing is done is

>>>> separate

>>>> threads which are abandonned if more than X seconds have passed (default

>>>> 30

>>>> I think). Obiously these threads are still lurking around in the

>>>> background

>>>> and consuming CPU.

>>>>

>>>> This is an issue when calling the Crawl command only. When using the

>>>> separate commands for the various steps, the runaway threads die with

>>>> the

>>>> main process, however since the Crawl uses a single process, these

>>>> timeout

>>>> threads keep going.

>>>>

>>>> Am not an expert in multithreading and don't have an idea of whether

>>>> these

>>>> threads could be killed somehow. Andrzej, any clue?

>>>

>>> This is a fundamental problem with run-away threads - there is no safe,

>>> reliable way to kill them off.

>>>

>>> And if you parse enough documents, you will run into a number that

>>> currently

>>> cause Tika to hang. Zip files for sure, but we ran into the same issue

>>> with

>>> FLV files.

>>>

>>> Over in Tika-land, Jukka has a patch that fires up a child JVM and runs

>>> parsers there. See https://issues.apache.org/jira/browse/TIKA-416

>>>

>>> -- Ken

>>>

>>

>> All,

>>

>>  Just an observation, but the general approach to this problem is to

>> use Thread.interrupt().  Virtually all code in the JDK treats the

>> thread being interrupted as a request to cancel.  Java Concurrency in

>> Practice (JCIP) has a whole chapter on this topic (Chapter 7).  IMHO,

>> any general purpose library code that swallows "InterruptedException"

>> and isn't implementing the Thread cancellation policy has a bug in it

>> (the cancellation policy can only be implemented by the owner of the

>> thread, unless the library is a task/thread library it cannot be

>> implementing the cancellation policy).  Any place you see:

>

> [snip]

>

>> One exception is that

>> sockets read/write operations don't operate this way, the socket must

>> be closed to interrupt a read/write, the approach JCIP suggests is to

>> tie the socket and thread in such a way that interrupt() closes the

>> sockets that would be reading/writing inside that thread.

>

> Excellent input, as I need to solve some issues with needing to abort HTTP

> requests.

>

> [snip]

>

>> Not sure exactly what the problems inside of Tika are, but getting it

>> to respect interruption would be a wonderful thing for everybody that

>> uses it.  The problem might be getting all underlying libraries it

>> uses to do so.

>

> Yes, that's exactly the issue in the cases I've seen. The libraries used to

> do the actual parsing can get caught in loops, when processing unexpected

> data. There's no checks for interrupt, e.g. it's code that is walking some

> data structure, and doesn't realize that it's in a loop (e.g. offset to next

> chunk is set to zero, so the same chunk is endlessly reprocessed).

>

> Occasionally we can get the underlying libraries to fix issues, but each new

> release has the potential for new and exciting hangs.

>

> That's why Jukka went down the admittedly hard-core and heavy-weight path of

> providing an option to run parses in a child JVM.

>

> If there's another solution, we'd love to hear about it :)

>

> Thanks,

>

> -- Ken

>

> --------------------------

> Ken Krugler

> +1 530-210-6378

> http://bixolabs.com

> e l a s t i c   w e b   m i n i n g

>

>

>

>

>

>




 

Re: nutch crawl command takes 98% of cpu

Posted by Alexis <al...@gmail.com>.
Hi,

Thanks for all the feedback. It looks like there is not much you can
do if you give the FLV parser some corrupted data. From a practical
point of view, we can say that this is extremely annoying as it takes
up all the CPU resources and prevent other threads to perform their
task properly, till the TIMEOUT occurs, kills the thread and frees up
the CPU.

We can notice that this happens when an FLV file is truncated (due to
an http.content.limit property lower that its content-length, in
bytes). So the suggestion is to hint to the parser that it is likely
to get stuck and skip the parsing in case the downloaded content size
mismatches the content-length header.

Besides, I often see errors in the HTML parser when the content is
truncated (https://issues.apache.org/jira/browse/TIKA-307). So it does
not hurt saving time and avoiding errors.

I created the issue here: https://issues.apache.org/jira/browse/NUTCH-965
See attached patch.

Alexis.

On Mon, Feb 7, 2011 at 12:00 PM, Ken Krugler
<kk...@transpac.com> wrote:
> Hi Kirby & others,
>
> On Jan 31, 2011, at 4:39pm, Kirby Bohling wrote:
>
>> On Sat, Jan 29, 2011 at 9:03 AM, Ken Krugler
>> <kk...@transpac.com> wrote:
>>>
>>> Some comments below.
>>>
>>> On Jan 29, 2011, at 5:55am, Julien Nioche wrote:
>>>
>>>> Hi,
>>>>
>>>> This shows the state of the various threads within a Java process. Most
>>>> of
>>>> them seem to be busy parsing zip archives with Tika. The interesting
>>>> part
>>>> is
>>>> that the main thread is at the Generation step :
>>>>
>>>> *  at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
>>>>  at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
>>>> *
>>>> with the "Thread-415331" normalizing the URLs as part of the generation.
>>>>
>>>> So why do we see threads busy at parsing these archives? I think this is
>>>> a
>>>> result of the Timeout mechanism (
>>>> https://issues.apache.org/jira/browse/NUTCH-696) used for the parsing.
>>>> Before it, we used to have the parsing step loop on a single document
>>>> and
>>>> never complete. Thanks to Andrzej's patch, the parsing is done is
>>>> separate
>>>> threads which are abandonned if more than X seconds have passed (default
>>>> 30
>>>> I think). Obiously these threads are still lurking around in the
>>>> background
>>>> and consuming CPU.
>>>>
>>>> This is an issue when calling the Crawl command only. When using the
>>>> separate commands for the various steps, the runaway threads die with
>>>> the
>>>> main process, however since the Crawl uses a single process, these
>>>> timeout
>>>> threads keep going.
>>>>
>>>> Am not an expert in multithreading and don't have an idea of whether
>>>> these
>>>> threads could be killed somehow. Andrzej, any clue?
>>>
>>> This is a fundamental problem with run-away threads - there is no safe,
>>> reliable way to kill them off.
>>>
>>> And if you parse enough documents, you will run into a number that
>>> currently
>>> cause Tika to hang. Zip files for sure, but we ran into the same issue
>>> with
>>> FLV files.
>>>
>>> Over in Tika-land, Jukka has a patch that fires up a child JVM and runs
>>> parsers there. See https://issues.apache.org/jira/browse/TIKA-416
>>>
>>> -- Ken
>>>
>>
>> All,
>>
>>  Just an observation, but the general approach to this problem is to
>> use Thread.interrupt().  Virtually all code in the JDK treats the
>> thread being interrupted as a request to cancel.  Java Concurrency in
>> Practice (JCIP) has a whole chapter on this topic (Chapter 7).  IMHO,
>> any general purpose library code that swallows "InterruptedException"
>> and isn't implementing the Thread cancellation policy has a bug in it
>> (the cancellation policy can only be implemented by the owner of the
>> thread, unless the library is a task/thread library it cannot be
>> implementing the cancellation policy).  Any place you see:
>
> [snip]
>
>> One exception is that
>> sockets read/write operations don't operate this way, the socket must
>> be closed to interrupt a read/write, the approach JCIP suggests is to
>> tie the socket and thread in such a way that interrupt() closes the
>> sockets that would be reading/writing inside that thread.
>
> Excellent input, as I need to solve some issues with needing to abort HTTP
> requests.
>
> [snip]
>
>> Not sure exactly what the problems inside of Tika are, but getting it
>> to respect interruption would be a wonderful thing for everybody that
>> uses it.  The problem might be getting all underlying libraries it
>> uses to do so.
>
> Yes, that's exactly the issue in the cases I've seen. The libraries used to
> do the actual parsing can get caught in loops, when processing unexpected
> data. There's no checks for interrupt, e.g. it's code that is walking some
> data structure, and doesn't realize that it's in a loop (e.g. offset to next
> chunk is set to zero, so the same chunk is endlessly reprocessed).
>
> Occasionally we can get the underlying libraries to fix issues, but each new
> release has the potential for new and exciting hangs.
>
> That's why Jukka went down the admittedly hard-core and heavy-weight path of
> providing an option to run parses in a child JVM.
>
> If there's another solution, we'd love to hear about it :)
>
> Thanks,
>
> -- Ken
>
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://bixolabs.com
> e l a s t i c   w e b   m i n i n g
>
>
>
>
>
>

Re: nutch crawl command takes 98% of cpu

Posted by Ken Krugler <kk...@transpac.com>.
Hi Kirby & others,

On Jan 31, 2011, at 4:39pm, Kirby Bohling wrote:

> On Sat, Jan 29, 2011 at 9:03 AM, Ken Krugler
> <kk...@transpac.com> wrote:
>> Some comments below.
>>
>> On Jan 29, 2011, at 5:55am, Julien Nioche wrote:
>>
>>> Hi,
>>>
>>> This shows the state of the various threads within a Java process.  
>>> Most of
>>> them seem to be busy parsing zip archives with Tika. The  
>>> interesting part
>>> is
>>> that the main thread is at the Generation step :
>>>
>>> *  at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
>>>  at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
>>> *
>>> with the "Thread-415331" normalizing the URLs as part of the  
>>> generation.
>>>
>>> So why do we see threads busy at parsing these archives? I think  
>>> this is a
>>> result of the Timeout mechanism (
>>> https://issues.apache.org/jira/browse/NUTCH-696) used for the  
>>> parsing.
>>> Before it, we used to have the parsing step loop on a single  
>>> document and
>>> never complete. Thanks to Andrzej's patch, the parsing is done is  
>>> separate
>>> threads which are abandonned if more than X seconds have passed  
>>> (default
>>> 30
>>> I think). Obiously these threads are still lurking around in the
>>> background
>>> and consuming CPU.
>>>
>>> This is an issue when calling the Crawl command only. When using the
>>> separate commands for the various steps, the runaway threads die  
>>> with the
>>> main process, however since the Crawl uses a single process, these  
>>> timeout
>>> threads keep going.
>>>
>>> Am not an expert in multithreading and don't have an idea of  
>>> whether these
>>> threads could be killed somehow. Andrzej, any clue?
>>
>> This is a fundamental problem with run-away threads - there is no  
>> safe,
>> reliable way to kill them off.
>>
>> And if you parse enough documents, you will run into a number that  
>> currently
>> cause Tika to hang. Zip files for sure, but we ran into the same  
>> issue with
>> FLV files.
>>
>> Over in Tika-land, Jukka has a patch that fires up a child JVM and  
>> runs
>> parsers there. See https://issues.apache.org/jira/browse/TIKA-416
>>
>> -- Ken
>>
>
> All,
>
>  Just an observation, but the general approach to this problem is to
> use Thread.interrupt().  Virtually all code in the JDK treats the
> thread being interrupted as a request to cancel.  Java Concurrency in
> Practice (JCIP) has a whole chapter on this topic (Chapter 7).  IMHO,
> any general purpose library code that swallows "InterruptedException"
> and isn't implementing the Thread cancellation policy has a bug in it
> (the cancellation policy can only be implemented by the owner of the
> thread, unless the library is a task/thread library it cannot be
> implementing the cancellation policy).  Any place you see:

[snip]

> One exception is that
> sockets read/write operations don't operate this way, the socket must
> be closed to interrupt a read/write, the approach JCIP suggests is to
> tie the socket and thread in such a way that interrupt() closes the
> sockets that would be reading/writing inside that thread.

Excellent input, as I need to solve some issues with needing to abort  
HTTP requests.

[snip]

> Not sure exactly what the problems inside of Tika are, but getting it
> to respect interruption would be a wonderful thing for everybody that
> uses it.  The problem might be getting all underlying libraries it
> uses to do so.

Yes, that's exactly the issue in the cases I've seen. The libraries  
used to do the actual parsing can get caught in loops, when processing  
unexpected data. There's no checks for interrupt, e.g. it's code that  
is walking some data structure, and doesn't realize that it's in a  
loop (e.g. offset to next chunk is set to zero, so the same chunk is  
endlessly reprocessed).

Occasionally we can get the underlying libraries to fix issues, but  
each new release has the potential for new and exciting hangs.

That's why Jukka went down the admittedly hard-core and heavy-weight  
path of providing an option to run parses in a child JVM.

If there's another solution, we'd love to hear about it :)

Thanks,

-- Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






Re: nutch crawl command takes 98% of cpu

Posted by Andrzej Bialecki <ab...@getopt.org>.
On 2/1/11 1:39 AM, Kirby Bohling wrote:
> On Sat, Jan 29, 2011 at 9:03 AM, Ken Krugler
> <kk...@transpac.com>  wrote:
>> Some comments below.
>>
>> On Jan 29, 2011, at 5:55am, Julien Nioche wrote:
>>
>>> Hi,
>>>
>>> This shows the state of the various threads within a Java process. Most of
>>> them seem to be busy parsing zip archives with Tika. The interesting part
>>> is
>>> that the main thread is at the Generation step :
>>>
>>> *  at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
>>>   at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
>>> *
>>> with the "Thread-415331" normalizing the URLs as part of the generation.
>>>
>>> So why do we see threads busy at parsing these archives? I think this is a
>>> result of the Timeout mechanism (
>>> https://issues.apache.org/jira/browse/NUTCH-696) used for the parsing.
>>> Before it, we used to have the parsing step loop on a single document and
>>> never complete. Thanks to Andrzej's patch, the parsing is done is separate
>>> threads which are abandonned if more than X seconds have passed (default
>>> 30
>>> I think). Obiously these threads are still lurking around in the
>>> background
>>> and consuming CPU.
>>>
>>> This is an issue when calling the Crawl command only. When using the
>>> separate commands for the various steps, the runaway threads die with the
>>> main process, however since the Crawl uses a single process, these timeout
>>> threads keep going.
>>>
>>> Am not an expert in multithreading and don't have an idea of whether these
>>> threads could be killed somehow. Andrzej, any clue?
>>
>> This is a fundamental problem with run-away threads - there is no safe,
>> reliable way to kill them off.
>>
>> And if you parse enough documents, you will run into a number that currently
>> cause Tika to hang. Zip files for sure, but we ran into the same issue with
>> FLV files.
>>
>> Over in Tika-land, Jukka has a patch that fires up a child JVM and runs
>> parsers there. See https://issues.apache.org/jira/browse/TIKA-416
>>
>> -- Ken
>>
>
> All,
>
>    Just an observation, but the general approach to this problem is to
> use Thread.interrupt().  Virtually all code in the JDK treats the
> thread being interrupted as a request to cancel.  Java Concurrency in
> Practice (JCIP) has a whole chapter on this topic (Chapter 7).  IMHO,
> any general purpose library code that swallows "InterruptedException"
> and isn't implementing the Thread cancellation policy has a bug in it
> (the cancellation policy can only be implemented by the owner of the
> thread, unless the library is a task/thread library it cannot be
> implementing the cancellation policy).  Any place you see:
>
> catch (InterruptedException ex) {
> // Ignore
> }
>
> Just plan on having a hard to track down bug at some point in the
> future.  At the very least, just reset the interruption status like
> so:
>
> catch (InterruptedException ex) {
>     // Resetting the interruption to avoid losing the cancellation request.
>     Thread.currentThread().interrupt();
> //  Twiddle any state necessary to get a bail out in a timeline manner...
> }
>
>    The problem with using the interruption status as cancellation
> approach is that it fails if there is a bug anywhere in any library
> that swallows the InterruptedException (in many ways it is similar to
> a data race).  It is a fundamental problem with threading (there is no
> way to share memory space and have a reliable cancel that a bug can't
> subvert, an infinite loop while holding a lock is the canonical
> example of the problem, killing the thread could lead to an invariant
> being invalid).
>
>     One trivial and simple way if you control the creation of Threads
> is to override "Thread.interrupt", and record that the interrupt
> method was called (and thus cancellation of the thread/work was
> requested), and at the top of the outer most loop check if the cancel
> was set, bail out.  That assumes at some point you do in fact get back
> to the top of the loop.  If you're stuck in an inner loop, fix the
> inner loop that is stuck to respect cancellation/interruption.
>
>    There are several gotchas dealing with interruptions.  Most blocking
> APIs inside of Java respect cancellation (they throw
> InterruptedException if isInterrupted() is true, rather then start a
> potentially blocking operation, and will wake up and throw the
> exception if interrupted in the middle of it).  One exception is that
> sockets read/write operations don't operate this way, the socket must
> be closed to interrupt a read/write, the approach JCIP suggests is to
> tie the socket and thread in such a way that interrupt() closes the
> sockets that would be reading/writing inside that thread.
>
> I believe that the NIO code does as long as the Channel is a
> InterruptableChannel, which the stock network implementations should
> be.  Selector.select() does not handle interruption, it must have
> .wakup called on it in an analogous way to closing the socket.
>
> Not sure exactly what the problems inside of Tika are, but getting it
> to respect interruption would be a wonderful thing for everybody that
> uses it.  The problem might be getting all underlying libraries it
> uses to do so.
>
> Kirby

That was very informative and useful, thanks for explaining it.

-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


Re: nutch crawl command takes 98% of cpu

Posted by Kirby Bohling <ki...@gmail.com>.
On Sat, Jan 29, 2011 at 9:03 AM, Ken Krugler
<kk...@transpac.com> wrote:
> Some comments below.
>
> On Jan 29, 2011, at 5:55am, Julien Nioche wrote:
>
>> Hi,
>>
>> This shows the state of the various threads within a Java process. Most of
>> them seem to be busy parsing zip archives with Tika. The interesting part
>> is
>> that the main thread is at the Generation step :
>>
>> *  at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
>>  at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
>> *
>> with the "Thread-415331" normalizing the URLs as part of the generation.
>>
>> So why do we see threads busy at parsing these archives? I think this is a
>> result of the Timeout mechanism (
>> https://issues.apache.org/jira/browse/NUTCH-696) used for the parsing.
>> Before it, we used to have the parsing step loop on a single document and
>> never complete. Thanks to Andrzej's patch, the parsing is done is separate
>> threads which are abandonned if more than X seconds have passed (default
>> 30
>> I think). Obiously these threads are still lurking around in the
>> background
>> and consuming CPU.
>>
>> This is an issue when calling the Crawl command only. When using the
>> separate commands for the various steps, the runaway threads die with the
>> main process, however since the Crawl uses a single process, these timeout
>> threads keep going.
>>
>> Am not an expert in multithreading and don't have an idea of whether these
>> threads could be killed somehow. Andrzej, any clue?
>
> This is a fundamental problem with run-away threads - there is no safe,
> reliable way to kill them off.
>
> And if you parse enough documents, you will run into a number that currently
> cause Tika to hang. Zip files for sure, but we ran into the same issue with
> FLV files.
>
> Over in Tika-land, Jukka has a patch that fires up a child JVM and runs
> parsers there. See https://issues.apache.org/jira/browse/TIKA-416
>
> -- Ken
>

All,

  Just an observation, but the general approach to this problem is to
use Thread.interrupt().  Virtually all code in the JDK treats the
thread being interrupted as a request to cancel.  Java Concurrency in
Practice (JCIP) has a whole chapter on this topic (Chapter 7).  IMHO,
any general purpose library code that swallows "InterruptedException"
and isn't implementing the Thread cancellation policy has a bug in it
(the cancellation policy can only be implemented by the owner of the
thread, unless the library is a task/thread library it cannot be
implementing the cancellation policy).  Any place you see:

catch (InterruptedException ex) {
// Ignore
}

Just plan on having a hard to track down bug at some point in the
future.  At the very least, just reset the interruption status like
so:

catch (InterruptedException ex) {
   // Resetting the interruption to avoid losing the cancellation request.
   Thread.currentThread().interrupt();
//  Twiddle any state necessary to get a bail out in a timeline manner...
}

  The problem with using the interruption status as cancellation
approach is that it fails if there is a bug anywhere in any library
that swallows the InterruptedException (in many ways it is similar to
a data race).  It is a fundamental problem with threading (there is no
way to share memory space and have a reliable cancel that a bug can't
subvert, an infinite loop while holding a lock is the canonical
example of the problem, killing the thread could lead to an invariant
being invalid).

   One trivial and simple way if you control the creation of Threads
is to override "Thread.interrupt", and record that the interrupt
method was called (and thus cancellation of the thread/work was
requested), and at the top of the outer most loop check if the cancel
was set, bail out.  That assumes at some point you do in fact get back
to the top of the loop.  If you're stuck in an inner loop, fix the
inner loop that is stuck to respect cancellation/interruption.

  There are several gotchas dealing with interruptions.  Most blocking
APIs inside of Java respect cancellation (they throw
InterruptedException if isInterrupted() is true, rather then start a
potentially blocking operation, and will wake up and throw the
exception if interrupted in the middle of it).  One exception is that
sockets read/write operations don't operate this way, the socket must
be closed to interrupt a read/write, the approach JCIP suggests is to
tie the socket and thread in such a way that interrupt() closes the
sockets that would be reading/writing inside that thread.

I believe that the NIO code does as long as the Channel is a
InterruptableChannel, which the stock network implementations should
be.  Selector.select() does not handle interruption, it must have
.wakup called on it in an analogous way to closing the socket.

Not sure exactly what the problems inside of Tika are, but getting it
to respect interruption would be a wonderful thing for everybody that
uses it.  The problem might be getting all underlying libraries it
uses to do so.

Kirby

Re: nutch crawl command takes 98% of cpu

Posted by Ken Krugler <kk...@transpac.com>.
Some comments below.

On Jan 29, 2011, at 5:55am, Julien Nioche wrote:

> Hi,
>
> This shows the state of the various threads within a Java process.  
> Most of
> them seem to be busy parsing zip archives with Tika. The interesting  
> part is
> that the main thread is at the Generation step :
>
> *  at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
>  at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
> *
> with the "Thread-415331" normalizing the URLs as part of the  
> generation.
>
> So why do we see threads busy at parsing these archives? I think  
> this is a
> result of the Timeout mechanism (
> https://issues.apache.org/jira/browse/NUTCH-696) used for the parsing.
> Before it, we used to have the parsing step loop on a single  
> document and
> never complete. Thanks to Andrzej's patch, the parsing is done is  
> separate
> threads which are abandonned if more than X seconds have passed  
> (default 30
> I think). Obiously these threads are still lurking around in the  
> background
> and consuming CPU.
>
> This is an issue when calling the Crawl command only. When using the
> separate commands for the various steps, the runaway threads die  
> with the
> main process, however since the Crawl uses a single process, these  
> timeout
> threads keep going.
>
> Am not an expert in multithreading and don't have an idea of whether  
> these
> threads could be killed somehow. Andrzej, any clue?

This is a fundamental problem with run-away threads - there is no  
safe, reliable way to kill them off.

And if you parse enough documents, you will run into a number that  
currently cause Tika to hang. Zip files for sure, but we ran into the  
same issue with FLV files.

Over in Tika-land, Jukka has a patch that fires up a child JVM and  
runs parsers there. See https://issues.apache.org/jira/browse/TIKA-416

-- Ken

> Would be interesting from a Tika point of view to know what  
> documents caused
> this? Alex is there a trace of the URLs in your logs? Could be  
> something
> like the content being trimmed and causing the parser to go in a loop,
> anyway it would be good to identify the source of the problem.
>
> I have to admit that I am not a big fan of the one-in-all Crawl  
> command, one
> way to alleviate the problem would be not to use it and call the  
> separate
> commands individually, which has also the merit of giving a better  
> idea of
> what goes under the bonnet. I'd rather we shipped a nice and tidy  
> shell
> script to achieve the same goals as the Crawl command, it will also  
> replace
> the numerous and somewhat faulty scripts that can be found on the  
> Wiki. It
> seems that this is a feature that people often request or comment on.
>
> Any thoughts?
>
> Alex, would you mind opening an issue on JIRA for this? Would be  
> great if
> you could see if the URLS causing the parsing to loop could be found  
> in the
> logs and if the same issue can be reproduced with the latest version  
> of
> Tika.
>
> Thanks
>
> Julien
>
>
> On 28 January 2011 21:53, <al...@aim.com> wrote:
>
>> Hello,
>>
>> I did jstack and the result is below.  Could you please let me know  
>> how to
>> interpret it?
>>
>> ----------------------------------------------------------------
>>
>>
>>
>> 2011-01-28 13:46:50
>> Full thread dump OpenJDK Server VM (19.0-b06 mixed mode):
>>
>> "Attach Listener" daemon prio=10 tid=0x6cb21800 nid=0x1e95 waiting on
>> condition [0x00000000]
>>  java.lang.Thread.State: RUNNABLE
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "SpillThread" daemon prio=10 tid=0x6053c400 nid=0x1e18 waiting on  
>> condition
>> [0x6c3ad000]
>>  java.lang.Thread.State: WAITING (parking)
>>   at sun.misc.Unsafe.park(Native Method)
>>   - parking to wait for  <0x7f9a8768> (a
>> java.util.concurrent.locks.AbstractQueuedSynchronizer 
>> $ConditionObject)
>>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java: 
>> 186)
>>   at
>> java.util.concurrent.locks.AbstractQueuedSynchronizer 
>> $ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>>   at
>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer 
>> $SpillThread.run(MapTask.java:1169)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "communication thread" daemon prio=10 tid=0x607bd400 nid=0x1e17  
>> waiting on
>> condition [0x6c8ad000]
>>  java.lang.Thread.State: TIMED_WAITING (sleeping)
>>   at java.lang.Thread.sleep(Native Method)
>>   at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:529)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-415331" prio=10 tid=0x6cb96c00 nid=0x175f runnable  
>> [0x6c2ba000]
>>  java.lang.Thread.State: RUNNABLE
>>   at  
>> org.apache.oro.text.regex.Perl5Matcher.__matchUnicodeClass(Unknown
>> Source)
>>   at org.apache.oro.text.regex.Perl5Matcher.__repeat(Unknown Source)
>>   at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
>>   at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
>>   at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
>>   at org.apache.oro.text.regex.Perl5Matcher.__tryExpression(Unknown
>> Source)
>>   at org.apache.oro.text.regex.Perl5Matcher.__interpret(Unknown  
>> Source)
>>   at org.apache.oro.text.regex.Perl5Matcher.contains(Unknown Source)
>>   at org.apache.oro.text.regex.Util.substitute(Unknown Source)
>>   at org.apache.oro.text.regex.Util.substitute(Unknown Source)
>>   at
>> org 
>> .apache 
>> .nutch 
>> .net 
>> .urlnormalizer 
>> .basic 
>> .BasicURLNormalizer 
>> .substituteUnnecessaryRelativePaths(BasicURLNormalizer.java:166)
>>   at
>> org 
>> .apache 
>> .nutch 
>> .net 
>> .urlnormalizer 
>> .basic.BasicURLNormalizer.normalize(BasicURLNormalizer.java:125)
>>   at
>> org.apache.nutch.net.URLNormalizers.normalize(URLNormalizers.java: 
>> 286)
>>   at
>> org 
>> .apache.nutch.crawl.URLPartitioner.getPartition(URLPartitioner.java: 
>> 69)
>>   at
>> org 
>> .apache.nutch.crawl.URLPartitioner.getPartition(URLPartitioner.java: 
>> 36)
>>   at
>> org.apache.nutch.crawl.Generator 
>> $Selector.getPartition(Generator.java:217)
>>   at
>> org.apache.nutch.crawl.Generator 
>> $Selector.getPartition(Generator.java:109)
>>   at
>> org.apache.hadoop.mapred.MapTask 
>> $OldOutputCollector.collect(MapTask.java:466)
>>   at org.apache.nutch.crawl.Generator$Selector.map(Generator.java: 
>> 212)
>>   at org.apache.nutch.crawl.Generator$Selector.map(Generator.java: 
>> 109)
>>   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>>   at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java: 
>> 177)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-414136" daemon prio=10 tid=0x609f8000 nid=0x207b runnable
>> [0x61fad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78fc22d0> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-398562" daemon prio=10 tid=0x611fa000 nid=0x5977 runnable
>> [0x629fe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78f9f6c8> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-390129" daemon prio=10 tid=0x613b1800 nid=0x237d runnable
>> [0x61ffe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78f88f10> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-378882" daemon prio=10 tid=0x62aa2c00 nid=0x6fa6 runnable
>> [0x66cfe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78eaafe0> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-359578" daemon prio=10 tid=0x61c13400 nid=0x1989 runnable
>> [0x621fe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78e82af8> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-350484" daemon prio=10 tid=0x62370000 nid=0x6f36 runnable
>> [0x6325c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78e5c968> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-349732" daemon prio=10 tid=0x6230f400 nid=0x6be3 runnable
>> [0x632fe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78e4a820> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-347768" daemon prio=10 tid=0x62215c00 nid=0x6327 runnable
>> [0x629ad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78e38340> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-330391" daemon prio=10 tid=0x62b7bc00 nid=0x15ca runnable
>> [0x66e5c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78e21b58> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-326848" daemon prio=10 tid=0x62d8d800 nid=0x586 runnable
>> [0x632ad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78e0dc38> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-320314" daemon prio=10 tid=0x62fa5c00 nid=0x755c runnable
>> [0x66ead000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78df7c88> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-297230" daemon prio=10 tid=0x634f7000 nid=0x6ec4 runnable
>> [0x6585c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78dd0b80> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-292864" daemon prio=10 tid=0x63581400 nid=0x5b80 runnable
>> [0x658fe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78d54910> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-290745" daemon prio=10 tid=0x635d5400 nid=0x520a runnable
>> [0x658ad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78d41a60> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-274085" daemon prio=10 tid=0x63cf0000 nid=0x7b1 runnable
>> [0x66cad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78cef510> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-251630" daemon prio=10 tid=0x649d9800 nid=0x1a26 runnable
>> [0x66c5c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78cd2588> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-247585" daemon prio=10 tid=0x64937800 nid=0x7e96 runnable
>> [0x67a5c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78ccf420> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-245511" daemon prio=10 tid=0x64c7d000 nid=0x7579 runnable
>> [0x670ad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78caaee0> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-241092" daemon prio=10 tid=0x642a8800 nid=0x61c1 runnable
>> [0x670fe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78c8c898> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-231748" daemon prio=10 tid=0x6430f000 nid=0x3862 runnable
>> [0x66efe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78c6ec58> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-230102" daemon prio=10 tid=0x64319800 nid=0x3124 runnable
>> [0x6705c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78c5cb10> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-221512" daemon prio=10 tid=0x6475c400 nid=0xabc runnable
>> [0x676fe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78c426f0> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-218763" daemon prio=10 tid=0x64a74400 nid=0x7c9d runnable
>> [0x6765c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78c30510> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-217142" daemon prio=10 tid=0x64c76000 nid=0x7567 runnable
>> [0x67aad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78c0a908> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-217132" daemon prio=10 tid=0x64cf8000 nid=0x755d runnable
>> [0x676ad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78c0ab48> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-210743" daemon prio=10 tid=0x64ebd400 nid=0x588d runnable
>> [0x684ad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78bef768> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-210232" daemon prio=10 tid=0x64ea1000 nid=0x564a runnable
>> [0x6845c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78bd8d10> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-209224" daemon prio=10 tid=0x64ee9800 nid=0x51b9 runnable
>> [0x67afe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78bc4e60> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-200609" daemon prio=10 tid=0x6524e000 nid=0x2a23 runnable
>> [0x69b5c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78a7eae0> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-197978" daemon prio=10 tid=0x65112000 nid=0x1e87 runnable
>> [0x6b65c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.CRC32.updateBytes(Native Method)
>>   at java.util.zip.CRC32.update(CRC32.java:62)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:242)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-194629" daemon prio=10 tid=0x6545a000 nid=0xe8b runnable
>> [0x684fe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78a3c970> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-193077" daemon prio=10 tid=0x65469800 nid=0x744 runnable
>> [0x6975c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78a29fa0> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-189394" daemon prio=10 tid=0x656e6000 nid=0x757b runnable
>> [0x697ad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78a01328> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-183163" daemon prio=10 tid=0x65af9000 nid=0x59a6 runnable
>> [0x697fe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x789f8a18> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-178890" daemon prio=10 tid=0x65a63400 nid=0x677c runnable
>> [0x69bad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x789d1368> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-176810" daemon prio=10 tid=0x65f27000 nid=0x2d74 runnable
>> [0x6b6ad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x789bd268> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-175412" daemon prio=10 tid=0x65b19400 nid=0x274c runnable
>> [0x69bfe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x789aa498> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-170353" daemon prio=10 tid=0x65958c00 nid=0x10cf runnable
>> [0x6a45c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78991418> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-168754" daemon prio=10 tid=0x65ec0400 nid=0xa24 runnable
>> [0x6a4ad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x7897dca0> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-165271" daemon prio=10 tid=0x65e05c00 nid=0x783c runnable
>> [0x6a4fe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x78966738> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-162717" daemon prio=10 tid=0x65d43400 nid=0x6c27 runnable
>> [0x6bcad000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x7894f760> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-162446" daemon prio=10 tid=0x65d38400 nid=0x6af9 runnable
>> [0x6b6fe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x7893c8a8> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-160981" daemon prio=10 tid=0x6605fc00 nid=0x6444 runnable
>> [0x6bc5c000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x789296d0> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-158465" daemon prio=10 tid=0x65fbf800 nid=0x58a8 runnable
>> [0x6bf7b000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x789056e8> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-156032" daemon prio=10 tid=0x66237800 nid=0x4dd6 runnable
>> [0x6bcfe000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x788e6c90> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Thread-143423" daemon prio=10 tid=0x66a19c00 nid=0x1239 runnable
>> [0x6c123000]
>>  java.lang.Thread.State: RUNNABLE
>>   at java.util.zip.Inflater.inflateBytes(Native Method)
>>   at java.util.zip.Inflater.inflate(Inflater.java:255)
>>   - locked <0x780ee910> (a java.util.zip.ZStreamRef)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip 
>> .ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java: 
>> 112)
>>   at
>> org 
>> .apache 
>> .commons 
>> .compress 
>> .archivers 
>> .zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java: 
>> 188)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>>   at
>> org 
>> .apache 
>> .tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>>   at  
>> org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>>   at  
>> org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>>   at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java: 
>> 334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Java2D Disposer" daemon prio=10 tid=0x6a55cc00 nid=0x2b21 in  
>> Object.wait()
>> [0x6c174000]
>>  java.lang.Thread.State: WAITING (on object monitor)
>>   at java.lang.Object.wait(Native Method)
>>   - waiting on <0x766749c8> (a java.lang.ref.ReferenceQueue$Lock)
>>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
>>   - locked <0x766749c8> (a java.lang.ref.ReferenceQueue$Lock)
>>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
>>   at sun.java2d.Disposer.run(Disposer.java:143)
>>   at java.lang.Thread.run(Thread.java:636)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Low Memory Detector" daemon prio=10 tid=0xb76a6000 nid=0x18cc  
>> runnable
>> [0x00000000]
>>  java.lang.Thread.State: RUNNABLE
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "CompilerThread1" daemon prio=10 tid=0xb76a4400 nid=0x18cb waiting on
>> condition [0x00000000]
>>  java.lang.Thread.State: RUNNABLE
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "CompilerThread0" daemon prio=10 tid=0xb76a2400 nid=0x18ca waiting on
>> condition [0x00000000]
>>  java.lang.Thread.State: RUNNABLE
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Signal Dispatcher" daemon prio=10 tid=0xb76a0c00 nid=0x18c9 runnable
>> [0x00000000]
>>  java.lang.Thread.State: RUNNABLE
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Finalizer" daemon prio=10 tid=0xb7691000 nid=0x18c8 in Object.wait()
>> [0x6d77d000]
>>  java.lang.Thread.State: WAITING (on object monitor)
>>   at java.lang.Object.wait(Native Method)
>>   - waiting on <0x75e1a3e8> (a java.lang.ref.ReferenceQueue$Lock)
>>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
>>   - locked <0x75e1a3e8> (a java.lang.ref.ReferenceQueue$Lock)
>>   at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
>>   at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "Reference Handler" daemon prio=10 tid=0xb768f800 nid=0x18c7 in
>> Object.wait() [0x6d3bc000]
>>  java.lang.Thread.State: WAITING (on object monitor)
>>   at java.lang.Object.wait(Native Method)
>>   - waiting on <0x75da44f0> (a java.lang.ref.Reference$Lock)
>>   at java.lang.Object.wait(Object.java:502)
>>   at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
>>   - locked <0x75da44f0> (a java.lang.ref.Reference$Lock)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "main" prio=10 tid=0xb7606400 nid=0x18c3 waiting on condition  
>> [0xb77b8000]
>>  java.lang.Thread.State: TIMED_WAITING (sleeping)
>>   at java.lang.Thread.sleep(Native Method)
>>   at
>> org 
>> .apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java: 
>> 1282)
>>   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1251)
>>   at org.apache.nutch.crawl.Generator.generate(Generator.java:526)
>>   at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
>>   at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
>>
>>  Locked ownable synchronizers:
>>   - None
>>
>> "VM Thread" prio=10 tid=0xb768bc00 nid=0x18c6 runnable
>>
>> "GC task thread#0 (ParallelGC)" prio=10 tid=0xb760d800 nid=0x18c4  
>> runnable
>>
>> "GC task thread#1 (ParallelGC)" prio=10 tid=0xb760ec00 nid=0x18c5  
>> runnable
>>
>> "VM Periodic Task Thread" prio=10 tid=0xb76a7c00 nid=0x18cd waiting  
>> on
>> condition
>>
>> JNI global references: 1699
>>
>>
>>
>> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






Re: nutch crawl command takes 98% of cpu

Posted by al...@aim.com.
Hello,

It was in the generation stage, so decided to do jstack again in the fetch step. The results are below. I have added zip to my crawl-urlfilter.txt file so it must not handle .zip files though.

thanks.
Alex.
----
2011-01-31 13:12:19
Full thread dump OpenJDK Server VM (19.0-b06 mixed mode):

"Thread-685455" daemon prio=10 tid=0x5a66e000 nid=0x78ac runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"FetcherThread" daemon prio=10 tid=0x5fe21400 nid=0x22eb waiting on condition [0x6c269000]
   java.lang.Thread.State: TIMED_WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0xa94c1df8> (a java.util.concurrent.FutureTask$Sync)
    at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326)
    at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:257)
    at java.util.concurrent.FutureTask.get(FutureTask.java:119)
    at org.apache.nutch.parse.ParseUtil.runParser(ParseUtil.java:159)
    at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:87)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:879)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:647)

   Locked ownable synchronizers:
    - None

"FetcherThread" daemon prio=10 tid=0x5fe20400 nid=0x22ea sleeping[0x621ad000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:575)

   Locked ownable synchronizers:
    - None

"FetcherThread" daemon prio=10 tid=0x601b6000 nid=0x22e9 sleeping[0x6c85c000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:575)

   Locked ownable synchronizers:
    - None

"FetcherThread" daemon prio=10 tid=0x60169c00 nid=0x22e8 waiting on condition [0x6caad000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:575)

   Locked ownable synchronizers:
    - None

"FetcherThread" daemon prio=10 tid=0x601fac00 nid=0x22e7 waiting on condition [0x6c216000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:575)

   Locked ownable synchronizers:
    - None

"FetcherThread" daemon prio=10 tid=0x5fe09c00 nid=0x22e6 sleeping[0x6c8fe000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:575)

   Locked ownable synchronizers:
    - None

"FetcherThread" daemon prio=10 tid=0x5fe2f000 nid=0x22e5 runnable [0x6c3fe000]
   java.lang.Thread.State: RUNNABLE
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:310)
    - locked <0xa896c3d8> (a java.net.SocksSocketImpl)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:176)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:163)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384)
    at java.net.Socket.connect(Socket.java:546)
    at org.apache.nutch.protocol.http.HttpResponse.<init>(HttpResponse.java:97)
    at org.apache.nutch.protocol.http.Http.getResponse(Http.java:64)
    at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:224)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:628)

   Locked ownable synchronizers:
    - None

"FetcherThread" daemon prio=10 tid=0x60176800 nid=0x22e4 sleeping[0x6c1c5000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:575)

   Locked ownable synchronizers:
    - None

"FetcherThread" daemon prio=10 tid=0x60369800 nid=0x22e3 sleeping[0x6c35c000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:575)

   Locked ownable synchronizers:
    - None

"FetcherThread" daemon prio=10 tid=0x601b6c00 nid=0x22e2 sleeping[0x6cafe000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:575)

   Locked ownable synchronizers:
    - None

"QueueFeeder" daemon prio=10 tid=0x6039d000 nid=0x22e1 waiting on condition [0x6ca5c000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.nutch.fetcher.Fetcher$QueueFeeder.run(Fetcher.java:500)

   Locked ownable synchronizers:
    - None

"SpillThread" daemon prio=10 tid=0x60174400 nid=0x22e0 waiting on condition [0x6c3ad000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x7fa3c8d8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1169)

   Locked ownable synchronizers:
    - None

"communication thread" daemon prio=10 tid=0x602fa800 nid=0x22df waiting on condition [0x6c8ad000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:529)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-415395" prio=10 tid=0x6cb96c00 nid=0x22db waiting on condition [0x6c2ba000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:1034)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

   Locked ownable synchronizers:
    - None

"Attach Listener" daemon prio=10 tid=0x6cb21800 nid=0x1e95 waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"Thread-414136" daemon prio=10 tid=0x609f8000 nid=0x207b runnable [0x61fad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78fb8e70> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-398562" daemon prio=10 tid=0x611fa000 nid=0x5977 runnable [0x629fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78f9ce10> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-390129" daemon prio=10 tid=0x613b1800 nid=0x237d runnable [0x61ffe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78f86670> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-378882" daemon prio=10 tid=0x62aa2c00 nid=0x6fa6 runnable [0x66cfe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78ea8780> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-359578" daemon prio=10 tid=0x61c13400 nid=0x1989 runnable [0x621fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e802d0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-350484" daemon prio=10 tid=0x62370000 nid=0x6f36 runnable [0x6325c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e5a140> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-349732" daemon prio=10 tid=0x6230f400 nid=0x6be3 runnable [0x632fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e47ff8> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-347768" daemon prio=10 tid=0x62215c00 nid=0x6327 runnable [0x629ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e35b18> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-330391" daemon prio=10 tid=0x62b7bc00 nid=0x15ca runnable [0x66e5c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e1f330> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-326848" daemon prio=10 tid=0x62d8d800 nid=0x586 runnable [0x632ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e0b410> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-320314" daemon prio=10 tid=0x62fa5c00 nid=0x755c runnable [0x66ead000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78df5460> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-297230" daemon prio=10 tid=0x634f7000 nid=0x6ec4 runnable [0x6585c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78dce3b8> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-292864" daemon prio=10 tid=0x63581400 nid=0x5b80 runnable [0x658fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78d52148> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-290745" daemon prio=10 tid=0x635d5400 nid=0x520a runnable [0x658ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78d3f298> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-274085" daemon prio=10 tid=0x63cf0000 nid=0x7b1 runnable [0x66cad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78cecd68> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-251630" daemon prio=10 tid=0x649d9800 nid=0x1a26 runnable [0x66c5c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78ccfde0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-247585" daemon prio=10 tid=0x64937800 nid=0x7e96 runnable [0x67a5c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78cccc78> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-245511" daemon prio=10 tid=0x64c7d000 nid=0x7579 runnable [0x670ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78ca8738> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-241092" daemon prio=10 tid=0x642a8800 nid=0x61c1 runnable [0x670fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c8a0f0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-231748" daemon prio=10 tid=0x6430f000 nid=0x3862 runnable [0x66efe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c6c4b0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-230102" daemon prio=10 tid=0x64319800 nid=0x3124 runnable [0x6705c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c5a368> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-221512" daemon prio=10 tid=0x6475c400 nid=0xabc runnable [0x676fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c3ff48> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-218763" daemon prio=10 tid=0x64a74400 nid=0x7c9d runnable [0x6765c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c2dd68> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-217142" daemon prio=10 tid=0x64c76000 nid=0x7567 runnable [0x67aad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c08160> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-217132" daemon prio=10 tid=0x64cf8000 nid=0x755d runnable [0x676ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c083a0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-210743" daemon prio=10 tid=0x64ebd400 nid=0x588d runnable [0x684ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78becfc0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-210232" daemon prio=10 tid=0x64ea1000 nid=0x564a runnable [0x6845c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78bd6568> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-209224" daemon prio=10 tid=0x64ee9800 nid=0x51b9 runnable [0x67afe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78bc26b8> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-200609" daemon prio=10 tid=0x6524e000 nid=0x2a23 runnable [0x69b5c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78a7c338> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-197978" daemon prio=10 tid=0x65112000 nid=0x1e87 runnable [0x6b65c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78a55810> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-194629" daemon prio=10 tid=0x6545a000 nid=0xe8b runnable [0x684fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78a3a1c8> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-193077" daemon prio=10 tid=0x65469800 nid=0x744 runnable [0x6975c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78a27908> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-189394" daemon prio=10 tid=0x656e6000 nid=0x757b runnable [0x697ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789fec90> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-183163" daemon prio=10 tid=0x65af9000 nid=0x59a6 runnable [0x697fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789f6380> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-178890" daemon prio=10 tid=0x65a63400 nid=0x677c runnable [0x69bad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789cecd0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-176810" daemon prio=10 tid=0x65f27000 nid=0x2d74 runnable [0x6b6ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789babd0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-175412" daemon prio=10 tid=0x65b19400 nid=0x274c runnable [0x69bfe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789a7e00> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-170353" daemon prio=10 tid=0x65958c00 nid=0x10cf runnable [0x6a45c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x7898eda0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-168754" daemon prio=10 tid=0x65ec0400 nid=0xa24 runnable [0x6a4ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x7897b628> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-165271" daemon prio=10 tid=0x65e05c00 nid=0x783c runnable [0x6a4fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789640c0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-162717" daemon prio=10 tid=0x65d43400 nid=0x6c27 runnable [0x6bcad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x7894d0e8> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-162446" daemon prio=10 tid=0x65d38400 nid=0x6af9 runnable [0x6b6fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x7893a230> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-160981" daemon prio=10 tid=0x6605fc00 nid=0x6444 runnable [0x6bc5c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78927058> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-158465" daemon prio=10 tid=0x65fbf800 nid=0x58a8 runnable [0x6bf7b000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78903070> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-156032" daemon prio=10 tid=0x66237800 nid=0x4dd6 runnable [0x6bcfe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x788e4618> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-143423" daemon prio=10 tid=0x66a19c00 nid=0x1239 runnable [0x6c123000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x780ee8d8> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Java2D Disposer" daemon prio=10 tid=0x6a55cc00 nid=0x2b21 in Object.wait() [0x6c174000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x76674990> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
    - locked <0x76674990> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
    at sun.java2d.Disposer.run(Disposer.java:143)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Low Memory Detector" daemon prio=10 tid=0xb76a6000 nid=0x18cc runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"CompilerThread1" daemon prio=10 tid=0xb76a4400 nid=0x18cb waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"CompilerThread0" daemon prio=10 tid=0xb76a2400 nid=0x18ca waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"Signal Dispatcher" daemon prio=10 tid=0xb76a0c00 nid=0x18c9 runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"Finalizer" daemon prio=10 tid=0xb7691000 nid=0x18c8 in Object.wait() [0x6d77d000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x75e1a3b0> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
    - locked <0x75e1a3b0> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

   Locked ownable synchronizers:
    - None

"Reference Handler" daemon prio=10 tid=0xb768f800 nid=0x18c7 in Object.wait() [0x6d3bc000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x75da44d0> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:502)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
    - locked <0x75da44d0> (a java.lang.ref.Reference$Lock)

   Locked ownable synchronizers:
    - None

"main" prio=10 tid=0xb7606400 nid=0x18c3 waiting on condition [0xb77b8000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1282)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1251)
    at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1107)
    at org.apache.nutch.crawl.Crawl.main(Crawl.java:133)

   Locked ownable synchronizers:
    - None

"VM Thread" prio=10 tid=0xb768bc00 nid=0x18c6 runnable 

"GC task thread#0 (ParallelGC)" prio=10 tid=0xb760d800 nid=0x18c4 runnable 

"GC task thread#1 (ParallelGC)" prio=10 tid=0xb760ec00 nid=0x18c5 runnable 

"VM Periodic Task Thread" prio=10 tid=0xb76a7c00 nid=0x18cd waiting on condition 

JNI global references: 1629

----

 

 


 

 

-----Original Message-----
From: Julien Nioche <li...@gmail.com>
To: user <us...@nutch.apache.org>
Sent: Sat, Jan 29, 2011 5:56 am
Subject: Re: nutch crawl command takes 98% of cpu


Hi,

This shows the state of the various threads within a Java process. Most of
them seem to be busy parsing zip archives with Tika. The interesting part is
that the main thread is at the Generation step :

*  at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
  at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
*
with the "Thread-415331" normalizing the URLs as part of the generation.

So why do we see threads busy at parsing these archives? I think this is a
result of the Timeout mechanism (
https://issues.apache.org/jira/browse/NUTCH-696) used for the parsing.
Before it, we used to have the parsing step loop on a single document and
never complete. Thanks to Andrzej's patch, the parsing is done is separate
threads which are abandonned if more than X seconds have passed (default 30
I think). Obiously these threads are still lurking around in the background
and consuming CPU.

This is an issue when calling the Crawl command only. When using the
separate commands for the various steps, the runaway threads die with the
main process, however since the Crawl uses a single process, these timeout
threads keep going.

Am not an expert in multithreading and don't have an idea of whether these
threads could be killed somehow. Andrzej, any clue?

Would be interesting from a Tika point of view to know what documents caused
this? Alex is there a trace of the URLs in your logs? Could be something
like the content being trimmed and causing the parser to go in a loop,
anyway it would be good to identify the source of the problem.

I have to admit that I am not a big fan of the one-in-all Crawl command, one
way to alleviate the problem would be not to use it and call the separate
commands individually, which has also the merit of giving a better idea of
what goes under the bonnet. I'd rather we shipped a nice and tidy shell
script to achieve the same goals as the Crawl command, it will also replace
the numerous and somewhat faulty scripts that can be found on the Wiki. It
seems that this is a feature that people often request or comment on.

Any thoughts?

Alex, would you mind opening an issue on JIRA for this? Would be great if
you could see if the URLS causing the parsing to loop could be found in the
logs and if the same issue can be reproduced with the latest version of
Tika.

Thanks

Julien


On 28 January 2011 21:53, <al...@aim.com> wrote:

> Hello,
>
> I did jstack and the result is below.  Could you please let me know how to
> interpret it?
>
> ----------------------------------------------------------------
>
>
>
>  2011-01-28 13:46:50
> Full thread dump OpenJDK Server VM (19.0-b06 mixed mode):
>
> "Attach Listener" daemon prio=10 tid=0x6cb21800 nid=0x1e95 waiting on
> condition [0x00000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>    - None
>
> "SpillThread" daemon prio=10 tid=0x6053c400 nid=0x1e18 waiting on condition
> [0x6c3ad000]
>   java.lang.Thread.State: WAITING (parking)
>    at sun.misc.Unsafe.park(Native Method)
>    - parking to wait for  <0x7f9a8768> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>    at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>    at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1169)
>
>   Locked ownable synchronizers:
>    - None
>
> "communication thread" daemon prio=10 tid=0x607bd400 nid=0x1e17 waiting on
> condition [0x6c8ad000]
>   java.lang.Thread.State: TIMED_WAITING (sleeping)
>    at java.lang.Thread.sleep(Native Method)
>    at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:529)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-415331" prio=10 tid=0x6cb96c00 nid=0x175f runnable [0x6c2ba000]
>   java.lang.Thread.State: RUNNABLE
>    at org.apache.oro.text.regex.Perl5Matcher.__matchUnicodeClass(Unknown
> Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__repeat(Unknown Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__tryExpression(Unknown
> Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__interpret(Unknown Source)
>    at org.apache.oro.text.regex.Perl5Matcher.contains(Unknown Source)
>    at org.apache.oro.text.regex.Util.substitute(Unknown Source)
>    at org.apache.oro.text.regex.Util.substitute(Unknown Source)
>    at
> org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer.substituteUnnecessaryRelativePaths(BasicURLNormalizer.java:166)
>    at
> org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer.normalize(BasicURLNormalizer.java:125)
>    at
> org.apache.nutch.net.URLNormalizers.normalize(URLNormalizers.java:286)
>    at
> org.apache.nutch.crawl.URLPartitioner.getPartition(URLPartitioner.java:69)
>    at
> org.apache.nutch.crawl.URLPartitioner.getPartition(URLPartitioner.java:36)
>    at
> org.apache.nutch.crawl.Generator$Selector.getPartition(Generator.java:217)
>    at
> org.apache.nutch.crawl.Generator$Selector.getPartition(Generator.java:109)
>    at
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
>    at org.apache.nutch.crawl.Generator$Selector.map(Generator.java:212)
>    at org.apache.nutch.crawl.Generator$Selector.map(Generator.java:109)
>    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>    at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-414136" daemon prio=10 tid=0x609f8000 nid=0x207b runnable
> [0x61fad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78fc22d0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-398562" daemon prio=10 tid=0x611fa000 nid=0x5977 runnable
> [0x629fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78f9f6c8> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-390129" daemon prio=10 tid=0x613b1800 nid=0x237d runnable
> [0x61ffe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78f88f10> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)

>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-378882" daemon prio=10 tid=0x62aa2c00 nid=0x6fa6 runnable
> [0x66cfe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78eaafe0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-359578" daemon prio=10 tid=0x61c13400 nid=0x1989 runnable
> [0x621fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e82af8> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-350484" daemon prio=10 tid=0x62370000 nid=0x6f36 runnable
> [0x6325c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e5c968> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-349732" daemon prio=10 tid=0x6230f400 nid=0x6be3 runnable
> [0x632fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e4a820> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-347768" daemon prio=10 tid=0x62215c00 nid=0x6327 runnable
> [0x629ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e38340> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-330391" daemon prio=10 tid=0x62b7bc00 nid=0x15ca runnable
> [0x66e5c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e21b58> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-326848" daemon prio=10 tid=0x62d8d800 nid=0x586 runnable
> [0x632ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e0dc38> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-320314" daemon prio=10 tid=0x62fa5c00 nid=0x755c runnable
> [0x66ead000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78df7c88> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-297230" daemon prio=10 tid=0x634f7000 nid=0x6ec4 runnable
> [0x6585c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78dd0b80> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-292864" daemon prio=10 tid=0x63581400 nid=0x5b80 runnable
> [0x658fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78d54910> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-290745" daemon prio=10 tid=0x635d5400 nid=0x520a runnable
> [0x658ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78d41a60> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-274085" daemon prio=10 tid=0x63cf0000 nid=0x7b1 runnable
> [0x66cad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78cef510> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-251630" daemon prio=10 tid=0x649d9800 nid=0x1a26 runnable
> [0x66c5c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78cd2588> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-247585" daemon prio=10 tid=0x64937800 nid=0x7e96 runnable
> [0x67a5c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78ccf420> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-245511" daemon prio=10 tid=0x64c7d000 nid=0x7579 runnable
> [0x670ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78caaee0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-241092" daemon prio=10 tid=0x642a8800 nid=0x61c1 runnable
> [0x670fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c8c898> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-231748" daemon prio=10 tid=0x6430f000 nid=0x3862 runnable
> [0x66efe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c6ec58> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-230102" daemon prio=10 tid=0x64319800 nid=0x3124 runnable
> [0x6705c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c5cb10> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-221512" daemon prio=10 tid=0x6475c400 nid=0xabc runnable
> [0x676fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c426f0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-218763" daemon prio=10 tid=0x64a74400 nid=0x7c9d runnable
> [0x6765c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c30510> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-217142" daemon prio=10 tid=0x64c76000 nid=0x7567 runnable
> [0x67aad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c0a908> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-217132" daemon prio=10 tid=0x64cf8000 nid=0x755d runnable
> [0x676ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c0ab48> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-210743" daemon prio=10 tid=0x64ebd400 nid=0x588d runnable
> [0x684ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78bef768> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-210232" daemon prio=10 tid=0x64ea1000 nid=0x564a runnable
> [0x6845c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78bd8d10> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-209224" daemon prio=10 tid=0x64ee9800 nid=0x51b9 runnable
> [0x67afe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78bc4e60> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-200609" daemon prio=10 tid=0x6524e000 nid=0x2a23 runnable
> [0x69b5c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78a7eae0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-197978" daemon prio=10 tid=0x65112000 nid=0x1e87 runnable
> [0x6b65c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.CRC32.updateBytes(Native Method)
>    at java.util.zip.CRC32.update(CRC32.java:62)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:242)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-194629" daemon prio=10 tid=0x6545a000 nid=0xe8b runnable
> [0x684fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78a3c970> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-193077" daemon prio=10 tid=0x65469800 nid=0x744 runnable
> [0x6975c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78a29fa0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-189394" daemon prio=10 tid=0x656e6000 nid=0x757b runnable
> [0x697ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78a01328> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-183163" daemon prio=10 tid=0x65af9000 nid=0x59a6 runnable
> [0x697fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789f8a18> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-178890" daemon prio=10 tid=0x65a63400 nid=0x677c runnable
> [0x69bad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789d1368> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-176810" daemon prio=10 tid=0x65f27000 nid=0x2d74 runnable
> [0x6b6ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789bd268> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-175412" daemon prio=10 tid=0x65b19400 nid=0x274c runnable
> [0x69bfe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789aa498> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-170353" daemon prio=10 tid=0x65958c00 nid=0x10cf runnable
> [0x6a45c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78991418> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-168754" daemon prio=10 tid=0x65ec0400 nid=0xa24 runnable
> [0x6a4ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x7897dca0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-165271" daemon prio=10 tid=0x65e05c00 nid=0x783c runnable
> [0x6a4fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78966738> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-162717" daemon prio=10 tid=0x65d43400 nid=0x6c27 runnable
> [0x6bcad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x7894f760> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-162446" daemon prio=10 tid=0x65d38400 nid=0x6af9 runnable
> [0x6b6fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x7893c8a8> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-160981" daemon prio=10 tid=0x6605fc00 nid=0x6444 runnable
> [0x6bc5c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789296d0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-158465" daemon prio=10 tid=0x65fbf800 nid=0x58a8 runnable
> [0x6bf7b000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789056e8> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-156032" daemon prio=10 tid=0x66237800 nid=0x4dd6 runnable
> [0x6bcfe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x788e6c90> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-143423" daemon prio=10 tid=0x66a19c00 nid=0x1239 runnable
> [0x6c123000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x780ee910> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Java2D Disposer" daemon prio=10 tid=0x6a55cc00 nid=0x2b21 in Object.wait()
> [0x6c174000]
>   java.lang.Thread.State: WAITING (on object monitor)
>    at java.lang.Object.wait(Native Method)
>    - waiting on <0x766749c8> (a java.lang.ref.ReferenceQueue$Lock)
>    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
>    - locked <0x766749c8> (a java.lang.ref.ReferenceQueue$Lock)
>    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
>    at sun.java2d.Disposer.run(Disposer.java:143)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Low Memory Detector" daemon prio=10 tid=0xb76a6000 nid=0x18cc runnable
> [0x00000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>    - None
>
> "CompilerThread1" daemon prio=10 tid=0xb76a4400 nid=0x18cb waiting on
> condition [0x00000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>    - None
>
> "CompilerThread0" daemon prio=10 tid=0xb76a2400 nid=0x18ca waiting on
> condition [0x00000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>    - None
>
> "Signal Dispatcher" daemon prio=10 tid=0xb76a0c00 nid=0x18c9 runnable
> [0x00000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>    - None
>
> "Finalizer" daemon prio=10 tid=0xb7691000 nid=0x18c8 in Object.wait()
> [0x6d77d000]
>   java.lang.Thread.State: WAITING (on object monitor)
>    at java.lang.Object.wait(Native Method)
>    - waiting on <0x75e1a3e8> (a java.lang.ref.ReferenceQueue$Lock)
>    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
>    - locked <0x75e1a3e8> (a java.lang.ref.ReferenceQueue$Lock)
>    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
>    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)
>
>   Locked ownable synchronizers:
>    - None
>
> "Reference Handler" daemon prio=10 tid=0xb768f800 nid=0x18c7 in
> Object.wait() [0x6d3bc000]
>   java.lang.Thread.State: WAITING (on object monitor)
>    at java.lang.Object.wait(Native Method)
>    - waiting on <0x75da44f0> (a java.lang.ref.Reference$Lock)
>    at java.lang.Object.wait(Object.java:502)
>    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
>    - locked <0x75da44f0> (a java.lang.ref.Reference$Lock)
>
>   Locked ownable synchronizers:
>    - None
>
> "main" prio=10 tid=0xb7606400 nid=0x18c3 waiting on condition [0xb77b8000]
>   java.lang.Thread.State: TIMED_WAITING (sleeping)
>    at java.lang.Thread.sleep(Native Method)
>    at
> org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1282)
>    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1251)
>    at org.apache.nutch.crawl.Generator.generate(Generator.java:526)
>    at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
>    at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
>
>   Locked ownable synchronizers:
>    - None
>
> "VM Thread" prio=10 tid=0xb768bc00 nid=0x18c6 runnable
>
> "GC task thread#0 (ParallelGC)" prio=10 tid=0xb760d800 nid=0x18c4 runnable
>
> "GC task thread#1 (ParallelGC)" prio=10 tid=0xb760ec00 nid=0x18c5 runnable
>
> "VM Periodic Task Thread" prio=10 tid=0xb76a7c00 nid=0x18cd waiting on
> condition
>
> JNI global references: 1699
>
>
>
> --
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com


 

Re: nutch crawl command takes 98% of cpu

Posted by Julien Nioche <li...@gmail.com>.
Hi,

This shows the state of the various threads within a Java process. Most of
them seem to be busy parsing zip archives with Tika. The interesting part is
that the main thread is at the Generation step :

*  at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
  at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
*
with the "Thread-415331" normalizing the URLs as part of the generation.

So why do we see threads busy at parsing these archives? I think this is a
result of the Timeout mechanism (
https://issues.apache.org/jira/browse/NUTCH-696) used for the parsing.
Before it, we used to have the parsing step loop on a single document and
never complete. Thanks to Andrzej's patch, the parsing is done is separate
threads which are abandonned if more than X seconds have passed (default 30
I think). Obiously these threads are still lurking around in the background
and consuming CPU.

This is an issue when calling the Crawl command only. When using the
separate commands for the various steps, the runaway threads die with the
main process, however since the Crawl uses a single process, these timeout
threads keep going.

Am not an expert in multithreading and don't have an idea of whether these
threads could be killed somehow. Andrzej, any clue?

Would be interesting from a Tika point of view to know what documents caused
this? Alex is there a trace of the URLs in your logs? Could be something
like the content being trimmed and causing the parser to go in a loop,
anyway it would be good to identify the source of the problem.

I have to admit that I am not a big fan of the one-in-all Crawl command, one
way to alleviate the problem would be not to use it and call the separate
commands individually, which has also the merit of giving a better idea of
what goes under the bonnet. I'd rather we shipped a nice and tidy shell
script to achieve the same goals as the Crawl command, it will also replace
the numerous and somewhat faulty scripts that can be found on the Wiki. It
seems that this is a feature that people often request or comment on.

Any thoughts?

Alex, would you mind opening an issue on JIRA for this? Would be great if
you could see if the URLS causing the parsing to loop could be found in the
logs and if the same issue can be reproduced with the latest version of
Tika.

Thanks

Julien


On 28 January 2011 21:53, <al...@aim.com> wrote:

> Hello,
>
> I did jstack and the result is below.  Could you please let me know how to
> interpret it?
>
> ----------------------------------------------------------------
>
>
>
>  2011-01-28 13:46:50
> Full thread dump OpenJDK Server VM (19.0-b06 mixed mode):
>
> "Attach Listener" daemon prio=10 tid=0x6cb21800 nid=0x1e95 waiting on
> condition [0x00000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>    - None
>
> "SpillThread" daemon prio=10 tid=0x6053c400 nid=0x1e18 waiting on condition
> [0x6c3ad000]
>   java.lang.Thread.State: WAITING (parking)
>    at sun.misc.Unsafe.park(Native Method)
>    - parking to wait for  <0x7f9a8768> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
>    at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
>    at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1169)
>
>   Locked ownable synchronizers:
>    - None
>
> "communication thread" daemon prio=10 tid=0x607bd400 nid=0x1e17 waiting on
> condition [0x6c8ad000]
>   java.lang.Thread.State: TIMED_WAITING (sleeping)
>    at java.lang.Thread.sleep(Native Method)
>    at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:529)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-415331" prio=10 tid=0x6cb96c00 nid=0x175f runnable [0x6c2ba000]
>   java.lang.Thread.State: RUNNABLE
>    at org.apache.oro.text.regex.Perl5Matcher.__matchUnicodeClass(Unknown
> Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__repeat(Unknown Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__tryExpression(Unknown
> Source)
>    at org.apache.oro.text.regex.Perl5Matcher.__interpret(Unknown Source)
>    at org.apache.oro.text.regex.Perl5Matcher.contains(Unknown Source)
>    at org.apache.oro.text.regex.Util.substitute(Unknown Source)
>    at org.apache.oro.text.regex.Util.substitute(Unknown Source)
>    at
> org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer.substituteUnnecessaryRelativePaths(BasicURLNormalizer.java:166)
>    at
> org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer.normalize(BasicURLNormalizer.java:125)
>    at
> org.apache.nutch.net.URLNormalizers.normalize(URLNormalizers.java:286)
>    at
> org.apache.nutch.crawl.URLPartitioner.getPartition(URLPartitioner.java:69)
>    at
> org.apache.nutch.crawl.URLPartitioner.getPartition(URLPartitioner.java:36)
>    at
> org.apache.nutch.crawl.Generator$Selector.getPartition(Generator.java:217)
>    at
> org.apache.nutch.crawl.Generator$Selector.getPartition(Generator.java:109)
>    at
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
>    at org.apache.nutch.crawl.Generator$Selector.map(Generator.java:212)
>    at org.apache.nutch.crawl.Generator$Selector.map(Generator.java:109)
>    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>    at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-414136" daemon prio=10 tid=0x609f8000 nid=0x207b runnable
> [0x61fad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78fc22d0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-398562" daemon prio=10 tid=0x611fa000 nid=0x5977 runnable
> [0x629fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78f9f6c8> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-390129" daemon prio=10 tid=0x613b1800 nid=0x237d runnable
> [0x61ffe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78f88f10> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-378882" daemon prio=10 tid=0x62aa2c00 nid=0x6fa6 runnable
> [0x66cfe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78eaafe0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-359578" daemon prio=10 tid=0x61c13400 nid=0x1989 runnable
> [0x621fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e82af8> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-350484" daemon prio=10 tid=0x62370000 nid=0x6f36 runnable
> [0x6325c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e5c968> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-349732" daemon prio=10 tid=0x6230f400 nid=0x6be3 runnable
> [0x632fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e4a820> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-347768" daemon prio=10 tid=0x62215c00 nid=0x6327 runnable
> [0x629ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e38340> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-330391" daemon prio=10 tid=0x62b7bc00 nid=0x15ca runnable
> [0x66e5c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e21b58> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-326848" daemon prio=10 tid=0x62d8d800 nid=0x586 runnable
> [0x632ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78e0dc38> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-320314" daemon prio=10 tid=0x62fa5c00 nid=0x755c runnable
> [0x66ead000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78df7c88> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-297230" daemon prio=10 tid=0x634f7000 nid=0x6ec4 runnable
> [0x6585c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78dd0b80> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-292864" daemon prio=10 tid=0x63581400 nid=0x5b80 runnable
> [0x658fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78d54910> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-290745" daemon prio=10 tid=0x635d5400 nid=0x520a runnable
> [0x658ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78d41a60> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-274085" daemon prio=10 tid=0x63cf0000 nid=0x7b1 runnable
> [0x66cad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78cef510> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-251630" daemon prio=10 tid=0x649d9800 nid=0x1a26 runnable
> [0x66c5c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78cd2588> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-247585" daemon prio=10 tid=0x64937800 nid=0x7e96 runnable
> [0x67a5c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78ccf420> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-245511" daemon prio=10 tid=0x64c7d000 nid=0x7579 runnable
> [0x670ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78caaee0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-241092" daemon prio=10 tid=0x642a8800 nid=0x61c1 runnable
> [0x670fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c8c898> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-231748" daemon prio=10 tid=0x6430f000 nid=0x3862 runnable
> [0x66efe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c6ec58> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-230102" daemon prio=10 tid=0x64319800 nid=0x3124 runnable
> [0x6705c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c5cb10> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-221512" daemon prio=10 tid=0x6475c400 nid=0xabc runnable
> [0x676fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c426f0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-218763" daemon prio=10 tid=0x64a74400 nid=0x7c9d runnable
> [0x6765c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c30510> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-217142" daemon prio=10 tid=0x64c76000 nid=0x7567 runnable
> [0x67aad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c0a908> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-217132" daemon prio=10 tid=0x64cf8000 nid=0x755d runnable
> [0x676ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78c0ab48> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-210743" daemon prio=10 tid=0x64ebd400 nid=0x588d runnable
> [0x684ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78bef768> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-210232" daemon prio=10 tid=0x64ea1000 nid=0x564a runnable
> [0x6845c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78bd8d10> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-209224" daemon prio=10 tid=0x64ee9800 nid=0x51b9 runnable
> [0x67afe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78bc4e60> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-200609" daemon prio=10 tid=0x6524e000 nid=0x2a23 runnable
> [0x69b5c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78a7eae0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-197978" daemon prio=10 tid=0x65112000 nid=0x1e87 runnable
> [0x6b65c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.CRC32.updateBytes(Native Method)
>    at java.util.zip.CRC32.update(CRC32.java:62)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:242)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-194629" daemon prio=10 tid=0x6545a000 nid=0xe8b runnable
> [0x684fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78a3c970> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-193077" daemon prio=10 tid=0x65469800 nid=0x744 runnable
> [0x6975c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78a29fa0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-189394" daemon prio=10 tid=0x656e6000 nid=0x757b runnable
> [0x697ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78a01328> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-183163" daemon prio=10 tid=0x65af9000 nid=0x59a6 runnable
> [0x697fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789f8a18> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-178890" daemon prio=10 tid=0x65a63400 nid=0x677c runnable
> [0x69bad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789d1368> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-176810" daemon prio=10 tid=0x65f27000 nid=0x2d74 runnable
> [0x6b6ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789bd268> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-175412" daemon prio=10 tid=0x65b19400 nid=0x274c runnable
> [0x69bfe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789aa498> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-170353" daemon prio=10 tid=0x65958c00 nid=0x10cf runnable
> [0x6a45c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78991418> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-168754" daemon prio=10 tid=0x65ec0400 nid=0xa24 runnable
> [0x6a4ad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x7897dca0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-165271" daemon prio=10 tid=0x65e05c00 nid=0x783c runnable
> [0x6a4fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x78966738> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-162717" daemon prio=10 tid=0x65d43400 nid=0x6c27 runnable
> [0x6bcad000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x7894f760> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-162446" daemon prio=10 tid=0x65d38400 nid=0x6af9 runnable
> [0x6b6fe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x7893c8a8> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-160981" daemon prio=10 tid=0x6605fc00 nid=0x6444 runnable
> [0x6bc5c000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789296d0> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-158465" daemon prio=10 tid=0x65fbf800 nid=0x58a8 runnable
> [0x6bf7b000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x789056e8> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-156032" daemon prio=10 tid=0x66237800 nid=0x4dd6 runnable
> [0x6bcfe000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x788e6c90> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Thread-143423" daemon prio=10 tid=0x66a19c00 nid=0x1239 runnable
> [0x6c123000]
>   java.lang.Thread.State: RUNNABLE
>    at java.util.zip.Inflater.inflateBytes(Native Method)
>    at java.util.zip.Inflater.inflate(Inflater.java:255)
>    - locked <0x780ee910> (a java.util.zip.ZStreamRef)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
>    at
> org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
>    at
> org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
>    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
>    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
>    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
>    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Java2D Disposer" daemon prio=10 tid=0x6a55cc00 nid=0x2b21 in Object.wait()
> [0x6c174000]
>   java.lang.Thread.State: WAITING (on object monitor)
>    at java.lang.Object.wait(Native Method)
>    - waiting on <0x766749c8> (a java.lang.ref.ReferenceQueue$Lock)
>    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
>    - locked <0x766749c8> (a java.lang.ref.ReferenceQueue$Lock)
>    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
>    at sun.java2d.Disposer.run(Disposer.java:143)
>    at java.lang.Thread.run(Thread.java:636)
>
>   Locked ownable synchronizers:
>    - None
>
> "Low Memory Detector" daemon prio=10 tid=0xb76a6000 nid=0x18cc runnable
> [0x00000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>    - None
>
> "CompilerThread1" daemon prio=10 tid=0xb76a4400 nid=0x18cb waiting on
> condition [0x00000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>    - None
>
> "CompilerThread0" daemon prio=10 tid=0xb76a2400 nid=0x18ca waiting on
> condition [0x00000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>    - None
>
> "Signal Dispatcher" daemon prio=10 tid=0xb76a0c00 nid=0x18c9 runnable
> [0x00000000]
>   java.lang.Thread.State: RUNNABLE
>
>   Locked ownable synchronizers:
>    - None
>
> "Finalizer" daemon prio=10 tid=0xb7691000 nid=0x18c8 in Object.wait()
> [0x6d77d000]
>   java.lang.Thread.State: WAITING (on object monitor)
>    at java.lang.Object.wait(Native Method)
>    - waiting on <0x75e1a3e8> (a java.lang.ref.ReferenceQueue$Lock)
>    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
>    - locked <0x75e1a3e8> (a java.lang.ref.ReferenceQueue$Lock)
>    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
>    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)
>
>   Locked ownable synchronizers:
>    - None
>
> "Reference Handler" daemon prio=10 tid=0xb768f800 nid=0x18c7 in
> Object.wait() [0x6d3bc000]
>   java.lang.Thread.State: WAITING (on object monitor)
>    at java.lang.Object.wait(Native Method)
>    - waiting on <0x75da44f0> (a java.lang.ref.Reference$Lock)
>    at java.lang.Object.wait(Object.java:502)
>    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
>    - locked <0x75da44f0> (a java.lang.ref.Reference$Lock)
>
>   Locked ownable synchronizers:
>    - None
>
> "main" prio=10 tid=0xb7606400 nid=0x18c3 waiting on condition [0xb77b8000]
>   java.lang.Thread.State: TIMED_WAITING (sleeping)
>    at java.lang.Thread.sleep(Native Method)
>    at
> org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1282)
>    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1251)
>    at org.apache.nutch.crawl.Generator.generate(Generator.java:526)
>    at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
>    at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)
>
>   Locked ownable synchronizers:
>    - None
>
> "VM Thread" prio=10 tid=0xb768bc00 nid=0x18c6 runnable
>
> "GC task thread#0 (ParallelGC)" prio=10 tid=0xb760d800 nid=0x18c4 runnable
>
> "GC task thread#1 (ParallelGC)" prio=10 tid=0xb760ec00 nid=0x18c5 runnable
>
> "VM Periodic Task Thread" prio=10 tid=0xb76a7c00 nid=0x18cd waiting on
> condition
>
> JNI global references: 1699
>
>
>
> --
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Re: nutch crawl command takes 98% of cpu

Posted by al...@aim.com.
Hello,

I did jstack and the result is below.  Could you please let me know how to interpret it?

----------------------------------------------------------------



 2011-01-28 13:46:50
Full thread dump OpenJDK Server VM (19.0-b06 mixed mode):

"Attach Listener" daemon prio=10 tid=0x6cb21800 nid=0x1e95 waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"SpillThread" daemon prio=10 tid=0x6053c400 nid=0x1e18 waiting on condition [0x6c3ad000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x7f9a8768> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
    at org.apache.hadoop.mapred.MapTask$MapOutputBuffer$SpillThread.run(MapTask.java:1169)

   Locked ownable synchronizers:
    - None

"communication thread" daemon prio=10 tid=0x607bd400 nid=0x1e17 waiting on condition [0x6c8ad000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.hadoop.mapred.Task$TaskReporter.run(Task.java:529)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-415331" prio=10 tid=0x6cb96c00 nid=0x175f runnable [0x6c2ba000]
   java.lang.Thread.State: RUNNABLE
    at org.apache.oro.text.regex.Perl5Matcher.__matchUnicodeClass(Unknown Source)
    at org.apache.oro.text.regex.Perl5Matcher.__repeat(Unknown Source)
    at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
    at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
    at org.apache.oro.text.regex.Perl5Matcher.__match(Unknown Source)
    at org.apache.oro.text.regex.Perl5Matcher.__tryExpression(Unknown Source)
    at org.apache.oro.text.regex.Perl5Matcher.__interpret(Unknown Source)
    at org.apache.oro.text.regex.Perl5Matcher.contains(Unknown Source)
    at org.apache.oro.text.regex.Util.substitute(Unknown Source)
    at org.apache.oro.text.regex.Util.substitute(Unknown Source)
    at org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer.substituteUnnecessaryRelativePaths(BasicURLNormalizer.java:166)
    at org.apache.nutch.net.urlnormalizer.basic.BasicURLNormalizer.normalize(BasicURLNormalizer.java:125)
    at org.apache.nutch.net.URLNormalizers.normalize(URLNormalizers.java:286)
    at org.apache.nutch.crawl.URLPartitioner.getPartition(URLPartitioner.java:69)
    at org.apache.nutch.crawl.URLPartitioner.getPartition(URLPartitioner.java:36)
    at org.apache.nutch.crawl.Generator$Selector.getPartition(Generator.java:217)
    at org.apache.nutch.crawl.Generator$Selector.getPartition(Generator.java:109)
    at org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
    at org.apache.nutch.crawl.Generator$Selector.map(Generator.java:212)
    at org.apache.nutch.crawl.Generator$Selector.map(Generator.java:109)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

   Locked ownable synchronizers:
    - None

"Thread-414136" daemon prio=10 tid=0x609f8000 nid=0x207b runnable [0x61fad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78fc22d0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-398562" daemon prio=10 tid=0x611fa000 nid=0x5977 runnable [0x629fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78f9f6c8> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-390129" daemon prio=10 tid=0x613b1800 nid=0x237d runnable [0x61ffe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78f88f10> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-378882" daemon prio=10 tid=0x62aa2c00 nid=0x6fa6 runnable [0x66cfe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78eaafe0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-359578" daemon prio=10 tid=0x61c13400 nid=0x1989 runnable [0x621fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e82af8> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-350484" daemon prio=10 tid=0x62370000 nid=0x6f36 runnable [0x6325c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e5c968> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-349732" daemon prio=10 tid=0x6230f400 nid=0x6be3 runnable [0x632fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e4a820> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-347768" daemon prio=10 tid=0x62215c00 nid=0x6327 runnable [0x629ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e38340> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-330391" daemon prio=10 tid=0x62b7bc00 nid=0x15ca runnable [0x66e5c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e21b58> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-326848" daemon prio=10 tid=0x62d8d800 nid=0x586 runnable [0x632ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78e0dc38> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-320314" daemon prio=10 tid=0x62fa5c00 nid=0x755c runnable [0x66ead000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78df7c88> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-297230" daemon prio=10 tid=0x634f7000 nid=0x6ec4 runnable [0x6585c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78dd0b80> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-292864" daemon prio=10 tid=0x63581400 nid=0x5b80 runnable [0x658fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78d54910> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-290745" daemon prio=10 tid=0x635d5400 nid=0x520a runnable [0x658ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78d41a60> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-274085" daemon prio=10 tid=0x63cf0000 nid=0x7b1 runnable [0x66cad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78cef510> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-251630" daemon prio=10 tid=0x649d9800 nid=0x1a26 runnable [0x66c5c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78cd2588> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-247585" daemon prio=10 tid=0x64937800 nid=0x7e96 runnable [0x67a5c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78ccf420> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-245511" daemon prio=10 tid=0x64c7d000 nid=0x7579 runnable [0x670ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78caaee0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-241092" daemon prio=10 tid=0x642a8800 nid=0x61c1 runnable [0x670fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c8c898> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-231748" daemon prio=10 tid=0x6430f000 nid=0x3862 runnable [0x66efe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c6ec58> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-230102" daemon prio=10 tid=0x64319800 nid=0x3124 runnable [0x6705c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c5cb10> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-221512" daemon prio=10 tid=0x6475c400 nid=0xabc runnable [0x676fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c426f0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-218763" daemon prio=10 tid=0x64a74400 nid=0x7c9d runnable [0x6765c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c30510> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-217142" daemon prio=10 tid=0x64c76000 nid=0x7567 runnable [0x67aad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c0a908> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-217132" daemon prio=10 tid=0x64cf8000 nid=0x755d runnable [0x676ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78c0ab48> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-210743" daemon prio=10 tid=0x64ebd400 nid=0x588d runnable [0x684ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78bef768> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-210232" daemon prio=10 tid=0x64ea1000 nid=0x564a runnable [0x6845c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78bd8d10> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-209224" daemon prio=10 tid=0x64ee9800 nid=0x51b9 runnable [0x67afe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78bc4e60> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-200609" daemon prio=10 tid=0x6524e000 nid=0x2a23 runnable [0x69b5c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78a7eae0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-197978" daemon prio=10 tid=0x65112000 nid=0x1e87 runnable [0x6b65c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.CRC32.updateBytes(Native Method)
    at java.util.zip.CRC32.update(CRC32.java:62)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:242)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-194629" daemon prio=10 tid=0x6545a000 nid=0xe8b runnable [0x684fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78a3c970> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-193077" daemon prio=10 tid=0x65469800 nid=0x744 runnable [0x6975c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78a29fa0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-189394" daemon prio=10 tid=0x656e6000 nid=0x757b runnable [0x697ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78a01328> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-183163" daemon prio=10 tid=0x65af9000 nid=0x59a6 runnable [0x697fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789f8a18> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-178890" daemon prio=10 tid=0x65a63400 nid=0x677c runnable [0x69bad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789d1368> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-176810" daemon prio=10 tid=0x65f27000 nid=0x2d74 runnable [0x6b6ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789bd268> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-175412" daemon prio=10 tid=0x65b19400 nid=0x274c runnable [0x69bfe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789aa498> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-170353" daemon prio=10 tid=0x65958c00 nid=0x10cf runnable [0x6a45c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78991418> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-168754" daemon prio=10 tid=0x65ec0400 nid=0xa24 runnable [0x6a4ad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x7897dca0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-165271" daemon prio=10 tid=0x65e05c00 nid=0x783c runnable [0x6a4fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x78966738> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-162717" daemon prio=10 tid=0x65d43400 nid=0x6c27 runnable [0x6bcad000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x7894f760> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-162446" daemon prio=10 tid=0x65d38400 nid=0x6af9 runnable [0x6b6fe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x7893c8a8> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-160981" daemon prio=10 tid=0x6605fc00 nid=0x6444 runnable [0x6bc5c000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789296d0> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-158465" daemon prio=10 tid=0x65fbf800 nid=0x58a8 runnable [0x6bf7b000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x789056e8> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-156032" daemon prio=10 tid=0x66237800 nid=0x4dd6 runnable [0x6bcfe000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x788e6c90> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Thread-143423" daemon prio=10 tid=0x66a19c00 nid=0x1239 runnable [0x6c123000]
   java.lang.Thread.State: RUNNABLE
    at java.util.zip.Inflater.inflateBytes(Native Method)
    at java.util.zip.Inflater.inflate(Inflater.java:255)
    - locked <0x780ee910> (a java.util.zip.ZStreamRef)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.read(ZipArchiveInputStream.java:235)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.skip(ZipArchiveInputStream.java:261)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.closeEntry(ZipArchiveInputStream.java:302)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextZipEntry(ZipArchiveInputStream.java:112)
    at org.apache.commons.compress.archivers.zip.ZipArchiveInputStream.getNextEntry(ZipArchiveInputStream.java:188)
    at org.apache.tika.parser.pkg.PackageExtractor.unpack(PackageExtractor.java:177)
    at org.apache.tika.parser.pkg.PackageExtractor.parse(PackageExtractor.java:93)
    at org.apache.tika.parser.pkg.PackageParser.parse(PackageParser.java:61)
    at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:95)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:18)
    at org.apache.nutch.parse.ParseCallable.call(ParseCallable.java:7)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Java2D Disposer" daemon prio=10 tid=0x6a55cc00 nid=0x2b21 in Object.wait() [0x6c174000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x766749c8> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
    - locked <0x766749c8> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
    at sun.java2d.Disposer.run(Disposer.java:143)
    at java.lang.Thread.run(Thread.java:636)

   Locked ownable synchronizers:
    - None

"Low Memory Detector" daemon prio=10 tid=0xb76a6000 nid=0x18cc runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"CompilerThread1" daemon prio=10 tid=0xb76a4400 nid=0x18cb waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"CompilerThread0" daemon prio=10 tid=0xb76a2400 nid=0x18ca waiting on condition [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"Signal Dispatcher" daemon prio=10 tid=0xb76a0c00 nid=0x18c9 runnable [0x00000000]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
    - None

"Finalizer" daemon prio=10 tid=0xb7691000 nid=0x18c8 in Object.wait() [0x6d77d000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x75e1a3e8> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:133)
    - locked <0x75e1a3e8> (a java.lang.ref.ReferenceQueue$Lock)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:149)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

   Locked ownable synchronizers:
    - None

"Reference Handler" daemon prio=10 tid=0xb768f800 nid=0x18c7 in Object.wait() [0x6d3bc000]
   java.lang.Thread.State: WAITING (on object monitor)
    at java.lang.Object.wait(Native Method)
    - waiting on <0x75da44f0> (a java.lang.ref.Reference$Lock)
    at java.lang.Object.wait(Object.java:502)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
    - locked <0x75da44f0> (a java.lang.ref.Reference$Lock)

   Locked ownable synchronizers:
    - None

"main" prio=10 tid=0xb7606400 nid=0x18c3 waiting on condition [0xb77b8000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at org.apache.hadoop.mapred.JobClient.monitorAndPrintJob(JobClient.java:1282)
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1251)
    at org.apache.nutch.crawl.Generator.generate(Generator.java:526)
    at org.apache.nutch.crawl.Generator.generate(Generator.java:431)
    at org.apache.nutch.crawl.Crawl.main(Crawl.java:127)

   Locked ownable synchronizers:
    - None

"VM Thread" prio=10 tid=0xb768bc00 nid=0x18c6 runnable 

"GC task thread#0 (ParallelGC)" prio=10 tid=0xb760d800 nid=0x18c4 runnable 

"GC task thread#1 (ParallelGC)" prio=10 tid=0xb760ec00 nid=0x18c5 runnable 

"VM Periodic Task Thread" prio=10 tid=0xb76a7c00 nid=0x18cd waiting on condition 

JNI global references: 1699


 


 

 

-----Original Message-----
From: Julien Nioche <li...@gmail.com>
To: user <us...@nutch.apache.org>
Sent: Fri, Jan 28, 2011 6:01 am
Subject: Re: nutch crawl command takes 98% of cpu


That's assuming that the problem comes from the parsing.
Alex, can you either run jstack on the process to see what is is hanging on
or do as Chris suggested?
Note that it is not recommended to upgrade to Tika 0.8 if you want to
process PDF docs because of an issue which will be resolved in the next Tika
release.
Another solution - if the problem comes from flv files and you are not
interested in them - is to add a URLFilter which will prevent such files to
be fetched.

Julien

On 28 January 2011 03:32, Alexis <al...@gmail.com> wrote:

> Hi,
>
> I ran into the same issue as well with Nutch 1.2. You could fix it by
> upgrading the version of tika parser to at least 0.8. The lib can be
> found in the plugins/parse-tika/ directory of your Nutch release.
>
> This has already been mentioned twice in the mailing-list: See
> http://lucene.472066.n3.nabble.com/Full-CPU-usage-td1976780.html
>
> I hope this will help you out.
>
> Alexis
>
> On Fri, Jan 28, 2011 at 1:01 AM, Chris Woolum <cw...@moonvalley.com>
> wrote:
> > If you are looking at the tasktracker control panel, what does it show?
> > The link is http://localhost:50030
> >
> >
> > -----Original Message-----
> > From: alxsss@aim.com [mailto:alxsss@aim.com]
> > Sent: Thursday, January 27, 2011 3:01 PM
> > To: user@nutch.apache.org
> > Subject: nutch crawl command takes 98% of cpu
> >
> > Hello,
> >
> > I run crawl command with -depth 7 -topN -1 on my linux box with 1.5Mps
> > internet, amd 3.1ghz processor,  4GB memory, Fedora Linux 14, nutch 1.2.
> > After 1-2 days nutch takes 98% of cpu. My seed file includes about 3500
> > domains and I put fetch.external links to false.
> >
> > Is this normal? If not, what can be done to improve it?
> >
> > Thanks.
> > Alex.
> >
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com


 

Re: nutch crawl command takes 98% of cpu

Posted by Julien Nioche <li...@gmail.com>.
That's assuming that the problem comes from the parsing.
Alex, can you either run jstack on the process to see what is is hanging on
or do as Chris suggested?
Note that it is not recommended to upgrade to Tika 0.8 if you want to
process PDF docs because of an issue which will be resolved in the next Tika
release.
Another solution - if the problem comes from flv files and you are not
interested in them - is to add a URLFilter which will prevent such files to
be fetched.

Julien

On 28 January 2011 03:32, Alexis <al...@gmail.com> wrote:

> Hi,
>
> I ran into the same issue as well with Nutch 1.2. You could fix it by
> upgrading the version of tika parser to at least 0.8. The lib can be
> found in the plugins/parse-tika/ directory of your Nutch release.
>
> This has already been mentioned twice in the mailing-list: See
> http://lucene.472066.n3.nabble.com/Full-CPU-usage-td1976780.html
>
> I hope this will help you out.
>
> Alexis
>
> On Fri, Jan 28, 2011 at 1:01 AM, Chris Woolum <cw...@moonvalley.com>
> wrote:
> > If you are looking at the tasktracker control panel, what does it show?
> > The link is http://localhost:50030
> >
> >
> > -----Original Message-----
> > From: alxsss@aim.com [mailto:alxsss@aim.com]
> > Sent: Thursday, January 27, 2011 3:01 PM
> > To: user@nutch.apache.org
> > Subject: nutch crawl command takes 98% of cpu
> >
> > Hello,
> >
> > I run crawl command with -depth 7 -topN -1 on my linux box with 1.5Mps
> > internet, amd 3.1ghz processor,  4GB memory, Fedora Linux 14, nutch 1.2.
> > After 1-2 days nutch takes 98% of cpu. My seed file includes about 3500
> > domains and I put fetch.external links to false.
> >
> > Is this normal? If not, what can be done to improve it?
> >
> > Thanks.
> > Alex.
> >
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Re: nutch crawl command takes 98% of cpu

Posted by Alexis <al...@gmail.com>.
Hi,

I ran into the same issue as well with Nutch 1.2. You could fix it by
upgrading the version of tika parser to at least 0.8. The lib can be
found in the plugins/parse-tika/ directory of your Nutch release.

This has already been mentioned twice in the mailing-list: See
http://lucene.472066.n3.nabble.com/Full-CPU-usage-td1976780.html

I hope this will help you out.

Alexis

On Fri, Jan 28, 2011 at 1:01 AM, Chris Woolum <cw...@moonvalley.com> wrote:
> If you are looking at the tasktracker control panel, what does it show?
> The link is http://localhost:50030
>
>
> -----Original Message-----
> From: alxsss@aim.com [mailto:alxsss@aim.com]
> Sent: Thursday, January 27, 2011 3:01 PM
> To: user@nutch.apache.org
> Subject: nutch crawl command takes 98% of cpu
>
> Hello,
>
> I run crawl command with -depth 7 -topN -1 on my linux box with 1.5Mps
> internet, amd 3.1ghz processor,  4GB memory, Fedora Linux 14, nutch 1.2.
> After 1-2 days nutch takes 98% of cpu. My seed file includes about 3500
> domains and I put fetch.external links to false.
>
> Is this normal? If not, what can be done to improve it?
>
> Thanks.
> Alex.
>

RE: nutch crawl command takes 98% of cpu

Posted by Chris Woolum <cw...@moonvalley.com>.
If you are looking at the tasktracker control panel, what does it show?
The link is http://localhost:50030


-----Original Message-----
From: alxsss@aim.com [mailto:alxsss@aim.com] 
Sent: Thursday, January 27, 2011 3:01 PM
To: user@nutch.apache.org
Subject: nutch crawl command takes 98% of cpu

Hello,

I run crawl command with -depth 7 -topN -1 on my linux box with 1.5Mps
internet, amd 3.1ghz processor,  4GB memory, Fedora Linux 14, nutch 1.2.
After 1-2 days nutch takes 98% of cpu. My seed file includes about 3500
domains and I put fetch.external links to false.

Is this normal? If not, what can be done to improve it?

Thanks.
Alex.