You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Hutchins, Jonathan" <jh...@webmd.net> on 2014/04/25 19:56:01 UTC

DIH issues with 4.7.1

I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH
process that we are using takes 4x as long to complete.  The only odd
thing I notice is when I enable debug logging for the dataimporthandler
process, it appears that in the new version each sql query is resulting in
a new connection opened through jdbcdatasource (log:
http://pastebin.com/JKh4gpmu).  Were there any changes that would affect
the speed of running a full import?

Thanks!

- Jonathan Hutchins



Re: DIH issues with 4.7.1

Posted by Mark Miller <ma...@gmail.com>.
bq. due to things like NTP, etc.

The full sentence is very important. NTP is not the only way for this to happen - you also have leap seconds, daylight savings time, internet clock sync, a whole host of things that affect currentTimeMillis and not nanoTime. It is without question the way to go to even hope for monotonicity.
-- 
Mark Miller
about.me/markrmiller

On April 26, 2014 at 1:11:14 PM, Walter Underwood (wunder@wunderwood.org) wrote:

NTP works very hard to keep the clock positive monotonic. But nanoTime is intended for elapsed time measurement anyway, so it is the right choice.  

You can get some pretty fun clock behavior by running on virtual machines, like in AWS. And some system real time clocks don't tick during a leap second. And Windows system clocks are probably still hopeless.  

If you want to run the clock backwards, we don't need NTP, we can set it with "date".  

wunder  

On Apr 26, 2014, at 9:10 AM, Mark Miller <ma...@gmail.com> wrote:  

> My answer remains the same. I guess if you want more precise terminology, nanoTime will generally be monotonic and currentTimeMillis will not be, due to things like NTP, etc. You want monotonicity for measuring elapsed times.  
> --  
> Mark Miller  
> about.me/markrmiller  
>  
> On April 26, 2014 at 11:25:16 AM, Walter Underwood (wunder@wunderwood.org) wrote:  
>  
> NTP should slew the clock rather than jump it. I haven't checked recently, but that is how it worked in the 90's when I was organizing the NTP hierarchy at HP.  
>  
> It only does step changes if the clocks is really wrong. That is most likely at reboot, when other demons aren't running yet.  
>  
> wunder  
>  
> On Apr 26, 2014, at 7:30 AM, Mark Miller <ma...@gmail.com> wrote:  
>  
>> System.currentTimeMillis can jump around due to NTP, etc. If you are trying to count elapsed time, you don’t want to use a method that can jump around with the results.  
>> --  
>> Mark Miller  
>> about.me/markrmiller  
>>  
>> On April 26, 2014 at 8:58:20 AM, YouPeng Yang (yypvsxf19870706@gmail.com) wrote:  
>>  
>> Hi Rafał Kuć  
>> I got it,the point is many operating systems measure time in units of  
>> tens of milliseconds,and the System.currentTimeMillis() is just base on  
>> operating system.  
>> In my case,I just do DIH with a crontable, Is there any possiblity to get  
>> in that trouble?I am really can not picture what the situation may lead to  
>> the problem.  
>>  
>>  
>> Thanks very much.  
>>  
>>  
>> 2014-04-26 20:49 GMT+08:00 YouPeng Yang <yy...@gmail.com>:  
>>  
>>> Hi Mark Miller  
>>> Sorry to get you in these discussion .  
>>> I notice that Mark Miller report this issure in  
>>> https://issues.apache.org/jira/browse/SOLR-5734 according to  
>>> https://issues.apache.org/jira/browse/SOLR-5721,but it just happened with  
>>> the zookeeper.  
>>> If I just do DIH with JDBCDataSource ,I do not think it will get the  
>>> problem.  
>>> Please give some hints  
>>>  
>>>>> Bonus,just post the last mail I send about the problem:  
>>>  
>>> I have just compare the difference between the version 4.6.0 and 4.7.1.  
>>> Notice that the time in the getConnection function is declared with the  
>>> System.nanoTime in 4.7.1 ,while System.currentTimeMillis().  
>>> Curious about the resson for the change.the benefit of it .Is it  
>>> neccessory?  
>>> I have read the SOLR-5734 ,  
>>> https://issues.apache.org/jira/browse/SOLR-5734  
>>> Do some google about the difference of currentTimeMillis and nano,but  
>>> still can not figure out it.  
>>>  
>>> Thank you very much.  
>>>  
>>>  
>>> 2014-04-26 20:31 GMT+08:00 YouPeng Yang <yy...@gmail.com>:  
>>>  
>>> Hi  
>>>> I have just compare the difference between the version 4.6.0 and  
>>>> 4.7.1. Notice that the time in the getConnection function is declared  
>>>> with the System.nanoTime in 4.7.1 ,while System.currentTimeMillis().  
>>>> Curious about the resson for the change.the benefit of it .Is it  
>>>> neccessory?  
>>>> I have read the SOLR-5734 ,  
>>>> https://issues.apache.org/jira/browse/SOLR-5734  
>>>> Do some google about the difference of currentTimeMillis and nano,but  
>>>> still can not figure out it.  
>>>>  
>>>>  
>>>>  
>>>>  
>>>> 2014-04-26 2:24 GMT+08:00 Shawn Heisey <so...@elyograg.org>:  
>>>>  
>>>> On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote:  
>>>>>  
>>>>>> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH  
>>>>>> process that we are using takes 4x as long to complete. The only odd  
>>>>>> thing I notice is when I enable debug logging for the dataimporthandler  
>>>>>> process, it appears that in the new version each sql query is resulting  
>>>>>> in  
>>>>>> a new connection opened through jdbcdatasource (log:  
>>>>>> http://pastebin.com/JKh4gpmu). Were there any changes that would  
>>>>>> affect  
>>>>>> the speed of running a full import?  
>>>>>>  
>>>>>  
>>>>> This is most likely the problem you are experiencing:  
>>>>>  
>>>>> https://issues.apache.org/jira/browse/SOLR-5954  
>>>>>  
>>>>> The fix will be in the new 4.8 version. The release process for 4.8 is  
>>>>> underway right now. A second release candidate was required yesterday. If  
>>>>> no further problems are encountered, the release should be made around the  
>>>>> middle of next week. If problems are encountered, the release will be  
>>>>> delayed.  
>>>>>  
>>>>> Here's something very important that has been mentioned before: Solr  
>>>>> 4.8 will require Java 7. Previously, Java 6 was required. Java 7u55 (the  
>>>>> current release from Oracle as I write this) is recommended as a minimum.  
>>>>>  
>>>>> If a 4.7.3 version is built, this is a fix that we should backport.  
>>>>>  
>>>>> Thanks,  
>>>>> Shawn  
>>>>>  
>>>>>  
>>>>  
>>>  
>  
> --  
> Walter Underwood  
> wunder@wunderwood.org  
>  
>  
>  

--  
Walter Underwood  
wunder@wunderwood.org  




Re: DIH issues with 4.7.1

Posted by Walter Underwood <wu...@wunderwood.org>.
NTP works very hard to keep the clock positive monotonic. But nanoTime is intended for elapsed time measurement anyway, so it is the right choice.

You can get some pretty fun clock behavior by running on virtual machines, like in AWS. And some system real time clocks don't tick during a leap second. And Windows system clocks are probably still hopeless.

If you want to run the clock backwards, we don't need NTP, we can set it with "date".

wunder

On Apr 26, 2014, at 9:10 AM, Mark Miller <ma...@gmail.com> wrote:

> My answer remains the same. I guess if you want more precise terminology, nanoTime will generally be monotonic and currentTimeMillis will not be, due to things like NTP, etc. You want monotonicity for measuring elapsed times.
> -- 
> Mark Miller
> about.me/markrmiller
> 
> On April 26, 2014 at 11:25:16 AM, Walter Underwood (wunder@wunderwood.org) wrote:
> 
> NTP should slew the clock rather than jump it. I haven't checked recently, but that is how it worked in the 90's when I was organizing the NTP hierarchy at HP.  
> 
> It only does step changes if the clocks is really wrong. That is most likely at reboot, when other demons aren't running yet.  
> 
> wunder  
> 
> On Apr 26, 2014, at 7:30 AM, Mark Miller <ma...@gmail.com> wrote:  
> 
>> System.currentTimeMillis can jump around due to NTP, etc. If you are trying to count elapsed time, you don’t want to use a method that can jump around with the results.  
>> --  
>> Mark Miller  
>> about.me/markrmiller  
>> 
>> On April 26, 2014 at 8:58:20 AM, YouPeng Yang (yypvsxf19870706@gmail.com) wrote:  
>> 
>> Hi Rafał Kuć  
>> I got it,the point is many operating systems measure time in units of  
>> tens of milliseconds,and the System.currentTimeMillis() is just base on  
>> operating system.  
>> In my case,I just do DIH with a crontable, Is there any possiblity to get  
>> in that trouble?I am really can not picture what the situation may lead to  
>> the problem.  
>> 
>> 
>> Thanks very much.  
>> 
>> 
>> 2014-04-26 20:49 GMT+08:00 YouPeng Yang <yy...@gmail.com>:  
>> 
>>> Hi Mark Miller  
>>> Sorry to get you in these discussion .  
>>> I notice that Mark Miller report this issure in  
>>> https://issues.apache.org/jira/browse/SOLR-5734 according to  
>>> https://issues.apache.org/jira/browse/SOLR-5721,but it just happened with  
>>> the zookeeper.  
>>> If I just do DIH with JDBCDataSource ,I do not think it will get the  
>>> problem.  
>>> Please give some hints  
>>> 
>>>>> Bonus,just post the last mail I send about the problem:  
>>> 
>>> I have just compare the difference between the version 4.6.0 and 4.7.1.  
>>> Notice that the time in the getConnection function is declared with the  
>>> System.nanoTime in 4.7.1 ,while System.currentTimeMillis().  
>>> Curious about the resson for the change.the benefit of it .Is it  
>>> neccessory?  
>>> I have read the SOLR-5734 ,  
>>> https://issues.apache.org/jira/browse/SOLR-5734  
>>> Do some google about the difference of currentTimeMillis and nano,but  
>>> still can not figure out it.  
>>> 
>>> Thank you very much.  
>>> 
>>> 
>>> 2014-04-26 20:31 GMT+08:00 YouPeng Yang <yy...@gmail.com>:  
>>> 
>>> Hi  
>>>> I have just compare the difference between the version 4.6.0 and  
>>>> 4.7.1. Notice that the time in the getConnection function is declared  
>>>> with the System.nanoTime in 4.7.1 ,while System.currentTimeMillis().  
>>>> Curious about the resson for the change.the benefit of it .Is it  
>>>> neccessory?  
>>>> I have read the SOLR-5734 ,  
>>>> https://issues.apache.org/jira/browse/SOLR-5734  
>>>> Do some google about the difference of currentTimeMillis and nano,but  
>>>> still can not figure out it.  
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 2014-04-26 2:24 GMT+08:00 Shawn Heisey <so...@elyograg.org>:  
>>>> 
>>>> On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote:  
>>>>> 
>>>>>> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH  
>>>>>> process that we are using takes 4x as long to complete. The only odd  
>>>>>> thing I notice is when I enable debug logging for the dataimporthandler  
>>>>>> process, it appears that in the new version each sql query is resulting  
>>>>>> in  
>>>>>> a new connection opened through jdbcdatasource (log:  
>>>>>> http://pastebin.com/JKh4gpmu). Were there any changes that would  
>>>>>> affect  
>>>>>> the speed of running a full import?  
>>>>>> 
>>>>> 
>>>>> This is most likely the problem you are experiencing:  
>>>>> 
>>>>> https://issues.apache.org/jira/browse/SOLR-5954  
>>>>> 
>>>>> The fix will be in the new 4.8 version. The release process for 4.8 is  
>>>>> underway right now. A second release candidate was required yesterday. If  
>>>>> no further problems are encountered, the release should be made around the  
>>>>> middle of next week. If problems are encountered, the release will be  
>>>>> delayed.  
>>>>> 
>>>>> Here's something very important that has been mentioned before: Solr  
>>>>> 4.8 will require Java 7. Previously, Java 6 was required. Java 7u55 (the  
>>>>> current release from Oracle as I write this) is recommended as a minimum.  
>>>>> 
>>>>> If a 4.7.3 version is built, this is a fix that we should backport.  
>>>>> 
>>>>> Thanks,  
>>>>> Shawn  
>>>>> 
>>>>> 
>>>> 
>>> 
> 
> --  
> Walter Underwood  
> wunder@wunderwood.org  
> 
> 
> 

--
Walter Underwood
wunder@wunderwood.org




Re: DIH issues with 4.7.1

Posted by Mark Miller <ma...@gmail.com>.
My answer remains the same. I guess if you want more precise terminology, nanoTime will generally be monotonic and currentTimeMillis will not be, due to things like NTP, etc. You want monotonicity for measuring elapsed times.
-- 
Mark Miller
about.me/markrmiller

On April 26, 2014 at 11:25:16 AM, Walter Underwood (wunder@wunderwood.org) wrote:

NTP should slew the clock rather than jump it. I haven't checked recently, but that is how it worked in the 90's when I was organizing the NTP hierarchy at HP.  

It only does step changes if the clocks is really wrong. That is most likely at reboot, when other demons aren't running yet.  

wunder  

On Apr 26, 2014, at 7:30 AM, Mark Miller <ma...@gmail.com> wrote:  

> System.currentTimeMillis can jump around due to NTP, etc. If you are trying to count elapsed time, you don’t want to use a method that can jump around with the results.  
> --  
> Mark Miller  
> about.me/markrmiller  
>  
> On April 26, 2014 at 8:58:20 AM, YouPeng Yang (yypvsxf19870706@gmail.com) wrote:  
>  
> Hi Rafał Kuć  
> I got it,the point is many operating systems measure time in units of  
> tens of milliseconds,and the System.currentTimeMillis() is just base on  
> operating system.  
> In my case,I just do DIH with a crontable, Is there any possiblity to get  
> in that trouble?I am really can not picture what the situation may lead to  
> the problem.  
>  
>  
> Thanks very much.  
>  
>  
> 2014-04-26 20:49 GMT+08:00 YouPeng Yang <yy...@gmail.com>:  
>  
>> Hi Mark Miller  
>> Sorry to get you in these discussion .  
>> I notice that Mark Miller report this issure in  
>> https://issues.apache.org/jira/browse/SOLR-5734 according to  
>> https://issues.apache.org/jira/browse/SOLR-5721,but it just happened with  
>> the zookeeper.  
>> If I just do DIH with JDBCDataSource ,I do not think it will get the  
>> problem.  
>> Please give some hints  
>>  
>>>> Bonus,just post the last mail I send about the problem:  
>>  
>> I have just compare the difference between the version 4.6.0 and 4.7.1.  
>> Notice that the time in the getConnection function is declared with the  
>> System.nanoTime in 4.7.1 ,while System.currentTimeMillis().  
>> Curious about the resson for the change.the benefit of it .Is it  
>> neccessory?  
>> I have read the SOLR-5734 ,  
>> https://issues.apache.org/jira/browse/SOLR-5734  
>> Do some google about the difference of currentTimeMillis and nano,but  
>> still can not figure out it.  
>>  
>> Thank you very much.  
>>  
>>  
>> 2014-04-26 20:31 GMT+08:00 YouPeng Yang <yy...@gmail.com>:  
>>  
>> Hi  
>>> I have just compare the difference between the version 4.6.0 and  
>>> 4.7.1. Notice that the time in the getConnection function is declared  
>>> with the System.nanoTime in 4.7.1 ,while System.currentTimeMillis().  
>>> Curious about the resson for the change.the benefit of it .Is it  
>>> neccessory?  
>>> I have read the SOLR-5734 ,  
>>> https://issues.apache.org/jira/browse/SOLR-5734  
>>> Do some google about the difference of currentTimeMillis and nano,but  
>>> still can not figure out it.  
>>>  
>>>  
>>>  
>>>  
>>> 2014-04-26 2:24 GMT+08:00 Shawn Heisey <so...@elyograg.org>:  
>>>  
>>> On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote:  
>>>>  
>>>>> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH  
>>>>> process that we are using takes 4x as long to complete. The only odd  
>>>>> thing I notice is when I enable debug logging for the dataimporthandler  
>>>>> process, it appears that in the new version each sql query is resulting  
>>>>> in  
>>>>> a new connection opened through jdbcdatasource (log:  
>>>>> http://pastebin.com/JKh4gpmu). Were there any changes that would  
>>>>> affect  
>>>>> the speed of running a full import?  
>>>>>  
>>>>  
>>>> This is most likely the problem you are experiencing:  
>>>>  
>>>> https://issues.apache.org/jira/browse/SOLR-5954  
>>>>  
>>>> The fix will be in the new 4.8 version. The release process for 4.8 is  
>>>> underway right now. A second release candidate was required yesterday. If  
>>>> no further problems are encountered, the release should be made around the  
>>>> middle of next week. If problems are encountered, the release will be  
>>>> delayed.  
>>>>  
>>>> Here's something very important that has been mentioned before: Solr  
>>>> 4.8 will require Java 7. Previously, Java 6 was required. Java 7u55 (the  
>>>> current release from Oracle as I write this) is recommended as a minimum.  
>>>>  
>>>> If a 4.7.3 version is built, this is a fix that we should backport.  
>>>>  
>>>> Thanks,  
>>>> Shawn  
>>>>  
>>>>  
>>>  
>>  

--  
Walter Underwood  
wunder@wunderwood.org  




Re: DIH issues with 4.7.1

Posted by Walter Underwood <wu...@wunderwood.org>.
NTP should slew the clock rather than jump it. I haven't checked recently, but that is how it worked in the 90's when I was organizing the NTP hierarchy at HP.

It only does step changes if the clocks is really wrong. That is most likely at reboot, when other demons aren't running yet.

wunder

On Apr 26, 2014, at 7:30 AM, Mark Miller <ma...@gmail.com> wrote:

> System.currentTimeMillis can jump around due to NTP, etc. If you are trying to count elapsed time, you don’t want to use a method that can jump around with the results.
> -- 
> Mark Miller
> about.me/markrmiller
> 
> On April 26, 2014 at 8:58:20 AM, YouPeng Yang (yypvsxf19870706@gmail.com) wrote:
> 
> Hi Rafał Kuć  
> I got it,the point is many operating systems measure time in units of  
> tens of milliseconds,and the System.currentTimeMillis() is just base on  
> operating system.  
> In my case,I just do DIH with a crontable, Is there any possiblity to get  
> in that trouble?I am really can not picture what the situation may lead to  
> the problem.  
> 
> 
> Thanks very much.  
> 
> 
> 2014-04-26 20:49 GMT+08:00 YouPeng Yang <yy...@gmail.com>:  
> 
>> Hi Mark Miller  
>> Sorry to get you in these discussion .  
>> I notice that Mark Miller report this issure in  
>> https://issues.apache.org/jira/browse/SOLR-5734 according to  
>> https://issues.apache.org/jira/browse/SOLR-5721,but it just happened with  
>> the zookeeper.  
>> If I just do DIH with JDBCDataSource ,I do not think it will get the  
>> problem.  
>> Please give some hints  
>> 
>>>> Bonus,just post the last mail I send about the problem:  
>> 
>> I have just compare the difference between the version 4.6.0 and 4.7.1.  
>> Notice that the time in the getConnection function is declared with the  
>> System.nanoTime in 4.7.1 ,while System.currentTimeMillis().  
>> Curious about the resson for the change.the benefit of it .Is it  
>> neccessory?  
>> I have read the SOLR-5734 ,  
>> https://issues.apache.org/jira/browse/SOLR-5734  
>> Do some google about the difference of currentTimeMillis and nano,but  
>> still can not figure out it.  
>> 
>> Thank you very much.  
>> 
>> 
>> 2014-04-26 20:31 GMT+08:00 YouPeng Yang <yy...@gmail.com>:  
>> 
>> Hi  
>>> I have just compare the difference between the version 4.6.0 and  
>>> 4.7.1. Notice that the time in the getConnection function is declared  
>>> with the System.nanoTime in 4.7.1 ,while System.currentTimeMillis().  
>>> Curious about the resson for the change.the benefit of it .Is it  
>>> neccessory?  
>>> I have read the SOLR-5734 ,  
>>> https://issues.apache.org/jira/browse/SOLR-5734  
>>> Do some google about the difference of currentTimeMillis and nano,but  
>>> still can not figure out it.  
>>> 
>>> 
>>> 
>>> 
>>> 2014-04-26 2:24 GMT+08:00 Shawn Heisey <so...@elyograg.org>:  
>>> 
>>> On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote:  
>>>> 
>>>>> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH  
>>>>> process that we are using takes 4x as long to complete. The only odd  
>>>>> thing I notice is when I enable debug logging for the dataimporthandler  
>>>>> process, it appears that in the new version each sql query is resulting  
>>>>> in  
>>>>> a new connection opened through jdbcdatasource (log:  
>>>>> http://pastebin.com/JKh4gpmu). Were there any changes that would  
>>>>> affect  
>>>>> the speed of running a full import?  
>>>>> 
>>>> 
>>>> This is most likely the problem you are experiencing:  
>>>> 
>>>> https://issues.apache.org/jira/browse/SOLR-5954  
>>>> 
>>>> The fix will be in the new 4.8 version. The release process for 4.8 is  
>>>> underway right now. A second release candidate was required yesterday. If  
>>>> no further problems are encountered, the release should be made around the  
>>>> middle of next week. If problems are encountered, the release will be  
>>>> delayed.  
>>>> 
>>>> Here's something very important that has been mentioned before: Solr  
>>>> 4.8 will require Java 7. Previously, Java 6 was required. Java 7u55 (the  
>>>> current release from Oracle as I write this) is recommended as a minimum.  
>>>> 
>>>> If a 4.7.3 version is built, this is a fix that we should backport.  
>>>> 
>>>> Thanks,  
>>>> Shawn  
>>>> 
>>>> 
>>> 
>> 

--
Walter Underwood
wunder@wunderwood.org




Re: DIH issues with 4.7.1

Posted by Mark Miller <ma...@gmail.com>.
System.currentTimeMillis can jump around due to NTP, etc. If you are trying to count elapsed time, you don’t want to use a method that can jump around with the results.
-- 
Mark Miller
about.me/markrmiller

On April 26, 2014 at 8:58:20 AM, YouPeng Yang (yypvsxf19870706@gmail.com) wrote:

Hi Rafał Kuć  
I got it,the point is many operating systems measure time in units of  
tens of milliseconds,and the System.currentTimeMillis() is just base on  
operating system.  
In my case,I just do DIH with a crontable, Is there any possiblity to get  
in that trouble?I am really can not picture what the situation may lead to  
the problem.  


Thanks very much.  


2014-04-26 20:49 GMT+08:00 YouPeng Yang <yy...@gmail.com>:  

> Hi Mark Miller  
> Sorry to get you in these discussion .  
> I notice that Mark Miller report this issure in  
> https://issues.apache.org/jira/browse/SOLR-5734 according to  
> https://issues.apache.org/jira/browse/SOLR-5721,but it just happened with  
> the zookeeper.  
> If I just do DIH with JDBCDataSource ,I do not think it will get the  
> problem.  
> Please give some hints  
>  
> >> Bonus,just post the last mail I send about the problem:  
>  
> I have just compare the difference between the version 4.6.0 and 4.7.1.  
> Notice that the time in the getConnection function is declared with the  
> System.nanoTime in 4.7.1 ,while System.currentTimeMillis().  
> Curious about the resson for the change.the benefit of it .Is it  
> neccessory?  
> I have read the SOLR-5734 ,  
> https://issues.apache.org/jira/browse/SOLR-5734  
> Do some google about the difference of currentTimeMillis and nano,but  
> still can not figure out it.  
>  
> Thank you very much.  
>  
>  
> 2014-04-26 20:31 GMT+08:00 YouPeng Yang <yy...@gmail.com>:  
>  
> Hi  
>> I have just compare the difference between the version 4.6.0 and  
>> 4.7.1. Notice that the time in the getConnection function is declared  
>> with the System.nanoTime in 4.7.1 ,while System.currentTimeMillis().  
>> Curious about the resson for the change.the benefit of it .Is it  
>> neccessory?  
>> I have read the SOLR-5734 ,  
>> https://issues.apache.org/jira/browse/SOLR-5734  
>> Do some google about the difference of currentTimeMillis and nano,but  
>> still can not figure out it.  
>>  
>>  
>>  
>>  
>> 2014-04-26 2:24 GMT+08:00 Shawn Heisey <so...@elyograg.org>:  
>>  
>> On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote:  
>>>  
>>>> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH  
>>>> process that we are using takes 4x as long to complete. The only odd  
>>>> thing I notice is when I enable debug logging for the dataimporthandler  
>>>> process, it appears that in the new version each sql query is resulting  
>>>> in  
>>>> a new connection opened through jdbcdatasource (log:  
>>>> http://pastebin.com/JKh4gpmu). Were there any changes that would  
>>>> affect  
>>>> the speed of running a full import?  
>>>>  
>>>  
>>> This is most likely the problem you are experiencing:  
>>>  
>>> https://issues.apache.org/jira/browse/SOLR-5954  
>>>  
>>> The fix will be in the new 4.8 version. The release process for 4.8 is  
>>> underway right now. A second release candidate was required yesterday. If  
>>> no further problems are encountered, the release should be made around the  
>>> middle of next week. If problems are encountered, the release will be  
>>> delayed.  
>>>  
>>> Here's something very important that has been mentioned before: Solr  
>>> 4.8 will require Java 7. Previously, Java 6 was required. Java 7u55 (the  
>>> current release from Oracle as I write this) is recommended as a minimum.  
>>>  
>>> If a 4.7.3 version is built, this is a fix that we should backport.  
>>>  
>>> Thanks,  
>>> Shawn  
>>>  
>>>  
>>  
>  

Re: DIH issues with 4.7.1

Posted by YouPeng Yang <yy...@gmail.com>.
Hi Rafał Kuć
  I got it,the point is  many operating systems measure time in units of
tens of milliseconds,and the  System.currentTimeMillis() is  just base on
operating system.
  In my case,I just do DIH with a crontable, Is there any possiblity to get
in that trouble?I am really can not picture what the situation may lead to
the problem.


Thanks very much.


2014-04-26 20:49 GMT+08:00 YouPeng Yang <yy...@gmail.com>:

> Hi Mark Miller
>   Sorry to get you in these discussion .
>   I notice that Mark Miller report this issure in
> https://issues.apache.org/jira/browse/SOLR-5734 according to
> https://issues.apache.org/jira/browse/SOLR-5721,but it just happened with
> the zookeeper.
>   If I just do DIH with JDBCDataSource ,I do not think it will get the
> problem.
>   Please give some hints
>
>  >> Bonus,just post the last mail I send about the problem:
>
>    I have just compare the difference between the version 4.6.0 and 4.7.1.
> Notice that the time in the getConnection function   is declared with the
> System.nanoTime in 4.7.1 ,while System.currentTimeMillis().
>   Curious about the resson for the change.the benefit of it .Is it
> neccessory?
>    I have read the SOLR-5734 ,
> https://issues.apache.org/jira/browse/SOLR-5734
>    Do some google about the difference of currentTimeMillis and nano,but
> still can not figure out it.
>
> Thank you very much.
>
>
> 2014-04-26 20:31 GMT+08:00 YouPeng Yang <yy...@gmail.com>:
>
> Hi
>>    I have just compare the difference between the version 4.6.0 and
>> 4.7.1. Notice that the time in the getConnection function   is declared
>> with the System.nanoTime in 4.7.1 ,while System.currentTimeMillis().
>>   Curious about the resson for the change.the benefit of it .Is it
>> neccessory?
>>    I have read the SOLR-5734 ,
>> https://issues.apache.org/jira/browse/SOLR-5734
>>    Do some google about the difference of currentTimeMillis and nano,but
>> still can not figure out it.
>>
>>
>>
>>
>> 2014-04-26 2:24 GMT+08:00 Shawn Heisey <so...@elyograg.org>:
>>
>> On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote:
>>>
>>>> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH
>>>> process that we are using takes 4x as long to complete.  The only odd
>>>> thing I notice is when I enable debug logging for the dataimporthandler
>>>> process, it appears that in the new version each sql query is resulting
>>>> in
>>>> a new connection opened through jdbcdatasource (log:
>>>> http://pastebin.com/JKh4gpmu).  Were there any changes that would
>>>> affect
>>>> the speed of running a full import?
>>>>
>>>
>>> This is most likely the problem you are experiencing:
>>>
>>> https://issues.apache.org/jira/browse/SOLR-5954
>>>
>>> The fix will be in the new 4.8 version.  The release process for 4.8 is
>>> underway right now.  A second release candidate was required yesterday.  If
>>> no further problems are encountered, the release should be made around the
>>> middle of next week.  If problems are encountered, the release will be
>>> delayed.
>>>
>>> Here's something very important that has been mentioned before:  Solr
>>> 4.8 will require Java 7.  Previously, Java 6 was required.  Java 7u55 (the
>>> current release from Oracle as I write this) is recommended as a minimum.
>>>
>>> If a 4.7.3 version is built, this is a fix that we should backport.
>>>
>>> Thanks,
>>> Shawn
>>>
>>>
>>
>

Re: DIH issues with 4.7.1

Posted by YouPeng Yang <yy...@gmail.com>.
Hi Mark Miller
  Sorry to get you in these discussion .
  I notice that Mark Miller report this issure in
https://issues.apache.org/jira/browse/SOLR-5734 according to
https://issues.apache.org/jira/browse/SOLR-5721,but it just happened with
the zookeeper.
  If I just do DIH with JDBCDataSource ,I do not think it will get the
problem.
  Please give some hints

 >> Bonus,just post the last mail I send about the problem:
   I have just compare the difference between the version 4.6.0 and 4.7.1.
Notice that the time in the getConnection function   is declared with the
System.nanoTime in 4.7.1 ,while System.currentTimeMillis().
  Curious about the resson for the change.the benefit of it .Is it
neccessory?
   I have read the SOLR-5734 ,
https://issues.apache.org/jira/browse/SOLR-5734
   Do some google about the difference of currentTimeMillis and nano,but
still can not figure out it.

Thank you very much.


2014-04-26 20:31 GMT+08:00 YouPeng Yang <yy...@gmail.com>:

> Hi
>    I have just compare the difference between the version 4.6.0 and 4.7.1.
> Notice that the time in the getConnection function   is declared with the
> System.nanoTime in 4.7.1 ,while System.currentTimeMillis().
>   Curious about the resson for the change.the benefit of it .Is it
> neccessory?
>    I have read the SOLR-5734 ,
> https://issues.apache.org/jira/browse/SOLR-5734
>    Do some google about the difference of currentTimeMillis and nano,but
> still can not figure out it.
>
>
>
>
> 2014-04-26 2:24 GMT+08:00 Shawn Heisey <so...@elyograg.org>:
>
> On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote:
>>
>>> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH
>>> process that we are using takes 4x as long to complete.  The only odd
>>> thing I notice is when I enable debug logging for the dataimporthandler
>>> process, it appears that in the new version each sql query is resulting
>>> in
>>> a new connection opened through jdbcdatasource (log:
>>> http://pastebin.com/JKh4gpmu).  Were there any changes that would affect
>>> the speed of running a full import?
>>>
>>
>> This is most likely the problem you are experiencing:
>>
>> https://issues.apache.org/jira/browse/SOLR-5954
>>
>> The fix will be in the new 4.8 version.  The release process for 4.8 is
>> underway right now.  A second release candidate was required yesterday.  If
>> no further problems are encountered, the release should be made around the
>> middle of next week.  If problems are encountered, the release will be
>> delayed.
>>
>> Here's something very important that has been mentioned before:  Solr 4.8
>> will require Java 7.  Previously, Java 6 was required.  Java 7u55 (the
>> current release from Oracle as I write this) is recommended as a minimum.
>>
>> If a 4.7.3 version is built, this is a fix that we should backport.
>>
>> Thanks,
>> Shawn
>>
>>
>

Re: DIH issues with 4.7.1

Posted by Rafał Kuć <r....@solr.pl>.
Hello!

Look at the javadocs for both. The granularity of
System.currentTimeMillis() depend on the operating system, so it may
happen that calls to that method that are 1 millisecond away from each
other still return the same value. This is not the case with
System.nanoTime() -
http://docs.oracle.com/javase/7/docs/api/java/lang/System.html

-- 
Regards,
 Rafał Kuć
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


> Hi
>    I have just compare the difference between the version 4.6.0 and 4.7.1.
> Notice that the time in the getConnection function   is declared with the
> System.nanoTime in 4.7.1 ,while System.currentTimeMillis().
>   Curious about the resson for the change.the benefit of it .Is it
> neccessory?
>    I have read the SOLR-5734 ,
> https://issues.apache.org/jira/browse/SOLR-5734
>    Do some google about the difference of currentTimeMillis and nano,but
> still can not figure out it.




> 2014-04-26 2:24 GMT+08:00 Shawn Heisey <so...@elyograg.org>:

>> On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote:
>>
>>> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH
>>> process that we are using takes 4x as long to complete.  The only odd
>>> thing I notice is when I enable debug logging for the dataimporthandler
>>> process, it appears that in the new version each sql query is resulting in
>>> a new connection opened through jdbcdatasource (log:
>>> http://pastebin.com/JKh4gpmu).  Were there any changes that would affect
>>> the speed of running a full import?
>>>
>>
>> This is most likely the problem you are experiencing:
>>
>> https://issues.apache.org/jira/browse/SOLR-5954
>>
>> The fix will be in the new 4.8 version.  The release process for 4.8 is
>> underway right now.  A second release candidate was required yesterday.  If
>> no further problems are encountered, the release should be made around the
>> middle of next week.  If problems are encountered, the release will be
>> delayed.
>>
>> Here's something very important that has been mentioned before:  Solr 4.8
>> will require Java 7.  Previously, Java 6 was required.  Java 7u55 (the
>> current release from Oracle as I write this) is recommended as a minimum.
>>
>> If a 4.7.3 version is built, this is a fix that we should backport.
>>
>> Thanks,
>> Shawn
>>
>>


Re: DIH issues with 4.7.1

Posted by YouPeng Yang <yy...@gmail.com>.
Hi
   I have just compare the difference between the version 4.6.0 and 4.7.1.
Notice that the time in the getConnection function   is declared with the
System.nanoTime in 4.7.1 ,while System.currentTimeMillis().
  Curious about the resson for the change.the benefit of it .Is it
neccessory?
   I have read the SOLR-5734 ,
https://issues.apache.org/jira/browse/SOLR-5734
   Do some google about the difference of currentTimeMillis and nano,but
still can not figure out it.




2014-04-26 2:24 GMT+08:00 Shawn Heisey <so...@elyograg.org>:

> On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote:
>
>> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH
>> process that we are using takes 4x as long to complete.  The only odd
>> thing I notice is when I enable debug logging for the dataimporthandler
>> process, it appears that in the new version each sql query is resulting in
>> a new connection opened through jdbcdatasource (log:
>> http://pastebin.com/JKh4gpmu).  Were there any changes that would affect
>> the speed of running a full import?
>>
>
> This is most likely the problem you are experiencing:
>
> https://issues.apache.org/jira/browse/SOLR-5954
>
> The fix will be in the new 4.8 version.  The release process for 4.8 is
> underway right now.  A second release candidate was required yesterday.  If
> no further problems are encountered, the release should be made around the
> middle of next week.  If problems are encountered, the release will be
> delayed.
>
> Here's something very important that has been mentioned before:  Solr 4.8
> will require Java 7.  Previously, Java 6 was required.  Java 7u55 (the
> current release from Oracle as I write this) is recommended as a minimum.
>
> If a 4.7.3 version is built, this is a fix that we should backport.
>
> Thanks,
> Shawn
>
>

Re: DIH issues with 4.7.1

Posted by Shawn Heisey <so...@elyograg.org>.
On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote:
> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH
> process that we are using takes 4x as long to complete.  The only odd
> thing I notice is when I enable debug logging for the dataimporthandler
> process, it appears that in the new version each sql query is resulting in
> a new connection opened through jdbcdatasource (log:
> http://pastebin.com/JKh4gpmu).  Were there any changes that would affect
> the speed of running a full import?

This is most likely the problem you are experiencing:

https://issues.apache.org/jira/browse/SOLR-5954

The fix will be in the new 4.8 version.  The release process for 4.8 is 
underway right now.  A second release candidate was required yesterday.  
If no further problems are encountered, the release should be made 
around the middle of next week.  If problems are encountered, the 
release will be delayed.

Here's something very important that has been mentioned before:  Solr 
4.8 will require Java 7.  Previously, Java 6 was required.  Java 7u55 
(the current release from Oracle as I write this) is recommended as a 
minimum.

If a 4.7.3 version is built, this is a fix that we should backport.

Thanks,
Shawn


Re: DIH issues with 4.7.1

Posted by Alan Woodward <al...@flax.co.uk>.
Hi Jonathan,

It's a known bug: https://issues.apache.org/jira/browse/SOLR-5954.  It'll be fixed in 4.8, which is being voted on now.

Alan Woodward
www.flax.co.uk


On 25 Apr 2014, at 18:56, Hutchins, Jonathan wrote:

> I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH
> process that we are using takes 4x as long to complete.  The only odd
> thing I notice is when I enable debug logging for the dataimporthandler
> process, it appears that in the new version each sql query is resulting in
> a new connection opened through jdbcdatasource (log:
> http://pastebin.com/JKh4gpmu).  Were there any changes that would affect
> the speed of running a full import?
> 
> Thanks!
> 
> - Jonathan Hutchins
> 
>