You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by pvginkel <gi...@git.apache.org> on 2017/05/08 11:44:33 UTC

[GitHub] lucenenet pull request #205: Made FSDirectory stale files set synchronized.

GitHub user pvginkel opened a pull request:

    https://github.com/apache/lucenenet/pull/205

    Made FSDirectory stale files set synchronized.

    `FSDirectory` has a hash set to track stale files. In the original Lucene source at e.g. https://github.com/apache/lucene-solr/blob/b7047694256a8e5808b9a9ddf480643ff768d8a9/lucene/core/src/java/org/apache/lucene/store/FSDirectory.java#L126, this is a synchronized set. In the .NET version, it's not, causing unsynchronized access to this set.
    
    Changed to a `Lucene.Net.Support.ConcurrentHashSet<>`.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/pvginkel/lucenenet synchronized-stale-files

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/lucenenet/pull/205.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #205
    
----

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by pvginkel <gi...@git.apache.org>.
Github user pvginkel commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    I am currently working around this using reflection, so for me it's fine.
    However, this is causing real problems. I wrote a prototype which is
    indexing using 4 threads, a thread that commits every 5 seconds and a
    reopen thread every .5 seconds, at roughly 1500 documents per second,
    indexing all of Wikipedia. It crashes because of this problem every single
    run.
    
    On Mon, May 8, 2017, 14:25 Shad Storhaug <no...@github.com> wrote:
    
    > @synhershko <https://github.com/synhershko>
    >
    > Not without cancelling the vote and starting over (for the 3rd time) - we
    > have only 12 hours left on the current vote. I suspect there will be
    > several issues such as this that will need patching, so we should probably
    > wait until the next release to address them.
    >
    > @pvginkel <https://github.com/pvginkel>
    >
    > Good find. Keep them coming :).
    >
    > —
    > You are receiving this because you were mentioned.
    >
    >
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/lucenenet/pull/205#issuecomment-299852696>, or mute
    > the thread
    > <https://github.com/notifications/unsubscribe-auth/AAymJu37xJlX_IYmMo1FHG8q1B-ART4Lks5r3wm_gaJpZM4NT1mj>
    > .
    >



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by synhershko <gi...@git.apache.org>.
Github user synhershko commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    This is a great test - and as it happens many bugs in Apache Lucene (Java), even very severe ones, were caught using a randomized testing framework, which by definition is not deterministic.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by NightOwl888 <gi...@git.apache.org>.
Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    > OK. So I should remove the `[Ignore]` attribute then?
    
    Yes. We should only use the `[Ignore]` attribute in cases where the test takes > 30 minutes and must be run manually, or was ignored in Lucene for some reason. Otherwise we want the tests to complain if there is something wrong with the software.
    
    > The only question I still have is about the timeout. In my tests, if it would test, it would test rather quickly. At the moment I have the timeout on 5 seconds, but it should be fine to lower this to e.g. 1 second or .5 seconds. Should I do this?
    
    I would say use your best judgement to set the timeout where it is long enough to fail at least 50% of the time on most systems. 5 seconds isn't too bad considering a full test run takes around an hour. I don't even consider them "long running tests" unless they are upwards of 1 minute.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by synhershko <gi...@git.apache.org>.
Github user synhershko commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    Good catch, thanks!
    
    @NightOwl888 can we get this to the beta version before it's released?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by pvginkel <gi...@git.apache.org>.
Github user pvginkel commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    OK. So I should remove the `[Ignore]` attribute then?
    
    The only question I still have is about the timeout. In my tests, if it would test, it would test rather quickly. At the moment I have the timeout on 5 seconds, but it should be fine to lower this to e.g. 1 second or .5 seconds. Should I do this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by NightOwl888 <gi...@git.apache.org>.
Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    No big deal - I'd say the majority of the tests we have are non-deterministic - certainly the randomization makes them a bit unpredictable. If we don't get any false negatives or false positives, I'd say this is fine. So long as we have a way to detect the problem so it can be addressed if it pops up again in a future port.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by pvginkel <gi...@git.apache.org>.
Github user pvginkel commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    I have, but this is a problematic one. This depends on a race condition, which is non-deterministic by definition.
    
    The test case I created runs a few threads for a few seconds and fails when the exception is thrown in one of the threads. Fun thing is that we're running until some timeout and then cancelling the test. If we wouldn't the test could theoretically run indefinitely, never hitting the race condition. However, since we run for some time interval, it could also be that the problem is still there, but it just wasn't hit.
    
    I've ignored this test case because:
    
    * It's non deterministic, so the fact that it succeeds doesn't mean that the problem isn't there, making the test case useless;
    * It prolongs running the test suite since we have to run for a non trivial amount of time. I've set this to 5 seconds.
    
    This all being said: the test case is there so you can try it if you'd like. On my 8 core workstation, it fails most of the time with the `HashSet<>` and succeeds all of the time with the `ConcurrentHashSet<>`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by NightOwl888 <gi...@git.apache.org>.
Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    > I am currently working around this using reflection, so for me it's fine.
    
    Peter, could you please post your workaround here in case other users need it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by pvginkel <gi...@git.apache.org>.
Github user pvginkel commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    Sure, no problem:
    
        private void ReplaceStaleFiles(FSDirectory fsDirectory)
        {
            // Work around for https://github.com/apache/lucenenet/pull/205.
    
            var field = typeof(FSDirectory).GetField("StaleFiles", BindingFlags.Instance | BindingFlags.NonPublic);
    
            field.SetValue(fsDirectory, new ConcurrentHashSet<string>());
        }


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by pvginkel <gi...@git.apache.org>.
Github user pvginkel commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    I've removed the `[Ignore]` attribute and left the timeout as is.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by NightOwl888 <gi...@git.apache.org>.
Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    @synhershko 
    
    Not without cancelling the vote and starting over (for the 3rd time) - we have only 12 hours left on the current vote. I suspect there will be several issues such as this that will need patching, so we should probably wait until the next release to address them.
    
    @pvginkel 
    
    Good find. Keep them coming :).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet issue #205: Made FSDirectory stale files set synchronized.

Posted by NightOwl888 <gi...@git.apache.org>.
Github user NightOwl888 commented on the issue:

    https://github.com/apache/lucenenet/pull/205
  
    @pvginkel 
    
    Could you add a test to this PR in the [TestDirectory class](https://github.com/pvginkel/lucenenet/blob/synchronized-stale-files/src/Lucene.Net.Tests/Store/TestDirectory.cs) that fails with the current HashSet and passes with the ConcurrentHashSet? Not that I don't believe you fixed it, but for future porting efforts it would help if we don't repeat the same bugs over and over again.
    
    Please mark the test with the `[LuceneNetSpecific]` attribute.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] lucenenet pull request #205: Made FSDirectory stale files set synchronized.

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/lucenenet/pull/205


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---