You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by milleruntime <gi...@git.apache.org> on 2017/04/28 14:22:09 UTC
[GitHub] accumulo pull request #255: ACCUMULO-4365: Fix trace test in ShellServerIT
GitHub user milleruntime opened a pull request:
https://github.com/apache/accumulo/pull/255
ACCUMULO-4365: Fix trace test in ShellServerIT
This test has been frustrating for some time now... After spending some time looking into a potential bug (possibly one similar to ACCUMULO-4191) I don't think a bug actually exists.
This test looked for "sendMutations" traces generated from the insert command but ACCUMULO-1755 changed the way mutations were handled in TabletServerBatchWriter. The multithreaded nature of mutations are now too unpredictable and time sensitive to reliably look for "sendMutations" traces here.
Aside from ignoring the test, this is the best solution I can come up with...
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/milleruntime/accumulo ACCUMULO-4365
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/accumulo/pull/255.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #255
----
commit 18f99f38ecfc54c2b062a8f9c08477fc65f91a11
Author: Mike Miller <mm...@apache.org>
Date: 2017-04-28T14:12:12Z
ACCUMULO-4365: Fix trace test in ShellServerIT
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
Thanks @billierinaldi good idea with the tracer.span.min.ms parameter. I will give a try see what we get!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
Shell trace output:
<pre>
Trace started at 2017/04/28 10:49:35.021
Time Start Service@Location Name
25863+0 shell@canx2um-78286w.infosec.tycho.ncsc.mil shell:root
1+2 tserver@0.0.0.0 listLocalUsers
3+4499 master@0.0.0.0 beginFateOperation
1+4508 shell@canx2um-78286w.infosec.tycho.ncsc.mil client:executeFateOperation
4+4512 master@0.0.0.0 executeFateOperation
3+4517 master@0.0.0.0 CreateTable
1+4520 master@0.0.0.0 CreateTable
13+4525 master@0.0.0.0 SetupPermissions
3+4541 master@0.0.0.0 PopulateZookeeper
11+4544 master@0.0.0.0 PopulateZookeeper
3+4559 master@0.0.0.0 ChooseDir
13+4565 master@0.0.0.0 CreateDir
12+4566 master@0.0.0.0 mkdir
5+4571 master@0.0.0.0 ClientNamenodeProtocol#mkdirs
3+4581 master@0.0.0.0 PopulateMetadata
1+4582 tserver@0.0.0.0 update
1+4582 tserver@0.0.0.0 wal
4+4586 master@0.0.0.0 FinishCreateTable
1+4518 shell@canx2um-78286w.infosec.tycho.ncsc.mil client:waitForFateOperation
88+4519 master@0.0.0.0 waitForFateOperation
1+4610 shell@canx2um-78286w.infosec.tycho.ncsc.mil client:finishFateOperation
3+4611 master@0.0.0.0 finishFateOperation
1+4623 tserver@0.0.0.0 listLocalUsers
56+8014 shell@canx2um-78286w.infosec.tycho.ncsc.mil close
46+8015 shell@canx2um-78286w.infosec.tycho.ncsc.mil BinMutations 1
44+8016 shell@canx2um-78286w.infosec.tycho.ncsc.mil binMutations
2+8043 shell@canx2um-78286w.infosec.tycho.ncsc.mil client:startScan
1+8046 tserver@0.0.0.0 startScan
1+8058 tserver@0.0.0.0 startScan
1+8058 tserver@0.0.0.0 metadata tablets read ahead 3
9+8061 shell@canx2um-78286w.infosec.tycho.ncsc.mil org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter 1
9+8061 shell@canx2um-78286w.infosec.tycho.ncsc.mil sendMutations
1+8066 shell@canx2um-78286w.infosec.tycho.ncsc.mil client:update
3+8067 tserver@0.0.0.0 update
1+8068 tserver@0.0.0.0 wal
1+8069 tserver@0.0.0.0 commit
1+8071 tserver@0.0.0.0 listLocalUsers
4+11506 tserver@0.0.0.0 getTableConfiguration
4+11513 tserver@0.0.0.0 getTableConfiguration
4+11519 tserver@0.0.0.0 getTableConfiguration
4+11524 tserver@0.0.0.0 getTableConfiguration
5+11540 shell@canx2um-78286w.infosec.tycho.ncsc.mil scan
5+11540 shell@canx2um-78286w.infosec.tycho.ncsc.mil scan:location
2+11542 tserver@0.0.0.0 startScan
2+11542 tserver@0.0.0.0 tablet read ahead 1
3+17872 master@0.0.0.0 beginFateOperation
4+17876 master@0.0.0.0 executeFateOperation
3+17881 master@0.0.0.0 DeleteTable
3+17884 master@0.0.0.0 DeleteTable
5+17999 master@0.0.0.0 CleanUp
3+18001 master@0.0.0.0 scan
3+18001 master@0.0.0.0 scan:location
2+18002 tserver@0.0.0.0 startScan
2+18002 tserver@0.0.0.0 metadata tablets read ahead 8
4+18058 master@0.0.0.0 CleanUp
3+18059 master@0.0.0.0 scan
3+18059 master@0.0.0.0 scan:location
1+18059 master@0.0.0.0 client:startScan
1+18060 tserver@0.0.0.0 startScan
1+18060 tserver@0.0.0.0 metadata tablets read ahead 1
40+18062 master@0.0.0.0 CleanUp
3+18063 master@0.0.0.0 batch scanner 25- 1
1+18064 tserver@0.0.0.0 startMultiScan
1+18064 tserver@0.0.0.0 metadata tablets read ahead 2
2+18067 master@0.0.0.0 scan
2+18067 master@0.0.0.0 scan:location
1+18067 tserver@0.0.0.0 startScan
1+18067 tserver@0.0.0.0 metadata tablets read ahead 3
4+18069 master@0.0.0.0 close
2+18076 master@0.0.0.0 scan
2+18076 master@0.0.0.0 scan:location
1+18077 tserver@0.0.0.0 startScan
1+18077 tserver@0.0.0.0 metadata tablets read ahead 4
5+18079 master@0.0.0.0 delete
3+18080 master@0.0.0.0 ClientNamenodeProtocol#delete
229+17880 master@0.0.0.0 waitForFateOperation
2+18110 master@0.0.0.0 finishFateOperation
1+23194 tserver@0.0.0.0 listLocalUsers
The following spans are not rooted (probably due to a parent span of length 0ms):
4+18069 master@0.0.0.0 org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter 1
4+18069 master@0.0.0.0 sendMutations
3+18070 tserver@0.0.0.0 wal
3+18070 tserver@0.0.0.0 update
</pre>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
FAILURE running with Maven:
<pre>
Trace started at 2017/04/28 11:06:53.041
Time Start Service@Location Name
2251+0 shell@canx2um-78286w.infosec.tycho.ncsc.mil shell:root
2+1 master@0.0.0.0 beginFateOperation
2+3 master@0.0.0.0 executeFateOperation
1+6 master@0.0.0.0 CreateTable
1+7 master@0.0.0.0 CreateTable
5+9 master@0.0.0.0 SetupPermissions
1+16 master@0.0.0.0 PopulateZookeeper
5+17 master@0.0.0.0 PopulateZookeeper
2+24 master@0.0.0.0 ChooseDir
37+6 master@0.0.0.0 waitForFateOperation
2+43 master@0.0.0.0 finishFateOperation
1009+49 shell@canx2um-78286w.infosec.tycho.ncsc.mil close
5+49 shell@canx2um-78286w.infosec.tycho.ncsc.mil BinMutations 1
2+49 shell@canx2um-78286w.infosec.tycho.ncsc.mil binMutations
1+50 tserver@0.0.0.0 startScan
1+50 tserver@0.0.0.0 metadata tablets read ahead 5
2+1058 tserver@0.0.0.0 getTableConfiguration
2+1061 tserver@0.0.0.0 getTableConfiguration
2+1064 tserver@0.0.0.0 getTableConfiguration
3+1066 tserver@0.0.0.0 getTableConfiguration
2+1100 shell@canx2um-78286w.infosec.tycho.ncsc.mil scan
2+1100 shell@canx2um-78286w.infosec.tycho.ncsc.mil scan:location
2+1100 tserver@0.0.0.0 startScan
1+1101 tserver@0.0.0.0 tablet read ahead 8
2+1103 master@0.0.0.0 beginFateOperation
2+1105 master@0.0.0.0 executeFateOperation
2+1108 master@0.0.0.0 DeleteTable
1+1110 master@0.0.0.0 DeleteTable
2+1221 master@0.0.0.0 CleanUp
1+1222 master@0.0.0.0 scan
1+1222 master@0.0.0.0 scan:location
1+1222 tserver@0.0.0.0 startScan
1+1222 tserver@0.0.0.0 metadata tablets read ahead 3
19+1223 master@0.0.0.0 CleanUp
1+1223 master@0.0.0.0 batch scanner 525- 1
1+1225 master@0.0.0.0 close
1+1226 master@0.0.0.0 scan
1+1226 master@0.0.0.0 scan:location
138+1107 master@0.0.0.0 waitForFateOperation
1+1246 master@0.0.0.0 finishFateOperation
The following spans are not rooted (probably due to a parent span of length 0ms):
1+29 master@0.0.0.0 PopulateMetadata
1+29 tserver@0.0.0.0 prep
1+29 tserver@0.0.0.0 update
3+32 master@0.0.0.0 FinishCreateTable
</pre>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by ctubbsii <gi...@git.apache.org>.
Github user ctubbsii commented on the issue:
https://github.com/apache/accumulo/pull/255
Don't forget to close the JIRA if this is done.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by ctubbsii <gi...@git.apache.org>.
Github user ctubbsii commented on the issue:
https://github.com/apache/accumulo/pull/255
The reason this is borked due to the thread pools is probably due to the fact that the parent span id is stored in a thread local variable. If we can create spans by explicitly passing the parent (constructor or static method), that would probably fix the issue. The more complicated fix is to use a trace-aware threadpool or executor service.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
Created a quick attempt at a static parent span to pass to the threads in 7444888ff8582bc31d0c61444103cc4dc9ff362a
Doesn't seem to work though.... whomp whomp
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by ctubbsii <gi...@git.apache.org>.
Github user ctubbsii commented on the issue:
https://github.com/apache/accumulo/pull/255
Also, GitHub appears broke right now... so if you see the same comment repeated 4x sometime later, it's because I tried to submit it as a review comment a few times first, but it didn't work.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by joshelser <gi...@git.apache.org>.
Github user joshelser commented on the issue:
https://github.com/apache/accumulo/pull/255
> Could it be because we have the traces in 2 separate thread pools and htrace gets confused:
> Or because we change the name of the thread (wtf??) in TabletServerBatchWriter.SendTask.send()
What little I knew about HTrace and it's implementation is lost in the wind, but both seem plausible. May require some new spelunking into their codebase (heck, I don't even know what version we're using of theirs anymore).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
Merged into 1.7, 1.8 and master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by joshelser <gi...@git.apache.org>.
Github user joshelser commented on the issue:
https://github.com/apache/accumulo/pull/255
Aha!
> The following spans are not rooted (probably due to a parent span of length 0ms):
I don't think this test is even validating the correct thing. That span is supposed to be under..
```
1010+168 shell@canx2um-78286w.infosec.tycho.ncsc.mil close
2+169 shell@canx2um-78286w.infosec.tycho.ncsc.mil BinMutations 1
2+169 shell@canx2um-78286w.infosec.tycho.ncsc.mil binMutations
```
The test is really just verifying that we non-rooted spans are still logged (which is definitely not the intent). It seems more like ACCUMULO-1755 broke the span hierarchy in the TSBW.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
> If we can create spans by explicitly passing the parent (constructor or static method)
I'm not quite sure exactly what you mean... instead of this:
`Span span = Trace.start("sendMutations");
...
span.stop();`
We would have something like:
`Span child = Span.trace(parent, "sendMutations");
...
child.stop() `
Wouldn't the child still be local?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
I think we got it! Looking good with 22c9481
I will run the IT a few more times just to make sure this is squashed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo pull request #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime closed the pull request at:
https://github.com/apache/accumulo/pull/255
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
Now why is it rooted correctly when run manually in the shell????
<pre>
shell@canx2um-78286w.infosec.tycho.ncsc.mil close
46+8015 shell@canx2um-78286w.infosec.tycho.ncsc.mil BinMutations 1
44+8016 shell@canx2um-78286w.infosec.tycho.ncsc.mil binMutations
2+8043 shell@canx2um-78286w.infosec.tycho.ncsc.mil client:startScan
1+8046 tserver@0.0.0.0 startScan
1+8058 tserver@0.0.0.0 startScan
1+8058 tserver@0.0.0.0 metadata tablets read ahead 3
9+8061 shell@canx2um-78286w.infosec.tycho.ncsc.mil org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter 1
9+8061 shell@canx2um-78286w.infosec.tycho.ncsc.mil sendMutations
</pre>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by joshelser <gi...@git.apache.org>.
Github user joshelser commented on the issue:
https://github.com/apache/accumulo/pull/255
> Now why is it rooted correctly when run manually in the shell????
So, the hierarchy definitely seems goofed to me. MutationWriter+sendMutations are contained beneath BinMutations (i'd expected MutationWriter to be at the same level as BinMutations, beneath `close`). Maybe BinMutations should just be some "BinAndWrite" span (guessing that's the new thread which was introduced).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by billierinaldi <gi...@git.apache.org>.
Github user billierinaldi commented on the issue:
https://github.com/apache/accumulo/pull/255
I figure there are two possibilities, 1) the sendMutations span isn't always created, which would be bad, or 2) the span is created, but it's dropped sometimes because it's of length 0ms. It could be the latter, since I see sendMutations was 1ms for one of the sample traces posted. We could eliminate the second possibility by configuring the span receiver not to drop 0ms spans for this test. This can be done by setting tracer.span.min.ms to 0 (default value is 1) in the accumulo conf.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
Could it be because we have the traces in 2 separate thread pools and htrace gets confused:
In org/apache/accumulo/core/client/impl/TabletServerBatchWriter.java
BinMutations:
`binningThreadPool.execute(Trace.wrap(new Runnable() {`
And sendMutations:
`sendThreadPool.submit(Trace.wrap(new SendTask(server)));`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
Or because we change the name of the thread (wtf??) in TabletServerBatchWriter.SendTask.send() https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/client/impl/TabletServerBatchWriter.java#L856
before starting the "sendMutations" trace??
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by joshelser <gi...@git.apache.org>.
Github user joshelser commented on the issue:
https://github.com/apache/accumulo/pull/255
> The multithreaded nature of mutations are now too unpredictable and time sensitive to reliably look for "sendMutations" traces here.
Do you have more specifics as to why this is?
ACCUMULO-1755 added that single-thread that will bin and then send mutations. It seems odd that we would lose the sendMutations span (as that code hasn't really changed, just how it gets invoked).
Do you have the actual trace hierarchy handy from a test failure? Maybe it would ring a bell for @billierinaldi. I recall debugging weird tracing with her in a starbucks once upon a time.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by ctubbsii <gi...@git.apache.org>.
Github user ctubbsii commented on the issue:
https://github.com/apache/accumulo/pull/255
Instead of long javadoc, could just leave old behavior in, commented-out with comment line above it explaining that the span doesn't always appear (isn't always rooted properly).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
With the span min set to 0, I am consistently seeing the Test pass and the output:
<pre> 0+2268 master@0.0.0.0 org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter 1
0+2268 master@0.0.0.0 sendMutations</pre>
The span taking <1ms would explain why this test was previously failing a lot on my machine
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
> Do you have more specifics as to why this is?
No, only experience with getting different traces and your and Shawn Walkers comments on [PR94](https://github.com/apache/accumulo/pull/94)
> ACCUMULO-1755 added that single-thread that will bin and then send mutations. It seems odd that we would lose the sendMutations span (as that code hasn't really changed, just how it gets invoked).
I don't think sendMutations get lost, just that its a race condition we don't have control over. Running the trace commands in the shell works fine. And you guys seemed to put a lot of design into ACCUMULO-1755.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
SUCCESS running with Maven:
<pre>
Trace started at 2017/04/28 11:01:58.382
Time Start Service@Location Name
2375+0 shell@canx2um-78286w.infosec.tycho.ncsc.mil shell:root
2+101 master@0.0.0.0 beginFateOperation
2+107 master@0.0.0.0 executeFateOperation
2+110 master@0.0.0.0 CreateTable
1+109 shell@canx2um-78286w.infosec.tycho.ncsc.mil client:waitForFateOperation
47+113 master@0.0.0.0 waitForFateOperation
2+160 master@0.0.0.0 finishFateOperation
1010+168 shell@canx2um-78286w.infosec.tycho.ncsc.mil close
2+169 shell@canx2um-78286w.infosec.tycho.ncsc.mil BinMutations 1
2+169 shell@canx2um-78286w.infosec.tycho.ncsc.mil binMutations
1+169 tserver@0.0.0.0 startScan
3+1178 tserver@0.0.0.0 getTableConfiguration
3+1181 tserver@0.0.0.0 getTableConfiguration
2+1184 tserver@0.0.0.0 getTableConfiguration
3+1186 tserver@0.0.0.0 getTableConfiguration
2+1219 shell@canx2um-78286w.infosec.tycho.ncsc.mil scan
2+1219 shell@canx2um-78286w.infosec.tycho.ncsc.mil scan:location
1+1220 tserver@0.0.0.0 startScan
1+1220 tserver@0.0.0.0 tablet read ahead 8
2+1222 master@0.0.0.0 beginFateOperation
2+1224 master@0.0.0.0 executeFateOperation
3+1227 master@0.0.0.0 DeleteTable
1+1230 master@0.0.0.0 DeleteTable
3+1342 master@0.0.0.0 CleanUp
2+1343 master@0.0.0.0 scan
2+1343 master@0.0.0.0 scan:location
21+1345 master@0.0.0.0 CleanUp
1+1345 master@0.0.0.0 batch scanner 516- 1
1+1347 master@0.0.0.0 scan
1+1347 master@0.0.0.0 scan:location
1+1347 tserver@0.0.0.0 startScan
1+1348 master@0.0.0.0 close
1+1349 master@0.0.0.0 scan
1+1349 master@0.0.0.0 scan:location
1+1349 tserver@0.0.0.0 startScan
1+1349 tserver@0.0.0.0 metadata tablets read ahead 3
142+1227 master@0.0.0.0 waitForFateOperation
2+1369 master@0.0.0.0 finishFateOperation
The following spans are not rooted (probably due to a parent span of length 0ms):
8+114 master@0.0.0.0 SetupPermissions
2+124 master@0.0.0.0 PopulateZookeeper
8+126 master@0.0.0.0 PopulateZookeeper
2+136 master@0.0.0.0 ChooseDir
1+142 master@0.0.0.0 PopulateMetadata
1+142 tserver@0.0.0.0 prep
1+142 tserver@0.0.0.0 update
4+145 master@0.0.0.0 FinishCreateTable
1+1348 master@0.0.0.0 sendMutations
1+1348 master@0.0.0.0 org.apache.accumulo.core.client.impl.TabletServerBatchWriter$MutationWriter 1
</pre>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by milleruntime <gi...@git.apache.org>.
Github user milleruntime commented on the issue:
https://github.com/apache/accumulo/pull/255
So this is the guy causing us problems in Tracer.java:
`private static final ThreadLocal<Span> currentSpan = new ThreadLocal<Span>() {..}`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by joshelser <gi...@git.apache.org>.
Github user joshelser commented on the issue:
https://github.com/apache/accumulo/pull/255
> I don't think sendMutations get lost, just that its a race condition we don't have control over.
Correct me if I'm wrong: but if we just don't understand the problem, I'm not in favor of removing this check. If we know definitively what the issue is and accept why it is something we cannot fix, that's a different story.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] accumulo issue #255: ACCUMULO-4365: Fix trace test in ShellServerIT
Posted by joshelser <gi...@git.apache.org>.
Github user joshelser commented on the issue:
https://github.com/apache/accumulo/pull/255
> So this is the guy causing us problems in Tracer.java:
Ah, so that ThreadLocal is designed so that we can transparently pick up the "current" thread. When we spawn a new thread, we need to pass along the current spanId to that child. I'm pretty sure we already have wrappers for `Runnable` to do this -- maybe they just weren't used?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---