You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dave Seltzer <ds...@tveyes.com> on 2013/11/22 00:03:07 UTC

Periodic Slowness on Solr Cloud

I'm doing some performance testing against an 8-node Solr cloud cluster,
and I'm noticing some periodic slowness.


http://farm4.staticflickr.com/3668/10985410633_23e26c7681_o.png

I'm doing random test searches against an Alias Collection made up of four
smaller (monthly) collections. Like this:

MasterCollection
|- Collection201308
|- Collection201309
|- Collection201310
|- Collection201311

The last collection is constantly updated. New documents are being added at
the rate of about 3 documents per second.

I believe the slowness may due be to NRT, but I'm not sure. How should I
investigate this?

If the slowness is related to NRT, how can I alleviate the issue without
disabling NRT?

Thanks Much!

-Dave

Re: Periodic Slowness on Solr Cloud

Posted by Dave Seltzer <ds...@tveyes.com>.
Wow. That is one noisy command!

Full output is below. The grepped output looks like:

[solr@searchtest07 ~]$ java -XX:+PrintFlagsFinal -version | grep -i -E
'heapsize|permsize|version'
    uintx AdaptivePermSizeWeight                    = 20
 {product}
    uintx ErgoHeapSizeLimit                         = 0
{product}
    uintx HeapSizePerGCThread                       = 87241520
 {product}
    uintx InitialHeapSize                          := 447247104
{product}
    uintx LargePageHeapSizeThreshold                = 134217728
{product}
    uintx MaxHeapSize                              := 7157579776
 {product}
    uintx MaxPermSize                               = 85983232        {pd
product}
    uintx PermSize                                  = 21757952        {pd
product}
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

It looks like Java is correctly determining that this is in-fact a
"server." It seems to start with an Xmx of 25% of the RAM or around 7GB.

So, in addition to tweaking GC I'm going to increase Xmx. Any advise as to
how much memory should go to the Heap and how much should go to the OS disk
cache? Should I split it 50/50?

Again. Many Thanks.

-Dave


------------------------------------ Full output from printflags
------------------------------------------
[solr@searchtest07 ~]$ java -XX:+PrintFlagsFinal -version
[Global flags]
    uintx AdaptivePermSizeWeight                    = 20
 {product}
    uintx AdaptiveSizeDecrementScaleFactor          = 4
{product}
    uintx AdaptiveSizeMajorGCDecayTimeScale         = 10
 {product}
    uintx AdaptiveSizePausePolicy                   = 0
{product}
    uintx AdaptiveSizePolicyCollectionCostMargin    = 50
 {product}
    uintx AdaptiveSizePolicyInitializingSteps       = 20
 {product}
    uintx AdaptiveSizePolicyOutputInterval          = 0
{product}
    uintx AdaptiveSizePolicyWeight                  = 10
 {product}
    uintx AdaptiveSizeThroughPutPolicy              = 0
{product}
    uintx AdaptiveTimeWeight                        = 25
 {product}
     bool AdjustConcurrency                         = false
{product}
     bool AggressiveOpts                            = false
{product}
     intx AliasLevel                                = 3               {C2
product}
     bool AlignVector                               = false           {C2
product}
     intx AllocateInstancePrefetchLines             = 1
{product}
     intx AllocatePrefetchDistance                  = 192
{product}
     intx AllocatePrefetchInstr                     = 0
{product}
     intx AllocatePrefetchLines                     = 4
{product}
     intx AllocatePrefetchStepSize                  = 64
 {product}
     intx AllocatePrefetchStyle                     = 1
{product}
     bool AllowJNIEnvProxy                          = false
{product}
     bool AllowNonVirtualCalls                      = false
{product}
     bool AllowParallelDefineClass                  = false
{product}
     bool AllowUserSignalHandlers                   = false
{product}
     bool AlwaysActAsServerClassMachine             = false
{product}
     bool AlwaysCompileLoopMethods                  = false
{product}
     bool AlwaysLockClassLoader                     = false
{product}
     bool AlwaysPreTouch                            = false
{product}
     bool AlwaysRestoreFPU                          = false
{product}
     bool AlwaysTenure                              = false
{product}
     bool AssertOnSuspendWaitFailure                = false
{product}
     intx Atomics                                   = 0
{product}
     intx AutoBoxCacheMax                           = 128             {C2
product}
    uintx AutoGCSelectPauseMillis                   = 5000
 {product}
     intx BCEATraceLevel                            = 0
{product}
     intx BackEdgeThreshold                         = 100000          {pd
product}
     bool BackgroundCompilation                     = true            {pd
product}
    uintx BaseFootPrintEstimate                     = 268435456
{product}
     intx BiasedLockingBulkRebiasThreshold          = 20
 {product}
     intx BiasedLockingBulkRevokeThreshold          = 40
 {product}
     intx BiasedLockingDecayTime                    = 25000
{product}
     intx BiasedLockingStartupDelay                 = 4000
 {product}
     bool BindGCTaskThreadsToCPUs                   = false
{product}
     bool BlockLayoutByFrequency                    = true            {C2
product}
     intx BlockLayoutMinDiamondPercentage           = 20              {C2
product}
     bool BlockLayoutRotateLoops                    = true            {C2
product}
     bool BranchOnRegister                          = false           {C2
product}
     bool BytecodeVerificationLocal                 = false
{product}
     bool BytecodeVerificationRemote                = true
 {product}
     bool C1OptimizeVirtualCallProfiling            = true            {C1
product}
     bool C1ProfileBranches                         = true            {C1
product}
     bool C1ProfileCalls                            = true            {C1
product}
     bool C1ProfileCheckcasts                       = true            {C1
product}
     bool C1ProfileInlinedCalls                     = true            {C1
product}
     bool C1ProfileVirtualCalls                     = true            {C1
product}
     bool C1UpdateMethodData                        = true            {C1
product}
     intx CICompilerCount                           = 2
{product}
     bool CICompilerCountPerCPU                     = false
{product}
     bool CITime                                    = false
{product}
     bool CMSAbortSemantics                         = false
{product}
    uintx CMSAbortablePrecleanMinWorkPerIteration   = 100
{product}
     intx CMSAbortablePrecleanWaitMillis            = 100
{manageable}
    uintx CMSBitMapYieldQuantum                     = 10485760
 {product}
    uintx CMSBootstrapOccupancy                     = 50
 {product}
     bool CMSClassUnloadingEnabled                  = false
{product}
    uintx CMSClassUnloadingMaxInterval              = 0
{product}
     bool CMSCleanOnEnter                           = true
 {product}
     bool CMSCompactWhenClearAllSoftRefs            = true
 {product}
    uintx CMSConcMarkMultiple                       = 32
 {product}
     bool CMSConcurrentMTEnabled                    = true
 {product}
    uintx CMSCoordinatorYieldSleepCount             = 10
 {product}
     bool CMSDumpAtPromotionFailure                 = false
{product}
    uintx CMSExpAvgFactor                           = 50
 {product}
     bool CMSExtrapolateSweep                       = false
{product}
    uintx CMSFullGCsBeforeCompaction                = 0
{product}
    uintx CMSIncrementalDutyCycle                   = 10
 {product}
    uintx CMSIncrementalDutyCycleMin                = 0
{product}
     bool CMSIncrementalMode                        = false
{product}
    uintx CMSIncrementalOffset                      = 0
{product}
     bool CMSIncrementalPacing                      = true
 {product}
    uintx CMSIncrementalSafetyFactor                = 10
 {product}
    uintx CMSIndexedFreeListReplenish               = 4
{product}
     intx CMSInitiatingOccupancyFraction            = -1
 {product}
     intx CMSInitiatingPermOccupancyFraction        = -1
 {product}
     intx CMSIsTooFullPercentage                    = 98
 {product}
   double CMSLargeCoalSurplusPercent                = 0.950000
 {product}
   double CMSLargeSplitSurplusPercent               = 1.000000
 {product}
     bool CMSLoopWarn                               = false
{product}
    uintx CMSMaxAbortablePrecleanLoops              = 0
{product}
     intx CMSMaxAbortablePrecleanTime               = 5000
 {product}
    uintx CMSOldPLABMax                             = 1024
 {product}
    uintx CMSOldPLABMin                             = 16
 {product}
    uintx CMSOldPLABNumRefills                      = 4
{product}
    uintx CMSOldPLABReactivityFactor                = 2
{product}
     bool CMSOldPLABResizeQuicker                   = false
{product}
    uintx CMSOldPLABToleranceFactor                 = 4
{product}
     bool CMSPLABRecordAlways                       = true
 {product}
    uintx CMSParPromoteBlocksToClaim                = 16
 {product}
     bool CMSParallelRemarkEnabled                  = true
 {product}
     bool CMSParallelSurvivorRemarkEnabled          = true
 {product}
     bool CMSPermGenPrecleaningEnabled              = true
 {product}
    uintx CMSPrecleanDenominator                    = 3
{product}
    uintx CMSPrecleanIter                           = 3
{product}
    uintx CMSPrecleanNumerator                      = 2
{product}
     bool CMSPrecleanRefLists1                      = true
 {product}
     bool CMSPrecleanRefLists2                      = false
{product}
     bool CMSPrecleanSurvivors1                     = false
{product}
     bool CMSPrecleanSurvivors2                     = true
 {product}
    uintx CMSPrecleanThreshold                      = 1000
 {product}
     bool CMSPrecleaningEnabled                     = true
 {product}
     bool CMSPrintChunksInDump                      = false
{product}
     bool CMSPrintObjectsInDump                     = false
{product}
    uintx CMSRemarkVerifyVariant                    = 1
{product}
     bool CMSReplenishIntermediate                  = true
 {product}
    uintx CMSRescanMultiple                         = 32
 {product}
    uintx CMSRevisitStackSize                       = 1048576
{product}
    uintx CMSSamplingGrain                          = 16384
{product}
     bool CMSScavengeBeforeRemark                   = false
{product}
    uintx CMSScheduleRemarkEdenPenetration          = 50
 {product}
    uintx CMSScheduleRemarkEdenSizeThreshold        = 2097152
{product}
    uintx CMSScheduleRemarkSamplingRatio            = 5
{product}
   double CMSSmallCoalSurplusPercent                = 1.050000
 {product}
   double CMSSmallSplitSurplusPercent               = 1.100000
 {product}
     bool CMSSplitIndexedFreeListBlocks             = true
 {product}
     intx CMSTriggerPermRatio                       = 80
 {product}
     intx CMSTriggerRatio                           = 80
 {product}
     intx CMSWaitDuration                           = 2000
 {manageable}
    uintx CMSWorkQueueDrainThreshold                = 10
 {product}
     bool CMSYield                                  = true
 {product}
    uintx CMSYieldSleepCount                        = 0
{product}
     intx CMSYoungGenPerWorker                      = 67108864        {pd
product}
    uintx CMS_FLSPadding                            = 1
{product}
    uintx CMS_FLSWeight                             = 75
 {product}
    uintx CMS_SweepPadding                          = 1
{product}
    uintx CMS_SweepTimerThresholdMillis             = 10
 {product}
    uintx CMS_SweepWeight                           = 75
 {product}
     bool CheckJNICalls                             = false
{product}
     bool ClassUnloading                            = true
 {product}
     intx ClearFPUAtPark                            = 0
{product}
     bool ClipInlining                              = true
 {product}
    uintx CodeCacheExpansionSize                    = 65536           {pd
product}
    uintx CodeCacheFlushingMinimumFreeSpace         = 1536000
{product}
    uintx CodeCacheMinimumFreeSpace                 = 512000
 {product}
     bool CollectGen0First                          = false
{product}
     bool CompactFields                             = true
 {product}
     intx CompilationPolicyChoice                   = 0
{product}
     intx CompilationRepeat                         = 0               {C1
product}
ccstrlist CompileCommand                            =
{product}
    ccstr CompileCommandFile                        =
{product}
ccstrlist CompileOnly                               =
{product}
     intx CompileThreshold                          = 10000           {pd
product}
     bool CompilerThreadHintNoPreempt               = true
 {product}
     intx CompilerThreadPriority                    = -1
 {product}
     intx CompilerThreadStackSize                   = 0               {pd
product}
    uintx ConcGCThreads                             = 0
{product}
     intx ConditionalMoveLimit                      = 3               {C2
pd product}
     bool ConvertSleepToYield                       = true            {pd
product}
     bool ConvertYieldToSleep                       = false
{product}
     bool CreateMinidumpOnCrash                     = false
{product}
     bool CriticalJNINatives                        = true
 {product}
     bool DTraceAllocProbes                         = false
{product}
     bool DTraceMethodProbes                        = false
{product}
     bool DTraceMonitorProbes                       = false
{product}
     bool Debugging                                 = false
{product}
    uintx DefaultMaxRAMFraction                     = 4
{product}
     intx DefaultThreadPriority                     = -1
 {product}
     intx DeferPollingPageLoopCount                 = -1
 {product}
     intx DeferThrSuspendLoopCount                  = 4000
 {product}
     bool DeoptimizeRandom                          = false
{product}
     bool DisableAttachMechanism                    = false
{product}
     bool DisableExplicitGC                         = false
{product}
     bool DisplayVMOutputToStderr                   = false
{product}
     bool DisplayVMOutputToStdout                   = false
{product}
     bool DoEscapeAnalysis                          = true            {C2
product}
     bool DontCompileHugeMethods                    = true
 {product}
     bool DontYieldALot                             = false           {pd
product}
     bool DumpSharedSpaces                          = false
{product}
     bool EagerXrunInit                             = false
{product}
     intx EliminateAllocationArraySizeLimit         = 64              {C2
product}
     bool EliminateAllocations                      = true            {C2
product}
     bool EliminateLocks                            = true            {C2
product}
     bool EliminateNestedLocks                      = true            {C2
product}
     intx EmitSync                                  = 0
{product}
     bool EnableTracing                             = false
{product}
    uintx ErgoHeapSizeLimit                         = 0
{product}
    ccstr ErrorFile                                 =
{product}
    ccstr ErrorReportServer                         =
{product}
     bool EstimateArgEscape                         = true
 {product}
     bool ExplicitGCInvokesConcurrent               = false
{product}
     bool ExplicitGCInvokesConcurrentAndUnloadsClasses  = false
{product}
     bool ExtendedDTraceProbes                      = false
{product}
     bool FLSAlwaysCoalesceLarge                    = false
{product}
    uintx FLSCoalescePolicy                         = 2
{product}
   double FLSLargestBlockCoalesceProximity          = 0.990000
 {product}
     bool FailOverToOldVerifier                     = true
 {product}
     bool FastTLABRefill                            = true
 {product}
     intx FenceInstruction                          = 0               {ARCH
product}
     intx FieldsAllocationStyle                     = 1
{product}
     bool FilterSpuriousWakeups                     = true
 {product}
     bool ForceNUMA                                 = false
{product}
     bool ForceTimeHighResolution                   = false
{product}
     intx FreqInlineSize                            = 325             {pd
product}
   double G1ConcMarkStepDurationMillis              = 10.000000
{product}
    uintx G1ConcRSHotCardLimit                      = 4
{product}
    uintx G1ConcRSLogCacheSize                      = 10
 {product}
     intx G1ConcRefinementGreenZone                 = 0
{product}
     intx G1ConcRefinementRedZone                   = 0
{product}
     intx G1ConcRefinementServiceIntervalMillis     = 300
{product}
    uintx G1ConcRefinementThreads                   = 0
{product}
     intx G1ConcRefinementThresholdStep             = 0
{product}
     intx G1ConcRefinementYellowZone                = 0
{product}
    uintx G1ConfidencePercent                       = 50
 {product}
    uintx G1HeapRegionSize                          = 0
{product}
    uintx G1HeapWastePercent                        = 10
 {product}
    uintx G1MixedGCCountTarget                      = 8
{product}
     intx G1RSetRegionEntries                       = 0
{product}
    uintx G1RSetScanBlockSize                       = 64
 {product}
     intx G1RSetSparseRegionEntries                 = 0
{product}
     intx G1RSetUpdatingPauseTimePercent            = 10
 {product}
     intx G1RefProcDrainInterval                    = 10
 {product}
    uintx G1ReservePercent                          = 10
 {product}
    uintx G1SATBBufferEnqueueingThresholdPercent    = 60
 {product}
     intx G1SATBBufferSize                          = 1024
 {product}
     intx G1UpdateBufferSize                        = 256
{product}
     bool G1UseAdaptiveConcRefinement               = true
 {product}
    uintx GCDrainStackTargetSize                    = 64
 {product}
    uintx GCHeapFreeLimit                           = 2
{product}
    uintx GCLockerEdenExpansionPercent              = 5
{product}
     bool GCLockerInvokesConcurrent                 = false
{product}
    uintx GCLogFileSize                             = 0
{product}
    uintx GCPauseIntervalMillis                     = 0
{product}
    uintx GCTaskTimeStampEntries                    = 200
{product}
    uintx GCTimeLimit                               = 98
 {product}
    uintx GCTimeRatio                               = 99
 {product}
    uintx HeapBaseMinAddress                        = 2147483648      {pd
product}
     bool HeapDumpAfterFullGC                       = false
{manageable}
     bool HeapDumpBeforeFullGC                      = false
{manageable}
     bool HeapDumpOnOutOfMemoryError                = false
{manageable}
    ccstr HeapDumpPath                              =
{manageable}
    uintx HeapFirstMaximumCompactionCount           = 3
{product}
    uintx HeapMaximumCompactionInterval             = 20
 {product}
    uintx HeapSizePerGCThread                       = 87241520
 {product}
     bool IgnoreUnrecognizedVMOptions               = false
{product}
     bool IncrementalInline                         = true            {C2
product}
    uintx InitialCodeCacheSize                      = 2555904         {pd
product}
    uintx InitialHeapSize                          := 447247104
{product}
    uintx InitialRAMFraction                        = 64
 {product}
    uintx InitialSurvivorRatio                      = 8
{product}
     intx InitialTenuringThreshold                  = 7
{product}
    uintx InitiatingHeapOccupancyPercent            = 45
 {product}
     bool Inline                                    = true
 {product}
     intx InlineSmallCode                           = 1000            {pd
product}
     bool InsertMemBarAfterArraycopy                = true            {C2
product}
     intx InteriorEntryAlignment                    = 16              {C2
pd product}
     intx InterpreterProfilePercentage              = 33
 {product}
     bool JNIDetachReleasesMonitors                 = true
 {product}
     bool JavaMonitorsInStackTrace                  = true
 {product}
     intx JavaPriority10_To_OSPriority              = -1
 {product}
     intx JavaPriority1_To_OSPriority               = -1
 {product}
     intx JavaPriority2_To_OSPriority               = -1
 {product}
     intx JavaPriority3_To_OSPriority               = -1
 {product}
     intx JavaPriority4_To_OSPriority               = -1
 {product}
     intx JavaPriority5_To_OSPriority               = -1
 {product}
     intx JavaPriority6_To_OSPriority               = -1
 {product}
     intx JavaPriority7_To_OSPriority               = -1
 {product}
     intx JavaPriority8_To_OSPriority               = -1
 {product}
     intx JavaPriority9_To_OSPriority               = -1
 {product}
     bool LIRFillDelaySlots                         = false           {C1
pd product}
    uintx LargePageHeapSizeThreshold                = 134217728
{product}
    uintx LargePageSizeInBytes                      = 0
{product}
     bool LazyBootClassLoader                       = true
 {product}
     intx LiveNodeCountInliningCutoff               = 20000           {C2
product}
     bool LoadExecStackDllInVMThread                = true
 {product}
     intx LoopOptsCount                             = 43              {C2
product}
     intx LoopUnrollLimit                           = 60              {C2
pd product}
     intx LoopUnrollMin                             = 4               {C2
product}
     bool LoopUnswitching                           = true            {C2
product}
     bool ManagementServer                          = false
{product}
    uintx MarkStackSize                             = 4194304
{product}
    uintx MarkStackSizeMax                          = 536870912
{product}
     intx MarkSweepAlwaysCompactCount               = 4
{product}
    uintx MarkSweepDeadRatio                        = 1
{product}
     intx MaxBCEAEstimateLevel                      = 5
{product}
     intx MaxBCEAEstimateSize                       = 150
{product}
    uintx MaxDirectMemorySize                       = 0
{product}
     bool MaxFDLimit                                = true
 {product}
    uintx MaxGCMinorPauseMillis                     =
18446744073709551615{product}
    uintx MaxGCPauseMillis                          =
18446744073709551615{product}
    uintx MaxHeapFreeRatio                          = 70
 {product}
    uintx MaxHeapSize                              := 7157579776
 {product}
     intx MaxInlineLevel                            = 9
{product}
     intx MaxInlineSize                             = 35
 {product}
     intx MaxJavaStackTraceDepth                    = 1024
 {product}
     intx MaxJumpTableSize                          = 65000           {C2
product}
     intx MaxJumpTableSparseness                    = 5               {C2
product}
     intx MaxLabelRootDepth                         = 1100            {C2
product}
     intx MaxLoopPad                                = 11              {C2
product}
    uintx MaxNewSize                                =
18446744073709486080{product}
     intx MaxNodeLimit                              = 75000           {C2
product}
    uintx MaxPermHeapExpansion                      = 5439488
{product}
    uintx MaxPermSize                               = 85983232        {pd
product}
 uint64_t MaxRAM                                    = 137438953472    {pd
product}
    uintx MaxRAMFraction                            = 4
{product}
     intx MaxRecursiveInlineLevel                   = 1
{product}
     intx MaxTenuringThreshold                      = 15
 {product}
     intx MaxTrivialSize                            = 6
{product}
     intx MaxVectorSize                             = 16              {C2
product}
     bool MethodFlushing                            = true
 {product}
     intx MinCodeCacheFlushingInterval              = 30
 {product}
    uintx MinHeapDeltaBytes                         = 196608
 {product}
    uintx MinHeapFreeRatio                          = 40
 {product}
     intx MinInliningThreshold                      = 250
{product}
     intx MinJumpTableSize                          = 18              {C2
product}
    uintx MinPermHeapExpansion                      = 327680
 {product}
    uintx MinRAMFraction                            = 2
{product}
    uintx MinSurvivorRatio                          = 3
{product}
    uintx MinTLABSize                               = 2048
 {product}
     intx MonitorBound                              = 0
{product}
     bool MonitorInUseLists                         = false
{product}
     intx MultiArrayExpandLimit                     = 6               {C2
product}
     bool MustCallLoadClassInternal                 = false
{product}
     intx NUMAChunkResizeWeight                     = 20
 {product}
    uintx NUMAInterleaveGranularity                 = 2097152
{product}
     intx NUMAPageScanRate                          = 256
{product}
     intx NUMASpaceResizeRate                       = 1073741824
 {product}
     bool NUMAStats                                 = false
{product}
    ccstr NativeMemoryTracking                      = off
{product}
     intx NativeMonitorFlags                        = 0
{product}
     intx NativeMonitorSpinLimit                    = 20
 {product}
     intx NativeMonitorTimeout                      = -1
 {product}
     bool NeedsDeoptSuspend                         = false           {pd
product}
     bool NeverActAsServerClassMachine              = false           {pd
product}
     bool NeverTenure                               = false
{product}
     intx NewRatio                                  = 2
{product}
    uintx NewSize                                   = 1310720
{product}
    uintx NewSizeThreadIncrease                     = 5320            {pd
product}
     intx NmethodSweepCheckInterval                 = 5
{product}
     intx NmethodSweepFraction                      = 16
 {product}
     intx NodeLimitFudgeFactor                      = 1000            {C2
product}
    uintx NumberOfGCLogFiles                        = 0
{product}
     intx NumberOfLoopInstrToAlign                  = 4               {C2
product}
     intx ObjectAlignmentInBytes                    = 8
{lp64_product}
    uintx OldPLABSize                               = 1024
 {product}
    uintx OldPLABWeight                             = 50
 {product}
    uintx OldSize                                   = 5439488
{product}
     bool OmitStackTraceInFastThrow                 = true
 {product}
ccstrlist OnError                                   =
{product}
ccstrlist OnOutOfMemoryError                        =
{product}
     intx OnStackReplacePercentage                  = 140             {pd
product}
     bool OptimizeFill                              = true            {C2
product}
     bool OptimizePtrCompare                        = true            {C2
product}
     bool OptimizeStringConcat                      = true            {C2
product}
     bool OptoBundling                              = false           {C2
pd product}
     intx OptoLoopAlignment                         = 16              {pd
product}
     bool OptoScheduling                            = false           {C2
pd product}
    uintx PLABWeight                                = 75
 {product}
     bool PSChunkLargeArrays                        = true
 {product}
     intx ParGCArrayScanChunk                       = 50
 {product}
    uintx ParGCDesiredObjsFromOverflowList          = 20
 {product}
     bool ParGCTrimOverflow                         = true
 {product}
     bool ParGCUseLocalOverflow                     = false
{product}
     intx ParallelGCBufferWastePct                  = 10
 {product}
    uintx ParallelGCThreads                         = 8
{product}
     bool ParallelGCVerbose                         = false
{product}
    uintx ParallelOldDeadWoodLimiterMean            = 50
 {product}
    uintx ParallelOldDeadWoodLimiterStdDev          = 80
 {product}
     bool ParallelRefProcBalancingEnabled           = true
 {product}
     bool ParallelRefProcEnabled                    = false
{product}
     bool PartialPeelAtUnsignedTests                = true            {C2
product}
     bool PartialPeelLoop                           = true            {C2
product}
     intx PartialPeelNewPhiDelta                    = 0               {C2
product}
    uintx PausePadding                              = 1
{product}
     intx PerBytecodeRecompilationCutoff            = 200
{product}
     intx PerBytecodeTrapLimit                      = 4
{product}
     intx PerMethodRecompilationCutoff              = 400
{product}
     intx PerMethodTrapLimit                        = 100
{product}
     bool PerfAllowAtExitRegistration               = false
{product}
     bool PerfBypassFileSystemCheck                 = false
{product}
     intx PerfDataMemorySize                        = 32768
{product}
     intx PerfDataSamplingInterval                  = 50
 {product}
    ccstr PerfDataSaveFile                          =
{product}
     bool PerfDataSaveToFile                        = false
{product}
     bool PerfDisableSharedMem                      = false
{product}
     intx PerfMaxStringConstLength                  = 1024
 {product}
    uintx PermGenPadding                            = 3
{product}
    uintx PermMarkSweepDeadRatio                    = 5
{product}
    uintx PermSize                                  = 21757952        {pd
product}
     intx PreInflateSpin                            = 10              {pd
product}
     bool PreferInterpreterNativeStubs              = false           {pd
product}
     intx PrefetchCopyIntervalInBytes               = 576
{product}
     intx PrefetchFieldsAhead                       = 1
{product}
     intx PrefetchScanIntervalInBytes               = 576
{product}
     bool PreserveAllAnnotations                    = false
{product}
    uintx PretenureSizeThreshold                    = 0
{product}
     bool PrintAdaptiveSizePolicy                   = false
{product}
     bool PrintCMSInitiationStatistics              = false
{product}
     intx PrintCMSStatistics                        = 0
{product}
     bool PrintClassHistogram                       = false
{manageable}
     bool PrintClassHistogramAfterFullGC            = false
{manageable}
     bool PrintClassHistogramBeforeFullGC           = false
{manageable}
     bool PrintCommandLineFlags                     = false
{product}
     bool PrintCompilation                          = false
{product}
     bool PrintConcurrentLocks                      = false
{manageable}
     intx PrintFLSCensus                            = 0
{product}
     intx PrintFLSStatistics                        = 0
{product}
     bool PrintFlagsFinal                          := true
 {product}
     bool PrintFlagsInitial                         = false
{product}
     bool PrintGC                                   = false
{manageable}
     bool PrintGCApplicationConcurrentTime          = false
{product}
     bool PrintGCApplicationStoppedTime             = false
{product}
     bool PrintGCCause                              = false
{product}
     bool PrintGCDateStamps                         = false
{manageable}
     bool PrintGCDetails                            = false
{manageable}
     bool PrintGCTaskTimeStamps                     = false
{product}
     bool PrintGCTimeStamps                         = false
{manageable}
     bool PrintHeapAtGC                             = false
{product rw}
     bool PrintHeapAtGCExtended                     = false
{product rw}
     bool PrintHeapAtSIGBREAK                       = true
 {product}
     bool PrintJNIGCStalls                          = false
{product}
     bool PrintJNIResolving                         = false
{product}
     bool PrintOldPLAB                              = false
{product}
     bool PrintOopAddress                           = false
{product}
     bool PrintPLAB                                 = false
{product}
     bool PrintParallelOldGCPhaseTimes              = false
{product}
     bool PrintPromotionFailure                     = false
{product}
     bool PrintReferenceGC                          = false
{product}
     bool PrintRevisitStats                         = false
{product}
     bool PrintSafepointStatistics                  = false
{product}
     intx PrintSafepointStatisticsCount             = 300
{product}
     intx PrintSafepointStatisticsTimeout           = -1
 {product}
     bool PrintSharedSpaces                         = false
{product}
     bool PrintStringTableStatistics                = false
{product}
     bool PrintTLAB                                 = false
{product}
     bool PrintTenuringDistribution                 = false
{product}
     bool PrintTieredEvents                         = false
{product}
     bool PrintVMOptions                            = false
{product}
     bool PrintVMQWaitTime                          = false
{product}
     bool PrintWarnings                             = true
 {product}
    uintx ProcessDistributionStride                 = 4
{product}
     bool ProfileInterpreter                        = true            {pd
product}
     bool ProfileIntervals                          = false
{product}
     intx ProfileIntervalsTicks                     = 100
{product}
     intx ProfileMaturityPercentage                 = 20
 {product}
     bool ProfileVM                                 = false
{product}
     bool ProfilerPrintByteCodeStatistics           = false
{product}
     bool ProfilerRecordPC                          = false
{product}
    uintx PromotedPadding                           = 3
{product}
     intx QueuedAllocationWarningCount              = 0
{product}
     bool RangeCheckElimination                     = true
 {product}
     intx ReadPrefetchInstr                         = 0               {ARCH
product}
     bool ReassociateInvariants                     = true            {C2
product}
     bool ReduceBulkZeroing                         = true            {C2
product}
     bool ReduceFieldZeroing                        = true            {C2
product}
     bool ReduceInitialCardMarks                    = true            {C2
product}
     bool ReduceSignalUsage                         = false
{product}
     intx RefDiscoveryPolicy                        = 0
{product}
     bool ReflectionWrapResolutionErrors            = true
 {product}
     bool RegisterFinalizersAtInit                  = true
 {product}
     bool RelaxAccessControlCheck                   = false
{product}
     bool RequireSharedSpaces                       = false
{product}
    uintx ReservedCodeCacheSize                     = 50331648        {pd
product}
     bool ResizeOldPLAB                             = true
 {product}
     bool ResizePLAB                                = true
 {product}
     bool ResizeTLAB                                = true            {pd
product}
     bool RestoreMXCSROnJNICalls                    = false
{product}
     bool RewriteBytecodes                          = true            {pd
product}
     bool RewriteFrequentPairs                      = true            {pd
product}
     intx SafepointPollOffset                       = 256             {C1
pd product}
     intx SafepointSpinBeforeYield                  = 2000
 {product}
     bool SafepointTimeout                          = false
{product}
     intx SafepointTimeoutDelay                     = 10000
{product}
     bool ScavengeBeforeFullGC                      = true
 {product}
     intx SelfDestructTimer                         = 0
{product}
    uintx SharedDummyBlockSize                      = 536870912
{product}
    uintx SharedMiscCodeSize                        = 4194304
{product}
    uintx SharedMiscDataSize                        = 6291456
{product}
    uintx SharedReadOnlySize                        = 10485760
 {product}
    uintx SharedReadWriteSize                       = 14680064
 {product}
     bool ShowMessageBoxOnError                     = false
{product}
     intx SoftRefLRUPolicyMSPerMB                   = 1000
 {product}
     bool SplitIfBlocks                             = true            {C2
product}
     intx StackRedPages                             = 1               {pd
product}
     intx StackShadowPages                          = 20              {pd
product}
     bool StackTraceInThrowable                     = true
 {product}
     intx StackYellowPages                          = 2               {pd
product}
     bool StartAttachListener                       = false
{product}
     intx StarvationMonitorInterval                 = 200
{product}
     bool StressLdcRewrite                          = false
{product}
    uintx StringTableSize                           = 60013
{product}
     bool SuppressFatalErrorMessage                 = false
{product}
    uintx SurvivorPadding                           = 3
{product}
     intx SurvivorRatio                             = 8
{product}
     intx SuspendRetryCount                         = 50
 {product}
     intx SuspendRetryDelay                         = 5
{product}
     intx SyncFlags                                 = 0
{product}
    ccstr SyncKnobs                                 =
{product}
     intx SyncVerbose                               = 0
{product}
    uintx TLABAllocationWeight                      = 35
 {product}
    uintx TLABRefillWasteFraction                   = 64
 {product}
    uintx TLABSize                                  = 0
{product}
     bool TLABStats                                 = true
 {product}
    uintx TLABWasteIncrement                        = 4
{product}
    uintx TLABWasteTargetPercent                    = 1
{product}
     intx TargetPLABWastePct                        = 10
 {product}
     intx TargetSurvivorRatio                       = 50
 {product}
    uintx TenuredGenerationSizeIncrement            = 20
 {product}
    uintx TenuredGenerationSizeSupplement           = 80
 {product}
    uintx TenuredGenerationSizeSupplementDecay      = 2
{product}
     intx ThreadPriorityPolicy                      = 0
{product}
     bool ThreadPriorityVerbose                     = false
{product}
    uintx ThreadSafetyMargin                        = 52428800
 {product}
     intx ThreadStackSize                           = 1024            {pd
product}
    uintx ThresholdTolerance                        = 10
 {product}
     intx Tier0BackedgeNotifyFreqLog                = 10
 {product}
     intx Tier0InvokeNotifyFreqLog                  = 7
{product}
     intx Tier0ProfilingStartPercentage             = 200
{product}
     intx Tier23InlineeNotifyFreqLog                = 20
 {product}
     intx Tier2BackEdgeThreshold                    = 0
{product}
     intx Tier2BackedgeNotifyFreqLog                = 14
 {product}
     intx Tier2CompileThreshold                     = 0
{product}
     intx Tier2InvokeNotifyFreqLog                  = 11
 {product}
     intx Tier3BackEdgeThreshold                    = 60000
{product}
     intx Tier3BackedgeNotifyFreqLog                = 13
 {product}
     intx Tier3CompileThreshold                     = 2000
 {product}
     intx Tier3DelayOff                             = 2
{product}
     intx Tier3DelayOn                              = 5
{product}
     intx Tier3InvocationThreshold                  = 200
{product}
     intx Tier3InvokeNotifyFreqLog                  = 10
 {product}
     intx Tier3LoadFeedback                         = 5
{product}
     intx Tier3MinInvocationThreshold               = 100
{product}
     intx Tier4BackEdgeThreshold                    = 40000
{product}
     intx Tier4CompileThreshold                     = 15000
{product}
     intx Tier4InvocationThreshold                  = 5000
 {product}
     intx Tier4LoadFeedback                         = 3
{product}
     intx Tier4MinInvocationThreshold               = 600
{product}
     bool TieredCompilation                         = false           {pd
product}
     intx TieredCompileTaskTimeout                  = 50
 {product}
     intx TieredRateUpdateMaxTime                   = 25
 {product}
     intx TieredRateUpdateMinTime                   = 1
{product}
     intx TieredStopAtLevel                         = 4
{product}
     bool TimeLinearScan                            = false           {C1
product}
     bool TraceBiasedLocking                        = false
{product}
     bool TraceClassLoading                         = false
{product rw}
     bool TraceClassLoadingPreorder                 = false
{product}
     bool TraceClassResolution                      = false
{product}
     bool TraceClassUnloading                       = false
{product rw}
     bool TraceDynamicGCThreads                     = false
{product}
     bool TraceGen0Time                             = false
{product}
     bool TraceGen1Time                             = false
{product}
    ccstr TraceJVMTI                                =
{product}
     bool TraceLoaderConstraints                    = false
{product rw}
     bool TraceMonitorInflation                     = false
{product}
     bool TraceParallelOldGCTasks                   = false
{product}
     intx TraceRedefineClasses                      = 0
{product}
     bool TraceSafepointCleanupTime                 = false
{product}
     bool TraceSuspendWaitFailures                  = false
{product}
     intx TrackedInitializationLimit                = 50              {C2
product}
     bool TransmitErrorReport                       = false
{product}
     intx TypeProfileMajorReceiverPercent           = 90              {C2
product}
     intx TypeProfileWidth                          = 2
{product}
     intx UnguardOnExecutionViolation               = 0
{product}
     bool UnlinkSymbolsALot                         = false
{product}
     bool Use486InstrsOnly                          = false           {ARCH
product}
     bool UseAES                                    = true
 {product}
     bool UseAESIntrinsics                          = true
 {product}
     intx UseAVX                                    = 0               {ARCH
product}
     bool UseAdaptiveGCBoundary                     = false
{product}
     bool UseAdaptiveGenerationSizePolicyAtMajorCollection  = true
   {product}
     bool UseAdaptiveGenerationSizePolicyAtMinorCollection  = true
   {product}
     bool UseAdaptiveNUMAChunkSizing                = true
 {product}
     bool UseAdaptiveSizeDecayMajorGCCost           = true
 {product}
     bool UseAdaptiveSizePolicy                     = true
 {product}
     bool UseAdaptiveSizePolicyFootprintGoal        = true
 {product}
     bool UseAdaptiveSizePolicyWithSystemGC         = false
{product}
     bool UseAddressNop                             = true            {ARCH
product}
     bool UseAltSigs                                = false
{product}
     bool UseAutoGCSelectPolicy                     = false
{product}
     bool UseBiasedLocking                          = true
 {product}
     bool UseBimorphicInlining                      = true            {C2
product}
     bool UseBoundThreads                           = true
 {product}
     bool UseCMSBestFit                             = true
 {product}
     bool UseCMSCollectionPassing                   = true
 {product}
     bool UseCMSCompactAtFullCollection             = true
 {product}
     bool UseCMSInitiatingOccupancyOnly             = false
{product}
     bool UseCodeCacheFlushing                      = true
 {product}
     bool UseCompiler                               = true
 {product}
     bool UseCompilerSafepoints                     = true
 {product}
     bool UseCompressedOops                        := true
 {lp64_product}
     bool UseConcMarkSweepGC                        = false
{product}
     bool UseCondCardMark                           = false           {C2
product}
     bool UseCountLeadingZerosInstruction           = false           {ARCH
product}
     bool UseCounterDecay                           = true
 {product}
     bool UseDivMod                                 = true            {C2
product}
     bool UseDynamicNumberOfGCThreads               = false
{product}
     bool UseFPUForSpilling                         = false           {C2
product}
     bool UseFastAccessorMethods                    = false
{product}
     bool UseFastEmptyMethods                       = false
{product}
     bool UseFastJNIAccessors                       = true
 {product}
     bool UseFastStosb                              = true            {ARCH
product}
     bool UseG1GC                                   = false
{product}
     bool UseGCLogFileRotation                      = false
{product}
     bool UseGCOverheadLimit                        = true
 {product}
     bool UseGCTaskAffinity                         = false
{product}
     bool UseHeavyMonitors                          = false
{product}
     bool UseHugeTLBFS                              = false
{product}
     bool UseInlineCaches                           = true
 {product}
     bool UseInterpreter                            = true
 {product}
     bool UseJumpTables                             = true            {C2
product}
     bool UseLWPSynchronization                     = true
 {product}
     bool UseLargePages                             = false           {pd
product}
     bool UseLargePagesIndividualAllocation         = false           {pd
product}
     bool UseLinuxPosixThreadCPUClocks              = true
 {product}
     bool UseLockedTracing                          = false
{product}
     bool UseLoopCounter                            = true
 {product}
     bool UseLoopPredicate                          = true            {C2
product}
     bool UseMaximumCompactionOnSystemGC            = true
 {product}
     bool UseMembar                                 = false           {pd
product}
     bool UseNUMA                                   = false
{product}
     bool UseNUMAInterleaving                       = false
{product}
     bool UseNewLongLShift                          = false           {ARCH
product}
     bool UseOSErrorReporting                       = false           {pd
product}
     bool UseOldInlining                            = true            {C2
product}
     bool UseOnStackReplacement                     = true            {pd
product}
     bool UseOnlyInlinedBimorphic                   = true            {C2
product}
     bool UseOprofile                               = false
{product}
     bool UseOptoBiasInlining                       = true            {C2
product}
     bool UsePPCLWSYNC                              = true
 {product}
     bool UsePSAdaptiveSurvivorSizePolicy           = true
 {product}
     bool UseParNewGC                               = false
{product}
     bool UseParallelGC                            := true
 {product}
     bool UseParallelOldGC                          = true
 {product}
     bool UsePerfData                               = true
 {product}
     bool UsePopCountInstruction                    = true
 {product}
     bool UseRDPCForConstantTableBase               = false           {C2
product}
     bool UseSHM                                    = false
{product}
     intx UseSSE                                    = 4
{product}
     bool UseSSE42Intrinsics                        = true
 {product}
     bool UseSerialGC                               = false
{product}
     bool UseSharedSpaces                           = false
{product}
     bool UseSignalChaining                         = true
 {product}
     bool UseSplitVerifier                          = true
 {product}
     bool UseStoreImmI16                            = false           {ARCH
product}
     bool UseStringCache                            = false
{product}
     bool UseSuperWord                              = true            {C2
product}
     bool UseTLAB                                   = true            {pd
product}
     bool UseThreadPriorities                       = true            {pd
product}
     bool UseTypeProfile                            = true
 {product}
     bool UseUnalignedLoadStores                    = true            {ARCH
product}
     bool UseVMInterruptibleIO                      = false
{product}
     bool UseVectoredExceptions                     = false           {pd
product}
     bool UseXMMForArrayCopy                        = true
 {product}
     bool UseXmmI2D                                 = false           {ARCH
product}
     bool UseXmmI2F                                 = false           {ARCH
product}
     bool UseXmmLoadAndClearUpper                   = true            {ARCH
product}
     bool UseXmmRegToRegMoveAll                     = true            {ARCH
product}
     bool VMThreadHintNoPreempt                     = false
{product}
     intx VMThreadPriority                          = -1
 {product}
     intx VMThreadStackSize                         = 1024            {pd
product}
     intx ValueMapInitialSize                       = 11              {C1
product}
     intx ValueMapMaxLoopSize                       = 8               {C1
product}
     intx ValueSearchLimit                          = 1000            {C2
product}
     bool VerifyMergedCPBytecodes                   = true
 {product}
     intx WorkAroundNPTLTimedWaitHang               = 1
{product}
    uintx YoungGenerationSizeIncrement              = 20
 {product}
    uintx YoungGenerationSizeSupplement             = 80
 {product}
    uintx YoungGenerationSizeSupplementDecay        = 8
{product}
    uintx YoungPLABSize                             = 4096
 {product}
     bool ZeroTLAB                                  = false
{product}
     intx hashCode                                  = 0
{product}
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)


On Fri, Nov 22, 2013 at 12:33 PM, Raymond Wiker <rw...@gmail.com> wrote:

> You mentioned earlier that you are not setting -Xms/-Xmx; the values
> actually in use would then depend on the Java version, whether you're
> running 32- or 64-bit Java, whether Java thinks your machines are
> "servers", and whether you have specified the "-server" flag – and possibly
> a few other things.
>
> What do you get if you run the command below?
>
> java -XX:+PrintFlagsFinal -version
>
> (Ref:
> http://stackoverflow.com/questions/3428251/is-there-a-default-xmx-setting-for-java-1-5for details; I "stole" the incantation above from that location, but there
> are more complete examples of how it could be used there.)
>
> Note: you need to adjust the command line so that it uses the same java
> version as the one you're using, and also add whatever JRE-modifying
> parameters that you use when starting Solr.
>
> On 22 Nov 2013, at 18:12 , Dave Seltzer <ds...@tveyes.com> wrote:
>
> > Thanks so much Shawn,
> >
> > I think you (and others) are completely right about this being heap and
> GC
> > related. I just did a test while not indexing data and the same periodic
> > slowness was observable.
> >
> > On to GC/Memory Tuning!
>
>

Re: Periodic Slowness on Solr Cloud

Posted by Raymond Wiker <rw...@gmail.com>.
You mentioned earlier that you are not setting -Xms/-Xmx; the values actually in use would then depend on the Java version, whether you're running 32- or 64-bit Java, whether Java thinks your machines are "servers", and whether you have specified the "-server" flag – and possibly a few other things.

What do you get if you run the command below?

java -XX:+PrintFlagsFinal -version

(Ref: http://stackoverflow.com/questions/3428251/is-there-a-default-xmx-setting-for-java-1-5 for details; I "stole" the incantation above from that location, but there are more complete examples of how it could be used there.)

Note: you need to adjust the command line so that it uses the same java version as the one you're using, and also add whatever JRE-modifying parameters that you use when starting Solr.

On 22 Nov 2013, at 18:12 , Dave Seltzer <ds...@tveyes.com> wrote:

> Thanks so much Shawn,
> 
> I think you (and others) are completely right about this being heap and GC
> related. I just did a test while not indexing data and the same periodic
> slowness was observable.
> 
> On to GC/Memory Tuning!


Re: Periodic Slowness on Solr Cloud

Posted by Dave Seltzer <ds...@tveyes.com>.
Thanks so much Shawn,

I think you (and others) are completely right about this being heap and GC
related. I just did a test while not indexing data and the same periodic
slowness was observable.

On to GC/Memory Tuning!

Many Thanks!

-Dave


On Fri, Nov 22, 2013 at 12:09 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 11/22/2013 10:01 AM, Shawn Heisey wrote:
>
>> You can see how much the max heap is in the Solr admin UI dashboard -
>> it'll be the right-most number on the JVM-Memory graph.  On my 64-bit linux
>> development machine with 16GB of RAM, it looks like Java defaults to a 4GB
>> max heap.  I have the heap size manually set to 7GB for Solr on that
>> machine.  The 6GB heap you have mentioned might not be enough, or it might
>> be more than you need.  It all depends on the kind of queries you are doing
>> and exactly how Solr is configured.
>>
>
> Followup: I would also recommend starting with my garbage collection
> settings.  This wiki page is linked on the wiki page I've already given you.
>
> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
>
> You might need a script to start Solr.  There is also a redhat-specific
> init script on that wiki page.  I haven't included any instructions for
> installing it.  Someone who already knows about init scripts won't have
> much trouble getting it working on a redhat-derived OS, and someone who
> doesn't will need extensive instructions or an install script, neither of
> which has been written.
>
> Thanks,
> Shawn
>

Re: Periodic Slowness on Solr Cloud

Posted by Shawn Heisey <so...@elyograg.org>.
On 11/22/2013 2:17 PM, Dave Seltzer wrote:
> So I made a few changes, but I still seem to be dealing with this pesky
> periodic slowness.
>
> Changes:
> 1) I'm now only forcing commits every 5 minutes. This was done by
> specifying commitWithin=300000 when doing document adds.
> 2) I'm specifying an -Xmx12g to force the java heap to take more memory
> 3) I'm using the GC configuration parameters from the wiki (
> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning)

<snip>

> I'm still seeing the same periodic slowness about every 3.5 minutes. This
> slowness occurs whether or not I'm indexing content, so it appears to be
> unrelated to my commit schedule.

It sounds like your heap isn't too small.  Try reducing it to 5GB, then 
to 4GB after some testing, so more memory gets used by the OS disk 
cache.  I would also recommend trying perhaps 100 threads on your test 
app rather than 200.  Work your way up until you find the point where it 
just can't handle the load.

> See the most recent graph here:
> http://farm4.staticflickr.com/3819/10999523464_328814e358_o.png
>
> To keep things consistent I'm still testing with 200 threads. When I test
> with 10 threads everything is much faster, but I still get the same
> periodic slowness.
>
> One thing I've noticed is that while Java is aware of the 12 gig heap, Solr
> doesn't seem to be using much of it. The system panel of the Web UI shows
> 11.5GB of JVM-Memory available, but only 2.11GB in use.

The memory usage in the admin UI is an instantaneous snapshot.  If you 
use jvisualvm or jconsole (included in the Java JDK) to get a graph of 
memory usage, you'll see it change over time.  As Java allocates 
objects, memory usage increases until it's using all the heap.  Some 
amount of that allocation will be objects that are no longer in use -- 
garbage.  Then garbage collection will kick in and memory usage will 
drop down to however much is actually in use in the particular memory 
pool that's being collected.  This is what people often refer to as the 
sawtooth pattern.

Here's a couple of screenshots.  The jconsole program is running on 
Windows 7, Solr is running on Linux.  One screenshot is the graph, the 
other is the VM summary where you can see that Solr has been running for 
nearly 8 days.  This is one of my production Solr servers, so some of 
the parameters are slightly different than what's on my wiki:

https://dl.dropboxusercontent.com/u/97770508/solr-jconsole.png
https://dl.dropboxusercontent.com/u/97770508/solr-jconsole-summary.png

If you do not have a GUI installed on the actual Solr machine, you'll 
need to use remote JMX to connect jconsole.  In the init script on my 
wiki page, you can see JMX options.  With those, you can tell a remote 
jconsole to use server.example.com:8686 instead of a local PID.  You can 
use any port you want that's not already in use instead of 8686.  
Running jconsole with -interval=1 will make the graph update once a 
second, I think it's every 5 seconds by default.

You can also hit reload on the dashboard page to see how memory usage 
changes over time, but it's not as useful as a graph.  Memory usage will 
not change by much if you are not actively querying or indexing.

Thanks,
Shawn


Re: Periodic Slowness on Solr Cloud

Posted by Dave Seltzer <ds...@tveyes.com>.
So I made a few changes, but I still seem to be dealing with this pesky
periodic slowness.

Changes:
1) I'm now only forcing commits every 5 minutes. This was done by
specifying commitWithin=300000 when doing document adds.
2) I'm specifying an -Xmx12g to force the java heap to take more memory
3) I'm using the GC configuration parameters from the wiki (
http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning)

The new startup args are:
-DzkRun
-Xmx12g
-XX:+AggressiveOpts
-XX:+UseLargePages
-XX:+ParallelRefProcEnabled
-XX:+CMSParallelRemarkEnabled
-XX:CMSMaxAbortablePrecleanTime=6000
-XX:CMSTriggerPermRatio=80
-XX:CMSInitiatingOccupancyFraction=70
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSFullGCsBeforeCompaction=1
-XX:PretenureSizeThreshold=64m
-XX:+CMSScavengeBeforeRemark
-XX:+UseConcMarkSweepGC
-XX:MaxTenuringThreshold=8
-XX:TargetSurvivorRatio=90
-XX:SurvivorRatio=4
-XX:NewRatio=3

I'm still seeing the same periodic slowness about every 3.5 minutes. This
slowness occurs whether or not I'm indexing content, so it appears to be
unrelated to my commit schedule.

See the most recent graph here:
http://farm4.staticflickr.com/3819/10999523464_328814e358_o.png

To keep things consistent I'm still testing with 200 threads. When I test
with 10 threads everything is much faster, but I still get the same
periodic slowness.

One thing I've noticed is that while Java is aware of the 12 gig heap, Solr
doesn't seem to be using much of it. The system panel of the Web UI shows
11.5GB of JVM-Memory available, but only 2.11GB in use.

Screenshot: http://farm4.staticflickr.com/3822/10999509515_72a9013ec7_o.jpg

So I've told Java to use more memory. Do I need to tell Solr to use more as
well?

Thanks everyone!

-Dave



On Fri, Nov 22, 2013 at 12:09 PM, Shawn Heisey <so...@elyograg.org> wrote:

> On 11/22/2013 10:01 AM, Shawn Heisey wrote:
>
>> You can see how much the max heap is in the Solr admin UI dashboard -
>> it'll be the right-most number on the JVM-Memory graph.  On my 64-bit linux
>> development machine with 16GB of RAM, it looks like Java defaults to a 4GB
>> max heap.  I have the heap size manually set to 7GB for Solr on that
>> machine.  The 6GB heap you have mentioned might not be enough, or it might
>> be more than you need.  It all depends on the kind of queries you are doing
>> and exactly how Solr is configured.
>>
>
> Followup: I would also recommend starting with my garbage collection
> settings.  This wiki page is linked on the wiki page I've already given you.
>
> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
>
> You might need a script to start Solr.  There is also a redhat-specific
> init script on that wiki page.  I haven't included any instructions for
> installing it.  Someone who already knows about init scripts won't have
> much trouble getting it working on a redhat-derived OS, and someone who
> doesn't will need extensive instructions or an install script, neither of
> which has been written.
>
> Thanks,
> Shawn
>
>

Re: Periodic Slowness on Solr Cloud

Posted by Shawn Heisey <so...@elyograg.org>.
On 11/22/2013 10:01 AM, Shawn Heisey wrote:
> You can see how much the max heap is in the Solr admin UI dashboard - 
> it'll be the right-most number on the JVM-Memory graph.  On my 64-bit 
> linux development machine with 16GB of RAM, it looks like Java 
> defaults to a 4GB max heap.  I have the heap size manually set to 7GB 
> for Solr on that machine.  The 6GB heap you have mentioned might not 
> be enough, or it might be more than you need.  It all depends on the 
> kind of queries you are doing and exactly how Solr is configured.

Followup: I would also recommend starting with my garbage collection 
settings.  This wiki page is linked on the wiki page I've already given you.

http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning

You might need a script to start Solr.  There is also a redhat-specific 
init script on that wiki page.  I haven't included any instructions for 
installing it.  Someone who already knows about init scripts won't have 
much trouble getting it working on a redhat-derived OS, and someone who 
doesn't will need extensive instructions or an install script, neither 
of which has been written.

Thanks,
Shawn


Re: Periodic Slowness on Solr Cloud

Posted by Shawn Heisey <so...@elyograg.org>.
On 11/22/2013 8:13 AM, Dave Seltzer wrote:
> Regarding memory: Including duplicate data in shard replicas the entire
> index is 350GB. Each server hosts a total of 44GB of data. Each server has
> 28GB of memory. I haven't been setting -Xmx or -Xms, in the hopes that Java
> would take the memory it needs and leave the rest to the OS for cache.

That's not how Java works.  Java has a min heap and max heap setting.  
If you (or the auto-detected settings) tell it that the max heap is 4GB, 
it will only ever use slightly more than 4GB of RAM.  If the app needs 
more than that, this will lead to terrible performance and/or out of 
memory errors.

You can see how much the max heap is in the Solr admin UI dashboard - 
it'll be the right-most number on the JVM-Memory graph.  On my 64-bit 
linux development machine with 16GB of RAM, it looks like Java defaults 
to a 4GB max heap.  I have the heap size manually set to 7GB for Solr on 
that machine.  The 6GB heap you have mentioned might not be enough, or 
it might be more than you need.  It all depends on the kind of queries 
you are doing and exactly how Solr is configured.

If it were me, I'd want a memory size between 48 and 64GB for a total 
index size of 44GB.  Whether you really need that much is very dependent 
on your exact requirements, index makeup, and queries.  To support the 
high query load you're sending, it probably is a requirement.  More 
memory is likely to help performance, but I can't guarantee it without 
looking a lot deeper into your setup, and that's difficult to do via email.

One thing I can tell you about checking performance - see how much of 
your 70% CPU usage is going to I/O wait.  If it's more than a few 
percent, more memory might help.  First try increasing the max heap by 1 
or 2GB.

> Given that I'll never need to serve 200 concurrent connections in
> production, do you think my servers need more memory?
> Should I be tinkering with -Xmx and -Xms?

If you'll never need to serve that many, test with a lower number.  Make 
it higher than you'll need, but not a lot higher. The test with 200 
connections isn't a bad idea -- you do want to stress test things way 
beyond your actual requirements, but you'll also want to see how it does 
with a more realistic load.

Those are the min/max heap settings I just mentioned.  IMHO you should 
set at least the max heap.  If you want to handle a high load, it's a 
good idea to set the min heap to the same value as the max heap, so that 
it doesn't need to worry about hitting limits in order to allocate 
additional memory.  It'll eventually allocate the max heap anyway.

> Regarding commits: My end-users want new data to be made available quickly.
> Thankfully I'm only inserting between 1 and 3 documents per second so the
> change-rate isn't crazy.
>
> Should I just slow down my commit frequency, and depend on soft-commits? If
> I do this, will the commits take even longer?
> Given 1000 documents, is it generally faster to do 10 commits of 100, or 1
> commit of 1000?

Fewer commits is always better.  The amount of time they take isn't 
strongly affected by the number of new documents, unless there are a LOT 
of them.  Figure out the timeframe that's the maximum amount of time (in 
milliseconds) that you think people are willing to wait for new data to 
become visible.  Use that as your autoSoftCommit interval, or as the 
commitWithin parameter on your indexing requests.  Set your autoCommit 
interval to around five minutes, as described on the wiki page I 
linked.  If you are using auto settings and/or commitWithin, then you 
will never need to send an explicit commit command.  Reducing commit 
frequency is one of the first things you'll want to try.  Frequent 
commits use a *lot* of I/O and CPU resources.

Although there are exceptions, most installs rarely NEED commits to 
happen more often than about once a minute, and longer intervals are 
often perfectly acceptable.  Even in situations where a higher frequency 
is required, 10-15 seconds is often good enough.  Getting sub-second 
commit times is *possible*, but usually requires significant hardware 
investment or changing the config in a way that is detrimental to query 
performance.

Thanks,
Shawn


Re: Periodic Slowness on Solr Cloud

Posted by Dave Seltzer <ds...@tveyes.com>.
Hi Shawn,

Wow! Thank you for your considered reply!

I'm going to dig into these issues, but I have a few questions:

Regarding memory: Including duplicate data in shard replicas the entire
index is 350GB. Each server hosts a total of 44GB of data. Each server has
28GB of memory. I haven't been setting -Xmx or -Xms, in the hopes that Java
would take the memory it needs and leave the rest to the OS for cache.

Given that I'll never need to serve 200 concurrent connections in
production, do you think my servers need more memory?
Should I be tinkering with -Xmx and -Xms?

Regarding commits: My end-users want new data to be made available quickly.
Thankfully I'm only inserting between 1 and 3 documents per second so the
change-rate isn't crazy.

Should I just slow down my commit frequency, and depend on soft-commits? If
I do this, will the commits take even longer?
Given 1000 documents, is it generally faster to do 10 commits of 100, or 1
commit of 1000?

Thanks so much!

-D



On Fri, Nov 22, 2013 at 2:27 AM, Shawn Heisey <so...@elyograg.org> wrote:

> On 11/21/2013 6:41 PM, Dave Seltzer wrote:
> > In digging a little deeper and looking at the config I see that
> > <nrtMode>true</nrtMode> is commented out.  I believe this is the default
> > setting. So I don't know if NRT is enabled or not. Maybe just a red
> herring.
>
> I had never seen this setting before.  The default is true.  SolrCloud
> requires that it be set to true.  Looks like it's a new parameter in
> 4.5, added by SOLR-4909.  From what I can tell reading the issue,
> turning it off effectively disables soft commits.
>
> https://issues.apache.org/jira/browse/SOLR-4909
>
> You've said that you are adding about 3 documents per second, but you
> haven't said anything about how often you are doing commits.  Erick's
> question basically boils down to this:  How quickly after indexing do
> you expect the changes to be visible on a search, and how often are you
> doing commits?
>
> Generally speaking (and ignoring the fact that nrtMode now exists), NRT
> is not something you enable, it's something you try to achieve, by using
> soft commits quickly and often, and by adjusting the configuration to
> make the commits go faster.
>
> If you are trying to keep the interval between indexing and document
> visibility down to less than a few seconds (especially if it's less than
> one second), then you are trying to achieve NRT.
>
> There's a lot of information on the following wiki page about
> performance problems.  This specific link is to the last part of that
> page, which deals with slow commits:
>
> http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_commits
>
> > I don't know what Garbage Collector we're using. In this test I'm running
> > Solr 4.5.1 using Jetty from the example directory.
>
> If you aren't using any tuning parameters beyond setting the max heap,
> then you are using the default parallel collector.  It's a poor choice
> for Solr unless your heap is very small.  At 6GB, yours isn't very
> small.  It's not particularly huge either, but not small.
>
> > The CPU on the 8 nodes all stay around 70% use during the test. The nodes
> > have 28GB of RAM. Java is using about 6GB and the rest is being used by
> OS
> > cache.
>
> How big is your index?  If it's larger than about 30 GB, you probably
> need more memory.  If it's much larger than about 40 GB, you definitely
> need more memory.
>
> > To perform the test we're running 200 concurrent threads in JMeter. The
> > threads hit HAProxy which loadbalances the requests among the nodes. Each
> > query is for a random word out of a list of about 10,000 words. Some of
> the
> > queries have faceting turned on.
>
> That's a pretty high query load.  If you want to get anywhere near top
> performance out of it, you'll want to have enough memory to fit your
> entire index into RAM.  You'll also need to reduce the load introduced
> by indexing.  A large part of the load from indexing comes from commits.
>
> > Because we're heavily loading the system the queries are returning quite
> > slowly. For a simple search, the average response time was 300ms. The
> peak
> > response time was 11,000ms. The spikes in latency seem to occur about
> every
> > 2.5 minutes.
>
> I would bet that you're having one or both of the following issues:
>
> 1) Garbage collection issues from one or more of the following:
>  a) Heap too small.
>  b) Using the default GC instead of CMS with tuning.
> 2) General performance issues from one or more of the following:
>  a) Not enough cache memory for your index size.
>  b) Too-frequent commits.
>  c) Commits taking a lot of time and resources due to cache warming.
>
> With a high query and index load, any problems become magnified.
>
> > I haven't spent that much time messing with SolrConfig, so most of the
> > settings are the out-of-the-box defaults.
>
> The defaults are very good for small to medium indexes and low to medium
> query load.  If you have a big index and/or high query load, you'll
> generally need to tune.
>
> Thanks,
> Shawn
>
>

Re: Periodic Slowness on Solr Cloud

Posted by Shawn Heisey <so...@elyograg.org>.
On 11/21/2013 6:41 PM, Dave Seltzer wrote:
> In digging a little deeper and looking at the config I see that
> <nrtMode>true</nrtMode> is commented out.  I believe this is the default
> setting. So I don't know if NRT is enabled or not. Maybe just a red herring.

I had never seen this setting before.  The default is true.  SolrCloud
requires that it be set to true.  Looks like it's a new parameter in
4.5, added by SOLR-4909.  From what I can tell reading the issue,
turning it off effectively disables soft commits.

https://issues.apache.org/jira/browse/SOLR-4909

You've said that you are adding about 3 documents per second, but you
haven't said anything about how often you are doing commits.  Erick's
question basically boils down to this:  How quickly after indexing do
you expect the changes to be visible on a search, and how often are you
doing commits?

Generally speaking (and ignoring the fact that nrtMode now exists), NRT
is not something you enable, it's something you try to achieve, by using
soft commits quickly and often, and by adjusting the configuration to
make the commits go faster.

If you are trying to keep the interval between indexing and document
visibility down to less than a few seconds (especially if it's less than
one second), then you are trying to achieve NRT.

There's a lot of information on the following wiki page about
performance problems.  This specific link is to the last part of that
page, which deals with slow commits:

http://wiki.apache.org/solr/SolrPerformanceProblems#Slow_commits

> I don't know what Garbage Collector we're using. In this test I'm running
> Solr 4.5.1 using Jetty from the example directory.

If you aren't using any tuning parameters beyond setting the max heap,
then you are using the default parallel collector.  It's a poor choice
for Solr unless your heap is very small.  At 6GB, yours isn't very
small.  It's not particularly huge either, but not small.

> The CPU on the 8 nodes all stay around 70% use during the test. The nodes
> have 28GB of RAM. Java is using about 6GB and the rest is being used by OS
> cache.

How big is your index?  If it's larger than about 30 GB, you probably
need more memory.  If it's much larger than about 40 GB, you definitely
need more memory.

> To perform the test we're running 200 concurrent threads in JMeter. The
> threads hit HAProxy which loadbalances the requests among the nodes. Each
> query is for a random word out of a list of about 10,000 words. Some of the
> queries have faceting turned on.

That's a pretty high query load.  If you want to get anywhere near top
performance out of it, you'll want to have enough memory to fit your
entire index into RAM.  You'll also need to reduce the load introduced
by indexing.  A large part of the load from indexing comes from commits.

> Because we're heavily loading the system the queries are returning quite
> slowly. For a simple search, the average response time was 300ms. The peak
> response time was 11,000ms. The spikes in latency seem to occur about every
> 2.5 minutes.

I would bet that you're having one or both of the following issues:

1) Garbage collection issues from one or more of the following:
 a) Heap too small.
 b) Using the default GC instead of CMS with tuning.
2) General performance issues from one or more of the following:
 a) Not enough cache memory for your index size.
 b) Too-frequent commits.
 c) Commits taking a lot of time and resources due to cache warming.

With a high query and index load, any problems become magnified.

> I haven't spent that much time messing with SolrConfig, so most of the
> settings are the out-of-the-box defaults.

The defaults are very good for small to medium indexes and low to medium
query load.  If you have a big index and/or high query load, you'll
generally need to tune.

Thanks,
Shawn


Re: Periodic Slowness on Solr Cloud

Posted by Dave Seltzer <ds...@tveyes.com>.
Lots of questions. Okay.

In digging a little deeper and looking at the config I see that
<nrtMode>true</nrtMode> is commented out.  I believe this is the default
setting. So I don't know if NRT is enabled or not. Maybe just a red herring.

I don't know what Garbage Collector we're using. In this test I'm running
Solr 4.5.1 using Jetty from the example directory.

The CPU on the 8 nodes all stay around 70% use during the test. The nodes
have 28GB of RAM. Java is using about 6GB and the rest is being used by OS
cache.

To perform the test we're running 200 concurrent threads in JMeter. The
threads hit HAProxy which loadbalances the requests among the nodes. Each
query is for a random word out of a list of about 10,000 words. Some of the
queries have faceting turned on.

Because we're heavily loading the system the queries are returning quite
slowly. For a simple search, the average response time was 300ms. The peak
response time was 11,000ms. The spikes in latency seem to occur about every
2.5 minutes.

I haven't spent that much time messing with SolrConfig, so most of the
settings are the out-of-the-box defaults.

Where should I start to look?

Thanks so much!

-Dave





On Thu, Nov 21, 2013 at 6:53 PM, Mark Miller <ma...@gmail.com> wrote:

> Yes, more details…
>
> Solr version, which garbage collector, how does heap usage look, cpu, etc.
>
> - Mark
>
> On Nov 21, 2013, at 6:46 PM, Erick Erickson <er...@gmail.com>
> wrote:
>
> > How real time is NRT? In particular, what are you commit settings?
> >
> > And can you characterize "periodic slowness"? Queries that usually
> > take 500ms not tail 10s? Or 1s? How often? How are you measuring?
> >
> > Details matter, a lot...
> >
> > Best,
> > Erick
> >
> >
> >
> >
> > On Thu, Nov 21, 2013 at 6:03 PM, Dave Seltzer <ds...@tveyes.com>
> wrote:
> >
> >> I'm doing some performance testing against an 8-node Solr cloud cluster,
> >> and I'm noticing some periodic slowness.
> >>
> >>
> >> http://farm4.staticflickr.com/3668/10985410633_23e26c7681_o.png
> >>
> >> I'm doing random test searches against an Alias Collection made up of
> four
> >> smaller (monthly) collections. Like this:
> >>
> >> MasterCollection
> >> |- Collection201308
> >> |- Collection201309
> >> |- Collection201310
> >> |- Collection201311
> >>
> >> The last collection is constantly updated. New documents are being
> added at
> >> the rate of about 3 documents per second.
> >>
> >> I believe the slowness may due be to NRT, but I'm not sure. How should I
> >> investigate this?
> >>
> >> If the slowness is related to NRT, how can I alleviate the issue without
> >> disabling NRT?
> >>
> >> Thanks Much!
> >>
> >> -Dave
> >>

Re: Periodic Slowness on Solr Cloud

Posted by Mark Miller <ma...@gmail.com>.
Yes, more details…

Solr version, which garbage collector, how does heap usage look, cpu, etc.

- Mark

On Nov 21, 2013, at 6:46 PM, Erick Erickson <er...@gmail.com> wrote:

> How real time is NRT? In particular, what are you commit settings?
> 
> And can you characterize "periodic slowness"? Queries that usually
> take 500ms not tail 10s? Or 1s? How often? How are you measuring?
> 
> Details matter, a lot...
> 
> Best,
> Erick
> 
> 
> 
> 
> On Thu, Nov 21, 2013 at 6:03 PM, Dave Seltzer <ds...@tveyes.com> wrote:
> 
>> I'm doing some performance testing against an 8-node Solr cloud cluster,
>> and I'm noticing some periodic slowness.
>> 
>> 
>> http://farm4.staticflickr.com/3668/10985410633_23e26c7681_o.png
>> 
>> I'm doing random test searches against an Alias Collection made up of four
>> smaller (monthly) collections. Like this:
>> 
>> MasterCollection
>> |- Collection201308
>> |- Collection201309
>> |- Collection201310
>> |- Collection201311
>> 
>> The last collection is constantly updated. New documents are being added at
>> the rate of about 3 documents per second.
>> 
>> I believe the slowness may due be to NRT, but I'm not sure. How should I
>> investigate this?
>> 
>> If the slowness is related to NRT, how can I alleviate the issue without
>> disabling NRT?
>> 
>> Thanks Much!
>> 
>> -Dave
>> 


Re: Periodic Slowness on Solr Cloud

Posted by Erick Erickson <er...@gmail.com>.
How real time is NRT? In particular, what are you commit settings?

And can you characterize "periodic slowness"? Queries that usually
take 500ms not tail 10s? Or 1s? How often? How are you measuring?

Details matter, a lot...

Best,
Erick




On Thu, Nov 21, 2013 at 6:03 PM, Dave Seltzer <ds...@tveyes.com> wrote:

> I'm doing some performance testing against an 8-node Solr cloud cluster,
> and I'm noticing some periodic slowness.
>
>
> http://farm4.staticflickr.com/3668/10985410633_23e26c7681_o.png
>
> I'm doing random test searches against an Alias Collection made up of four
> smaller (monthly) collections. Like this:
>
> MasterCollection
> |- Collection201308
> |- Collection201309
> |- Collection201310
> |- Collection201311
>
> The last collection is constantly updated. New documents are being added at
> the rate of about 3 documents per second.
>
> I believe the slowness may due be to NRT, but I'm not sure. How should I
> investigate this?
>
> If the slowness is related to NRT, how can I alleviate the issue without
> disabling NRT?
>
> Thanks Much!
>
> -Dave
>