You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/02/01 08:34:00 UTC

[jira] [Work logged] (HIVE-25746) Compaction Failure Counter counted incorrectly

     [ https://issues.apache.org/jira/browse/HIVE-25746?focusedWorklogId=718484&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-718484 ]

ASF GitHub Bot logged work on HIVE-25746:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Feb/22 08:33
            Start Date: 01/Feb/22 08:33
    Worklog Time Spent: 10m 
      Work Description: vcsomor commented on pull request #2974:
URL: https://github.com/apache/hive/pull/2974#issuecomment-1026592281


   > The only thing that gave me pause was that we call getOrCreateCounter every time the metric is incremented, since this means that all the counters must be iterated through... it should be fast but on the other hand I don't think there's a limit to the number of counters that can be created... what do you think?
   
   @klcopp I deliberately put there the `getOrCreateCounter` because in the majority of the cycles we don't need it (I mean the success case has more probability). It must be iterated through only if there is an issue. Furthermore the Initiatitor/Cleaners might run long and such `getOrCreateCounter` calls are negligible compared to the whole cycle time.
   
   What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscribe@hive.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 718484)
    Time Spent: 2h 10m  (was: 2h)

> Compaction Failure Counter counted incorrectly
> ----------------------------------------------
>
>                 Key: HIVE-25746
>                 URL: https://issues.apache.org/jira/browse/HIVE-25746
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>    Affects Versions: 4.0.0
>            Reporter: Viktor Csomor
>            Assignee: Viktor Csomor
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> The count of the below metrics counted incorrectly upon an exception.
> - {{compaction_initator_failure_counter}}
> - {{compaction_cleaner_failure_counter}}
> Reasoning:
> In the {{Initator}}/{{Cleaner}} class creates a list of {{CompletableFuture}} which {{Runnable}} core exception is being wrapped to {{RuntimeExceptions}}. 
> The below code-snippet waits all cleaners to complete (Initiators does it similarly).
> {code:java}
>         try {
>            ....
>             for (CompactionInfo compactionInfo : readyToClean) {
>               cleanerList.add(CompletableFuture.runAsync(CompactorUtil.ThrowingRunnable.unchecked(() ->
>                       clean(compactionInfo, cleanerWaterMark, metricsEnabled)), cleanerExecutor));
>             }
>             CompletableFuture.allOf(cleanerList.toArray(new CompletableFuture[0])).join();
>           }
>         } catch (Throwable t) {
>           // the lock timeout on AUX lock, should be ignored.
>           if (metricsEnabled && handle != null) {
>             failuresCounter.inc();
>           }
> {code}
> If the {{CompleteableFututre#join}} throws an Exception then the failure counter is incremented.
> Docs:
> {code}
>     /**
>      * Returns the result value when complete, or throws an
>      * (unchecked) exception if completed exceptionally. To better
>      * conform with the use of common functional forms, if a
>      * computation involved in the completion of this
>      * CompletableFuture threw an exception, this method throws an
>      * (unchecked) {@link CompletionException} with the underlying
>      * exception as its cause.
>      *
>      * @return the result value
>      * @throws CancellationException if the computation was cancelled
>      * @throws CompletionException if this future completed
>      * exceptionally or a completion computation threw an exception
>      */
>     public T join() {
>         Object r;
>         return reportJoin((r = result) == null ? waitingGet(false) : r);
>     }
> {code}
> (!) Let's suppose we have 10 cleaners and the 2nd throws an exception. The {{catch}} block will be initiated and the {{failuresCounter}} will be incremented. If there is any consecutive error amongst the remaining cleaners the counter won't be incremented. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)