You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@buildstream.apache.org by GitBox <gi...@apache.org> on 2021/12/07 09:16:10 UTC

[GitHub] [buildstream] gtristan commented on issue #1541: 1.6.3 performance question

gtristan commented on issue #1541:
URL: https://github.com/apache/buildstream/issues/1541#issuecomment-987722194


   @devcurmudgeon I *suspect* that this is an inevitable inefficiency when not using any persistent local artifact cache and will be similar in bst-1 and master (explained in more depth at the end of this comment). Log files might help to identify whether this is the case or not.
   
   > bst master uses the same order. I believe this was discussed in an issue before and doing the "right thing" would make the bst overhead bigger which the people doing the work thought would hurt performance rather than help.
   > 
   > The issue is that when an element is not found in the artifact cache, it's added to the source queue and its build dependencies are added to the artifact queue. Then jobs are taken from the queues in a fixed order regardless of how "urgent" they are (add to that that the artifact and source queues both compete for fetcher resources)
   
   I would say that bst-1 and master diverge quite significantly because of dynamic scheduling, and I would disagree that urgency of element processing is ignored. We do consider the original depth sort and try to process deeper elements first in both bst-1 and master (this is imperfect, but not at all indifferent of urgency) - to better calculate urgency we aught to have some knowledge of how *heavy* elements are (in terms of resource consumption to build, potential artifact size, and potential build times).
   
   Your observation of pull/fetch queues competing for the same "fetcher" resource is interesting. For anyone not familiar with the term *fetcher resource*, @abderrahim is referring to the amount of parallel *fetch* jobs which the user has allowed buildstream to perform concurrently - bst treats this as *downloading stuff* and earlier queues get priority, one side effect of this is that any potential artifact downloads are performed before any sources are ever downloaded (while this does not affect the `fetch -> build` transition, it can be interesting to share the fetcher resource better in the interest of building something earlier).
   
   **However** I don't believe this is particularly relevant to the case which @devcurmudgeon is referring to, which I think is clearly a bug which I hope is not actually happening (I believe this did regress in the past and was fixed, I don't believe that we would have left bst-1 behind and I'm sure I've seen this work properly in recent months).
   
   Considering only the fetch and build queues for example, yes elements do technically *pass through* the fetch queue before they ever hit the build queue, that does *not* mean that all elements must be fetched before any element ever gets built (although perhaps this is not what is being reported ?).
   
   > Hi while building fdsdk, i noticed that on a many core machine, bst 1.6.3 downloads lots of sources, and then only starts building after those jobs are done. this seems clearly suboptimal. has this been fixed on mainline?
   
   Can you clarify as to whether *lots* means *all* in this context ?
   
   In order for an element to start building:
   * its build dependency artifacts must be present in the local cache
   * its source code must be present in the local cache
   
   One explanation for it taking a while for a build to start, is that the base runtime artifact (an `import` element) is rather large and will take more time to download than source code modules - so it will be impossible to start building anything until that runtime is downloaded. This latency will be accentuated if you are not leveraging a local cache on the build machine in between builds - regardless of whether the base runtime artifact is available on an artifact cache server or not - the whole thing needs to be downloaded every time before anything can ever build anything. This is unfortunately typical when building in CI, in cases where you have not setup your build machines to retain a local cache to persist between builds.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@buildstream.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org