You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@solr.apache.org by "Mark Robert Miller (Jira)" <ji...@apache.org> on 2021/07/26 00:29:00 UTC

[jira] [Commented] (SOLR-15509) Issues to potentially improve JSON faceting and Stats performance with an SQL performance focus..

    [ https://issues.apache.org/jira/browse/SOLR-15509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17386974#comment-17386974 ] 

Mark Robert Miller commented on SOLR-15509:
-------------------------------------------

So the general focus here is on sql / stream expressions performance at scale.

The elephant in the room on that front is the ridiculously poor performance of 'tuples' in Java.

You are generally talking large numbers of numerics. But as objects. Widened to the largest type. Scattered around on the heap. Getting moved by GC. Boxing, unboxing. You name it.

I don't focus on that here, even though it's a scale performance nightmare because:
 * Solr's advantage and reason this stuff is currently viable is because of what is pushed to the search engine, in particular stats and json facets.
 * Solving this in current Java is a large, less than satisfying endeavor. Good luck with basic on heap Java in it's current form. Many other systems do a much, much better job, but through lots of effort, Unsafe, and plenty of duct type. But this is their focus, the core of their systems (storm, flink, spark, cassandra, etc, etc). Trying to solve it as some kind of optional, none core work on a non core offshoot of Solr is ... not likely a good use of time.
 * Things coming in Java down the line will make it a less herculanean lift that will fit right in as non optional, first class effort that can align with similar work in the rest of Solr. Value types. MemorySegment. hardware accelerated Vectors I hope :) Basically everything that's being done to address the fact that Java sucks at tuples.

So that's further out in terms of doing some good and valuable and right, and why the focus here is on the more short term on the current viable and "advantage Solr" path.

> Issues to potentially improve JSON faceting and Stats performance with an SQL performance focus..
> -------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-15509
>                 URL: https://issues.apache.org/jira/browse/SOLR-15509
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Mark Robert Miller
>            Assignee: Mark Robert Miller
>            Priority: Minor
>              Labels: RobustSQL
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org