You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metamodel.apache.org by "Justin Lim Wei Kit (JIRA)" <ji...@apache.org> on 2016/10/31 07:09:00 UTC

[jira] [Updated] (METAMODEL-1128) Metamodel's Elasticsearch doesn't return more than 10 documents(rows)

     [ https://issues.apache.org/jira/browse/METAMODEL-1128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Justin Lim Wei Kit updated METAMODEL-1128:
------------------------------------------
    Priority: Major  (was: Critical)

> Metamodel's Elasticsearch doesn't return more than 10 documents(rows)
> ---------------------------------------------------------------------
>
>                 Key: METAMODEL-1128
>                 URL: https://issues.apache.org/jira/browse/METAMODEL-1128
>             Project: Apache MetaModel
>          Issue Type: Bug
>    Affects Versions: 4.5.4
>         Environment: Elasticsearch 1.4.4 Windows Service
>            Reporter: Justin Lim Wei Kit
>
> Hi guys,
> I've just tested Metamodel's Elasticsearch using ElasticSearchRestDataContext, and I have encountered an issue, which is, selecting all rows will return only top 10 rows.
> This will cause aggregations and sorting to fail as well, since Metamodel tries to grab all rows from ES, then sorts/aggregates it, but only manages to grab the top 10 rows.
> The reasons is because by default, ES returns only 10 rows, unless explicitly told to return more rows. Quoting ES docs ^[1]^
> bq. size defaults to 10.
>  And as far as I know, Metamodel doesn't explicitly tell ES to return more rows.
> There are 2 probable solutions to the problem. The first solution is to set a very large number inside the query "&size=BIGNUMBER". 
> However, setting the size bigger than the index.max_result_window wont work, and users with documents more than "&size=BIGNUMBER" will have issues ^[1]^
> bq. Note that from + size can not be more than the index.max_result_window index setting which defaults to 10,000.
> Another solution, but might be harder to implement, is a sliced scroll. ^[2]^
> {quote}
> Sliced Scroll
> For scroll queries that return a lot of documents it is possible to split the scroll in multiple slices which can be consumed independently:
> {quote}
> Let me know your thoughts,
> Thank you.
> [1]https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-from-size.html
> [2]https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html#sliced-scroll



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)