You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by funes79 <gi...@git.apache.org> on 2017/01/27 14:45:33 UTC

[GitHub] incubator-impala pull request #1: Branch 2.8.0

GitHub user funes79 opened a pull request:

    https://github.com/apache/incubator-impala/pull/1

    Branch 2.8.0

    I would like to register my first pull request for Impala. We are using it in production almost 3 years.
    I would like to suggest to improve the behaviour of compute incremental stats. 
    We have a very very large table, initialy migrated from other cluster and we had to create stats on the table. Compute incremental stats after 4 hours failed (skipped), and in that time based on HDFS reads almost 90% of the table was scanned. Unfortunately Impala didnt stored the partitions statisics (daily paritions) so when I checked the stats there was everywhere false. And the performance of the compute stats is very poor, it looks like it is scanning partition by partition the tables, and if the partitons is small (on one node) the other nodes are stayin idle.  
    Two improvements I would suggest:
     - write the calculated stats immediatly after the partitions stats are gathered
     - if the table has large number of partitoons (3 years, 1000 partitons) scan at least so many partions how many Impala Daemon are configured in parallel.
    
    Thanks

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-impala branch-2.8.0

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-impala/pull/1.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1
    
----
commit 2423d23f8a84f4b38d2250ae0598207aeda243b2
Author: Jim Apple <jb...@apache.org>
Date:   2017-01-06T23:53:24Z

    Update VERSION to begin release candidate testing
    
    Change-Id: I0fcec577babba0929600d540936bb154a42dee50

commit 95e9479c12a3ba6fdfed25ae88467c8ba4622ad2
Author: Jim Apple <jb...@apache.org>
Date:   2017-01-05T16:19:28Z

    Add disclaimer to docs: Cloudera-specific info still present.
    
    While we are working on excising it, we don't want users to be
    confused about what the manual is intended to describe.
    
    Change-Id: I7740189fd7ff7f22d8471f037e190d9923521936
    Reviewed-on: http://gerrit.cloudera.org:8080/5610
    Reviewed-by: Tim Armstrong <ta...@cloudera.com>
    Tested-by: Impala Public Jenkins

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Re: [GitHub] incubator-impala pull request #1: Branch 2.8.0

Posted by Henry Robinson <he...@cloudera.com>.
Ah, good catch, thanks.

On 31 January 2017 at 10:08, Jim Apple <jb...@cloudera.com> wrote:

> I answered on the PR with the wiki page instructions on how to file
> bugs and patches. The PR didn't actually have any new code and the
> verb tense made me think this might be a feature request, not a patch.
>
> On Tue, Jan 31, 2017 at 9:57 AM, Henry Robinson <he...@cloudera.com>
> wrote:
> > Have you reached out to the author with suggestions for how to contribute
> > their patch? Looks like a patch we might want to consider.
> >
> > On 31 January 2017 at 09:34, Jim Apple <jb...@cloudera.com> wrote:
> >
> > After replying, I asked on http://infra.chat about how to close this.
> > The right answer is apparently to ask in infra.chat or file an Apache
> > infra ticket.
> >
> > On Tue, Jan 31, 2017 at 9:33 AM, Humbedooh <gi...@git.apache.org> wrote:
> >> Github user Humbedooh closed the pull request at:
> >>
> >>     https://github.com/apache/incubator-impala/pull/1
> >>
> >>
> >> ---
> >> If your project is set up for it, you can reply to this email and have
> > your
> >> reply appear on GitHub as well. If your project does not have this
> feature
> >> enabled and wishes so, or if the feature is enabled but not working,
> > please
> >> contact infrastructure at infrastructure@apache.org or file a JIRA
> ticket
> >> with INFRA.
> >> ---
> >
> >
> >
> >
> > --
> > Henry Robinson
> > Software Engineer
> > Cloudera
> > 415-994-6679
> > --
> > Henry Robinson
> > Software Engineer
> > Cloudera
> > 415-994-6679
>



-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

Re: [GitHub] incubator-impala pull request #1: Branch 2.8.0

Posted by Jim Apple <jb...@cloudera.com>.
I answered on the PR with the wiki page instructions on how to file
bugs and patches. The PR didn't actually have any new code and the
verb tense made me think this might be a feature request, not a patch.

On Tue, Jan 31, 2017 at 9:57 AM, Henry Robinson <he...@cloudera.com> wrote:
> Have you reached out to the author with suggestions for how to contribute
> their patch? Looks like a patch we might want to consider.
>
> On 31 January 2017 at 09:34, Jim Apple <jb...@cloudera.com> wrote:
>
> After replying, I asked on http://infra.chat about how to close this.
> The right answer is apparently to ask in infra.chat or file an Apache
> infra ticket.
>
> On Tue, Jan 31, 2017 at 9:33 AM, Humbedooh <gi...@git.apache.org> wrote:
>> Github user Humbedooh closed the pull request at:
>>
>>     https://github.com/apache/incubator-impala/pull/1
>>
>>
>> ---
>> If your project is set up for it, you can reply to this email and have
> your
>> reply appear on GitHub as well. If your project does not have this feature
>> enabled and wishes so, or if the feature is enabled but not working,
> please
>> contact infrastructure at infrastructure@apache.org or file a JIRA ticket
>> with INFRA.
>> ---
>
>
>
>
> --
> Henry Robinson
> Software Engineer
> Cloudera
> 415-994-6679
> --
> Henry Robinson
> Software Engineer
> Cloudera
> 415-994-6679

Re: [GitHub] incubator-impala pull request #1: Branch 2.8.0

Posted by Henry Robinson <he...@cloudera.com>.
Have you reached out to the author with suggestions for how to contribute
their patch? Looks like a patch we might want to consider.

On 31 January 2017 at 09:34, Jim Apple <jb...@cloudera.com> wrote:

After replying, I asked on http://infra.chat about how to close this.
The right answer is apparently to ask in infra.chat or file an Apache
infra ticket.

On Tue, Jan 31, 2017 at 9:33 AM, Humbedooh <gi...@git.apache.org> wrote:
> Github user Humbedooh closed the pull request at:
>
>     https://github.com/apache/incubator-impala/pull/1
>
>
> ---
> If your project is set up for it, you can reply to this email and have
your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working,
please
> contact infrastructure at infrastructure@apache.org or file a JIRA ticket
> with INFRA.
> ---




-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679
-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

Re: [GitHub] incubator-impala pull request #1: Branch 2.8.0

Posted by Jim Apple <jb...@cloudera.com>.
After replying, I asked on http://infra.chat about how to close this.
The right answer is apparently to ask in infra.chat or file an Apache
infra ticket.

On Tue, Jan 31, 2017 at 9:33 AM, Humbedooh <gi...@git.apache.org> wrote:
> Github user Humbedooh closed the pull request at:
>
>     https://github.com/apache/incubator-impala/pull/1
>
>
> ---
> If your project is set up for it, you can reply to this email and have your
> reply appear on GitHub as well. If your project does not have this feature
> enabled and wishes so, or if the feature is enabled but not working, please
> contact infrastructure at infrastructure@apache.org or file a JIRA ticket
> with INFRA.
> ---

[GitHub] incubator-impala pull request #1: Branch 2.8.0

Posted by Humbedooh <gi...@git.apache.org>.
Github user Humbedooh closed the pull request at:

    https://github.com/apache/incubator-impala/pull/1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---