You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by Thomas Tauber-Marshall <tm...@cloudera.com> on 2018/01/02 20:04:23 UTC

Re: New Impala Contributors: IMPALA-6296

On Fri, Dec 29, 2017 at 11:12 AM Manaswini Maharana <mm...@cloudera.com>
wrote:

> I've pushed the initial changes to -
> https://gerrit.cloudera.org/#/c/8923/
>
> Steps that I've followed to make this contribution - As this is my first,
> I want to elaborate a little bit to ensure I'm covering the bases right.
>
> 1. After the following the extensive contribution guide
> <https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala>,
> setup the development environment on machine with Ubuntu 16.04 OS, 15 GB
> memory, Intel® Core™ i7-8550U CPU @ 1.80GHz × 8  & 512 GB SSD
> 2. Modified the build process to avoiding building against -so
> (IMPALA-5365 & IMPALA-3926)
> 3. Reproduced the issue. Attached Impalad error log.
> 4. Put in the code fix - added a condition check to ensure
> 'emit_perf_map_' evaluates to true before the dependent
> object(perf_map_lock_) is asserted using DCHECK.
> 5. Unit tested the changes using Impala-shell. No daemon crash encountered
> this time.
> 6. Loaded and ran the "be" unit test cases to ensure the change did not
> break other stuff. Attached the results.
>
>
> Questions I have:
> 1. Is it recommened to run more end-to-end unit tests or does it depend on
> the intensity of the change ? Like here I've limited it to just running be
> tests, is that ok?
>

In general, you're expected to run any tests that may reasonably be
affected by the change, and of course the more testing the better. This
change affects basically any query that uses codegen, which is a
significant portion of our tests, so running the full suite would probably
be a good idea (eg bin/run-all-tests.sh)

You may also want to try out the Jenkins jobs that we provide for testing
patches, as described here:
https://cwiki.apache.org/confluence/display/IMPALA/Using+Gerrit+to+submit+and+review+patches#UsingGerrittosubmitandreviewpatches-Verifyingapatch(opentoallImpalacontributors)


> 2. As per Tim's comment, I should see *.asm files in codegen-modules after
> the fix, but I don't see any. Am I missing anything ?
>

You may need to manually create the directory $IMPALA_HOME/codegen-modules/


>
> Thanks & appreciate your input,
> Mansi
>
>
> On Fri, Dec 29, 2017 at 12:43 PM, Manaswini Maharana <
> mmaharana@cloudera.com> wrote:
>
>> Thank you!
>>
>> On Fri, Dec 29, 2017 at 11:47 AM, Jim Apple <jb...@cloudera.com> wrote:
>>
>>> Done
>>> On Fri, Dec 29, 2017 at 7:12 AM, Manaswini Maharana
>>> <mm...@cloudera.com> wrote:
>>> > username: mmaharana
>>> >
>>> > On Fri, Dec 29, 2017 at 10:07 AM, Jin Chul Kim <ji...@gmail.com>
>>> wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I guess she doesn't have a privilege to change assignee. The
>>> permission
>>> >> should be provided by somebody who has an admin privilege.
>>> >> @Mansi, please share your username.
>>> >>
>>> >> Best regards,
>>> >> Jinchul
>>> >>
>>> >> 2017-12-30 0:02 GMT+09:00 Vincent Tran <vt...@cloudera.com>:
>>> >>
>>> >> > You should be able to just simply "assign to me" if you are signed
>>> into
>>> >> ADD
>>> >> > Jira.
>>> >> >
>>> >> > On Dec 29, 2017 9:54 AM, "Manaswini Maharana" <
>>> mmaharana@cloudera.com>
>>> >> > wrote:
>>> >> >
>>> >> > > Good morning, Team - any update on this?
>>> >> > >
>>> >> > > - Mansi
>>> >> > >
>>> >> > >
>>> >> > > On Thu, Dec 28, 2017 at 12:32 AM, Manaswini Maharana <
>>> >> > > mmaharana@cloudera.com
>>> >> > > > wrote:
>>> >> > >
>>> >> > > > Hello team, I'd like to contribute to Impala and would like to
>>> start
>>> >> it
>>> >> > > > with this (IMPALA-6296) jira. Can I please have it assigned to
>>> me?
>>> >> > > >
>>> >> > > > My development environment is ready. I'm on ubuntu 16.04 so yes
>>> some
>>> >> > > > challenges to set up the environment especially with the
>>> >> > LD_LIBRARY_PATH
>>> >> > > > issues. So far I'm able to work past it and might ask for
>>> another
>>> >> pair
>>> >> > of
>>> >> > > > eyes to ensure it follows standard. The builds are going fine
>>> and
>>> >> runs
>>> >> > a
>>> >> > > > bit longer as I've disabled the -so. Apart from setup, I'm was
>>> able
>>> >> to
>>> >> > > > reproduce the issue and might have been ready with the fix
>>> which I'd
>>> >> > like
>>> >> > > > to discuss  once I'm officially assigned the jira.
>>> >> > > >
>>> >> > > > Thanks Tim, for the instructions, it was very helpful.
>>> >> > > >
>>> >> > > >
>>> >> > > > Thanks!
>>> >> > > > Mansi
>>> >> > > >
>>> >> > > >
>>> >> > > >
>>> >> > > > On Tue, Dec 12, 2017 at 8:24 PM, Tim Armstrong <
>>> >> > tarmstrong@cloudera.com>
>>> >> > > > wrote:
>>> >> > > >
>>> >> > > >> If you'd like to contribute a patch to Impala, but aren't sure
>>> what
>>> >> > > >> you want to work on, you can look at Impala's newbie issues:
>>> >> > > >> https://issues.apache.org/jira/issues/?filter=12341668. You
>>> can
>>> >> find
>>> >> > > >> detailed instructions on submitting patches at
>>> >> > > >> https://cwiki.apache.org/confluence/display/IMPALA/
>>> >> > > Contributing+to+Impala
>>> >> > > >> .
>>> >> > > >> This is a walkthrough of a ticket a new contributor could take
>>> on,
>>> >> > > >> with hopefully enough detail to get you going but not so much
>>> to
>>> >> take
>>> >> > > >> away the fun.
>>> >> > > >>
>>> >> > > >> How can we solve
>>> https://issues.apache.org/jira/browse/IMPALA-6296,
>>> >> > > >> "DCheck in CodegenSymbolEmitter when --asm_module_dir is set on
>>> >> debug
>>> >> > > >> build
>>> >> > > >> "?
>>> >> > > >> First, make sure you have your development environment set up.
>>> Let's
>>> >> > > >> see if we can reproduce the issue.
>>> >> > > >>
>>> >> > > >> Create a directory for the codegen modules then start
>>> impala-server
>>> >> > > >> $ mkdir codegen-modules
>>> >> > > >> $ start-impala-cluster.py --impalad_args="--opt_module_d
>>> >> > > >> ir=codegen-modules
>>> >> > > >> --unopt_module_dir=codegen-modules --asm_module_dir=codegen-
>>> >> modules"
>>> >> > > >>
>>> >> > > >> Now run a query through Impala. I chose this query because it
>>> uses
>>> >> > > >> Impala's
>>> >> > > >> runtime code generation ("codegen") to improve performance
>>> >> > > >> $ bin/impala-shell.sh -q "select count(*) from tpch.lineitem"
>>> >> > > >>
>>> >> > > >> After a few seconds, you should see an error from Impala
>>> crashing:
>>> >> > > >> Error communicating with impalad: TSocket read 0 bytes
>>> >> > > >> Could not execute command: select count(*) from tpch.lineitem
>>> >> > > >>
>>> >> > > >> Attempting to run any further queries will fail because of the
>>> >> crashed
>>> >> > > >> Impala daemons:
>>> >> > > >> $ impala-shell.sh -q "select count(*) from tpch.lineitem";
>>> >> > > >> Starting Impala Shell without Kerberos authentication
>>> >> > > >> Error connecting: TTransportException, Could not connect to
>>> >> > > >> localhost:21000
>>> >> > > >> Not connected to Impala, could not execute queries.
>>> >> > > >>
>>> >> > > >> If you look at logs/cluster/impalad.ERROR you will see a
>>> message
>>> >> > > >> explaining
>>> >> > > >> the crash - we hit a DCHECK (a.k.a. an assertion):
>>> >> > > >>
>>> >> > > >> F1212 17:10:07.642722  9138 codegen-symbol-emitter.cc:90] Check
>>> >> > failed:
>>> >> > > >> perf_map_.find(obj.getData().data()) != perf_map_.end()
>>> >> > > >> *** Check failure stack trace: ***
>>> >> > > >>     @          0x3be16ed  google::LogMessage::Fail()
>>> >> > > >>     @          0x3be2f92  google::LogMessage::SendToLog()
>>> >> > > >>     @          0x3be10c7  google::LogMessage::Flush()
>>> >> > > >>     @          0x3be468e  google::LogMessageFatal::~
>>> >> LogMessageFatal()
>>> >> > > >>     @          0x1b354f1
>>> >> > > >> impala::CodegenSymbolEmitter::NotifyFreeingObject()
>>> >> > > >>     @          0x375b791  llvm::MCJIT::NotifyFreeingObject()
>>> >> > > >>     @          0x375b811  llvm::MCJIT::~MCJIT()
>>> >> > > >>     @          0x375bfc9  llvm::MCJIT::~MCJIT()
>>> >> > > >>     @          0x1b265ba  std::default_delete<>::operator()()
>>> >> > > >>     @          0x1b23e21  std::unique_ptr<>::reset()
>>> >> > > >>     @          0x1b11198  impala::LlvmCodeGen::Close()
>>> >> > > >>     @          0x182194d
>>> impala::RuntimeState::ReleaseResources()
>>> >> > > >>     @          0x1893a35
>>> impala::FragmentInstanceState::Close()
>>> >> > > >>     @          0x1890d58  impala::FragmentInstanceState::Exec()
>>> >> > > >>     @          0x1879afa  impala::QueryState::ExecFInstance()
>>> >> > > >>     @          0x18783bc
>>> >> > > >> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
>>> >> > > >>     @          0x187a739
>>> >> > > >> _ZN5boost6detail8function26void_function_obj_invoker0IZN6imp
>>> >> > > >> ala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_
>>> >> > > 15function_bufferE
>>> >> > > >>     @          0x17c7200  boost::function0<>::operator()()
>>> >> > > >>     @          0x1abdf8b  impala::Thread::SuperviseThread()
>>> >> > > >>     @          0x1ac6b16  boost::_bi::list4<>::operator()<>()
>>> >> > > >>     @          0x1ac6a59  boost::_bi::bind_t<>::operator()()
>>> >> > > >>     @          0x1ac6a1c  boost::detail::thread_data<>::run()
>>> >> > > >>     @          0x2d6b08a  thread_proxy
>>> >> > > >>     @     0x7fcb5ff146ba  start_thread
>>> >> > > >>     @     0x7fcb5fc4a3dd  clone
>>> >> > > >>
>>> >> > > >> It looks like something is wrong with the code - 'perf_map_'
>>> doesn't
>>> >> > > have
>>> >> > > >> anything in it. If we look a few lines above, it looks like
>>> >> > 'perf_map_'
>>> >> > > is
>>> >> > > >> only filled in if 'emit_perf_map_' is true.
>>> >> > > >>
>>> >> > > >>   if (emit_perf_map_) {
>>> >> > > >>     lock_guard<SpinLock> perf_map_lock(perf_map_lock_);
>>> >> > > >>     DCHECK(perf_map_.find(obj.getData().data()) ==
>>> >> perf_map_.end());
>>> >> > > >>     perf_map_[obj.getData().data()] =
>>> std::move(perf_map_entries);
>>> >> > > >>     WritePerfMapLocked();
>>> >> > > >>   }
>>> >> > > >>
>>> >> > > >> Maybe we should also be checking 'emit_perf_map_' before the
>>> DCHECK?
>>> >> > > >>
>>> >> > > >> After you have fixed this bug, you should be able to run the
>>> query
>>> >> > > without
>>> >> > > >> Impala crashing and you should be able to inspect the assembly
>>> >> > generated
>>> >> > > >> by
>>> >> > > >> Impala in codegen-modules/*.asm
>>> >> > > >>
>>> >> > > >> Have fun, and don't be afraid to ask dev@impala.apache.org is
>>> you
>>> >> > have
>>> >> > > >> any questions!
>>> >> > > >>
>>> >> > > >
>>> >> > > >
>>> >> > >
>>> >> >
>>> >>
>>>
>>
>>
>

Re: New Impala Contributors: IMPALA-6296

Posted by Thomas Tauber-Marshall <tm...@cloudera.com>.
On Wed, Jan 3, 2018 at 10:06 AM Manaswini Maharana <mm...@cloudera.com>
wrote:

> Thanks for the pointers, Thomas. I was able to run the Jenkin tests, but
> not sure how to manually add the links to Gerrit as recommended. Do I  add
> it through the reply text box as shown below?
>

Yes, you can just post it as a comment on the review using the "Reply"
button


>
> - Mansi
> ​
>
> On Tue, Jan 2, 2018 at 3:04 PM, Thomas Tauber-Marshall <
> tmarshall@cloudera.com> wrote:
>
>> On Fri, Dec 29, 2017 at 11:12 AM Manaswini Maharana <
>> mmaharana@cloudera.com>
>> wrote:
>>
>> > I've pushed the initial changes to -
>> > https://gerrit.cloudera.org/#/c/8923/
>> >
>> > Steps that I've followed to make this contribution - As this is my
>> first,
>> > I want to elaborate a little bit to ensure I'm covering the bases right.
>> >
>> > 1. After the following the extensive contribution guide
>>
> > <
>> https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala
>> >,
>
>
>> > setup the development environment on machine with Ubuntu 16.04 OS, 15 GB
>> > memory, Intel® Core™ i7-8550U CPU @ 1.80GHz × 8  & 512 GB SSD
>> > 2. Modified the build process to avoiding building against -so
>> > (IMPALA-5365 & IMPALA-3926)
>> > 3. Reproduced the issue. Attached Impalad error log.
>> > 4. Put in the code fix - added a condition check to ensure
>> > 'emit_perf_map_' evaluates to true before the dependent
>> > object(perf_map_lock_) is asserted using DCHECK.
>> > 5. Unit tested the changes using Impala-shell. No daemon crash
>> encountered
>> > this time.
>> > 6. Loaded and ran the "be" unit test cases to ensure the change did not
>> > break other stuff. Attached the results.
>> >
>> >
>> > Questions I have:
>> > 1. Is it recommened to run more end-to-end unit tests or does it depend
>> on
>> > the intensity of the change ? Like here I've limited it to just running
>> be
>> > tests, is that ok?
>> >
>>
>> In general, you're expected to run any tests that may reasonably be
>> affected by the change, and of course the more testing the better. This
>> change affects basically any query that uses codegen, which is a
>> significant portion of our tests, so running the full suite would probably
>> be a good idea (eg bin/run-all-tests.sh)
>>
>> You may also want to try out the Jenkins jobs that we provide for testing
>> patches, as described here:
>>
>> https://cwiki.apache.org/confluence/display/IMPALA/Using+Gerrit+to+submit+and+review+patches#UsingGerrittosubmitandreviewpatches-Verifyingapatch(opentoallImpalacontributors)
>>
>>
>> > 2. As per Tim's comment, I should see *.asm files in codegen-modules
>> after
>> > the fix, but I don't see any. Am I missing anything ?
>> >
>>
>> You may need to manually create the directory
>> $IMPALA_HOME/codegen-modules/
>>
>>
>> >
>> > Thanks & appreciate your input,
>> > Mansi
>> >
>> >
>> > On Fri, Dec 29, 2017 at 12:43 PM, Manaswini Maharana <
>> > mmaharana@cloudera.com> wrote:
>> >
>> >> Thank you!
>> >>
>> >> On Fri, Dec 29, 2017 at 11:47 AM, Jim Apple <jb...@cloudera.com>
>> wrote:
>> >>
>> >>> Done
>> >>> On Fri, Dec 29, 2017 at 7:12 AM, Manaswini Maharana
>> >>> <mm...@cloudera.com> wrote:
>> >>> > username: mmaharana
>> >>> >
>> >>> > On Fri, Dec 29, 2017 at 10:07 AM, Jin Chul Kim <ji...@gmail.com>
>> >>> wrote:
>> >>> >
>> >>> >> Hi,
>> >>> >>
>> >>> >> I guess she doesn't have a privilege to change assignee. The
>> >>> permission
>> >>> >> should be provided by somebody who has an admin privilege.
>> >>> >> @Mansi, please share your username.
>> >>> >>
>> >>> >> Best regards,
>> >>> >> Jinchul
>> >>> >>
>> >>> >> 2017-12-30 0:02 GMT+09:00 Vincent Tran <vt...@cloudera.com>:
>> >>> >>
>> >>> >> > You should be able to just simply "assign to me" if you are
>> signed
>> >>> into
>> >>> >> ADD
>> >>> >> > Jira.
>> >>> >> >
>> >>> >> > On Dec 29, 2017 9:54 AM, "Manaswini Maharana" <
>> >>> mmaharana@cloudera.com>
>> >>> >> > wrote:
>> >>> >> >
>> >>> >> > > Good morning, Team - any update on this?
>> >>> >> > >
>> >>> >> > > - Mansi
>> >>> >> > >
>> >>> >> > >
>> >>> >> > > On Thu, Dec 28, 2017 at 12:32 AM, Manaswini Maharana <
>> >>> >> > > mmaharana@cloudera.com
>> >>> >> > > > wrote:
>> >>> >> > >
>> >>> >> > > > Hello team, I'd like to contribute to Impala and would like
>> to
>> >>> start
>> >>> >> it
>> >>> >> > > > with this (IMPALA-6296) jira. Can I please have it assigned
>> to
>> >>> me?
>> >>> >> > > >
>> >>> >> > > > My development environment is ready. I'm on ubuntu 16.04 so
>> yes
>> >>> some
>> >>> >> > > > challenges to set up the environment especially with the
>> >>> >> > LD_LIBRARY_PATH
>> >>> >> > > > issues. So far I'm able to work past it and might ask for
>> >>> another
>> >>> >> pair
>> >>> >> > of
>> >>> >> > > > eyes to ensure it follows standard. The builds are going fine
>> >>> and
>> >>> >> runs
>> >>> >> > a
>> >>> >> > > > bit longer as I've disabled the -so. Apart from setup, I'm
>> was
>> >>> able
>> >>> >> to
>> >>> >> > > > reproduce the issue and might have been ready with the fix
>> >>> which I'd
>> >>> >> > like
>> >>> >> > > > to discuss  once I'm officially assigned the jira.
>> >>> >> > > >
>> >>> >> > > > Thanks Tim, for the instructions, it was very helpful.
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > > Thanks!
>> >>> >> > > > Mansi
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > > On Tue, Dec 12, 2017 at 8:24 PM, Tim Armstrong <
>> >>> >> > tarmstrong@cloudera.com>
>> >>> >> > > > wrote:
>> >>> >> > > >
>> >>> >> > > >> If you'd like to contribute a patch to Impala, but aren't
>> sure
>> >>> what
>> >>> >> > > >> you want to work on, you can look at Impala's newbie issues:
>> >>> >> > > >> https://issues.apache.org/jira/issues/?filter=12341668. You
>> >>> can
>> >>> >> find
>> >>> >> > > >> detailed instructions on submitting patches at
>> >>> >> > > >> https://cwiki.apache.org/confluence/display/IMPALA/
>> >>> >> > > Contributing+to+Impala
>> >>> >> > > >> .
>> >>> >> > > >> This is a walkthrough of a ticket a new contributor could
>> take
>> >>> on,
>> >>> >> > > >> with hopefully enough detail to get you going but not so
>> much
>> >>> to
>> >>> >> take
>> >>> >> > > >> away the fun.
>> >>> >> > > >>
>> >>> >> > > >> How can we solve
>> >>> https://issues.apache.org/jira/browse/IMPALA-6296,
>> >>> >> > > >> "DCheck in CodegenSymbolEmitter when --asm_module_dir is
>> set on
>> >>> >> debug
>> >>> >> > > >> build
>> >>> >> > > >> "?
>> >>> >> > > >> First, make sure you have your development environment set
>> up.
>> >>> Let's
>> >>> >> > > >> see if we can reproduce the issue.
>> >>> >> > > >>
>> >>> >> > > >> Create a directory for the codegen modules then start
>> >>> impala-server
>> >>> >> > > >> $ mkdir codegen-modules
>> >>> >> > > >> $ start-impala-cluster.py --impalad_args="--opt_module_d
>> >>> >> > > >> ir=codegen-modules
>> >>> >> > > >> --unopt_module_dir=codegen-modules --asm_module_dir=codegen-
>> >>> >> modules"
>> >>> >> > > >>
>> >>> >> > > >> Now run a query through Impala. I chose this query because
>> it
>> >>> uses
>> >>> >> > > >> Impala's
>> >>> >> > > >> runtime code generation ("codegen") to improve performance
>> >>> >> > > >> $ bin/impala-shell.sh -q "select count(*) from
>> tpch.lineitem"
>> >>> >> > > >>
>> >>> >> > > >> After a few seconds, you should see an error from Impala
>> >>> crashing:
>> >>> >> > > >> Error communicating with impalad: TSocket read 0 bytes
>> >>> >> > > >> Could not execute command: select count(*) from
>> tpch.lineitem
>> >>> >> > > >>
>> >>> >> > > >> Attempting to run any further queries will fail because of
>> the
>> >>> >> crashed
>> >>> >> > > >> Impala daemons:
>> >>> >> > > >> $ impala-shell.sh -q "select count(*) from tpch.lineitem";
>> >>> >> > > >> Starting Impala Shell without Kerberos authentication
>> >>> >> > > >> Error connecting: TTransportException, Could not connect to
>> >>> >> > > >> localhost:21000
>> >>> >> > > >> Not connected to Impala, could not execute queries.
>> >>> >> > > >>
>> >>> >> > > >> If you look at logs/cluster/impalad.ERROR you will see a
>> >>> message
>> >>> >> > > >> explaining
>> >>> >> > > >> the crash - we hit a DCHECK (a.k.a. an assertion):
>> >>> >> > > >>
>> >>> >> > > >> F1212 17:10:07.642722  9138 codegen-symbol-emitter.cc:90]
>> Check
>> >>> >> > failed:
>> >>> >> > > >> perf_map_.find(obj.getData().data()) != perf_map_.end()
>> >>> >> > > >> *** Check failure stack trace: ***
>> >>> >> > > >>     @          0x3be16ed  google::LogMessage::Fail()
>> >>> >> > > >>     @          0x3be2f92  google::LogMessage::SendToLog()
>> >>> >> > > >>     @          0x3be10c7  google::LogMessage::Flush()
>> >>> >> > > >>     @          0x3be468e  google::LogMessageFatal::~
>> >>> >> LogMessageFatal()
>> >>> >> > > >>     @          0x1b354f1
>> >>> >> > > >> impala::CodegenSymbolEmitter::NotifyFreeingObject()
>> >>> >> > > >>     @          0x375b791  llvm::MCJIT::NotifyFreeingObject()
>> >>> >> > > >>     @          0x375b811  llvm::MCJIT::~MCJIT()
>> >>> >> > > >>     @          0x375bfc9  llvm::MCJIT::~MCJIT()
>> >>> >> > > >>     @          0x1b265ba
>> std::default_delete<>::operator()()
>> >>> >> > > >>     @          0x1b23e21  std::unique_ptr<>::reset()
>> >>> >> > > >>     @          0x1b11198  impala::LlvmCodeGen::Close()
>> >>> >> > > >>     @          0x182194d
>> >>> impala::RuntimeState::ReleaseResources()
>> >>> >> > > >>     @          0x1893a35
>> >>> impala::FragmentInstanceState::Close()
>> >>> >> > > >>     @          0x1890d58
>> impala::FragmentInstanceState::Exec()
>> >>> >> > > >>     @          0x1879afa
>> impala::QueryState::ExecFInstance()
>> >>> >> > > >>     @          0x18783bc
>> >>> >> > > >> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
>> >>> >> > > >>     @          0x187a739
>> >>> >> > > >> _ZN5boost6detail8function26void_function_obj_invoker0IZN6imp
>> >>> >> > > >> ala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_
>> >>> >> > > 15function_bufferE
>> >>> >> > > >>     @          0x17c7200  boost::function0<>::operator()()
>> >>> >> > > >>     @          0x1abdf8b  impala::Thread::SuperviseThread()
>> >>> >> > > >>     @          0x1ac6b16
>> boost::_bi::list4<>::operator()<>()
>> >>> >> > > >>     @          0x1ac6a59  boost::_bi::bind_t<>::operator()()
>> >>> >> > > >>     @          0x1ac6a1c
>> boost::detail::thread_data<>::run()
>> >>> >> > > >>     @          0x2d6b08a  thread_proxy
>> >>> >> > > >>     @     0x7fcb5ff146ba  start_thread
>> >>> >> > > >>     @     0x7fcb5fc4a3dd  clone
>> >>> >> > > >>
>> >>> >> > > >> It looks like something is wrong with the code - 'perf_map_'
>> >>> doesn't
>> >>> >> > > have
>> >>> >> > > >> anything in it. If we look a few lines above, it looks like
>> >>> >> > 'perf_map_'
>> >>> >> > > is
>> >>> >> > > >> only filled in if 'emit_perf_map_' is true.
>> >>> >> > > >>
>> >>> >> > > >>   if (emit_perf_map_) {
>> >>> >> > > >>     lock_guard<SpinLock> perf_map_lock(perf_map_lock_);
>> >>> >> > > >>     DCHECK(perf_map_.find(obj.getData().data()) ==
>> >>> >> perf_map_.end());
>> >>> >> > > >>     perf_map_[obj.getData().data()] =
>> >>> std::move(perf_map_entries);
>> >>> >> > > >>     WritePerfMapLocked();
>> >>> >> > > >>   }
>> >>> >> > > >>
>> >>> >> > > >> Maybe we should also be checking 'emit_perf_map_' before the
>> >>> DCHECK?
>> >>> >> > > >>
>> >>> >> > > >> After you have fixed this bug, you should be able to run the
>> >>> query
>> >>> >> > > without
>> >>> >> > > >> Impala crashing and you should be able to inspect the
>> assembly
>> >>> >> > generated
>> >>> >> > > >> by
>> >>> >> > > >> Impala in codegen-modules/*.asm
>> >>> >> > > >>
>> >>> >> > > >> Have fun, and don't be afraid to ask dev@impala.apache.org
>> is
>> >>> you
>> >>> >> > have
>> >>> >> > > >> any questions!
>> >>> >> > > >>
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > >
>> >>> >> >
>> >>> >>
>> >>>
>> >>
>> >>
>> >
>>
>

Re: New Impala Contributors: IMPALA-6296

Posted by Manaswini Maharana <mm...@cloudera.com>.
Thanks for the pointers, Thomas. I was able to run the Jenkin tests, but
not sure how to manually add the links to Gerrit as recommended. Do I  add
it through the reply text box as shown below?

- Mansi
​

On Tue, Jan 2, 2018 at 3:04 PM, Thomas Tauber-Marshall <
tmarshall@cloudera.com> wrote:

> On Fri, Dec 29, 2017 at 11:12 AM Manaswini Maharana <
> mmaharana@cloudera.com>
> wrote:
>
> > I've pushed the initial changes to -
> > https://gerrit.cloudera.org/#/c/8923/
> >
> > Steps that I've followed to make this contribution - As this is my first,
> > I want to elaborate a little bit to ensure I'm covering the bases right.
> >
> > 1. After the following the extensive contribution guide
> > <https://cwiki.apache.org/confluence/display/IMPALA/Contribu
> ting+to+Impala>,
> > setup the development environment on machine with Ubuntu 16.04 OS, 15 GB
> > memory, Intel® Core™ i7-8550U CPU @ 1.80GHz × 8  & 512 GB SSD
> > 2. Modified the build process to avoiding building against -so
> > (IMPALA-5365 & IMPALA-3926)
> > 3. Reproduced the issue. Attached Impalad error log.
> > 4. Put in the code fix - added a condition check to ensure
> > 'emit_perf_map_' evaluates to true before the dependent
> > object(perf_map_lock_) is asserted using DCHECK.
> > 5. Unit tested the changes using Impala-shell. No daemon crash
> encountered
> > this time.
> > 6. Loaded and ran the "be" unit test cases to ensure the change did not
> > break other stuff. Attached the results.
> >
> >
> > Questions I have:
> > 1. Is it recommened to run more end-to-end unit tests or does it depend
> on
> > the intensity of the change ? Like here I've limited it to just running
> be
> > tests, is that ok?
> >
>
> In general, you're expected to run any tests that may reasonably be
> affected by the change, and of course the more testing the better. This
> change affects basically any query that uses codegen, which is a
> significant portion of our tests, so running the full suite would probably
> be a good idea (eg bin/run-all-tests.sh)
>
> You may also want to try out the Jenkins jobs that we provide for testing
> patches, as described here:
> https://cwiki.apache.org/confluence/display/IMPALA/Using+
> Gerrit+to+submit+and+review+patches#UsingGerrittosubmitand
> reviewpatches-Verifyingapatch(opentoallImpalacontributors)
>
>
> > 2. As per Tim's comment, I should see *.asm files in codegen-modules
> after
> > the fix, but I don't see any. Am I missing anything ?
> >
>
> You may need to manually create the directory $IMPALA_HOME/codegen-modules/
>
>
> >
> > Thanks & appreciate your input,
> > Mansi
> >
> >
> > On Fri, Dec 29, 2017 at 12:43 PM, Manaswini Maharana <
> > mmaharana@cloudera.com> wrote:
> >
> >> Thank you!
> >>
> >> On Fri, Dec 29, 2017 at 11:47 AM, Jim Apple <jb...@cloudera.com>
> wrote:
> >>
> >>> Done
> >>> On Fri, Dec 29, 2017 at 7:12 AM, Manaswini Maharana
> >>> <mm...@cloudera.com> wrote:
> >>> > username: mmaharana
> >>> >
> >>> > On Fri, Dec 29, 2017 at 10:07 AM, Jin Chul Kim <ji...@gmail.com>
> >>> wrote:
> >>> >
> >>> >> Hi,
> >>> >>
> >>> >> I guess she doesn't have a privilege to change assignee. The
> >>> permission
> >>> >> should be provided by somebody who has an admin privilege.
> >>> >> @Mansi, please share your username.
> >>> >>
> >>> >> Best regards,
> >>> >> Jinchul
> >>> >>
> >>> >> 2017-12-30 0:02 GMT+09:00 Vincent Tran <vt...@cloudera.com>:
> >>> >>
> >>> >> > You should be able to just simply "assign to me" if you are signed
> >>> into
> >>> >> ADD
> >>> >> > Jira.
> >>> >> >
> >>> >> > On Dec 29, 2017 9:54 AM, "Manaswini Maharana" <
> >>> mmaharana@cloudera.com>
> >>> >> > wrote:
> >>> >> >
> >>> >> > > Good morning, Team - any update on this?
> >>> >> > >
> >>> >> > > - Mansi
> >>> >> > >
> >>> >> > >
> >>> >> > > On Thu, Dec 28, 2017 at 12:32 AM, Manaswini Maharana <
> >>> >> > > mmaharana@cloudera.com
> >>> >> > > > wrote:
> >>> >> > >
> >>> >> > > > Hello team, I'd like to contribute to Impala and would like to
> >>> start
> >>> >> it
> >>> >> > > > with this (IMPALA-6296) jira. Can I please have it assigned to
> >>> me?
> >>> >> > > >
> >>> >> > > > My development environment is ready. I'm on ubuntu 16.04 so
> yes
> >>> some
> >>> >> > > > challenges to set up the environment especially with the
> >>> >> > LD_LIBRARY_PATH
> >>> >> > > > issues. So far I'm able to work past it and might ask for
> >>> another
> >>> >> pair
> >>> >> > of
> >>> >> > > > eyes to ensure it follows standard. The builds are going fine
> >>> and
> >>> >> runs
> >>> >> > a
> >>> >> > > > bit longer as I've disabled the -so. Apart from setup, I'm was
> >>> able
> >>> >> to
> >>> >> > > > reproduce the issue and might have been ready with the fix
> >>> which I'd
> >>> >> > like
> >>> >> > > > to discuss  once I'm officially assigned the jira.
> >>> >> > > >
> >>> >> > > > Thanks Tim, for the instructions, it was very helpful.
> >>> >> > > >
> >>> >> > > >
> >>> >> > > > Thanks!
> >>> >> > > > Mansi
> >>> >> > > >
> >>> >> > > >
> >>> >> > > >
> >>> >> > > > On Tue, Dec 12, 2017 at 8:24 PM, Tim Armstrong <
> >>> >> > tarmstrong@cloudera.com>
> >>> >> > > > wrote:
> >>> >> > > >
> >>> >> > > >> If you'd like to contribute a patch to Impala, but aren't
> sure
> >>> what
> >>> >> > > >> you want to work on, you can look at Impala's newbie issues:
> >>> >> > > >> https://issues.apache.org/jira/issues/?filter=12341668. You
> >>> can
> >>> >> find
> >>> >> > > >> detailed instructions on submitting patches at
> >>> >> > > >> https://cwiki.apache.org/confluence/display/IMPALA/
> >>> >> > > Contributing+to+Impala
> >>> >> > > >> .
> >>> >> > > >> This is a walkthrough of a ticket a new contributor could
> take
> >>> on,
> >>> >> > > >> with hopefully enough detail to get you going but not so much
> >>> to
> >>> >> take
> >>> >> > > >> away the fun.
> >>> >> > > >>
> >>> >> > > >> How can we solve
> >>> https://issues.apache.org/jira/browse/IMPALA-6296,
> >>> >> > > >> "DCheck in CodegenSymbolEmitter when --asm_module_dir is set
> on
> >>> >> debug
> >>> >> > > >> build
> >>> >> > > >> "?
> >>> >> > > >> First, make sure you have your development environment set
> up.
> >>> Let's
> >>> >> > > >> see if we can reproduce the issue.
> >>> >> > > >>
> >>> >> > > >> Create a directory for the codegen modules then start
> >>> impala-server
> >>> >> > > >> $ mkdir codegen-modules
> >>> >> > > >> $ start-impala-cluster.py --impalad_args="--opt_module_d
> >>> >> > > >> ir=codegen-modules
> >>> >> > > >> --unopt_module_dir=codegen-modules --asm_module_dir=codegen-
> >>> >> modules"
> >>> >> > > >>
> >>> >> > > >> Now run a query through Impala. I chose this query because it
> >>> uses
> >>> >> > > >> Impala's
> >>> >> > > >> runtime code generation ("codegen") to improve performance
> >>> >> > > >> $ bin/impala-shell.sh -q "select count(*) from tpch.lineitem"
> >>> >> > > >>
> >>> >> > > >> After a few seconds, you should see an error from Impala
> >>> crashing:
> >>> >> > > >> Error communicating with impalad: TSocket read 0 bytes
> >>> >> > > >> Could not execute command: select count(*) from tpch.lineitem
> >>> >> > > >>
> >>> >> > > >> Attempting to run any further queries will fail because of
> the
> >>> >> crashed
> >>> >> > > >> Impala daemons:
> >>> >> > > >> $ impala-shell.sh -q "select count(*) from tpch.lineitem";
> >>> >> > > >> Starting Impala Shell without Kerberos authentication
> >>> >> > > >> Error connecting: TTransportException, Could not connect to
> >>> >> > > >> localhost:21000
> >>> >> > > >> Not connected to Impala, could not execute queries.
> >>> >> > > >>
> >>> >> > > >> If you look at logs/cluster/impalad.ERROR you will see a
> >>> message
> >>> >> > > >> explaining
> >>> >> > > >> the crash - we hit a DCHECK (a.k.a. an assertion):
> >>> >> > > >>
> >>> >> > > >> F1212 17:10:07.642722  9138 codegen-symbol-emitter.cc:90]
> Check
> >>> >> > failed:
> >>> >> > > >> perf_map_.find(obj.getData().data()) != perf_map_.end()
> >>> >> > > >> *** Check failure stack trace: ***
> >>> >> > > >>     @          0x3be16ed  google::LogMessage::Fail()
> >>> >> > > >>     @          0x3be2f92  google::LogMessage::SendToLog()
> >>> >> > > >>     @          0x3be10c7  google::LogMessage::Flush()
> >>> >> > > >>     @          0x3be468e  google::LogMessageFatal::~
> >>> >> LogMessageFatal()
> >>> >> > > >>     @          0x1b354f1
> >>> >> > > >> impala::CodegenSymbolEmitter::NotifyFreeingObject()
> >>> >> > > >>     @          0x375b791  llvm::MCJIT::NotifyFreeingObject()
> >>> >> > > >>     @          0x375b811  llvm::MCJIT::~MCJIT()
> >>> >> > > >>     @          0x375bfc9  llvm::MCJIT::~MCJIT()
> >>> >> > > >>     @          0x1b265ba  std::default_delete<>::operato
> r()()
> >>> >> > > >>     @          0x1b23e21  std::unique_ptr<>::reset()
> >>> >> > > >>     @          0x1b11198  impala::LlvmCodeGen::Close()
> >>> >> > > >>     @          0x182194d
> >>> impala::RuntimeState::ReleaseResources()
> >>> >> > > >>     @          0x1893a35
> >>> impala::FragmentInstanceState::Close()
> >>> >> > > >>     @          0x1890d58  impala::FragmentInstanceState:
> :Exec()
> >>> >> > > >>     @          0x1879afa  impala::QueryState::ExecFInsta
> nce()
> >>> >> > > >>     @          0x18783bc
> >>> >> > > >> _ZZN6impala10QueryState15StartFInstancesEvENKUlvE_clEv
> >>> >> > > >>     @          0x187a739
> >>> >> > > >> _ZN5boost6detail8function26void_function_obj_invoker0IZN6imp
> >>> >> > > >> ala10QueryState15StartFInstancesEvEUlvE_vE6invokeERNS1_
> >>> >> > > 15function_bufferE
> >>> >> > > >>     @          0x17c7200  boost::function0<>::operator()()
> >>> >> > > >>     @          0x1abdf8b  impala::Thread::SuperviseThread()
> >>> >> > > >>     @          0x1ac6b16  boost::_bi::list4<>::operator(
> )<>()
> >>> >> > > >>     @          0x1ac6a59  boost::_bi::bind_t<>::operator()()
> >>> >> > > >>     @          0x1ac6a1c  boost::detail::thread_data<>::
> run()
> >>> >> > > >>     @          0x2d6b08a  thread_proxy
> >>> >> > > >>     @     0x7fcb5ff146ba  start_thread
> >>> >> > > >>     @     0x7fcb5fc4a3dd  clone
> >>> >> > > >>
> >>> >> > > >> It looks like something is wrong with the code - 'perf_map_'
> >>> doesn't
> >>> >> > > have
> >>> >> > > >> anything in it. If we look a few lines above, it looks like
> >>> >> > 'perf_map_'
> >>> >> > > is
> >>> >> > > >> only filled in if 'emit_perf_map_' is true.
> >>> >> > > >>
> >>> >> > > >>   if (emit_perf_map_) {
> >>> >> > > >>     lock_guard<SpinLock> perf_map_lock(perf_map_lock_);
> >>> >> > > >>     DCHECK(perf_map_.find(obj.getData().data()) ==
> >>> >> perf_map_.end());
> >>> >> > > >>     perf_map_[obj.getData().data()] =
> >>> std::move(perf_map_entries);
> >>> >> > > >>     WritePerfMapLocked();
> >>> >> > > >>   }
> >>> >> > > >>
> >>> >> > > >> Maybe we should also be checking 'emit_perf_map_' before the
> >>> DCHECK?
> >>> >> > > >>
> >>> >> > > >> After you have fixed this bug, you should be able to run the
> >>> query
> >>> >> > > without
> >>> >> > > >> Impala crashing and you should be able to inspect the
> assembly
> >>> >> > generated
> >>> >> > > >> by
> >>> >> > > >> Impala in codegen-modules/*.asm
> >>> >> > > >>
> >>> >> > > >> Have fun, and don't be afraid to ask dev@impala.apache.org
> is
> >>> you
> >>> >> > have
> >>> >> > > >> any questions!
> >>> >> > > >>
> >>> >> > > >
> >>> >> > > >
> >>> >> > >
> >>> >> >
> >>> >>
> >>>
> >>
> >>
> >
>