You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by "Sean Geoghegan (JIRA)" <ji...@apache.org> on 2009/11/11 06:08:28 UTC

[jira] Created: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Erlang View with Reduce Fails on Large Number of documents
----------------------------------------------------------

                 Key: COUCHDB-567
                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
             Project: CouchDB
          Issue Type: Bug
    Affects Versions: 0.10
            Reporter: Sean Geoghegan


I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:

** Last message in was {'EXIT',<0.2220.0>,
                        {function_clause,
                         [{couch_view_updater,view_insert_doc_query_results,
                           [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
                           [],
                           [{{view,0,
                                 [<<"_temp">>],
                                 <<"...">>,
                                 {btree,<0.2218.0>,
                                     {1565615,{341,[0]}},
                                     #Fun<couch_btree.3.83553141>,
                                     #Fun<couch_btree.4.30790806>,
                                     #Fun<couch_view.less_json_keys.2>,
                                     #Fun<couch_view_group.11.46347864>},
                                 [{<<"_temp">>,
                                   <<"...">>}]},
                             []}],
                           [],[]]},
                      {couch_view_updater,view_insert_query_results,4},
                      {couch_view_updater,process_doc,4},
                      {couch_view_updater,'-update/2-fun-0-',6},
                      {couch_btree,stream_kv_node2,7},
                      {couch_btree,stream_kp_node,6},
                      {couch_btree,fold,5},
                      {couch_view_updater,update,2}]]},

This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.

I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
  
   * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
   * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
   * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
   * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.

I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:

   {[], Group2, ViewEmptyKeyValues, []}

to this:

 {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}

Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Paul Joseph Davis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776332#action_12776332 ] 

Paul Joseph Davis commented on COUCHDB-567:
-------------------------------------------

Sean,

Good detective work. If you can provide script to reproduce that'd be most helpful. Hopefully I can dig into this tomorrow or Thursday.

> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Sean Geoghegan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776811#action_12776811 ] 

Sean Geoghegan commented on COUCHDB-567:
----------------------------------------

Sorry, that should be 0.10.0 of course.

> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>         Attachments: generate-data.rb, view.erl
>
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Paul Joseph Davis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776861#action_12776861 ] 

Paul Joseph Davis commented on COUCHDB-567:
-------------------------------------------

I didn't realize the number of checkpoints was important. I ported your test over directly, and then reduced the numbers to make it not take so long to run. I'm testing on the 0.10.1 branch now to see if I can spot the fix.

You're right that we can't backport all of the changes to 0.10.x, but hopefully the cause isn't too deep. I had it in my head to check couch_view_updater this afternoon but somewhere along the lines I blanked on that.

More info shortly.

> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>         Attachments: generate-data.rb, view.erl
>
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Paul Joseph Davis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776798#action_12776798 ] 

Paul Joseph Davis commented on COUCHDB-567:
-------------------------------------------

Sean,

I added a port of your Ruby test to the erlang_views.js test in Futon and can't get trunk to reproduce this error you're having. Can you try svn up'ing and seeing if the error is triggered? I also did some logging and locally the process exchange with couch_query_servers goes as it should.

> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>         Attachments: generate-data.rb, view.erl
>
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Sean Geoghegan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776831#action_12776831 ] 

Sean Geoghegan commented on COUCHDB-567:
----------------------------------------

Paul,

I tried your test case on 0.10.0 and it passed, until I increased the number of words in each document to 1000 (at 100 checkpoints weren't triggered so the error didn't occur).  So you might want to increase that number.

I also tried with trunk and it works fine, no errors with either amount of words in each document. So that is good.  I think the fix might have been in couch_view_updater.erl. That is where I had to make my initial change to get it working.  There looks like there are a lot of changes in this file between 0.10.0 and trunk.  So backporting is probably not realistic.

Thanks for your help with this.

> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>         Attachments: generate-data.rb, view.erl
>
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Paul Joseph Davis (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paul Joseph Davis resolved COUCHDB-567.
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 0.10.1
         Assignee: Paul Joseph Davis

Fixed as of 835281.

Sean, can you try this on for size? This is only in the 0.10.x branch so you'll have to pull that form SVN and build. The 0.10.1 release will be in a day or two so this is mostly just a remote check that it's fixed.

Too tired to write anything of interest, so here's the commit message:

Fixes COUCHDB-567 error with ErlView reduces.

Apparently we never tested ErlView reductions on 0.10.x? As far as I can tell   they never should have worked. It was exactly as Sean Geoghegan described in    that the interleaved calls to reduce where trouncing the mapper state.

Trunk uses the two process update scheme so wouldn't be affected by this        trouncing. This patch is a stop gap to make ErlViews work. I've tested with the Futon test patch and an updated query_server_spec.rb to revalidate things are working.

Fixing this bug has made it quite apparent that the query server specs need to be drastically rethought. I spent quite a bit of time tracking down that the specs were actually testing that the subprocess died. ErlViews causing the host process to die would be very bad. In the future I'd like to see script/response sets and file stuctures for function definitions.


> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>            Assignee: Paul Joseph Davis
>             Fix For: 0.10.1
>
>         Attachments: generate-data.rb, view.erl
>
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Sean Geoghegan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776297#action_12776297 ] 

Sean Geoghegan commented on COUCHDB-567:
----------------------------------------

Actually, my fix breaks when the reduce function is removed.

I'm not sure how to go about fixing this though.  I can probably provide a test case if needed though.

> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Paul Joseph Davis (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776817#action_12776817 ] 

Paul Joseph Davis commented on COUCHDB-567:
-------------------------------------------

Sean,

I don't see anything in the SVN logs that affected couch_os_process.erl, couch_native_process.erl or couch_query_servers.erl that's not in the about to be released 0.10.1. And I'm pretty sure that all of these commits were in 0.10.0. If you can't reproduce on trunk, you should be able to just copy that js test file to your Futon directory and see if it can be triggered on 0.10.0 though that would be most odd.

> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>         Attachments: generate-data.rb, view.erl
>
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Sean Geoghegan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776870#action_12776870 ] 

Sean Geoghegan commented on COUCHDB-567:
----------------------------------------

I don't know if the number of check points are important, but with only 100 elements in the words array I wasn't getting any checkpoints triggered, so there error wasn't happening.  There might be some number under 1000 that fails, I can find it if needed.

> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>         Attachments: generate-data.rb, view.erl
>
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Sean Geoghegan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776810#action_12776810 ] 

Sean Geoghegan commented on COUCHDB-567:
----------------------------------------

Thanks Paul, I'll try that out.  Note that I only experienced the error on version 0.1.0, (on both Linux and Windows).  I'll confirm that it doesn't occur on the trunk version.

> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>         Attachments: generate-data.rb, view.erl
>
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (COUCHDB-567) Erlang View with Reduce Fails on Large Number of documents

Posted by "Sean Geoghegan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/COUCHDB-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Geoghegan updated COUCHDB-567:
-----------------------------------

    Attachment: view.erl
                generate-data.rb

I've attached a Ruby script that generates some test data and a of map and reduce function for that data which triggers this error.

> Erlang View with Reduce Fails on Large Number of documents
> ----------------------------------------------------------
>
>                 Key: COUCHDB-567
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-567
>             Project: CouchDB
>          Issue Type: Bug
>    Affects Versions: 0.10
>            Reporter: Sean Geoghegan
>         Attachments: generate-data.rb, view.erl
>
>
> I have been having a problem with running Erlang views over a large dataset.  Whenever the indexer goes to checkpoint it's process the following error occurs:
> ** Last message in was {'EXIT',<0.2220.0>,
>                         {function_clause,
>                          [{couch_view_updater,view_insert_doc_query_results,
>                            [{doc,<<"73956fdca62c384849a3313e6c48b7ed">>,...
>                            [],
>                            [{{view,0,
>                                  [<<"_temp">>],
>                                  <<"...">>,
>                                  {btree,<0.2218.0>,
>                                      {1565615,{341,[0]}},
>                                      #Fun<couch_btree.3.83553141>,
>                                      #Fun<couch_btree.4.30790806>,
>                                      #Fun<couch_view.less_json_keys.2>,
>                                      #Fun<couch_view_group.11.46347864>},
>                                  [{<<"_temp">>,
>                                    <<"...">>}]},
>                              []}],
>                            [],[]]},
>                       {couch_view_updater,view_insert_query_results,4},
>                       {couch_view_updater,process_doc,4},
>                       {couch_view_updater,'-update/2-fun-0-',6},
>                       {couch_btree,stream_kv_node2,7},
>                       {couch_btree,stream_kp_node,6},
>                       {couch_btree,fold,5},
>                       {couch_view_updater,update,2}]]},
> This problem occurs regardless of the functionality of the map and reduce functions, it seems to based on the time it takes to generate, or whatever causes the checkpoints to get written out.
> I did some investigation into the problem by adding alot of LOG_INFO statements throughout the code.  I was able to determine the following:
>   
>    * the Erlang View process is being held on to by the view updater for the entire duration of the indexing, 
>    * however after the first checkpoint is hit and the progress is written out, a reduce call is made to the erlang view server, once this completes the view server is released back to the cache using ret_os_process. 
>    * when the next reduce cycle occurs the same erlang view server is returned by get_os_process but it is first sent a reset message which clears all the functions in the view servers state.
>    * when the next map cycles starts the view updater uses the same handle to the erlang view server it had in the beginning. It assumes that the servers state is the same however it has been reset so there are no view functions in the view server.  This causes the above error when it then attempts to write out the result of a view function which doesn't exist in the server.
> I was able to fix this problem by modifying line 139 of couch_view_updater.erl from this:
>    {[], Group2, ViewEmptyKeyValues, []}
> to this:
>  {[], Group2#group{query_server=nil}, ViewEmptyKeyValues, []}
> Which removes the view updater's handle to the erlang server proc, forcing it to get/create a new one for each map cycle and setting up the view functions within the server.  I don't know if this is the right way to do it, or if it has any bad side-effects, but it does prevent the crash at least, and allow the indexing to complete correctly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.