You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "kevingurney (via GitHub)" <gi...@apache.org> on 2023/03/14 19:24:10 UTC

[GitHub] [arrow] kevingurney opened a new pull request, #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

kevingurney opened a new pull request, #34563:
URL: https://github.com/apache/arrow/pull/34563

   ### Rationale for this change
   
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   This pull request is a follow up to [this mailing list discussion](https://lists.apache.org/thread/2kgxbs54dw4wvcwrthzvb1ljqcvnrv7h) about integrating [`mathworks/libmexclass`](https://github.com/mathworks/libmexclass/) with the MATLAB Interface to Arrow code base.
   
   We've spent the last few months working on building `libmexclass` from scratch in order to ease development of the MATLAB Interface to Arrow. `libmexclass` essentially provides a way to connect MATLAB classes with corresponding C++ classes using an approach inspired by the [Proxy Design Pattern](https://en.wikipedia.org/wiki/Proxy_pattern).
   
   Our hope is that using `libmexclass` will enable us to more easily build out an object-oriented MATLAB Interface to Arrow memory by wrapping corresponding Arrow C++ classes and "proxying" method calls on these MATLAB objects to the underlying Arrow C++ objects.
   
   ### What changes are included in this PR?
   
   1. Modifications were made to the CMake build system for the MATLAB interface to use `libmexclass` under the hood. This includes the addition of a new build flag `-D MATLAB_ARROW_INTERFACE = ON | OFF` which toggles building the new code that uses `libmexclass` under the hood.
   2. To illustrate the basic usage of `libmexclass`, we have added one new MATLAB class `arrow.array.Float64Array`. This class allows users to construct an Arrow array with logical type `Float64` from a MATLAB `double` array with zero data copies. Under the hood, a `Proxy` wraps and bounds the lifetime of the underlying Arrow C++ `Float64Array` object. In addition, this `Proxy` is responsible for delegating method calls on an `arrow.array.Float64Array` to the corresponding Arrow C++ `Float64Array`.
   
   ### Are these changes tested?
   
   Yes, these changes have been tested on Linux, macOS, and Windows.
   
   1. We've modified the MATLAB CI GitHub Actions workflow (`.github/workflows/matlab.yml`) to build the new  `arrow.array.Float64Array` code using `libmexclass`. This includes passing `-D MATLAB_ARROW_INTERFACE=ON` to the `cmake` command call in `ci/scripts/matlab_build.sh`.
   2. We've added a new basic MATLAB test `test/arrow/array/tFloat64Array.m` which tests for successful construction of an `arrow.array.Float64Array`.
   3. We've confirmed that the `Dev` CI workflow linting checks are all passing and appropriate Apache license headers have been added.
   4. We've manually tested creation, deletion, and assignment of multiple `arrow.array.Float64Array` instances on Linux, macOS, and Windows with a variety of different MATLAB `double` arrays.
   
   ### Are there any user-facing changes?
   
   Yes, there is now a public class named `arrow.array.Float64Array` which is added to the MATLAB Path.
   
   Included below is a simple example of creating two different `arrow.array.Float64Array` objects in MATLAB:
   
   ```matlab
   >> A = arrow.array.Float64Array([1, 2, 3])            
   
   A = 
   
   [
     1,
     2,
     3
   ]
   
   >> random = arrow.array.Float64Array(rand(1, 10, 100))
   
   random = 
   
   [
     0.6311887342690112,
     0.355073651878849,
     0.9970032716066477,
     0.22417149898312716,
     0.6524510729686149,
     0.6049906419082594,
     0.38724543148313495,
     0.14218715929050407,
     0.025134985710203117,
     0.4211122537652413,
     ...
     0.6228027906591304,
     0.7966246853083961,
     0.74587490154065,
     0.12553623135481973,
     0.8223940067590204,
     0.02515050142850217,
     0.41442888092403163,
     0.7314074679729372,
     0.7813740002759628,
     0.367285915131369
   ]
   
   ```
   
   **Note**: This is an early stage PR, so the naming scheme `arrow.array.<Type>Array` might change in the future.
   
   ### Future Directions
   
   1. Currently, the "old" `featherread`/`featherwrite` code is still being built by CMake and installed to the specified `CMAKE_INSTALL_PREFIX`. This slows down the build process and complicates the build system logic. In addition, these Feather functions only support reading and writing a subset of Feather V1 files. We should considering disabling building of this legacy code by default or removing it entirely. In the long term, when we have more Arrow types in MATLAB (e.g. `arrow.Table`, `arrow.Schema`, `arrow.RecordBatch`, etc.) we should consider re-implementing this functionality in terms of the new APIs.
   2. We would like to start adding more numeric array classes like (`arrow.array.UInt8Array`, `arrow.array.Int64Array`, etc.).
   3. We only added one very basic test for `arrow.array.Float64Array` in this pull request. We should add a lot more tests as the APIs develop to test things like indexing, copying, slicing, etc.
   4. We don't have any documentation for `arrow.array.Float64Array` right now. In general, we should start adding detailed documentation for the new APIs as we start to implement them.
   5. Lots more! This is just the beginning of building out the MATLAB Interface to Arrow APIs. We plan on creating GitHub issues for tracking work as we go.
   
   ### Notes
   
   1. Creating `libmexclass` and integrating it with the Arrow code base was a team effort! Thank you to @sreeharihegden, @lafiona, @sgilmore10, @jhughes-mw, and others at @MathWorks for their help with this pull request!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1147943325


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   @kou - we are working on implementing these changes, but we wanted to check with you about integrating your suggested C++ template approach into `mathworks/libmexclass`.
   
   From a copyright attribution standpoint, would you like to make a pull request to `libmexclass` yourself with these changes (**Note**: this would require you to sign a [MathWorks CLA](https://github.com/mathworks/libmexclass/blob/main/CONTRIBUTING.md))? Or, would you prefer if we just made the changes ourselves (without any further explicit attribution)?
   
   Let us know what makes the most sense to you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1175596555


##########
matlab/CMakeLists.txt:
##########
@@ -317,6 +324,51 @@ matlab_add_mex(R2018a
 
 target_include_directories(mexcall PRIVATE ${CPP_SOURCE_DIR})
 
+# ARROW_SHARED_LIB
+# On Windows, this will be ARROW_HOME/bin/arrow.dll and on Linux and macOS, it is the arrow.so/dylib in the newly built arrow_shared library.
+if(NOT Arrow_FOUND)
+  message(STATUS "ARROW_SHARED_LIB will be set using IMPORTED_LOCATION value when building."
+  )
+  get_target_property(ARROW_SHARED_LIB arrow_shared IMPORTED_LOCATION)
+else()
+  # If not building Arrow, ARROW_SHARED_LIB derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake would be non-empty.
+  message(STATUS "ARROW_SHARED_LIB: ${ARROW_SHARED_LIB}")
+endif()
+
+# ARROW_LINK_LIB
+# On Windows, we use the arrow.lib for linking arrow_matlab against the Arrow C++ library.
+# The location of arrow.lib is previously saved in IMPORTED_IMPLIB.
+if(WIN32)
+  # If not building Arrow, IMPORTED_IMPLIB will be empty.
+  # Then set ARROW_LINK_LIB to ARROW_IMPORT_LIB which would have been derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake. This will avoid the ARROW_LINK_LIB set to NOTFOUND error.
+  # The ARROW_IMPORT_LIB should be ARROW_HOME/lib/arrow.lib on Windows.
+  if(NOT Arrow_FOUND)
+    message(STATUS "ARROW_LINK_LIB will be set using IMPORTED_IMPLIB value when building."
+    )
+    get_target_property(ARROW_LINK_LIB arrow_shared IMPORTED_IMPLIB)
+  else()
+    set(ARROW_LINK_LIB "${ARROW_IMPORT_LIB}")
+    message(STATUS "Setting ARROW_LINK_LIB to ARROW_IMPORT_LIB: ${ARROW_IMPORT_LIB}, which is derived from the ARROW_HOME provided."
+    )
+  endif()
+else()
+  # On Linux and macOS, it is the arrow.so/dylib in the newly built arrow_shared library used for linking.
+  # On Unix, this is the same as ARROW_SHARED_LIB.
+  message(STATUS "Setting ARROW_LINK_LIB to ARROW_SHARED_LIB as they are same on Unix.")
+  set(ARROW_LINK_LIB "${ARROW_SHARED_LIB}")
+endif()
+
+# ARROW_INCLUDE_DIR should be set so that header files in the include directory can be found correctly.
+# The value of ARROW_INCLUDE_DIR should be ARROW_HOME/include on all platforms.
+if(NOT Arrow_FOUND)
+  message(STATUS "ARROW_INCLUDE_DIR will be set using INTERFACE_INCLUDE_DIRECTORIES value when building."
+  )
+  get_target_property(ARROW_INCLUDE_DIR arrow_shared INTERFACE_INCLUDE_DIRECTORIES)
+else()
+  # If not building Arrow, ARROW_INCLUDE_DIR derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake would be non-empty.
+  message(STATUS "ARROW_INCLUDE_DIR: ${ARROW_INCLUDE_DIR}")
+endif()
+

Review Comment:
   No problem at all! Admittedly, the logic is a bit confusing since we currently support building from source.
   
   Ideally, it would be great if we could refactor the build system in the future to use `FetchContent` when building from source (instead of `ExternalProject_Add`). However, it seems like `FetchContent` isn't currently supported based on #32234.
   
   Thanks for letting us know that `arrow_shared` is deprecated. We agree that we should eventually move to using `Arrow::arrow_shared`. This is something that we can work on for a future pull request.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1140561514


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   > libmexclass.so can be shared from multiple MATALB modules. (Module A and module B can use the same libmexclass.so.)
   
   Correct!
   
   > libmexclassclient.so can't be shared from multiple MATAB modules. (Each module must build its libmexclassclient.so.)
   
   Correct!
   
   > It seems that we can just use C++ template feature for it instead of macros when we build libmexclient.so in our CMake.
   
   In the current architecture using C++ templates to instantiate the correct client subclass of `libmexclass:proxy::Factory` wouldn't be possible. This is because the client is *not* the one who builds / defines the MEX gateway function. Instead, the [`mex_gateway_function.cpp` "template code"](https://github.com/mathworks/libmexclass/blob/main/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp) is "injected" with the name of the client's `Factory` subclass and header file name. In short, the client does *not* write the MEX gateway function code - they just "customize it" using `target_compile_definitions`.
   
   Don't hesitate to let me know if this is still unclear in any way.
   
   > BTW, who does instantiate MexFunction defined in mex_gateway_function.cpp?
   
   Same as my previous comment above. The `libmexclass` build system (i.e. *not* the client) is responsible for taking the `mex_gateway_function.cpp` "template code" and "injecting" the client's `Factory` subclass and header file name during the build process. This results in a single binary file `gateway.mexa64` that can be called from MATLAB as `gateway` like any other MATLAB function (note - this is [how MEX functions work, in general](https://www.mathworks.com/help/matlab/matlab_external/c-mex-functions.html)).
   
   > "The client code" is [mex_gateway_function.cpp](https://github.com/mathworks/libmexclass/blob/484ade97176c611cbb7056b941df99f1f3a6a0dc/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp), right? Or our CMake code?
   
   Sorry for the confusion. I think I used "client code" to mean two different things in my previous comment. To clarify - there is the "client C++ `Proxy` and `Factory` code" - which is built into `libmexclassclient.so`. Then, there is also the "client CMake code" - which I intended to mean the Arrow-MATLAB CMakeLists.txt.
   
   I hope this clears things up.
   
   > "a few macros" are C/C++'s (C preprocessor's) macros, right? Or CMake's macro?
   
   CMake macros.
   
   > "the appropriate target and link dependencies" is for CMake, right?
   
   Correct. This is for CMake targets.
   
   > Calling add_library(), target_link_libraries() and so on for libmexclient.so and gateway.mexa64 are OK for me. Because it's one of natural CMake usages.
   
   Excellent! This sounds good.
   
   > "a few helper macros" are CMake's macros, right?
   
   Yep, these would be CMake helper macros which would hopefully make it easy to connect `libmexclass` targets with client-defined targets/options.
   
    > Can we use C++ template feature instead of the approach? I may provide a code example once I understand who instantiates MexFunction.
   
   See my previous answer to this. With the current approach, the client does *not* write the source code for the MEX gateway function. Instead, they simply use the predefined "template code". If we wanted to use C++ templates for this, then the client would have to be responsible for writing the MEX gateway code from scratch, as well as setting up the appropriate CMake options on the MEX target. This seems like it may be a more error prone approach and would lead to each client rewriting the exact same MEX gateway code (aside from the one line where they use their subclass of `Factory`).
   
   > I don't have a strong opinion. I think that the latter approach (installing libmexclass and use it as a CMake package) is better because multiple MATLAB modules can share one libmexclass installation. But I'm OK with the former approach (using FetchContent).
   
   Thanks for providing your input on this. Your reasoning for the package approach being more preferable makes a lot of sense. On the other hand, it seems like the `FetchContent` approach might be a bit simpler to set up.
   
   I think, if it is OK, we could start with the `FetchContent` approach to hopefully unblock development more quickly and then consider adding package support as a future improvement to `libmexclass`. It seems like the advantages of the package approach might also be more pertinent in the future when we may have multiple clients of `libmexclass` loaded into MATLAB at the same time.
   
   ---
   
   I hope I answered all of your questions clearly. If anything else is still not making sense, please let me know!
   
   If the approach of using a combination of `FetchContent` and the "template MEX function code" (i.e. [`mex_gateway_function.cpp`](https://github.com/mathworks/libmexclass/blob/main/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp)) with "code injection" via `target_compile_definitions` seems like a reasonable approach, then we would be happy to start working on implementing that in `libmexclass`.
   
   Thanks again!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1164681643


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   **Update**: We've spent several days worth of effort trying to get to the root cause of what's causing the crash and have determined the following:
   
   1. The crash doesn't appear to be related to a breaking change in the Arrow C++ libraries. What we thought we were seeing before was actually a false positive. It turns out that we were accidentally building in `Debug` mode when qualifying the changes on Windows, which was triggering the crash. The crash occurs regardless of whether a new or old version of the Arrow C++ libraries are used. In addition, the crash does **not** occur when building in `Release` mode.
   
   2. Unfortunately, it seems like the crash is being caused by some kind of memory corruption. We stepped through the code in the Visual Studio debugger and the call to `array->ToString()` in `float64_array.cc` returns the expected `std::string` value representing the contents of the `arrow::Array`. However, right after `ToString()` returns, the returned `std::string` value memory becomes filled with random characters, and then, MATLAB crashes. At this point, we have yet to identify the source of the memory corruption.
   
   3. We tried replacing the call `array->ToString()` with other `arrow::Array` APIs, and MATLAB did not crash. It's not clear why `ToString()` seems to be showing the memory corruption.
    
   4. We have been experimenting with instrumenting the code using [`AddressSanitizer`](https://github.com/google/sanitizers/wiki/AddressSanitizer) to determine the root cause of the apparent memory corruption.
   
   ---
   
   In order to unblock this pull request, it may make sense to treat this as an independent issue. Since the crash doesn't seem to occur in `Release` mode on Windows, the crash shouldn't prevent further progress on the MATLAB Interface to Arrow for now.
   
   We will work on addressing the remaining comments on this PR while trying to debug this crash further in parallel.
   
   We apologize for the delay in determining the root cause of this issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1170609078


##########
matlab/CMakeLists.txt:
##########


Review Comment:
   **Update**: I've captured the desire to investigate lowering the minimum required CMake version in #35221.
   
   Marking this conversation as resolved for now. Feel free to re-open this conversation if you feel that this is a more pressing issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1174189787


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,109 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# ----------------------------------
+# Configure libmexclass FetchContent
+# ----------------------------------
+
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_FETCH_CONTENT_NAME libmexclass)
+# TODO: Consider using SSH URL for the Git Repository when
+# libmexclass is accessible for CI without permission issues.
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_FETCH_CONTENT_GIT_REPOSITORY "https://github.com/mathworks/libmexclass.git")
+# Use a specific Git commit hash to avoid libmexclass version changing unexpectedly.
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_FETCH_CONTENT_GIT_TAG "44c15d0")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_FETCH_CONTENT_SOURCE_SUBDIR "libmexclass/cpp")
+
+# ------------------------------------------
+# Configure libmexclass Client Proxy Library
+# ------------------------------------------
+
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_LIBRARY_NAME arrowproxy)
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_LIBRARY_ROOT_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_LIBRARY_INCLUDE_DIRS ${MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_LIBRARY_ROOT_INCLUDE_DIR}
+                                                               ${MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_INCLUDE_DIR}
+                                                               ${MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_FACTORY_INCLUDE_DIR}
+                                                               ${ARROW_INCLUDE_DIRS})

Review Comment:
   Can we remove `ARROW_INCLUDE_DIRS`?
   
   ```suggestion
                                                                  ${MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_FACTORY_INCLUDE_DIR})
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou merged pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou merged PR #34563:
URL: https://github.com/apache/arrow/pull/34563


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1148088727


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   I don't need a copyright attribution for this! So could you do it yourself without any further explicit attribution?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1164681643


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   **Update**: We've spent several days worth of effort trying to get to the root cause of what's causing the crash and have determined the following:
   
   1. The crash doesn't appear to be related to a breaking change in the Arrow C++ libraries. What we thought we were seeing before was actually a false positive. It turns out that we were accidentally building in `Debug` mode when qualifying the changes on Windows, which was triggering the crash. The crash occurs regardless of whether a new or old version of the Arrow C++ libraries are used. In addition, the crash does **not** occur when building in `Release` mode.
   
   2. Unfortunately, it seems like the crash is being caused by some kind of memory corruption. We stepped through the code in the Visual Studio debugger and the call to `array->ToString()` in `float64_array.cc` returns the expected `std::string` value representing the contents of the `arrow::Array`. However, right after `ToString()` returns, the returned `std::string` value memory becomes filled with random characters, and then, MATLAB crashes. At this point, we have yet to identify the source of the memory corruption.
    
   3. We have been experimenting with instrumenting the code using [`AddressSanitizer`](https://github.com/google/sanitizers/wiki/AddressSanitizer) to determine the root cause of the apparent memory corruption.
   
   ---
   
   In order to unblock this pull request, it may make sense to treat this as an independent issue. Since the crash doesn't seem to occur in `Release` mode on Windows, the crash shouldn't prevent further progress on the MATLAB Interface to Arrow for now.
   
   We will work on addressing the remaining comments on this PR while trying to debug this crash further in parallel.
   
   We apologize for the delay on getting to the root cause of this issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1174189613


##########
matlab/CMakeLists.txt:
##########
@@ -317,6 +324,51 @@ matlab_add_mex(R2018a
 
 target_include_directories(mexcall PRIVATE ${CPP_SOURCE_DIR})
 
+# ARROW_SHARED_LIB
+# On Windows, this will be ARROW_HOME/bin/arrow.dll and on Linux and macOS, it is the arrow.so/dylib in the newly built arrow_shared library.
+if(NOT Arrow_FOUND)
+  message(STATUS "ARROW_SHARED_LIB will be set using IMPORTED_LOCATION value when building."
+  )
+  get_target_property(ARROW_SHARED_LIB arrow_shared IMPORTED_LOCATION)
+else()
+  # If not building Arrow, ARROW_SHARED_LIB derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake would be non-empty.
+  message(STATUS "ARROW_SHARED_LIB: ${ARROW_SHARED_LIB}")
+endif()
+
+# ARROW_LINK_LIB
+# On Windows, we use the arrow.lib for linking arrow_matlab against the Arrow C++ library.
+# The location of arrow.lib is previously saved in IMPORTED_IMPLIB.
+if(WIN32)
+  # If not building Arrow, IMPORTED_IMPLIB will be empty.
+  # Then set ARROW_LINK_LIB to ARROW_IMPORT_LIB which would have been derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake. This will avoid the ARROW_LINK_LIB set to NOTFOUND error.
+  # The ARROW_IMPORT_LIB should be ARROW_HOME/lib/arrow.lib on Windows.
+  if(NOT Arrow_FOUND)
+    message(STATUS "ARROW_LINK_LIB will be set using IMPORTED_IMPLIB value when building."
+    )
+    get_target_property(ARROW_LINK_LIB arrow_shared IMPORTED_IMPLIB)
+  else()
+    set(ARROW_LINK_LIB "${ARROW_IMPORT_LIB}")
+    message(STATUS "Setting ARROW_LINK_LIB to ARROW_IMPORT_LIB: ${ARROW_IMPORT_LIB}, which is derived from the ARROW_HOME provided."
+    )
+  endif()
+else()
+  # On Linux and macOS, it is the arrow.so/dylib in the newly built arrow_shared library used for linking.
+  # On Unix, this is the same as ARROW_SHARED_LIB.
+  message(STATUS "Setting ARROW_LINK_LIB to ARROW_SHARED_LIB as they are same on Unix.")
+  set(ARROW_LINK_LIB "${ARROW_SHARED_LIB}")
+endif()
+
+# ARROW_INCLUDE_DIR should be set so that header files in the include directory can be found correctly.
+# The value of ARROW_INCLUDE_DIR should be ARROW_HOME/include on all platforms.
+if(NOT Arrow_FOUND)
+  message(STATUS "ARROW_INCLUDE_DIR will be set using INTERFACE_INCLUDE_DIRECTORIES value when building."
+  )
+  get_target_property(ARROW_INCLUDE_DIR arrow_shared INTERFACE_INCLUDE_DIRECTORIES)
+else()
+  # If not building Arrow, ARROW_INCLUDE_DIR derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake would be non-empty.
+  message(STATUS "ARROW_INCLUDE_DIR: ${ARROW_INCLUDE_DIR}")
+endif()
+

Review Comment:
   Ah, sorry. I misread the implementation.
   I thought all individual target properties are used directly for building a library because https://github.com/apache/arrow/pull/34563/files#diff-282bdaf7afd3f3d7974e3ab41857a65e3eea6a566107ea7d209ba3fec72e2e77R43 uses `ARROW_INCLUDE_DIRS`.
   But I noticed that `ARROW_INCLUDE_DIRS` is only used directly and we can remove it because `arrow_shared` is specified: https://github.com/apache/arrow/pull/34563/files#diff-282bdaf7afd3f3d7974e3ab41857a65e3eea6a566107ea7d209ba3fec72e2e77R81
   
   FYI: `arrow_shared` is deprecated in the current `find_package(Arrow)`. `Arrow::arrow_shred` is recommended. We can use `Arrow::arrow_shared` instead of `arrow_shared` for `ExternalProject_Add()`-ed Apache Arrow C++. But it'll be out-of-scope of this pull request.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sreeharihegden commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "sreeharihegden (via GitHub)" <gi...@apache.org>.
sreeharihegden commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1175680908


##########
matlab/CMakeLists.txt:
##########
@@ -317,6 +324,51 @@ matlab_add_mex(R2018a
 
 target_include_directories(mexcall PRIVATE ${CPP_SOURCE_DIR})
 
+# ARROW_SHARED_LIB
+# On Windows, this will be ARROW_HOME/bin/arrow.dll and on Linux and macOS, it is the arrow.so/dylib in the newly built arrow_shared library.
+if(NOT Arrow_FOUND)
+  message(STATUS "ARROW_SHARED_LIB will be set using IMPORTED_LOCATION value when building."
+  )
+  get_target_property(ARROW_SHARED_LIB arrow_shared IMPORTED_LOCATION)
+else()
+  # If not building Arrow, ARROW_SHARED_LIB derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake would be non-empty.
+  message(STATUS "ARROW_SHARED_LIB: ${ARROW_SHARED_LIB}")
+endif()
+
+# ARROW_LINK_LIB
+# On Windows, we use the arrow.lib for linking arrow_matlab against the Arrow C++ library.
+# The location of arrow.lib is previously saved in IMPORTED_IMPLIB.
+if(WIN32)
+  # If not building Arrow, IMPORTED_IMPLIB will be empty.
+  # Then set ARROW_LINK_LIB to ARROW_IMPORT_LIB which would have been derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake. This will avoid the ARROW_LINK_LIB set to NOTFOUND error.
+  # The ARROW_IMPORT_LIB should be ARROW_HOME/lib/arrow.lib on Windows.
+  if(NOT Arrow_FOUND)
+    message(STATUS "ARROW_LINK_LIB will be set using IMPORTED_IMPLIB value when building."
+    )
+    get_target_property(ARROW_LINK_LIB arrow_shared IMPORTED_IMPLIB)
+  else()
+    set(ARROW_LINK_LIB "${ARROW_IMPORT_LIB}")
+    message(STATUS "Setting ARROW_LINK_LIB to ARROW_IMPORT_LIB: ${ARROW_IMPORT_LIB}, which is derived from the ARROW_HOME provided."
+    )
+  endif()
+else()
+  # On Linux and macOS, it is the arrow.so/dylib in the newly built arrow_shared library used for linking.
+  # On Unix, this is the same as ARROW_SHARED_LIB.
+  message(STATUS "Setting ARROW_LINK_LIB to ARROW_SHARED_LIB as they are same on Unix.")
+  set(ARROW_LINK_LIB "${ARROW_SHARED_LIB}")
+endif()
+
+# ARROW_INCLUDE_DIR should be set so that header files in the include directory can be found correctly.
+# The value of ARROW_INCLUDE_DIR should be ARROW_HOME/include on all platforms.
+if(NOT Arrow_FOUND)
+  message(STATUS "ARROW_INCLUDE_DIR will be set using INTERFACE_INCLUDE_DIRECTORIES value when building."
+  )
+  get_target_property(ARROW_INCLUDE_DIR arrow_shared INTERFACE_INCLUDE_DIRECTORIES)
+else()
+  # If not building Arrow, ARROW_INCLUDE_DIR derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake would be non-empty.
+  message(STATUS "ARROW_INCLUDE_DIR: ${ARROW_INCLUDE_DIR}")
+endif()
+

Review Comment:
   I have captured this in #35309. We can probably mark this as resolved now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1164681643


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   **Update**: We've spent several days worth of effort trying to get to the root cause of what's causing the crash and have determined the following:
   
   1. The crash doesn't appear to be related to a breaking change in the Arrow C++ libraries. What we thought we were seeing before was actually a false positive. It turns out that we were accidentally building in `Debug` mode when qualifying the changes on Windows, which was triggering the crash. The crash occurs regardless of whether a new or old version of the Arrow C++ libraries are used. In addition, the crash does **not** occur when building in `Release` mode.
   
   2. Unfortunately, it seems like the crash is being caused by some kind of memory corruption. We stepped through the code in the Visual Studio debugger and the call to `array->ToString()` in `float64_array.cc` returns the expected `std::string` value representing the contents of the `arrow::Array`. However, right after `ToString()` returns, the returned `std::string` value memory becomes filled with random characters, and then, MATLAB crashes. At this point, we have yet to identify the source of the memory corruption.
   
   3. We tried replacing the call `array->ToString()` with other `arrow::Array` APIs, and MATLAB did not crash. It's not clear why `ToString()` seems to be showing the memory corruption.
    
   4. We have been experimenting with instrumenting the code using [`AddressSanitizer`](https://github.com/google/sanitizers/wiki/AddressSanitizer) to determine the root cause of the apparent memory corruption.
   
   ---
   
   In order to unblock this pull request, it may make sense to treat this as an independent issue. Since the crash doesn't seem to occur in `Release` mode on Windows, the crash shouldn't prevent further progress on the MATLAB Interface to Arrow for now.
   
   We will work on addressing the remaining comments on this PR while trying to debug this crash further in parallel.
   
   We apologize for the delay on getting to the root cause of this issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1164799795


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   > However, right after `ToString()` returns, the returned `std::string` value memory becomes filled with random characters, and then, MATLAB crashes.
   
   How did you try it?
   `std::cout << array->ToString() << std::endl;` creates a temporary `std::string`. So I think that you can't touch the `std::string` in `Float64Array::Print`. (I think that you can't refer the `std::string` without specifying its address explicitly.)
   
   > We have been experimenting with instrumenting the code using [`AddressSanitizer`](https://github.com/google/sanitizers/wiki/AddressSanitizer) to determine the root cause of the apparent memory corruption.
   
   Could you share the log from `AddressSanitizer`?
   
   
   
   > In order to unblock this pull request, it may make sense to treat this as an independent issue.
   
   I agree with you.
   
   > We apologize for the delay on getting to the root cause of this issue.
   
   No problem! Thanks for working on this!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1160999973


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   **Update**: It turns out that building against an older version (i.e. [`4b31aa4`](https://github.com/apache/arrow/commit/4b31aa4f4d76b629e62f2223cd13906f6c911ccb)) of the Arrow C++ libraries (i.e. `arrow.dll`) prevents the crash from occurring.
   
   We haven't yet identified exactly which commit to the Arrow C++ libraries introduced the crash, but we will investigate this further (this may be time consuming because the Windows builds are fairly slow right now).
   
   Once we identify the commit to the Arrow C++ libraries that introduced the breaking change, we can determine whether we need to update the source code for the MATLAB Interface to Arrow to react to any API changes (in the case that the changes were intended).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1166816899


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Thank you for very much for sharing your thoughts, @kou!
   
   > How did you try it?
   std::cout << array->ToString() << std::endl; creates a temporary std::string. So I think that you can't touch the std::string in Float64Array::Print. (I think that you can't refer the std::string without specifying its address explicitly.)
   
   We actually did try modifying the code to copy the resulting `std::string` returned by `array->ToString()` before displaying it using `std::cout`. Unfortunately, the memory corruption still seems to occur even when the data is copied.
   
   > Could you share the log from AddressSanitizer?
   
   So far, we've only had access to a Linux machine with `AddressSanitizer`, and, for some reason, the memory corruption doesn't appear to be occurring on Linux (i.e. `AddressSanitizer` didn't display any output).
   
   Once we have a Windows build working with `AddressSanitizer` enabled, we will be happy to share the output!
   
   > I agree with you.
   
   Excellent! We will work on addressing the remaining feedback on this pull request to keep things moving forward.
   
   ---
   
   **Update**: We've made a bit more progress on simplifying the reproduction steps for this issue.
   
   This is reproducible by creating a [basic C++ MEX function](https://www.mathworks.com/help/matlab/matlab_external/c-mex-functions.html) which creates an `arrow::Array` and calls `ToString` on it, without using `libmexclass` or `CMake`/[`FindMatlab`](https://cmake.org/cmake/help/latest/module/FindMatlab.html) (i.e. compile the MEX gateway function using the [`mex`](https://www.mathworks.com/help/matlab/ref/mex.html) command in MATLAB):
   
   ```matlab
   >> mex -v -g COMPFLAGS='$COMPFLAGS /std:c++17' test.cpp -IC:\path\to\arrow\debug\install\include -larrow -LC:path\to\arrow\debug\install\lib
   ```
   
   **Note**: when using the `mex` command above, we linked the MEX gateway against a `Debug` mode build of the Arrow C++ libraries built using Visual Studio 2019.
   
   Interestingly, this memory corruption is **not** reproducible with a simple standalone C++ program (i.e. no MATLAB or MEX).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] sreeharihegden commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "sreeharihegden (via GitHub)" <gi...@apache.org>.
sreeharihegden commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1171912145


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   **Update**: Regarding erroring when building the MATLAB Interface to Arrow in `Debug` mode on Windows, this doesn't look straightforward to implement as we had thought initially as the error needs to be reported during the CMake `Build` step rather than the `Configure` step (as the build type can be known only after the `build` command has been issued). 
   
   Since, the problem here is not directly related to this PR and not be blocking it, I have created a new issue #35239 to report an error when building the MATLAB Interface to Arrow in `Debug` mode on Windows, so that we can work on it separately.
   
   I hope this resolves all feedback related to this issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1160962568


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   **Update:** The main refactoring changes to `libmexclass` have been [merged into the `main` development branch of `mathworks/libmexclass`](https://github.com/mathworks/libmexclass/pull/55).
   
   The new `libmexclass` APIs appear to be working with the code for the MATLAB Interface to Arrow on Linux and macOS as expected: https://github.com/mathworks/arrow/actions/runs/4639877088.
   
   Unfortunately, it is taking us some time to debug an issue that appears to be specific to Windows.
   
   When creating an instance of `arrow.array.Float64Array` on Windows, MATLAB shuts down with no crash dump or stack trace. We suspect that this may have something to do with the way we are linking `arrowproxy.dll` against `arrow.dll`, but we aren't sure yet.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1156466175


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1167214786


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   **Update**: We think we've identified the issue!
   
   ---
   
   ## Overview
   
   After a lot of debugging and research, we realized that when the Arrow C++ libraries (i.e. `arrow.dll`) are built in `Debug` mode, they are linked against a `Debug` version of the [MSVC Runtime](https://learn.microsoft.com/en-us/cpp/c-runtime-library/crt-library-features) (i.e. `VCRUNTIME140D.dll` - note the **D** for **D**ebug in the DLL name).
   
   At the same time, `mexclass.dll`, `arrowproxy.dll`, and `gateway.mexw64` are all built by default against a non-`Debug` version of the MSVC Runtime (i.e. `VCRUNTIME140.dll` - note the lack of **D** in the DLL name).
   
   ---
   
   ### Checking the MSVC Runtime using `dumpbin`
   
   We were able to verify the different MSVC Runtime versions in use by the various binaries by running `dumpbin` in an instance of `x64 Native Tools Command Prompt for VS 2019`:
   
   ```shell
   $ dumpbin /dependents arrow.dll
   Microsoft (R) COFF/PE Dumper Version 14.29.30137.0
   Copyright (C) Microsoft Corporation.  All rights reserved.
   
   
   Dump of file arrow.dll
   
   File Type: DLL
   
     Image has the following dependencies:
   
       KERNEL32.dll
       SHELL32.dll
       ole32.dll
       MSVCP140D.dll
       VCRUNTIME140D.dll
       VCRUNTIME140_1D.dll
       ucrtbased.dll
   
     Summary
   
           1000 .00cfg
          33000 .data
           6000 .idata
         142000 .pdata
         606000 .rdata
          29000 .reloc
           1000 .rsrc
        117D000 .text
           1000 .tls
   
   ```
   
   Note the presence of `VCRUNTIME140D.dll` in the `dumpbin` output for `arrow.dll` above.
   
   ---
   
   ### Incompatibility of Multiple Different MSVC Runtimes
   
   Unfortunately, there are no guarantees that one version of the MSVC Runtime will be compatible with another. This limitation is documented in the MSVC documentation page ["C runtime (CRT) and C++ standard library (STL) `.lib` files"](https://learn.microsoft.com/en-us/cpp/c-runtime-library/crt-library-features?view=msvc-170#what-problems-exist-if-an-application-uses-more-than-one-crt-version).
   
   Included below is a relevant excerpt from the documentation:
   
   > A single process may load multiple EXE and DLL images, each with its own CRT. Each of those CRTs may use a different allocator, may have different internal structure layouts, and may use different storage arrangements. It means allocated memory, CRT resources, or classes passed across a DLL boundary can cause problems in memory management, internal static usage, or layout interpretation. For example, if a class is allocated in one DLL but passed to and deleted by another, which CRT deallocator is used? The errors caused can range from the subtle to the immediately fatal, and therefore direct transfer of such resources is discouraged.
   
   It turns out that the case being described above is exactly what was happening when calling `arrow->ToString()` (defined in `arrow.dll`) from `Float64Array::Print` (defined in `arrowproxy.dll`).
   
   Calling `ToString()` on an `arrow::Array` instance crosses over into `arrow.dll` (which uses a completely separate MSVC Runtime from `arrowproxy.dll` when built in `Debug` mode). When `ToString()` allocates memory for a `std::string` within the heap space of the MSVC Runtime used by `arrow.dll` and then returns that `std::string` to `arrowproxy.dll`, undefined behavior is invoked, which eventually leads to the memory corruption we observed.
   
   ---
   
   ### Fixing the Issue
   
   To fix the memory corruption, we modified `matlab/tools/cmake/BuildMatlabArrowInterface.cmake` so that all of the client `libmexclass` binaries (i.e. `mexclass.dll`, `arrowproxy.dll`, and `gateway.mexw64`) link against a `Debug` version of the MSVC Runtime - just like `arrow.dll` does. We were able to accomplish this by [setting the `MSVC_RUNTIME_LIBRARY` target property to `"MultiThreadedDebugDLL"` (i.e. `/MDd`)](https://cmake.org/cmake/help/latest/prop_tgt/MSVC_RUNTIME_LIBRARY.html) to match the `Debug` build of `arrow.dll`.
   
   This excerpt from the [MSVC documentation](https://learn.microsoft.com/en-us/cpp/c-runtime-library/crt-library-features?view=msvc-170#what-problems-exist-if-an-application-uses-more-than-one-crt-version) alludes to this fix:
   
   > It's also possible to avoid some of these issues if all of the images in your process use the same dynamically loaded version of the CRT. To ensure that all components use the same DLL version of the CRT, build them by using the /MD option, and use the same compiler toolset and property settings.
   
   The MSVC documentation also provides more information about the potential issues that can arise when passing resources between DLLs in the page ["Potential errors passing CRT objects across DLL boundaries"](https://learn.microsoft.com/en-us/cpp/c-runtime-library/potential-errors-passing-crt-objects-across-dll-boundaries).
   
   The [common advice we've seen around the Internet](https://stackoverflow.com/questions/22797418/how-do-i-safely-pass-objects-especially-stl-objects-to-and-from-a-dll) relating to this topic is that there is no straightforward, "safe" way to export APIs from a DLL which return STL objects (e.g. `std::string` returned by `arrow::Array::ToString`). My guess is that other community members have already thought through these limitations of the Arrow C++ APIs which return STL objects before. However, I think it may be worth adding a note / warning to the ["Debug Builds" section of the "Building on Windows" documentation for the Arrow C++ libraries](https://arrow.apache.org/docs/developers/cpp/windows.html#debug-builds) about these potential MSVC Runtime compatibility issues so that others hopefully won't have to spend as much effort debugging issues like this in the future. *We would be happy to follow up with a separate pull request to enhance the documentation with this information!*
   
   ---
   
   ### Open Questions
   
   Although we have identified the source of the memory corruption, there is still an open question about whether it is "safe" to build `mexclass.dll`, `arrowproxy.dll`, and `gateway.mexw64` against `VCRUNTIME140D.dll`.
   
   MEX functions are eventually loaded into the MATLAB process address space when called, so it isn't immediately clear whether this will interact as expected in all cases with the MSVC Runtime being used by MATLAB (i.e. `matlab.exe`). To get a better understanding of the potential implications here, it would make sense for us to consult with some other colleagues at MathWorks who have more domain expertise in this area.
   
   It may also make sense to consider modifying the [`CMakeLists.txt` for `mathworks/libmexclass`](https://github.com/mathworks/libmexclass/blob/main/libmexclass/cpp/CMakeLists.txt) to automatically link against a `Debug` MSVC Runtime when building a client project in `Debug` mode. If this isn't desirable as a default behavior, it may still be worth ading a note about potential MSVC Runtime compatibility issues to the `libmexclass` documentation.
   
   ---
   
   ### Moving Forward
   
   Putting the open questions about the safety of linking against a `Debug` MSVC Runtime in a MEX function to the side, it seems reasonable to treat this as a non-pressing issue since the vast majority of MATLAB users won't be building the MATLAB Interface to Arrow against a `Debug` build of `arrow.dll`. In addition, the fix we have identified (i.e. build the client `libmexclass` libraries with `/MDd`) appears to be at least a partial workaround for community members who do want to use a `Debug` build of `arrow.dll`.
   
   We apologize again for taking some time to get to the bottom of this. It was quite surprising that this is what ended up being the source of the memory corruption. In order to get to the root cause, we first needed to narrow down the reproduction steps and get a better understanding of the differences between `Release` and `Debug` builds. We wanted to ensure this memory corruption wasn't indicative of a deeper issue in the implementation of `libmexclass` before building more code on top of it within the upstream `apache/arrow` code base.
   
   Thank you to everyone in the community for your patience while debugging this! We should now be able to concentrate more of our energy on pushing the MATLAB interface forward since this doesn't seem to be a blocking issue. Please don't hesitate to let us know if you have questions about any of this!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1169128816


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   > It makes sense. Please go ahead!
   
   Great! Will do!
   
   > I think that all MEX related libraries and client libraries should use the same build mode as MATLAB itself. If MATLAB is built with Release, others should use Release too. If MATLAB is built with Debug, others should use Debug too. If different mode is specified (MATLAB uses Release but Debug is specified to client libraries), we should report an error such as "You can't build this with Debug because MATLAB is built with Release. You must use Release".
   
   Thanks for sharing your thoughts, @kou!
   
   I think you are most likely correct about all the libraries requiring the same build mode as MATLAB. As you suggested, erroring in this case seems like it might be helpful. We can follow up with a change to add the necessary logic to error appropriately.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1167280545


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Thanks for your report!
   
   Mixing MSVC Runtimes is the root cause makes sense.
   
   
   
   > However, I think it may be worth adding a note / warning to the ["Debug Builds" section of the "Building on Windows" documentation for the Arrow C++ libraries](https://arrow.apache.org/docs/developers/cpp/windows.html#debug-builds) about these potential MSVC Runtime compatibility issues so that others hopefully won't have to spend as much effort debugging issues like this in the future. _We would be happy to follow up with a separate pull request to enhance the documentation with this information!_
   
   It makes sense. Please go ahead!
   
   > Open Questions
   
   I think that all MEX related libraries and client libraries should use the same build mode as MATLAB itself. If MATLAB is built with `Release`, others should use `Release` too. If MATLAB is built with `Debug`, others should use `Debug` too. If different mode is specified (MATLAB uses `Release` but `Debug` is specified to client libraries), we should report an error such as "You can't build this with Debug because MATLAB is built with Release. You must use Release".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] assignUser commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "assignUser (via GitHub)" <gi...@apache.org>.
assignUser commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1140929844


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   I am taking a backseat on the C++ design decisions but overall this sounds reasonable to me! Thanks for the detailed explanations @kevingurney !



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1142184546


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Thanks for sharing this code, @kou!
   
   This seem like a nice approach. The fact that clients only have to write essentially 1 line of "unique" code (because the rest is just standard C++ MEX code) is compelling.
   
   Since it seems like we have converged on a reasonable path forward, I'll start working on making the necessary changes to mathworks/libmexclass to support this new design. I'll follow up on this pull request when the changes are ready.
   
   Thank you again for all your help!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1149811879


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Just for reference, we are working on the CMake build system refactor here: https://github.com/mathworks/libmexclass/tree/54.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1171962459


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Yes, it make sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1170626032


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   **Update**: I've captured the idea of adding a warning to the ["Debug Builds" section of the "Building on Windows" documentation for the Arrow C++ libraries](https://arrow.apache.org/docs/developers/cpp/windows.html#debug-builds) about potential MSVC Runtime compatibility issues in #35224.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on PR #34563:
URL: https://github.com/apache/arrow/pull/34563#issuecomment-1516824849

   I believe we've addressed all the feedback on this pull request at this point. Thank you for all the helpful suggestions!
   
   The refactored `libmexclass` code have been integrated and all CI checks are passing.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #34563:
URL: https://github.com/apache/arrow/pull/34563#issuecomment-1468699288

   * Closes: #33854


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1156454060


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   For reference, [this](https://github.com/mathworks/arrow/tree/GH-33854-libmexclass-refactor) is the branch we are actively working on at the moment. We will merge the changes in from this branch as soon as we get things working as expected.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] assignUser commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "assignUser (via GitHub)" <gi...@apache.org>.
assignUser commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1137933983


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   FetchContent would probably also alleviate this by allowing libmex access to variables from this cmake and this project access to the libmex targets at configure time. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] assignUser commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "assignUser (via GitHub)" <gi...@apache.org>.
assignUser commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1137933983


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   FetchContent would probably also alleviate this by allowing libmex access to variables from this cmake and this project access to the libmex targets at compile time. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1137910630


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   It seems that libmexclass's build process uses files in this repository.
   Can we reconsider this approach? I think that this approach has the following problems:
   
   * We can't use CMake targets to link. (We need to extract library path, include path and so on manually from CMake targets as you did.)
   * It's difficult to control how to build our sources in our CMake.
   
   I'm not familiar with libmexclass but how about the following approach?
   
   * libmexclass provides a library and a CMake package that don't depend on external files (files in our repository)
   * The CMake package provides a helper function to use the libmexclass library easily (Or the CMake package provides a CMake target)
   * We build our source files with the libmexclass library in our CMake configurations (`target_link_libraries()` and so on) instead of `ExternalProject_Add()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1137743156


##########
matlab/CMakeLists.txt:
##########
@@ -202,13 +207,12 @@ endif()
 
 option(MATLAB_BUILD_TESTS "Build the C++ tests for the MATLAB interface" OFF)
 
-# Grab CMAKE Modules from the CPP interface.
-set(CPP_CMAKE_MODULES "${CMAKE_SOURCE_DIR}/../cpp/cmake_modules")
-if(EXISTS "${CPP_CMAKE_MODULES}")
-  set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CPP_CMAKE_MODULES})
-endif()
+# Add tools/cmake directory to the CMAKE_MODULE_PATH.
+set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/tools/cmake)

Review Comment:
   How about using `list(PREPEND)`?
   
   https://cmake.org/cmake/help/latest/command/list.html#prepend
   
   ```suggestion
   list(PREPEND CMAKE_MODULE_PATH ${CMAKE_SOURCE_DIR}/tools/cmake)
   ```



##########
matlab/CMakeLists.txt:
##########
@@ -32,8 +32,13 @@ function(build_arrow)
     message(SEND_ERROR "Error: unrecognized arguments: ${ARG_UNPARSED_ARGUMENTS}")
   endif()
 
-  # If Arrow needs to be built, the default location will be within the build tree.
-  set(ARROW_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/arrow_ep-prefix")
+  # If Arrow build available, set ARROW_PREFIX to the ARROW_HOME specified with cmake.
+  if(Arrow_FOUND)
+    set(ARROW_PREFIX "${ARROW_HOME}")
+  else()
+    # If Arrow needs to be built, the default location will be within the build tree.
+    set(ARROW_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/arrow_ep-prefix")
+  endif()

Review Comment:
   Do we need this change?
   If we use system Arrow C++, we don't link to `arrow_shared_for_gtest`. So all `arrow_shared_for_gtest` related properties are ignored.



##########
matlab/CMakeLists.txt:
##########
@@ -202,13 +207,12 @@ endif()
 
 option(MATLAB_BUILD_TESTS "Build the C++ tests for the MATLAB interface" OFF)
 
-# Grab CMAKE Modules from the CPP interface.
-set(CPP_CMAKE_MODULES "${CMAKE_SOURCE_DIR}/../cpp/cmake_modules")
-if(EXISTS "${CPP_CMAKE_MODULES}")
-  set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CPP_CMAKE_MODULES})
-endif()
+# Add tools/cmake directory to the CMAKE_MODULE_PATH.
+set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/tools/cmake)
 
-set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/cmake_modules)
+# Get find_package(Arrow) by setting Arrow_DIR to CPP interface ArrowConfig.cmake
+# TODO: Make sure this works on all platforms submitting
+set(CMAKE_PREFIX_PATH "${ARROW_HOME}/lib/cmake/Arrow")

Review Comment:
   If we want to use `CMAKE_PREFIX_PATH`, this should be `set(CMAKE_PREFIX_PATH "${ARROW_HOME}")`.
   But using `Arrow_ROOT` or `Arrow_DIR` is better than `CMAKE_PREFIX_PATH` because `CMAKE_PREFIX_PATH` affects all  `find_package(XXX)`:
   
   ```cmake
   if(ARROW_HOME AND NOT Arrow_ROOT)
     set(Arrow_ROOT "${ARROW_HOME}")
   end
   ```
   
   See also: https://cmake.org/cmake/help/latest/command/find_package.html#config-mode-search-procedure



##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   It seems that libmexclass's build process uses files in this repository.
   Can we reconsider this approach? I think that this approach has the following problems:
   
   * We can't use CMake targets to link. (We need to extract library path, include path and so on manually from CMake targets as you did.)
   * It's difficult to control how to build our sources in our CMake.
   
   I'm familiar with libmexclass but how about the following approach?
   
   * libmexclass provides a library and a CMake package that don't depend on external files (files in our repository)
   * The CMake package provides a helper function to use the libmexclass library easily (Or the CMake package provides a CMake target)
   * We build our source files with the libmexclass library in our CMake configurations (`target_link_libraries()` and so on) instead of `ExternalProject_Add()`



##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")
+
+# -------
+# Build
+# -------
+
+# Build libmexclass as an external project.
+include(ExternalProject)
+ExternalProject_Add(
+    libmexclass
+    # TODO: Consider using SSH URL for the Git Repository when
+    # libmexclass is accessible for CI without permission issues.
+    GIT_REPOSITORY https://github.com/mathworks/libmexclass.git
+    GIT_TAG main
+    SOURCE_SUBDIR libmexclass/cpp
+    CMAKE_CACHE_ARGS "-D CUSTOM_PROXY_FACTORY_INCLUDE_DIR:STRING=${CUSTOM_PROXY_FACTORY_INCLUDE_DIR}"
+                     "-D CUSTOM_PROXY_FACTORY_SOURCES:STRING=${CUSTOM_PROXY_FACTORY_SOURCES}"
+                     "-D CUSTOM_PROXY_SOURCES:STRING=${CUSTOM_PROXY_SOURCES}"
+                     "-D CUSTOM_PROXY_INCLUDE_DIR:STRING=${CUSTOM_PROXY_INCLUDE_DIR}"
+                     "-D CUSTOM_PROXY_LINK_LIBRARIES:STRING=${CUSTOM_PROXY_LINK_LIBRARIES}"
+                     "-D CUSTOM_PROXY_RUNTIME_LIBRARIES:STRING=${CUSTOM_PROXY_RUNTIME_LIBRARIES}"
+                     "-D CUSTOM_PROXY_FACTORY_HEADER_FILENAME:STRING=${CUSTOM_PROXY_FACTORY_HEADER_FILENAME}"
+                     "-D CUSTOM_PROXY_FACTORY_CLASS_NAME:STRING=${CUSTOM_PROXY_FACTORY_CLASS_NAME}"
+    INSTALL_COMMAND ${CMAKE_COMMAND} --build . --target install

Review Comment:
   This may be redundant.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1170567263


##########
matlab/CMakeLists.txt:
##########
@@ -202,13 +207,12 @@ endif()
 
 option(MATLAB_BUILD_TESTS "Build the C++ tests for the MATLAB interface" OFF)
 
-# Grab CMAKE Modules from the CPP interface.
-set(CPP_CMAKE_MODULES "${CMAKE_SOURCE_DIR}/../cpp/cmake_modules")
-if(EXISTS "${CPP_CMAKE_MODULES}")
-  set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CPP_CMAKE_MODULES})
-endif()
+# Add tools/cmake directory to the CMAKE_MODULE_PATH.
+set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/tools/cmake)

Review Comment:
   Good idea! Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1170596271


##########
matlab/CMakeLists.txt:
##########
@@ -202,13 +207,12 @@ endif()
 
 option(MATLAB_BUILD_TESTS "Build the C++ tests for the MATLAB interface" OFF)
 
-# Grab CMAKE Modules from the CPP interface.
-set(CPP_CMAKE_MODULES "${CMAKE_SOURCE_DIR}/../cpp/cmake_modules")
-if(EXISTS "${CPP_CMAKE_MODULES}")
-  set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CPP_CMAKE_MODULES})
-endif()
+# Add tools/cmake directory to the CMAKE_MODULE_PATH.
+set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/tools/cmake)
 
-set(CMAKE_MODULE_PATH ${CMAKE_MODULE_PATH} ${CMAKE_SOURCE_DIR}/cmake_modules)
+# Get find_package(Arrow) by setting Arrow_DIR to CPP interface ArrowConfig.cmake
+# TODO: Make sure this works on all platforms submitting
+set(CMAKE_PREFIX_PATH "${ARROW_HOME}/lib/cmake/Arrow")

Review Comment:
   This is an excellent point.
   
   We weren't aware that `find_package` supported setting `<PackageName>_ROOT` to configure the Config mode search behavior.
   
   Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "ursabot (via GitHub)" <gi...@apache.org>.
ursabot commented on PR #34563:
URL: https://github.com/apache/arrow/pull/34563#issuecomment-1521215498

   Benchmark runs are scheduled for baseline = 966a8040151ebe6276d14ee0c6e98b085daab887 and contender = 9009dd76e4e9a2f4f13340ebf4173e71813b359b. 9009dd76e4e9a2f4f13340ebf4173e71813b359b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/5b253ab0dd4f4fbb8aad4b84f005be47...35f73ccdaad54f2589c3a31a17356c6f/)
   [Failed] [test-mac-arm](https://conbench.ursa.dev/compare/runs/86f2e9b0cc0d40d79aad54fa43b54163...41a1ba0588af400db7b128651af94ead/)
   [Finished :arrow_down:0.77% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/3beb50a496a04aa59282196a8bd0f1f1...fa1505643b254760814b30eed7cebfca/)
   [Finished :arrow_down:0.48% :arrow_up:0.03%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/f866ffa3aeb247dbb21d27e84a12d23e...59f8a160cabe424fa56e3e705f3fccd6/)
   Buildkite builds:
   [Finished] [`9009dd76` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2775)
   [Failed] [`9009dd76` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2809)
   [Finished] [`9009dd76` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2773)
   [Finished] [`9009dd76` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2800)
   [Finished] [`966a8040` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/2774)
   [Failed] [`966a8040` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/2808)
   [Finished] [`966a8040` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/2772)
   [Finished] [`966a8040` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/2799)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on PR #34563:
URL: https://github.com/apache/arrow/pull/34563#issuecomment-1520731522

   It looks like the CI failure for `Python / AMD64 macOS 12 Python 3 (pull_request)` is unrelated and is [failing in `main` too](https://github.com/apache/arrow/actions/runs/4788422093/jobs/8515004817).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1142726984


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] assignUser commented on pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "assignUser (via GitHub)" <gi...@apache.org>.
assignUser commented on PR #34563:
URL: https://github.com/apache/arrow/pull/34563#issuecomment-1470985049

   Missed the review comment box :facepalm: :
   
   This looks like a great addition with a lot of effort behind it, thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1167214786


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   **Update**: We think we've identified the issue!
   
   ---
   
   ### Overview
   
   After a lot of debugging and research, we realized that when the Arrow C++ libraries (i.e. `arrow.dll`) are built in `Debug` mode, they are linked against a `Debug` version of the [MSVC Runtime](https://learn.microsoft.com/en-us/cpp/c-runtime-library/crt-library-features) (i.e. `VCRUNTIME140D.dll` - note the **D** for **D**ebug in the DLL name).
   
   At the same time, `mexclass.dll`, `arrowproxy.dll`, and `gateway.mexw64` are all built by default against a non-`Debug` version of the MSVC Runtime (i.e. `VCRUNTIME140.dll` - note the lack of **D** in the DLL name).
   
   ---
   
   ### Checking the MSVC Runtime using `dumpbin`
   
   We were able to verify the different MSVC Runtime versions in use by the various binaries by running `dumpbin` in an instance of `x64 Native Tools Command Prompt for VS 2019`:
   
   ```shell
   $ dumpbin /dependents arrow.dll
   Microsoft (R) COFF/PE Dumper Version 14.29.30137.0
   Copyright (C) Microsoft Corporation.  All rights reserved.
   
   
   Dump of file arrow.dll
   
   File Type: DLL
   
     Image has the following dependencies:
   
       KERNEL32.dll
       SHELL32.dll
       ole32.dll
       MSVCP140D.dll
       VCRUNTIME140D.dll
       VCRUNTIME140_1D.dll
       ucrtbased.dll
   
     Summary
   
           1000 .00cfg
          33000 .data
           6000 .idata
         142000 .pdata
         606000 .rdata
          29000 .reloc
           1000 .rsrc
        117D000 .text
           1000 .tls
   
   ```
   
   Note the presence of `VCRUNTIME140D.dll` in the `dumpbin` output for `arrow.dll` above.
   
   ---
   
   ### Incompatibility of Multiple Different MSVC Runtimes
   
   Unfortunately, there are no guarantees that one version of the MSVC Runtime will be compatible with another. This limitation is documented in the MSVC documentation page ["C runtime (CRT) and C++ standard library (STL) `.lib` files"](https://learn.microsoft.com/en-us/cpp/c-runtime-library/crt-library-features?view=msvc-170#what-problems-exist-if-an-application-uses-more-than-one-crt-version).
   
   Included below is a relevant excerpt from the documentation:
   
   > A single process may load multiple EXE and DLL images, each with its own CRT. Each of those CRTs may use a different allocator, may have different internal structure layouts, and may use different storage arrangements. It means allocated memory, CRT resources, or classes passed across a DLL boundary can cause problems in memory management, internal static usage, or layout interpretation. For example, if a class is allocated in one DLL but passed to and deleted by another, which CRT deallocator is used? The errors caused can range from the subtle to the immediately fatal, and therefore direct transfer of such resources is discouraged.
   
   It turns out that the case being described above is exactly what was happening when calling `arrow->ToString()` (defined in `arrow.dll`) from `Float64Array::Print` (defined in `arrowproxy.dll`).
   
   Calling `ToString()` on an `arrow::Array` instance crosses over into `arrow.dll` (which uses a completely separate MSVC Runtime from `arrowproxy.dll` when built in `Debug` mode). When `ToString()` allocates memory for a `std::string` within the heap space of the MSVC Runtime used by `arrow.dll` and then returns that `std::string` to `arrowproxy.dll`, undefined behavior is invoked, which eventually leads to the memory corruption we observed.
   
   ---
   
   ### Fixing the Issue
   
   To fix the memory corruption, we modified `matlab/tools/cmake/BuildMatlabArrowInterface.cmake` so that all of the client `libmexclass` binaries (i.e. `mexclass.dll`, `arrowproxy.dll`, and `gateway.mexw64`) link against a `Debug` version of the MSVC Runtime - just like `arrow.dll` does. We were able to accomplish this by [setting the `MSVC_RUNTIME_LIBRARY` target property to `"MultiThreadedDebugDLL"` (i.e. `/MDd`)](https://cmake.org/cmake/help/latest/prop_tgt/MSVC_RUNTIME_LIBRARY.html) to match the `Debug` build of `arrow.dll`.
   
   This excerpt from the [MSVC documentation](https://learn.microsoft.com/en-us/cpp/c-runtime-library/crt-library-features?view=msvc-170#what-problems-exist-if-an-application-uses-more-than-one-crt-version) alludes to this fix:
   
   > It's also possible to avoid some of these issues if all of the images in your process use the same dynamically loaded version of the CRT. To ensure that all components use the same DLL version of the CRT, build them by using the /MD option, and use the same compiler toolset and property settings.
   
   The MSVC documentation also provides more information about the potential issues that can arise when passing resources between DLLs in the page ["Potential errors passing CRT objects across DLL boundaries"](https://learn.microsoft.com/en-us/cpp/c-runtime-library/potential-errors-passing-crt-objects-across-dll-boundaries).
   
   The [common advice we've seen around the Internet](https://stackoverflow.com/questions/22797418/how-do-i-safely-pass-objects-especially-stl-objects-to-and-from-a-dll) relating to this topic is that there is no straightforward, "safe" way to export APIs from a DLL which return STL objects (e.g. `std::string` returned by `arrow::Array::ToString`). My guess is that other community members have already thought through these limitations of the Arrow C++ APIs which return STL objects before. However, I think it may be worth adding a note / warning to the ["Debug Builds" section of the "Building on Windows" documentation for the Arrow C++ libraries](https://arrow.apache.org/docs/developers/cpp/windows.html#debug-builds) about these potential MSVC Runtime compatibility issues so that others hopefully won't have to spend as much effort debugging issues like this in the future. *We would be happy to follow up with a separate pull request to enhance the documentation with this information!*
   
   ---
   
   ### Open Questions
   
   Although we have identified the source of the memory corruption, there is still an open question about whether it is "safe" to build `mexclass.dll`, `arrowproxy.dll`, and `gateway.mexw64` against `VCRUNTIME140D.dll`.
   
   MEX functions are eventually loaded into the MATLAB process address space when called, so it isn't immediately clear whether this will interact as expected in all cases with the MSVC Runtime being used by MATLAB (i.e. `matlab.exe`). To get a better understanding of the potential implications here, it would make sense for us to consult with some other colleagues at MathWorks who have more domain expertise in this area.
   
   It may also make sense to consider modifying the [`CMakeLists.txt` for `mathworks/libmexclass`](https://github.com/mathworks/libmexclass/blob/main/libmexclass/cpp/CMakeLists.txt) to automatically link against a `Debug` MSVC Runtime when building a client project in `Debug` mode. If this isn't desirable as a default behavior, it may still be worth ading a note about potential MSVC Runtime compatibility issues to the `libmexclass` documentation.
   
   ---
   
   ### Moving Forward
   
   Putting the open questions about the safety of linking against a `Debug` MSVC Runtime in a MEX function to the side, it seems reasonable to treat this as a non-pressing issue since the vast majority of MATLAB users won't be building the MATLAB Interface to Arrow against a `Debug` build of `arrow.dll`. In addition, the fix we have identified (i.e. build the client `libmexclass` libraries with `/MDd`) appears to be at least a partial workaround for community members who do want to use a `Debug` build of `arrow.dll`.
   
   We apologize again for taking some time to get to the bottom of this. It was quite surprising that this is what ended up being the source of the memory corruption. In order to get to the root cause, we first needed to narrow down the reproduction steps and get a better understanding of the differences between `Release` and `Debug` builds. We wanted to ensure this memory corruption wasn't indicative of a deeper issue in the implementation of `libmexclass` before building more code on top of it within the upstream `apache/arrow` code base.
   
   Thank you to everyone in the community for your patience while debugging this! We should now be able to concentrate more of our energy on pushing the MATLAB interface forward since this doesn't seem to be a blocking issue. Please don't hesitate to let us know if you have questions about any of this!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1171700533


##########
matlab/CMakeLists.txt:
##########
@@ -32,8 +32,13 @@ function(build_arrow)
     message(SEND_ERROR "Error: unrecognized arguments: ${ARG_UNPARSED_ARGUMENTS}")
   endif()
 
-  # If Arrow needs to be built, the default location will be within the build tree.
-  set(ARROW_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/arrow_ep-prefix")
+  # If Arrow build available, set ARROW_PREFIX to the ARROW_HOME specified with cmake.
+  if(Arrow_FOUND)
+    set(ARROW_PREFIX "${ARROW_HOME}")
+  else()
+    # If Arrow needs to be built, the default location will be within the build tree.
+    set(ARROW_PREFIX "${CMAKE_CURRENT_BINARY_DIR}/arrow_ep-prefix")
+  endif()

Review Comment:
   This is a good point. We probably don't need to be setting all of these properties on the `arrow_shared_for_gtest` target if we aren't going to end up actually using the `arrow_shared_for_gtest` target in the end.
   
   I took a first pass at refactoring this.
   
   Eventually, it would probably make sense to stop using this "bundled GoogleTest" approach because it adds quite a bit of complexity to the build system. I think it would be preferable if we used `FetchContent` to build GoogleTest from upstream source when an existing GoogleTest installation isn't found by `find_package(GTest)`.
   
   Marking this as resolved.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1170561896


##########
matlab/CMakeLists.txt:
##########


Review Comment:
   1. We are now using `FetchContent` to integrate the `libmexclass` code into the CMake build system for the MATLAB interface (see [this pull request for adding `FetchContent` support to `mathworks/libmexclass`](https://github.com/mathworks/libmexclass/pull/55)). Thank you @assignUser for this helpful suggestion!
   
   
   2. Regarding the high minimum CMake version - this is an excellent question!
          
        I did a quick search through the history of changes to the `CMakeLists.txt` and found [this comment](https://github.com/apache/arrow/pull/10305#discussion_r657176867) discussing the rationale for using version `3.20` as the minimum. It looks like the main reason was to avoid some bugs that were present in [`FindMatlab`](https://cmake.org/cmake/help/latest/module/FindMatlab.html) with older versions of CMake. That being said, we agree that this is a pretty high minimum, and it may make sense to reconsider this decision and do a thorough examination of whether older versions might be OK to use.
   
        Since it may take some time to get access to development environments with older versions of CMake, it may make sense to capture this as a separate issue, so that we don't block this pull request.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1173944277


##########
matlab/CMakeLists.txt:
##########
@@ -317,6 +324,51 @@ matlab_add_mex(R2018a
 
 target_include_directories(mexcall PRIVATE ${CPP_SOURCE_DIR})
 
+# ARROW_SHARED_LIB
+# On Windows, this will be ARROW_HOME/bin/arrow.dll and on Linux and macOS, it is the arrow.so/dylib in the newly built arrow_shared library.
+if(NOT Arrow_FOUND)
+  message(STATUS "ARROW_SHARED_LIB will be set using IMPORTED_LOCATION value when building."
+  )
+  get_target_property(ARROW_SHARED_LIB arrow_shared IMPORTED_LOCATION)
+else()
+  # If not building Arrow, ARROW_SHARED_LIB derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake would be non-empty.
+  message(STATUS "ARROW_SHARED_LIB: ${ARROW_SHARED_LIB}")
+endif()
+
+# ARROW_LINK_LIB
+# On Windows, we use the arrow.lib for linking arrow_matlab against the Arrow C++ library.
+# The location of arrow.lib is previously saved in IMPORTED_IMPLIB.
+if(WIN32)
+  # If not building Arrow, IMPORTED_IMPLIB will be empty.
+  # Then set ARROW_LINK_LIB to ARROW_IMPORT_LIB which would have been derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake. This will avoid the ARROW_LINK_LIB set to NOTFOUND error.
+  # The ARROW_IMPORT_LIB should be ARROW_HOME/lib/arrow.lib on Windows.
+  if(NOT Arrow_FOUND)
+    message(STATUS "ARROW_LINK_LIB will be set using IMPORTED_IMPLIB value when building."
+    )
+    get_target_property(ARROW_LINK_LIB arrow_shared IMPORTED_IMPLIB)
+  else()
+    set(ARROW_LINK_LIB "${ARROW_IMPORT_LIB}")
+    message(STATUS "Setting ARROW_LINK_LIB to ARROW_IMPORT_LIB: ${ARROW_IMPORT_LIB}, which is derived from the ARROW_HOME provided."
+    )
+  endif()
+else()
+  # On Linux and macOS, it is the arrow.so/dylib in the newly built arrow_shared library used for linking.
+  # On Unix, this is the same as ARROW_SHARED_LIB.
+  message(STATUS "Setting ARROW_LINK_LIB to ARROW_SHARED_LIB as they are same on Unix.")
+  set(ARROW_LINK_LIB "${ARROW_SHARED_LIB}")
+endif()
+
+# ARROW_INCLUDE_DIR should be set so that header files in the include directory can be found correctly.
+# The value of ARROW_INCLUDE_DIR should be ARROW_HOME/include on all platforms.
+if(NOT Arrow_FOUND)
+  message(STATUS "ARROW_INCLUDE_DIR will be set using INTERFACE_INCLUDE_DIRECTORIES value when building."
+  )
+  get_target_property(ARROW_INCLUDE_DIR arrow_shared INTERFACE_INCLUDE_DIRECTORIES)
+else()
+  # If not building Arrow, ARROW_INCLUDE_DIR derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake would be non-empty.
+  message(STATUS "ARROW_INCLUDE_DIR: ${ARROW_INCLUDE_DIR}")
+endif()
+

Review Comment:
   Thanks for the suggestion, @kou!
   
   Just to clarify, are you suggesting that we should use `Arrow::arrow_shared` in the `else` clauses where `Arrow_FOUND` is `true` for printing debug messages? Or are you suggesting, more generally, that we don't set imported target properties separately on `arrow_shared`?
   
   For reference, when building Arrow from source using `ExternalProject_Add` (i.e. when `find_package(Arrow)` fails to find a valid Arrow installation), the imported target that is created from the `ExternalProject_Add` build artifacts is called `arrow_shared`. If we start using `Arrow::arrow_shared`, there would then be two possible different targets: (1) `Arrow::arrow_shared` and (2) `arrow_shared`, which might complicate the logic of the build system (although, that doesn't necessarily mean we shouldn't make this change - but, it seems like a consideration worth keeping in mind).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1173944509


##########
ci/scripts/matlab_build.sh:
##########
@@ -25,6 +25,6 @@ source_dir=${base_dir}/matlab
 build_dir=${base_dir}/matlab/build
 install_dir=${base_dir}/matlab/install
 
-cmake -S ${source_dir} -B ${build_dir} -G Ninja -D MATLAB_BUILD_TESTS=ON -D CMAKE_INSTALL_PREFIX=${install_dir} -D MATLAB_ADD_INSTALL_DIR_TO_SEARCH_PATH=OFF
+cmake -S ${source_dir} -B ${build_dir} -G Ninja -D MATLAB_ARROW_INTERFACE=ON -D MATLAB_BUILD_TESTS=ON -D CMAKE_INSTALL_PREFIX=${install_dir} -D MATLAB_ADD_INSTALL_DIR_TO_SEARCH_PATH=OFF

Review Comment:
   This seems like a nice change. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1175711873


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,109 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# ----------------------------------
+# Configure libmexclass FetchContent
+# ----------------------------------
+
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_FETCH_CONTENT_NAME libmexclass)
+# TODO: Consider using SSH URL for the Git Repository when
+# libmexclass is accessible for CI without permission issues.
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_FETCH_CONTENT_GIT_REPOSITORY "https://github.com/mathworks/libmexclass.git")
+# Use a specific Git commit hash to avoid libmexclass version changing unexpectedly.
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_FETCH_CONTENT_GIT_TAG "44c15d0")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_FETCH_CONTENT_SOURCE_SUBDIR "libmexclass/cpp")
+
+# ------------------------------------------
+# Configure libmexclass Client Proxy Library
+# ------------------------------------------
+
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_LIBRARY_NAME arrowproxy)
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_LIBRARY_ROOT_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_LIBRARY_INCLUDE_DIRS ${MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_LIBRARY_ROOT_INCLUDE_DIR}
+                                                               ${MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_INCLUDE_DIR}
+                                                               ${MATLAB_ARROW_LIBMEXCLASS_CLIENT_PROXY_FACTORY_INCLUDE_DIR}
+                                                               ${ARROW_INCLUDE_DIRS})

Review Comment:
   Good idea!
   
   We didn't realize that calling `target_link_libraries` also [propagates "usage requirements"](https://cmake.org/cmake/help/latest/manual/cmake-buildsystem.7.html#target-usage-requirements) (i.e. the `INTERFACE_INCLUDE_DIRECTORIES` from `arrow_shared` will automatically be added to the `INCLUDE_DIRECTORIES` for the `libmexclass` client proxy library).
   
   While investigating this, we also realized that the [`libmexclass` code is currently adding the include directories from the `INTERFACE_INCLUDE_DIRECTORIES` of all `LINK_LIBRARIES` passed to `libmexclass_client_add_proxy_library` manually](https://github.com/mathworks/libmexclass/blob/44c15d08cef2c9397c8a9e5fd6d649214f0f372f/libmexclass/cpp/CMakeLists.txt#L147).
   
   We've captured this as a separate issue in mathworks/libmexclass#57.
   
   Marking this as resolved.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1149602656


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Thanks for clarifying! We will go ahead and make these changes on our own, in that case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1156333113


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   A quick update:
   
   At this point, most of the major refactoring work to `libmexclass` is complete. We have the [example client in `mathworks/libmexclass`](https://github.com/mathworks/libmexclass/blob/54/example/CMakeLists.txt) building and the [CI workflow in mathworks/libmexclass is passing](https://github.com/mathworks/libmexclass/actions/runs/4568570671/jobs/8063792570).
   
   Now, we will start modifying the `CMakeLists.txt` for the MATLAB interface to use the new `libmexclass` infrastructure.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1141021576


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Thanks! I think that I understand.
   
   > this is [how MEX functions work, in general](https://www.mathworks.com/help/matlab/matlab_external/c-mex-functions.html)
   
   This link is helpful. This is what I want to know!
   
   How about providing the following `libmexclass/mex.h` by `libmexclass`:
   
   ```cpp
   #include "libmexclass/action/Action.h"
   #include "libmexclass/action/Factory.h"
   #include "libmexclass/mex/Arguments.h"
   #include "libmexclass/mex/State.h"
   #include "libmexclass/proxy/Factory.h"
   
   
   #include "mex.hpp"
   #include "mexAdapter.hpp"
   
   namespace libmexclass {
       namespace mex {
           template <typename ProxyFactory>
           CallMexFunction((matlab::mex::ArgumentList outputs, matlab::mex::ArgumentList inputs) {
               // Wrap the inputs and outputs.
               Arguments output_arguments{outputs.begin()};
               Arguments input_arguments{inputs.begin()};
   
               // Store the inputs, outputs, and MATLAB Engine pointer in a libmexclass::mex::State instance.
               State state{input_arguments, output_arguments, getEngine()};
   
               // Create a polymorphic proxy::Factory instance.
               // NOTE: Clients provide the implementation for CustomProxyFactory.
               std::shared_ptr<proxy::Factory> proxy_factory = std::make_shared<ProxyFactory>();
   
               // Create an action::Factory.
               action::Factory action_factory{proxy_factory};
   
               // Create an Action.
               std::unique_ptr<action::Action> action = action_factory.makeAction(state);
   
               // Execute the Action.
               action->execute();
           }
       }
   }
   ```
   
   and `apache/arrow` has only the following small C++ code?
   
   ```cpp
   #include <libmexclass/mex.h>
   
   #include <arrow/matlab/proxy/factory.h>
   
   class MexFunction : public matlab::mex::Function {
       public:
       void operator()(matlab::mex::ArgumentList outputs, matlab::mex::ArgumentList inputs) {
         libmexclass::mex::CallMexFunction<arrow::matlab::proxy::Factory>(outputs, inputs);
       }
   };
   ```
   
   This approach doesn't use (a bit) tricky header/code injections by C preprocessor. It just uses a standard C++ feature instead of injection by C preprocessor. 
   
   If the approach doesn't make sense, I'm OK with the current "template code" approach.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1170597716


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")
+
+# -------
+# Build
+# -------
+
+# Build libmexclass as an external project.
+include(ExternalProject)
+ExternalProject_Add(
+    libmexclass
+    # TODO: Consider using SSH URL for the Git Repository when
+    # libmexclass is accessible for CI without permission issues.
+    GIT_REPOSITORY https://github.com/mathworks/libmexclass.git
+    GIT_TAG main
+    SOURCE_SUBDIR libmexclass/cpp
+    CMAKE_CACHE_ARGS "-D CUSTOM_PROXY_FACTORY_INCLUDE_DIR:STRING=${CUSTOM_PROXY_FACTORY_INCLUDE_DIR}"
+                     "-D CUSTOM_PROXY_FACTORY_SOURCES:STRING=${CUSTOM_PROXY_FACTORY_SOURCES}"
+                     "-D CUSTOM_PROXY_SOURCES:STRING=${CUSTOM_PROXY_SOURCES}"
+                     "-D CUSTOM_PROXY_INCLUDE_DIR:STRING=${CUSTOM_PROXY_INCLUDE_DIR}"
+                     "-D CUSTOM_PROXY_LINK_LIBRARIES:STRING=${CUSTOM_PROXY_LINK_LIBRARIES}"
+                     "-D CUSTOM_PROXY_RUNTIME_LIBRARIES:STRING=${CUSTOM_PROXY_RUNTIME_LIBRARIES}"
+                     "-D CUSTOM_PROXY_FACTORY_HEADER_FILENAME:STRING=${CUSTOM_PROXY_FACTORY_HEADER_FILENAME}"
+                     "-D CUSTOM_PROXY_FACTORY_CLASS_NAME:STRING=${CUSTOM_PROXY_FACTORY_CLASS_NAME}"
+    INSTALL_COMMAND ${CMAKE_COMMAND} --build . --target install

Review Comment:
   Since we are now using the refactored `libmexclass` APIs (which directly expose the targets to the client `CMakeLists.txt` using `FetchContent`), an `INSTALL_COMMAND` is no longer being used. So, I think this comment may no longer apply to the latest version of the code.
   
   Marking this as resolved.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] assignUser commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "assignUser (via GitHub)" <gi...@apache.org>.
assignUser commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1170700031


##########
matlab/CMakeLists.txt:
##########


Review Comment:
   Nice thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1173206067


##########
matlab/CMakeLists.txt:
##########
@@ -317,6 +324,51 @@ matlab_add_mex(R2018a
 
 target_include_directories(mexcall PRIVATE ${CPP_SOURCE_DIR})
 
+# ARROW_SHARED_LIB
+# On Windows, this will be ARROW_HOME/bin/arrow.dll and on Linux and macOS, it is the arrow.so/dylib in the newly built arrow_shared library.
+if(NOT Arrow_FOUND)
+  message(STATUS "ARROW_SHARED_LIB will be set using IMPORTED_LOCATION value when building."
+  )
+  get_target_property(ARROW_SHARED_LIB arrow_shared IMPORTED_LOCATION)
+else()
+  # If not building Arrow, ARROW_SHARED_LIB derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake would be non-empty.
+  message(STATUS "ARROW_SHARED_LIB: ${ARROW_SHARED_LIB}")
+endif()
+
+# ARROW_LINK_LIB
+# On Windows, we use the arrow.lib for linking arrow_matlab against the Arrow C++ library.
+# The location of arrow.lib is previously saved in IMPORTED_IMPLIB.
+if(WIN32)
+  # If not building Arrow, IMPORTED_IMPLIB will be empty.
+  # Then set ARROW_LINK_LIB to ARROW_IMPORT_LIB which would have been derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake. This will avoid the ARROW_LINK_LIB set to NOTFOUND error.
+  # The ARROW_IMPORT_LIB should be ARROW_HOME/lib/arrow.lib on Windows.
+  if(NOT Arrow_FOUND)
+    message(STATUS "ARROW_LINK_LIB will be set using IMPORTED_IMPLIB value when building."
+    )
+    get_target_property(ARROW_LINK_LIB arrow_shared IMPORTED_IMPLIB)
+  else()
+    set(ARROW_LINK_LIB "${ARROW_IMPORT_LIB}")
+    message(STATUS "Setting ARROW_LINK_LIB to ARROW_IMPORT_LIB: ${ARROW_IMPORT_LIB}, which is derived from the ARROW_HOME provided."
+    )
+  endif()
+else()
+  # On Linux and macOS, it is the arrow.so/dylib in the newly built arrow_shared library used for linking.
+  # On Unix, this is the same as ARROW_SHARED_LIB.
+  message(STATUS "Setting ARROW_LINK_LIB to ARROW_SHARED_LIB as they are same on Unix.")
+  set(ARROW_LINK_LIB "${ARROW_SHARED_LIB}")
+endif()
+
+# ARROW_INCLUDE_DIR should be set so that header files in the include directory can be found correctly.
+# The value of ARROW_INCLUDE_DIR should be ARROW_HOME/include on all platforms.
+if(NOT Arrow_FOUND)
+  message(STATUS "ARROW_INCLUDE_DIR will be set using INTERFACE_INCLUDE_DIRECTORIES value when building."
+  )
+  get_target_property(ARROW_INCLUDE_DIR arrow_shared INTERFACE_INCLUDE_DIRECTORIES)
+else()
+  # If not building Arrow, ARROW_INCLUDE_DIR derived from ARROW_PREFIX set to the ARROW_HOME specified with cmake would be non-empty.
+  message(STATUS "ARROW_INCLUDE_DIR: ${ARROW_INCLUDE_DIR}")
+endif()
+

Review Comment:
   Could you use the `Arrow::arrow_shared` CMake target (or `Arrow::arrow_static`) instead of separated library path, include path and so on?
   The `Arrow::arrow_shared` CMake target has all information (library path, include path and so on) we need.



##########
ci/scripts/matlab_build.sh:
##########
@@ -25,6 +25,6 @@ source_dir=${base_dir}/matlab
 build_dir=${base_dir}/matlab/build
 install_dir=${base_dir}/matlab/install
 
-cmake -S ${source_dir} -B ${build_dir} -G Ninja -D MATLAB_BUILD_TESTS=ON -D CMAKE_INSTALL_PREFIX=${install_dir} -D MATLAB_ADD_INSTALL_DIR_TO_SEARCH_PATH=OFF
+cmake -S ${source_dir} -B ${build_dir} -G Ninja -D MATLAB_ARROW_INTERFACE=ON -D MATLAB_BUILD_TESTS=ON -D CMAKE_INSTALL_PREFIX=${install_dir} -D MATLAB_ADD_INSTALL_DIR_TO_SEARCH_PATH=OFF

Review Comment:
   How about adding newlines for readability?
   
   ```suggestion
   cmake \
     -S ${source_dir} \
     -B ${build_dir} \
     -G Ninja \
     -D MATLAB_ARROW_INTERFACE=ON \
     -D MATLAB_BUILD_TESTS=ON \
     -D CMAKE_INSTALL_PREFIX=${install_dir} \
     -D MATLAB_ADD_INSTALL_DIR_TO_SEARCH_PATH=OFF
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] assignUser commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "assignUser (via GitHub)" <gi...@apache.org>.
assignUser commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1137882276


##########
matlab/CMakeLists.txt:
##########


Review Comment:
   You define a (for arrow) high cmake minimum version of 3.20
   
   - Do you actually use any features that require this version (on first glance I don't think so?)
   - Have you considered using FetchContent instead of ExternalProject? 
   
   [FetchContent](https://cmake.org/cmake/help/latest/module/FetchContent.html) basically gets  the content and uses it with `add_subdirectory` this means that it gets configured at the same time as your project -> targets are available vs at build time with ep.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kou (via GitHub)" <gi...@apache.org>.
kou commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1139913862


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Thanks for explaining the `libmexclass` build system!
   
   Can I confirm my understandings?
   
   * `libmexclass.so` can be shared from multiple MATALB modules. (Module A and module B can use the same `libmexclass.so`.)
   * `libmexclassclient.so` can't be shared from multiple MATAB modules. (Each module must build its `libmexclassclient.so`.)
   
   > One thing worth explicitly noting, is that `target_compile_definitions` is used to "dynamically" build a MEX binary from [predefined "template" code](https://github.com/mathworks/libmexclass/blob/484ade97176c611cbb7056b941df99f1f3a6a0dc/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp#L62). Essentially, we use these compiler definitions to inform the compiler what the appropriate [header file](https://github.com/mathworks/libmexclass/blob/484ade97176c611cbb7056b941df99f1f3a6a0dc/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp#L7) is for the client's subclass of `libmexclass::proxy::Factory` and what the [name of their subclass is](https://github.com/mathworks/libmexclass/blob/484ade97176c611cbb7056b941df99f1f3a6a0dc/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp#L62).
   
   It seems that we can just use C++ template feature for it instead of macros when we build `libmexclient.so` in our CMake.
   
   BTW, who does instantiate `MexFunction` defined in `mex_gateway_function.cpp`?
   
   > are we OK if the client code still has to call a few macros to set up the appropriate target and link dependencies between the client shared library (i.e. their `Factory` and `Proxy` code) and the MEX `gateway` target?
   
   "The client code" is [`mex_gateway_function.cpp`](https://github.com/mathworks/libmexclass/blob/484ade97176c611cbb7056b941df99f1f3a6a0dc/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp), right? Or our CMake code?
   
   "a few macros" are C/C++'s (C preprocessor's) macros, right? Or CMake's macro? 
   
   "the appropriate target and link dependencies" is for CMake, right?
   
   Calling `add_library()`, `target_link_libraries()` and so on for `libmexclient.so` and `gateway.mexa64` are OK for me. Because it's one of natural CMake usages.
   
   > Our hope is that we could potentially preserve some of this simplicity by providing a few helper macros.
   
   "a few helper macros" are CMake's macros, right?
   
   +1
   
   > Are we OK with continuing to build the MEX `gateway` from a [predefined "template" source file](https://github.com/mathworks/libmexclass/blob/484ade97176c611cbb7056b941df99f1f3a6a0dc/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp#L62) and using "code injection" (i.e. `target_compile_definitions`) to inform the compiler what header file and class name to use for the client subclass of `libmexclass::proxy::Factory`?
   
   Can we use C++ template feature instead of the approach? I may provide a code example once I understand who instantiates `MexFunction`. 
   
   > it seems like there are two different approaches being suggested here. One is to use `FetchContent` to essentially "embed" the build system of `libmexclass` into the client build system (i.e. make the targets and macros available). The other approach seems to be creating a CMake package and using it in conjunction with `find_package`. 
   
   Right.
   
   > Are there any strong opinions on which approach is preferable? At first glance, it seems like using `FetchContent` might be a bit simpler to setup.
   
   I don't have a strong opinion. I think that the latter approach (installing `libmexclass` and use it as a CMake package) is better because multiple MATLAB modules can share one `libmexclass` installation. But I'm OK with the former approach (using `FetchContent`).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kevingurney commented on a diff in pull request #34563: GH-33854: [MATLAB] Add basic libmexclass integration code to MATLAB interface

Posted by "kevingurney (via GitHub)" <gi...@apache.org>.
kevingurney commented on code in PR #34563:
URL: https://github.com/apache/arrow/pull/34563#discussion_r1139036861


##########
matlab/tools/cmake/BuildMatlabArrowInterface.cmake:
##########
@@ -0,0 +1,80 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+# -------
+# Config
+# -------
+
+# Build configuration for libmexclass.
+set(CUSTOM_PROXY_FACTORY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy;${CMAKE_SOURCE_DIR}/src/cpp")
+set(CUSTOM_PROXY_FACTORY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/proxy/factory.cc")
+set(CUSTOM_PROXY_SOURCES "${CMAKE_SOURCE_DIR}/src/cpp/arrow/matlab/array/proxy/float64_array.cc")
+set(CUSTOM_PROXY_INCLUDE_DIR "${CMAKE_SOURCE_DIR}/src/cpp;${ARROW_INCLUDE_DIR}")
+set(CUSTOM_PROXY_LINK_LIBRARIES ${ARROW_LINK_LIB})
+# On Windows, arrow.dll must be installed regardless of
+# whether Arrow_FOUND is true or false. Therefore, we explicitly
+# copy ARROW_SHARED_LIB to the installation folder +libmexclass/+proxy.
+set(CUSTOM_PROXY_RUNTIME_LIBRARIES ${ARROW_SHARED_LIB})
+set(CUSTOM_PROXY_FACTORY_HEADER_FILENAME "factory.h")
+set(CUSTOM_PROXY_FACTORY_CLASS_NAME "arrow::matlab::proxy::Factory")

Review Comment:
   Thank you @kou and @assignUser for all of your feedback! It's all very helpful.
   
   ---
   
   @assignUser - thanks for suggesting the use of `FetchContent`. I can see how using it would make `libmexclass` more "tightly integrated" with client CMake code by directly exposing targets that can be manipulated by the client.
   
   However, I want to make sure we are all on the same page before starting any refactoring efforts on the `libmexclass` build system. To start, I thought it might be helpful to clarify the current "architecture" of the `libmexclass` build system.
   
   The way `libmexclass` currently works is by building 3 separate binaries (one of which is named `libmexclass.so`).
   
   The 3 binaries and their link dependencies look like the following:
   
   1. `libmexclass.so` -- depends on --> MATLAB shared libraries
   2. `libmexclassclient.so` -- depends on --> `libmexclass.so` AND custom client link libraries (e.g. `libarrow.so`)
   3. `gateway.mexa64` -- depends on --> `libmexclass.so` AND `libmexclassclient.so`
   
   `libmexclassclient.so` is a shared library that is built directly from the specified client code (e.g. `float64_array.cc`). The "semantic" use of this library is to provide a concrete subclass of `libmexclass::proxy::Factory` to the MEX gateway, so that `Proxy` instances can be created appropriately. To be clear - `libmexclassclient.so` is the only binary that is actually built directly from code that the client CMake project provides. The rest of the binaries are built from code in the `libmexclass` project.
   
   One thing worth explicitly noting, is that `target_compile_definitions` is used to "dynamically" build a MEX binary from [predefined "template" code](https://github.com/mathworks/libmexclass/blob/484ade97176c611cbb7056b941df99f1f3a6a0dc/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp#L62). Essentially, we use these compiler definitions to inform the compiler what the appropriate [header file](https://github.com/mathworks/libmexclass/blob/484ade97176c611cbb7056b941df99f1f3a6a0dc/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp#L7) is for the client's subclass of `libmexclass::proxy::Factory` and what the [name of their subclass is](https://github.com/mathworks/libmexclass/blob/484ade97176c611cbb7056b941df99f1f3a6a0dc/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp#L62).
   
   Hopefully, this helps somewhat to clear up what is going on under the hood when building a CMake project that is using `libmexclass`.
   
   ---
   
   In terms of moving forward, a few questions:
   
   1. Assuming we refactor the `libmexclass` build system so that its targets can be used via `FetchContent`, are we OK if the client code still has to call a few macros to set up the appropriate target and link dependencies between the client shared library (i.e. their `Factory` and `Proxy` code) and the MEX `gateway` target? The original design of the build system for `libmexclass` was designed to optimize for simplicity of use (which happens to be at expense of some control in this case). Our hope is that we could potentially preserve some of this simplicity by providing a few helper macros.
   2. Are we OK with continuing to build the MEX `gateway` from a [predefined "template" source file](https://github.com/mathworks/libmexclass/blob/484ade97176c611cbb7056b941df99f1f3a6a0dc/libmexclass/cpp/source/libmexclass/mex/mex_gateway_function.cpp#L62) and using "code injection" (i.e. `target_compile_definitions`) to inform the compiler what header file and class name to use for the client subclass of `libmexclass::proxy::Factory`? If we aren't OK with this, then we would need the client to author the MEX gateway source code entirely on their own, which would be more error prone and would probably require us to modify the `libmexclass` APIs.
   3. It's possible I am misunderstanding, but it seems like there are two different approaches being suggested here. One is to use `FetchContent` to essentially "embed" the build system of `libmexclass` into the client build system (i.e. make the targets and macros available). The other approach seems to be creating a CMake package and using it in conjunction with `find_package`. Are there any strong opinions on which approach is preferable? At first glance, it seems like using `FetchContent` might be a bit simpler to setup.
   
   ---
   
   If anything is unclear, please don't hesitate to let me know.
   
   Thank you very much for thinking through this with us!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org