You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@doris.apache.org by 陈明雨 <mo...@163.com> on 2020/07/14 14:23:20 UTC

[Proposal] Modify the code structure of FE to maven multi-module structure

**Motivation**




At present, we have introduced the Spark Load feature, which needs to upload the SparkDpp runtime jar package to the Spark cluster during load. Because the current FE code base is a single module structure, it is not possible to compile the SparkDpp runtime jar package separately, so we temporarily upload the entire palo-fe.jar to the Spark cluster. This approach itself is not normal.




Secondly, palo-fe.jar does not contain other third-party libraries that it depends on. If the SparkDpp library depends on other third-party libraries (such as Roaring Bitmap), the jar package of this third-party library also needs to be uploaded. However, it is difficult to ensure that all dependent third-party libraries are uploaded, resulting in exceptions such as ClassNotFoundException when Spark jobs are running.




Therefore, I decided to build the FE code using maven's multi-module approach.




**New code structrue**




```

fe/

├── pom.xml           // parent pom

├── fe-core

│   ├── pom.xml     // module pom

│   ├── src

│   │   ├── main

│   │   └── test

├── spark-dpp

│   ├── pom.xml     // module pom

│   ├── src

```

The new `fe/` directory is the parent dir of all sub-modules.

The `fe-core/` is the origin `fe/` directory. 




Not `fe-core/` and `spark-dpp/` are sub-modules.




**How to build**




The first PR will only changes the directory level, without any code changes, so the build method and output are exactly the same as before.




**Conflict with other on-going PR**




This change will affect all PRs related to FE code that have not yet been merged. These PRs will have conflicts and cannot be resolved on the Github web page. Here is a conflict resolution:




1. Make sure that all your code changes has been pushed to your own Github repo, with branch, eg, `my_dev_branch`




1. checkout the master with the new FE code structure. (upstream-apache points to https://github.com/apache/incubator-doris)




     `git checkout -b my_new_dev_branch upstream-apache/master`




2. Pull the code from your own Github repo




    `git pull https://github.com/your/incubator-doris.git my_dev_branch`




    If there is no conflict in your code before, then there should be no conflict now, and a merge node will be generated after pull.




3. Push this new branch to your Github repo, and make a new PR:




    `git push origin my_new_dev_branch`




**Works to do**




[ ] Change the FE code base structure to maven multi-module.

[ ] Extract SparkDpp code to new sub-module.

[ ] Change `fe_plugins/` to be a new sub-module.




--

此致!Best Regards
陈明雨 Mingyu Chen

Email:
chenmingyu@apache.org