Skip to content
Merged
222 changes: 145 additions & 77 deletions docs/source/developers/java/building.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,25 +47,156 @@ repository:
$ cd arrow
$ git submodule update --init --recursive

Basic Installation
------------------
These are the options available to compile Arrow Java modules with:
- Maven build tool
- Docker compose
- Archery

Building Java Modules
---------------------

To build the default modules, go to the project root and execute:

Maven
~~~~~

.. code-block::

$ cd arrow/java
$ export JAVA_HOME=<absolute path to your java home>
$ java --version
$ mvn clean install

Building JNI Libraries on Linux
-------------------------------
Docker compose
~~~~~~~~~~~~~~

.. code-block::

$ cd arrow/java
$ export JAVA_HOME=<absolute path to your java home>
$ java --version
$ docker-compose run debian-java

Archery
~~~~~~~

.. code-block::

$ cd arrow/java
$ export JAVA_HOME=<absolute path to your java home>
$ java --version
$ archery docker run debian-java

Building JNI Libraries (.dylib / .so)
-------------------------------------

First, we need to build the `C++ shared libraries`_ that the JNI bindings will use.
We can build these manually or we can use `Archery`_ to build them using a Docker container
(This will require installing Docker, Docker Compose, and Archery).

Note: If you are building on Apple Silicon, be sure to use a JDK version that was compiled
for that architecture. See, for example, the `Azul JDK <https://www.azul.com/downloads/?os=macos&architecture=arm-64-bit&package=jdk>`_.

Maven
~~~~~

- To build only the C Data Interface library:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we itemize this subsection, we need to indent the following .. code-block:: to put into this subsection.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed


.. code-block::

$ cd arrow/java
$ export JAVA_HOME=<absolute path to your java home>
$ java --version
$ mvn clean generate-resources -Pgenerate-cdata-dylib_so -N
$ ls -latr ../java-dist/lib
|__ libarrow_cdata_jni.dylib

- To build other JNI libraries:

Not options available on Maven.

CMake
~~~~~

- To build only the JNI C Data Interface library:

.. code-block::

$ cd arrow
$ mkdir -p java-dist java-jni
$ cmake \
-S java \
-B java-jni \
-DARROW_JAVA_JNI_ENABLE_C=ON \
-DARROW_JAVA_JNI_ENABLE_DEFAULT=OFF \
-DBUILD_TESTING=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=java-dist/lib
$ cmake --build java-jni --target install --config Release
$ ls -latr java-dist/lib
|__ libarrow_cdata_jni.dylib

- To build JNI ORC & Gandiva libraries:

.. code-block::

$ cd arrow
$ brew bundle --file=cpp/Brewfile
Homebrew Bundle complete! 25 Brewfile dependencies now installed.
$ brew uninstall aws-sdk-cpp
(We can't use aws-sdk-cpp installed by Homebrew because it has
an issue: https://github.com/aws/aws-sdk-cpp/issues/1809 )
$ export JAVA_HOME=<absolute path to your java home>
$ mkdir -p java-dist cpp-jni
$ cmake \
-S cpp \
-B cpp-jni \
-DARROW_CSV=ON \
-DARROW_DATASET=ON \
-DARROW_DEPENDENCY_SOURCE=BUNDLED \
-DARROW_DEPENDENCY_USE_SHARED=OFF \
-DARROW_FILESYSTEM=ON \
-DARROW_GANDIVA=ON \
-DARROW_GANDIVA_JAVA=ON \
-DARROW_GANDIVA_STATIC_LIBSTDCPP=ON \
-DARROW_JNI=ON \
-DARROW_ORC=ON \
-DARROW_PARQUET=ON \
-DARROW_PLASMA=ON \
-DARROW_PLASMA_JAVA_CLIENT=ON \
-DARROW_S3=ON \
-DARROW_USE_CCACHE=ON \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_LIBDIR=lib \
-DCMAKE_INSTALL_PREFIX=java-dist \
-DCMAKE_UNITY_BUILD=ON
$ cmake --build cpp-jni --target install --config Release
$ ls -latr java-dist/lib
|__ libarrow_orc_jni.dylib
|__ libgandiva_jni.dylib

- To build only the Dataset library:

.. code-block::

$ cd arrow
$ mkdir -p java-dist java-jni
$ cmake \
-S java \
-B java-jni \
-DARROW_JAVA_JNI_ENABLE_DATASET=ON \
-DARROW_JAVA_JNI_ENABLE_DEFAULT=OFF \
-DBUILD_TESTING=OFF \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=java-dist/lib \
-DCMAKE_PREFIX_PATH=$PWD/java-dist
$ cmake --build java-jni --target install --config Release
$ ls -latr java-dist/lib
|__ libarrow_dataset_jni.dylib

Archery
~~~~~~~

.. code-block::

$ cd arrow
Expand All @@ -77,85 +208,22 @@ We can build these manually or we can use `Archery`_ to build them using a Docke
|__ libgandiva_jni.so
|__ libplasma_java.so

Building JNI Libraries on MacOS
-------------------------------
Note: If you are building on Apple Silicon, be sure to use a JDK version that was compiled for that architecture. See, for example, the `Azul JDK <https://www.azul.com/downloads/?os=macos&architecture=arm-64-bit&package=jdk>`_.
Building Java JNI Modules
-------------------------

First, you need to build Apache Arrow C++:
- To compile the JNI bindings, use the ``arrow-c-data`` Maven profile:

.. code-block::

$ cd arrow
$ brew bundle --file=cpp/Brewfile
Homebrew Bundle complete! 25 Brewfile dependencies now installed.
$ brew uninstall aws-sdk-cpp
(We can't use aws-sdk-cpp installed by Homebrew because it has
an issue: https://github.com/aws/aws-sdk-cpp/issues/1809 )
$ export JAVA_HOME=<absolute path to your java home>
$ mkdir -p java-dist cpp-jni
$ cmake \
-S cpp \
-B cpp-jni \
-DARROW_CSV=ON \
-DARROW_DATASET=ON \
-DARROW_DEPENDENCY_USE_SHARED=OFF \
-DARROW_FILESYSTEM=ON \
-DARROW_GANDIVA=ON \
-DARROW_GANDIVA_JAVA=ON \
-DARROW_GANDIVA_STATIC_LIBSTDCPP=ON \
-DARROW_JNI=ON \
-DARROW_ORC=ON \
-DARROW_PARQUET=ON \
-DARROW_PLASMA=ON \
-DARROW_PLASMA_JAVA_CLIENT=ON \
-DARROW_S3=ON \
-DAWSSDK_SOURCE=BUNDLED \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_LIBDIR=lib \
-DCMAKE_INSTALL_PREFIX=java-dist \
-DCMAKE_UNITY_BUILD=ON \
-Dre2_SOURCE=BUNDLED
$ cmake --build cpp-jni --target install
$ ls -latr ../java-dist/lib
|__ libarrow_orc_jni.dylib
|__ libgandiva_jni.dylib
|__ libplasma_java.dylib

Then, you can build JNI libraries:

.. code-block::

$ mkdir -p java-jni
$ cmake \
-S java \
-B java-jni \
-DCMAKE_BUILD_TYPE=Release \
-DCMAKE_INSTALL_PREFIX=java-dist/lib \
-DCMAKE_PREFIX_PATH=java-dist
$ cmake --build java-jni --target install
$ ls -latr ../java-dist/lib
|__ libarrow_cdata_jni.dylib
|__ libarrow_dataset_jni.dylib

To build other JNI libraries:
.. code-block::

$ cd arrow/java
$ mvn -Darrow.c.jni.dist.dir=<absolute path to your arrow folder>/java-dist/lib -Parrow-c-data clean install

Building Arrow JNI Modules
--------------------------
- To compile the JNI bindings for ORC / Gandiva / Dataset, use the ``arrow-jni`` Maven profile:

To compile the JNI bindings, use the ``arrow-c-data`` Maven profile:
.. code-block::

.. code-block::

$ cd arrow/java
$ mvn -Darrow.c.jni.dist.dir=<absolute path to your arrow folder>/java-dist/lib -Parrow-c-data clean install

To compile the JNI bindings for ORC / Gandiva / Dataset, use the ``arrow-jni`` Maven profile:

.. code-block::

$ cd arrow/java
$ mvn -Darrow.cpp.build.dir=<absolute path to your arrow folder>/java-dist/lib -Parrow-jni clean install
$ cd arrow/java
$ mvn -Darrow.cpp.build.dir=<absolute path to your arrow folder>/java-dist/lib -Parrow-jni clean install

IDE Configuration
=================
Expand Down
64 changes: 64 additions & 0 deletions java/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -905,6 +905,70 @@
</build>
</profile>

<profile>
<id>generate-cdata-dylib_so</id>
<properties>
<arrow.c.jni.dist.dir>java-dist/lib</arrow.c.jni.dist.dir>
</properties>
<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>3.1.0</version>
<executions>
<execution>
<id>cdatadir</id>
<phase>generate-resources</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<executable>mkdir</executable>
<commandlineArgs>-p java-dist java-jni</commandlineArgs>
<workingDirectory>../</workingDirectory>
</configuration>
</execution>
<execution>
<id>cdatadefine</id>
<phase>generate-resources</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<executable>cmake</executable>
<commandlineArgs>
-S java
-B java-jni
-DARROW_JAVA_JNI_ENABLE_C=ON
-DARROW_JAVA_JNI_ENABLE_DEFAULT=OFF
-DBUILD_TESTING=OFF
-DCMAKE_BUILD_TYPE=Release
-DCMAKE_INSTALL_PREFIX=${arrow.c.jni.dist.dir}
</commandlineArgs>
<workingDirectory>../</workingDirectory>
</configuration>
</execution>
<execution>
<id>cdatabuild</id>
<phase>generate-resources</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<executable>cmake</executable>
<commandlineArgs>
--build java-jni --target install --config Release
</commandlineArgs>
<workingDirectory>../</workingDirectory>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</profile>

</profiles>

</project>