Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
title: Deploy Apache Flink on Google Cloud C4A (Arm-based Axion VMs)

minutes_to_complete: 30

who_is_this_for: This learning path is intended for software developers deploying and optimizing Apache Flink workloads on Linux/Arm64 environments, specifically using Google Cloud C4A virtual machines powered by Axion processors.

learning_objectives:
- Provision an Arm-based SUSE SLES virtual machine on Google Cloud (C4A with Axion processors)
- Install Apache Flink on a SUSE Arm64 (C4A) instance
- Validate Flink functionality by starting the Flink cluster and running a simple baseline job (e.g., WordCount) on the Arm64 VM
- Benchmark Flink performance using internal JMH-based micro-benchmarks on Arm64 (Aarch64) architecture

prerequisites:
- A [Google Cloud Platform (GCP)](https://cloud.google.com/free) account with billing enabled
- Basic familiarity with [Apache Flink](https://flink.apache.org/) and its runtime environment

author: Pareena Verma

##### Tags
skilllevels: Introductory
subjects: Databases
cloud_service_providers: Google Cloud

armips:
- Neoverse

tools_software_languages:
- Flink
- Java
- Maven

operatingsystems:
- Linux

# ================================================================================
# FIXED, DO NOT MODIFY
# ================================================================================
further_reading:
- resource:
title: Google Cloud documentation
link: https://cloud.google.com/docs
type: documentation

- resource:
title: Flink documentation
link: https://nightlies.apache.org/flink/flink-docs-lts/
type: documentation

- resource:
title: Flink Performance Tool
link: https://github.com/apache/flink-benchmarks/tree/master?tab=readme-ov-file#flink-benchmarks
type: documentation

weight: 1
layout: "learningpathall"
learning_path_main_page: "yes"
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
# ================================================================================
# FIXED, DO NOT MODIFY THIS FILE
# ================================================================================
weight: 21 # Set to always be larger than the content in this path to be at the end of the navigation.
title: "Next Steps" # Always the same, html page title.
layout: "learningpathall" # All files under learning paths have this same wrapper for Hugo processing.
---
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
title: Getting started with Apache Flink on Google Axion C4A (Arm Neoverse-V2)

weight: 2

layout: "learningpathall"
---

## Google Axion C4A Arm instances in Google Cloud

Google Axion C4A is a family of Arm-based virtual machines built on Google’s custom Axion CPU, which is based on Arm Neoverse-V2 cores. Designed for high-performance and energy-efficient computing, these virtual machines offer strong performance for modern cloud workloads such as CI/CD pipelines, microservices, media processing, and general-purpose applications.

The C4A series provides a cost-effective alternative to x86 virtual machines while leveraging the scalability and performance benefits of the Arm architecture in Google Cloud.

To learn more about Google Axion, refer to the [Introducing Google Axion Processors, our new Arm-based CPUs](https://cloud.google.com/blog/products/compute/introducing-googles-new-arm-based-cpu) blog.

## Apache Flink

[Apache Flink](https://flink.apache.org/) is an open-source, distributed **stream and batch data processing framework** developed under the [Apache Software Foundation](https://www.apache.org/).

Flink is designed for **high-performance, low-latency, and stateful computations** on both unbounded (streaming) and bounded (batch) data. It provides a robust runtime and APIs in **Java**, **Scala**, and **Python** for building scalable, fault-tolerant data processing pipelines.

Flink is widely used for **real-time analytics**, **event-driven applications**, **data pipelines**, and **machine learning workloads**. It integrates seamlessly with popular systems such as **Apache Kafka**, **Hadoop**, and various **cloud storage services**.

To learn more, visit the [Apache Flink official website](https://flink.apache.org/) and explore the [documentation](https://nightlies.apache.org/flink/flink-docs-release-2.1/).
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
---
title: Apache Flink Baseline Testing on Google Axion C4A Arm Virtual Machine
weight: 5

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Apache Flink Baseline Testing on GCP SUSE VM
This guide explains how to perform **baseline testing** for Apache Flink after installation on a **GCP SUSE VM**. Baseline testing ensures that the Flink cluster is operational, the environment is correctly configured, and basic jobs run successfully.

### Download and Extract Maven
Before running Flink jobs, ensure that **Java** and **Maven** are installed on your VM.
Download Maven and extract it:

```console
cd /opt
sudo wget https://archive.apache.org/dist/maven/maven-3/3.8.6/binaries/apache-maven-3.8.6-bin.tar.gz
sudo tar -xvzf apache-maven-3.8.6-bin.tar.gz
sudo mv apache-maven-3.8.6 /opt/maven
```

### Set Environment Variables
Configure the environment so Maven commands are recognized system-wide:

```console
echo "export M2_HOME=/opt/maven" >> ~/.bashrc
echo "export PATH=\$M2_HOME/bin:\$PATH" >> ~/.bashrc
source ~/.bashrc
```
Verify the Maven installation:

```console
mvn -version
```
At this point, both Java and Maven are installed and ready to use.

### Start the Flink Cluster
Before proceeding to start the Flink cluster, you need to allow port 8081 from your GCP console.

Start the Flink cluster using the provided startup script:

```console
cd $FLINK_HOME
./bin/start-cluster.sh
```

You should see output similar to:
```output
Starting cluster.
[INFO] 1 instance(s) of standalonesession are already running on lpprojectsusearm64.
Starting standalonesession daemon on host lpprojectsusearm64.
Starting taskexecutor daemon on host lpprojectsusearm64.
```

Verify that the JobManager and TaskManager processes are running:

```console
jps
```

You should see output similar to:
```output
21723 StandaloneSessionClusterEntrypoint
2621 Jps
2559 TaskManagerRunner
```

### Access the Flink Web UI

Open the Flink Web UI in a browser:

```console
http://<VM_IP>:8081
```

- A successfully loaded dashboard confirms the cluster network and UI functionality.
-This serves as the baseline for network and UI validation.

![Flink Dashboard alt-text#center](images/flink-dashboard.png "Figure 1: Flink Dashboard")

### Run a Simple Example Job
Execute a sample streaming job to verify that Flink can run tasks correctly:

```console
cd $FLINK_HOME
./bin/flink run examples/streaming/WordCount.jar
```

- Monitor the job in the Web UI or check console logs.
- Confirm that the job completes successfully.

![Flink Dashboard alt-text#center](images/wordcount.png "Figure 2: Word Count Job")

Flink baseline testing has been completed. You can now proceed to Flink benchmarking.
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
title: Apache Flink Benchmarking
weight: 6

### FIXED, DO NOT MODIFY
layout: learningpathall
---


## Apache Flink Benchmarking
This guide provides step-by-step instructions to set up and run **Apache Flink Benchmarks** on a **GCP SUSE VMs**. It covers cloning the repository, building the benchmarks, exploring the JAR, and listing available benchmarks.

### Clone the Repository
Start by cloning the official Flink benchmarks repository. This repository contains all the benchmark definitions and example jobs.

```console
cd ~
git clone https://github.com/apache/flink-benchmarks.git
cd flink-benchmarks
```

### Build the Benchmarks with Maven
Use Maven to compile the benchmarks and generate the benchmark JAR. Skip tests to save time.

```console
mvn clean package -DskipTests
```
- **mvn clean package** → Cleans previous builds and packages the project.

After this step, the target directory will contain the compiled **benchmarks.jar**.

### Explore the JAR Contents
Verify the generated files inside the `target` directory:

```console
cd target
ls
```
You should see an output similar to:

```output
benchmark-0.1.jar classes generated-test-sources maven-status protoc-plugins
benchmarks.jar generated-sources maven-archiver protoc-dependencies test-classes
```
- **benchmarks.jar**→ The main benchmark JAR file used to run Flink benchmarks.

### List Available Benchmarks
To view all the benchmarks included in the JAR:

```console
java -jar benchmarks.jar -l
```
- `-l` → Lists all benchmarks packaged in the JAR.
- This helps you identify which benchmarks you want to execute on your VM.

### Run Selected Benchmarks
While the Flink benchmarking project includes multiple suites for state backends, windowing, checkpointing, and scheduler performance, this Learning path focuses on the Remote Channel Throughput benchmark to evaluate network and I/O performance.

**Remote Channel Throughput**: Measures the data transfer rate between remote channels in Flink, helping to evaluate network and I/O performance.
```console
java -jar benchmarks.jar org.apache.flink.benchmark.RemoteChannelThroughputBenchmark.remoteRebalance
```
You should see an output similar to:
```output

Result "org.apache.flink.benchmark.RemoteChannelThroughputBenchmark.remoteRebalance":
10536.511 ±(99.9%) 60.121 ops/ms [Average]
(min, avg, max) = (10289.593, 10536.511, 10687.736), stdev = 89.987
CI (99.9%): [10476.390, 10596.633] (assumes normal distribution)

# Run complete. Total time: 00:25:14
Benchmark (mode) Mode Cnt Score Error Units
RemoteChannelThroughputBenchmark.remoteRebalance ALIGNED thrpt 30 17445.341 ± 153.256 ops/ms
RemoteChannelThroughputBenchmark.remoteRebalance DEBLOAT thrpt 30 10536.511 ± 60.121 ops/ms
```

### Flink Benchmark Metrics Explained

- **Run Count**: Total benchmark iterations executed, higher count improves reliability.
- **Average Throughput**: Mean operations per second across all iterations.
- **Standard Deviation**: Variation from average throughput, smaller means more consistent.
- **Confidence Interval (99.9%)**: Range where the true average throughput lies with 99.9% certainty.
- **Min Throughput**: The lowest throughput was observed, and it shows worst-case performance.
- **Max Throughput**: Highest throughput observed, shows best-case performance.

### Benchmark summary on x86_64
To compare the benchmark results, the following results were collected by running the same benchmark on a `x86 - c4-standard-4` (4 vCPUs, 15 GB Memory) x86_64 VM in GCP, running SUSE:

| Benchmark | Mode | Count | Score (ops/ms) | Error (±) | Min | Max | Stdev | CI (99.9%) | Units |
|---------------------------------------------------|---------|-------|----------------|-----------|------------|------------|---------|------------------------|--------|
| RemoteChannelThroughputBenchmark.remoteRebalance | ALIGNED | 30 | 24873.046 | 892.673 | 11195.028 | 12425.761 | 421.057 | [11448.649, 12011.275] | ops/ms |
| RemoteChannelThroughputBenchmark.remoteRebalance | DEBLOAT | 30 | 11729.962 | 281.313 | 11195.028 | 12425.761 | 421.057 | [11448.649, 12011.275] | ops/ms |

### Benchmark summary on Arm64
Results from the earlier run on the `c4a-standard-4` (4 vCPU, 16 GB memory) Arm64 VM in GCP (SUSE):

| Benchmark | Mode | Count | Score (ops/ms) | Error (±) | Min | Max | Stdev | CI (99.9%) | Units |
|---------------------------------------------------|---------|-------|----------------|-----------|-----------|-----------|---------|------------------------|--------|
| RemoteChannelThroughputBenchmark.remoteRebalance | ALIGNED | 30 | 17445.341 | 153.256 | 10289.593 | 10687.736 | 89.987 | [10476.390, 10596.633] | ops/ms |
| RemoteChannelThroughputBenchmark.remoteRebalance | DEBLOAT | 30 | 10536.511 | 60.121 | 10289.593 | 10687.736 | 89.987 | [10476.390, 10596.633] | ops/ms |

### Apache Flink performance benchmarking comparison on Arm64 and x86_64

- The **ALIGNED mode** achieved an average throughput of **17,445 ops/ms**, demonstrating higher performance on the Arm64 VM.
- The **DEBLOAT mode** achieved an average throughput of **10,537 ops/ms**, slightly lower due to optimization differences.
- The benchmark confirms that the **Arm64 architecture** efficiently handles Flink's remote channel throughput workloads.
- Overall, the average throughput across both modes is approximately **14,854 ops/ms**, indicating strong baseline performance for Arm64 deployments.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
---
title: Install Apache Flink
weight: 4

### FIXED, DO NOT MODIFY
layout: learningpathall
---

## Install Apache Flink on GCP VM
This guide walks you through installing **Apache Flink** and its required dependencies on a **Google Cloud Platform (GCP) SUSE Arm64 Virtual Machine (VM)**. By the end of this section, you will have a fully configured Flink environment ready for job execution and benchmarking.

### Update the System and Install Java
Before installing Flink, ensure your system packages are up to date and Java is installed.

```console
sudo zypper refresh
sudo zypper update -y
sudo zypper install -y java-17-openjdk java-17-openjdk-devel
```
This step ensures you have the latest system updates and the Java runtime needed to execute Flink applications.

### Download Apache Flink Binary
Next, download the pre-built binary package for **Apache Flink** from the official Apache mirror.

```console
cd /opt
sudo wget https://dlcdn.apache.org/flink/flink-2.1.0/flink-2.1.0-bin-scala_2.12.tgz
```
This command retrieves the official Flink binary distribution for installation on your VM.

{{% notice Note %}}
Flink 2.0.0 introduced Disaggregated State Management architecture, which enables more efficient resource utilization in cloud-native environments, ensuring high-performance real-time processing while minimizing resource overhead.
You can view [this release note](https://flink.apache.org/2025/03/24/apache-flink-2.0.0-a-new-era-of-real-time-data-processing/)

The [Arm Ecosystem Dashboard](https://developer.arm.com/ecosystem-dashboard/) recommends Flink version 2.0.0, the minimum recommended on the Arm platforms.
{{% /notice %}}

### Extract the Downloaded Archive
Extract the downloaded `.tgz` archive to make the Flink files accessible for configuration.

```console
sudo tar -xvzf flink-2.1.0-bin-scala_2.12.tgz
```
After extraction, you will have a directory named `flink-2.1.0` under `/opt`.

**Rename the extracted directory for convenience:**
For easier access and management, rename the extracted Flink directory to a simple name like `/opt/flink`.

```console
sudo mv flink-2.1.0 /opt/flink
```
This makes future references to your Flink installation path simpler and more consistent.

### Configure Environment Variables
Set the environment variables so the Flink commands are recognized system-wide. This ensures you can run `flink` from any terminal session.

```console
echo "export FLINK_HOME=/opt/flink" >> ~/.bashrc
echo "export PATH=\$FLINK_HOME/bin:\$PATH" >> ~/.bashrc
```

Additionally, create a dedicated log directory for Flink and assign proper permissions:
```console
sudo mkdir -p /opt/flink/log
sudo chown -R $(whoami):$(id -gn) /opt/flink/log
sudo chmod -R 755 /opt/flink/log
```

**Apply the changes:**

```console
source ~/.bashrc
```

### Verify the Installation
To confirm that Flink has been installed correctly, check its version:

```console
flink -v
```

You should see an output similar to:

```output
Version: 2.1.0, Commit ID: 4cb6bd3
```
This confirms that Apache Flink has been installed and is ready for use.
Loading