Service

The service is an example of a Java app interacting with an ONNX model, using Apache KServe Component.

The Java app is using REST via apache-camel Undertow
the ONNX model is hosted on a Triton NVIDIA Server (locally)
the latest apache-camel KServe component is used for the integration

More detail about KServe can be also found on my article about Java apache-camel-kserve component.

Triton Server

The Triton NVIDIA Server was selected because it is a host for pre-trained models, that:

accepts ONNX models
supports the standard V2 Inference Protocol
offers scalability easily, with deployment in Kubernetes native frameworks, like KServe (KFServing)

The models

Both models are pre-trained and retrieved from hugging-face. They are used only for learning purposes here. Both models need to be downloaded from hugging-face and put into the right folder to work. The models are available through triton, by running docker-compose.

Bert - Mask

I ve used bert-base-uncased model that finds the possible words to replace the "[MASK]" in a sentence. For example, we pass "Have a nice [MASK]" and in the top outputs we expect to see "have a nice day".

First downloaded the onnx file and put it in triton-server/models/bert-base-uncased/1. Then downloaded vocab.txt and put it into sentiment-service/src/main/resources. Finally, the tokenizer.json is being used by the line HuggingFaceTokenizer.newInstance("bert-base-uncased") (no action needed).

Model Input using Java For the input, i've used the HuggingFaceTokenizer class, provided by ai.djl and huggingface. There's an implementation for Bert that makes our life easy! This will convert the words to ids that the model can understand.

Model Output using Java The output is trickier. The server response comes back as an array of floats. Each value represents the probability of each word (from the vocabulary). As we said the response is the whole sentence with the "[MASK]" replaced by each possible word. Then a probability is assigned. This means, if we pass in the sentence "have a nice [MASK]", and the model vocabulary has 10 words, then the response is the sentence 10 times, one for each word. the word has 6 tokens (i think), for the 4 words and a start and an end (?). so the response would be 60 floats with the probabilities of each word.

In this service, we read only the values of possible masks, sort the top 5 possible answers and return the 5 words only.

Sentiments

I ve used pjxcharya/onnx-sentiment-model model that accepts a sentence and returns 2 values, the possibility of this sentence to be good and bad.

First downloaded the onnx file and put it in triton-server/models/sentiment/1. Finally, there is no tokenizer.json but we can use the same from before here: HuggingFaceTokenizer.newInstance("bert-base-uncased") (no action needed).

Model Input using Java For the input, i've used the HuggingFaceTokenizer class, provided by ai.djl and huggingface. I'm using the implementation for Bert that works here too. This will convert the words to ids that the model can understand.

Model Output using Java As explained, the response is only 2 floats, one for the possibility the sentence to have a bad sentiment and one to have a good one.

In this service, we compare the 2 values and just return back the outcome, good or bad.

Docker Compose

Docker-compose is provided to easily start locally the triton server with the pre-trained models loaded. Start the Triton server only running docker-compose up triton-server --build.

Postman

A Postman collection is provided, that includes REST calls for both the Triton Server directly, but also for our service.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
postman		postman
sentiment-service		sentiment-service
triton-server		triton-server
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Service

Triton Server

The models

Bert - Mask

Sentiments

Docker Compose

Postman

About

Uh oh!

Languages

bpstelios10/apache-camel-kserver

Folders and files

Latest commit

History

Repository files navigation

Service

Triton Server

The models

Bert - Mask

Sentiments

Docker Compose

Postman

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages