Spring AI with Ollama

Introduction to Spring AI with Ollama

Welcome to a comprehensive guide on leveraging Spring AI with Ollama to develop AI-driven applications using Java. This tutorial will cover everything from setting up Ollama locally, configuring your development environment, to creating an application that utilizes large language models for text generation.

Setting Up Ollama

Installing Ollama

To use Ollama locally, you must first install it on your machine. This involves downloading the Ollama software from the official repository and configuring it to run without Docker. Follow the installation instructions provided in the Ollama README file available on their official GitHub page.

Ollama Download link

Downloading the Mistral Model

After setting up Ollama, you can download the Mistral model directly from the Ollama management interface. Mistral is designed for a broad range of applications, offering robust text generation capabilities.

// Run this command in cli after downloading
// If running first time, first it download the model before running
// There are many models, let's start with 'mistral' 
ollama run mistral
Can check on browser

Command line

Exploring Other Available Models

Ollama supports several models, each with unique characteristics tailored to different tasks:

  • Orca-mini: Ideal for small-scale, quick-response applications.
  • Llama2: Suitable for more demanding tasks requiring deeper context understanding.
  • Llama3:Meta Llama 3: Next update after Llama2, More capable openly available LLM.
  • Custom models: Developed for specialized tasks; check the Ollama repository for more details.
  • To read more about all available models click this link: Ollama models library

Integrating Ollama with Spring AI: Maven and Gradle Dependencies
Maven Dependency Configuration

To integrate Ollama with Spring AI in a Maven project, you need to add the following dependency to your pom.xml file. This dependency will include the necessary Spring AI Ollama libraries in your project.

Gradle Dependency Configuration

For Gradle-based projects, you will include the Spring AI Ollama dependency in your build.gradle file. This allows Gradle to manage the library and its associated dependencies.

// Add this to the dependencies block of your build.gradle
dependencies {
    implementation 'org.springframework.ai:spring-ai-ollama-spring-boot-starter'
Repository Configuration

Since Spring AI artifacts are typically published in the Spring Milestone and Snapshot repositories, you might need to add these repositories to your build file if the dependencies are not found in Maven Central. Here's how you can do it:

    <name>Spring Milestones Repository</name>
// For Gradle, add this to your build.gradle
repositories {
    maven { url 'https://repo.spring.io/milestone' }
Example pom.xml Configuration
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
        <relativePath /> <!-- lookup parent from repository -->
    <description>Spring AI backend</description>
            <name>Spring Milestones</name>

These configurations ensure that your project is set up to utilize the latest Spring AI capabilities with Ollama, allowing you to build and run AI-enhanced applications seamlessly.

Application Properties Configuration

Add an application.properties file under the src/main/resources directory to enable and configure the Ollama chat model:

# Base URL where the Ollama API server is running

# The model name to be used for text generation

# The temperature setting controls the creativity of the model's responses

Explanation of properties:

  • spring.ai.ollama.base-url: This property specifies the base URL where the Ollama API server is hosted. For local development, it might be http://localhost:11434. Replace this with your actual server URL if different.
  • spring.ai.ollama.chat.options.model: This property sets the model name to be used for generating text responses. In this case, it's set to mistral, which is one of the supported models.
  • spring.ai.ollama.chat.options.temperature: The temperature parameter controls the randomness of the model's output. A value of 0.7 means that the model will produce fairly creative responses without being too random. Lower values make the output more focused and deterministic, while higher values make it more random and creative.

These properties will create an OllamaChatModel implementation that you can inject into your classes and use to generate text responses based on input prompts.

Sample Controller

Here is an example of a simple @RestController class that uses the chat model for text generations.

package codeKatha;

import org.springframework.ai.chat.model.ChatResponse;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.ollama.OllamaChatModel;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import reactor.core.publisher.Flux;
import java.util.Map;

public class ChatController {

    // Injecting OllamaChatModel via constructor
    private final OllamaChatModel chatModel;

    public ChatController(OllamaChatModel chatModel) {
        this.chatModel = chatModel;
     * Endpoint to generate text based on a given prompt.
     * @param prompt the input prompt to generate text
     * @return generated text response from the model
    public String generateText(@RequestParam String prompt) {
        // Call the model with the prompt and return the generated content
        return chatModel.call(new Prompt(prompt)).getResult().getOutput().getContent();

     * For This you may need to add flux dependency
     * Endpoint to generate a streaming response for a given message.
     * @param message the input message to generate a response
     * @return a Flux stream of ChatResponse containing the generated responses
    public Flux<ChatResponse> generateStream(@RequestParam(value = "message", defaultValue = "Tell me a joke") String message) {
        // Create a prompt from the user message and return a stream of responses
        Prompt prompt = new Prompt(message);
        return chatModel.stream(prompt);

For API /generateText you not need this. But, for API /generateStream you may need to add flux and web dependecy in pom.xml, also need this configuration:



package codeKatha;

import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.client.RestClient;

public class RestClientConfig {

    public RestClient.Builder restClientBuilder() {
        return RestClient.builder();


Integrating Spring AI with Ollama offers powerful capabilities for developing AI-powered applications. This guide provides the foundational knowledge needed to implement LLMs in your Java projects, enabling you to harness the potential of AI in your software solutions.

Also Read: Spring Boot with Open AI (Chat Gpt)


Popular Posts on Code Katha

Java Interview Questions for 10 Years Experience

Sql Interview Questions for 10 Years Experience

Spring Boot Interview Questions for 10 Years Experience

Java interview questions - Must to know concepts

Visual Studio Code setup for Java and Spring with GitHub Copilot

Spring Data JPA

Data Structures & Algorithms Tutorial with Coding Interview Questions

Java interview questions for 5 years experience

Elasticsearch Java Spring Boot