Connecting Your Flutter App to a Local LLM with Ollama and Dart

Privacy-first AI is a game-changer. Learn how to connect your Flutter application to a local Ollama server for local, low-latency, and cost-free LLM power.

Published on • 2026-03-22

AI Assistant

As developers, we often reach for APIs like OpenAI or Gemini. But what if your app needs to work offline? What if user privacy is paramount? What if you want to experiment without worrying about token costs?

Enter Ollama. In this tutorial, we’ll learn how to bridge a Flutter application to a local Ollama server using Dart.

Why Go Local?

Data Privacy: User data never leaves their machine.
Zero Cost: No API bills, just compute time.
Latency: No network overhead if running on the same device.
Offline Capability: AI works in a cabin in the woods.

Prerequisites

Ollama Installed: Download it from ollama.com.
A Model Downloaded: Run ollama pull llama3 (or any other model you prefer).
Flutter SDK: A working Flutter development environment.

Setting Up the Ollama Client in Dart

Ollama exposes a simple REST API on port 11434. We can use the http package in Dart to communicate with it.

First, add the dependency to your pubspec.yaml:

dependencies:
  http: ^1.2.0

The API Service

Create a service to handle the communication:

import 'dart:convert';
import 'package:http/http.dart' as http;

class OllamaService {
  final String baseUrl = 'http://localhost:11434/api';

  Future<String> generateResponse(String model, String prompt) async {
    final response = await http.post(
      Uri.parse('$baseUrl/generate'),
      headers: {'Content-Type': 'application/json'},
      body: jsonEncode({
        'model': model,
        'prompt': prompt,
        'stream': false, // For simplicity, we'll start with a non-streamed response
      }),
    );

    if (response.statusCode == 200) {
      final data = jsonDecode(response.body);
      return data['response'];
    } else {
      throw Exception('Failed to connect to Ollama');
    }
  }
}

Handling Streaming Responses

For a modern AI experience, you likely want a streaming UI. Ollama makes this easy.

Stream<String> streamResponse(String model, String prompt) async* {
  final request = http.Request('POST', Uri.parse('$baseUrl/generate'));
  request.body = jsonEncode({
    'model': model,
    'prompt': prompt,
  });

  final response = await http.Client().send(request);

  if (response.statusCode == 200) {
    await for (final chunk in response.stream.transform(utf8.decoder)) {
      final Map<String, dynamic> data = jsonDecode(chunk);
      yield data['response'];
      if (data['done']) break;
    }
  }
}

Networking Gotcha: `localhost` vs. Real IP

If you’re testing on a real mobile device (Android/iOS) while Ollama is running on your Mac/PC, localhost won’t work.

Find your computer’s local IP (e.g., 192.168.1.50).
Set the OLLAMA_HOST environment variable to 0.0.0.0:11434 before starting Ollama so it accepts external connections.
Update your baseUrl to use the computer’s IP.

Integrating with the UI

In your Flutter UI, you can use a TextEditingController to get the user’s input and a FutureBuilder or StreamBuilder to display the AI’s response.

StreamBuilder<String>(
  stream: _ollamaService.streamResponse('llama3', _userPrompt),
  builder: (context, snapshot) {
    if (snapshot.hasData) {
      return Text(snapshot.data!);
    }
    return CircularProgressIndicator();
  },
)

Conclusion

By bridging Flutter and Ollama, you’ve unlocked a world of private, high-performance AI. Whether you’re building a desktop assistant or a mobile tool, local LLMs provide a powerful alternative to cloud APIs.

flutter dart ollama local-llm ai