On-Premise AI Solutions with Ollama: Factory-Internal Predictive Maintenance - Amazeng Blog

Introduction

In industrial facilities, data privacy and real-time decision making are critical requirements. While cloud-based AI services like AWS Bedrock or OpenAI offer powerful capabilities, they require sensitive production data to leave the facility. Ollama addresses this issue by enabling local Large Language Model (LLM) deployment.

In this article, we examine how to implement AI-powered predictive maintenance applications using Ollama with our ZMA Data Acquisition and GDT Digital Transmitter devices, without any cloud connection.

What is Ollama?

Ollama is an open-source application that enables local deployment of large language models like Llama, Mistral, Phi.

Key Features

✅ Fully Local: All data stays within facility
✅ No Internet Required: Complete offline operation
✅ Open Source Models: Llama 3, Mistral 7B, Phi-3
✅ Easy API: Compatible with OpenAI API
✅ Low Cost: No per-token API fees

Cloud AI vs Ollama Comparison

Feature	AWS Bedrock	OpenAI GPT-4	Ollama (Local)
Data Privacy	Data leaves facility	Data leaves facility	Fully local
Internet Dependency	Required	Required	Not required
Latency	200-500ms	300-800ms	50-200ms
Cost	$0.00025-0.01/1K token	$0.01-0.06/1K token	Hardware only
Customization	Limited	No	Full control
Model Size	Large (100B+)	Very large (1.7T)	Small-Medium (7B-70B)

System Architecture

┌──────────────────────────────────────────────────┐
│              Factory (Fully Offline)             │
│                                                  │
│  ┌─────────┐    Modbus TCP    ┌──────────────┐  │
│  │ ZMA-4   │◄─────────────────►│              │  │
│  │ Devices │                   │  Linux PC    │  │
│  └─────────┘                   │  (AI Server) │  │
│                                │              │  │
│  ┌─────────┐    Modbus TCP    │ - Ollama     │  │
│  │ GDT     │◄─────────────────►│ - Python     │  │
│  │ Trans.  │                   │ - PostgreSQL │  │
│  └─────────┘                   │              │  │
│                                │ Models:      │  │
│                                │ • Llama 3.2  │  │
│                                │ • Mistral 7B │  │
│                                │ • Phi-3      │  │
│                                └──────────────┘  │
│                                                  │
└──────────────────────────────────────────────────┘

Ollama Installation and Setup

1. Server Installation (Ubuntu 22.04)

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Start Ollama service
sudo systemctl start ollama
sudo systemctl enable ollama

# Download models
ollama pull llama3.2:3b      # 3B parameter model (~2GB)
ollama pull mistral:7b       # 7B parameter model (~4.1GB)
ollama pull phi3:medium      # 14B parameter model (~7.9GB)

# Verify installation
ollama list

2. Python Integration

pip install ollama requests pymodbus psycopg2-binary pandas

Application 1: Anomaly Detection from Loadcell Data

Scenario

GDT Digital Transmitter reads weight from 4 loadcells on a tank. Imbalanced load (one loadcell reading much higher/lower) may indicate:

Tank tilt
Loadcell failure
Foundation problem

AI-Powered Detection

anomaly_detector.py

#!/usr/bin/env python3

import ollama
import json
from pymodbus.client.sync import ModbusTcpClient

class LoadcellAnomalyDetector:
    def __init__(self):
        self.gdt_client = ModbusTcpClient('192.168.1.100', port=502)
        self.gdt_client.connect()

        # Use lightweight model (3B)
        self.model = "llama3.2:3b"

    def read_loadcells(self):
        """Read 4 loadcell values from GDT"""
        result = self.gdt_client.read_holding_registers(10, 8, unit=1)

        if not result.isError():
            lc1 = self._to_float(result.registers[0:2])
            lc2 = self._to_float(result.registers[2:4])
            lc3 = self._to_float(result.registers[4:6])
            lc4 = self._to_float(result.registers[6:8])

            return {
                "lc1": round(lc1, 3),
                "lc2": round(lc2, 3),
                "lc3": round(lc3, 3),
                "lc4": round(lc4, 3),
                "total": round(lc1 + lc2 + lc3 + lc4, 3)
            }
        return None

    def analyze_with_ai(self, data):
        """Analyze loadcell values with Ollama"""
        prompt = f"""
You are an industrial automation expert. Analyze the following loadcell readings:

Loadcell 1 (Front-Left):  {data['lc1']} mV/V
Loadcell 2 (Front-Right): {data['lc2']} mV/V
Loadcell 3 (Rear-Left):   {data['lc3']} mV/V
Loadcell 4 (Rear-Right):  {data['lc4']} mV/V
Total Weight: {data['total']} mV/V

Normal state: All 4 loadcells should read similar values (±5% tolerance).

Task: Detect any anomalies and provide:
1. Is there an anomaly? (Yes/No)
2. If yes, which loadcell?
3. Probable cause
4. Recommended action

Respond in JSON format.
"""

        response = ollama.generate(
            model=self.model,
            prompt=prompt,
            format='json'  # Force JSON output
        )

        result = json.loads(response['response'])
        return result

    def run(self):
        """Main monitoring loop"""
        print(f"🤖 AI Anomaly Detector started (Model: {self.model})")

        while True:
            data = self.read_loadcells()

            if data:
                print(f"\n📊 Loadcells: {data}")

                # AI analysis
                analysis = self.analyze_with_ai(data)

                if analysis.get('anomaly') == 'Yes':
                    print(f"⚠️  ANOMALY DETECTED!")
                    print(f"   Problem: {analysis.get('problem')}")
                    print(f"   Cause: {analysis.get('probable_cause')}")
                    print(f"   Action: {analysis.get('recommended_action')}")
                else:
                    print("✓ Normal operation")

            time.sleep(10)  # Check every 10 seconds

if __name__ == "__main__":
    detector = LoadcellAnomalyDetector()
    detector.run()

Example Output:

{
  "anomaly": "Yes",
  "problem": "Loadcell 3 reading is 15% lower than others",
  "probable_cause": "Possible loadcell drift or loose connection",
  "recommended_action": "Check Loadcell 3 wiring and recalibrate if necessary",
  "severity": "Medium"
}

Application 2: Predictive Maintenance from Vibration Data

Scenario

ZMA-4 collects vibration data from motor via accelerometer (4-20mA). Predict motor failure by analyzing data with AI.

Feature Extraction + AI Analysis

vibration_monitor.py

import numpy as np
import pandas as pd
from scipy import signal
from pymodbus.client.sync import ModbusTcpClient
import ollama

class VibrationAnalyzer:
    def __init__(self):
        self.zma_client = ModbusTcpClient('192.168.1.50', port=502)
        self.zma_client.connect()
        self.model = "mistral:7b"

    def read_vibration_data(self, samples=1000):
        """Read 1000 samples from ZMA (1kHz sampling)"""
        data = []
        for _ in range(samples):
            result = self.zma_client.read_holding_registers(0, 2, unit=1)
            value = self._to_float(result.registers)
            data.append(value)
            time.sleep(0.001)  # 1ms = 1kHz

        return np.array(data)

    def extract_features(self, vibration_data):
        """Extract frequency domain features"""
        # FFT
        fft = np.fft.fft(vibration_data)
        freqs = np.fft.fftfreq(len(vibration_data), d=0.001)

        # Peak frequencies
        peaks, _ = signal.find_peaks(np.abs(fft), height=100)
        peak_freqs = freqs[peaks]

        # Statistical features
        features = {
            "rms": np.sqrt(np.mean(vibration_data**2)),
            "peak": np.max(np.abs(vibration_data)),
            "crest_factor": np.max(np.abs(vibration_data)) / np.sqrt(np.mean(vibration_data**2)),
            "dominant_freq": peak_freqs[0] if len(peak_freqs) > 0 else 0,
            "spectral_energy": np.sum(np.abs(fft)**2)
        }

        return features

    def predict_with_ai(self, features):
        """Predictive maintenance analysis with AI"""
        prompt = f"""
You are a mechanical engineer expert in predictive maintenance. Analyze the following vibration data from an industrial motor:

RMS Vibration: {features['rms']:.3f} m/s²
Peak Vibration: {features['peak']:.3f} m/s²
Crest Factor: {features['crest_factor']:.2f}
Dominant Frequency: {features['dominant_freq']:.1f} Hz
Spectral Energy: {features['spectral_energy']:.2e}

Reference values:
- Normal RMS: 0.5-2.0 m/s²
- Warning RMS: 2.0-5.0 m/s²
- Critical RMS: >5.0 m/s²
- Normal Crest Factor: 3-4
- Motor rotation: 1500 RPM (25 Hz)

Task: Provide predictive maintenance assessment:
1. Motor condition (Normal/Warning/Critical)
2. Probable fault type (if any)
3. Estimated time to failure (days)
4. Recommended action

Respond in JSON format.
"""

        response = ollama.generate(
            model=self.model,
            prompt=prompt,
            format='json'
        )

        return json.loads(response['response'])

if __name__ == "__main__":
    analyzer = VibrationAnalyzer()

    # Read vibration data
    vibration = analyzer.read_vibration_data(1000)

    # Extract features
    features = analyzer.extract_features(vibration)
    print(f"📊 Features: {features}")

    # AI prediction
    prediction = analyzer.predict_with_ai(features)
    print(f"\n🔮 AI Prediction:")
    print(json.dumps(prediction, indent=2))

Example Output:

{
  "condition": "Warning",
  "fault_type": "Bearing wear (outer race)",
  "estimated_ttf_days": 15,
  "confidence": 0.78,
  "recommended_action": "Schedule bearing replacement within 2 weeks. Increase monitoring frequency to daily.",
  "evidence": "Elevated RMS (3.2 m/s²) and high-frequency components at 250 Hz indicate outer race fault."
}

Raspberry Pi 5 + Ollama: Edge AI Solution

Hardware Configuration

Components:

Raspberry Pi 5 (8GB RAM)
NVMe SSD (256GB) via PCIe
PoE+ HAT (for Ethernet power)
Industrial case with fan

Performance:

Model	Inference Speed	RAM Usage
Llama 3.2 (3B)	~15 tokens/sec	2.5GB
Phi-3 (3.8B)	~12 tokens/sec	3.2GB
Mistral 7B	~6 tokens/sec	5.8GB

Setup Script

#!/bin/bash
# install_edge_ai.sh

# Install Ollama on Raspberry Pi 5
curl -fsSL https://ollama.com/install.sh | sh

# Download lightweight models
ollama pull llama3.2:3b
ollama pull phi3:mini

# Install Python dependencies
pip3 install ollama pymodbus numpy pandas

# Configure systemd service
cat > /etc/systemd/system/edge-ai.service <<EOF
[Unit]
Description=Edge AI Monitoring
After=network.target ollama.service

[Service]
Type=simple
User=pi
WorkingDirectory=/opt/edge-ai
ExecStart=/usr/bin/python3 /opt/edge-ai/monitor.py
Restart=always

[Install]
WantedBy=multi-user.target
EOF

systemctl enable edge-ai
systemctl start edge-ai

Cost Comparison: Ollama vs Cloud AI

Scenario: 10 devices, continuous monitoring

Monthly Costs:

Solution	Hardware	Compute	Total
AWS Bedrock	-	$450/month	$450/month
OpenAI API	-	$820/month	$820/month
Ollama (Local)	$800 (one-time)	$5 (electricity)	$5/month

ROI: Ollama pays for itself in 2 months!

Conclusion

Ollama enables powerful AI applications in industrial facilities while maintaining complete data privacy. Especially for:

✅ Factories with confidential production data
✅ Facilities without reliable internet
✅ Real-time decision making requirements (<100ms)
✅ Cost-sensitive projects

Our ZMA and GDT products can be combined with Ollama to build predictive maintenance systems entirely locally.

Industrial AI with AWS Bedrock (Cloud alternative)
TinyML: AI Revolution in Industrial Sensors (MCU-level AI)
ZMA Data Acquisition
GDT Digital Transmitter