On-Premise AI Solutions with Ollama: Factory-Internal Predictive Maintenance

Amazeng Technical Team
11 min read
OllamaLocal AIPredictive MaintenanceLlamaEdge AIData PrivacyIndustrial AI

Introduction

In industrial facilities, data privacy and real-time decision making are critical requirements. While cloud-based AI services like AWS Bedrock or OpenAI offer powerful capabilities, they require sensitive production data to leave the facility. Ollama addresses this issue by enabling local Large Language Model (LLM) deployment.

In this article, we examine how to implement AI-powered predictive maintenance applications using Ollama with our ZMA Data Acquisition and GDT Digital Transmitter devices, without any cloud connection.

What is Ollama?

Ollama is an open-source application that enables local deployment of large language models like Llama, Mistral, Phi.

Key Features

Fully Local: All data stays within facility
No Internet Required: Complete offline operation
Open Source Models: Llama 3, Mistral 7B, Phi-3
Easy API: Compatible with OpenAI API
Low Cost: No per-token API fees

Cloud AI vs Ollama Comparison

FeatureAWS BedrockOpenAI GPT-4Ollama (Local)
Data PrivacyData leaves facilityData leaves facilityFully local
Internet DependencyRequiredRequiredNot required
Latency200-500ms300-800ms50-200ms
Cost$0.00025-0.01/1K token$0.01-0.06/1K tokenHardware only
CustomizationLimitedNoFull control
Model SizeLarge (100B+)Very large (1.7T)Small-Medium (7B-70B)

System Architecture

┌──────────────────────────────────────────────────┐
│              Factory (Fully Offline)             │
│                                                  │
│  ┌─────────┐    Modbus TCP    ┌──────────────┐  │
│  │ ZMA-4   │◄─────────────────►│              │  │
│  │ Devices │                   │  Linux PC    │  │
│  └─────────┘                   │  (AI Server) │  │
│                                │              │  │
│  ┌─────────┐    Modbus TCP    │ - Ollama     │  │
│  │ GDT     │◄─────────────────►│ - Python     │  │
│  │ Trans.  │                   │ - PostgreSQL │  │
│  └─────────┘                   │              │  │
│                                │ Models:      │  │
│                                │ • Llama 3.2  │  │
│                                │ • Mistral 7B │  │
│                                │ • Phi-3      │  │
│                                └──────────────┘  │
│                                                  │
└──────────────────────────────────────────────────┘

Ollama Installation and Setup

1. Server Installation (Ubuntu 22.04)

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Start Ollama service
sudo systemctl start ollama
sudo systemctl enable ollama

# Download models
ollama pull llama3.2:3b      # 3B parameter model (~2GB)
ollama pull mistral:7b       # 7B parameter model (~4.1GB)
ollama pull phi3:medium      # 14B parameter model (~7.9GB)

# Verify installation
ollama list

2. Python Integration

pip install ollama requests pymodbus psycopg2-binary pandas

Application 1: Anomaly Detection from Loadcell Data

Scenario

GDT Digital Transmitter reads weight from 4 loadcells on a tank. Imbalanced load (one loadcell reading much higher/lower) may indicate:

  • Tank tilt
  • Loadcell failure
  • Foundation problem

AI-Powered Detection

anomaly_detector.py

#!/usr/bin/env python3

import ollama
import json
from pymodbus.client.sync import ModbusTcpClient

class LoadcellAnomalyDetector:
    def __init__(self):
        self.gdt_client = ModbusTcpClient('192.168.1.100', port=502)
        self.gdt_client.connect()

        # Use lightweight model (3B)
        self.model = "llama3.2:3b"

    def read_loadcells(self):
        """Read 4 loadcell values from GDT"""
        result = self.gdt_client.read_holding_registers(10, 8, unit=1)

        if not result.isError():
            lc1 = self._to_float(result.registers[0:2])
            lc2 = self._to_float(result.registers[2:4])
            lc3 = self._to_float(result.registers[4:6])
            lc4 = self._to_float(result.registers[6:8])

            return {
                "lc1": round(lc1, 3),
                "lc2": round(lc2, 3),
                "lc3": round(lc3, 3),
                "lc4": round(lc4, 3),
                "total": round(lc1 + lc2 + lc3 + lc4, 3)
            }
        return None

    def analyze_with_ai(self, data):
        """Analyze loadcell values with Ollama"""
        prompt = f"""
You are an industrial automation expert. Analyze the following loadcell readings:

Loadcell 1 (Front-Left):  {data['lc1']} mV/V
Loadcell 2 (Front-Right): {data['lc2']} mV/V
Loadcell 3 (Rear-Left):   {data['lc3']} mV/V
Loadcell 4 (Rear-Right):  {data['lc4']} mV/V
Total Weight: {data['total']} mV/V

Normal state: All 4 loadcells should read similar values (±5% tolerance).

Task: Detect any anomalies and provide:
1. Is there an anomaly? (Yes/No)
2. If yes, which loadcell?
3. Probable cause
4. Recommended action

Respond in JSON format.
"""

        response = ollama.generate(
            model=self.model,
            prompt=prompt,
            format='json'  # Force JSON output
        )

        result = json.loads(response['response'])
        return result

    def run(self):
        """Main monitoring loop"""
        print(f"🤖 AI Anomaly Detector started (Model: {self.model})")

        while True:
            data = self.read_loadcells()

            if data:
                print(f"\n📊 Loadcells: {data}")

                # AI analysis
                analysis = self.analyze_with_ai(data)

                if analysis.get('anomaly') == 'Yes':
                    print(f"⚠️  ANOMALY DETECTED!")
                    print(f"   Problem: {analysis.get('problem')}")
                    print(f"   Cause: {analysis.get('probable_cause')}")
                    print(f"   Action: {analysis.get('recommended_action')}")
                else:
                    print("✓ Normal operation")

            time.sleep(10)  # Check every 10 seconds

if __name__ == "__main__":
    detector = LoadcellAnomalyDetector()
    detector.run()

Example Output:

{
  "anomaly": "Yes",
  "problem": "Loadcell 3 reading is 15% lower than others",
  "probable_cause": "Possible loadcell drift or loose connection",
  "recommended_action": "Check Loadcell 3 wiring and recalibrate if necessary",
  "severity": "Medium"
}

Application 2: Predictive Maintenance from Vibration Data

Scenario

ZMA-4 collects vibration data from motor via accelerometer (4-20mA). Predict motor failure by analyzing data with AI.

Feature Extraction + AI Analysis

vibration_monitor.py

import numpy as np
import pandas as pd
from scipy import signal
from pymodbus.client.sync import ModbusTcpClient
import ollama

class VibrationAnalyzer:
    def __init__(self):
        self.zma_client = ModbusTcpClient('192.168.1.50', port=502)
        self.zma_client.connect()
        self.model = "mistral:7b"

    def read_vibration_data(self, samples=1000):
        """Read 1000 samples from ZMA (1kHz sampling)"""
        data = []
        for _ in range(samples):
            result = self.zma_client.read_holding_registers(0, 2, unit=1)
            value = self._to_float(result.registers)
            data.append(value)
            time.sleep(0.001)  # 1ms = 1kHz

        return np.array(data)

    def extract_features(self, vibration_data):
        """Extract frequency domain features"""
        # FFT
        fft = np.fft.fft(vibration_data)
        freqs = np.fft.fftfreq(len(vibration_data), d=0.001)

        # Peak frequencies
        peaks, _ = signal.find_peaks(np.abs(fft), height=100)
        peak_freqs = freqs[peaks]

        # Statistical features
        features = {
            "rms": np.sqrt(np.mean(vibration_data**2)),
            "peak": np.max(np.abs(vibration_data)),
            "crest_factor": np.max(np.abs(vibration_data)) / np.sqrt(np.mean(vibration_data**2)),
            "dominant_freq": peak_freqs[0] if len(peak_freqs) > 0 else 0,
            "spectral_energy": np.sum(np.abs(fft)**2)
        }

        return features

    def predict_with_ai(self, features):
        """Predictive maintenance analysis with AI"""
        prompt = f"""
You are a mechanical engineer expert in predictive maintenance. Analyze the following vibration data from an industrial motor:

RMS Vibration: {features['rms']:.3f} m/s²
Peak Vibration: {features['peak']:.3f} m/s²
Crest Factor: {features['crest_factor']:.2f}
Dominant Frequency: {features['dominant_freq']:.1f} Hz
Spectral Energy: {features['spectral_energy']:.2e}

Reference values:
- Normal RMS: 0.5-2.0 m/s²
- Warning RMS: 2.0-5.0 m/s²
- Critical RMS: >5.0 m/s²
- Normal Crest Factor: 3-4
- Motor rotation: 1500 RPM (25 Hz)

Task: Provide predictive maintenance assessment:
1. Motor condition (Normal/Warning/Critical)
2. Probable fault type (if any)
3. Estimated time to failure (days)
4. Recommended action

Respond in JSON format.
"""

        response = ollama.generate(
            model=self.model,
            prompt=prompt,
            format='json'
        )

        return json.loads(response['response'])

if __name__ == "__main__":
    analyzer = VibrationAnalyzer()

    # Read vibration data
    vibration = analyzer.read_vibration_data(1000)

    # Extract features
    features = analyzer.extract_features(vibration)
    print(f"📊 Features: {features}")

    # AI prediction
    prediction = analyzer.predict_with_ai(features)
    print(f"\n🔮 AI Prediction:")
    print(json.dumps(prediction, indent=2))

Example Output:

{
  "condition": "Warning",
  "fault_type": "Bearing wear (outer race)",
  "estimated_ttf_days": 15,
  "confidence": 0.78,
  "recommended_action": "Schedule bearing replacement within 2 weeks. Increase monitoring frequency to daily.",
  "evidence": "Elevated RMS (3.2 m/s²) and high-frequency components at 250 Hz indicate outer race fault."
}

Raspberry Pi 5 + Ollama: Edge AI Solution

Hardware Configuration

Components:

  • Raspberry Pi 5 (8GB RAM)
  • NVMe SSD (256GB) via PCIe
  • PoE+ HAT (for Ethernet power)
  • Industrial case with fan

Performance:

ModelInference SpeedRAM Usage
Llama 3.2 (3B)~15 tokens/sec2.5GB
Phi-3 (3.8B)~12 tokens/sec3.2GB
Mistral 7B~6 tokens/sec5.8GB

Setup Script

#!/bin/bash
# install_edge_ai.sh

# Install Ollama on Raspberry Pi 5
curl -fsSL https://ollama.com/install.sh | sh

# Download lightweight models
ollama pull llama3.2:3b
ollama pull phi3:mini

# Install Python dependencies
pip3 install ollama pymodbus numpy pandas

# Configure systemd service
cat > /etc/systemd/system/edge-ai.service <<EOF
[Unit]
Description=Edge AI Monitoring
After=network.target ollama.service

[Service]
Type=simple
User=pi
WorkingDirectory=/opt/edge-ai
ExecStart=/usr/bin/python3 /opt/edge-ai/monitor.py
Restart=always

[Install]
WantedBy=multi-user.target
EOF

systemctl enable edge-ai
systemctl start edge-ai

Cost Comparison: Ollama vs Cloud AI

Scenario: 10 devices, continuous monitoring

Monthly Costs:

SolutionHardwareComputeTotal
AWS Bedrock-$450/month$450/month
OpenAI API-$820/month$820/month
Ollama (Local)$800 (one-time)$5 (electricity)$5/month

ROI: Ollama pays for itself in 2 months!

Conclusion

Ollama enables powerful AI applications in industrial facilities while maintaining complete data privacy. Especially for:

Factories with confidential production data
Facilities without reliable internet
Real-time decision making requirements (<100ms)
Cost-sensitive projects

Our ZMA and GDT products can be combined with Ollama to build predictive maintenance systems entirely locally.