Real-Time Decisioning Demo | Manufacturing Predictive Maintenance NBA
Pipeline<18ms
1 / 9
1
Architecture
2
Asset Signal
3
Ingest
4
Context
5
Feature Serving
6
Ranking
7
Business Impact
8
Outcome
9
Architecture Recap
Stage 1: The Architecture
Unified operational context for predictive maintenance next-best-action
Five tiers form a continuous loop: ingest, context, decide, act, learn. SCADA telemetry, historian data, MES schedules, ERP inventory, CMMS work orders, and quality events feed Redis through RDI and Redis Feature Form. Redis RAM handles the hot operational path. Redis Flex holds the broader asset history, embeddings, maintenance graph, and long-tail context. The maintenance workbench renders the next best action before a minor anomaly becomes an unplanned outage.
Data Sources

SCADA / PLC

Live sensor telemetry, alarms, machine state

Historian

Vibration, temperature, pressure, cycle history

MES

Production schedule, line priority, WIP state

ERP / EAM

Spare parts, suppliers, maintenance cost

CMMS + Kafka

Work orders, technician status, quality events

Ingest Layer

Redis Data Integration (RDI)

Synchronizes ERP, CMMS, MES, and master asset records into the operational context layer

Redis Feature Form

Serves online features from telemetry, maintenance history, and offline reliability models with train-serve parity

Unified Context Layer

Redis RAM

Hot asset state, live alarms, technician and line availability

Redis Flex

Warm maintenance history, embeddings, failure signatures, and asset graph

Feature Store

RUL, anomaly, downtime, quality, and risk features

Redis Context Retriever

Assembles the Asset 360 — equipment state, maintenance history, and failure signals — and exposes it as structured MCP tools for the decision engine

Decision Engine

Eligibility Rules

Safety, production, labor, and SLA constraints

NBA Ranker

Uptime, cost, quality, and service weighting

Vector Search

Match current signature to historical failure patterns

Policy Arbitration

Balances throughput, risk, compliance, and maintenance urgency

Output Surfaces

Reliability Workbench

Next best action, failure context, dispatch guidance

Maintenance App

Technician assignment, checklist, parts reservation

Plant Ops Console

Line slowdown, reroute, batch completion guidance

Supplier / ERP

Parts order, SLA escalation, procurement trigger

Learn:  Maintenance outcomes, repairs, false positives, downtime, and quality impact flow back through Redis Streams or Kafka and offline models, then redeploy into Redis.
Decision Target
<18 ms
North Star
Uptime + quality + cost
Decision Surface
Maintenance + plant ops
Stage 2: Asset Signal
A packaging line starts drifting toward failure
At 2:14 PM, filler motor M-204 on Line 7 shows rising vibration, increasing amperage draw, and a subtle temperature drift. The line is currently running a high-priority beverage batch scheduled for same-day shipment. The plant has a narrow window to decide whether to keep running, slow down, dispatch a technician, or reserve parts before the anomaly turns into downtime and scrap.
Live Asset Event
M7
Line 7 · Filler Motor M-204
Packaging plant | Critical line asset | 11-year-old motor | Running high-priority SKU
AT-RISK SIGNAL
Eventanomaly_cluster_detected
Time2:14 PM local shift
Current line stateRunning at 96% design speed
Vibration delta+27% over trailing baseline
Temperature delta+9.2°F over expected envelope
Amperage trendRising for 41 minutes
Production prioritySame-day retail replenishment batch
Why This Moment Matters
The plant does not need a dashboard alert. It needs a decision. If the team waits for the alarm threshold, the likely outcome is unplanned downtime, scrap, missed shipment windows, and emergency maintenance.
The challenge is that the right action depends on more than telemetry. The platform has to weigh technician availability, spare parts, line criticality, batch completion time, and the historical failure signature of this exact asset family.
The opportunity: with the right context, the plant can intervene earlier and choose the lowest-cost, lowest-risk action that preserves throughput.
Without Redis: telemetry, maintenance history, parts, and production priorities sit in different systems, so the plant either overreacts or reacts too late.
Stage 3: Ingest
Telemetry, maintenance, and operations data flow into Redis
RDI synchronizes ERP, CMMS, MES, and asset master data from the operational systems. Redis Feature Form pulls online features from streaming telemetry, historian records, and offline reliability models. Redis becomes the live operational context layer for the maintenance moment, not a replacement for the systems of record.
Redis Data Integration (RDI)Redis Feature Form
Source Systems → Redis
PLC
SCADA / PLC
Live vibration, motor current, temperature, speed, alarms, machine state
HIS
Historian
Sensor history, control loop behavior, trending and baseline envelopes
MES
MES
Work order priority, batch status, SKU criticality, line schedule, changeover windows
ERP
ERP / EAM
Spare parts, supplier lead times, maintenance cost, asset BOM, procurement rules
CMMS
CMMS
Maintenance history, technician skills, work orders, MTBF, unresolved issues
KFK
Kafka or Redis Streams + Quality Events
Scrap spikes, operator notes, upstream/downstream constraints, line interruptions
Pipeline Status
ERP + CMMS syncSub-second to seconds
Telemetry ingestionStreaming
Feature parity100%
Cold-start fallback<3%
Decision dependenciesServed from Redis context layer
Custom integration codeMinimized through RDI
Additive architecture: MES, ERP, CMMS, SCADA, and the historian stay in place. Redis is the serving layer that makes them act together inside the maintenance decision window.
Stage 4: Context
The asset 360 assembles in real time
Redis assembles durable asset context and live operational context in the same response path. The right maintenance action depends on both: what this motor and line have done historically, and what the plant can tolerate right now.
Redis RAMRedis FlexRedis Context Retriever
Historical Asset Context
Asset familyM-200 fill head motor series
Last bearing replacement14 months ago
Work orders in past year3 corrective, 2 preventive
Historical match89% similarity to pre-bearing-failure signature
Remaining useful life18 to 36 operating hours
Quality impact correlationHigh when vibration exceeds current threshold
Live Operational Context
Current batch completion82% complete
Technician availability1 skilled tech free in 17 minutes
Spare bearing in stockYes, on-site crib A-12
Alternative line capacityAvailable after changeover in 46 minutes
Current OEE impactProjected -3.8 pts if slowdown starts now
Downtime risk if no actionHigh within this shift
Context signal: Redis Context Retriever assembles the Asset 360 — equipment state, maintenance history, and failure signals — so the decision engine has exactly what it needs. The best next action is not simply “dispatch maintenance.” It is the action that minimizes total business harm across uptime, product quality, labor, and fulfillment commitments.
Stage 5: Feature Serving
Predictive maintenance features hydrate in milliseconds
Redis Feature Form serves online features from Redis RAM and Redis Flex. Reliability, quality, and operations features all arrive with the same definitions used to train the models offline. No train-serve skew. No fan-out at decision time.
Redis Feature FormRedis Feature Store
anomaly_cluster_score
Aggregated anomaly score over multi-sensor window for this asset
0.930.3 ms
failure_signature_embedding
Vector match between current telemetry pattern and historical failure modes
0.89 similarity0.5 ms
rul_hours_prediction
Estimated remaining useful life under current operating load
18-36 hrs0.4 ms
line_criticality_score
Business importance of the current run given production schedule and service commitments
0.880.2 ms
quality_loss_risk
Probability of scrap or defect increase if the asset keeps running at current speed
0.640.3 ms
repair_readiness_score
Availability of labor, parts, tools, and allowable maintenance window
0.810.4 ms
Feature Serving Performance
Features Hydrated
198
P99 Lookup
2.4 ms
Train / Serve Parity
100%
Hot-path storage
RAM + Flex
Stage 6: Ranking
Three next-best-actions are scored and arbitrated
Vector search matches the current signature against historical failures. Eligibility rules account for safety, labor, and production constraints. The ranker balances uptime, quality, maintenance cost, and fulfillment risk to recommend the best next action for the plant right now.
Redis SearchNBA RankerPolicy Arbitration
4 feasible actions evaluated
Safety and labor constraints applied
Top 3 surfaced to reliability workbench
#1 Winner
NEXT BEST ACTION
Slow line to 82% and dispatch technician in 17 minutes
Finish the current high-priority batch with reduced stress on the asset, reserve the on-site bearing, and route the technician to intervene inside the current shift before hard failure.
NBA score0.95
#2 Conservative
SCHEDULE PROTECT
Complete batch at full speed, inspect during next changeover
Protects immediate output but accepts elevated scrap and downtime probability if the anomaly accelerates before the scheduled window.
NBA score0.78
#3 Aggressive
STOP NOW
Immediate shutdown and corrective maintenance
Best for safety or severe quality scenarios, but not the optimal action for this current asset state because the plant still has a lower-cost intervention path available.
NBA score0.69
Stage 7: Business Impact
The value is avoided downtime, protected quality, and better maintenance timing
Predictive maintenance only matters if it changes the operating decision. The real value comes from making the right intervention at the right moment, not simply raising more alerts. Better maintenance decisioning protects uptime, reduces scrap, and lowers emergency repair cost.
Decision Economics
Projected downtime avoided2.5 to 4.0 hours
Scrap / rework exposure avoided$18K to $35K per event
Emergency maintenance premiumReduced by pre-positioning labor and parts
OTIF shipment riskProtected for same-day retail replenishment
Maintenance team productivityHigher because dispatches are ranked, not reactive
Key insight: this is not a dashboard modernization story. It is a real-time operational decisioning story. Redis helps the plant decide when to act, how to act, and whether the best action is to keep running, slow down, reroute, or intervene now.
Per-Event Outcome
unplanned stop
Late reaction
fragmented systems
controlled intervention
Redis-powered
next-best-action
At plant scale, even modest improvements in intervention timing compound into higher OEE, lower maintenance cost, and fewer service failures.
Stage 8: Outcome
Same reliability workbench. Different decision layer.
The UI does not need to be reinvented. The difference is what the operations and maintenance teams can decide with it. Without Redis, the workbench shows disconnected alerts and history. With Redis, it shows the best next action with the operational context needed to trust it.
Generic Reliability View
L7
Line 7 anomaly
Alert
High vibration
Static threshold crossed
What the operator sees
ALR
Alarm history
No parts or labor context
View
MES
Batch status
Separate screen
Switch
CMMS
Maintenance records
Separate system
Search
alerts
not actions
late
intervention risk
high
operator burden
Redis-Powered Workbench
L7
Line 7 anomaly
Next best action
Slow to 82%
Dispatch tech in 17 min · bearing in stock
Why this is the right action
RUL
18-36 hrs RUL
Failure pattern match: 89%
Risk
OPS
Batch 82% complete
Protect same-day shipment
Throughput
FIX
Technician + part ready
Lowest-disruption repair path
Act
Context assembled in Redis
Maintenance confidence: 95%
Telemetry, production priority, repair readiness, and historical signature all point to the same next action.
<18ms
decision time
95%
confidence
lower
downtime risk
Stage 9: The Architecture, Proven
One live decision loop for predictive maintenance and plant operations
SCADA, historian, MES, ERP, CMMS, and quality systems stay in place. RDI and Redis Feature Form make them operational. Redis RAM and Redis Flex serve the unified asset context. The decisioning stack returns the next best maintenance action in milliseconds, while the line is still running and the plant still has choices.
Data Sources

SCADA / PLC

Live sensor telemetry, alarms, machine state

Historian

Vibration, temperature, pressure, cycle history

MES

Production schedule, line priority, WIP state

ERP / EAM

Spare parts, suppliers, maintenance cost

CMMS + Kafka

Work orders, technician status, quality events

Ingest Layer

Redis Data Integration (RDI)

Synchronizes ERP, CMMS, MES, and master asset records into the operational context layer

Redis Feature Form

Serves online features from telemetry, maintenance history, and offline reliability models with train-serve parity

Unified Context Layer

Redis RAM

Hot asset state, live alarms, technician and line availability

Redis Flex

Warm maintenance history, embeddings, failure signatures, and asset graph

Feature Store

RUL, anomaly, downtime, quality, and risk features

Redis Context Retriever

Assembles the Asset 360 — equipment state, maintenance history, and failure signals — and exposes it as structured MCP tools for the decision engine

Decision Engine

Eligibility Rules

Safety, production, labor, and SLA constraints

NBA Ranker

Uptime, cost, quality, and service weighting

Vector Search

Match current signature to historical failure patterns

Policy Arbitration

Balances throughput, risk, compliance, and maintenance urgency

Output Surfaces

Reliability Workbench

Next best action, failure context, dispatch guidance

Maintenance App

Technician assignment, checklist, parts reservation

Plant Ops Console

Line slowdown, reroute, batch completion guidance

Supplier / ERP

Parts order, SLA escalation, procurement trigger

Learn:  Maintenance outcomes, repairs, false positives, downtime, and quality impact flow back through Redis Streams or Kafka and offline models, then redeploy into Redis.
Decision Latency
<18 ms
Outcome
More uptime, less scrap
North Star
Uptime + quality + cost