Application of the Isolation k-means Method in Quality Control of Food Production

Modern systems monitoring production processes, especially in the food industry, generate enormous amounts of real-time data. Efficient detection of anomalies — deviations from the norm that may indicate machine failures, operator errors, contamination, or other quality threats — is one of the key challenges in ensuring production safety and continuity. In this context, data analysis methods capable of identifying non-standard patterns without prior labeling of data as correct or incorrect become particularly important.

One Promising Method: Isolation k-means

One promising method is Isolation k-means — an approach that combines the advantages of classical k-means clustering with the isolation mechanism characteristic of anomaly detection techniques such as Isolation Forest.

Isolation k-means is a hybrid data analysis method based on two fundamental assumptions.
First, it uses the k-means algorithm to group data into clusters — identifying natural structures and recurring patterns within a dataset.
Second, it analyzes the distance between data points and the centers (centroids) of their assigned clusters — assuming that data points lying far from a cluster center are potential anomalies.

Moreover, the algorithm can apply additional isolation criteria to separate points that are not only distant but also rarely co-occur with other observations.

This method becomes particularly valuable in the food industry, where detecting irregularities in real time can prevent serious problems such as product contamination, improper temperature conditions, or incorrect ingredient proportions. Thanks to its flexibility and ability to operate on unlabeled data, Isolation k-means can be effectively used to monitor production parameters, identify deviations from standard process profiles, and support decision-making by operators and quality engineers.

The following sections describe how the Isolation k-means method can be adapted to time-series data in food production and what benefits its application may bring to real-world quality control systems.

Unsupervised Learning

In data science and machine learning, we distinguish between supervised and unsupervised learning.
In supervised learning, the machine is guided by a human, whereas in unsupervised learning it must learn everything on its own.

Unsupervised learning creates the ability to work with unlabeled data. This means that the algorithm does not need previously prepared labels — information indicating which data are correct (normal) and which are faulty (anomalies).

For example, in classical supervised learning, the data must be labeled beforehand: a temperature of 60 °C is proper, while exceeding 95 °C is too high.

In practice — especially in food production — we often do not have such critical thresholds. It is difficult to pre-define which cases are anomalies, particularly when the production line frequently changes its assortment, leading to very short production batches.

That is precisely why algorithms operating on unlabeled data, that is, according to the principle of unsupervised learning, are flexible and well suited to this type of production environment. These algorithms learn patterns directly from the data, independently identifying what is “typical” and what deviates significantly from it.

W-MOSZCZYNSKI ps 5-25

The k-means Clustering Method

k-means clustering is one of the most popular exploratory data analysis methods in machine learning.
It is the most frequently used clustering technique — grouping objects based on features — for example, in customer behavior analysis, image segmentation, genetic data exploration, or anomaly detection.

It is a myth that clustering can only be done in two-dimensional (2D) or three-dimensional (3D) space. In practice, data often have a dozen, dozens, or even hundreds of features (e.g., sensor data from a production line), and k-means still works — just without visualization, since analysis takes place in multidimensional space.

Why mention this?
Because research is often conducted for a client who usually needs to see the division — which leads to generating cluster plots. When the studied phenomenon has, for example, 14 features, techniques such as PCA (Principal Component Analysis) or nonlinear methods like t-SNE (t-distributed Stochastic Neighbor Embedding) are used to reduce those 14 features to 2–3 key ones.

In production practice, using k-means clustering might look as follows:
We have a sample of meat with slightly lower temperature and 23 other measured features. These features place it in Cluster A. Another sample, with different values of those 24 features, falls into Cluster B. We also have Clusters C, D, and E.

In practical terms, this clustering does not necessarily mean anything — it simply reflects differences between products on the production line. Products can be assigned to clusters due to minimal variations in feature magnitudes. The process may be conducted for purely technical reasons.

The k-means method belongs to the group of unsupervised learning algorithms, meaning it does not require pre-labeled data. The algorithm independently finds structures (clusters) within the dataset.

The main goal of k-means is to divide a dataset into k non-overlapping groups (clusters) so that points within the same cluster are as similar as possible to each other and as different as possible from points in other clusters (e.g., groups of meat samples).

The quality of clustering is measured using the Sum of Squared Errors (SSE), also known as Within-Cluster Sum of Squares (WCSS) — the sum of squared distances between each point and its cluster centroid. The smaller the SSE, the tighter the clusters — meaning the points are closer to their centers.

In practice, this means that meat samples from one production batch can be significantly distinguished from each other. This can be useful for analyzing events related to, for example, bacterial contamination or harmful impurities in products.

The Isolation k-means Anomaly Detection Method

Isolation k-means combines two approaches:

clustering (k-means) for identifying typical data patterns, and
isolation of anomalies similar to methods such as Isolation Forest, focusing on identifying points that significantly deviate from “normal” groups (clusters).

Initially, data are grouped using the classical k-means algorithm. Centroids — cluster centers representing the most frequent patterns — are created.

For each data point, the distance from its assigned centroid is calculated. The greater the distance, the less the point fits its cluster.

For example, a quality-monitoring system on a packaging line may detect five different clusters of sausages. It’s still the same product (one SKU), but the system has found subgroups. Isolation k-means evaluates how isolated each point is from others. Points very distant from any subgroup are treated as potential anomalies.

Although this is an unsupervised method, it is possible to adjust anomaly detection sensitivity, for instance by setting a 95th percentile distance threshold, above which points are considered anomalies, or by using a normalized distance scale such as Z-score.

How to Prepare Data for Isolation k-means

Because the method is based on k-means, data must be properly prepared — just as in classical clustering.

Feature standardization (scaling): all features should have similar scales (e.g., mean = 0, standard deviation = 1). Otherwise, features with larger numerical values will dominate the distance metric. Alternatively, normalization can be used to transform feature values into a defined range (e.g., −1 to 1).

Mathematical models usually interpret numerically large values as more significant than smaller ones. For instance, in the case of meat, temperature of 4 °C would be considered less important than humidity of 62

Feature selection: choose variables that meaningfully represent process “normality.”
In food production, these may include temperature, production time, concentration levels, pressure, and humidity.
In industrial environments, many parameters (e.g., motor power or water usage) may have no relation to product quality and can unnecessarily clutter analysis.

Use of historical data:
The dataset should contain as much data as possible so that k-means can learn typical clusters.
In streaming (real-time) measurements, sample windows should include at least 100 products.

Food-production sensor data usually take the form of time series. They can be transformed into feature vectors using rolling windows, moving averages, standard deviations, etc.

Unlike classical k-means clustering, which assigns every point (product) to some cluster, Isolation k-means allows that certain points may not belong to any cluster — they are then interpreted as anomalies.

Practical Example: Isolation k-means in a Dairy

We have a dairy production line manufacturing yogurt. The process is strictly controlled and includes several key technological parameters that must remain within defined ranges to ensure safety and final product quality.

We measure six key features:

Milk pasteurization temperature (°C)
Fermentation time (minutes)
Fat content (
Protein content (
Final product pH
Number of live bacterial cultures (log CFU/ml)

Data from sensors and analyzers are collected in real time in transactional form. Each batch of 40 yogurts has a unique production number.
Each batch can thus be represented as a point in a six-dimensional feature space.
We do not analyze each feature separately, but treat them as one coherent observation — a complete process profile for that batch.

The k-means algorithm groups many earlier correct batches into clusters representing typical parameter configurations. For each cluster, a centroid is created.

When a new batch is produced, its profile (six features) is compared to the centroids.
If it lies significantly far from the nearest cluster (e.g., very short fermentation time combined with high pH and low bacterial culture count), it is treated as a potential anomaly.

In streaming mode, data are analyzed continuously.
Thanks to the ability to quickly compute distances to centroids, Isolation k-means can immediately identify batches with unusual profiles — before the product reaches packaging or distribution.
When combined with an alert system, operators can react instantly — stopping the line or conducting additional laboratory tests.

Advantages of the Isolation k-means Method

Considers the full feature profile, not just individual limits.
Many issues arise from combinations of parameters that individually fall within limits but together form an unnatural case.
No need for predefined thresholds or quality standards, which is especially useful for short production series. Production anomalies are rare and difficult to define in advance.
Simplicity and speed.
Isolation k-means is relatively simple and efficient, allowing real-time use in quality monitoring systems.
Robustness to scale differences.
After appropriate feature standardization or normalization, the algorithm operates stably even with varying measurement units.

Summary

This article demonstrated how the Isolation k-means method can be successfully applied to anomaly detection in food production processes.
Combining classical clustering with isolation analysis allows for a more effective quality assessment approach than traditional threshold-based methods.

By analyzing complete process feature profiles rather than individual values, the method can identify subtle yet potentially critical deviations.
Its flexibility, lack of dependence on predefined limits, and real-time applicability make it particularly attractive for production environments where reliability and rapid response are crucial.

In the era of digital transformation in the food industry, methods like Isolation k-means may become the foundation of intelligent quality control and predictive maintenance systems — supporting consumer safety and process efficiency.

THE DATA SCIENCE LIBRARY

Wojciech Moszczyński