Application of machine learning to data acquired across the product lifecycle into manufacturing intelligence for yielding positive impacts in all aspects of manufacturing.

I specifically identified the main process conditions affecting the production of defective products in the injection molding and die-casting manufacturing. The summary of this research is presented in the following table:

Project Synopsis
Title:Explainable Machine Learning for fault detection in the cyclic manufacturing industry
  • Ensemble learning models can predict the quality in the injection molding and die-casting process.
  • Rule-based explanations and Rule lattices are developed for interpreting the ensemble models.
  • The methods generate decision rules and hierarchical visualizations to better understand process conditions affecting the quality of the final products.
Period: August 2020- December 2022
  • Ensemble learning
  • Explainable Machine Learning
  • Rule-based classifiers
  • Time series analysis
  • Injection Molding Process
  • Die-casting Process
  • Product quality prediction
  • Academic writing

Research details

Smart manufacturing can be defined as the exploitation of data acquired across the product lifecycle into manufacturing intelligence with the purpose of yielding positive impacts in all aspects of manufacturing [1]. With the introduction of ubiquitous technologies —such as IoT sensor networks— the capacity of data collection in manufacturing production lines is becoming a de facto task in the Industry 4.0 era. Thus, the application of Machine Learning techniques that use data effectively is currently a competitive advantage that aid in several areas such as equipment supervision and product quality control.

One of the main research projects in which I have been involved in the last three years belongs to smart manufacturing technologies, more specifically, quality prediction in the cyclic manufacturing. Cyclic-manufacturing processes are defined as manufacturing processes in which the parts are produced by repeating the same sequence of steps that forms the manufacturing cycle. The data generated in these processes describes the state of the process for each cycle [2]. I mainly worked with two cyclic processes, injection molding and die-casting.

Machine learning for quality prediction

The general processing framework to generate quality predictive models in die-casting manufacturing processes is presented in Figure 1. The process is identical for injection molding manufacturing. As I mentioned before, process condition data are collected directly from casting and molding machines — such as Arburg or TBC. The collected time series data should be preprocessed and proper features are extracted in order to be used as input for machine learning models. After that, accurate predictive models are trained and used for quality prediction.

Figure 1. Traditional machine learning process for quality prediction in cyclic processes.

Injection molding and die-casting data

Because of the cyclic nature of the collected data, process condition data has time series form. Therefore, we have to preprocess the data in two ways:

  1. The time series should be mapped to a lot number using the recorded timestamp for each process condition. This is shown in Figure 1 (b). Each color represents a lot, and each time series of that lot represents the recorded process conditions for that lot of products.
  2. Feature extraction from each lot. We summarized each process condition by extracting summary statistics and process control parameters.

With the previous steps, we are able to construct the training data that will be used for training accurate quality prediction models.

Tree ensembles for quality prediction

One of my expertise in the are of machine learning is the application of ensemble models to common problems in different domains. Ensemble learning are algorithms that combine the prediction of several simpler models to improve the accuracy and reduce the variance of each base model. The most widely used models of this type are tree ensembles that use decision trees as base models. There are mainly two types of tree ensembles, boosting and bagging. Figure 2 presents a graphical representation of the difference between both algorithms. On the one hand, boosting algorithms build each base tree ensemble sequentially, and each subsequent tree attempts to overcome the errors of the previous trained trees. On the other hand, bagging algorithms build all the base trees parallelly, and all trees are independent of the other trees.

Figure 2. Graphical representation of (a) boosting and (b)bagging ensemble algorithms.

After forming the training set — see Figure 1 (c) and (d)— we trained several tree ensembles using widely used algorithms such as XGBoost, LightGBM and Random Forests. The downside of using ensembles is that results are hard to interpret and understand because of several reasons, mainly the size of the ensemble and the inner mechanisms of the ensemble [3]. Therefore, even if we can obtain accurate quality predictions, we would like to understand the decision mechanisms of the predictive model in order to understand the main process conditions that are affecting the production of defective products.

Explainable machine learning for process conditions

Providing meaningful explanations for tree ensemble models is my main contribution to the smart manufacturing field. As presented in my main portfolio project, I specialize in extracting simplified rulesets from tree ensembles that are able to minimize the loss in performance from original ensembles and provide smaller models in the form of decision rules that resemble the way in humans reason. Thus, the first step for providing explanations of ensemble models is to extract a simplified and accurate ruleset that is easier to understand than the complex ensemble.

After that step, I presented two different methods that attempt to further provide explanations that aids in explaining the main process conditions that affect production of defective products.

Rule-based explanations for the injection molding manufacturing

In [4] I introduced a method for enhancing the explanations provided by rule-based classifiers. Figure 3 introduce the main elements of the rule-based explanations for the injection molding manufacturing.

  1. First we include rule performance metrics such as coverage and confidence, so we understand globally what percentage of the training data is covered by this rule and how accurate it is.
  2. The rule conditions specify what process condition is composing this rule. This includes the split value that will help to understand how this process condition values will affect the probability of having defective products.
  3. The explanations include the feature importance of each process condition, so it also provides a quantifiable measure of the importance of each feature when constructing the predictive model.
  4. Two visual plots are included, these are Partial Dependency Plots (PDP) and Individual Conditional Expectation (ICE) plots, to understand how the values of each process condition affect the output of the predictions.
  5. The decision of this rule, either defect or normal product.
  6. A 3D interaction PDP that includes the distribution of samples used to create this rule.

Providing the interpretation of the rule presented in Figure 3 becomes much simpler than understanding an accurate tree ensemble. In this case we can see that the main process conditions that are increasing the probability of defective products are “Cylinder heating zone 2” and “Opening force“. When the mean value of “Cylinder heating zone 2” goes beyond 294.972 the probability of defective products increases. Moreover, the interaction of the previous process condition with the “Opening force” covers around 30% of the training samples and it is 90.3% accurate when making predictions.

Figure 3. Template of rule-based explanations as introduced in [4].

Rule visualizations for the die-casting manufacturing

The second visualization method that I introduce in [5] is oriented to understand the hierarchical relationships between process conditions in a rule-based classifier and how they interact to produce defective or normal products. This method is based on a mathematical framework known as Formal Concept Analysis (FCA). The method produce rule lattices called RuleLat, as shown in Figure 4. The lattice resemble a decision tree, although the nature and functionality of both methods are different. It is annotated with the coverage and confidence of the rules, as well as visual aids such as the thickness of the edges and the size of the nodes in proportion with the number of instances covered by conditions and rules.

In the example presented here, we can see that condition a14 (Cooling time), a18 (Temperature) and a23 (High speed time) are the main process conditions affecting the production of goods with imprint defects in the die-casting manufacturing. These three conditions interact with other process conditions to form most of the rules that predict defective products.

Figure 4. Example of RuleLat model as introduced in [5].


Machine learning models applied in the smart manufacturing field are an integral part of taking advantage of data for improving the yield in many aspects of manufacturing processes. However, in practice, the final models of the users need a way to understand the decision mechanisms of accurate predictive models.

Explainable machine learning is becoming an integral part in the adoption of machine learning in practical applications in manufacturing, and here I introduced two methods that attempt to increase the applicability of machine learning models in applications of product quality control.


[1] Tao, Fei, Qinglin Qi, Ang Liu, and Andrew Kusiak. “Data-driven smart manufacturing.” Journal of Manufacturing Systems 48 (2018): 157-169.

[2] Kozjek, Dominik, David Kralj, and Peter Butala. “Interpretative identification of the faulty conditions in a cyclic manufacturing process.” Journal of Manufacturing Systems 43 (2017): 214-224.

[3] Obregon, Josue, and Jae-Yoon Jung. “Explanation of ensemble models.” In Human-Centered Artificial Intelligence, pp. 51-72. Academic Press, 2022.

[4] Obregon, Josue, Jihoon Hong, and Jae-Yoon Jung. “Rule-based explanations based on ensemble machine learning for detecting sink mark defects in the injection moulding process.” Journal of Manufacturing Systems 60 (2021): 392-405.

[5] Obregon, Josue, and Jae-Yoon Jung. “Rule-based visualization of faulty process conditions in the die-casting manufacturing.” Journal of Intelligent Manufacturing (2022): 1-17.