Applied Science and Convergence Technology 2024; 33(6): 181-188
Published online November 30, 2024
https://doi.org/10.5757/ASCT.2024.33.6.181
Copyright © The Korean Vacuum Society.
Younji Leea , Jeong Eun Choia , Surin Ana , and Sang Jeen Hongb , ∗
aDepartment of Electronics Engineering, Myongji University, Yongin 17058, Republic of Korea
bDepartment of Semiconductor Engineering, Myongji University, Yongin 17058, Republic of Korea
Correspondence to:samhong@mju.ac.kr
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/licenses/by-nc-nd/4.0/) which permits non-commercial use, distribution and reproduction in any medium without alteration, provided that the original work is properly cited.
This paper proposes an equipment intelligence study that builds an integrated database system to collect equipment and sensor data generated during real-time processes, which are in turn used to diagnose abnormalities in semiconductor manufacturing equipment. The integrated database system leverages edge computing to convert data collected simultaneously from equipment and sensors into standardized communication protocols on a single central server. The constructed system not only detects equipment abnormalities in advance, but also helps identify the exact cause of these abnormalities by collecting optical emission spectroscopy sensor data generated from the equipment, such as voltage, pressure, and gas flow generated during the SiO2 deposition process. The system analyzes the results by determining meaningful data from the collected data and feeding them into a fault detection and classification algorithm. This research is promising for greatly improving the efficiency and yield of the overall continuous semiconductor manufacturing process.
Keywords: Fault detection and classification, Machine learning, Integrated database system, Equipment intelligence system
For decades, the semiconductor manufacturing industry has collected and stored large amounts of data to increase process efficiency and productivity. With the advent of the Fourth Industrial Revolution, the electronics manufacturing industry has developed automated production systems with human-led data management to control complex processes and products [1]. However, the operational of various processes and equipment can lead to unpredictable errors. To address this issue, the Industry 4.0 concept is applied to analyze processes using big data, data mining, and deep learning technologies [2,3]. For example, work has been conducted to improve efficiency using various machine learning technologies. These methods aim to eliminate extraneous signal data and address issues related to the identification of defects caused by noisy data generated during operation [4,5].
The semiconductor manufacturing industry has been developing for more than 50 years under the paradigm of increasing the number of transistors in individual chips according to Moore’s Law; however, chip technology has gradually reached its physical limit as the size of devices has decreased [6,7]. Numerous efforts have been made to overcome these technical limitations, and semiconductor technologies are being taken in various directions from Moore’s Law to ‘More Moore’, ‘More than Moore’, and ‘Beyond complementary metal–oxide–semiconductor’ [8]. The progress in semiconductor development has brought forth an increase in the number of different materials and technologies, and accordingly, an increase in the difficulties of process and equipment development [9]. As a result, the manufacturing process is monitored to ensure that defects are discovered, however, the volume of data generated in the semiconductor manufacturing process is so large that identifying defects and solving issues to achieve the desired yield is challenging. Due to the complexity of the process, manufacturing can be affected by various variables that lead to device failure and a lack of acceptance. Because fine errors in individual parts or processes are directly related to productivity, sensitive control is required during the manufacturing process [10]. To prevent such minor failures and errors from occurring, a system is required to monitor the process and equipment data in real time and predict and diagnose preliminary symptoms in the case of equipment malfunction [11].
Recently in the semiconductor manufacturing sector, the analysis of manufacturing data collected from equipment, such as status and external sensor data, has become of even greater interest. Many semiconductor equipment vendors such as Applied Materials [12], KLA Corporation [13], and ASML [14] are investing in the development of intelligent systems to improve process efficiency and yield by utilizing data analysis and machine learning technology. Intel, a leading semiconductor manufacturer, uses data analysis and machine learning to optimize its processes and increase its production yield [15]. These companies are developing semiconductor intelligence systems that collect and analyze data from various sources, including equipment sensors, to identify patterns and anomalies in the data and are conducting research and development to predict equipment failures and optimize process parameters to increase yields. However, research on such equipment intelligence has encountered difficulties in detecting and classifying errors and diagnosing anomalies owing to the limited amount of data available for security reasons. In addition, massive computing resources are required to select meaningful information from the large body of real-time data generated during the manufacturing process. This limitation imposes high costs and large capital investments, thus making it difficult for general research institutes and companies to conduct research in this area. Furthermore, the wide range of variables used for diagnosis and interpretation necessitates a heavy workload which can lead to data errors. These errors reduce the reliability of the data, thus making it difficult to maintain the accuracy and stability of the process abnormality diagnosis using algorithms. Because of these problems, current research on intelligent systems for semiconductor equipment is insufficient.
To address these limitations, we present an approach for diagnosing abnormal conditions in manufacturing processes using meaningful data that affect the process by implementing an integrated database system capable of collecting equipment and sensor data simultaneously. Using data acquired in real time, we aim to maintain normal conditions by reducing the number of unnoticed or undetected anomalies. Through this research, we will build an integrated database system that can simultaneously collect equipment and sensor data that affect the manufacturing process, thereby increasing data reliability. In addition, we construct a model that selects data with a high correlation to the process using domain knowledge to improve the interpretation efficiency and accuracy. In this study, we used 300 mm plasma-enhanced chemical vapor deposition (PECVD) advanced production tools. The employed PECVD system consists of six process chambers with a cluster control computer, and a specific chamber for three-dimensional negative-AND dielectric deposition was used. The backend fault scenario involves the degradation of the mass flow controller (MFC) for SiH4 gas, which is the most important chemical reactant for silicon oxide deposition in PECVD. Experiments on the MFC failure scenario were performed, and optical emission spectroscopy (OES) data were collected to acquire plasma chemistry information. The collected plasma data were fed into a machine-learning algorithm to determine the abnormal state of the equipment. Section 2 describes the implementation of an integrated database system for collecting data from semiconductor manufacturing equipment and sensors. Section 3 focuses on the methods used to detect anomalies and classify the causes of equipment degradation, providing detailed information on the equipment and sensors used in the process, as well as the modeling methods and results. Section 4 presents the results of the equipment abnormality detection and classification using data collected from the implemented database. Finally, conclusions and suggestions for future research are discussed.
Modern manufacturing processes require services that analyze and visualize data provided by data platforms, such as deep learning and digital twins. These data platforms are critical components of modern computing and can store, organize, and retrieve vast amounts of data. Such a platform comprises three modules: data collection, storage, and calculation [16]. The data collection module collects data from various sources including sensors, machines, and other devices. The collected data may be presented in different formats, such as analog or digital signals, and may have different characteristics, such as frequency and amplitude. Specialized hardware and software, such as data acquisition systems (DASs) and data logging devices, are used for collection and processing. Database systems are responsible for collecting and storing data and must be shared and accessed by multiple users or applications simultaneously. Database management systems (DBMSs) such as my structured query language (MySQL), Oracle, SQL Server, and Maria DB, are commonly used to manage and operate the databases. Each DBMS has unique characteristics and applications, allowing users to select the most suitable one based on the specific uses. Several types of DBMSs are available, including hierarchical, network, relational, object-oriented, and object-relational. Among these, relational DBMSs, including MySQL, are currently the most widely used, and store data in tables [17]. A structured query language is used to query databases in DBMSs, which can be used to search for or insert the desired data. In this study, a structured table-type database using MySQL is implemented to store and manage the collected data. A database schema helps it remain up-to-date and relevant over time [18]. All equipment and sensor data collected through this database system schema design were integrated into one table so that the data generated during the entire process could be collected and analyzed. The SEMI equipment communications standard/generic equipment model (SECS/GEM) communication protocol, which is used to transmit and receive data from semiconductor equipment, consists of SECS-I, SECS-II, and high-speed message service (HSMS) protocols based on SEMI standards. The HSMS protocol is used to define message transmission [19], and the SECS-II protocol defines the message content structure, as listed in Table I [20]. PECVD, a type of semiconductor deposition equipment used in database systems, uses HSMS communication in the SECS/GEM protocol, and external sensors use Modbus protocols, such as RS-232 and TCP/IP communication.
Table I. Different types of device data format..
Device | DBMS model | Data size | Description |
---|---|---|---|
RF generator | DATA4,5 | 2 bytes | Load position |
DATA6 | 1 byte | Power mode | |
DATA7 | 1 byte | Comm mode | |
DATA8 | 1 byte | RF power ON/OFF | |
DATA10,11 | 2 bytes | Tune position | |
DATA14,15 | 2 bytes | Forward power | |
DATA16,17 | 2 bytes | Reflect power | |
DATA18,19 | 2 bytes | VDC voltage | |
DATA22,23 | 2 bytes | User RF set | |
RF matching unit (RF matcher) | DATA0,1 | 0x06 | Preset1 C1 (load) |
DATA2,3 | 0x06 | Preset1 C2 (tune) | |
DATA0 | 0x0D | RF OFF/ON and Alarm | |
DATA1, 2 | 0x0D | C1 (load) position status | |
DATA3, 4 | 0x0D | C2 (tune) position status | |
DATA5, 6 | 0x0D | C3 (aux) position status | |
DATA7, 8 | 0x0D | Output VDC status (H/W opt.) | |
DATA9,10 | 0x0D | Output voltage (H/W opt.) | |
DATA11,12 | 0x0D | Output current (H/W opt.) |
Transmitting and receiving the generated data requires an edge computing system that can connect multiple communication models [21]. In this study, a database communication program was designed, as shown in Fig. 1, to collect equipment and sensor data. An edge-computing device provided by Drimsys (Seoul, Korea) was used to build an equipment data collection system.
PECVD is widely used to deposit thin films of various materials on a substrate using plasma-enhanced chemical reactions. The properties of thin films can be controlled by adjusting the process variables, such as the gas flow rate, power, pressure, and temperature, to obtain the desired results. The oxide deposition process in this study used a 300 mm capacitive coupled plasma PECVD system. The PECVD system consists of a vacuum chamber; an MFC for transmitting, receiving, and inserting gas; an RF plasma generator/matcher; and a power supply. The plasma generated during the process directly affects the characteristics of the thin film; therefore, it is important to ensure the quality and reliability of the final product. Although existing plasma process progressed through changes in equipment parameters, diagnosis and analysis of the internal state of the plasma was not performed, as it was considered necessary for the objectives of previous research. However, a smaller device requires increasingly sensitive control to generate stable plasma, which, in turn, places more significance on analyzing the plasma data generated during the process. As a result, the use of sensors to diagnose the plasma status has increased, and various studies have been conducted to collect and analyze data using noncontact sensors that do not affect the process [22,23]. Therefore, this study used an OES sensor to confirm and diagnose the plasma state by measuring the amount of light emitted from excited species in the plasma to provide information on the plasma properties, such as composition, electron temperature, and density.
Over time, PECVD equipment consisting of various modules and sensors may deteriorate or corrode and cause problems such as incorrect input values and performance degradation. In particular, an MFC is used to control the flow rate of the precursor gas, which is essential for achieving the desired film characteristics. Changes in the relevant parts owing to corrosion and minor failures can directly affect the plasma process and thereby thin-film characteristics. Therefore, this study focused on investigating the impact of MFC abnormalities on the PECVD process and conducted experiments to collect data from when abnormalities occurred and when the system operated normally. We created a scenario to simulate an aged MFC by depositing an oxide film using the PECVD equipment. SiH4 gas used in the oxide deposition process was selected as the ideal gas for the MFC step. The experiment was conducted assuming that SiH4 MFC was ideal due to aging, and the normal and abnormal ranges were set at ±2 sccm, as specified in Table II. Only the SiH4 gas was changed, while the remaining parameters were maintained at their normal values.
Table II. SiO2 deposition SiH4 gas fault scenario..
Class | Pressure (mTorr) | Gas (sccm) | Power (W) | Temp. (°C) | ||
---|---|---|---|---|---|---|
SiH4 | N2O | N2 | ||||
0 (Normal) | 2,250 | 200 | 4,000 | 3,000 | 400 | 360 |
1 (Abnormal) | 202–215 | |||||
2 (Abnormal) | 186–198 |
To analyze and prepare the data for modeling, a data selection process was performed to extract relevant features and exclude irrelevant or noisy data. During the data selection process, the equipment condition data and OES wavelength peaks, which have a significant impact on the oxide deposition process, were identified missing or noisy data was excluded. The OES data, comprising a broad spectrum of emissions, were first screened to select the chemical species relevant to the process. The OES peaks were selected by referring to the National Institute of Standard and Technology atomic spectra database and other papers [24–27].
The data selection considers variables that affect the deposition rate and quality of the thin films. Finally, RF power (watt), pressure (mTorr), temperature (°C), and gas flow rate (sccm) were selected as process parameters. The details of the selected parameters for the OES sensor are summarized in Table III, and the selected data can be found in the attached table. Data preprocessing, an essential step in the modeling process, organizes raw data to produce data suitable for analysis. This preprocessing step has a significant impact on the accuracy and efficiency of the modeling results. The first step in data pre-processing involves identifying and correcting or removing errors, noisy data, or missing data from the collected process data. Additionally, a data format conversion was performed to ensure that the data were in a format applicable to modeling. In this study, preprocessing was performed using domain knowledge, including knowledge of the PECVD process, to ensure the selection of appropriate data and the removal of irrelevant or redundant data. We also performed data normalization to ensure that all variables were within the same range to avoid bias.
Table III. Selected OES key wavelengths of SiO2 deposition..
Selected OES peak wavelength | |
---|---|
Si | 297.1 nm |
O2 | 296.8 nm |
N | 336.7 nm |
N2 | 315.5, 353.7, 357.7, 371.0, 375.4, 380.4, 391.1, 394.3, 399.5, and 405.9 nm |
SiH | 414.3 nm |
Hα | 656.2 nm |
Hβ | 486.1 nm |
Modeling was performed using an ensemble algorithm to perform fault detection and classification (FDC). An ensemble algorithm refers to a model that combines predictions using multiple algorithms for improved accuracy and robustness. Each decision tree in the ensemble provides the ability to split the data in the best manner based on specific criteria, such as information gain or Gini impurity. This process continues until the data are split into sufficiently small subsets and predictions are made for each subset based on several classes; predictions are constructed using a random subset of the training data and a random subset of the functions, which helps reduce overfitting and improve the diversity of the ensemble. This approach reduces the impact of individual model errors and helps the ensemble improve the overall accuracy and stability. That is, a decision tree performs a single learning function, whereas an ensemble is a model that combines multiple learning processes to make a decision. Ensemble learning includes many different types of algorithms, but all work on the basic principle of combining predictions from multiple models. The most commonly used ensemble algorithms are voting, wrapping, boosting, and stacking [28].
Ensemble models are tree-based algorithms consisting of a diverse range of models such as Random Forest, AdaBoost, Extra Trees, Boosting, and CatBoost. The core principle of ensemble models revolves around decision trees that utilize a set of rules to make predictions about the target variables. In particular, ensemble models are a popular choice for FDC because of their superior performance compared to single machine learning models. Their intuitive nature further enhances their attractiveness for FDC applications. This paper presents a comparative analysis of the training time and accuracy associated with the application of various machine-learning algorithms for FDC to enhance the analysis and diagnosis of faults in industrial processes. The evaluation involved a comparison of various machine learning outcomes, encompassing conventional classification algorithms such as decision trees and support vector machines, along with more advanced ensemble algorithms. Notably, the ensemble algorithms exhibits outstanding classification performance. Table IV demonstrates the remarkable performance of several algorithms when applied to the acquired process data. Consequently, for FDC, the Extra Trees and CatBoost models were employed as they exhibit the most favorable performances.
Table IV. Comparison of the models..
No. | Model | Train time (s) | Accuracy (%) |
---|---|---|---|
1 | HistGradient BoostingClassifier | 12.47 | 84.70 |
2 | XGB Classifier | 6.34 | 84.14 |
3 | CatBoost Classifier | 7.20 | 84.14 |
Extra Trees, a variant of the Random Forest model, introduces increased randomness through the random partitioning of features within the forest of trees. This increased randomness helps capture a wider range of feature interactions and improve the ability of the model to handle complex nonlinear relationships within the data. In addition, the ensemble method employed for Extra Trees, which involves generating multiple models using the Random Forest and tree algorithms, contributes to more accurate predictions. By combining predictions from multiple models, ensemble approaches can help reduce the impact of individual model biases and improve overall prediction performance [29].
CatBoost is a boosting-based model that improves the slow learning speed and overfitting problems of existing boosting models. It presents a new method for processing categorical data to improve the target leakage problem and has the advantage of modifying the existing gradient boosting algorithm [30,31]. In addition, all features form the same symmetric tree structure, which is conceptually adopted from [32]. Therefore, if the data contains two or more redundant variables that can clearly distinguish between classes in the dataset, the variables can be combined into one to reduce the prediction time.
In the modeling step, ensemble algorithms, such as Extra Trees and CatBoost, were implemented to detect and classify failures in the preprocessed data. The objective of the modeling was to classify the equipment and sensor data into three classes, namely, normal (0) and abnormal causes (1, 2), based on the 53 failure scenario tests performed. The data collected from the equipment and sensors were applied to the model. An ensemble algorithm was used to develop a model to classify the normal and abnormal states of the equipment. Another model using the same algorithm was implemented to classify the causes of normal and abnormal states. We used domain knowledge to develop a model that considered various factors, such as data balance, accuracy, and computational efficiency. The model was trained on 80 % of the preprocessed data, and the remaining 20 % of the data were used for testing. The model was further validated by crossvalidation to ensure the stability and robustness of the results.
The performance and predictive ability of the model were evaluated using a variety of well-known metrics, including confusion matrix, accuracy, precision, recall, and F1-score. These evaluation criteria are widely used in the machine learning field to evaluate the effectiveness of classification models and are also utilized in this study to evaluate ensemble models. The confusion matrix is a fundamental tool for evaluating the classification models. It comprises a table that describes the mapping between the predicted and actual class labels, provides a detailed history of the model predictions, and shows the types of prediction results. True positive (TP) and true negative (TN) predictions in the confusion matrix represent the correct predictions made by the model, whereas false positive (FP) and false negative (FN) predictions represent deviations from the true value. The confusion matrix serves as the basis for calculating other evaluation metrics and provides important information regarding the performance of ensemble models, thus enabling the calculation of metrics such as accuracy, precision, recall, and F1-score.
The accuracy is a measure of how well the predicted data align with the actual data. This metric calculates the ratio of the correct predictions to the total number of predictions made by the model.
Higher accuracy values indicate a higher prediction accuracy and are often used when evaluating models. However, if the data in a model are imbalanced, the accuracy may not be appropriate for evaluation. In this case, it is important to consider precision, recall, and the F1-score, along with accuracy, to comprehensively evaluate the performance of the model.
By considering a balanced approach incorporating precision, recall, and F1-score, we can achieve a comprehensive assessment of the effectiveness of the model beyond a reliance on accuracy. This multicriteria evaluation approach increases the understanding of the strengths and weaknesses of the model and contributes to a more robust evaluation [26]. Therefore, in this study, precision, recall, and F1-score are used in addition to accuracy to evaluate the performance of the model.
In semiconductor manufacturing processes, it is essential to monitor and collect data from various equipment and sensors in real time to ensure stable and accurate production. However, the collection of data from multiple devices using different communication protocols and data formats can be challenging. To address this challenge, edge computing technology has been utilized to collect data from different devices and sensors and integrate them into a single system. To establish the database for this study, we collaborated with manufacturing companies to address security issues pertaining to interlocking equipment and the configuration of data communication and sensors. A comprehensive system configuration was designed, and an integrated database system, shown in Fig. 2, was constructed. The parameters for this database system were selected through a meticulous process analysis, and significant software work was undertaken to configure the form of the database, including the design and optimization of tables, columns, and relationships.
The SECS communication protocol, a standard communication protocol in the semiconductor industry, was used to transmit and receive data from the database. The HSMS communication protocol was used during SECS/GEM communication in edge computing devices, which allowed faster data transmission and reception compared with the conventional TCP/IP protocol based on ethernet or Modbus communication. To ensure accurate and efficient data analysis, the data collection times of the equipment and sensors were synchronized and a database system integrating the equipment and sensor data was implemented. Data were collected every second to provide real-time monitoring and accurate data analysis. The use of edge computing technology and synchronized data collection enable efficient and effective data management. The implemented database system allows for real-time monitoring of data processing and collection without loss. This approach reduces the time required to collect data for analysis because it eliminates the need for time synchronization of each piece of equipment and device.
The MFC defect scenario experiment for the oxide deposition process was conducted 53 times, and the thickness and intensity were measured according to the changes in the MFC gas flow rate. The average thickness was obtained by measuring 49 points on the wafer using a reflectometer, and the average value and the value excluding the edge part of the total data were confirmed. The results show that the thickness uniformity was below 5 %, except at the corners. We confirmed that the thickness changed proportionally with the SiH4 gas flow rate. By analyzing the change in the OES strength according to the change in the SiH4 flow rate, the change in strength was clearly observed as the gas flow rate increased or decreased. Figure 3 shows that the intensity of the N-related peak increased with increasing gas flow rate of SiH4. This trend suggests that the intensity of the N-related peak increases as the amount of SiH4 gas increases because SiO2 is generated during the reaction between SiH4 and N2O.
The analysis of the oxide deposition process led to the selection of variables related to this process. These variables include sensor data that identified the chemical species related to the process based on the signals received during the SiO2 process. In addition, variables related to the RF power output voltage and current, which are closely related to the characteristics and thickness of the thin film, were selected. Heat-related variables and equipment state parameters related to pressure and temperature were also included, along with variables related to the setpoints and actual flow values of the MFC gas. Furthermore, the parameters related to the remote plasma system were included to monitor the plasma state. The core database parameters selected consisted of 72 equipment data points and 9 generator and matcher parameters, totaling 90 data points related to manufacturing equipment. In the case of sensor data, 81 wavelengths related to the process were selected by the OES and analyzed, which were confirmed to improve the model performance.
Real-time data were collected during the oxide deposition failure scenario to detect and classify abnormal conditions caused by equipment aging. After the preprocessing step, the data were split in an 8:2 training-to-testing ratio to conduct a comprehensive evaluation of the FDC model. Prior to applying the algorithm, the hyperparameters were fine-tuned using a grid search to optimize the performance of the model. The performance of the final learned model was evaluated by identifying various indicators such as accuracy, recall, precision, and F1-score, as summarized in Table V.
Table V. Evaluation of the fault detection model..
Fault detection model accuracy | Improved fault detection model accuracy | |||||||
---|---|---|---|---|---|---|---|---|
Algorithm | Extra trees | CatBoost | Extra trees | CatBoost | ||||
Train | Test | Train | Test | Train | Test | Train | Test | |
Accuracy | 97.6 | 80.4 | 98.5 | 83.0 | 94.4 | 83.8 | 98.1 | 86.0 |
Recall | 98.9 | 89.3 | 98.8 | 85.5 | 95.4 | 88.6 | 97.6 | 85.3 |
Precision | 96.8 | 77.2 | 98.4 | 82.8 | 94.1 | 82.1 | 98.8 | 87.9 |
F1-score | 97.8 | 82.8 | 98.6 | 84.1 | 94.7 | 85.2 | 98.2 | 86.6 |
The accuracy metric measures the overall accuracy of the model predictions, whereas the recall measures the model ability to accurately identify anomalous processes. Precision measures the ability of the model to minimize false positives, and the F1-score represents the harmonic mean of precision and recall. After evaluating the final learned models, the Extra Trees and CatBoost models achieved an evaluation data accuracy of 80.4 and 83.0 %, respectively. Moreover, additional preprocessing was performed on the OES data to increase the accuracy of the anomaly detection. Further modeling was performed by selecting a total of 79 parameters related to process gases (i.e., SiH4, N2O, and N2 gas) which are important parameters affecting the process from a dataset consisting of a total of 624 wavelengths. High accuracies of 83.8 and 86.0 % were confirmed, as shown in Table VI.
Table VI. Evaluation of the fault classification model..
Fault classification model accuracy | Improved fault classification model accuracy | |||||||
---|---|---|---|---|---|---|---|---|
Algorithm | Extra trees | CatBoost | Extra trees | CatBoost | ||||
Train | Test | Train | Test | Train | Test | Train | Test | |
Accuracy | 98.6 | 79.9 | 96.3 | 83.0 | 95.6 | 85.0 | 96.5 | 87.3 |
Recall | 98.6 | 79.9 | 96.3 | 83.0 | 95.6 | 85.0 | 96.5 | 87.3 |
Precision | 98.6 | 82.2 | 96.4 | 83.6 | 95.9 | 86.5 | 96.5 | 87.6 |
F1-score | 98.6 | 79.7 | 96.4 | 83.0 | 95.6 | 85.0 | 96.5 | 87.3 |
Using these results, we evaluated the performance of the machinelearning model in detecting abnormalities within the selected parameters. As a result of the evaluation, it was confirmed that the CatBoost model achieved an accuracy of over 85 %, showing excellent performance among anomaly detection models. In addition, to further verify the accuracy of the CatBoost model, which exhibited good performance, the classification results predicted by the final learned model were dimensionally reduced using the t-SNE algorithm. This allowed us to visualize the results easily and better understand the performance of the CatBoost model. The results are presented in Fig. 4 and confirm the high accuracy of the anomaly detection model. These results demonstrate the effectiveness of the machine learning model in detecting and classifying anomalous processes among the selected parameters. In particular, we show that data from an external OES sensor can detect defects in the MFC parts of the equipment in real time to identify abnormal conditions, and that domain knowledge can be used to select parameters that affect the process and preprocess the data. The use of preprocessed data affected by the process was confirmed to help improve model performance.
The OES data, which were preprocessed to remove noise from the entire spectrum, were similarly applied to the fault classification model. Here, preprocessed OES data were used as input; class 0 was assigned to the normal scenario, and classes 1 and 2 were assigned to the changed scenario, in which the SiH4 gas flow was increased or decreased. A classification algorithm was used to classify abnormal causes, and the Extra Trees and CatBoost algorithms were used to verify the results and were confirmed to have the highest performance. Similar to the anomaly-detection model, the training and testing data were separated in an 8 : 2 ratio, and the hyperparameters were adjusted using the grid search method. The performance of the final model was evaluated using the accuracy, recall, precision, and F1-score metrics, as detailed in Table VI. The Extra Trees model achieved a test data accuracy of 79.9 %, whereas the CatBoost model achieved a test data accuracy of 83 %.
A confusion matrix visualization of the results is presented. To increase the accuracy of this model, 79 wavelengths that affect the SiO2 process were selected and applied, similar to the preprocessing performed on the anomaly detection model. Preprocessed data were used in each algorithm and the hyperparameters were fine-tuned using the grid search method. The performance of the final learned model was evaluated based on the accuracy, recall, precision, and F1-score, and the results are presented in Table VI.
The CatBoost model achieved the highest test data accuracy of 87.3 %, proving to be the most effective in classifying abnormal states. To visualize the results of the final model, t-SNE analysis was performed on the confusion matrix and prediction output. As shown in Fig. 5, the t-SNE results clearly distinguished between normal data (class 0) and abnormal data (class 1 and class 2). The high accuracy of the model was further confirmed by the t-SNE results. In particular, through t-SNE analysis, different clusters appeared for each data class, further proving the effectiveness of the CatBoost model.
In our previous studies, tree-based ensemble algorithms, including bagging and boosting, demonstrated excellent performances when applied to OES data for semiconductor process analysis [33,34]. In this study, we also observed the strong performance of ensemble algorithms in the application of various machine-learning algorithms, including basic classification algorithms. We conducted a final comparison of the two algorithms that exhibited the best performances. The two algorithms that demonstrated optimal performance, Extra Trees and CatBoost, aimed to address the drawbacks of traditional tree-based ensemble models by reducing model overfitting and enhancing predictive performance. In particular, CatBoost, the algorithm that demonstrated the best performance, addresses the limitations of boosting algorithms, which are commonly associated with high performance, but are susceptible to overfitting. CatBoost was introduced to overcome these limitations and has been proven to perform exceptionally well, improve the learning speed, and successfully mitigate overfitting issues.
In this study, CatBoost demonstrated an outstanding performance in applying OES data to FDC in the PECVD process. This highlights the functional role of CatBoost in addressing the challenges related to model generalization difficulties caused by overfitting in OES data. Moreover, because OES data consist of numerous variables, including noisy data, they pose limitations in the context of big data analysis in mass-production environments. This study enhanced the efficiency of OES big data analysis by implementing an integrated database system. Additionally, it demonstrated not only a reduction in data acquisition time through domain knowledge-based preprocessing of OES data but also improvements the learning time and performance of the FDC models using machine learning. Furthermore, the study achieved a decrease in the learning time and an improvement in the model performance through the application of machine learning to FDC. Ultimately, the integration of an integrated database system and machine learning is expected to enhance the FDC performance in semiconductor plasma equipment using OES data within the big data environment of mass production.
In this research, we demonstrated a practical way to include 300 mm manufacturing equipment status identification data as well as additional sensory data. It should be noted that RF-related data are also included in the sensory DAS for a better understanding of the in situ process monitoring capability, and this methodology can also be applied to etching systems, although this paper only deals with PECVD equipment. One might raise a question for practical use in addition to the limited experimental conditions shown in this paper, but the high-volume semiconductor manufacturing environment employing 300 mm manufacturing equipment also runs very limited process recipes for the designated tool. In the university research environment, it is not practical to perform more assessments of real-world applications because of limited research budgets and resources, but it still conceptually holds for expanding to real-world applications, as demonstrated in this paper. Finally, 300 mm high-volume manufacturing equipment may not be suitable for small and mid-sized enterprises because it is based on 300 mm manufacturing equipment suppliers and chipmakers.
The demand for highly integrated semiconductor devices has increased the complexity of the technology and equipment configurations in manufacturing processes, particularly as the sizes and structures of semiconductor devices have become smaller and more intricate. This ongoing complexity necessitates the use of a variety of measurement and inspection tools to analyze and improve process quality and yield. However, as the number of equipment components and sensors increases, errors and false alarms occur, in addition to equipment complexity.
To address this issue, this paper proposes an equipment intelligence system to monitor and diagnose abnormalities in semiconductor processing equipment in real time and return subsequent processes to a normal state of operation. Previous studies have encountered challenges in detecting and classifying anomalies owing to a lack of data availability and data security. Currently, the communication protocol formats and data collection periods for various equipment component suppliers in the manufacturing industry are not uniform. To alleviate these concerns, an anomaly-detection and classification approach is proposed in which sensors are integrated into an equipment database system. This system utilizes edge computing to convert data into standardized communication protocols and collects various process variables on a central server. To demonstrate the practical feasibility of the proposed method, equipment aging problems that can cause subtle control errors, particularly in MFC, were simulated. Abnormalities during the deposition process were also simulated, and real-time data from the OES sensors attached to the equipment were collected. FDC models were developed using preprocessed OES data, achieving high accuracy and over 85 % performance. In conclusion, this paper establishes an integrated database system for semiconductor equipment and sensors to enable efficient analysis, early error detection, and accurate cause identification. This system is expected to significantly improve the process efficiency and yield of semiconductor manufacturing technology through real-time data analysis, result in a rapid response to abnormalities, and improve overall equipment efficiency.
However, if the analysis is performed using data from various sensors, the cause of the abnormality can be diagnosed with greater accuracy than when the analysis is performed using only one OES sensor and equipment. Errors that occur during the process lead to yield problems; therefore, additional systems that can immediately respond to abnormalities are required. In semiconductor processing, continuous processes are performed using multiple pieces of equipment rather than just one piece. Therefore, a database must include data from multiple devices. It is believed that a database system that integrates various types of equipment and sensors used in conjunction with a platform that utilizes artificial intelligence will be of great help in solving the yield problem. Therefore, in the future, it will be necessary to research the implementation of an intelligent system that builds an integrated database system that includes two or more sensors along with semiconductor equipment and links that system to an AI platform.
This research was supported by the National Research Council of Science and Technology under the Plasma E.I. (Grant ID: 1711121944, CRC-20-01-NFRI).
The authors declare no conflicts of interest.