An Evolutionary Method of Neural Network in System Identification
Shuming T. Wang1, Chi-Yen Shen1, Yu-Ju Chen2, Chuo-Yean Chang3, Rey-Chue Hwang1, *
1Department of Electrical Engineering, I-Shou University, Kaohsiung City, Taiwan, R.O.C.
2Department of Information Management, Cheng Shiu University, Kaohsiung City, Taiwan, R.O.C.
3Department of Electrical Engineering, Cheng Shiu University, Kaohsiung City, Taiwan, R.O.C.
To cite this article:
Shuming T. Wang, Chi-Yen Shen, Yu-Ju Chen, Chuo-Yean Chang, Rey-Chue Hwang. An Evolutionary Method of Neural Network in System Identification. International Journal of Intelligent Information Systems. Vol. 5, No. 5, 2016, pp. 75-81. doi: 10.11648/j.ijiis.20160505.14
Received: September 19, 2016; Accepted: October 12, 2016; Published: October 20, 2016
Abstract: This paper presents an evolutionary method for calculating the important degree (ID) of individual input variable of well-trained neural network (NN). The importance of each input variable of neural network could be distinguished in accordance with ID value obtained. In this research, several linear and nonlinear systems’ identifications were firstly studied and simulated. From the simulation results shown, the evolutionary method proposed is quite promising and accurate for the estimation of system’s parameters. In other worlds, the method proposed could be used for data mining in the real applications. In order to verify our inference view, the evaporation process of thin film was studied either. It is a real case of industrial application. Again, the studied results show that the method proposed indeed has the superiority and potential in the area of data mining.
Keywords: Evolutionary, Important Degree, Neural Network, System Identification
It is well known that system identification is a method which can identify the mathematical model of an unknown system from the measurements of system’s inputs and outputs. It has been employed into many areas such as industrial process, control system, economic forecasting, social science and so on. Many classes of system identification have been studied such as Volterra and Wiener series , NARMAX models  and neural network (NN) [3-7]. Generally, the steps like model hypothesis, estimation of parameters and system verification are indispensable for identifying an unknown system.
Recently, data mining has been widely used in the applications of signal processing and system identification. It is the process of analyzing data from different perspectives and then extracts the hidden information from a large data base. It can help the researcher to grasp the useful information that might be ignored and missed in the process of data analysis. The tasks of data mining generally include four classes which are classification, clustering, regression, and association rule learning [8-14]. In which, regression analysis consists of the graphic and analytic methods. It aims to obtain the relationships between response variable (output) and other predictor variables (inputs) for an unknown system. Its goal is to express the response variable as a function of the predictor variables. The major uses of regression methodology include model specification and parameter estimation . Model specification is a very important step in the application of system identification. Its objective is the assessment of the relative values of individual predictor variables on the prediction of the response. Basically, all relevant variables in the database must be analyzed in order to find which variable has the higher correlation with the response. Thus, the parameter estimation of the model is necessary to the regression analysis.
Due to the powerful learning and adaptive capabilities, NN model has been widely studied in the area of data mining. Many research articles have been proposed [16-22]. NN has the ability to deal with lots of linear or nonlinear modeling problems. Through the learning from data provided, NN is able to generate a mapping between input and output pairs bypassing the complicated statistical analysis steps. In general, the applications of NN model in data mining can be classified into four major categories: (1) Feed-forward NN, which is mainly used in the areas of prediction and pattern recognition; (2) Feedback NN, which is mainly used for associative memory and optimization calculation; (3) Self-organization NN, which is mainly used for cluster analysis; (4) Random NN, which has the advantages of associative memory and image processing .
In this research, the studies of NN versus regression analysis were studied. The assessment of the relative values of individual predictor variables (inputs) on the prediction of the response (output) was analyzed based on NN technique. The supervised learning feed-forward NN was mainly studied. And, the error back-propagation (BP) learning algorithm was used as the training rule of NN model. The details of NN structure and its learning algorithm are described in Section 2. Section 3 presents some relevant experiments and results. At last, a conclusion is given in Section 4.
2. NN Model and Its Learning Algorithm
The NN structure commonly known as multi-layered feed-forward net is used in this study. A three-layered feed-forward NN model as shown in Figure 1 is the selected topology. Each layer is connected to a layer above it in a feed-forward manner in the sense that no feed-back from the same layer or a layer above. All connections have a multiplying weight associated with them. Training is equivalent to find the proper weights for all connections such that a desired output is generated for a given input set.
In NN model, the nonlinear activation function of all nodes is the sigmoid function. Its math form is expressed as the following equation and its diagram is shown in Figure 2.
Where, , is the strength of connection between node j and node i in the layer below and is the value of node i.
Here, we use a simple three-layered NN model to describe the evolutional method we proposed for calculating the important degree (ID) of individual input variable of a well-trained NN [24-25]. In Figure 1, it can be clearly found that the hidden nodes’ outputs (and network output (Y) can be expressed as the following nonlinear math forms to the inputs ().
is the strength of connection between hidden node j and input node i and is the strength of connection between hidden node j and output node Y. and are bias terms.
According to the inferences of Eqs. (5), (6) and (7), the important degree (ID) and the percentage important degree (PID) of input to output Y are defined by
where, NT is the total number of input and m is the number (category) of input variables.
3. Experiments and Results
3.1. Linear and Nonlinear Regression Models
The aim of regression is to express the response variable as a function of the predictor variables. For linear regression, the response variable is supposed to have the linear relationship with the predictor variables. Denoting the response variable by Y and the m predictor variables by x1, x2, ……, xm, the linear relationship takes the form
In expression, are unknown parameters needed to be determined. is the noise term.
Nonlinear regression is a form of regression analysis in which the response variable is modeled by a function which is a nonlinear combination of the predictor variables. For instance, a nonlinear function is given as
In expression, the relationship between Y and (x1, x2) are nonlinear. In general, it is very difficult to obtain an exact closed-form expression for the unknown model in the nonlinear regression analysis. The data are usually fitted by a method of successive approximations. The present adaptive methods are very successful in linear regression. The application of these methods to nonlinear regression, however, requires several simplifications. And, only the approximate model could be obtained.
3.2. NN vs. Linear Regression Model
A linear regression model is given as 
The ratios of the coefficients among input variables are x1:x2=10:1, x1:x3= 100:1, and x2:x3=10:1. That means the degrees of importance of x1 to Y is 10 times x2 to Y, and 100 times x3 to Y, if x1, x2 and x3 have the same statistic distributions.
In our studies, Eq. (12) was firstly considered and 500 points were generated from the equation. Then, we assume this linear model is an unknown system which needs to be identified. In system’s identification process, five variables, x1, x2, x3, x4 and x5 are collected and assumed to be the possible influencing factors. Five variables are generated by uniformly distributed random numbers with mean zero and variance three and they have the values in the range of [3,3]. Table 1 lists the statistics of (x1, x2, x3, x4, x5).
A NN with size m-4-1 was used to model this linear equation. The number of input nodes (m) of network is based on the variables used. In this study, 400 points were used for NN’s training and 100 points were used for testing. Table 2 lists the testing mean absolute errors (MAE) of NN modeling with different input variables. From the values of MAE, we conclude that NN has the efficient learning to this unknown system.
The corresponding IDi value of each input variable in the well-trained NN models is then calculated. Table 3 is ID and PID values calculated for NN models with three, four and five inputs, respectively. From the table results shown, a phenomenon could be observed. For size 3-4-1 NN, the PID ratio of (x1, x2, x3) is (1:0.110:0.010). For size 4-4-1 NN, the PID ratio of (x1, x2, x3, x4) is (1:0.113:0.011:0.0017). For size 5-4-1 NN, the PID ratio of (x1, x2, x3, x4, x5) is (1:0.111:0.0107:0.00024:0.00135). It is clearly found that the PID ratios of (x1, x2, x3) in three NN models are all very close to the ratio (1:0.1:0.01) of the coefficients of (x1, x2, x3) in the original linear equation (12). The ID and PID values of x4 and x5 are much smaller than other variables’. This phenomenon shows that the nonlinear NN model is capable of evaluating the importance degree of each input variable to the output. And, x4 and x5 could be treated as the useless terms or noises due to their low ID and PID values.
In order to evidence the validity of the method we proposed, 500 points were regenerated. x1, x2, x3, x4 and x5 are uniformly distributed random numbers with different ranges. Table 4 lists the statistics of (x1, x2, x3, x4, x5).
From the statistics of Table 4, we conclude that the average absolute value of x2 should be 10 times the average absolute value of x1 and the average absolute value of x3 should be 5 times the average absolute value of x1. Multiplying the coefficient of each variable in Eq. (12), the important degrees of x1, x2 and x3 to Y should be 1:1:0.05. Table 5 lists the testing mean absolute errors (MAE) of NN modeling with different input variables. From the values of MAE, we also conclude that NN has the efficient learning to the system.
Again, we calculated ID and PID values from three well-trained NN models. Table 6 lists ID and PID values calculated by NN models with three, four and five inputs, respectively. From the table results shown, the ratio of PID of (x1, x2, x3) is (1:0.982:0.049) for size 3-4-1 NN model. For size 4-4-1 NN model, the ratio of NPID of (x1, x2, x3, x4) is (1:0.990:0.0498:0.0027). For size 5-4-1 NN, the ratio of PID of (x1, x2, x3, x4, x5) is (1:0.985:0.0475:0.0006:0.00305). It is clearly found that the ratios of (x1, x2, x3) in three NN models are all very close to (1:1:0.05) of (x1, x2, x3) in the original linear equation. The ID and PID values of x4 and x5 are much smaller than x1, x2 and x3.
Same as the previous studies, this phenomenon shows that the nonlinear NN model is able to evaluate the important degree of each input variable to the output. And, x4 and x5 are inferred to be the useless terms or noises due to their low ID and PID values.
3.3. NN vs. Nonlinear Regression Model
In this session, the nonlinear system identification by NN model was studied continuously. A nonlinear equation given by 
500 points were generated from the equation. x1, x2, x3, x4 and x5 are assumed to be the possible influencing factors. All five variables are uniformly distributed random numbers with the statistics listed in Table 7.
In our study, 400 points were used for NN’s training and 100 points were used for testing. Table 8 presents the MAEs of NN modeling with different input variables. It shows that NN has the efficient learning to the system.
Table 9 presents ID and PID values calculated by NN models with three, four and five inputs, respectively. From the table results shown, the ratios of ID and PID of (x1, x2, x3) in three NN modeling systems are almost equal. Same as the studies in linear system, the ID and PID values of x4 and x5 are much smaller than x1, x2 and x3. In other words, the variables (x1, x2, x3) can be concluded as the real useful inputs to Y and (x4, x5) could be inferred to be the noise terms.
In fact, unlike linear modeling, no clear information about the important degree of individual input variable to output could be obtained in nonlinear modeling. However, from the PID results shown in Table 9, we found that the important degrees of x1 and x2 are almost the same. This phenomenon could be observed from Eq. (13) when both x1 and x2 have the same statistic distribution.
Similar as previous study, the nonlinear system identification by NN model is studied continuously. A nonlinear equation given by 
1000 points were generated from the equation and then the system was assumed to be unknown. 800 points were used for NN’s training and 200 points were used for testing. Four variables, x1, x2, x3 and x4 are assumed to be the possible influencing factors to the system’s output y. Four input variables are uniformly distributed random numbers with the statistics listed in Table 10.
A size of 4-5-1 NN was used to perform the model’s identification. Table 11 presents the simulation results including MAEs of the training and test of NN and ID and PID calculations of (x1, x2, x3, x4) for NN’s modeling with four inputs.
|Training MAE||Test MAE|
In this study, due to the ranges of (x1, x2, x3, x4) are all in [0,1], thus the angles of sine function are within the range of [0, π/2]. From the nonlinear Eq. (14), it can be observed that the important degree of (x1, x2, x3) should be (1:0.5:0.33333). From the simulation results shown, PID ratio of (x1, x2, x3, x4) is (1:0.5320:0.3209:0.0117). The PID ratio of (x1, x2, x3) is very close to the ratio (1:0.5:0.33333) of the coefficients (x1, x2, x3) in the original nonlinear equation. The ID and PID values of x4 are very small. Thus, x4 could be treated as the useless term or noise.
In above studies of NN versus nonlinear regression models, the simulation results show that NN model is also capable of obtaining the important degree of each input variable to the system output. In order to further prove our point of view, the example of real industrial system identification is continuously studied in the following section.
3.4. NN vs. Industrial System
It is known that touch panel (TP) has become an indispensable part for many electronic appliances such as computer, mobile phone, ticket vendor, etc. Usually, a thin plate with coating film will be stuck on the screen of TP as a decoration and protection film. Basically, the film’s coating process is accomplished by the evaporator under the vacuum condition. However, before the evaporation process is taken, the relevant control parameters of evaporator are needed to be set precisely in order to ensure the whole coating process could be accomplished successfully. Thus, these control parameters could be treated as the influencing factors of evaporation process. Transmittance is an important optical property of TP film. Its value is highly correlated with the influencing factors of evaporation process. The relationship between transmittance and its relevant influencing factors is very complex and nonlinear. Thus, how to use NN to find the real influencing factors of TP film’s transmittance is the aim of the study in this section.
In our research, TP film with two layers coating is studied. The coating targets are Cr and . The information of data collected for the experiments includes the value of quartz crystal deposition monitor (x1), the rotation speed of holder (x2), the substrate position of panel placed (x3), Cr thickness (x4), thickness (x5) and TP transmittance (y). The complex relationship between y and its possible influencing factors (x1, x2, x3, x4, x5) is expected to be obtained by NN model. In order to fairly demonstrate the feasibility of NN model and the computational technique proposed, two data sets named Set-1, Set-2, are re-organized randomly from the original data base. For each data set, 100 points are used for NN’s training and 44 points are used for testing. Except MAE, the mean absolute percentage error (MAPE) is also used as the measurement of NN performance.
|Inputs: x1, x2, x3, x4, x5|
Table 12 lists the MAEs and MAPEs of transmittance estimations performed by a five-input NN model with size 5-6-1. From MAPEs shown, we conclude that NN model has the efficient training. ID and PID values for all inputs are calculated and listed in Table 13. From the values of PID shown, it is clearly found that Cr thickness (x4) and thickness (x5) are two most important influencing factors to TP transmittance. The value of quartz crystal deposition monitor (x1) and the rotation speed of holder (x2) also have the certain impacts to transmittance. In these five input variables, the substrate position of panel placed (x3) could be ignored due to its small PID value.
In order to prove our conclusion, the transmittance estimations by NN model with different input combinations are redone. Table 14 lists MAEs and MAPEs of transmittance estimations performed by NN model with different four-input combinations. Table 15 lists MAEs and MAPEs of transmittance estimations performed by NN model with different three-input combinations. From the results shown in these tables, it is clearly found that the conclusion we made in accordance with the values of ID and PID is correct. In other words, the evolutionary NN technique we proposed is very promising and potential in the real applications of system identification.
|Inputs: x1, x2, x4, x5|
|Inputs: xx1, x3, x4, x5|
|Inputs: x2, x3, x4, x5|
|Inputs: x1, x4, x5|
|Inputs: x2, x4, x5|
|Inputs: x3, x4, x5|
This research presents an evolutionary NN technique for the application of system identification. The linear and nonlinear systems and the real industrial system were studied and simulated. The study results obviously show that the novel technique proposed indeed has its feasibility and superiority in the application of system identification. The computational methods denoted by ID and PID can extract the useful and important inputs to the system output. In other words, this study proposes a new system identification technique based on supervised NN and this technique has the potential in the area of data mining with large database.
In this research, the activation function for all nodes of NN is the sigmoid function. The ID and PID values defined are also derived from the characteristic of sigmoid function. Thus, it is believed that NN model with any node’s transfer function which has the same characteristic as sigmoid function certainly has the similar capability in the application of data mining. For instance, the hyperbolic tangent function is also a choice.
It is well known that NN with BP learning algorithm usually has the local minimum problem in its learning process. An ill-learning NN model might have the generalization and accuracy problems, if NN model really plunged into the local minimum. This condition might cause the incorrect ID and PID values calculated from the ill-trained NN model. Thus, we have to emphasize that NN model must have an efficient learning for ensuring the correct ID and PID information could be obtained.
This research was supported by the Ministry of Science and Technology, Taiwan, R.O.C. under Contracts No. MOST-105-2221-E-214-041.