摘要
倒傳遞演算法是目前類神經網路中應用最廣的學習方法。 經由適當地選擇類神經網路架構,我們可以很精確及合理地解決許多問題。但使用倒傳遞演算法在許多應用上也面臨學習速度太慢的問題。 因此,許多的研究者正由各種議題上致力於增進倒傳遞演算法的學習效率。 在我們的研究中,我們認為由最陡坡降法則所產生之錯誤飽和情況將大大地減低倒傳遞演算法的學習速度。因此我們將分析在類神經網路中,外在層及隱藏層節點發生錯誤飽和情況之原因。我們將首先針對外在層提出一個「防止錯誤飽和方法」(ESP method)來防止節點發生錯誤飽和情況,另外我們也將此方法應用到隱藏層節點來調整它們的學習量。此外針對各節點所使用的轉換函式,我們也提出一個學習方法來調整各轉換函式的溫度參數以利錯誤飽和情況之防止。經由上述方法,我們不但可以藉由防止錯誤飽和情況來增進學習效率,且亦能保持使用能量函式的語意(semantic meaning)。最後我們將提出一些經驗法則以利建構一個更廣義的能量函數來避免錯誤飽和情況。同時我們也提出一些模擬結果來呈現上述方法的功用。
Improvement of Back Propagation Algorithm by Error Saturation Prevention Method
Abstract
Back Propagation algorithm is currently the most widely used learning algorithm in artificial neural networks. With properly selection of feed-forward neural network architecture, it is capable of approximating most problems with high accuracy and generalization ability. However, the slow convergence is a serious problem when using this well-known Back Propagation (BP) learning algorithm in many applications. As a result, many researchers take effort to improve the learning efficiency of BP algorithm by various enhancements. In our study, we consider the Error Saturation (ES) condition which is caused by the use of gradient descent method will greatly slow down the learning speed of BP algorithm. Thus, in this paper, we will analyze the causes of the ES condition both in output and in hidden layers. An Error Saturation Prevention (ESP) function is then proposed to prevent the nodes in output layer from the ES condition, and we also apply this method to the nodes in hidden layers to adjust the learning term. Besides, an adaptive learning method for the temperature variable in activation function is proposed to help the learning process. By the proposed methods, we can not only improve the learning efficiency by the ES condition prevention but also maintain the semantic meaning of energy function. Finally, we will propose heuristics for constructing general energy functions that could prevent the ES condition during learning phase. Some simulations are also given to show the workings of our proposed method.