Back to the programme printer.gif Print

Thursday, 13 March 2014
11:15 - 12:45 Advanced operation & maintenance
Science & Research  

Room: Llevant
Session description

The session covers the entire area of wind farm and wind turbine operation and maintenance, e.g. how to access, repair and organise operation and maintenance logistics onshore and offshore. In order to keep in hand the current health of turbines and farms, failure detection, identification and prognosis methods are also presented. Maintenance operations are also addressed from the viewpoints of required activities and efficiency. In order to cover management aspects, operation and lifetime cost calculation methodologies are also introduced. Experts from various European countries share their results during the session.

Learning objectives

  • Advanced operation and maintenance
  • Fault detection methods
  • Reliability calculation techniques
  • Monitoring on the field
  • Statistical and artificial intelligence-based solutions for diagnostics data- and model-based solutions for fault detection
Lead Session Chair:
Zsolt Viharos, Hungarian Academy of Sciences, Hungary

Christopher J. Crabtree , University of Durham, United Kingdom
Jamie Godwin University of Durham, United Kingdom
Jamie Godwin (1) F P Peter Matthews (1) Bindi Chen (1)
(1) University of Durham, Durham, United Kingdom

Printer friendly version: printer.gif Print

Presenter's biography

Biographies are supplied directly by presenters at EWEA 2014 and are published here unedited

Jamie Godwin is a post-graduate researcher at the University Of Durham in the United kingdom.
His research interests include (but are not limited to):

- new paradigms for prognostics - effective utilisation of suspension histories for prognosis - data-mining low frequency SCADA data - robust multivariate statistical techniques for prognosis.

He has authored various conference, journal and transaction papers, along with book chapters in the PHM field.
He has also acted as a reviewer for various conferences and journals, such as RAMS, the IEEE PHM conference, the european IEEE PHM conference, COMADEM, IGI-Global, the journal of the PHMSociety and many others.


Prediction of wind turbine gearbox condition based on hybrid prognostic techniques with robust multivariate statistics and artificial neural networks


As maintenance can account for up to 75% of the total lifecycle cost of an asset [1], turning data acquired from condition monitoring systems to actionable intelligence is essential to reduce these costs and mitigate the risks and consequences of failure.

Accurate prediction of the remaining useful life of components enables proactive maintenance strategies to be put in place [2]. This reduces the inventory required for spare parts, increases asset availability (and thus, production) and enables effective planning and scheduling of maintenance actions [3].


For the analysis, two datasets are utilised. Firstly, bearing failures from the NASA bearing dataset [4] with known failures are used to train a data-driven system. To this end, statistical features are taken from the high frequency data to reduce computational complexity whilst ensuring strong condition encapsulation.

In total, 15 features are analysed to determine their strength in encapsulating the bearing condition. Of these, 6 are shown to strongly reflect changes in condition. These are the kurtosis, skewness, RMS (root mean square), crest factor, energy operator RMS and narrowband peak to peak signals. These have been shown by [Bechhoefer-5] to be strong signals for determining condition utilising high frequency data.

These 6-tuples are used to determine normal operational behaviour by employing robust multivariate statistical techniques. For this analysis, a robust derivative of the Mahalanobis Distance [6] is employed.

By utilising the minimum covariance determinant to determine the attribute covariance, sensitivity to abnormal behaviour is increased, whilst reducing the noise in the covariance calculation.

Once the parameters for the covariance and central vector had been set, nonlinear prediction was performed by feed-forward back propagation neural networks. As the failure time is known for the NASA bearing dataset [4], it is possible to supervise the training process of the network; providing a generalizable neural network which can predict bearing remaining useful life based upon the 6-tuple of the sampled data. In order to provide context to the network, regressive (lagged) inputs are utilised. Due to the subjective nature of this labelling process, a comparative evaluation is performed against prediction of the condition index over time.

To validate the approach, a full sensitivity analysis of the sampling process and prediction is performed on an independent test set of bearing data from the MFPT [7]. Additionally, topographical sensitivity analysis is performed on the neural network to minimize training error. The inputs, lag, hidden layers, and layer sized are analysed to demonstrate the robustness of the approach.

Main body of abstract

Many techniques for remaining useful life prediction have inherent bias due to the subjective nature of the methodology or assumption of linear degradation of the component. However, degradation is a non-linear process: although training data-driven algorithms utilising a linear “remaining useful life” derived from the time until failure can provide prognostic capability, it inherently does not accurately reflect the underlying condition of the component.

As such, this work presents a novel methodology exploiting robust multivariate statistical techniques which incorporate an objective definition of asset condition to provide a remaining useful life prediction based on sound statistical judgement.

Initially, high frequency data from the NASA prognostic laboratories [4] is employed to determine a condition index which encapsulates an objective truth and no bias. To do this, a single second sample of the 20KHz data is transformed into a 6-tuple representing the kurtosis, skewness, RMS, crest factor, energy operator RMS and narrowband peak to peak signal. This reduces the computational complexity of the problems, reduces storage costs and allows trends to be seen over a longer time period.

The response from each of the 6 attributes in the tuple increase in proportion to the degradation. In order to employ this as an overall health indicator of the bearing, the attributes must be combined. Linear weighting of these variables would have redundancy from any covariance between the attributes. As such, covariance must be taken into consideration. This is done by employing the (a robust derivative of) Mahalnobis distance [6]:

Where the MCD represents the robust measure of attribute covariance as determined through the minimum covariance determinant. This is required due to the sensitivity to noise – inherently present in real-world data – of traditional covariance calculations.

This robust derivative of the Mahalnobis distance conforms to a F-distribution, with two parameters (c,m) which are determined through Monte-Carlo simulation [8].

This allows statistically significant levels of abnormality to be determined, with thresholds set to identify outlying behaviour. As the majority of the time the bearing is operating under normal operating conditions, the central vector and robust covariance matrix will accurately encapsulate these conditions.

The RMD value is shown to accurately reflect bearing condition in all of the NASA bearings. That is, the condition index accurately reflects the fault condition on the 4 bearings which suffered degradation, and no significant degradation is shown in the 8 other bearings. As such, the NASA data [4] is then used to train two artificial neural networks (ANN).

The first ANN uses the time until failure as the objective criteria for training the prediction and optimisation. Various input lags (ranging from 0 to 10) are tested a long with varying quantities of hidden layers, and the sizes of each of the layers are changed.

The second ANN uses the bearing condition as determined by the RMD as the objective function for prediction and optimisation. As in the first sensitivity analysis, input lags, hidden layers and layer size are varied to minimize training error. A comparative evaluation of these two training processes was then undertaken.

Although other nonlinear techniques are well suited to the prediction of nonlinear functions (such as degradation), techniques such as relevance vector machines and particle filters are similarly as “black box” in nature whilst not being as well understood.

After these neural networks have been trained, they are applied to an independent publically available test set from MFPT [7]. Training the ANN based upon the objective measure of degradation is found to have superior performance over using traditional linear representations of the remaining useful life, with reductions found in MAE, RMSE, RME and a decrease in training time and absolute minimum error obtained.


This work presents a novel methodology based upon recently discovered statistical phenomena which have been exploited for determining the remaining useful life of a bearing. Strong predictive capabilities are found, with an accurate prognostic horizon of 12 steps (in this case, one hour) found. This represents prediction of the condition in 150,000 shaft revolutions.

Due to the methodology employed, minimal data is required for determining attribute parameters. Computational complexity is reduced as the data is compressed by expression as statistical properties; this also saves on the storage costs associated with condition monitoring strategies.

An objective definition of degradation is provided, without the need for labelling data or additional expenditure for destructive (or non-destructive) testing. This objective definition of degradation is shown to have stronger predictive capabilities (in terms of RMSE, RAE, MAE and training time) than the subjective definition.

The ANNs were both shown to be generalizable to independent datasets. This ensured no overtraining took place and that the approach is generalizable to other bearings in similar conditions. Predictive capability was shown to be accurate 9 steps ahead (in this case, 45 minutes of data) on the independent test set. Whilst this is lower than the training set, it provides enough insight to enable effective maintenance comparison.

Currently, the technique shows strong promise in the prediction of remaining useful life for the bearing. However, the ANN is a “black box” and as such, is difficult to utilise in critical situations due to the lack of accountability and trust which can be placed in it. As such, future work will look at utilising more transparent techniques, such as relevance vector machines.

In addition to this, future work will look at applying the technique to SCADA data, allowing for accurate predictions of wind turbine subsystems a week in advance.

Learning objectives
This work enables others to effectively apply multivariate statistical techniques for the diagnosis and prognosis of bearing faults. As statistical features are extracted, the methodology is applicable to any wind turbine subsystem, and to both SCADA and CMS data signals. Common mistakes are dealt with, enabling sound application of the methodology.

[1] WWEA, “Quarterly bulletin,” World Wind Energy Association Bulletin, vol. 3, pp. 1 – 40, October 2012.

[2] Muller, A., Marquez, A. C., Iung, B., (2008). On the concept of e-maintenance: Review and current research. Reliability Engineering & System Safety 93(8), 1165 - 1187.

[3] D. Djurdjanovic, J. Lee, and J. Ni, “Watchdog agentan infotronics-based prognostics approach for product performance degradation assessment and prediction,” Advanced Engineering Informatics, vol. 17, no. 3, pp. 109–125, 2003.

[4] Lee, J., Qiu, H., Yu, G., Lin, J., and Rexnord Technical Services (2007). 'Bearing Data Set', IMS, University of Cincinnati. NASA Ames Prognostics Data Repository, [], NASA Ames, Moffett Field, CA.

[5] Eric Bechhoefer, Yongzhi Qu, Junda Zhu and David He. "Signal Processing Techniques to Improve an Acoustic Emissions Sensor". In proceedings of the Annual conference of the PHMSociety. 2013. October 14-17th, New Orleans, LA, USA. 2013.

[6] J. Godwin, P. Matthews, “Rapid Labelling of SCADA Data to Extract Transparent Rules using RIPPER”. Reliability, Availability & Maintainability Symposium. In Press. January 27 – 30. Colorado Springs, CO, USA.

[7] Bechhoefer, E. (2013, Feb 28). Fault Data Sets. MFPT Society. Retrieved June 4th 2013 from

[8] J. Hardin and D. M. Rocke, “The distribution of robust distances,” Journal of Computational and Graphical Statistics, vol. 14, no. 4, 2005.