Data bank: nine numerical methods for determining the parameters of weibull for wind energy generation tested by five statistical tools

Ahmed Samir Badawi, Siti Hajar Yusoff, Alhareth Mohammed Zyoud, Sheroz Khan, Aisha Hashim, Yılmaz Uyaroğlu, Mahmoud Ismail Department of Electrical and Computer Engineering, International Islamic University Malaysia, Malaysia Department of Electrical and Computer Engineering, Birzeit University, Birzeit, Ramallah, Palestine Department of Electrical and Renewable energy Engineering, Onaizha Collages of Engineering Al-Qassim, Saudi Arabia Electrical and Electronics Engineering Department, Sakarya University, Sakarya, Turkey Department of Electrical Engineering Palestine Technical University, Kadoorie Tulkarm, Palestine


INTRODUCTION
The electric energy crisis has emerged as a significant global problem in the last decade. Therefore, many governments tried to achieve the goal to supply an essential portion of the electrical grid from sustainable energy resources such as wind energy and PV solar system. The critical situation in the Mediterranean coastal plain of Palestine, the siege imposed and the growing need due to the high number of population for alternative sources of energy have become urgent concerns [1]. This study estimates wind

WEIBULL PARAMETERS CALCULATION
The Weibull probability distribution is a random variable that is used to describe the wind potential for a specific region. Two main parameters controlled the Weibull curve shape factor k and scale factor c. This parameter is generally applied in statistical analyses [13], [67], [68], and its use requires time-series records of wind speed data. Based on the wind speed data collected, the Weibull probability distribution can be represented as a cumulative distribution function (CDF) or Weibull function, F(υ), and Weibull PDF, f(υ) [27]. The CDF is obtained by computing the integral of the PDF [69]- [71], which is ultimately determined using (1) [7,] [27], [56], [67], [69]- [72].
The probability function can be derived as (2). Where , and are the mean wind speed (m/s), c (m/s), k (dimensionless), respectively. Parameter indicates the width of the wind speed probability distribution, which represents the wind probability distribution peak of any specific region [71], [73], [74]. Parameter indicates the abscissa scale of the wind probability distribution, which shows the wind in particular location [71], [75]. Parameters and can be obtained using MM, STDM, EM, MLM, MMLM, SMMLM, GM, LSM and EPF. These methods are frequently compared in the literature on wind energy basis. However, the results, conclusions and recommendations of previous studies differ greatly due to the change of wind speed data conditions. Hence, it can be concluded that the appropriateness of the methods may change with the sample data distribution, sample data size, goodness-of-fit tests and sample data format [9], [52]. Based on the Weibull PDF, WPD is determined using (3) to simulate the required electric power output for wind turbine model [71], [76]- [80].

NUMERICAL METHODS FOR DETERMINING WEIBULL PARAMETERS 4.1. Method of moments (MM)
The MM is recommended by Justus and Mikhail [81], [82]. The standard and mean deviations of the elements are noted initially at a suitable scale MM. On the basis of the numerical iteration of (4) and (5), the standard deviation σ and mean ( ̅ ) of wind speeds are calculated [11], [67], [81], [83]- [88]. The MM is an effective approach to deriving Weibull parameters. The first moment relates to the origin, and the second  (4) and (5) respectively. The calculation includes the MWS and standard deviation which are obtained from the calculated wind speed [88], [89]. where, Where Γ(x) is the gamma function expressed as and is the wind speed in time step (m/s) and is the number of non-zero wind speed data points. The Coefficient of Variance (COV) or variation coefficient is defined as the ratio between the mean standard deviation (σ) to average wind speed (υ) illustrated as a percentage. It presents the mutability of wind speed and can be illustrated as [90].

Empirical method (EM)
The EM is also commonly known as the power density method. EM is easy and simple to implement [88]. The empirical approach involves a straight forward and practical solution that only requires knowledge of MWS ̅ and standard deviation σ [81]. The EM uses the average of the cube of wind speed ( 3 ) and the cube of MWS ̅ 3 as Energy Pattern Factor (E pf ). The scale factor is determined from the E pf . The equations used to determine the scale parameter are identical to those used in the MM and EM [91]. Thus, the EM can be categorised as a special case of the MM [11], [67]. On the basis of the EM introduced by Justus [71], [92], [93], parameters and are computed using (10) and (11), respectively [71], [83], [92], [93]. In the EM, the parameters of Weibull can be estimated (10). 10 1 ,

Maximum likelihood method (MLM)
The MLM was put forward by Fisher [81], [94] and then introduced by Stevens and Smulders as an approach to obtain wind speed information [81], [95]. The MLM is based on the indirect results of numerical iteration methods for determining parameter . Therefore, the MLM is effective despite being a laborious and complicated procedure [81]. The MLM is a mathematical formulation technique also recognized as the likelihood function in time series format for the wind speed data [71]. MLM requires extensive numerical iterations [11]. These numerical iterations are needed to estimate the parameters and of the Weibull function. Through the MLM, parameters and are calculated using (12) and (13), respectively [71], [96], [97].

Modified maximum likelihood method (MMLM)
The MMLM is only applicable when the wind speed data are in frequency distribution format. Similar to the MLM, the MMLM entails several iterations when used to determine Weibull parameters. Parameters and are obtained using (14) and (15) [31], [71], [98].
Where is MWS central to bin and is the total number of bins, ( ) is the frequency of wind speed falling within bin , where ( ≥ 0) is the probability distribution curve that wind speed reaches or exceeds zero.

Second modified maximum likelihood method (SMMLM)
The SMMLM was developed by Christofferson and Gillette (1987) by replacing the iterative estimation of the shape parameter [99]. Which requires neither the iteration nor the sorting of data. Thus, SMMLM was selected by Hanitsch and Ahmed Shata in [100].

Graphical method (GM)
GM also called the LSM [101], is employed using the CDF. In GM, the wind speed record ought first to be categorised into bins. After using the logarithm of (17) twice, the GM can be obtained as. The GM is used by a logarithmic function of the CDF F(v), that is, the CDF F(v) is modulated for the inclusion of a dual logarithmic transformation [81]. Plotting ln( )as the -axis versus ln {− ln(1 − ( ))}as the y-axis shows a straight line in which is the slope and the y-intercept is × ( ) [11], [71], [102].

Energy pattern factor method (EPF)
The EPF is related to the mean records of wind speed; it is described by (18) [9], [67].
) ( Where, ̅ is given as (4), and is the energy pattern factor and is represented by (18).

Standard deviation method (STDM)
In this method, c and k are calculated as in (10) and (11). Several studies have adopted STDM to calculate Weibull parameters. In [46], this method was utilised to assess wind data in Zarrineh, Iran in 2012 as mentioned. Reference [67] analysed and compared seven numerical methods to assess their effectiveness in determining the parameters of the Weibull distribution using wind data collected from Camocim and Paracuru in the northeast region of Brazil. In [11], the authors conducted a statistical study to check the 1119 efficiency performance by determining the Weibull shape and scale factor for six different numerical methods for wind energy applications.

Least mean square method (LSM)
This method determines the shape factor and scale factor based on the slope of GM. Therefore, this numerical method considers the same as GM and depends on (17) [28].

GOODNESS OF FIT (GOF)
The performance of the five parameter estimation techniques of the Weibull probability distribution for calculating WPD is evaluated using several statistical techniques, including five statistical tools indicators. To achieve a comparative assessment, we utilise the Root Mean Square Error (RMSE), Chi-square test (X 2 ), Index of Agreement (IA), Mean Absolute Percentage Error (MAPE), and Relative Root Mean Square Error (RRMSE), along with some other statistical tools. In the aforementioned subsections, we present a summary of the statistical tools' parameters used in this work.

Root mean square error (RMSE)
RMSE shows the accuracy of a model by comparing the deviations between the values gathered by the Weibull function besides those obtained from measurement data. The positive value of RMSE is calculated as in (20) Where P i,w and P i,M are the i th calculated wind power density via WDF and the i th calculated WPD by measured data, respectively.

Chi-square test (X 2 )
X 2 is applied to analyse proportions of independent variables, that is, possible inconsistency between the expected frequencies and observed of the events of occurrence. X 2 is a non-parametric test that is independent of factors like the average population and variance. Two series behave comparably if the variance between the frequencies for every category is negligible, therefore, close to 0. Souza [103] indicated that for this model, the groups should be independent, the items should be randomly selected from each group, the observations should be frequently counted, and every observation should belong to only one group [81]. F(v) is the empirical probability distribution estimated from any wind speed record. Then, parameters k and c are determined to be minimum [104].
Where y is the observed value and x is the expected value.

Index of agreement (IA)
The IA presents the precision degree of predicted values relative to observed values. The IA that changes from 0 to 1 is computed by [28], [71], [105].
Where P w,avg and P M,avg are the average P i,W and P i,M values, and is the total number of observations.

Mean absolute percentage error (MAPE)
MAPE presents the average absolute percentage variance between the estimated wind power using the Weibull probability function and that calculated from the observed data (measured data wind speed). MAPE can be calculated by [71].  [106], [107]. RRMSE is considered, excellent if the efficiency performance is less than 10%, Good if 10% < RRMSE < 20%, average if 20% < RRMSE < 30% and Poor if RRMSE is more than 30%. RRMSE, MAPE, IA, X 2 , and RMSE with values close to zero are considered satisfactory [27].
( ) The nine techniques had been tested to check the percentage of error based on five statistical tools. The statistical tools are RMSE, X 2 , IA, MAPE and RRMSE. The ranking is performed using the aforementioned statistical tools to ensure an accurate diagnosis. Figure 1 shows the percentage of the monthly MWS of Ashqelon in the coastal plain of Palestine between 2012 and 2015. The sources of the meteorological data on Ashqelon, which is adjacent to Gaza City, are recorded on a daily basis according to the MWS that is usually calculated every month. The graph shows that MWS dramatically decreased from February-April 2012, reaching an all-time low of 3.2m/s. In January, MWS rose as high as or more than 5m/s. MWS increased steadily and reached approximately 4m/s. In the last three months, the curve declined. In April 2013, MWS increased dramatically, reaching around 4.7m/s. The curve suddenly fluctuated during the last eight months of the year. In January to August 2014, MWS significantly increased, reaching 4.8m/s before finally dropping in the last four months of the year. In January 2015, MWS jumped and reached 5.1m/s. It then fluctuated significantly and reached a peak point in June. However, MWS gradually declined between July and December, reaching an all-time low of 3m/s. Overall, MWS fluctuated between 3-5m/s during this period. Wind speed in this area is generally below 15m/s, and strong wind speed does not exceed 25m/s. Figure 2 illustrates the frequency distribution of the actual MWS records of Ashqelon between January 2012 and December 2015. The bar graph is extremely close to the Weibull curve (PDF) of the wind speed data. More than 90% of the frequency lies between 1 and 7m/s of wind speed for four years.

PDF and CDF for Ashqelon city
Measured and estimated cumulative distribution function (CDF) and probability distribution function (PDF) have been investigated using nine numerical methods for the Ashkelon site from 2012-2015. Figure 3 shows PDF for Ashqelon city for the four years. The PDF is represented the Weibull curve using nine numerical methods based on shape factor value (around 2). The red line is represented as the observed PDF, x-axis is wind speed and y-axis is PDF values. It can be seen that the peak of the curve at 4m/s. This study has been conducted a statistical Weibull analysis of the wind speed data plots of the PDF and CDF for the entire data and seasons. Figure 4 represents the CDF for the four years. The red curve is the observed CDF. This curve is considered the integral of Weibull PDF as (1).    Figure 3 and the scale parameter is between 4.1000m/s and 4.7555m/s. There are three cases for the shape factors control the PDF curve, a) k≤1 PDF is exponential, b) 1<k≤2 PDF is Weibull (the most common curve) and c) k>2 PDF is bell-shape distribution or Gaussian. Different wind parameters reflect dissimilar wind turbine systems or energy potential. Estimating these parameters accurately for a particular period is necessary for wind energy applications. Table 2 shows that the scale and shape factors (4.3119m/s and 2.1006, respectively) of the MLM are completely identical to the observed values in 2013. The standard deviation for 2013 ranges from 1.7264m/s to 1.9224m/s, whereas the observed value is 1.9179m/s. Table 3 indicate that and are nearly the same for the MM, STDM, EM, MLM and EPF. The results of the MM, STDM, EM, MLM and EPF are close and are better than those of the other methods. Table 4 shows the value of the Weibull parameter for 2015. Standard deviation and variation coefficient have been calculated using each numerical technique. The shape factor is around 2 and the scale factor is from 4.4-5.1m/s. It can be seen that EM and STDM have the same result from Table 1 to Table 4. Moreover, GM and LSM have the same result as mentioned in Section 4.9. Table 1 to Table 4 indicate that and are nearly the same for the MM, STDM, EM, MLM and EPF. The results of the MM, STDM, EM, MLM and EPF are close and better than those of the other methods.
The parameters of Weibull can be calculated using the GM of every year as shown in Figure 5 to Figure 8. The method to calculate the shape and scale factor is to plot the natural logarithm of the observed speed versus ln(-ln(1-F(υ)). Therefore, the Weibull parameters can be found by linearly fitting the plotted points; thus, is the slope of the fitted line, and is equal to exp(b/k), where b is the y-intercept of the fitted line. Where, ln( )as x axis versus ln(-ln(1-F(υ)).    Table 5 to Table 8 show the percentage of error based on the analysis of the five statistical techniques for nine numerical methods. Table 5 shows that the GM and LSM yield the greatest efficiency according to the RMSE, X2, IA and RRMSE in 2012. It is followed by EPF, MLM, MM, EM and STDM. The method with the worst efficiency reflected in the RMSE, X 2 and RRMSE is the SMMLM, followed by the MMLM. The ranking had been preceded on the basis of the percentage of error for each method. The wind speed effect in the efficiency for each method, in the other word it's difficult to approve that a specific method as the best one due to change wind speed condition.  ISSN: 2088-8694 Table 6 shows that GM and LSM achieve the best efficiency according to the RMSE in 2013. It is followed by EPF, MM, MLM SRDM and EM. The MM, STDM, EM and MLM showed approximately the same efficiency performance according to the RMSE, X 2 , IA and MAPE in 2013. The SMMLM shows the worst efficiency performance according to the RMSE, X 2 , IA and RRMSE.
In terms of the RMSE in 2014, the STDM and GM exhibit the best efficiency performance, followed by the EPF, MM and MLM as shown in Table 7. The SMMLM shows the lowest efficiency performance according to the RRMSE, X 2 and RMSE. Table 8 shows that in terms of the RMSE in 2015, the STDM and GM present the highest efficiency performance by the EPF, MM and MLM. By contrast, the SMMLM shows the lowest efficiency performance.

CONCLUSION
Nine numerical techniques had been used to calculate the Weibull parameter for Ashqelon site. The PDF and CDF had been implemented using nine numerical methods for Ashqelon site from 2012 to 2015. The percentage of error had been calculated using five statistical tools to check the efficiency performance for the numerical techniques. GM and energy pattern factor EPF show the greatest efficiency, whereas the SMMLM shows the lowest efficiency based on statistical tools. Between 2014 and 2015. The EM and STDM show the best efficiency performance, followed by MM and MLM. Based on numerical analysis shape factor k is approximately 2. Therefore, the PDF for Palestine is Weibull, whereas, scale factor c value from 4 to 5 m/s. The MWS speed at Ashqelon was 4.07m/s, 3.82m/s, 4.02m/s and 4.52m/s for the year 2012, 2013, 2014 and 2015 respectively. The EPF applies to the assessment of any wind speed data and shows the greatest accuracy performance through the years followed by MM and MLM. The SMMLM presents the worst prediction performance followed by MMLM according to all statistical techniques. Among the five statistical tools, RMSE is the most accurately predicted technique. By contrast, the worst predicted technique is RRMSE. This research determined the wind energy conversion characteristics. Based on WPD calculations this study confirms the potential of electrical energy generation in Palestine using small-scale turbines.