Wind speed modeling based on measurement data to predict future wind speed with modified Rayleigh model

Received Apr 1, 2021 Revised Jun 27, 2021 Accepted Jul 12, 2021 The development of modeling wind speed plays a very important in helping to obtain the actual wind speed data for the benefit of the power plant planning in the future. The wind speed in this paper is obtained from a PCEFWS 20 type measuring instrument with a duration of 30 minutes which is accumulated into monthly data for one year (2019). Despite the many wind speed modeling that has been done by researchers. Modeling wind speeds proposed in this study were obtained from the modified Rayleigh distribution. In this study, the Rayleigh scale factor (Cr) and modified Rayleigh scale factor (Cm) were calculated. The observed wind speed is compared with the predicted wind characteristics. The data fit test used correlation coefficient (R), root means square error (RMSE), and mean absolute percentage error (MAPE). The results of the proposed modified Rayleigh model provide very good results for users.

, and monthly variations in wind power [28]. Several approaches have been used to forecast wind power by developing an algorithmic model to anticipate the level of uncertainty and variability of wind generation [29].
Yuri et al. [30], modeling wind speed using Slashed-Rayleigh, where the ratio between the two independent random variables, Rayleigh distribution in the numerator, and the power of a random variable uniform in the denominator where Rayleigh sliced to provide a better match than the distribution of slash-Weibull. Rashad et al. [31], modeled the wind speed using the Rayleigh unit distribution to estimate the unique unknown parameter. Kachnia and Szewczyk [32], modeled the Rayleigh distribution which was applied to the hysteresis circle of magnetic materials. Yolanda et al. [33], modeling wind speed with Rayleigh-Lindley with the EM algorithm as an alternative solution. Gorla et al. [34], Rayleigh distribution model for wind farms and the monthly output is expected to consider the seasonal effect of the wind speed can be used.
Generally, some previous researchers have done a mathematical modeling approach to the characteristics of wind speed but need to develop other models to add to their knowledge. Rayleigh model of the proposed modifications aimed at minimizing defect characteristics obtained from a previous. To get wind speed modeling that is closer to the actual characteristics, it is necessary to have a model that is suitable for a certain area and is expected to be used in the process of assessing the potential for future wind energy. Researchers conducted the development of modeling wind speed with a modified Rayleigh distribution model approach for eliminating defects characteristics. Apart from the observed characteristics of the Rayleigh distribution function, the suitability of the measured/recorded data and the modeling data is also analyzed. This study aimed to obtain a new model of the modified Rayleigh distribution and analyze the suitability of the characteristics of wind speed.

RESEARCH METHOD
The use of wind speed data observed in this study was obtained from the measuring instrument PCE-FWS 20. The proposed modeling wind speed is approached with the measurement data recorded by the device. Based on the measured data, then do the model proposed approach to obtain data that will be used for simulated and tested for compliance with the measured data, then testing to ensure conformity of the proposed model of wind speed. The suitability test uses the correlation coefficient (R 2 ), root means square error (RMSE), and mean absolute percentage error (MAPE).

Wind speed data recorder
PCE-FWS 20 is a wireless weather station that is versatile, as it allows the accurate recording of wind direction, wind force, temperature, relative humidity, and rainfall. Weather data is sent up to 100 meters via a radio signal to the main station, equipped with the latest technology in weather analysis and powered by solar panels and batteries. With a USB interface and the included USB cable, the weather data can be sent directly from the wireless weather station to a PC or laptop. All these data are stamped with the time/date to be set even after a longer period and weather data can be stored indefinitely.
The analysis software provided makes it possible to observe and compare the weather over a longer period using charts. The PCE-FWS 20 Weather Station allows high accuracy detection of wind direction, wind speed, temperature, relative humidity, and rainfall. The PCE-FWS 20 station is shown in Figure 1.

Wind speed data
Wind speed data recording is taken based on the duration of 30 minutes installed and processed into monthly wind speed data, from January to December 2019. This wind speed data is a benchmark for the proposed wind speed modeling and is analyzed and evaluated.

Modified Rayleigh distribution
The Rayleigh distribution is often used in physics when it comes to modeling processes such as sound and light radiation, wave height, and wind speed. In addition to the Weibull distribution, Rayleigh distribution is also a distribution deemed appropriate to describe the distribution of wind speed. This distribution is used when the Weibull distribution area is considered less accurate to apply.
The Weibull distribution for Pdf and Cdf is given by * + By giving the shape parameter value (k) of k = 2 in the Weibull distribution, the probability density functions of the Rayleigh distribution (Pdf r ) and Cdf r are stated as: where v is the wind speed (m/s), c is the scale parameter. The parameter c is a function of v when the curve reaches its peak. By taking the derivative of Pdf r concerning v and setting it to zero and solving (3), then v is obtained, namely; with C m , the scale parameter of the Rayleigh model is modified and the value of v is estimated so that the shape of the entire curve and its area can be determined to v. The previous formula shows the standard distribution, specifically, the total area under the Pdf curve is 1. In actual applications, the constant K is multiplied by (3) and (4), where K is the total number of defects or the total cumulative damage rate.
Substituting the value of (6) into (3) and (4) and to determine the model of a set of data points, K and v are parameters that need to be estimated, so that the Pdf m and Cdf m forms for the proposed Rayleigh model are; The proposed modified Rayleigh model to eliminate the estimated wind speed characteristic defects is shown in (7) and (8), wherein the study the K value is around 1.15.

Wind speed modeling
The Rayleigh distribution scale parameter is obtained using the maximum likelihood estimator as expressed by (9)  where C r is the Rayleigh scale parameter and v i is the wind speed at the i th time. The average of the Rayleigh distribution function is determined by (10).
where is on the average of the Rayleigh distribution function. The wind speed modeling developed in this study is a modified Rayleigh distribution and is stated as; where N is the amount of data; v i is the measured (recorded) wind speed data; v m is a proposed wind speed modeling.

Statistical analysis of distributions
Model selection has become an important focus in recent years in statistical learning, machine learning, and big data analytics [35]- [37]. Currently, there are several criteria in the model selection literature. Many researchers [38], [39] have studied the problem primarily variable regression election in three decades. The statistical significance of the model comparison can be determined based on the suitability criteria in the literature [40]. Wind speed data modeling for the Rayleigh distribution function [41]. Deviations wind speed distribution using the Root Mean Square Error (RMSE) and annual energy production (AEP) [42]. A statistical test in the case of this study is shown in Table 1. where y i is the i th data; is the mean data to i th ; is the average data n is the number of model observations; k is the estimated number where A t are actuals and F t corresponding forecasts or predictions.

Rayleigh parameters and probability distribution functions
Rayleigh scale parameter (C r ) measured wind speed is calculated based on the equation of the (9), whereas the modified Rayleigh scale parameter (C m ) is based on (6). The Rayleigh probability distribution function (Pdf r ) and the modified Rayleigh distribution function (Pdf m ) are shown in (3) and (7), respectively. Rayleigh scale parameter of the wind speed data is recorded and a modified Rayleigh scale parameter amount of 5.2492 and 6.2424, respectively. The parameters scale for Rayleigh and Rayleigh modified are shown in Table 2. The difference in minimum, maximum, and average between Rayleigh and Rayleigh probability function is modified by -0.0095, 0.0277, and 0.0844, respectively, and the characteristics of the Rayleigh probability function are shown in Figure 2.
The comparison of the mean error value between the modified Rayleigh and Rayleigh scale parameters is about -18.94% (<0.0 %), this indicates that the proposed model has a very small error than the Rayleigh scale factor model. Figure 2 shows a comparison between the probability function of the measured data and the prediction that at wind speeds greater than 3 m/s, the modified Rayleigh model will give a better Pdf value when compared to the Rayleigh model before it was modified.  Figure 2. The difference between the two models of Rayleigh

Wind speed data recording
Based on the results of data recording with PCE-FWS 20, after processing the recording data with a duration of 30 minutes into daily and monthly data, the results are shown in Figure 3. Figure 3 shows the wind speed fluctuates between 2.4 m/s to 7.4 m/s. The minimum, maximum and average wind speeds are 2.37 m/s, 7.39 m/s, and 5.06 m/s, respectively.

Wind speed data modeling
Based on (11), the obtained results of modeling wind speed are shown in Figure 4. Figure 4 shows the wind speed fluctuates between 3.6 m/s to 6.3 m/s. The minimum, maximum and average wind speeds are 3.62 m/s, 6.38 m/s, and 5.25 m/s, respectively.

Comparison of wind speed modeling and measurement
Comparison of the wind speed of the recorded data and modeling are shown in Figure 5. Figure 5 shows a comparison between the measurement data and modeling based on a graph, where the color "blue" of the measurement data, while the color "green" for data modeling. The comparison of the two data shows a difference between the minimum, maximum, and average values of 0.525, 0.136, and 0.037, respectively. The measured wind speed and the modified wind speed model are shown in Figure 6, where both have similar shapes, but the proposed model looks better.  Figure 7 shows a comparison of measured wind speed data and modeling with modified Rayleigh with minimum, maximum and mean values of -1.0059, 1.2454, and 0.0236, respectively. Figure 7, color 'blue' measured wind speed data, the color 'red' is a predicted wind speed data and the color 'green' is the difference between the measured wind speed data with predicted data.

Statistical test results
Based on the results of the suitability test of the measurement and approach wind speed data with the correlation coefficient (R 2 ), root mean square error (RMSE), and mean absolute percentage error (MAPE) are shown in Table 3 as follows;  Table 3, shows that the correlation coefficient test (R 2 ) every month is between 0.9956-0.9994 with an average of 0.9145, this result gives a good meaning because it is close to 1. While the monthly RMSE test is between 0.0793-0.1303 and with an average of 0.1015, this result gives a good meaning because close to zero. While the MAPE test every month is between -40.21-10.583, with an average of -18.7528, this result gives a very good meaning because <10%.

CONCLUSION
The proposed wind speed modeling has fulfilled the statistical test requirements, according to the correlation coefficient (R 2 ), RMSE and, MAPE. The test result data by monthly statistics and averages indicate that the modeling approach correlation coefficient (R 2 ) of 0.9145, the test results with RMSE of 0.1015, and test results with MAPE of -18.7528. The results of the three tests indicate that the proposed model is well received.