1 Background
Renewable energy sources are of more and more significant importance in the current and future power supply systems [1, 2], especially, the photovoltaic(PV) power techniques has achieved tremendous progress in the industry and research fields. In the past years, the total cumulative solar PV power capacity has reached 178GW [3, 4]. Moreover, photovoltaic(PV) power takes a percentage of 8 in the gross power consumption in Italy and 7.1 in Germany in the year of 2015 [5,6]. The largescale deployments of PV system brings the surging demands for the management and scheduling operations on the PV power system, which greatly depends on the forecasting of PV system power outputs in [7,8,9]. Genially, PV power outputs are determined by the randomness of solar irradiance in the area of interest, which indicates that the power outputs are variable. Therefore, amount of models and methods have been proposed to approximate the PV power outputs under different conditions.
In [10], the power regression is modeled based on analysis of the images of the weather in the UC San Diego, which provides a god tested for the solar energy. Similarly, the analysis of cloud images or weather is employed in [11,12,13]. However, these methods requires expensive equipments to obtain the cloud or weather images, which is not favorable to lower the price of PV power systems. In [14], the forecasting of PV power output is implemented based on the realtime collection of solar irradiance through the irradiance sensor network. In [15], the images of cloud motions are obtained through geostationary satellite to predict the medium and shortterm solar radiation. The above approaches can provide good performance in forecasting PV power outputs, which need additional hardware or complex operations.
Besides applying equipments and complex operations, various forecasting algorithms are also proposed. Common approaches are based on employing the machine learning classification and regression methods. In [16], aerosol index, which has evident linear correlation with solar radiation attenuation, is used to train the artificial neural network (ANN) and forecast the power outputs in the next 24 hours. Similarly, the ANN method is also employed to implement the forecasting of the PV power outputs in [17, 18]. Support vector machine(SVM) was also employed to learn and model the relationship and relevance between the input data such as solar radiation and the output of PV power in [19,20,21]. In [22, 23], multiple linear regression (MLR) modeled the power outputs of PV system based on the features of solar radiation and the weather data. In [24], Knearest neighbour(KNN) was employed to build the forecast model based on the noncommon data. In [25], ANN, SVM, KNN and MLR are analyzed and the effect of selecting input data for the learning algorithms are analyzed.
Another approach is based on the probabilistic model to forecast probability density function associated with PV power outputs based on the features of input data in [26]. In [27], a versatile probability method based on pair copula construction to model the PV power system. Similarly, a chronological probability is employed to model the output of PV power system based on conditional probability and nonparametric kernel density estimation in [28]. Moreover, the conditional probability associated with the outputs of PV power is also utilized to predict the outputs in the future. In [29], the Bayesian sparse learning that incorporates the features of input data to learn the likelihood function of the outputs of PV power. In the above probabilistic model, the prediction of PV power outputs is inevitably negative, which is resulted from the models and do not follow the positivity of the outputs. Therefore, a sparse Bayesian learning algorithm that guarantees the positivity of the outputs and approximates the relevance between the input features and power outputs is proposed in this paper.
The rest of the paper is organized as follows. In Section II, the forecasting problem is molded as a Poisson regression problem and the regression problem is implemented on the basis of sparse Bayesian learning. The simulation results of proposed algorithm for the forecasting performance are presented in Section III. The conclusion and acknowledgement are given in Section IV and Section V, respectively.
2 System Model
Generally, the basic principle for photovoltaics is the photovoltaic effect, which transform the solar energy to the electrical energy in the semiconductors. The output power of the photovoltaics is modeled as follows,
The above model reveals the fact the output power
Based on previous discussions, the outputs of a PV system is nonnegative and can be regarded as integers (with low resolutions in large scale systems), and the outputs cannot be modeled by simple support vector machine, Gaussian process or relevance vector machine, which will leads to negative outputs predictions. In order to alleviate the problem, we employ a generalized linear model, poisson regression was built based on the hierarchical Bayesian learning.
2.1 Poisson Regression Model
In the regression of PV power outputs, a training set
The index i represents the daytime step and t means the time slots in a day and P_{it} is the corresponding PV power output. In the Poisson Regression, the power outputs is assumed to follow Poisson distribution as
Assuming that the power outputs is linear combinations of the inputs vector, which is given by
Based on (5), the likelihood function can be formulated as follows,
In Bayesian learning, the weight parameters ω is assumed to be a vector of random variables that subjects to independent Gaussian prior distributions, which is given by
By combining the equation (8) and (9), the posterior of ω^{ν} can be formulated as follows,
By subsisting the details, the posterior distribution can be given by
Similarly, the posterior distribution f (λ, δ^{2}P^{ν}, x^{ν}) can be formulated as
By assuming that λ and δ^{2} follows the uniform prior distribution, then f (λ, δ^{2}P^{ν}, x^{ν}) can reformulated as
By following the results in [33], ω^{ν} and δ^{2} can be updated as follows,
2.2 Poisson Regression of PV System Power based on SBL: Bayesian Estimation
In this subsection, the Sparse Bayesian Learning method is proposed for the Poisson regression of PV system power outputs.
Given the equation (4) and (5), we redefine the natural parameter ρ as follows,
So the Poisson distribution can be reformulate as
Taking logarithm to both sides of (21), the logposterior of ω can be rewritten as
By simple manipulations, the above complicated posterior formulation can be rewritten as
Hence, the posterior distribution can be formulated as
So the Bayesian estimation of ω^{ν} can be obtained from the mean
2.3 Bayesian Learning: Bayesian Estimation
Based on Bayesian rule, the posterior distribution of λ can be given by
Without any extra information of λ, it is assume that the prior distribution of λ is uniform distribution. Hence,
Then, the above likelihood can be given by
By using the approximation results in (21), it yields
By taking log operation with respect to both sides of (30), taking derivatives with respect to λ_{i} and setting it to zero, it leads to
2.4 Prediction of New Inputs Based on Poisson Regression Model
Given the estimation results of ω and λ, the prediction of new input x* can be formulated as follows
Due to θ* = exp(ϑ*), it is obtained that
Following the results in [34], the likelihood function of prediction can be approximated as
Based on the above derivations, the detailed algorithm can be formulated as Algorithm 1 as
Poisson Kernel Regression Based Sparse Bayesian Learning
1: Input the training set

2: Set the convergence criterion for ω by using the difference between the current estimation and the next estimation; 
3: Set η = 1 and the maximum iteration number to be η_{max} = 50; 
4: Initialize the parameter ω; 
5: Initialize the threshold value ω_{th}; 
6: Initialize the RVs matrix by setting P_{RV} = P; 
7: while Maximum iteration or convergence criteria is reached do 
8: Creating the kernel matrix according to (6); 
9: Calculate the inverse covariance matrix of ω according to (26); 
10: Calculate the mean vector according to (25); 
11: Updating the hyperparamter as

12: Eliminate the ω_{i} and the samples P_{i} with ω_{i} > ω_{th}; 
13: Updating kernel matrix by using the eliminated samples; 
14: end while 
15: Output the estimation of ω and λ 
2.5 Summary and Analysis
Based on combination of Poisson regression and SBL, the power output of PV system can be formulated as a regression problem. Based on the strength of solar radiations in different type weathers, the regression problem can can be classified into three subproblems.
In each regression problem, the weights of input vectors are dominated by independent zeromean Gaussian distribution, which is different form the Bayesian prior with identical Gaussian distributions. Meanwhile, the sparsity of the weights are guaranteed by the zero mean and the variance parameter λ, which will alleviate training complexity and time.
On one hand, the complexity of the proposed algorithm is dominated by the step 9 in Algorithm 1 which requires a matrix inverse operation. In the step 9, the complexity is scale with the order of N (P_{RV}). Furthermore, according to Algorithm 1, the number of P_{RV} will decrease with the iteration process, which means that the complexity decreases in each iteration.
On the other hand, the complexity of prediction is proportional to the number of RV samples. By comparing the analysis results of complexity to other algorithms, it is concluded that the proposed algorithm is less than the related kernel count data regression model including Kernel Probabilistic Regression and Probabilistic Regression [35].
3 Numerical Results
In the section, the data collected from the real PV power system in Anhui Polytechnic University PV power platform will be applied. The installed capacity of the platform reaches 100 kWh, which is deployed on the roof of main administration building in the campus. PV power data and corresponding weather data are collected in a season. Fig. 1 shows the collected PV power data in seven different days.
In order to make the weather data clear, the data is shifted by one unit in vertical orientation, which is shown in Fig. 2. The RMSE results are obtained through 1000 times Monte Carlo independent experiments, and is defined as
In Fig. 3, the data is collected in the sunny data and the forecasting values based on Poisson regression is closed to true data and has no negative outputs while SVM regression poses negative outputs and has larger error in table I.
The detailed RMSE of three different situations in two regression methods
Situations  Sunny  Sunny/Cloudy  Rainy/Cloudy 

RMSE of PRSBL  1.145  11.861  8.343 
RMSE of SVM  22.290  22.281  18.715 
Fig. 4 and Fig. 5 show the simulation results in hybrid weather. In both situations, the Poisson regression based on SBL algorithm can achieve better performance in forecasting and nonnegativity.
In simulation results, the PV power regression is more complicated under the hybrid weather conditions. In super shortterm regression, the other factors, such as environmental temperature and wind speed, can be regarded as stable and unchanged in a sole weather type. Thus just the time sequence correlation is considered. The proposed Poisson regression based on SBL can also incorporate the environmental temperature and wind speed to the input data, then the input data forms a vector and the SBL algorithm can still provide good performance, which can be found in [35].
By combining all the simulation results, the proposed PRSBL algorithm can provide accurate and nonnegative forecasting values of PV power, which outperforms the SVM algorithm in both aspects. The superiorities is resulted from the Poisson distribution assumption and statistical learning mechanism. Specifically, SBL is a datadriven iterative algorithm and updates the hyperparameters in a hierarchical way, which prevails over the SVM. Moreover, the assumption that the outputs of PV power subjects to Poisson distribution guarantees the nonnegativity of the predicted data. Furthermore, the assumption can be used by adopting the maximum entropy principle according to the physical situations.
4 Conclusion
The forecasting problem is of vital significance for the management and schedules in the renewable energy sources, such as the PV power system. The traditional nonparametric regression methods cannot guarantee the nonnegativity of the output. In this paper, a regression model based on Poisson distribution and sparse Bayesian learning algorithm is proposed to solve the nonnegative PV power forecasting problem. The detailed principles of PRSBL algorithm and the simulation results are illustrated. The simulation results demonstrate the superiorities and accuracies of the proposed algorithm. Moreover, the proposed algorithm is feasible to other exponential family distribution other than Poisson distribution, which deserves more investigations in the future.
The work is supported by the National Natural Science Foundation of China 61572032, Natural Science Foundation of Anhui Province 1908085MF215 and Key Natural Science Research Project of Anhui Province KJ2017A107.
References
 [1]↑
A. Tascikaraoglu, B. M. Sanandaji, G. Chicco, V. Cocina, F. Spertino, O. Erdinc, N. G. Paterakis, and J. P. S. Catalato (2016), Compressive spatiotemporal forecasting of meteorological quantities and photovoltaic power, IEEE Transactions on Sustainable Energy, vol. 7, no. 3, pp. 1295–1305.
 [2]↑
H. Sheng, J. Xiao, Y. Cheng, Q. Ni, and S. Wang (2018), Shortterm solar power forecasting based on weighted gaussian process regression, IEEE Transactions on Industrial Electronics, vol. 65, no. 1, pp. 300–308.
 [3]↑
S. Chai, M. Niu, Z. Xu, L. L. Lai, and K. P. Wong (2016), Nonparametric conditional interval forecasts for pv power generation considering the temporal dependence, in 2016 IEEE Power and Energy Society General Meeting (PESGM), pp. 1–5.
 [4]↑
D. Pepe, G. Bianchini, and A. Vicino (2016), Model estimation for pv generation forecasting using cloud cover information, 2016 IEEE International Energy Conference (ENERGYCON), pp. 1–6.
 [5]↑
E. Nuao, M. Koivisto, N. Cutululis, and P. Sarensen (2017), Simulation of regional dayahead pv power forecast scenarios, 2017 IEEE Manchester PowerTech, pp. 1–6.
 [6]↑
Y. Zhang, M. Beaudin, R. Taheri, H. Zareipour, and D. Wood (2015), Dayahead power output forecasting for smallscale solar photovoltaic electricity generators, IEEE Transactions on Smart Grid, vol. 6, no. 5, pp. 2253–2262.
 [7]↑
J. Vasilj, P. Sarajcev, and D. Jakus, Pv power forecast error simulation model (2015), in 2015 12th International Conference on the European Energy Market (EEM), pp. 1–5.
 [8]↑
L. Oneto, F. Laureri, M. Robba, F. Delfino, and D. Anguita (2018), Datadriven photovoltaic power production nowcasting and forecasting for polygeneration microgrids, IEEE Systems Journal, vol. 12, no. 3, pp. 2842–2853.
 [9]↑
X. G. Agoua, R. Girard, and G. Kariniotakis (2018), Shortterm spatiotemporal forecasting of photovoltaic power production, IEEE Transactions on Sustainable Energy, vol. 9, no. 2, pp. 538–546,.
 [10]↑
B. Urquhart (2011), Intrahour forecasting with a total sky imager at the UC San Diego solar energy testbed, ASES.
 [11]↑
R. Marquez and C. F. Coimbra (2013), Intrahour dni forecasting based on cloud tracking image analysis, Solar Energy, vol. 91, pp. 327–336.
 [12]↑
H. Yang, B. Kurtz, D. Nguyen, B. Urquhart, C. W. Chow, M. Ghonima, and J. Kleissl (2014), Solar irradiance forecasting using a groundbased sky imager developed at uc san diego, Solar Energy, vol. 103, pp. 502–524.
 [13]↑
S. QuesadaRuiz, Y. Chu, J. TovarPescador, H. Pedro, and C. Coimbra (2014), Cloudtracking methodology for intrahour dni forecasting, Solar Energy, vol. 102, pp. 267–275.
 [14]↑
A. T. Lorenzo, W. F. Holmgren, M. Leuthold, C. K. Kim, A. D. Cronin, and E. A. Betterton (2014), Shortterm pv power forecasts based on a realtime irradiance monitoring network, 2014 IEEE 40th Photovoltaic Specialist Conference (PVSC), pp. 0075–0079.
 [15]↑
R. Perez, S. Kivalov, J. Schlemmer, K. Hemker Jr, D. Renné, and T. E. Hoff (2010), Validation of short and medium term operational solar radiation forecasts in the us, Solar Energy, vol. 84, no. 12, pp. 2161–2172.
 [16]↑
J. Liu, W. Fang, X. Zhang, and C. Yang (2015), An improved photovoltaic power forecasting model with the assistance of aerosol index data, IEEE Transactions on Sustainable Energy, vol. 6, no. 2, pp. 434–442.
 [17]↑
W. Fei, M. Zengqiang, S. Shi, and Z. Chengcheng (2011), A practical model for singlestep power prediction of gridconnected pv plant using artificial neural network,’ 2011 IEEE PES Innovative Smart Grid Technologies, pp. 1–4.
 [18]↑
T.C. Yu and H.T. Chang (2012), The forecast of the electrical energy generated by photovoltaic systems using neural network method, Electric Information and Control Engineering (ICEICE), 2011 International Conference on. IEEE, pp. 2758–2761.
 [19]↑
J. Shi, W. J. Lee, Y. Liu, Y. Yang, and P. Wang (2012), Forecasting power output of photovoltaic systems based on weather classification and support vector machines, IEEE Transactions on Industry Applications, vol. 48, no. 3, pp. 1064–1069.
 [20]↑
S. Qijun, L. Fen, Q. Jialin, Z. Jinbin, and C. Zhenghong (2016), Photovoltaic power prediction based on principal component analysis and support vector machine, 2016 IEEE Innovative Smart Grid Technologies  Asia (ISGTAsia), pp. 815–820.
 [21]↑
Y. Liu, J. Zhao, M. Zhang, F. Liu, H. Ouyang, H. Fang, Q. Hao, and Y. Lu (2016), A novel photovoltaic power output forecasting method based on weather type clustering and wavelet support vector machines regression, 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNCFSKD), pp. 29–34.
 [22]↑
S. Liu and M. Dong (2016), Quantitative research on impact of ambient temperature and module temperature on shortterm photovoltaic power forecasting, 2016 International Conference on Smart Grid and Clean Energy Technologies (ICSGCE), pp. 262–266.
 [23]↑
H. K. Elminir, Y. A. Azzam, and F. I. Younes (2007), Prediction of hourly and daily diffuse fraction using neural network, as compared to linear regression models, Energy, vol. 32, no. 8, pp. 1513–1523.
 [24]↑
S. I. Sulaiman, T. K. A. Rahman, I. Musirin, and S. Shaari (2015), Artificial neural network versus linear regression for predicting gridconnected photovoltaic system output, Cyber Technology in Automation, Control, and Intelligent Systems (CYBER), 2012 IEEE International Conference on. IEEE, pp. 170–174.
 [25]↑
B. Saghafian, S. Anvari, and S. Morid (2013), Effect of southern oscillation index and spatially distributed climate data on improving the accuracy of artificial neural network, adaptive neurofuzzy inference system and knearest neighbour streamflow forecasting models, Expert Systems, vol. 30, no. 4, pp. 367–380.
 [26]↑
H. Long, Z. Zhang, and Y. Su (2014), Analysis of daily solar power prediction with datadriven approaches, Applied Energy, vol. 126, pp. 29–37.
 [27]↑
M. J. Sanjari and H. B. Gooi (2017), Probabilistic forecast of pv power generation based on higher order markov chain, IEEE Transactions on Power Systems, vol. 32, no. 4, pp. 2942–2952.
 [28]↑
W. Wu, K. Wang, B. Han, G. Li, X. Jiang, and M. L. Crow (2015), A versatile probability model of photovoltaic generation using pair copula construction, IEEE Transactions on Sustainable Energy, vol. 6, no. 4, pp. 1337–1345.
 [29]↑
Z. Ren, W. Yan, X. Zhao, W. Li, and J. Yu (2014), Chronological probability model of photovoltaic generation, IEEE Transactions on Power Systems, vol. 29, no. 3, pp. 1077–1088.
 [30]
M. Yang, S. Fan, and W. J. Lee (2012), Probabilistic shortterm wind power forecast using componential sparse bayesian learning, in 48th IEEE Industrial Commercial Power Systems Conference, pp. 1–8.
 [31]↑
Y. Sun, Y. Yuan, and G. Wang (2014), Extreme learning machine for classification over uncertain data, Neurocomputing, vol. 128, pp. 500–506.
 [32]↑
J. Moon, J. Park, E. Hwang, and S. Jun (2017), Forecasting power consumption for higher educational institutions based on machine learning, The Journal of Supercomputing, pp. 1–23.
 [33]↑
M. E. Tipping (2001), Sparse bayesian learning and the relevance vector machine, Journal of machine learning research, vol. 1, no. Jun, pp. 211–244.
 [34]↑
A. B. Chan and N. Vasconcelos (2012), Counting people with lowlevel features and bayesian regression, (2017) IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 2160–2177, April 2012.
 [35]↑
Y. Jia, S. Kwong, W. Wu, R. Wang, and W. Gao (2017), Sparse bayesian learningbased kernel poisson regression, IEEE Transactions on Cybernetics, pp. 1–13.