The availability of solar irradiance is uncertain and time-dependent, which is influenced by several climatic factors. Therefore accurate solar irradiance prediction is required for planning, designing, and site selection to establish new solar power plants. This study utilizes eight machine learning techniques, including multivariate linear regression, ridge, lasso, elastic net, multilayer perceptron, k-nearest neighbors, decision tree, and random forest to develop hourly global horizontal irradiance prediction models. A feature selection procedure to select the most influential input features from different meteorological variables is also discussed in this study. To examine the accuracy of the developed models, this study employs the data of 21 locations of different climatic and geographic regions. The considered cities are categorized into three groups using k-means clustering in order to find out the favorable locations for solar power generation. The computational results suggest that the random forest and k-nearest neighbor are the most efficient prediction model, which outperformed other machine learning models with an average forecast skill of 37% and 35% over the smart persistence model. Overall, this study may be exercised for the selection of an efficient GHI prediction model and location for the installation, designing and planning of new solar power plants.
Keywords Renewable energy, global horizontal irradiance, machine learning, forecasting, random forest, clustering