My ICEM 2013 Presentation on Solar Forecasting

At this year’s International Conference on Energy Meteorology, I presented some preliminary results from one of my current research projects in a talk entitled:

“Short-term Machine Learning Based Power Output Forecasts for Collectives of Rooftop Photovoltaics”

[download link]

The project, entitled Machine-learning-based forecasting of distributed solar energy production is a $0.8 million dollar grant from the Australian Solar Institute, led by the Australian National University in direct partnership with NICTA.  Together, we are undertaking a data-mining project of PV systems in Canberra, Australia, in order to create a solar forecasting system for rooftop scale distributed PV systems (related  to what you can see over at the ACT Solar Map).  It’s a pretty exciting project and I am very confident we are going to have some excellent results.

Why the confidence?  Well, for one, I am working with some of the top Machine Learning experts in the world (NICTA) to develop the forecasting engine for this project (namely Xinhua Zhang and Justin Domke for the results I’m sharing here).  And at this presentation, I was able to show off a little of what we can do when we put our heads together!

Here are the basics of what we’ve done so far – and you’re going to need to have a basic understanding of machine learning to follow it (no two ways about it!).  Using a support vector machine approach, we’ve built some experiments that gauge the benefits of creating solar forecasts that include multiple PV sites (~17 on average) as ‘features’ or ‘predictors’ for a single PV site, versus those that use only the historical power output from one site. 

We tested the routine on 10 minute and 60 minute interval data, with t+1, t+2 and t+3 time horizons.  Errors were evaluated with Mean Absolute Error, Root Mean Square Error and R^2 measures.  In the images below, you can see the benefit of, rather simply, just including more sites in the forecasting algorithm:

icem_w_baseline
Red = Persistence, Black = 1 site, Blue = 17 sites

Red = Persistence, Black = 1 site, Blue = 17 sites

Interesting results!  On only our first stab at it, we are getting single digit MAE and RMSE values – some of the best results in the field.  I am looking forward to seeing what we can do once we employ some more finesse!

Now, there will be more to come on the overall aim of this research project – it really deserves its own blog post (or perhaps several of them!).  But since I’m committing myself to writing regularly here, you can be sure you will here all about it.  For now, I just wanted to share a bit about the presentation (which you can download above!)

Well, I’ve got a flight to catch (currently in London-Heathrow!).  That’s all for now.

/*