Workshop with the Japanese Meteorological Agency: a Research Update

This week, I had the pleasure to present my current research to Mr. Toshi Kurino of the JMA

And I thought, why not use this change to provide an update on my current  research efforts? I've already put together a powerpoint on the topic - which did all the work for me on organising my thoughts.  So here goes.  What am I up to?

I've tried to make this easy. There's a slideshow below with accompanying audio hosted on soundcloud.  If you'd prefer, you can grab the presentation here:

[download presentation]

Solar 2014: Estimating Hourly Energy Generation of Distributed Photovoltaic Arrays


If you were tasked with estimating the energy generation from an entire city of PV systems - how would you do it?

A simulation probably jumps into your mind right away.  Scale up a model of PV system performance and that must get you close right?  Well that's a step in the right direction, and you could do that very accurately if you knew the amount of radiation arriving at the surface of all those PV arrays.

But that's a bit trickier than it sounds!  First, where are we going to get an estimate of the available solar radiation at a given location in the city?

The most common answer I get to this is: a pyranometer.  And that's a great start - you'd get a measurement of global horizontal irradiance (GHI) at a point location, which is very helpful.  But you're left with two major problems:

1. How representative is that pyranometer of the rest of the city's radiation resource?  Those clouds are tricky!

2. How do you estimate the amount of radiation arriving on all of those various tilted surfaces around the city?

So, OK, now we need multiple pyranometer sites around the city and at each site we need to tilt and orient them in various directions in order to get a representative sample.

Well, the bad news is that pyranometers cost a few thousand dollars each, need regular cleaning/calibration/maintenance and it's actually pretty difficult to find appropriate sites for them.  If you'd like to find out just how difficult it is to install scientific equipment on buildings - be my guest! (Hint: paperwork, approvals, PITAs galore!).

But I think I've got a better idea...

But I think I've got a better idea - what if we used the photovoltaic arrays that are already installed in a given region as our primary input to our city-scale modelling project?

They're pointed in many different directions, there are many of them already reporting data publicly in real-time, someone else has paid for the equipment AND they're representative of the systems we are trying to estimate in the first place. Sounds like a pretty sweet deal to me!

But, they are subject to shading, soiling and wiring inefficiencies, not to mention that they are not really the most scientific form of equipment.  Still - they are inherently a type of radiation sensor.  And we can probably deal with a lot of those things with some fancy machine-learning algorithms.

So, here is where I introduce our paper:

Estimating Hourly Energy Generation of Distributed Photovoltaic Arrays: a Comparison of Two Methods

J. Tan, N. A. Engerer and F. P. Mills

[download it]

 In it, we compare a two methods for estimating the energy generation of distributed PV arrays.  

The first uses pyranometers, radiation models and PV system modelling for the estimation.

Method 1: Based off of my Masters Thesis (see the 'Publications' page')

Method 1: Based off of my Masters Thesis (see the 'Publications' page')

The second uses a monitored PV system and my KPV methodology to make the estimation.  

Method 2: Based off of my KPV Methodology, Read the Solar Energy journal publication at the 'Publications' page

Method 2: Based off of my KPV Methodology, Read the Solar Energy journal publication at the 'Publications' page


I'll let you read the paper to get the details, as that's not the point of blogging (all the boring stuff is for the papers - Ok, I really do actually think that stuff is fun too, #supernerd).  But I will let you know that we've found a few interesting things:

1.  The pyranometer methods does tend to do a bit better (RMSE  15-20% versus 15-25%)

2. BUT when we start to leverage the prolific availability of the PV systems (there are many more of them out there!), we find the KPV method actually does best! (for distances less than 5km) 

3. We actually detected a calibration error in one of the pyranometers using the PV systems - so much for pyranometers being the pinnacle of scientific monitoring!

Overall, I find this result very encouraging.  If we can use PV systems as our primary input to our city-wide modelling idea, then we are one step closer to making the estimate we need.  And we can do it on the cheap - which is really good for solar! 

Now it's time to scale it up, test it on different time scales and handle all those pesky quality control issues.  But don't worry, you can count on me to bring you the results soon! 

Until then, enjoy my new webpage!





AMOS 2014 Talk: Categorising meteorological events as inputs to machine learning based solar forecasts


So I’m in Hobart, Tasmania.  A beautiful city I might add, with some picturesque upper level cirrus arranged in awesome gravity wave bands.  But I’m not here as a cloud tourist, but in fact to present some interesting research I am completing with an undergraduate student (Sonya Wellby) at the Australian Meteorological and Oceanographic Society's Annual Conference (AMOS 2014).

In the course of producing solar forecasts for our ARENA USASEC grant project (I really do need to actually write up a blog post on that at some point), we’ve discovered something interesting about the inclusion of weather data in our machine learning algorithms:  it doesn’t seem to help at all.

This is quite strange when you consider that the entire challenge of solar energy prediction is related to the clouds and those are driven by recognizable weather phenomena.  So including information like the temperature, wind speed/direction, surface pressure, etc should in theory, help the forecast improve.

But this is not what we’ve found.  In our single-site, Support Vector Machine (SVM with linear loss function) model estimates for 10 minute interval data/forecasts, we see an increase in Mean Bias and Root Mean Squared Errors and a decrease in the correlation between predictions/observations:

For hourly data/forecast intervals, we see no improvement in the forecasts with the inclusion of weather data.

10 minute interval forecasts/data (1). 60 minute interval forecasts/data at (2).    Red line is persistence, black line SVM without weather data, green line SVM with weather data.  All forecasts were produced using data from one PV site.

10 minute interval forecasts/data (1). 60 minute interval forecasts/data at (2).  

Red line is persistence, black line SVM without weather data, green line SVM with weather data.

All forecasts were produced using data from one PV site.

So what is going on here?


My hypothesis is that the weather data is full of too many small fluctuations and seemingly random signals for the machine-learning algorithm to see the “Big Picture" (thanks to Sonya for that phrase).  This is to say, it doesn’t recognize the overall synoptic or mesoscale event, which us meteorologists are trained to interpret.  It has no physical understanding of the data it is seeing – just a lack of direct relationships and therefor a diminishing weight to that data. 

So what can we do?

Well, our approach is to remove the individual feature vectors of weather data and replace them with a feature vector that signals weather or not a "significant" weather event is going to occur


“Significant” is defined here as the types of weather events that result in large scale ramp events for collectives of PV systems.  Currently, we are using data from 30 PV systems in Canberra to ID large ramp events.


Major events so far are fog clearing, morning cloud dissipation after easterly surges, the departure of a low pressure system/cold front and thunderstorm events.  We’re identifying more, but at this stage, I’ll refer you to our talk – which I’m posting here with audio (how cool are we?).

Check out our #AMOS2014 talk here, complete with Audio Transcript

[Download Talk]

See the slides:

Listen to the audio: