source: Innovyze Support Portal
When running a real-time live model, there is a need to collate the rainfall from multiple rainfall sources to get the best rainfall data.
A typical situation in the United States is using the rain gauges installed in the study area when data are available, and then using the radar rainfall (NEXRAD) when needed, and for forecasting use HRRR from NOAA.
Since we don’t have the time to review the rainfall data manually every time a real-time model is running, ICM has built-in functions to help identify data gaps, and then apply rules on how to choose the best data source when data are missing.
Identify missing and bad data
For rain gauge data stored in a scalar TSDB(time series database),
- missing data are identified based on the time interval
- bad data are identified based on the threshold definition in the data stream
As shown below, when we deleted two rows in the time series, the data are shown as missing in the plot with missing triangles data points. Also with the extend option, each data pointed is showing as a step.
Setting logic rules on which rainfall sources to use
The logic rules are defined in the “Spatial rain source” tab of the “New Polygons Window” grid.
When ICM calculates the rainfall for each subcatchment it will start with the high priority source (with the lower priority number). Then it will process the data point one by one, if there is missing data, it will look for the data from the next priority source.
We can also restrict the range of the rainfall data using the start/end seconds relative to origin. With this setting once the rainfall is out of the range, ICM will move to the next source for rainfall data.
In our example, we have 3 sources of rainfall data,
- Rain Gauge (15 min)
- NEXRAD: radar rainfall (~10min)
- HRRR: rainfall forecast (hourly forecast)
The model network is shown below,
- a simple network with two subcatchments
- one spatial rain gauge polygon
- NEXRAD/HRRR were used as additional spatial rainfall sources
As shown below, the rainfall sources look quite different.
- from 12/17 12:00: NEXRAD recorded more rainfall
- from 12/18 18:00: the rain gauge and HRRR recorded more rainfall
We’ll do a few experiments and learn how ICM chooses rainfall data.
As shown below, if we only change the priority order, then the source with the lowest priority number will be used as the rainfall source.
This one below will use the Rain gauge
This one below will use HRRR.
What if we change the settings of the start/end relative to the origin,
In the setting below,
- The origin is 12/18 12:00
The results are shown below,
- the green line is the rain gauge data
- the blue line is ICM calculated rainfall
- (1) is the run origin
- (2) priority 0 is RG_NET, and it is from start to -24hr (86400 second). In this region, rain gauge data was used
- (3) in this region, it is outside of RG_NET range, so NEXRAD data was used
- (4) in the region, HRRR data was used, and the rain gauge data were ignored.
Next we’ll see what will happen when data are missing,
We manually deleted a few rows of data from the rain gauge database. And ICM will replace the missing data with the next priority source
With the following spatial rainfall source settings,
- Before the origin, the missing data will be replaced with NEXRAD, the next priority
- and after the origin, the missing data will be replace with HRRR, NEXRAD will stop at origin, so the will move to the second priority source.
The model can be found on github.