Hydrometeorology Research Group



In [5]:
from IPython.display import HTML
    HTML('''<script>
    code_show=true; 
    function code_toggle() {
     if (code_show){
     $('div.input').hide();
     } else {
     $('div.input').show();
     }
     code_show = !code_show
    } 
    $( document ).ready(code_toggle);
    </script>
    <form action="javascript:code_toggle()"><input type="submit" value="Click here to toggle on/off the raw code."></form>''')
    
In [1]:
%matplotlib inline
from rain import *
In [2]:
%%capture
rg = read_netcdf('../../../data/PHOENIX/Version2/Phoenix.nc')

Phoenix Rainfall

As a desert environment, the Phoenix rain gage network brings certain challenges. Rain events are much less common, so zero values dominate the datasets and the loss of a reading that happens during a storm can skew the whole gage record.

Gage Locations

In [4]:
highlight = None
fig, ax = plt.subplots(figsize=(12,8))
#colors = list(df.index)
colors = df.start_date.apply(lambda x: x.year)
if highlight is not None:
    colors = [0 if c==highlight else c for c in colors]
scat = ax.scatter(x=df.lon, y=df.lat, c=colors, s=70)
plt.title('Gages colored by start year (hover for gage name and start date)')
import mpld3
_df = df['start_date'].astype(str)
d = _df.to_dict()
labels = ['{i}\n start: {s}'.format(i=i, s=d[i]) for i in df.index]
tooltip = mpld3.plugins.PointLabelTooltip(scat, labels=labels)
fig.colorbar(scat)
mpld3.plugins.connect(fig, tooltip)
mpld3.display()
In [4]:
df = df.sort_values('start_date')
by_date = df.set_index(df.start_date)
by_date[3] = 1
by_date[3] = by_date[3].cumsum()
tot = by_date[3]
tot = tot.resample('1D', how='max').dropna().reindex(wet.index, method='ffill')
tot[tot.isnull()]=1
counts = pd.DataFrame(wet.count(axis=1)).join(tot)
counts.columns = ['wet', 'tot']
counts['ratio']=(counts.wet/counts.tot)
counts.columns = ['Gages with rain', 'Total number of gages', 'ratio']

This dataset in particular is difficult to wrangle because there are up to 313 gages at any given time and the data stretch back to 1980. The gages have been added a couple at a time since then. The plots below treat rainfall as a binary operation by just showing nuber of gages with rain vs the total number of gages that had been deployed up to that point.

In [5]:
fig, axes = plt.subplots(nrows=2, figsize=(14,8), sharex=True)
axes[0].set_title('Gages and gages observing rainfall')
axes[0].set_axis_bgcolor('white')
axes[1].set_axis_bgcolor('white')
counts.plot(y=['Gages with rain', 'Total number of gages'], ax=axes[0])
axes[0].set_ylabel('Number of gages')
counts['ratio'].plot(ax=axes[1], color='g')
axes[1].set_ylabel('Percentage of gages recording rainfall')
plt.xlabel('')
plt.show()

Rain rate maps

One way to get an overview of the data is to map aggregated statistics at the gage locations.

Rain rate in mm/hr over period of record

In [4]:
s = rg.rate.mean(axis=0)
s.name = 'Mean Rain Rate for 1980-2014'
fig, axes = plt.subplots(ncols=2, nrows=1, figsize=(16,6))
Event(rg.ll.join(s)).map_rain(latlon=True, ax=axes[0], sharec=False, basemap=True, hide_title=True)
k = Event(rg.ll.join(s)).krige(res=False, latlon=True, basemap=True, ax=axes[1])

From that map there seems to be a strong north-eastern amplification in rainfall, but I was wondering if that might be dominated by a few big storms, or perhaps the gages were added in a patterned way. To try to get at this, I plotted the mean rain rate for each decade.

In [6]:
q = rg.rate['1980':'1989'].mean()
q.name = 'the 1980s'
w = rg.rate['1990':'1999'].mean()
w.name = 'the 1990s'
e = rg.rate['2000':'2009'].mean()
e.name = 'the 2000s'
r = rg.rate['2010':].mean()
r.name = 'the 2010s'
Event(rg.ll.join((q,w,e,r))).map_rain(latlon=True, figsize=(16, 14), basemap=True, hide_title=True)

The pattern holds up! Now I wanted to try plotting the mean rain rate given that it is raining. For this plot I used a threshold of 1 tip per 15min which is equivalent to 0.01 inches, .254mm, or 1mm/hr. Next to this plot is a map of the time of the day at which peak rainfall occurs.

In [12]:
rg.get_wet_rate()
fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(16,6))
s = rg.wet_rate.mean()
s.name = 'Mean rain rate given that it is raining'
scat = Event(rg.ll.join(s)).map_rain(latlon=True, basemap=True, hide_title=True, sharec=(6,9), ax=axes[0])
fig.colorbar(scat, ax=axes[0])
lm = rg.get_max_lowess(example_plot=False)
lm.columns = ['Time of max rainfall (using lowess)']
scatter = Event(rg.ll.join(lm)).map_rain(latlon=True, basemap=True, hide_title=True, sharec=(0,24), cmap='hsv',ax=axes[1])
fig.colorbar(scatter, ax=axes[1])
plt.show()

Standard descriptive plots

By inspecting these plots we can get a sense of the temporal and spatial patterns of the rain

In [4]:
df = rg.rate.groupby(rg.rate.index.month).mean().T
df.columns = ['January', 'February', 'March','April', 'May', 'June','July', 'August','September','October','November','December']
Event(rg.ll.join(df)).movie(basemap=True)


Once Loop Reflect