WIR-005 | Levi Harris

`WIR-005`: [10-05-25.10-12-25]

Previous Week’s Action Items

Action Items
- 1. [RSR] create an annotated bibliography of 20+ articles; upload latex document to blog
  - Breadth > depth; 1-2 sentence description per source
- 2. [RSR] in-depth breakdown of the most salient source (i.e., 1 paper) reviewed in D.1; upload to blog
  - Depth > breadth
- 3. [RSR] join/create a paper club; post proof to blog
- 4. [LOG] brainstorm: (x3) ideas regarding how I can send my summer 26’; post to blog

A.1

For lack of better filesharing options I’m providing a Google Drive link.

A.2

Application of Machine Learning Techniques to Improve Multi-Radar Multi-Sensor(MRMS) Precipitation Estimates in the Western United States

Background

MRMS

The multi-radar multi-sensor (MRMS) dataset is a rapidly-updating, fine-grained (i.e., 1km/2min) resolution suite of products provided by the NSSL in Norman, Oklahoma. MRMS offers quantitative precipitation estimates QPEs in two flavors: radar-only and gauge ingest + bias corrected variants. The current operational radar-only scheme Q3EVAP (Zhang Et. al 2020) is a physically-based model using dual-pol radar fields to estimate precipitation grids every two minutes. The bias adjusted QPEs (pass@1/pass@2) ingest data from the MIDAS gauge network, and are available with a latency of ~1 hour.
Therefore, when selecting between different QPE products, there exists a fundamental tradeoff between latency and accuracy. Gauge adjusted products can account for variables that hinder ZR relation algorithms, such as beam blockage, radar height, radar distance, and the inherent variability between different precipitation types (e.g., stratiform, convective, brightband, snow, and hail.) Meteorologists issuing flash flood warnings are often stuck between a rock and hard place – forced to use radar-only products in areas where gauge coverage is sparse, and using past experiences to intuit the likelihood and amount of under/overestimation present.

Atmospheric river events

This study explains rainfall events in the western CONUS, with particular attention paid to California and areas where orographic forcing is likely to occur. This region is particularly troublesome for the Q3EVAP algorithm for a myriad of reasons. First, mountainous terrain and elevation variation degrade radar performance. Radar beams are obstructed by terrain features, and so underestimation is prevalent in orographically enhanced precipitation regimes.
During the cool season months (roughly October to March), westward advection of warm moist air from the Pacific ocean brings high precipitation to the western CONUS. This process is referred to as an “atmospheric river” event. The authors of this work are primarily concerned with these events as they tend to yield the worst performance for the Q3EVAP algorithm.

Method

The authors propose a convolutional neural network (CNN) model to estimate precipitation at a point from 2D inputs. In total, the authors select 13 input features:

Reflectivity: RALA, CREF, SHSR, SHSRH
Temperature (at a number of elevations)
Vertically integrated liquid (VIL)
Brightband

These 2D features maps are stacked channel-wise to form an input with shape $(5, 5, 13)$ mapped to a scalar output $y \in \mathbb{R}$ representing QPE at a point in inches. Input feature maps are $5 \times 5$, $1 \times 1$ km grids. All input features are taken from MRMS dataset, and so are on the same spatial scale used by the NSSL. All experiments in this paper are conducted using 24QPE. Some input variable fields only exist at a finer temporal resolution, and are summed over time to derive their 24H equivalents. Input features are normalized strictly between zero and one using per-feature maximum and minimums.

\[\hat{X} = \frac{X - X_{\text{max}}}{X_{\text{max}} - X_{\text{min}}}\]

The CNN used in this paper is bespoke and relatively shallow compared to popular implementations for natural images (e.g., the ResNet family). The model consists of a few convolutional layers, gradually downscaling the (H, W) dimensions, followed by a flatten operation, a “gaussian layer” (I am personally unfamiliar with this regularization approach), and finally a few feedforward layers with dropout. The model is trained using a MAE loss.

\[\mathcal{L}(f, \theta) = || y - \hat{y} ||\]

Experiments

The authors spend the majority of this paper reviewing the performance of their method (CNN) on four case studies, and conclude by reviewing a summary of 112 precipitation days in the western CONUS. Three primary evaluation metrics are used: mean bias ratio (MBR), the ratio of QPE/gauge values, MAE, and fractional MAE (fMAE), $\text{mean gauge value} * \text{MAE}$. In one case study a second variation on the CNN method (CNN-PRISM) is introduced, which explicitly models terrain features and potential orographic effects from the PRISM dataset.

Broadly, the authors discover that the CNN method does a good job of correcting the strong underestimation bias of Q3EVAP during cool-season atmospheric river events. The CNN-PRISM variant improves upon the CNN baseline, but only slightly, and evaluate in just one of the four case studies. The authors also note that CNN method tends to overestimate precipitation, particularly in warm-rain regimes. It is speculated that brightband degradation or other orographic-related features not explicitly modeled may account for these lacunas in performance. More importantly, it is shown that while the deep learning model outperforms Q3EVAP in cool-month events, it vastly underperforms compared to the baseline during summer months. The authors suggest that this performance gap may be attributable to these data being missing from the model training set. I would agree.

As an aside, I believe that the MBR metric lacks clarity. For example: an MBR of 10 indicates that QPE values are 10 times higher than the ground truth - a 10x overestimation! An MBR of 0.1 implies our QPE is now a 10x underestimate. The issue here is that authors choose to plot every y axis on linear scale, and so exponentially worse underestimations are particularly hard to discern visually.

Discussion

For future work, the authors propose adding additional input fields to model topography, brightband effects, and other variables that modulate the Z-R relationship. Moreover, they hint at adding some interpretability techniques to probe model behavior. Not a ton of thought is put into experimenting with different model configurations, variants, loss functions, data augmentation schemes, regularization techniques, input normalization setups, or problem formulations. I’m being annoying. From my brief observations of the meteorology and atmospheric science communities, I’ve noticed a preference among these groups to focus on case studies rather than large-scale quantitative evaluations. This bias makes sense. Operational forecasters use products like the MRMS suite only within the context of weather events. I hope to use my naivety to my advantage, by focusing more on the specifics of model design and less on the underlying meteorology of the processes I model in my work.

A.3

I’ve signed up to attend bi-weekly meetings of this BAIR-based group. I’m not sure if they’re still meeting regularly; will join another group if this one turns out to be inactive.

A.4

1. If all goes well, contact X or X’s postdoc about doing an internship at Y.
1. Apply for and obtain an industry research internship; this will require a separate plan of action to concurrently manage this PhD cycle with overlapping internship cycle.
1. If all goes well, request to begin work at Z early.

WIR

Tags
[RSR] – research
[TA]  – TA responsibilities
[ACA] – academics/schoolwork
[LOG] – logistics/chores
[PhD] – PhD applications

[10-05-25]

Work Completed
- [ACA] COMP790.183: final project proposal
- [ACA] COMP790.170: set time for final project planning meeting
- [LOG] week-in-review
  - contact (x10) potential PIs – (x5) remaining
  - pick (x1) location to present project
  - select goals for next week

[10-06-25]

Classes
- 2.5 hrs
Work Completed
- [LOG] schedule personal apps
- [LOG] draft CSSA newsletter
- [PhD] ping some PIs
- [RSR] select paper

[10-07-25]

Classes
- 2.5 hrs
Meetings
- 1.0 hrs
Work Completed
- [LOG] complete + send CSSA newsletter
- [ACA] COMP790.150: revise + submit A.1
- [ACA] COMP790.173: final project
- [RSR] (x1) paper reading session

[10-08-25]

Classes
- 2.5 hrs
Work Completed
- [RSR] (x1) paper reading session
- [LOG] update calendar
- [LOG] many emails

[10-09-25]

Classes
- 2.5 hrs
Meetings
- 0.5 hrs
Work Completed
- [RSR] (x1) paper reading session
- [LOG] help brother with interview prep

[10-10-25]

Office hours
- 4.0 hrs
Work Completed
- [ACA]: COMP790.173 homework and final project
- [RSR]: paper reading + writeup

[10-11-25]

Work Completed
- [RSR]: Latex doc w/ 10+ citations and brief descriptions
- [PhD]: outreach

Concluding Remarks

I feel reasonably confident that my productivity and “consistency of task execution”, in particular, have improved since starting this week-in-review series. As always there is still room for improvement. Going forward, I think it would be wise to find more opportunities to devote my weekly efforts to altruistic (or at least not immediately self-serving) causes. Thank you as always for reading!

Action Items
- 1. Schedule a meeting with an eligible PI @XXX; post proof
  - Failure condition: $\times 2$ action item #2 requirements
  - To explain, this TODO item has a strong luck component. Should I fail there must be some productive repercussions.
- 2. Contact 5+ PIs/25+ lab members; post proof
  - We will define “contact” as the successful establishment of correspondence, not merely having gotten ghosted
  - Either 5 PIs or 25 lab members; obviously the idea is to incentives getting in contact with lab heads versus lab members
- 3. Come up with 4-5+ fellowship alternatives to the standard NSF; post listings and deadlines as a table

Until next week 👋

WIR-005: [10-05-25.10-12-25]