Math Frames

The novel CMI is built on Vine Copula theory, with specific settings tailored to super drought monitoring. There are three main stages to constrcut CMI and detect super drought. The details can be found in our publication in BAMS (2023).

Step 1: Data preparation and preprocess

Water balance at various time scales is formulated as: \[D_n^k=\sum^{k-1}_{i=0}(\mathrm{PET}_{n-i}-\mathrm{P}_{n-i}), n\ge k \] where \(k\) is the time scale and \(n\) is the month.

Then, the observation is converted to pseudo-observations by empirical Probability Integral Transform: \[ \boldsymbol{x_j}=(x_{1j},x_{2j},\ldots,x_{nj})^T \rightarrow \boldsymbol{u_j}=(u_{1j},u_{2j},\ldots,u_{nj})^T,j\in{1,\ldots,d} \] \[ u_{ij} = r_{ij}/(n+1), i\in{1,\ldots,n},\forall j \] where \(r_{ij}\) denotes the rank of \(x_{ij}\) among all \(x_{kj}\), \(k\in{1,\ldots,n}\). After conversion, the marginal uniformity is achieved.

Step 2: Vine Copula constrcution and goodness-of-fit test

Select the R-Vine structure, that is choosing and connecting the most dependent pairs in the tree by maximizing the sum of the absolute Kendall's \(\tau\) between all pairwise variables: \[ \tau\in(-1,1)=\mathbb{P}[(X_1-X_2)(Y_1-Y_2)>0]-\mathbb{P}[(X_1-X_2)(Y_1-Y_2)\lt 0] \]

Then, identify the bivariate copula type and corresponding parameters that best fit the observation based on AIC criterion: \[ \hat{\theta} = arg\,\, \mathop{\mathrm{max}}_{\theta} \sum^{n}_{i=1} \mathrm{ln} [c(u_{i1},c_{i2}|\boldsymbol{\theta})]\] \[ \mathrm{AIC}=-2\sum^n_1\mathrm{ln}[c(u_{i1},c_{i2}|\hat{\boldsymbol{\theta}})]+2k \] where \(\{(u_{i1},u_{i2}),i=1,\ldots,n\} \) are pseudo-observations, \(c\) is the copula density, and \(k\) is the number of parameters in the model. The candidate bivariate copula families include indepedence, normal, student \(t\), Clayton, Gumbel, Frank, Joe, Clayton-Gumbel (BB1), Joe-Gumbel (BB6), Joe-Clayton (BB7) and Joe-Frank (BB8).

Last, the goodness-of-fit test is performed by using Rosenblatt's transform: \[ \left( \begin{matrix} u_{11} & \cdots & u_{1d} \\ \vdots & & \vdots \\ u_{n1} & \cdots & u_{nd} \\ \end{matrix} \right) \xrightarrow[(PIT)]{Rosenblatt} \left( \begin{matrix} y_{11} & \cdots & y_{1d} \\ \vdots & & \vdots \\ y_{n1} & \cdots & y_{nd} \\ \end{matrix} \right) \xrightarrow[\sum_{i=1}^{d}\Phi^{-1}(y_n)^2]{Aggregation} \left( \begin{matrix} s_{1} \\ \vdots \\ s_{n} \\ \end{matrix} \right) \xrightarrow[\chi^2(d)]{K.S test} \]

Step 3: CMI derivation and super drought detection

Monte Carlo simulation is carried out to numerically construct \(C\) given by its density \(c\). Then, the copula measure \(C\) is converted Kendall measure \(K\) via Kendall function: \[K(q)=\mathbb{P}[C_{U_1,\ldots,U_d}(U_1,\ldots,U_d)\le q],q\in(0,1)\] In the final step, the standarized score is readily obtained by taking the inverse normal \( \Phi^{-1} \) \[ \mathrm{CMI}=(-1)\cdot \Phi^{-1}(K) \] Based on CMI, we can set threshold value for identifying super drought events.

Flowchart

Tools

I'm now in the course of transforming CMI calculation code to software package, including user interface design, program optimazation, fault tolerance mechanism and supporting documentation. This is not an easy job. Thank you for your patience and understanding.