q-value

Posted on 2022-05-28 Edited on 2022-11-07 In statistics

Q value

From wikipedia, q value is phrased as follows

In statistical hypothesis testing, specifically multiple hypothesis testing, the q-value provides a means to control the positive false discovery rate (pFDR).John D. 2003 Just as the p-value gives the expected false positive rate obtained by rejecting the null hypothesis for any result with an equal or smaller p-value, the q-value gives the expected pFDR obtained by rejecting the null hypothesis for any result with an equal or smaller q-value.

From Statistics and Data analysis: From elementary to Intermediate, we can find the explicit definition of p-value

Simply rejecting or not rejecting \(H_0\) at a specified \(\alpha\) does not fully convey the information in the data. It is more useful to report the smallest a-level at which the observed test result is significant. This smallest \(\alpha\)-level is called the observed level of significance or the P-value. The smaller the P-value, the more significant is the test result. Once the P-value is computed, a test at any specified \(\alpha\) can be conducted by rejecting \(H_0\) if P-value\(<\alpha\). An alternative definition of the P-value is that it is the probability under Null hypothesis of obtaining a test statistic at least as "extreme" as the observed value.

We can provide an example to enhance the understanding of p-value and q-value. Assume we have \(1000\) tests. Each test will produce a p-value. We can order these p-values from the smallest to the largest. A p-value of \(0.05\) means \(5\%\) of all tests which is 50 tests are expected to output false positive results. A q-value of \(0.05\) means tests with q-value less than \(0.05\) are all considered to be significant, and \(5\%\) of these significant tests are expected to output false positive results.

One of the methods that adjust p-value to q-value is the Benjamini-Hochberg method, (short for BH). The BH-adjusted p-values are defined as \[ p_{(i)}^{BH}=min\{\underset{j\geq i}{min}\{\frac{mp_{(j)}}{j}\},1\} \] The process is as follows.

Order all p-values in an ascending sequence. Then multiply each p-value by the total number of test \(m\) and divide by its rank order

For the first p-value, find the minimum value of \(\frac{mp_{(j)}}{j}\)for \(j=1,2,\cdots,m\); compare with \(1\), assign q-value as \(1\) if it is larger than \(1\). This value would be the smallest q-value among all tests; For the second p-value, find the minimum value of \(\frac{mp_{(j)}}{j}\)for \(j=2,\cdots,m\), this q-value must be less than the first q-value. We begin this procedure until all p-values are adjusted to q-values

We can write BH method using R code

> BH <- function(p.values)
+ {
+   p <- p.values
+   lp <- length(p)
+   i <- lp:1L
+   # key to the p.values (key1)
+   o <- order(p,decreasing = TRUE)
+   # key to the key1 (key2)
+   ro <- order(o)
+   pmin(1,cummin(lp/i * p[o]))[ro]
+ }
> BH(c(0.05,0.02,0.2,0.4,0.5))
[1] 0.1250000 0.1000000 0.3333333 0.5000000 0.5000000

We can visualize this process

p.value	0.05	0.02	0.2	0.4	0.5
p.value position (key1)	1	2	3	4	5

\[ \downarrow \]

p.value position(key1)	5	4	3	1	2
q.value	0.5	0.5	1/3	0.125	0.1
Position of key (key2)	1	2	3	4	5

\[ \downarrow \]

position of key (key2)	4	5	3	2	1
p.value position (key1)	1	2	3	4	5
p.value	0.05	0.02	0.2	0.4	0.5
q.value	0.125	0.1	1/3	1/2	1/2

Q value

Reference