q-value

Q value

From wikipedia, q value is phrased as follows

In statistical hypothesis testing, specifically multiple hypothesis testing, the q-value provides a means to control the positive false discovery rate (pFDR).John D. 2003 Just as the p-value gives the expected false positive rate obtained by rejecting the null hypothesis for any result with an equal or smaller p-value, the q-value gives the expected pFDR obtained by rejecting the null hypothesis for any result with an equal or smaller q-value.

From Statistics and Data analysis: From elementary to Intermediate, we can find the explicit definition of p-value

Simply rejecting or not rejecting \(H_0\) at a specified \(\alpha\) does not fully convey the information in the data. It is more useful to report the smallest a-level at which the observed test result is significant. This smallest \(\alpha\)-level is called the observed level of significance or the P-value. The smaller the P-value, the more significant is the test result. Once the P-value is computed, a test at any specified \(\alpha\) can be conducted by rejecting \(H_0\) if P-value\(<\alpha\). An alternative definition of the P-value is that it is the probability under Null hypothesis of obtaining a test statistic at least as "extreme" as the observed value.

We can provide an example to enhance the understanding of p-value and q-value. Assume we have \(1000\) tests. Each test will produce a p-value. We can order these p-values from the smallest to the largest. A p-value of \(0.05\) means \(5\%\) of all tests which is 50 tests are expected to output false positive results. A q-value of \(0.05\) means tests with q-value less than \(0.05\) are all considered to be significant, and \(5\%\) of these significant tests are expected to output false positive results.

One of the methods that adjust p-value to q-value is the Benjamini-Hochberg method, (short for BH). The BH-adjusted p-values are defined as \[ p_{(i)}^{BH}=min\{\underset{j\geq i}{min}\{\frac{mp_{(j)}}{j}\},1\} \] The process is as follows.

  1. Order all p-values in an ascending sequence. Then multiply each p-value by the total number of test \(m\) and divide by its rank order
  2. For the first p-value, find the minimum value of \(\frac{mp_{(j)}}{j}\)for \(j=1,2,\cdots,m\); compare with \(1\), assign q-value as \(1\) if it is larger than \(1\). This value would be the smallest q-value among all tests; For the second p-value, find the minimum value of \(\frac{mp_{(j)}}{j}\)for \(j=2,\cdots,m\), this q-value must be less than the first q-value. We begin this procedure until all p-values are adjusted to q-values

We can write BH method using R code

1
2
3
4
5
6
7
8
9
10
11
12
13
> BH <- function(p.values)
+ {
+ p <- p.values
+ lp <- length(p)
+ i <- lp:1L
+ # key to the p.values (key1)
+ o <- order(p,decreasing = TRUE)
+ # key to the key1 (key2)
+ ro <- order(o)
+ pmin(1,cummin(lp/i * p[o]))[ro]
+ }
> BH(c(0.05,0.02,0.2,0.4,0.5))
[1] 0.1250000 0.1000000 0.3333333 0.5000000 0.5000000

We can visualize this process

p.value 0.05 0.02 0.2 0.4 0.5
p.value position (key1) 1 2 3 4 5

\[ \downarrow \]

p.value position(key1) 5 4 3 1 2
q.value 0.5 0.5 1/3 0.125 0.1
Position of key (key2) 1 2 3 4 5

\[ \downarrow \]

position of key (key2) 4 5 3 2 1
p.value position (key1) 1 2 3 4 5
p.value 0.05 0.02 0.2 0.4 0.5
q.value 0.125 0.1 1/3 1/2 1/2

Reference