q-value
Q value
From wikipedia, q value is phrased as follows
In statistical hypothesis testing, specifically multiple hypothesis testing, the q-value provides a means to control the positive false discovery rate (pFDR).John D. 2003 Just as the p-value gives the expected false positive rate obtained by rejecting the null hypothesis for any result with an equal or smaller p-value, the q-value gives the expected pFDR obtained by rejecting the null hypothesis for any result with an equal or smaller q-value.
From Statistics and Data analysis: From elementary to Intermediate, we can find the explicit definition of p-value
Simply rejecting or not rejecting \(H_0\) at a specified \(\alpha\) does not fully convey the information in the data. It is more useful to report the smallest a-level at which the observed test result is significant. This smallest \(\alpha\)-level is called the observed level of significance or the P-value. The smaller the P-value, the more significant is the test result. Once the P-value is computed, a test at any specified \(\alpha\) can be conducted by rejecting \(H_0\) if P-value\(<\alpha\).
An alternative definition of the P-value is that it is the probability under Null hypothesis of obtaining a test statistic at least as "extreme" as the observed value.
We can provide an example to enhance the understanding of p-value and q-value. Assume we have \(1000\) tests. Each test will produce a p-value. We can order these p-values from the smallest to the largest. A p-value of \(0.05\) means \(5\%\) of all tests which is 50 tests are expected to output false positive results. A q-value of \(0.05\) means tests with q-value less than \(0.05\) are all considered to be significant, and \(5\%\) of these significant tests are expected to output false positive results.
One of the methods that adjust p-value to q-value is the Benjamini-Hochberg method, (short for BH). The BH-adjusted p-values are defined as \[ p_{(i)}^{BH}=min\{\underset{j\geq i}{min}\{\frac{mp_{(j)}}{j}\},1\} \] The process is as follows.
- Order all p-values in an ascending sequence. Then multiply each p-value by the total number of test \(m\) and divide by its rank order
- For the first p-value, find the minimum value of \(\frac{mp_{(j)}}{j}\)for \(j=1,2,\cdots,m\); compare with \(1\), assign q-value as \(1\) if it is larger than \(1\). This value would be the smallest q-value among all tests; For the second p-value, find the minimum value of \(\frac{mp_{(j)}}{j}\)for \(j=2,\cdots,m\), this q-value must be less than the first q-value. We begin this procedure until all p-values are adjusted to q-values
We can write BH method using R code
1 | > BH <- function(p.values) |
We can visualize this process
p.value | 0.05 | 0.02 | 0.2 | 0.4 | 0.5 |
---|---|---|---|---|---|
p.value position (key1) | 1 | 2 | 3 | 4 | 5 |
\[ \downarrow \]
p.value position(key1) | 5 | 4 | 3 | 1 | 2 |
---|---|---|---|---|---|
q.value | 0.5 | 0.5 | 1/3 | 0.125 | 0.1 |
Position of key (key2) | 1 | 2 | 3 | 4 | 5 |
\[ \downarrow \]
position of key (key2) | 4 | 5 | 3 | 2 | 1 |
---|---|---|---|---|---|
p.value position (key1) | 1 | 2 | 3 | 4 | 5 |
p.value | 0.05 | 0.02 | 0.2 | 0.4 | 0.5 |
q.value | 0.125 | 0.1 | 1/3 | 1/2 | 1/2 |