First principle thinking about Frequentist VS Bayesian Partisans

The field of statistics essentially comprises two main branches: frequentist and Bayesian. The primary distinction between these two approaches, as articulated by Michael I. Jordan, lies in their respective viewpoints:

Frequentists typically proceed by moving forward from the state of nature to the data, while Bayesians operate in the opposite direction, moving from the data back to the state of nature. This fundamental contrast is exemplified in concepts such as the false discovery rate, which considers the probability of a hypothesis given that a discovery has been made, conditioning it on the data.

Contemplating these divergent perspectives gives rise to intriguing questions. Below are some inquiries I have formulated and explored:

  1. What are the distinctions between confidence interval and credible interval? What are the distinctions between coverage probability and credible probability?

    [George Casella Statistical Inference p435] When describing the interactions between the confidence interval and the parameter, we should carefully say that the interval covers the parameter, not that the parameter is inside the interval. This was done on purpose. It’s imperative to stress that the random quantity in classical frequentist partisan is the interval, not the parameter. The parameter is fixed. When we say a 90% confidence interval for a parameter, say \(lambda\) is \(a\leq \lambda \leq b\), we are tempting to say (and many experimenters do) that “The probability is 90% that \(\lambda\) is in the interval \([a,b]\).” However, within classical statistics, such a statement is invalid since the parameter is assumed fixed. Formally, the interval \([a,b]\) is one of the possible realized values of the random interval \([f(\alpha/2),g(1-\alpha/2)]\). And since the parameter \(\lambda\) does not move, \(\lambda\) is in the realized interval \([a,b]\) with probability either \(0\) or \(1\). When we say that the realized interval \([a,b]\) has a 90% chance of coverage, we only mean that we know that 90% of the sample points of the random interval cover the true parameter.

    In contrast, the Bayesian setup allows us to say \(\lambda\) is inside \([a,b]\) with some probability, not \(0\) or \(1\). This is because, under the Bayesian model, \(\lambda\) is random variable with a probability distribution. All Bayesian claims of coverage are made with respect to the posterior distribution of the parameter.

    It’s important not to confuse incredible probability (The Bayes posterior probability) with coverage probability (The classical probability). The probabilities are very different entities, with different meanings and interpretations. Credible probability comes from the posterior distribution, which in turn gets its probability from the prior distribution. Thus, credible probabilities reflects the experimenter’s subjective beliefs, as expressed in the prior distribution and updated with the data to the posterior distribution. A Bayesian assertion of 90% coverage means that the experimenter, upon combining prior knowledge with data, is 90% sure of coverage

  2. How to calculate the credible interval?

    In Bayesian statistics, the posterior distribution of a parameter can be derived and used for inference about that parameter. Since the parameter is considered a random variable in this framework, this distribution can be divided into numerous intervals, each covering 95% of the distribution, thus having a credible probability of 95%. Importantly, while there are multiple ways to construct these 95% credible intervals, they are not considered random in Bayesian analysis. Typically, we would choose the intervals that are symmetrically constructed around the distribution's central point. Once an interval is established, we interpret it as having a 95% probability of containing the true parameter value.

  3. In a frequentist hypothesis testing framework, is it possible to compare the probabilities of the null hypothesis (H0) and the alternative hypothesis (H1) to make a decision about accepting or rejecting the hypothesis?

    [George Casella Statistics inference P379] The probabilities \(P(H_0 )\text{ is true}\) and \(P(H_1 )\text{ is true}\) are not meaningful to the classical statistician. The classical statistician considers \(\theta\) to be a fixed number. Consequently, a hypothesis is either true or false. If \(\theta\in \theta_0\), \(P(H_0 )\text{ is true}=1\) and \(P(H_1 )\text{ is true}=0\) for all values of \(\mathbf{x}\). If \(\theta\in\theta_0^c\), these values are reversed. Since these probabilities are unknown (since \(\theta\) is unknown) and do not depend on the sample \(\mathbf{x}\), methods of calculating the probabilities \(P(H_0 )\text{ is true}\) and \(P(H_1 )\text{ is true}\) are not used by the classical statistician. In a Bayesian formulation of a hypothesis testing problem, these probabilities depend on the sample \(\mathbf{x}\) and can give useful information about the veracity of \(H_0\) and \(H_1\)

  4. Is the Likelihood Ratio Test (LRT) specific to the Bayesian or frequentist statistical approach, or is it applicable to both methodologies?

    The likelihood ratio test is a method used within the frequentist statistical framework. It is a hypothesis test that compares two statistical models based on the ratio of their likelihoods. The test evaluates how well two models, typically one being a special case or a subset of the other (referred to as the null and alternative models), explain a set of observed data.

    The likelihood ratio test is not inherently Bayesian; it is primarily a frequentist approach. However, the concept of comparing likelihoods of different models or hypotheses is also fundamental in Bayesian statistics, except it's applied in a different context and with a different framework.

Unsolved questions

  1. What is the evalution method for bayesian test?

  2. What is the bayesian P value?

    [What are Bayesian p-values?](https://stats.stackexchange.com/questions/171386/what-are-bayesian-p-values)

  3. Is power analysis necessary for the Bayesian test? If so, how to calculate the power of the Bayesian tests?

    [Is power analysis necessary in Bayesian Statistics?](https://stats.stackexchange.com/questions/65754/is-power-analysis-necessary-in-bayesian-statistics)

    [Power analysis from Bayesian point of view [duplicate]](https://stats.stackexchange.com/questions/110346/power-analysis-from-bayesian-point-of-view)

  4. Does bayesian test has type I error when the parameter is assumed to come from a probability distribution?

    Analysis of type I and II error rates of Bayesian and frequentist parametric and nonparametric two-sample hypothesis tests under preliminary assessment of normality

  5. Why does the false discovery rate kind of like the bayesian thinking, going back from data to state of nature; Why FDR is unlike other metrics, such as Recall, Precision, Accuracy, or others?

    False discovery rate

    Original paper: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

    Bradley Efron empirical Bayes book