BOL: P-Value, FDR, q-score: What Do They Mean? A Simple Guide with Example

In statistics and bioinformatics, you’ll often see results reported with p-values, FDR, and q-values (q-scores). But what do these terms mean, and how are they different? Let’s break them down with simple definitions and a step-by-step example.

1. What is a P-Value?
Definition: The p-value is the probability of observing a result at least as extreme as the one you got, assuming the null hypothesis is true.

Low p-value (e.g., p < 0.05) → evidence against the null hypothesis.

High p-value → no strong evidence against the null.

Key idea: It tells you how surprising your data is if there’s really no effect.

2. The Multiple Testing Problem
In bioinformatics, genomics, or any large-scale study, you test thousands of hypotheses (e.g., thousands of genes). Even if there’s no real signal, some tests will have p < 0.05 just by chance.

Example:

Testing 10,000 genes

Even if all null, expect ~500 genes with p < 0.05 by chance

This is why we need multiple testing correction.

3. What is FDR (False Discovery Rate)?
Definition: FDR is the expected proportion of false positives among the results you declare significant.

Unlike the family-wise error rate (FWER), which controls for even a single false positive, FDR lets you tolerate some false discoveries to gain power.

Benjamini–Hochberg (BH) procedure is the most popular method to control FDR.

4. What is a q-value (or q-score)?
Definition: The q-value of a test is the minimum FDR at which that test would be called significant.

A p-value tells you how surprising your result is.

A q-value tells you how many of your significant results might be false positives if you call this result significant.

You can think of the q-value as the FDR-adjusted p-value.

5. Example: Step-by-Step
Let’s work through an example with 10 tests.

Test Raw p-value
1 0.001
2 0.004
3 0.010
4 0.020
5 0.030
6 0.040
7 0.050
8 0.060
9 0.070
10 0.080

Goal: Control FDR at 5%.

Step 1: Rank p-values
Rank from lowest to highest:

Rank p-value
1 0.001
2 0.004
3 0.010
4 0.020
5 0.030
6 0.040
7 0.050
8 0.060
9 0.070
10 0.080

Step 2: Apply Benjamini–Hochberg threshold
For each rank i, compute:

BH critical value =i/m*q
BH critical value=m/i*Q
m = 10 tests
Q = 0.05

Rank p-value BH critical value
1 0.001 0.005
2 0.004 0.010
3 0.010 0.015
4 0.020 0.020
5 0.030 0.025
6 0.040 0.030
7 0.050 0.035
8 0.060 0.040
9 0.070 0.045
10 0.080 0.050

Find the largest p-value ≤ its critical value:

p(4) = 0.020 ≤ 0.020 (T)

p(5) = 0.030 > 0.025 (F)

Result: We can declare the top 4 tests significant at FDR 5%.

Step 3: Computing q-values (conceptually)
The q-value for each p-value is roughly the minimum FDR at which it would be significant. Specialized software (e.g., R’s qvalue package) can estimate them.

In our example:

Tests 1–4 would have q-values ≤ 0.05

Tests 5–10 would have q-values > 0.05

The q-value gives you an adjusted p-value that accounts for multiple testing.

6. In Bioinformatics Workflows
You see these all the time:

RNA-seq differential expression → Report p-values, FDR/q-values

ChIP-seq peak calling

Genome-wide association studies (GWAS)

Proteomics, metabolomics

Always check if results are corrected for multiple testing. Reporting raw p-values alone can be misleading.

Summary
Term Meaning Interpretation
p-value Probability under null Small p → evidence against null
FDR False Discovery Rate Expected proportion of false positives among calls
q-value FDR-adjusted p-value Minimum FDR threshold where result is significant

Final Tip
Always correct for multiple testing! Otherwise, your beautiful "significant" results might just be noise.

BOL

Abhi

Tag cloud

Our Sponsors

P-Value, FDR, q-score: What Do They Mean? A Simple Guide with Example