Probability Fundamentals
Your complete beginner-to-intermediate guide to probability in statistics, finance, auditing, and business decision-making
Probability is a number between 0 and 1 that measures how likely an event is to occur. A probability of 0 means the event is impossible; a probability of 1 means the event is certain. Everything in between represents varying degrees of likelihood — and mastering probability is the single most important foundation for all of statistics.
Learning Objectives
By the end of this module, you will be able to:
| # | Objective | Skill Level | Application |
|---|---|---|---|
| 1 | Define probability and explain its role in statistics | Beginner | All quantitative fields |
| 2 | Identify sample spaces, events, and outcomes | Beginner | Experimental design |
| 3 | Apply the Addition, Multiplication, and Complement rules | Beginner | Risk, audit, business |
| 4 | Calculate conditional probability using the formal formula | Intermediate | Finance, insurance, fraud detection |
| 5 | Apply Bayes' Theorem to update beliefs with new evidence | Intermediate | Medical diagnosis, credit risk, ML |
| 6 | Build and interpret probability trees | Intermediate | Decision analysis |
| 7 | Identify and apply Binomial, Poisson, Normal, and Uniform distributions | Intermediate | Data analysis, forecasting |
| 8 | Apply probability to real-world financial, audit, and business problems | Intermediate | Professional practice |
๐ Table of Contents
- Section 4.1 — Introduction to Probability
- Section 4.2 — Basic Probability Concepts
- Section 4.3 — Probability Rules
- Section 4.4 — Conditional Probability
- Section 4.5 — Bayes' Theorem
- Section 4.6 — Probability Trees
- Section 4.7 — Probability Distributions
- Section 4.8 — Real-World Applications
- Section 4.9 — Case Study: Loan Default Risk
- Section 4.10 — Practice, Quiz & Assessment
- Frequently Asked Questions
- Final Summary & Next Module
Section 4.1 — Introduction to Probability
๐ฏ Learning Goal
Understand what probability is, where it comes from, and why it is the essential language of statistics, finance, and data-driven decision-making.
What Is Probability?
Simple definition: Probability is a measure of how likely something is to happen. It is expressed as a number from 0 to 1, or equivalently as a percentage from 0% to 100%.
Academic definition: Probability is a real-valued function defined on a sample space that assigns to each event a non-negative number satisfying the Kolmogorov axioms: non-negativity, normalization (total probability = 1), and additivity for mutually exclusive events.
Statistical definition: Probability quantifies uncertainty. It is the long-run relative frequency of an event across an infinite number of repeated experiments under identical conditions, or a subjective degree of belief calibrated to coherent standards.
| Probability Value | Meaning | Everyday Example |
|---|---|---|
| 0 | Impossible — will never happen | Rolling a 7 on a standard die |
| 0.1 — 0.2 | Unlikely but possible | Being selected in a lottery |
| 0.5 | Equal chance — coin flip | Getting heads on a fair coin |
| 0.7 — 0.9 | Likely but not certain | A good student passing an exam |
| 1.0 | Certain — will always happen | The sun rising tomorrow |
Why Probability Matters
Probability is not abstract mathematics — it is the engine behind every major decision framework in modern life:
| Field | Application | Probability Question |
|---|---|---|
| Finance | Portfolio risk | What is the probability this investment loses value? |
| Auditing | Sampling risk | What is the probability of missing a material error? |
| Medicine | Diagnosis | Given a positive test, what is the probability of the disease? |
| Insurance | Premium pricing | How likely is a policyholder to file a claim this year? |
| Business | Sales forecasting | What is the probability next month's revenue exceeds budget? |
| Machine Learning | Classification | What is the probability this email is spam? |
| Credit Risk | Loan decisions | What is the probability this borrower defaults within 12 months? |
| Quality Control | Defect detection | What is the probability a batch contains at least one defect? |
"Probability theory is nothing but common sense reduced to calculation." — Pierre-Simon Laplace
- Probability measures likelihood on a scale from 0 (impossible) to 1 (certain).
- It applies universally — finance, auditing, medicine, data science, and daily decisions all rely on probability.
- Probability is the mathematical foundation upon which sampling, hypothesis testing, and regression are built.
- There are three perspectives: classical (equally likely outcomes), empirical (observed data), and subjective (informed belief).
Section 4.2 — Basic Probability Concepts
๐ฏ Learning Goal
Master the vocabulary of probability: experiments, outcomes, sample spaces, and events — the building blocks for every probability calculation you will ever perform.
Core Vocabulary
| Term | Definition | Dice Example | Business Example |
|---|---|---|---|
| Experiment | Any process that produces a measurable result | Rolling a six-sided die | Launching a new product |
| Outcome | A single possible result of an experiment | Rolling a 4 | Product succeeds |
| Sample Space (S) | The set of ALL possible outcomes | S = {1, 2, 3, 4, 5, 6} | S = {success, failure} |
| Event (E) | A specific subset of the sample space | Rolling an even number | Product achieves >10% market share |
| Simple Event | An event containing exactly one outcome | Rolling a 3 | Exactly 5 units sold |
| Compound Event | An event containing two or more outcomes | Rolling a number > 4 | Sales between 100–200 units |
| Complementary Event (E') | All outcomes NOT in event E | Not rolling a 6 | Product does not achieve target |
| Mutually Exclusive | Two events that cannot both occur at once | Rolling a 2 AND rolling a 5 (same roll) | Profit AND loss in same quarter |
| Independent Events | Occurrence of one does not affect the other | First roll result vs. second roll result | Weather in Tokyo vs. sales in London |
The Probability Formula
Variable breakdown:
P(E)— Probability of event E occurring (a value between 0 and 1)n(E)— Number of outcomes in the event (favorable outcomes)n(S)— Total number of outcomes in the sample space
Worked Example — Standard Die Roll
Types of Probability
| Type | Definition | Formula Basis | Example | Best Used When |
|---|---|---|---|---|
| Classical | Based on equally likely outcomes from theory alone | n(E)/n(S) | Tossing a coin: P(H) = 1/2 | Games, theoretical problems |
| Empirical | Based on observed frequency of actual past events | f/n (frequency/total) | 200 of 1,000 loans defaulted: P(default) = 0.20 | Historical data available |
| Subjective | Based on expert judgment and informed belief | Expert estimate | Analyst estimates 65% chance competitor enters market | No data, expert knowledge available |
In credit risk modelling, classical probability underpins fair-value pricing models; empirical probability drives PD (Probability of Default) estimates in Basel III models using historical loan performance data; and subjective probability is used by investment analysts when assessing emerging markets with limited historical data.
- Every probability problem has an experiment, sample space, and one or more events.
- P(E) = n(E)/n(S) only when all outcomes are equally likely.
- Empirical probability uses observed data: P = frequency / total observations.
- Subjective probability is valid when no data exists but expert judgment is available.
Section 4.3 — Probability Rules
๐ฏ Learning Goal
Apply the three fundamental probability rules — Addition, Multiplication, and Complement — to calculate probabilities of combined events in finance, auditing, and business contexts.
Rule 1 — The Complement Rule
Worked Example: A company estimates a 30% chance of winning a contract. What is the probability of NOT winning?
P(win) = 0.30 → P(not win) = 1 − 0.30 = 0.70 = 70%
Audit Application: If an auditor estimates a 5% risk that internal controls are ineffective (control risk = 0.05), then the probability that controls ARE effective = 1 − 0.05 = 0.95 (95%).
Rule 2 — The Addition Rule
Variable breakdown:
P(A ∪ B)— Probability that A OR B (or both) occurP(A ∩ B)— Probability that BOTH A and B occur simultaneously (joint probability)
Worked Example — Loan Portfolio
A bank portfolio contains two loans. Based on credit models:
- P(Loan A defaults) = 0.15
- P(Loan B defaults) = 0.10
- P(Both A and B default) = 0.04
What is the probability that at least one loan defaults?
P(A ∪ B) = 0.15 + 0.10 − 0.04 = 0.21 = 21%
Interpretation: There is a 21% chance that at least one loan in this portfolio will default.
Forgetting to subtract P(A ∩ B). If you add P(A) + P(B) = 0.25 without subtracting the overlap, you count the joint event twice — this inflates the probability and leads to incorrect risk assessments.
Rule 3 — The Multiplication Rule
P(B|A) is the conditional probability of B given that A has already occurred (covered in Section 4.4).
Worked Example — Quality Control
A factory produces items. 5% are defective. You randomly test two items independently.
What is the probability BOTH items are defective?
P(defective) = 0.05 for each item (independent events)
P(both defective) = 0.05 × 0.05 = 0.0025 = 0.25%
Independent vs. Dependent Events
| Dimension | Independent Events | Dependent Events |
|---|---|---|
| Definition | Occurrence of A has NO effect on P(B) | Occurrence of A CHANGES P(B) |
| Test | P(A ∩ B) = P(A) × P(B) | P(A ∩ B) = P(A) × P(B|A) |
| Formula used | Simple multiplication | Conditional probability required |
| Statistical example | Two separate coin flips | Drawing cards without replacement |
| Business example | Sales in Tokyo vs. Lagos (unrelated markets) | Sales this month vs. last month (trend relationship) |
| Finance example | Stock returns of uncorrelated assets | Default of a subsidiary given parent default |
| Audit example | Testing two unrelated account balances | Testing accounts where errors in one indicate errors in another |
Probability Rules Summary
| Rule | Formula | When to Use | Key Caution |
|---|---|---|---|
| Complement | P(E') = 1 − P(E) | Finding "not E" probability | Always valid — no conditions |
| Addition (Mutually Exclusive) | P(A∪B) = P(A) + P(B) | A and B cannot happen together | Events truly cannot overlap |
| Addition (General) | P(A∪B) = P(A)+P(B)−P(A∩B) | A and B might overlap | Always use this unless exclusivity is proven |
| Multiplication (Independent) | P(A∩B) = P(A)×P(B) | A and B are independent | Verify independence before applying |
| Multiplication (Dependent) | P(A∩B) = P(A)×P(B|A) | A affects the probability of B | Requires conditional probability |
- The Complement Rule: P(not E) = 1 − P(E). Use it to find "at least one" probabilities efficiently.
- The Addition Rule: Always subtract the joint probability to avoid double-counting overlapping events.
- The Multiplication Rule: Use P(A)×P(B) only for independent events; use P(A)×P(B|A) for dependent events.
- Before applying any rule, classify your events: mutually exclusive? Independent? Dependent?
Section 4.4 — Conditional Probability
๐ฏ Learning Goal
Calculate and interpret conditional probability — the probability that an event occurs given that another event has already occurred — and apply it to financial screening, audit risk, and customer analytics.
Conditional probability is the probability of event B occurring given that event A has already occurred. It is written P(B|A) and calculated as P(A ∩ B) ÷ P(A). It narrows the sample space to only the scenarios where A is true.
Why Conditional Probability Matters
In the real world, information arrives sequentially. Knowing that a loan applicant has previously defaulted changes the probability that they will default again. Knowing a test result is positive changes the probability of disease. Conditional probability is how we formally update probabilities with new information.
Variable breakdown:
P(B|A)— Conditional probability: probability of B occurring, given A has occurredP(A ∩ B)— Joint probability: probability that BOTH A and B occurP(A)— Marginal probability: probability that A occurs (must be > 0)
Step-by-Step Worked Example — Credit Screening
A bank's historical data on 10,000 loan applications shows:
| Defaults (D) | Does Not Default (D') | Total | |
|---|---|---|---|
| Poor Credit Score (P) | 420 | 1,580 | 2,000 |
| Good Credit Score (G) | 80 | 7,920 | 8,000 |
| Total | 500 | 9,500 | 10,000 |
Question: Given that a borrower has a poor credit score, what is the probability they default?
Audit Example — Transaction Testing
During an audit, 1,000 transactions are reviewed. 60 contain errors. Of those 60 errors, 45 involve amounts over £50,000. What is the probability a transaction contains an error, given that it exceeds £50,000?
Assume 150 transactions exceed £50,000 total.
P(Error | >£50K) = P(Error AND >£50K) / P(>£50K) = (45/1000) / (150/1000) = 0.045 / 0.15 = 0.30 = 30%
Conclusion: High-value transactions have a 30% error rate — the auditor should prioritise sampling from this stratum.
1. Confusing P(B|A) with P(A|B): P(Default | Poor Credit) ≠ P(Poor Credit | Default). These are entirely different questions with different answers — confusing them is called the "inverse fallacy."
2. Ignoring base rates: A diagnostic test with 95% accuracy doesn't mean 95% of positive results are true positives. The base rate of the condition matters enormously (this is what Bayes' Theorem addresses).
- P(B|A) = P(A∩B) / P(A) — always divide the joint probability by the conditioning event's probability.
- Conditional probability narrows the sample space to only the cases where A is true.
- P(B|A) ≠ P(A|B) — confusing these is one of the most common errors in probability and statistics.
- In finance: conditional probability models differentiated default risk by borrower segment.
- In auditing: it allows risk-stratified sampling by directing attention to high-probability error areas.
Section 4.5 — Bayes' Theorem
๐ฏ Learning Goal
Apply Bayes' Theorem to update probability estimates as new evidence arrives — a critical tool in medical diagnosis, fraud detection, credit risk modelling, and machine learning classification.
Bayes' Theorem is a mathematical formula that updates the probability of a hypothesis when new evidence is obtained. It combines your prior belief with the likelihood of observing the evidence to produce a revised "posterior" probability. It is the formal mathematical foundation for rational belief updating.
The Formula
Variable breakdown:
| Symbol | Name | Meaning |
|---|---|---|
P(A|B) | Posterior probability | Updated probability of A after observing B |
P(B|A) | Likelihood | Probability of observing B if A is true |
P(A) | Prior probability | Initial probability of A before observing B |
P(B) | Marginal probability / Evidence | Total probability of observing B across all scenarios |
The denominator P(B) is expanded using the Total Probability Rule:
Case Study 1 — Medical Diagnosis
๐ฅ Case Study: Disease Screening
Problem: A disease affects 1% of the population. A diagnostic test is 99% sensitive (correctly identifies 99% of those who have the disease) and 95% specific (correctly identifies 95% of those who don't). A random person tests positive. What is the probability they actually have the disease?
Data:
- P(Disease) = 0.01 (prior — base rate)
- P(No Disease) = 0.99
- P(Positive | Disease) = 0.99 (sensitivity)
- P(Positive | No Disease) = 0.05 (1 − specificity = false positive rate)
Calculation:
P(Positive) = P(Pos|Disease)×P(Disease) + P(Pos|No Disease)×P(No Disease)
= (0.99×0.01) + (0.05×0.99) = 0.0099 + 0.0495 = 0.0594
P(Disease | Positive) = (0.99 × 0.01) / 0.0594 = 0.0099 / 0.0594 ≈ 0.167 = 16.7%
Interpretation: Despite the highly accurate test, a positive result means only a 16.7% chance of actually having the disease. Why? Because the disease is rare (1% base rate), so false positives vastly outnumber true positives in the population.
Decision: A positive result warrants further confirmatory testing rather than immediate treatment — understanding this prevents overdiagnosis and unnecessary harm.
Case Study 2 — Fraud Detection
๐ Case Study: Financial Transaction Fraud
Problem: A bank's transaction monitoring system flags 2% of all transactions as fraudulent. The fraud detection algorithm correctly identifies 90% of actual fraudulent transactions (sensitivity). It generates false alarms for 3% of legitimate transactions. A transaction is flagged. What is the probability it is genuinely fraudulent?
Data:
- P(Fraud) = 0.02; P(Legitimate) = 0.98
- P(Alert | Fraud) = 0.90
- P(Alert | Legitimate) = 0.03
Calculation:
P(Alert) = (0.90 × 0.02) + (0.03 × 0.98) = 0.018 + 0.0294 = 0.0474
P(Fraud | Alert) = (0.90 × 0.02) / 0.0474 = 0.018 / 0.0474 ≈ 0.38 = 38%
Interpretation: Only 38% of flagged transactions are truly fraudulent, despite the algorithm's 90% sensitivity. The remaining 62% are false positives — legitimate transactions incorrectly flagged. This is critical for operational efficiency: the fraud team should expect to investigate roughly 2.6 false alarms for every genuine fraud case.
Case Study 3 — Credit Risk
๐ณ Case Study: Loan Application Assessment
Problem: Based on historical data, 8% of loan applicants default within 3 years. A credit bureau score below 600 occurs in 25% of applicants overall, but in 70% of those who eventually default. A new applicant has a score below 600. What is their probability of defaulting?
Data:
- P(Default) = 0.08; P(No Default) = 0.92
- P(Score <600 | Default) = 0.70
- P(Score <600 | No Default) = (0.25 − 0.08×0.70) / 0.92 ≈ 0.21
Calculation:
P(Score<600) = (0.70×0.08) + (0.21×0.92) = 0.056 + 0.193 = 0.249 ≈ 0.25
P(Default | Score<600) = (0.70 × 0.08) / 0.25 = 0.056 / 0.25 = 0.224 = 22.4%
Decision: The applicant's default probability has risen from the base rate of 8% to 22.4% based on their credit score. This materially affects pricing: the bank should charge a higher interest rate or require additional collateral to compensate for the elevated risk.
- Bayes' Theorem updates prior probabilities with new evidence to produce posterior probabilities.
- Low base rates (rare events) mean positive test results carry less information than intuition suggests — this is the base rate fallacy.
- The denominator P(B) uses the Total Probability Rule: sum across all mutually exclusive scenarios.
- Bayes' Theorem is the mathematical foundation of: spam filters, medical diagnostics, credit scoring, and Bayesian machine learning.
Section 4.6 — Probability Trees
๐ฏ Learning Goal
Build and interpret probability trees to visualise sequential events, calculate path probabilities, and support structured decision-making.
A probability tree is a diagram that maps all possible sequences of events, with branches labelled by their probabilities. Multiplying probabilities along a path gives the joint probability of that sequence. Adding across paths gives marginal probabilities.
How to Build a Probability Tree
Example — Product Launch Decision
A company launches a product. Market reception is either Strong (60%) or Weak (40%). If reception is strong, the probability of exceeding revenue target is 80%. If reception is weak, the probability of exceeding revenue target is 20%.
| Path | Market Reception | Revenue Result | Calculation | Path Probability |
|---|---|---|---|---|
| Path 1 | Strong (0.60) | Exceeds Target (0.80) | 0.60 × 0.80 | 0.48 |
| Path 2 | Strong (0.60) | Misses Target (0.20) | 0.60 × 0.20 | 0.12 |
| Path 3 | Weak (0.40) | Exceeds Target (0.20) | 0.40 × 0.20 | 0.08 |
| Path 4 | Weak (0.40) | Misses Target (0.80) | 0.40 × 0.80 | 0.32 |
| Total (must = 1.00) | 1.00 ✓ | |||
P(Exceeds Revenue Target) = Path 1 + Path 3 = 0.48 + 0.08 = 0.56 = 56%
Audit Sampling Tree
An auditor tests transactions where 10% contain errors. She tests two transactions independently. The tree produces:
| Transaction 1 | Transaction 2 | Probability |
|---|---|---|
| Error (0.10) | Error (0.10) | 0.10 × 0.10 = 0.01 |
| Error (0.10) | No Error (0.90) | 0.10 × 0.90 = 0.09 |
| No Error (0.90) | Error (0.10) | 0.90 × 0.10 = 0.09 |
| No Error (0.90) | No Error (0.90) | 0.90 × 0.90 = 0.81 |
| Total | 1.00 ✓ | |
P(At least one error) = 1 − P(No errors) = 1 − 0.81 = 0.19 = 19%
- Probability trees map all possible sequential outcomes — use them when events occur in stages.
- Multiply probabilities along a path to get the joint probability of that sequence.
- Add relevant path probabilities to find the probability of composite events (e.g., "at least one").
- Branch probabilities at each node must always sum to exactly 1.
Section 4.7 — Probability Distributions
๐ฏ Learning Goal
Identify, select, and apply the four key probability distributions — Binomial, Poisson, Normal, and Uniform — to model real-world data in finance, operations, and research.
A probability distribution is a mathematical function that describes every possible value a random variable can take and the probability associated with each value. Distributions are the bridge between individual probability calculations and statistical inference, hypothesis testing, and predictive modelling.
Overview of Key Distributions
| Distribution | Type | Variable | Key Parameter | Classic Application |
|---|---|---|---|---|
| Binomial | Discrete | Count of successes in n trials | n (trials), p (probability) | Loan defaults, quality defects, audit errors |
| Poisson | Discrete | Count of events in fixed interval | ฮป (average rate) | System failures, fraud events per day, call arrivals |
| Normal | Continuous | Symmetric, bell-shaped | ฮผ (mean), ฯ (std dev) | Stock returns, heights, exam scores |
| Uniform | Continuous | Equal probability across range | a (min), b (max) | Simulation inputs, random number generation |
Binomial Distribution
Conditions (all four must hold):
- Fixed number of trials: n
- Each trial has only two outcomes: success or failure
- Probability of success p is constant across all trials
- Trials are independent of each other
Worked Example — Credit Defaults
A bank has 10 small business loans, each with an independent 15% probability of defaulting in the next year. What is the probability that exactly 2 of the 10 loans default?
Poisson Distribution
Use when: Counting rare events over a fixed interval of time, space, or volume, where events occur independently and at a constant average rate.
Worked Example — Fraud Events
A bank experiences an average of 3 fraudulent transactions per day. What is the probability of exactly 5 fraudulent transactions occurring on a given day?
ฮป = 3, k = 5
P(X=5) = (e⁻³ × 3⁵) / 5! = (0.0498 × 243) / 120 = 12.1 / 120 ≈ 0.1008 = 10.1%
Normal Distribution
The Empirical Rule (68-95-99.7 Rule):
| Range | % of Data Included | Finance Example (Portfolio Return ฮผ=8%, ฯ=5%) |
|---|---|---|
| ฮผ ± 1ฯ | 68.27% | Returns between 3% and 13% |
| ฮผ ± 2ฯ | 95.45% | Returns between −2% and 18% |
| ฮผ ± 3ฯ | 99.73% | Returns between −7% and 23% |
Worked Example — Portfolio Returns (Value at Risk)
A portfolio has mean annual return ฮผ = 10%, standard deviation ฯ = 15%. What return is at the 5th percentile (the threshold below which the worst 5% of years fall)?
Z at 5th percentile = −1.645
X = ฮผ + Z×ฯ = 10% + (−1.645 × 15%) = 10% − 24.7% = −14.7%
Interpretation: In the worst 5% of years, this portfolio loses more than 14.7%. This is the 5% VaR (Value at Risk) — a key regulatory capital measure.
Uniform Distribution
Example:
A payment processing time is uniformly distributed between 1 and 5 days. What is the probability processing takes more than 3 days?
P(X > 3) = (5 − 3) / (5 − 1) = 2/4 = 0.50 = 50%
Complete Distribution Comparison
| Feature | Binomial | Poisson | Normal | Uniform |
|---|---|---|---|---|
| Type | Discrete | Discrete | Continuous | Continuous |
| Parameters | n, p | ฮป | ฮผ, ฯ | a, b |
| Mean | np | ฮป | ฮผ | (a+b)/2 |
| Variance | np(1−p) | ฮป | ฯ² | (b−a)²/12 |
| Shape | Skewed to symmetric | Right-skewed for small ฮป | Symmetric bell | Flat rectangle |
| Finance use | Loan defaults, option pricing (CRR) | Operational loss events | Asset returns, portfolio risk | Simulation inputs |
| Audit use | Sampling for attribute errors | Error events per time period | Continuous control metrics | Random sampling selection |
| Key assumption | Independent trials, constant p | Rare, independent events at constant rate | Symmetric, many small influences | Equally likely outcomes |
| Limitation | Binary outcomes only | Only for count data | Not for fat-tailed financial data | Real data rarely uniform |
- Binomial: fixed trials, binary outcome, constant probability, independent — use for counts of successes.
- Poisson: count rare events per interval where ฮป = mean rate — mean equals variance.
- Normal: symmetric, bell-shaped, defined by mean and standard deviation — 68-95-99.7 rule is fundamental.
- Uniform: equal likelihood across a range — baseline model and simulation input.
- Match the distribution to the data type and generating process, not just the shape of the data.
Section 4.8 — Real-World Applications
Probability in Finance
| Application | Probability Concept Used | How It Works |
|---|---|---|
| Value at Risk (VaR) | Normal distribution | Find return threshold below which losses occur with P = 1%, 5% |
| Probability of Default (PD) | Empirical + Bayes | Estimate P(default) from historical data, updated by credit signals |
| Option Pricing (Black-Scholes) | Log-normal distribution | Asset prices assumed to follow log-normal process |
| Portfolio Diversification | Joint probability / correlation | P(both assets fall) depends on their dependence structure |
| Monte Carlo Simulation | All distributions | Generate thousands of random scenarios to estimate risk distributions |
Probability in Auditing
| Audit Risk Component | Probability Concept | Formula | Typical Value |
|---|---|---|---|
| Inherent Risk (IR) | Prior probability of error | Assessed judgmentally | 40–80% |
| Control Risk (CR) | P(controls fail to prevent error) | Assessed from control testing | 20–60% |
| Detection Risk (DR) | P(auditor misses existing error) | AR / (IR × CR) | Set to achieve AR ≤ 5% |
| Audit Risk (AR) | Joint probability | AR = IR × CR × DR | ≤ 5% (professional standard) |
| Sampling Risk | Confidence intervals | Based on sample size and tolerable error rate | 5% standard in most audits |
Probability in Business
| Business Decision | Probability Tool | Decision Rule |
|---|---|---|
| New product launch | Decision tree with probabilities | Launch if Expected Value > launch cost |
| Inventory management | Poisson distribution for demand | Set safety stock to cover P(stockout) ≤ 5% |
| Customer churn prediction | Logistic regression / Bayes | Flag customers with P(churn) > threshold for retention campaigns |
| Insurance pricing | Actuarial probability models | Premium = E[claim] + risk loading + operating margin |
| Supplier selection | Conditional probability | Select supplier with lowest P(late delivery | large order) |
Section 4.9 — Case Study: Loan Default Risk
๐ Full Case Study: Community Bank Loan Portfolio Risk Assessment
1. Business Problem
A community bank holds a portfolio of 500 small business loans. The risk management team needs to estimate the probability of default events in the coming year to determine required loan loss provisions (regulatory capital) and to set loan approval criteria for new applications.
2. Available Data
| Data Point | Value | Source |
|---|---|---|
| Portfolio size | 500 loans | Loan management system |
| Historical default rate (overall) | 6% | 5-year internal records |
| Sector A (retail) — 200 loans | 9% historical default rate | Credit analysis |
| Sector B (manufacturing) — 300 loans | 4% historical default rate | Credit analysis |
| New applicant credit score | Below 550 | Credit bureau |
| P(Score < 550 | Default) | 0.65 | Historical model |
| P(Score < 550 | No Default) | 0.08 | Historical model |
3. Step 1 — Expected Number of Defaults (Binomial)
Treating defaults as approximately independent Binomial events:
Sector A (Retail): n = 200, p = 0.09 → E[defaults] = 200 × 0.09 = 18 loans
Sector B (Manufacturing): n = 300, p = 0.04 → E[defaults] = 300 × 0.04 = 12 loans
Portfolio total expected defaults: 18 + 12 = 30 loans out of 500 (6%)
4. Step 2 — Probability of 35 or More Defaults (Risk Tail)
Using Normal approximation to Binomial (n=500, p=0.06):
Mean = 500 × 0.06 = 30; SD = √(500 × 0.06 × 0.94) = √16.92 ≈ 4.11
Z = (35 − 30) / 4.11 = 1.22 → P(X ≥ 35) ≈ 1 − ฮฆ(1.22) = 1 − 0.889 = 11.1%
Interpretation: There is an 11.1% chance the portfolio experiences 35 or more defaults — a materially elevated loss scenario the bank must provision for.
5. Step 3 — Bayes' Theorem for New Applicant
A new applicant has a credit score below 550. Using Bayes' Theorem:
Prior P(Default) = 0.06
P(Score <550) = (0.65×0.06) + (0.08×0.94) = 0.039 + 0.0752 = 0.1142
P(Default | Score<550) = (0.65 × 0.06) / 0.1142 = 0.039 / 0.1142 = 0.342 = 34.2%
6. Risk Assessment and Decision
| Scenario | Default Probability | Decision Recommendation |
|---|---|---|
| Average portfolio loan | 6.0% | Standard approval process |
| Retail sector loan | 9.0% | Enhanced monitoring, higher rate |
| New applicant (score <550) | 34.2% | Decline or require significant collateral |
| Portfolio tail (≥35 defaults) | 11.1% | Maintain elevated loan loss provisions |
7. Recommendations
- Set loan loss provision to cover the 99th percentile scenario (Z=2.33): 30 + 2.33×4.11 ≈ 39.6 → provision for 40 defaults.
- Apply sector-specific approval criteria: retail applicants face a higher default rate and require adjusted pricing or additional covenants.
- Implement Bayesian score-based credit decision rule: applicants with score <550 face a 34.2% default rate — well above the bank's risk tolerance threshold of 10%.
- Review portfolio concentration: 40% retail exposure driving disproportionate risk suggests sector diversification is warranted.
Common Probability Mistakes — The Critical 15
| # | Mistake | Why It Happens | Correct Approach |
|---|---|---|---|
| 1 | Using accuracy over base rate in Bayes' problems | Ignoring base rates (how rare the event is) | Always identify P(A) before applying Bayes |
| 2 | Confusing P(B|A) with P(A|B) | Assuming conditional probability is symmetric | These are different — always check which direction you need |
| 3 | Adding probabilities of dependent events | Assuming events are independent when they're not | Test independence: check if P(A∩B) = P(A)×P(B) |
| 4 | Forgetting to subtract P(A∩B) in Addition Rule | Counting overlapping outcomes twice | Use P(A∪B) = P(A)+P(B)−P(A∩B) always |
| 5 | Probability greater than 1 or less than 0 | Algebraic errors in calculation | If result outside [0,1], recheck; it's always invalid |
| 6 | Gambler's Fallacy — "it's overdue" | Misunderstanding independence over time | Each fair coin flip is independent; past results don't change future P |
| 7 | Applying Binomial when events are not independent | Misidentifying independence | Verify all four Binomial conditions before applying |
| 8 | Using Normal distribution for small samples | Assuming normality without justification | Check n ≥ 30 and data symmetry; use t-distribution for small n |
| 9 | Confusing "or" (union) with "and" (intersection) | Language ambiguity in probability problems | "Or" = ∪ (at least one); "And" = ∩ (both) |
| 10 | Equating probability with certainty at extreme values | Treating P=0.99 as certain | Even P=0.99 events fail 1% of the time; plan for tail risk |
| 11 | Ignoring conditional probability in sequential events | Using unconditional probabilities throughout a tree | Update probabilities at each node based on prior outcomes |
| 12 | Misusing the Complement Rule for joint events | Taking complement of a multi-event expression incorrectly | P(at least one) = 1 − P(none) — use this carefully |
| 13 | Confusing Poisson and Binomial for count data | Both model counts so seem interchangeable | Binomial: fixed n, known p. Poisson: events per interval, unknown n |
| 14 | Treating subjective probability as precise | Expert estimates reported with false precision | Express as ranges or sensitivity analyses; acknowledge uncertainty |
| 15 | Not verifying tree branch probabilities sum to 1 | Arithmetic errors in building trees | Always check: each node's branches must sum exactly to 1.00 |
- Probability measures likelihood: P(E) ∈ [0,1]; 0 = impossible, 1 = certain
- Three rules are the core toolkit: Complement (1−P), Addition (∪ with overlap), Multiplication (∩ with conditioning)
- Conditional probability P(B|A): Always divide by P(A); never confuse with P(A|B)
- Bayes' Theorem: Updates prior beliefs with new evidence; base rates critically matter
- Trees: Multiply along paths; add across paths for composite events
- Distributions: Binomial (fixed trials), Poisson (rate-based counts), Normal (continuous symmetric), Uniform (equal likelihood)
- Applications: VaR uses Normal; PD models use Bayes/empirical; audit risk uses multiplication rule; Binomial drives sampling theory
Section 4.10 — Practice and Assessment
Part A — 15 Numerical Practice Problems (with Solutions)
Work each problem independently before reading the solution. Show all steps.
P1. Basic Probability — Card Draw
A standard deck has 52 cards. What is the probability of drawing a red King?
Solution: Red Kings = 2 (King of Hearts + King of Diamonds). P = 2/52 = 1/26 ≈ 0.0385
P2. Complement Rule — Project Delivery
A project has a 35% chance of being delivered late. What is the probability it is delivered on time?
Solution: P(on time) = 1 − 0.35 = 0.65 = 65%
P3. Addition Rule — Survey Results
In a customer survey: P(satisfied) = 0.70, P(would recommend) = 0.65, P(satisfied AND would recommend) = 0.55. Find P(satisfied OR would recommend).
Solution: P(S∪R) = 0.70 + 0.65 − 0.55 = 0.80 = 80%
P4. Multiplication Rule — Audit Test
An auditor tests two transactions. P(Transaction 1 has error) = 0.08. P(Transaction 2 has error) = 0.08. Both are independent. Find P(both have errors).
Solution: P = 0.08 × 0.08 = 0.0064 = 0.64%
P5. Conditional Probability — Insurance
Of 1,000 policyholders: 200 are under 25 years old; 40 of those under 25 filed claims; 80 of the 800 over 25 filed claims. Find P(claim | under 25).
Solution: P(claim | under 25) = 40/200 = 0.20 = 20%. Compare: P(claim | over 25) = 80/800 = 10%. Young drivers claim at double the rate.
P6. Bayes' Theorem — Product Defects
Machine A produces 60% of output; Machine B produces 40%. Machine A has a 3% defect rate; Machine B has a 5% defect rate. A defective item is found. What is the probability it came from Machine B?
Solution: P(defect) = 0.60×0.03 + 0.40×0.05 = 0.018 + 0.020 = 0.038
P(B | defect) = (0.05×0.40)/0.038 = 0.020/0.038 = 0.526 = 52.6%
P7. Binomial — Quality Sampling
A sample of 8 items is drawn from a batch with 20% defect rate. Find P(exactly 3 defective).
Solution: P(X=3) = C(8,3)×(0.20)³×(0.80)⁵ = 56×0.008×0.3277 = 0.1468 ≈ 14.7%
P8. Poisson — Server Failures
A server experiences 2 crashes per month on average. Find P(exactly 4 crashes next month).
Solution: ฮป=2, k=4. P = e⁻²×2⁴/4! = 0.1353×16/24 = 0.0902 ≈ 9.0%
P9. Normal Distribution — Salary Analysis
Salaries are normally distributed: ฮผ = £45,000, ฯ = £8,000. What fraction of employees earns more than £53,000?
Solution: Z = (53,000 − 45,000)/8,000 = 1.00. P(X > 53,000) = 1 − ฮฆ(1.00) = 1 − 0.8413 = 0.1587 = 15.87%
P10. At Least One — System Reliability
Three independent backup systems each have a 5% failure probability. Find P(at least one fails).
Solution: P(none fail) = 0.95³ = 0.857. P(at least one fails) = 1 − 0.857 = 0.143 = 14.3%
P11. Binomial — Loan Portfolio
A portfolio of 20 loans each has P(default) = 0.05. What is the expected number of defaults and standard deviation?
Solution: E[X] = 20×0.05 = 1 loan. SD = √(20×0.05×0.95) = √0.95 = 0.975 loans
P12. Conditional + Bayes — Stress Test
Before a recession: P(firm fails) = 0.04. P(negative cash flow | failure) = 0.80. P(negative cash flow | survival) = 0.15. A firm shows negative cash flow. Find P(failure | negative cash flow).
Solution: P(NCF) = 0.80×0.04 + 0.15×0.96 = 0.032+0.144 = 0.176
P(fail|NCF) = 0.032/0.176 = 0.182 = 18.2%
P13. Normal — Inventory Management
Daily demand is normally distributed: ฮผ=100 units, ฯ=15 units. What safety stock is needed so that stockouts occur no more than 2.5% of the time?
Solution: Need P(demand ≤ stock) = 0.975 → Z = 1.96. Stock = 100 + 1.96×15 = 100+29.4 = 130 units (safety stock of 30 units)
P14. Probability Tree — Two-Stage Tender
Stage 1 success probability: 0.60. If Stage 1 succeeded, Stage 2 success probability: 0.70. Find P(both stages successful).
Solution: P = 0.60×0.70 = 0.42 = 42%
P15. Empirical Probability — Audit Sample
An auditor reviews 250 invoices and finds 18 with errors. Based on this sample, estimate the probability that a randomly selected invoice contains an error.
Solution: P(error) = 18/250 = 0.072 = 7.2% (empirical probability). This exceeds the tolerable error rate of 5%, indicating a control weakness requiring further investigation.
Part B — 25 Multiple Choice Questions
| # | Question | Answer | Explanation |
|---|---|---|---|
| 1 | P(A) = 0.40. What is P(A')? | B) 0.60 | Complement: 1 − 0.40 = 0.60 |
| 2 | P(A) = 0.3, P(B) = 0.5, P(A∩B) = 0.2. P(A∪B) = ? | C) 0.60 | 0.3+0.5−0.2 = 0.60 |
| 3 | Which distribution models the number of successes in 10 independent coin flips? | A) Binomial | Fixed n, binary outcome, constant p, independent |
| 4 | P(B|A) = 0.6, P(A) = 0.4. P(A∩B) = ? | B) 0.24 | P(A∩B) = P(B|A)×P(A) = 0.6×0.4 = 0.24 |
| 5 | A disease occurs in 2% of the population. A test is 95% accurate. A positive test result's probability of being a true positive is: | C) Less than 30% | Base rate fallacy — low prevalence means many false positives |
| 6 | In the Normal distribution, what % of values fall within ฮผ ± 2ฯ? | B) 95.45% | Empirical rule: 68-95.45-99.73 |
| 7 | Two events cannot happen simultaneously. They are: | A) Mutually exclusive | Mutually exclusive: P(A∩B) = 0 |
| 8 | The mean and variance of a Poisson distribution with ฮป=5 are: | C) Both equal 5 | Poisson: mean = variance = ฮป |
| 9 | Which probability type relies on historical data? | B) Empirical | Empirical = observed frequency / total trials |
| 10 | In Bayes' Theorem, P(A) before observing evidence is called the: | A) Prior probability | Posterior is the updated probability after evidence |
| 11 | P(A) = 0.5, P(B) = 0.4, events independent. P(A∩B) = ? | B) 0.20 | Independent: P(A∩B) = 0.5×0.4 = 0.20 |
| 12 | Which distribution applies to: "number of customer calls in one hour"? | C) Poisson | Count of independent events per fixed time interval at constant rate |
| 13 | Audit risk = IR × CR × DR. If IR=0.8, CR=0.5, desired AR=0.05, what must DR be? | B) 0.125 | DR = 0.05/(0.8×0.5) = 0.05/0.4 = 0.125 |
| 14 | A uniform distribution spans [2, 10]. Its mean is: | C) 6 | Mean = (2+10)/2 = 6 |
| 15 | What is the probability of rolling a sum of 7 with two dice? | B) 1/6 | 6 favorable outcomes out of 36: (1,6),(2,5),(3,4),(4,3),(5,2),(6,1) |
| 16 | P(not getting a defect) = 0.92. P(getting a defect) = ? | A) 0.08 | Complement: 1 − 0.92 = 0.08 |
| 17 | n=100, p=0.05 Binomial. Expected value and variance? | C) E=5, Var=4.75 | E=np=5; Var=np(1-p)=100×0.05×0.95=4.75 |
| 18 | If P(A|B) = P(A), then A and B are: | B) Independent | Independence definition: conditioning on B doesn't change P(A) |
| 19 | P(at least one success in 3 independent trials, p=0.3) = ? | C) 0.657 | 1 − P(none) = 1 − 0.7³ = 1 − 0.343 = 0.657 |
| 20 | Which metric measures the spread of a Normal distribution? | A) Standard deviation (ฯ) | ฯ determines the width of the bell curve |
| 21 | Bayes' Theorem is most useful when: | C) Updating probabilities as new evidence arrives | Posterior = (Likelihood × Prior) / Evidence |
| 22 | A probability tree path shows: 0.4 → 0.7. The path probability is: | B) 0.28 | Multiply along path: 0.4 × 0.7 = 0.28 |
| 23 | Which distribution is appropriate for continuous data with equal probability across all values in [a,b]? | D) Uniform | Uniform distribution: f(x) = 1/(b-a) for all x in [a,b] |
| 24 | The formula P(A∩B) = P(A) × P(B|A) is the: | A) General Multiplication Rule | Holds for both dependent and independent events |
| 25 | Z-score = 1.96 corresponds to approximately what percentile? | C) 97.5th percentile | ฮฆ(1.96) ≈ 0.975; used in 95% confidence intervals |
Frequently Asked Questions
Final Summary — Module 4 Complete Recap
Probability Fundamentals
- Probability measures likelihood: P(E) ∈ [0,1]; P(impossible)=0; P(certain)=1
- P(E) = n(E)/n(S) for equally likely outcomes
- Three types: Classical (theory), Empirical (data), Subjective (judgment)
Probability Rules
- Complement: P(E') = 1 − P(E)
- Addition: P(A∪B) = P(A) + P(B) − P(A∩B)
- Multiplication: P(A∩B) = P(A) × P(B|A); if independent: P(A)×P(B)
Conditional Probability
- P(B|A) = P(A∩B) / P(A) — probability of B given A occurred
- P(B|A) ≠ P(A|B) — these are different questions with different answers
Bayes' Theorem
- P(A|B) = [P(B|A) × P(A)] / P(B) — update prior beliefs with evidence
- Low base rates critically reduce the positive predictive value of tests
Probability Trees
- Multiply along paths → add across relevant paths
- Branch probabilities at each node must sum to 1
Probability Distributions
- Binomial B(n,p): fixed trials, binary, independent. E=np, Var=np(1−p)
- Poisson P(ฮป): events per interval. E=Var=ฮป
- Normal N(ฮผ,ฯ): symmetric bell. Z=(X−ฮผ)/ฯ. 68-95-99.7 rule
- Uniform U(a,b): equal probability. Mean=(a+b)/2
Applications
- Finance: VaR (Normal), PD modelling (Bayes/empirical), option pricing (log-normal)
- Audit: AR = IR × CR × DR; sampling risk via confidence intervals
- Business: Decision trees, inventory management, customer analytics
▶ Continue Learning
Module 5 — Sampling and Estimation
In Module 5, you will see exactly how the probability foundations you have built in this module come alive in practice. Sampling theory uses the Binomial distribution to determine how many transactions an auditor must test. Estimation uses the Normal distribution to construct confidence intervals around sample means. Hypothesis testing — the most powerful tool in applied statistics — is entirely built on conditional probability: "What is the probability of observing this result if the null hypothesis were true?" Every concept in Module 5 and beyond is a direct application of what you learned here.
Topics in Module 5: Random sampling methods · Central Limit Theorem · Point estimation · Confidence intervals (z and t) · Sample size determination · Applications in auditing, finance, and research
Module 4: Probability Fundamentals · Applied Statistics Course · © Educational Content
0 Comments