在统计学中,连续随机变量是可以取无限多个值的随机变量。与离散随机变量不同,连续随机变量取任何特定值的概率为0,我们只能计算它在某个区间内取值的概率。
In statistics, a continuous random variable is a random variable that can take an unlimited number of values. Unlike discrete random variables, the probability that a continuous random variable takes any specific value is 0; we can only calculate the probability that it takes values within a certain interval.
对于连续随机变量X,有以下重要性质:
For a continuous random variable X, the following important properties hold:
1. \( P(X = a) = 0 \) 对于任何实数a
1. \( P(X = a) = 0 \) for any real number a
2. \( P(a \leq X \leq b) = P(a < X < b) \) 因为单点概率为0
2. \( P(a \leq X \leq b) = P(a < X < b) \) because single point probability is 0
正态分布是最常见的连续概率分布之一,其概率密度函数(PDF)呈钟形曲线。正态分布由两个参数完全描述:均值μ和方差σ²,记作X ~ N(μ, σ²)。
The normal distribution is one of the most common continuous probability distributions, with a bell-shaped probability density function (PDF). It is completely described by two parameters: the mean μ and the variance σ², denoted as X ~ N(μ, σ²).
正态分布的概率密度函数为:
The probability density function of the normal distribution is:
\[ f(x) = \frac{1}{\sigma\sqrt{2\pi}} e^{-\frac{(x-\mu)^2}{2\sigma^2}} \]
正态分布的主要特征包括:
The main characteristics of the normal distribution include:
正态分布的一个重要性质是经验法则,它描述了数据在均值附近的分布情况:
An important property of the normal distribution is the empirical rule, which describes how data is distributed around the mean:
• 约68%的数据位于均值±1个标准差范围内:\( P(\mu - \sigma \leq X \leq \mu + \sigma) \approx 0.68 \)
• Approximately 68% of the data lies within ±1 standard deviation of the mean: \( P(\mu - \sigma \leq X \leq \mu + \sigma) \approx 0.68 \)
• 约95%的数据位于均值±2个标准差范围内:\( P(\mu - 2\sigma \leq X \leq \mu + 2\sigma) \approx 0.95 \)
• Approximately 95% of the data lies within ±2 standard deviations of the mean: \( P(\mu - 2\sigma \leq X \leq \mu + 2\sigma) \approx 0.95 \)
• 约99.7%的数据位于均值±3个标准差范围内:\( P(\mu - 3\sigma \leq X \leq \mu + 3\sigma) \approx 0.997 \)
• Approximately 99.7% of the data lies within ±3 standard deviations of the mean: \( P(\mu - 3\sigma \leq X \leq \mu + 3\sigma) \approx 0.997 \)
经验法则是一个近似规则,在实际应用中非常有用,可以快速评估数据的分布情况。
The empirical rule is an approximate rule that is very useful in practical applications, allowing for quick assessment of data distribution.
一家工厂生产的金属销直径服从正态分布N(10, 0.04),其中单位为毫米。计算以下概率:
The diameters of metal pins produced by a factory follow a normal distribution N(10, 0.04) in millimeters. Calculate the following probabilities:
解 / Solution:
已知均值μ = 10,方差σ² = 0.04,因此标准差σ = 0.2
Given mean μ = 10, variance σ² = 0.04, so standard deviation σ = 0.2
1. 9.6 = 10 - 2(0.2) = μ - 2σ,10.4 = 10 + 2(0.2) = μ + 2σ
根据68-95-99.7法则,约95%的数据在μ±2σ范围内
因此,P(9.6 ≤ X ≤ 10.4) ≈ 0.95
1. 9.6 = 10 - 2(0.2) = μ - 2σ, 10.4 = 10 + 2(0.2) = μ + 2σ
According to the 68-95-99.7 rule, approximately 95% of data lies within μ±2σ
Therefore, P(9.6 ≤ X ≤ 10.4) ≈ 0.95
2. 10.3 = 10 + 1.5(0.2) = μ + 1.5σ
由于正态分布的对称性,我们知道:
• P(μ - σ ≤ X ≤ μ + σ) ≈ 0.68 → P(X ≤ μ + σ) ≈ 0.84
• P(μ - 2σ ≤ X ≤ μ + 2σ) ≈ 0.95 → P(X ≤ μ + 2σ) ≈ 0.975
1.5σ位于1σ和2σ之间,通过线性近似估算:
P(X > 10.3) = P(X > μ + 1.5σ) ≈ 0.067
2. 10.3 = 10 + 1.5(0.2) = μ + 1.5σ
Due to the symmetry of the normal distribution, we know:
• P(μ - σ ≤ X ≤ μ + σ) ≈ 0.68 → P(X ≤ μ + σ) ≈ 0.84
• P(μ - 2σ ≤ X ≤ μ + 2σ) ≈ 0.95 → P(X ≤ μ + 2σ) ≈ 0.975
1.5σ is between 1σ and 2σ, using linear approximation:
P(X > 10.3) = P(X > μ + 1.5σ) ≈ 0.067
上述例子中使用的是经验法则的近似值。在实际应用中,我们需要使用标准正态分布表或计算器来获取更精确的概率值。
The above example uses approximate values from the empirical rule. In practical applications, we need to use standard normal distribution tables or calculators to obtain more precise probability values.
正态分布在统计学中占据中心地位,这主要归功于以下几个原因:
The normal distribution occupies a central position in statistics, mainly due to the following reasons: