You are on page 1of 2

Population: The set of all possible observations of interest to the problem at hand.

Sample: The part of the population from which we collect information Factor: the explanatory variable(s), often with multiple levels (i.e., if temperature is the factor, levels could be 50 and 75 ) Treatment/Treatment Combination: a specific combination of the levels of each factor. Example: 2 factors Pressure (4 atm, 5 atm, 6 atm) Temperature (50 and 75 ) How many treatment combinations? Categorical or Continuous: Categorical factors have limited levels; Continuous do not have limited levels. Experimental Unit (EU): The smallest unit to which we apply a treatment combination. Observational Unit (OU): The unit upon which we take the measurement Sampling Techniques: Simple Random Sample (SRS): every possible sample of n observations from the population has the same chance of being selected Stratified Random Sample: when there are several groups or strata (grouping them can remove some of the variability) a simple random sample is performed on each in order to get at least one representative from each group. Also useful for comparing strata. th Systematic Random Sample: randomly select an item within the first m items. Thereafter, sample each m item. Lends itself well to high-speed part manufacturing. Pairing sampling units: Pairing sampling units allows us to remove the unit to unit variability and focus on the real issues useful when the sampling units differ widely. Types of Designs: Completely Randomized Design (CRD)- A design which randomly allocates all treatment combinations to the EUs. Each EU has the same chance of receiving any treatment combination. Randomized Complete Block Design (RCBD)- A design whose blocks contain a single observation on each treatment. Stemplots: Pros: can evaluate the shape of the data; dont lose original data; can suggest natural groupings Cons: loses time order of data; cumbersome to construct by hand with very large data sets Subset: Complement: Union: Intersection: Mutually Exclusive: Conditional Probability: Dependent Probability: Independent Probability: Calculate P( , if P(A) is difficult ) )

Addition Rule: Mutually Exclusive Addition: ( Law of Total Probability: Bayes Rule: P(B|A) =
| | |

Cumulative Distribution Function (CDF) : Discrete random variables can assume, at most, a countable set of numbers. Continuous random variables can assume any possible real value. The distinction between a cdf (big F) and a pmf (little f) is that the pmf gives you the probability of one particular point (=) where as the cdf gives you the probability of that point plus everything before it ( ). In addition to a CDF, every continuous r.v. has a probability density function (PDF): If we know the pdf of X, then we can find the cdf by integrating: We define the expected value or the population mean of X (denoted =E(X)) as This implies that for some interval [a,b], we have that

if X is discrete, if X is continuous.

A measurement of the deviation from = E(X) is called the variance of X, denoted by

Var(X). The variance is defined by the formula

Standard Deviation: Discrete Distributions: Bernoulli Distribution: For the event that just has two different outcomes, 1 indicates success (or yes or categorical a), 0 indicates fail (or x 1-x no or categorical b). The notation is defined by the operator. It is one of the discrete random variable distribution. PMF: f(x) = p (1-p) where x { 0, 1}, Mean or expectation = E(X)= p, Variance = Var (X) = p(1-p), If X1, ---, Xn are independent and identically distributed random variable from Bernoulli distribution with probability of success (p), then Y= ~ Binomial (n , p) where { 0, 1} The Binomial Distribution, The single most important discrete distribution. Has parameters n and p., Models data that can be classified as a success (p) or failure (1-p). The r.v. X is the sum of successes in n independent trials. Shorthand: X~Binomial(n,p). The PMF for the binomial distribution is ( )=( ) ^ (1)^( ), 01; =0,1,, where ( )=!/ !( )! is the number of ways to get x successes in n trials and it can be shown that E(X) = np Var(X) = np(1-p) TI: MATHPRB3:nCr The Poisson Distribution: Has parameter . is the average rate for a time frame. Models independent count data. Shorthand: X~Poisson().

The Geometric Distribution: Has parameter p. Models the number of independent trials to obtain a success. p is the probability of success on any given trial. The PMF: ( )=(1)^( 1) 0<<1; x=1,2, ()= 1/ ()= (1)/^2 The Negative Binomial Distribution: Has parameters r and p. Like the geometric except it models the number of observations needed to get r successes. p is the probability of success on any given trial. The PMF: ( )= (( 1)(1)) (1)^( ) ^ 0<p<1;x=r,r+1, E(X)= / ()=((1))/^2 The Hypergeometric Distribution: Has parameters N, n, and r. Models the number of successes out of n trials when sampling without replacement from a finite population of N objects that contains exactly r successes. The PMF: ( )=( )(()( ))/(() ) ; ; =0,1,2,, E(X)=/ ()=(()())/(^2 (1)) Continuous Distributions The Exponential Distribution: Has rate . is the rate parameter, explained as the rate of the event happen. 1/ is the expected lifetime between two consecutive events. The PDF and CDF: x>0; >0 ()=1/ ()=1/^2 The Uniform Distribution: Has parameters a and b. Models time between events and interarrival times. Useful when an event has occurred within a time frame but you have no information on the exact time. The time frame, or interval, is given by [a,b] or (a,b). The PDF and CDF: ( )=1/() and ( )= ( )/(), a b, , The Weibull Distribution (optional): Has parameters and . is the scale parameter and is the shape parameter. If the data follow a Weibull distribution, these parameters can be adjusted to better fit the data. Also models the time between events and interarrival times. The PDF and CDF: ( )=( )^(1) ^(( )^ ) ( )=1^(( )^ ) x, , >0 where is the gamma function. The Gamma Distribution (optional): Has parameters (scale) and (shape). Also models the time between events and interarrival times. Shorthand: X~(,) The PDF: ( )= (^ ^(1))/(()) ^( ) x, , >0 The Normal Disrtibution: The single most important continuous distribution. Has parameters Models many phenomena; in certain cases, can model the behavior of averages. (the mean) and ^2 (the variance).

The PDF: When =0 and ^2= =1, the normal distribution become the standard normal distribution, denoted by Z. The PDF becomes: Sample Mean: , , ,

Central Limit Theorem: Suppose that X1,X2, Xn are a random sample from a population with mean and variance . Then, if n is large enough (typically 30), the distribution of the sample average is approximately normal with mean and variance .

You might also like