Professional Documents
Culture Documents
Overview
Data types Summarizing data using descriptive statistics Standard error Confidence Intervals
Overview
P values One vs two tailed tests Alpha and Beta errors Sample size considerations and power analysis Statistics for comparing 2 or more groups with continuous data Non-parametric tests
Overview
Cox Regression
Types of Data
Types of data
Continuous data
Theoretically infinite possible values (within physiologic limits) , including fractional values
Height, age, weight Interval between measures has meaning. Ratio of two interval data points has no meaning Temperature in celsius, day of the year). Ratio of the measures has meaning Weight, height
Can be interval
Can be ratio
Histogram
Continuous Data
Frequency Distribution
Sample Mean
Sample Median
Used to indicate the average in a skewed population Often reported with the mean
If the mean and the median are the same, sample is normally distributed.
If an odd number of values, it is the middle value If even number of values, it is the average of the two middle values.
Sample Mode
Infrequently reported as a value in studies. Is the most common value More frequently used to describe the distribution of data
Interquartile range
Is the range of data from the 25th percentile to the 75th percentile Common component of a box and whiskers plot
It is the box, and the line across the box is the median or middle value Rarely, mean will also be displayed.
Standard Error
A fundamental goal of statistical analysis is to estimate a parameter of a population based on a sample The values of a specific variable from a sample are an estimate of the entire population of individuals who might have been eligible for the study. A measure of the precision of a sample in estimating the population parameter.
Standard Error
Clarification
Standard Deviation measures the variability or spread of the data in an individual sample. Standard error measures the precision of the estimate of a population parameter provided by the sample mean or proportion.
Standard Error
Significance:
The larger the study (sample size), the smaller the confidence intervals and the greater the precision of the estimate.
Confidence Intervals
May be used to assess a single point estimate such as mean or proportion. Most commonly used in assessing the estimate of the difference between two groups.
Confidence Intervals
Confidence Intervals
P Values
The probability that any observation is due to chance alone assuming that the null hypothesis is true
Typically, an estimate that has a p value of 0.05 or less is considered to be statistically significant or unlikely to occur due to chance alone.
The P value used is an arbitrary value
P value of 0.05 equals 1 in 20 chance P value of 0.01 equals 1 in 100 chance P value of 0.001 equals 1 in 1000 chance.
A P value provides only a probability that estimate is due to chance A P value could be statistically significant but of limited clinical significance.
A very large study might find that a difference of .1 on a VAS Scale of 0 to 10 is statistically significant but it may be of no clinical significance A large study might find many significant findings during multivariable
For most tests, if the confidence interval includes 0, then it is not significant. Ratios: if CI includes 1, then is not significant
The interval contains the true population value 95% of the time.
If a confidence interval range is very wide, then plausible value might range from very low to very high.
Example: A relative risk of 4 might have a confidence interval of 1.05 to 9, suggesting that although the estimate is for a 400% increased risk, an increased risk of 5% to 900% is plausible.
Errors
Type I error
Remember there is variability among samplesthey might seem to come from different populations but they may not.
Errors
Type II error
Claiming there is no difference between two samples when in fact there is. Also called a error. The probability of not making a Type II error is 1 - , which is called the power of the test. Hidden error because cant be detected without a proper power analysis
Errors
Test Result
Null Hypothesis H0 Alternative Hypothesis H1
Null Hypothesis H0
No Error
Truth
Alternative Hypothesis H1 Type II
Type I
No Error
Also called power analysis. When designing a study, one needs to determine how large a study is needed. Power is the ability of a study to avoid a Type II error. Sample size calculation yields the number of study subjects needed, given a certain desired power to detect a difference and a certain level of P value that will be considered significant.
Many studies are completed without proper estimate of appropriate study size. This may lead to a negative study outcome in error.
Depends on:
Level of Type I error: 0.05 typical Level of Type II error: 0.20 typical One sided vs two sided: nearly always two Inherent variability of population
The difference that would be meaningful between the two assessment arms.
When comparing two samples, we usually cannot be sure which is going to be be better.
Stata input: Mean 1 = .2, mean 2 = .3, = .05, power (1-) =.8.
Stata input: Mean 1 = 20, mean 2 = 30, = .05, power (1-) =.8, std. dev. 10.
Statistical Tests
Parametric tests
Non-parametric tests
Choosing a test for comparing the averages of 2 or more samples of scores of experiments with one treatment factor Data Between subjects Within subjects (independent samples) (related samples) 2 samples Interval Ordinal Nominal Independent t-test Paired t-test
Wilcoxon-Mann-Whitney Wilcoxon signed ranks test test, Sign test Chi-square test Mc Nemar test > 2 samples
Interval Ordinal
Nominal
Chi-square test
Nominal
2 categories
Binomial test
>2 categories
Ordinal
Interval
Mean
t-test
Distribution
KolmogorovSmirnov test
Statistic Pearson Correlation (r) Spearmans Rho, Kendalls tau-a, tau-b, tau-c
Nominal
Phi, Cramer V
Students T test
Paired T Tests
Uses the change before and after intervention in a single individual Reduces the degree of variability between the groups Given the same number of patients, has greater power to detect a difference between groups
Analysis of Variance
Used to determine if two or more samples are from the same population- the null hypothesis.
If two samples, is the same as the T test. Usually used for 3 or more samples.
If it appears they are not from same population, cant tell which sample is different.
Non-parametric Tests
Testing proportions
Use for categorical, ordinal or non-normally distributed continuous data May check both parametric and nonparametric tests to check for congruity Most non-parametric tests are based on ranks or other non- value related methods Interpretation:
Used to compare observed proportions of an event compared to expected. Used with nominal data (better/ worse; dead/alive) If there is a substantial difference between observed and expected, then it is likely that the null hypothesis is rejected. Often presented graphically as a 2 X 2 Table
Chi-Squared ( 2) Test
Chi-Squared ( 2) Formula
Correlation
r = 0 - .2 r = .2 - .4 r = .4 - .6 r = .6 - .8 r = .8 - 1
low, probably meaningless low, possible importance moderate correlation high correlation very high correlation
Can be positive or negative Pearsons, Spearman correlation coefficient Tells nothing about causation
Correlation
Correlation
Perfect Correlation
Source: Altman. Practical Statistics for Medical Research
Correlation
Correlation Coefficient 0
Correlation Coefficient .3
Correlation
Correlation Coefficient .7
Regression
Y = ax + b
Use to predict a dependent variables value based on the value of an independent variable.
Very helpful- In analysis of height and weight, for a known height, one can predict weight.
Allows prediction of values of Y rather than just whether there is a relationship between two variable.
Regression
Types of regression
Linear- uses continuous data to predict continuous data outcome Logistic- uses continuous data to predict probability of a dichotomous outcome Poisson regression- time between rare events. Cox proportional hazards regression- survival analysis.
Determining the association between two variables while controlling for the values of others. Example: Uterine Fibroids
Both age and race impact the incidence of fibroids. Multiple regression allows one to test the impact of age on the incidence while controlling for race (and all other factors)
In published papers, the multivariable models are more powerful than univariable models and take precedence.
Therefore we discount the univariable model as it does not control for confounding variables. Eg: Coronary disease is potentially affected by age, HTN, smoking status, gender and many other factors. If assessing whether height is a factor:
If it is significant on univariable analysis, but not on multivariable analysis, these other factors confounded the analysis.
Risk Ratios
Risk Ratios
Allows exploration of the probability that certain factors are associated with outcomes of interest
Usually require large and long-term studies to determine risks and risk ratios.
Example: Risk ratio of 3.1 (95% CI 0.97- 9.41) includes 1, thus would not be statistically significant.
Odds Ratios
Odds of an event occurring divided by the odds of the event not occurring.
Odds are calculated by the number of times an event happens by the number of times it does not happen.
Odds Ratios
Are calculated from case control studies Case control: patients with a condition (often rare) are compared to a group of selected controls for exposure to one or more potential etiologic factors. Cannot calculate risk from these studies as that requires the observation of the natural occurrence of an event over time in exposed and unexposed patients (prospective cohort study). Instead we can calculate the odds for each group.
Risk reduction
Absolute risk reduction: amount that the risk is reduced. Relative risk reduction: proportion or percentage reduction. Example:
Death rate without treatment: 10 per 1000 Death rate with treatment: 5 per 1000 ARR = 5 per 1000 RRR = 50%
Survival Analysis
Evaluation of time to an event (death, recurrence, recover). Provides means of handling censored data
Patients who do not reach the event by the end of the study or who are lost to follow-up Curves presented as stepwise change from baseline There are no fixed intervals of follow-up- survival proportion recalculated after each event.
Survival Analysis
Kaplan-Meier Curve
Source: Wikipedia
Kaplan-Meier Analysis
Provides a graphical means of comparing the outcomes of two groups that vary by intervention or other factor. Survival rates can be measured directly from curve. Difference between curves can be tested for statistical significance.
AKA: Proportional Hazards Survival Model. Used to investigate relationship between an event (death, recurrence) occurring over time and possible explanatory factors. Reported result: Hazard ratio (HR).
Ratio of the hazard in one group divided the hazard in another. Interpreted same as risk ratios and odds ratios
Common use in long-term studies where various factors might predispose to an event.
Example: after uterine embolization, which factors (age, race, uterine size, etc) might make recurrence more likely.
Maksud lu??
Summary
Understanding basic statistical concepts is central to understanding the medical literature. Not important to understand the basis of the tests or the underlying math. Need to know when a test should be used and how to interpret its results