37 views

Uploaded by Raphael Balita

- Box–Behnken Modelling of Phenol Removal from Aqueous Solution using Emulsion Liquid Membrane
- Using SPSS for Factorial, Between-Subjects Analysis of Variance
- 1. Introduction to Taguchi Techniques
- Health Promotion Strategies for Nurses
- Interpretation 21
- Biophysical
- Anova
- Techniques of Physical Examination
- relaciones-de-equivalencia-de-estmulos.pdf
- Cognitive Disorders
- Anti Psychotic Drugs
- 71492287 Statistics Project
- 53314066-How-to-Pass-the-NLE
- Health Teaching
- Environment
- Vol3-Issue1-7
- Ct Design4
- 26-DISBURSEMENT OF AGRICULTURAL LOANS, CONSTRAINTS.pdf
- The One-Way Analysis of Variance (ANOVA) Process for STAT 461 Students at Penn State University
- Nhst Pdw Resource Guide Final 2018-08-14

You are on page 1of 7

ANOVA is a powerful set of techniques to test differences among means of three or more samples When only a single variable is being considered (think of it as a table with only one row), then this is called one-way ANOVA The null hypothesis is Ho: 1 = 2 = = k for k samples. The alternative hypothesis is Ha: the populations means are not all equal. For Ha, all the means wouldnt have to be different from each other; it may be that just one of the k population means is significantly different from the rest of them. Sample sizes for the different groups do not need to be equal, although it is best if they are. The assumptions of ANOVA include 1) that each sample is normally distributed and 2) that the variances of each of the populations are equal. Unlike 2 and G2 tests, which were restricted to counts only, ANOVA can be used on most types of data (e.g. continuous variables, proportions and so on) If we had multiple samples and wanted to test differences among them, we could potentially perform multiple 2-sample t-tests. However, for each test, we make a type I error with rate . The effects are additive: that is, perform n tests and the probability of making a type I error is n . Also, for n samples, there are n(n / 2 possible comparisons essence, we 1) in are bound to be make false conclusions. ANOVA takes a different approach and essentially makes all of these comparisons at once.

Sum of squares ANOVA is based upon the concept of sum of squares (in longhand, sum of squared deviates). 2 Estimated variance, i.e. ( xi x ) ( n 1) , is a measure of the variability within a sample divided by degrees of freedom, which for present purposes we can think of as sample size. That is, variance is essentially average variability per data point. If we dont divide by degree of freedom then we have a measure of total, absolute variability. This is sum of squares. 2 2 SS = ( xi x ) = xi2 ( x ) n Partitioning of variability When we have multiple (3 or more for ANOVA) samples, and consider the variability with the whole set of data, there are two sources of variability. First is the variability within each of the samples and second is the variability between the samples, the result of the effect of a treatment upon a particular mean, which therefore shifts that particular distribution. Imagine for the moment that we were just dealing with two distributions (but remember ANOVA is strictly for 3 or more) Treatment A Treatment B Total range of data

Relative the above, the following [in which only the means have been altered] would have greater variability as a result of between treatmentsthe mean of B is higher shifting the whole distribution to the right, but the within treatment variability is no different than the situation above. In other words, the between treatment variability plays a relatively greater role in explaining total variability than above. The fraction of the total variability that comes from between samples [as opposed to within] is greater than for the above example. Treatment A Treatment B Total range of data is larger Conversely, if the means were the same as before but variability within samples were higher Treatment A Treatment B Total range of data Then, within treatment variability accounts for relatively more variability of the whole dataset. The fraction of the total variability that comes from between samples [as opposed to within] is less than for the first example. The above is a crude explanation of the basic idea behind ANOVA: partition the variability of the whole data set into two components, that accounted for by within samples [hereafter groups] and that between groups. The way that we measure the variability is by sum of squares. SStotal = SSwithin groups + SSbetween groups Within group Sum of Squares This is easy to calculate. For each group (sample) we can calculate its SS using the formula 2 above. (For example, for the data 16, 15, 17, 15 & 20, then n = 5, x =1395 , x =83 and hence SS = 17.2.). We then simply sum those SS for all groups to get a total SS within groups. SSwithin groups =

SS

i =1

x SS

x x

All data All the data 16...20, 20..18, 1818 15 273 5039 18.2 70.4

Therefore, SSwithin groups = 17.2 + 14.8 + 18.8 = 50.8 Between groups Sum of Squares For the above example, we have three means: 16.6, 18.8 and 19.2. Essentially we have a sample, a distribution of means and we want to measure the variability among those means. We use a slightly modified version of the SS formula. The mean of all of the data is 18.2 (= 273 / 15). We measure the variability of our set of means (16.6, 18.8 and 19.2) from the overall mean, 18.2. So initially we can think of sum of squares for the between groups as (16.6 18.2)2 + (18.8 18.2)2 + (19.2 18.2)2. However, those three mean values are each associated with a particular original sample size (5 for each group in this case). For our SS, we ought therefore to weight the difference from the 18.2 mean by the sample size, that is ngroup(Meangroup i Meanall data)2 In this case, the sample sizes are equal so it doesnt matter, but often it does. Therefore, in general, SSbetween groups =

n ( x

k

xall data

)2 .

Thus, for the above examples we have SSbetween groups = 5(16.6 18.2)2 + 5(18.8 18.2)2 + 5(19.2 18.2)2 = 12.8 + 1.8 + 5 = 19.6. From the table, notice that 70.4 is the SS of all of the data. This value was calculated in the normal way, treating it as a single sample consisting of 15 observations and using 2 SS = ( xi x ) . Now, 19.6 (SSbetween) plus 50.8 (SSwithin) equals 70.4. We have partitioned the variability into two components, between and within, in a meaningful manner that sums to what we would quantify if it were just a single, combined sample.

The final steps Remember that the estimated variance of a focal population =

sum of squared deviates . degrees of freedom

We can again think of dividing the sum of squares by sample size (technically degrees of freedom) to get a sort of average squared deviate. In ANOVA, this has a special name mean square, often denoted MS. We have a measure, 19.6, which is SSbetween. If we divide by the relevant degrees of freedom, then we have mean square for between treatments. There were 3 groups and so degrees of freedom = number of groups 1 = 3 1 = 2. MSbetween = 19.6 / 2 = 9.8 We have a measure, 50.8, which is SSwithin. If we divide by the relevant degrees of freedom, then we have mean square for within. For each of the 3 groups, we have ni 1 degrees of freedom. Total degrees of freedom is thus (5 1) + (5 1) + (5 1) = 4 3 = 12. MSwithin = 50.8 / 12 = 4.23 To summarize, we now have a measure, mean square, for the variance within the groups, this is 4.23, and a measure for the variance between the groups, 9.8, and through degrees of freedom all the sample sizes have been taken into account, that is those two measures have each been normalized, so we can directly compare them in meaningfully. Finally, we look at the ratio between them to get an F test statistic. Note, we always take between / within. It is this ratio that we look up in a table of the F distribution. For our chosen level of significance, the table gives us a critical F value. If our test statistic is greater than this critical value, we can reject the null hypothesis at level . If it is less than the critical value, then we fail to reject the null hypothesis.

F= MS between 9. 8 = = 2.32 MS within 4.23

The table for the F distribution is a little complicated because we have two degrees of freedom, that for the numerator and that for the denominator. Then we have different critical values for each level of significance. For our sample we have 2 degrees of freedom for the numerator, 12 degrees of freedom for the denominator, and we will take = 0.05. The critical value from the table is 3.89. As our test statistic, 2.32, is less than this value, we fail to reject the null hypothesis. We do not have sufficient evidence to demonstrate a significant difference, at the 5% significance level, among the different populations means.

One final note, MSwithin, because it deals with variability within a sample, can be thought of as error about the mean of the population from which the sample comes from. That is, a observation is the population mean, , plus an error term:

2 where ei , j ~ N ( 0, ) Thus, MSwithin, is usually called mean squared error and denoted MSE. This notation is often used in the output of statistical packages.

x i , j = j + ei , j

By the way, one-way ANOVAs are the easiest, they go up in complexity from here! Carl Anderson (carl@isye.gatech.edu) 10/25/02

2

2

All data All the data 16...20, 20..18, 1818 15 273 5039 18.2 70.4

SS = ( xi x ) = xi2 ( x )

SStotal = SSwithin groups + SSbetween groups SSwithin groups = 17.2 + 14.8 + 18.8 = 50.8 SSbetween groups = 5(16.6 18.2)2 + 5(18.8 18.2)2 + 5(19.2 18.2)2 = 12.8 + 1.8 + 5 = 19.6.

SStotal = 50.8 + 19.6 = 70.4 Minitab command: Stat / ANOVA / One-way (unstacked) One-way ANOVA: A, B, C

Analysis of Variance Source DF SS Factor 2 19.60 Error 12 50.80 Total 14 70.40 Level A B C N 5 5 5 Mean 16.600 18.800 19.200 2.058 MS 9.80 4.23 F 2.31 P 0.141

Pooled StDev =

Individual 95% CIs For Mean Based on Pooled StDev --------+---------+---------+-------(---------*---------) (---------*---------) (---------*---------) --------+---------+---------+-------16.0 18.0 20.0

Source

Degrees freedom

Mean squares

SSTr k 1

F-statistic

MSTr MSE

p-value

P ( Fk 1, Nt k F )

MSTr =

F =

= 19.6 / 2 = 9.8

Error (=within)

Nt k = 15-3 = 12

MSE =

SSE Nt 1

= 50.8 / 12 = 4.23

Total

Nt 1 = 15-1 = 14

SST = 70.4

- Box–Behnken Modelling of Phenol Removal from Aqueous Solution using Emulsion Liquid MembraneUploaded byIRJET Journal
- Using SPSS for Factorial, Between-Subjects Analysis of VarianceUploaded byToar Andreas Moleong
- 1. Introduction to Taguchi TechniquesUploaded bySreedhar Pugalendhi
- Health Promotion Strategies for NursesUploaded bypuding101
- Interpretation 21Uploaded byvaneesha1989
- BiophysicalUploaded byBryan Jeff Quilo Mayo
- AnovaUploaded byhem
- Techniques of Physical ExaminationUploaded byAijem Ryan
- relaciones-de-equivalencia-de-estmulos.pdfUploaded bycalaveres
- Cognitive DisordersUploaded byLawrence Nemir
- Anti Psychotic DrugsUploaded byJohn Corpuz
- 71492287 Statistics ProjectUploaded bykelvinryondale
- 53314066-How-to-Pass-the-NLEUploaded byDiana Rose DC
- Health TeachingUploaded byMarco Virgo
- EnvironmentUploaded byErwin
- Vol3-Issue1-7Uploaded byrindangsukmanita
- Ct Design4Uploaded byMeriam Gita Maulia Suhaidi
- 26-DISBURSEMENT OF AGRICULTURAL LOANS, CONSTRAINTS.pdfUploaded byAnonymous nJgr3E
- The One-Way Analysis of Variance (ANOVA) Process for STAT 461 Students at Penn State UniversityUploaded bygarrettherr
- Nhst Pdw Resource Guide Final 2018-08-14Uploaded byasevil
- R CommanderUploaded byGilbert Rozario
- 5119-5396-1-PBUploaded byairine
- Facebook Usage and Its Effects to the Male Junior High School Students in Al Majd International School Dammam Philippine CurriculumUploaded byErdy Joshua Dayrit
- Houle-2005-A Functional Approach to VolunteeriUploaded byConti Kucci
- LEGEND Biology NotesUploaded byJeca Adornado
- SPSSChapter10 GLM.pdfUploaded byMeutia Atika Faradilla
- StatisticsUploaded byYeoj Aquit Parreño
- Wine Analysis TaskUploaded byRussell Monroe
- OUTPUT.docUploaded byDesy Annisa
- Residuos de Madera Con Cemento AnalisisUploaded byMario A. León

- Mat AssignUploaded byNikhilesh Prabhakar
- Fractal TerrainUploaded bychessgeneral
- Design of horizontal pressure vessel for vacuum system drain collector receiverUploaded byInternational Journal for Scientific Research and Development - IJSRD
- Camera Gamma Market.pdfUploaded bytejashree kirve
- 99.Filling Automatic Transmission With FluidUploaded byCarlos HC
- Balagurusamy Programming in C#Uploaded bySayan Chatterjee
- CFI_01_0715Uploaded bySyed Baqar Hussain
- SMPV Draft Notification 2015 EnglishUploaded bysathish_iyengar
- Memoryless NotesUploaded bySmriti Paliwal
- Lesson Plan Lesson 2 7P PowerpointUploaded byRobjparker85
- Water Intrusion Problems Related to Unsealed Stucco PenetrationsUploaded byetggrelay
- RBO - cashUploaded bydine90
- (1991) TM 11-5820-I 048-110 Operator's Manual Radio Set AN & PRC-127Uploaded byThorikto
- Maruti and HyundaiUploaded byKunal Matani
- CIS Windows Server 2008Uploaded byCesar Bernal
- Integrated Disability Evaluation System Fact SheetUploaded byOffice of Wounded Warrior Care & Transition Policy
- econ3140a-15-1stTermUploaded byNG
- Designing a Wheel in CATIA V5.pdfUploaded bysokojasmin
- chapter 2 - growing regions industry statisticsUploaded byapi-232405151
- Catalog Advantys AS-I IP20-IP67_803510_DIA3ED2040909EN_200408Uploaded byJean Marzan
- Mitsui Outlet Park SepangUploaded byA Asrul
- Optical MemoryUploaded byv_gurucharan
- HC900 ControllerOIandHCDUploaded byNguyễn Anh Tú
- Market View and Options StrategyUploaded byassuredgain
- xdqwUploaded byChristian
- Suffix and Prefix Exercise 1Uploaded byRodrigo Carvalho
- HVDCUploaded byRamachandran Unnikrishnan
- Lettre des travailleurs sociaux au sujet de la surcharge de travailUploaded byRadio-Canada
- Camarines Norte Elective Cooperative Inc. vs. TorresUploaded byRustom Ibañez
- C89X41X10X1.0Uploaded byPavan Poreyana Balakrishna