You are on page 1of 12

Data Description

The relation between different variables can be easily perceived by condensing the data. The
distribution of Drivers, Constructors and Tyre suppliers is shown as:
Constru
ctors
Ferrari
Honda
Mclaren
Mercedes
Red Bull
Renault
Williams
Driver
Count

Tyre Supplier
Bridges Miche Pir
Driver
tone
lin
elli Count
8
3
3
14
6
4
3
13
3
5
3
11
7
7
2
16
5
9
6
20
7
2
4
13
3
3
7
13
39

33

28

100

Construtor and Tyre Distribution among Drivers

Driver Count

10
9
8
7
6
5
4
3
2
1
0

77

6
33

5
3

5
3

7
4
2

33

Bridgestone
Michelin
Pirelli

Constructors

As the down force, break wear, tyre wear and fuel load varies across cars, it is a good idea to
know how they are distributed. The tables and charts depicted below shows the how they vary.
Downforce
Level

Break Wear Level


Hig
h
Low Medium

Car
Count

High
Low
Medium
Very High
Very Low
Car Count

4
6
6
5
7
28

6
10
10
9
5
40

11
7
5
4
5
32

21
23
21
18
17
100

Variation of Downforce level with Break Wear


12

11

10

10
8

Car Count

6
4

10
7

9
7
5

High
5 5

2
0

High

Low

Medium Very High Very Low

Downforce Level

Fuel
Load
High
Low
Medium
Very
High
Very
Low
Car
Count

Tyre Wear Level


Hig Lo
Medi
Car
h
w
um
Count
6
6
5
17
7
11
6
24
9
9
11
29
4

14

16

32

40

28

100

Low
Medium

Variation of Tyre Wear with Fuel Load


12

11

11

10

9 9

Tyre Wear

6 6

High
Low

Medium

2
0

High

Low

Medium

Very High

Very Low

Fuel Load

With the variation of wins across constructors suggested from the previous seasons, we can
determine the probability that a constructor would win the world championship. A good estimate
of 4 wins in a season of 20 races would suggest in a greater likelihood of achieving that target.
Probabilities for the same are as follows:
Construc
tors
Ferrari
Honda
Mclaren
Mercedes
Red Bull
Renault
Williams
Grand
Total

Wi Winning
Probability of at
ns
Probability
least 4 Wins
338
0.1385
0.2970
335
0.1373
0.2913
292
0.1197
0.2114
380
0.1557
0.3800
465
0.1906
0.5466
297
0.1217
0.2201
333
0.1365
0.2875
24
40

The chance for each constructors to win races in the championship can be expressed as Binomial
distribution. The Binomial curve for the 2 contenders-Ferrari and Mercedes are shown:

Binomial Curve - Wins for Red Bull


0.2000
0.1500

Chance for Wins 0.1000


0.0500
0.0000

10

15

20

25

Race Wins

Binomial Curve - Wins for Mercedes


0.2000
0.1500

Chance for Wins 0.1000


0.0500
0.0000

10

15

20

25

Race Wins

Since the likelihood for 4 or more wins varies only a slightly among the other 5 constructors, the
binomial curve only varies slightly from that of that of Mercedes, showing a declining trend.
The DNFs measure the number of instances where a car failed to finish the race. The factors that
leads to such accidents are a function of driver error. From the sample of 100 drivers, the
calculated average number of accidents is:
Mean
279.
Starts
66
Mean
36.2

DNF's

In a season of 20 races it can be said that there will be an average of 2.6 accidents in a season. In
order to improve the reliability, new regulations will be introduced by FIA if there are more than
8 accidents within the next 5 seasons.
This follows a Poisson distribution as shown:
Poisson Probabilities
Data
Mean number of
accidents per season
X Value

2.6
8

Results
0.0
038

P(x>=8)

Accidents over 5 years


0.3000
0.2500
0.2000

Chance for Accident 0.1500


0.1000
0.0500
0.0000

10

15

20

25

Number of accidents

The chance for 8 accidents over 5 seasons = 0.0038


The mean pit stop time for cars from the data is 4 seconds. This follows an exponential
distribution. The chance that a car takes more than 7 seconds is then 82.62%.
Exponential Probabilities
Data
Mean time for pit
stops
X Value

4.0 s
7.0 s

Results
P(x<=7)

0.82
62

Pit stop time Distribution


0.9000
0.8000
0.7000
0.6000
0.5000

Chance for the pit stop time 0.4000


0.3000
0.2000
0.1000
0.0000

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5

Pit stop time in seconds

Inferential Data Analysis


Constructors, Race engineers, and media surrounding the sport make several claims and
predictions based on the previous results. In this section inferential statistics is used to predict
with reasonable certainty whether the same is true. The sample of 100 drivers and the
corresponding parameters taken in this study is a good representation of the generalized history
of Formula 1. To address the assumptions of these claims, various tests of significance are done.
All tests are conducted with 5% Level of Significance.
Test 1
According to the F1-standards, if the average speed of all the racers
throughout all the laps falls below 198 mph, then there can be a definite
quality degradation issue with the engine and the current engine needs to be
migrated to a newer version. A sample of a 100 F1 racers has been collected
to see if there is a need to replace the current V6 version engine with a
newer V8 version.
Solution:
The hypothesis can be formed as,
H0 : = 198 mph
v/s
H1 :
< 198 mph
Since sample size, n = 1000, so we have a large sample.

Z Test of Hypothesis for


the Mean
Data
Null Hypothesis
=
Level of Significance
Speed Standard
Deviation
Sample Size
Sample Mean
Intermediate
Calculations
Standard Error of the
Mean
Z Test Statistic

198
0.05
10.1
6
1000
196.
3

0.3212874
1
5.2912126
2

Lower-Tail Test
Lower Critical Value
p-Value
Reject the null
hypothesis

1.6448536
27
6.0754E-08

Since the p-value is less than , so we reject the null hypothesis.


So, we can conclude that there is definite quality degradation issue with the engine and hence the
current version of V6 engine should be migrated to the higher V8 version.
Test 2
During 2001-2010 seasons, the cars that used Pirelli tyres recorded a pit stop
time of 5 seconds. Constructors assume that with the recent change in tyre
regulations, there is a decrease in pit stop time.
The hypothesis can be formed as,
H0 : = 5 seconds
v/s
H1 :
Since sample size, n = 28, so we have a small sample
t Test for Hypothesis of

< 5 Seconds

the Mean
Data
Null Hypothesis
m=
Level of Significance
Sample Size
Sample Mean
Sample Standard Deviation

5
0.05
28
4.007142
857
0.435404
088

Intermediate Calculations
Standard Error of the Mean
Degrees of Freedom
t Test Statistic

0.082283
638
27
12.06627
665

Lower-Tail Test
1.703288
446
1.09381E
-12

Lower Critical Value


p-Value
Reject the null
hypothesis

Since the p-value is less than , so we reject the null hypothesis.


Hence, we can conclude that with the recent change in regulations, there is a
significant decrease in pit stop time.
Test 6
The new design of the tyres for the current season are supposed to increase the heat within 10
laps. The earlier the tyres are warmed up and reach the optimum temperature, better the grip
through the corners and hence causes performance improvement. This grip is a function of the
cornering coefficient. Does the new tyres work as intended?
The hypothesis can be framed as,
H0 :

F = L

v/s

H1 :

F < L

Paired t Test
Data
Hypothesized Mean
Difference
Level of significance
Intermediate Calculations
Sample Size
DBar
Degrees of Freedom
SD
Standard Error
t Test Statistic
Upper-Tail Test
Upper Critical Value
p-Value
Do not reject the null
hypothesis
t-Test: Paired Two Sample
for Means

Mean
Variance
Observations
Pearson Correlation
Hypothesized Mean Difference
df
t Stat
P(T<=t) one-tail
t Critical one-tail
P(T<=t) two-tail
t Critical two-tail
TDIST
Calculations
T.DIST.RT
1-T.DIST.RT

0.212
57
0.787
43

0
0.05

28
-0.173809524
27
1.13574026
0.214634734
-0.809792154

1.703288446
0.787430336

Cornering 1-5 Cornering 6-10


Laps
Laps
5.322619048
5.496428571
4.60675338
2.804225456
28
28
0.851517565
0
27
-0.809792154
0.212569664
1.703288446
0.425139329
2.051830516

Since the p-value is greater than , we do not reject the null hypothesis. It can thus be concluded
that the tyres do not work as intended.
Test 7
Bridgestone and Michelin are the 2 largest tyre suppliers for F1 and has been
throughout its history. As such the sports media based on the brand image
claim that there is no significant different between the proportion of these
tyres. Is their claim justified?
The hypothesis can be framed as,
H0 :

PB = PM

v/s

Tyres
Bridgestone
Michelin
Pirelli
Grand Total

H1 :

PB PM

Count of
Driver
39
33
28
100

Z Test for Differences in Two


Proportions
Data
Hypothesized Difference
Level of Significance
Bridgestone
Number of Items of Interest
Sample Size
Michelin
Number of Items of Interest
Sample Size
Intermediate Calculations
Bridgestone Proportion
Michelin Proportion
Difference in Two Proportions
Average Proportion

0
0.05
39
100
33
100

0.39
0.33
0.06
0.36

0.883883
476

Z Test Statistic
Two-Tail Test

1.959963
985
1.959963
985
0.376759
118

Lower Critical Value


Upper Critical Value
p-Value
Do not reject the null
hypothesis

Since the p-value is greater than , we do not reject the null hypothesis. It can thus be concluded
that there is no significant difference between the proportions of the 2 tyres.
Test 10
A random analysis of the number of fastest laps achieved by cars running
the 3 types of tyres Bridgestone, Michelin & Pirelli suggest that 40% of the
fastest laps are by Bridgestone, 30% each by the other 2. Does the actual
number of fastest laps differ from that of the analysis done?
The hypothesis can be framed as,
H0 :
0.3

The proportion of the fastest laps are equal to : PB = 0.4, PM = 0.3, Pp =

/s
H1 : The proportions of fastest laps are not equal to: PB = 0.4, PM = 0.3, Pp
= 0.3
2 test of Goodness of Fit
Sum of Fastest
Laps

Expected Fast
Laps

895

927.6

Michelin

774

695.7

Pirelli
Grand
Total

650

695.7

2319

2319

Tyres
Bridgesto
ne

P Value

0.001533614

Chi
Square
1.145709
357
8.812548
512
3.001997
988
12.9602
5586

Since the p-value is less than , we reject the null hypothesis. That is, the
actual number of fastest laps differ from what is claimed.
Test 11

You might also like