You are on page 1of 30

Regression:

A machine learning perspective


Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
1

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Part of a specialization

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

This course is a part of the


Machine Learning Specialization
1. Foundations

2. Regression

3. Classification

4. Clustering
& Retrieval

5. Recommender
Systems

6. Capstone

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

What is the course about?

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

What is regression?
From features to predictions

Data

ML
Regression
Method

Intelligence

Input x:

features derived Learn xy


from data

relationship

Predict y:

continuous output or
response to input
2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Salary after ML specialization

hard work

How much will your salary be? (y = $$)


Depends on x = performance in courses, quality of
capstone project, # of forum responses,
6

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Stock prediction
Predict the price of a stock (y)
Depends on x =
-Recent history of stock price
-News events
-Related commodities

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Tweet popularity
How many people will retweet your tweet? (y)
Depends on x = # followers,
# of followers of followers,
features of text tweeted,
popularity of hashtag,
# of past retweets,

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Reading your mind


Output y
very sad

very happy

Inputs x are
brain region
intensities

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Case Study:
Predicting house prices
ML
Regression
Method

$
$

$
(y)

+ house
attributes (x)
10

Intelligence

$ = ??

price ($)

Data

house size
2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Impact of regression

11

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Course outline

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Module 1: Simple Regression


What makes it simple?
1 input and just fit a line to data
price ($)

house size
13

x
2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

price ($)

Module 1: Simple Regression


Define goodness-of-fit
metric for each possible line

14

intercept

Gradient descent algorithm

slope

better fit

house size

Get estimated
parameters
-interpret
-use to form
predictions
2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Module 2: Multiple Regression


y

price ($)

Fit more complex


relationships than
just a line
house size x

Incorporate
more inputs

price ($)

x[2]

house size
15

x[1]

-
-
-
-
-
-

Square feet
# bathrooms
# bedrooms
Lot size
Year built

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

price ($)

price ($)

Module 3: Assessing Performance

house size x

T
I
F
R
VE size
Ohouse

Measures of error:
- Training
- Test
- True (generalization)
16

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

price ($)

price ($)

Module 3: Assessing Performance

house size x

T
I
F
R
VE size
Ohouse

Bias-variance
tradeo

17

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

price ($)

price ($)

Module 4: Ridge Regression

house size x

Ridge total cost =


measure of fit +

T
I
F
R
VE size
Ohouse

measure of
model complexity

bias-variance tradeo
18

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Module 4: Ridge Regression


How to choose balance?
(i.e., model complexity)
measure of fit +

measure of
model complexity

Cross validation
Valid
set
error2() (2)
19

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Module 5: Feature Selection


& Lasso Regression
$?

20

Dishwasher
Garbage disposal
Microwave
Range / Oven
Refrigerator
Washer
Dryer
Laundry loca0on
Hea0ng type
Je]ed Tub
Deck
Fenced Yard
Lawn
Garden
Sprinkler System

2015 Emily Fox & Carlos Guestrin

Useful for eciency


of predictions and
interpretability

Lot size
Single Family
Year built
Last sold price
Last sale price/sqM
Finished sqM
Unnished sqM
Finished basement sqM
# oors
Flooring types
Parking type
Parking amount
Cooling
Hea0ng
Exterior materials
Roof type
Structure style

Machine Learning Specializa0on

Module 5: Feature Selection


& Lasso Regression
Lasso total cost =
measure of fit + (dierent) measure of
model complexity
knocks out certain features
sparsity
Coordinate descent algorithm
21

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

price ($)

Module 6: Nearest Neighbor


& Kernel Regression
y

Here, this is the


closest datapoint

house size

22

$ = ???

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Module 6: Nearest Neighbor


& Kernel Regression
Epanechnikov Kernel (lambda = 0.2)
1.5

f(x0)

0.5

0.5

$ = ???

23

0.1

0.2

0.3

0.4

x0

0.6

0.7

0.8

0.9

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Summary of whats covered


Models

Algorithms

Concepts

24

Linear regression
Regularization: Ridge (L2), Lasso (L1)
Nearest neighbor and kernel regression

Gradient descent
Coordinate descent

Loss functions, bias-variance tradeo,


cross-validation, sparsity, overfitting,
model selection, feature selection

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Assumed background

25

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Math background
Basic calculus
-Concept of derivatives

Basic linear algebra


-Vectors
-Matrices
-Matrix multiply

26

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Programming experience
Basic Python used
-Can pick up along the way if
knowledge of other language

27

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Reliance on GraphLab Create


SFrames will be used, though not required
- open source project of Dato
(creators of GraphLab Create)
- can use pandas and numpy instead

Assignments will:
1. Use GraphLab Create to
explore high-level concepts
2. Ask you to implement
all algorithms without GraphLab Create

Net result:
- learn how to code methods in Python
28

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Computing needs
Basic 64-bit desktop or laptop
Access to internet
Ability to:
-Install and run Python (and GraphLab Create)
-Store a few GB of data

29

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

Lets get started!

2015 Emily Fox & Carlos Guestrin

Machine Learning Specializa0on

You might also like