0 views

Uploaded by Dre

Michael Importance Weighting

- Chapter 16
- Current Version_ SaTScan v9.1.1 Released March 9 2011
- Steve Selvin a Biostatistics Toolbox for Data Analysis
- SSRN-id3018366
- 320 Exam 1 2010A Key
- Mathematics by n k Singh
- UT Dallas Syllabus for econ6311.001.09f taught by Daniel Griffith (dag054000)
- New Stat 501 Bulk
- Some remarks on the tails of the Gaussian distribution
- 9709_s03_qp_6
- Non-Gaussian Statistics
- Beta Normal
- CVEN2002 Week6
- Applied Probability Notes
- Normal Approx
- .Gianin, Sgarra -Mathematical Finance_ Theory Review and Exercises_pp.286
- How to Create Normal Probability Plot
- Benford's Law
- Untitled
- 4

You are on page 1of 30

April 15, 2015

Introduction

Want to utilize information about test distribution

Correct bias or discrepancy between training and testing

distributions

1

Importance Weighting

Unlabeled test data from target distribution P

Weight the cost of errors on training instances.

Common denition of weight for point x: w(x) = P(x)/Q(x)

2

Importance Weighting

Can we give generalization bounds for this method?

When does DA work? When does it not work?

How should we weight the costs?

3

Overview

Preliminaries

Learning guarantee when loss is bounded

Learning guarantee when loss is unbounded, but second moment

is bounded

Algorithm

4

Preliminaries: Rnyi divergence

( )1

1 P(x)

D (P||Q) = log2 P(x)

1 x

Q(x)

[ ] 1

1

P (x)

D (P||Q)

d (P||Q) = 2 =

x

Q1 (x)

D (P||Q) = 0 iff P = Q

5

Preliminaries: Importance weights

Lemma 1:

Proof:

( P(x) )2

2 2

EQ [w ] = w (x)Q(x) = Q(x) = d2 (P||Q)

Q(x)

xX xX

1

EQ [w2 (x)L2h (x)] d+1 (P||Q)R(h)1

6

Preliminaries: Importance weights

1 1

Hlders Inequality (Jin, Wilson, and Nobel, 2014): Let p + q = 1, then

( ) p1 ( ) q1

p q

|ax bx | |ax | |bx |

x x x

7

Preliminaries: Importance weights

[ ]2 [ ]

2 P(x) 1 P(x) 1

ExQ [w (x)L2h (x)] = Q(x) L2h (x) = P(x) P(x) L2h (x)

x

Q(x) x

Q(x)

[ [ ] ] 1 [ ] 1

P(x) 2

P(x) P(x)Lh (x)

1

x

Q(x) x

[ +1

] 1

= d+1 (P||Q) P(x)Lh (x)Lh 1

(x)

x

1 1 1

d+1 (P||Q)R(h)1 B1+ = d+1 (P||Q)R(h)1

8

Learning Guarantees: Bounded case

P(x)

supx w(x) = supx Q(x) = d (P||Q) = M. Let d (P||Q) < +. Fix h H.

Then, for any > 0, with probability at least 1 ,

2

w (h)| M log

|R(h) R

2m

9

Overview

Preliminaries

Learning guarantee when loss is bounded

Learning guarantee when loss is unbounded, but second moment

is bounded

Algorithm

10

Learning Guarantees: Bounded case

least 1 , the following bound holds:

1 1 1 R(h)2 ] log 1

R(h) R w (h) + 2M log + 2[d+1 (P||Q)R(h)

3m m

11

Learning Guarantees: Bounded case

( n ) ( )

1 n2

Pr xi exp

n 2 2 + 2M/3

i=1

when |xi | M.

12

Learning Guarantees: Bounded case

Then |Z| M. Thus, by lemma 2, the variance of Z can be bounded in

terms of d+1 (P||Q):

1

2 (Z) = EQ [w2 (x)Lh (x)2 )] R(h)2 d+1 (P||Q)R(h)1 R(h)2

( )

m2 /2

Pr[R(h) R

w (h) > ] exp .

2 (Z) + M/3

13

Learning Guarantees: Bounded case

1

1

2M log M2 log2 1 2 2 (Z) log 1

R(h) Rw (h) +

+ +

3m 9m2 m

1

w (h) + 2M log + 2 2 (Z) log 1

=R

3m m

14

Learning Guarantees: Bounded case

Theorem 2: Let H be a nite hypothesis set. Then for any > 0, with

probability at least 1 , the following holds for the importance

weighting method:

w (h) + 2M(log|H| + log ) + 2d2 (P||Q)(log |H| = log

1 1

R(h) R

3m m

15

Learning Guarantees: Bounded case

in the case of = 1:

1 1

R(h) R w (h) + 2M log + 2d2 (P||Q) log

3m m

16

Learning Guarantees: Bounded case

Assume there exists h0 H such that Lh0 (x) = 1 for all x. There exists

an absolute constant c, c = 2/412 , such that

[ ]

d2 (P||Q) 1

Pr sup |R(h) Rw (h)|

c>0

hH 4m

17

Overview

Preliminaries

Learning guarantee when loss is bounded

Learning guarantee when loss is unbounded, but second moment

is bounded

Algorithm

18

Learning Guarantees: Unbounded case

Guassian distribution with P and Q with means and

[ ]

P(x) P Q2 (x )2 P2 (x )2

= exp

Q(x) Q 2P2 Q2

P(x)

Thus, even if P = Q and = , d (P||Q) = supx Q(x) = , thus

Theorem 1 is not informative.

19

Learning Guarantees: Unbounded case

+ [ ]

Q 2 2 (x )2 P2 (x )2 2

dw (P||Q) = 2 exp Q Q dx

P 2 2P2

20

Learning Guarantees: Unbounded case

But sample from Q only has few points far from

A few extreme sample points would have large weights

21

Learning Guarantees: Unbounded case

Pdim({Lh (x) : H H}) = p < . Assume that d2 (P||Q) < + and

w(x) = 0 for all x. Then for > 0, with probability at least 1 , the

following holds:

5/4

3 p log

2me

p + log

4

R(h) Rw (h) + 2

8

d2 (P||Q)

m

22

Learning Guarantees: Unbounded case

[ ]

h]

Pr suphH E[Lh ] E[L

2]

> 2 + log

1

E[L

[ h ]

Pr suphH,t Pr[Lh>t]Pr[L h >t]

>

h >t]

Pr[L

[ ] ( )

2

Pr suphH

R(h)R(h)

> 2 + log 4H (2m) exp m4

1

R(h)

[ ]

h (x)]

Pr sup h H 2

E[Lh (x)]E[L

> 2 + log 1

E[Lh (x)]

( )

m2

p 4

4 exp p log 2em

[ ] ( )

h (x)] m8/3

Pr suphH E[Lh (x)]E[L

2

> 4 exp p log 2em

p 45/3

E[Lh (x)]

(x)]|

h 3 p log 2me +log 8

5/4 2 2

2 max{ E[Lh (x)], E[Lh (x)]} 8 p

m 23

Learning Guarantees: Unbounded case

] ( )

[ R(h) R w (h) 2em m8/3

Pr sup > 4 exp p log 5/3 .

hH d2 (P||Q) p 4

H = {w(x)Lh (x) : h H}. Note, any set shattered by H is shattered

by H , since there exists a subset B of a set A that is shattered by H ,

such that H shatters A with witnesses si = ri /w(xi ).

24

Overview

Learning guarantee when loss is bounded

Learning guarantee when loss is unbounded, but second moment

is bounded

Algorithm

25

Alternative algorithms

u (h) = 1 m u(xi )Lh (xi ) and let Q

u : X 7 R, u > 0. Let R be the

m i=1

empirical distribution: Theorem 4: Let H be a hypothesis set such

that Pdim({Lh (x) : h H}) = p < . Assume that

0 < EQ [u2 (x)] < + and u(x) = 0 for all x. Then for any > 0 with

probability at least 1 ,

[ ]

|R(h) R

u (h) |EQ [w(x) u(x)]Lh (x) |

( ) p log 2me 4

p + log

3

+25/4 max

8

EQ [u2 (x)L2h (x)], EQ [u2 (x)L2h (x)]

m

26

Alternative algorithms

Minimize upper bound

( )

( )

max EQ [u ], EQ [u ] EQ [u2 ] 1 + O(1/ m) ,

2 2

[ ]

minuU E |w(x u(x)| + EQ [u2 ]

Trade-off between bias and variance minimization.

27

Alternative algorithms

28

Alternative algorithms

29

- Chapter 16Uploaded byVee Lan
- Current Version_ SaTScan v9.1.1 Released March 9 2011Uploaded bycatoper
- Steve Selvin a Biostatistics Toolbox for Data AnalysisUploaded byAlexandru Antohi
- SSRN-id3018366Uploaded byquanxu88
- 320 Exam 1 2010A KeyUploaded byZac Clingaman
- Mathematics by n k SinghUploaded byarijit manna
- UT Dallas Syllabus for econ6311.001.09f taught by Daniel Griffith (dag054000)Uploaded byUT Dallas Provost's Technology Group
- New Stat 501 BulkUploaded byCassie Desilva
- Some remarks on the tails of the Gaussian distributionUploaded byJack D'Aurizio
- 9709_s03_qp_6Uploaded byIqra Jawed
- Non-Gaussian StatisticsUploaded byluca a.
- Beta NormalUploaded byManas Jyoti Barman
- CVEN2002 Week6Uploaded byKai Liu
- Applied Probability NotesUploaded byKalaivani
- Normal ApproxUploaded bymsh-666
- .Gianin, Sgarra -Mathematical Finance_ Theory Review and Exercises_pp.286Uploaded byMarco Martinozzo
- How to Create Normal Probability PlotUploaded byTanjid Hossain
- Benford's LawUploaded byScrooty
- UntitledUploaded byapi-25887578
- 4Uploaded byRifat Khan
- W14-3301Uploaded byTryer
- Extreme Response Analyses of Marlin TLP Tendon Tension During Hurricane IvanUploaded byTee Shi Feng
- Irregular WavesUploaded byLava Sat
- gcpjse014r.OO.2Uploaded byrdnelsongcp
- stacy huang - epid 602 hw3Uploaded byapi-458165379
- Super-resolution Enhancement of Text Image Sequences.Uploaded byantonio Scacchi
- BayesianUploaded byBalaji Angamuthu
- 10.1.1.28Uploaded byflorabelle127
- Notes St260 Lec6Uploaded bymclee7
- Predicting Statistical Distributions of Footbridge VibrationsUploaded byAhmad Waal

- BCA Construction Quality Assessment System CONQUAS OverviewUploaded byshirley
- Competitive intelligence deliverables in significant way | Blueocean MIUploaded byBuzwizz
- PowerOn_fusion.pdfUploaded byJagan Vanama
- MecaStack BrochureUploaded byGigena
- Library Management System projectUploaded byrachna_raghuwanshi
- Continous Auditing Through Leveraging TechnologyUploaded bykamuturi
- Analisis Fitur_editNuning (Hasil Translate)Uploaded byIman
- ATE in stata.pdfUploaded byFara Wulan
- diff2Uploaded byMarko Markovic
- Award Checkpoint and Beep Code List PUBUploaded bytfkill
- 1 Page Intro H-Ideas en v1 Jeudi, 14 Septembre 2017 15-35-04Uploaded byAndi Cahya
- Veritas UpgradeUploaded byanimeshdoc
- IT Director/ManagerUploaded byapi-78557852
- qr codeUploaded bychervalier
- AssemblyUploaded byusman_michelle_agura
- DNN DevelopmentUploaded byanishpv
- Duplicate Invoice Check CustomizUploaded byRajesh Soni
- Malayala Sangeetham Info-VancouverUploaded bymalayalasangeetham
- Chuong Trinh VBA Cad_ đại học xây dưungjUploaded byLoveu Nmt
- 04 Exercise Realtime Classification weweewewUploaded byWidhi Mahaputra
- LogUploaded byTyo Pumpkins
- Pst DisableUploaded byKevin Salavamani
- Comparison of Clustering Algorithms ReportUploaded bysan343
- Scratch HR/ENGUploaded byDalibor Omazic
- Webuser_-_9_September_2015Uploaded byJoel Cambridge
- Senior Cost Accountant Analyst in San Francisco Bay CA Resume Debra Krutul-HicksUploaded byDebraKrutulHicks
- Assignment 04 TCPIPUploaded byAarti Khare
- 53173454-GSM-MAPUploaded byMạnh Hùng Đỗ
- Data Center Review Audit ProgramUploaded byRicky Bongo
- Nameshop ApplicationUploaded byInternetStudio