You are on page 1of 41

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication.


IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Optimal Scheduling and Placement of Internet Banner Advertisements

Subodha Kumar∗ Milind Dawande† Vijay S. Mookerjee†

Abstract
The increasing popularity of the world wide web has made it an attractive medium for ad-
vertisers. As more advertisers place internet advertisements (hereafter also called “ads”), it
has become important for web site owners to maximize revenue through the optimal selection
and placement of these ads. Unlike most previous research, we consider a hybrid pricing model
where the price advertisers pay is a function of (i) the number of exposures of the ad and (ii) the
number of clicks on the ad. The problem is to find an ad schedule to maximize web site revenue
under a hybrid pricing model. We formulate two versions of the problem: static and dynamic,
and propose a variety of efficient solution techniques that provide near-optimal solutions. In
the dynamic version, the schedule of ads is changed based on individual user click behavior. We
show – using a theoretical proof under special circumstances and an experimental demonstra-
tion under general conditions – that a schedule that adapts to user click behavior consistently
outperforms one that does not. We also demonstrate that to benefit from observing user click
behavior, the associated probability parameter need not be estimated accurately. For both these
versions, we examine the sensitivity of the revenue with respect to the model parameters.

Index Terms—Web advertisement, banner ads, scheduling, click behavior, knapsack.

1 Introduction

The number of internet users is increasing dramatically. This number was nearly 605.6 million as of

September 2002 [24] and 934 million by the end of 2004, and is estimated to be 1.35 billion by the

end of 2007 [6]. More importantly, Nielsen//NetRatings, the global standard for internet audience

measurement and analysis, reported that the number of internet users with an household income

of more than $150,000 jumped by 20 percent over the previous year to 10.3 million in January 2005

[9]. Given the large number of internet users, especially affluent users, the web is fast becoming an

attractive medium for advertising.

With the sharp increase in online users, the market for web advertising is also growing. The

Interactive Advertising Bureau [14] reported that web advertising revenue in the U.S. rose nearly 34

percent in 2006 over that in 2005, totaling about $16.8 billion. Moreover, the fourth quarter revenue

in 2006 totaled a record $4.8 billion, marking the highest quarterly revenue ever reported [14]. A

Business School, University of Washington, Box 353200, Seattle, WA 98195; subodha@u.washington.edu

School of Management, University of Texas at Dallas, Richardson, TX 75083; {milind, vijaym}@utdallas.edu

Digital Object Indentifier 10.1109/TKDE.2007.190640 1041-4347/$25.00 © 2007 IEEE


This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Figure 1: Banner ads on www.vtliving.com (Date clicked: March 26, 2007)

recent article in Business Week reports that the yearly growth rate of internet ads is galloping

ahead at 28.8%, whereas the overall ad industry grows at 7.7% a year [2].

Several web sites have now begun to depend heavily on the revenue generated by the advertise-

ments displayed on the site. Microsoft Corp.’s MSN, which attracts more than 350 million visitors a

month to its portal and e-mail service, is ready to give away millions of dollars worth of extra space

and technical support, as long as the advertisers buy some ads up front [3]. In the first quarter of

2005, Yahoo’s profit doubled and Google’s profit increased six-fold: the stellar first-quarter results

of both these organizations was based largely on the strength of revenues from online advertising

[16]. Subscriptions accounted for just 13% of Yahoo’s first-quarter revenue; the rest was mainly

attributed to online ads [18]. For web sites that attract a large number of users, e.g., MSN, Yahoo,

and Google, the optimal selection and placement of advertisements has become a significant issue.

Banner ads - a popular type of web advertising [25] - are the ad form of interest in this study.

Typically, a banner ad is a small graphic image that is linked to a target web page [23]. Many

different types of banner ads with different sizes and shapes (typically rectangular) are used in web

advertisement. Banner ads usually appear on the side, top, or bottom of a screen as a distinct,

clickable image [20]. For example, http://www.vtliving.com displays several side banner ads of equal

width and different heights as shown in Figure 1.

2
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Typically, a web site displays a specific sequence of ads to each user during her visit to the site.

For example, the site may display a specific set of ads to a user during the first minute of the visit

and a different set of ads during the next minute of the visit. Ads compete for exposure during

each time interval (e.g., one minute) and the goal is usually to optimize some objective calculated

for the visit. For the purposes of this study, the planning horizon is the duration of a user’s visit.

For example, a typical planning horizon may consist of 10 minutes. A scheduling problem that

arises is one of choosing a subset of ads for each time interval within the planning horizon.

The space for displaying ads in a time interval is referred to as a slot. Since the size of a slot

is limited, not all ads can be displayed in a slot. The goal of the ad scheduling problem is to

select a set of ads for each slot (i.e., a schedule) such that the total revenue over the planning

horizon is maximized. A feasible schedule has two properties. First, since advertisers do not gain

by displaying multiple copies of an ad in the same slot, each slot should have at most one copy of

an ad. Second, the set of ads selected for a slot should fit in the available space.

Most previous research on ad scheduling has been directed at maximizing the utilization of the

space available for ad display over a planning horizon. Adler et al. [1] prove that this problem is NP-

hard and provide algorithms for two different versions of the problem. Dawande et al. [8] provide

a set of heuristics for these problems, prove their worst-case bounds and analyze their average

performance. Kumar et al. [17] examine the use of a hybrid genetic algorithm, while Menon and

Amiri [21] use Lagrangean decomposition and column generation. The space utilization objective,

however, does not address some of the popular ad pricing models used in practice. We next discuss

the connection between ad scheduling and pricing.

The most commonly used ad pricing models are the CPM (cost-per-thousand impressions)

model, the click-through model and the hybrid model [23]. In the CPM model, an advertiser pays

an amount based on the number of impressions (exposures) of an ad. The number of impressions

during a visit is the number of times the ad is displayed to the user during the visit [23]. However,

the CPM pricing model does not consider the effectiveness of an ad. One way to measure the

effectiveness of an ad is to use its click rate, i.e., the number of times the ad is clicked upon divided

by the number of times it is exposed. The click-through model is a more accountable way of

3
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

pricing. In this model, the payment for an ad is based on the number of times the ad is clicked

upon. Finally, a hybrid model is one where pricing is based on a combination of the number of

impressions and the number of clicks. About 55% of the 2002 internet advertisement revenue had

some performance-based component and about 34% was based on hybrid pricing [13].

The CPM model and the click-through model represent two extreme pricing strategies. Pure

CPM pricing favors the web site owner because there is no risk: the web site owner is paid whether

or not the ad is clicked upon1 . On the other hand, the click-through pricing scheme favors the

advertiser because the web site owner carries all the risk: if an ad is not clicked upon, no revenue

accrues to the web site owner. A hybrid pricing model shares risk between the web site owner and

the advertiser and is the pricing model used in this study. In general, with hybrid pricing, the

space utilization objective is not equivalent to that of maximizing revenue. We therefore propose a

revenue maximization model under a hybrid pricing regime. However, because the proposed model

is also valid for CPM and click-through pricing schemes (with appropriate parameter values), it

incorporates all three prevalent pricing strategies of web advertising.

Most previous research has considered internet ad scheduling similar to (or sometimes identical

to) ad scheduling in traditional media, such as a magazine or a newspaper. There are several features

that make this study unique to internet advertising. First, we focus on optimizing advertising

objectives at the level of a single user. In recent times, web advertising targeted at the individual

user is gaining popularity [10][11][22]. To our knowledge, ours is the first study that proposes and

solves the single user version of the internet ad scheduling problem. Second, our model incorporates

results from recent research on clickstream analysis that predict the probability of an ad being

clicked upon during a user’s visit. In particular, our model is based on the observations made in

Chatterjee et al. [5], where it is shown that the repeated exposure of an ad during a user’s visit

has a negative linear effect and a positive quadratic effect on the click rate. Here, the magnitude of

the negative coefficient of the linear term is significantly larger than the positive coefficient of the

quadratic term. Thus, the positive quadratic effect begins to dominate only after a certain number
1
We are referring here to situations where the advertiser is not merely interested in ad display (e.g., to promote
brand awareness), but gets significant benefit only if the ad is clicked upon. Many banner ads on the web are of this
category.

4
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

of exposures of an ad. These observations are central to our discussion in this paper. Third, a

novel aspect of our work is the ability to adapt the sequence of ads shown to a user based upon the

user’s actual click behavior.

The remainder of the paper is organized as follows. The web advertisement scheduling problem,

its assumptions, and main results are summarized in Section 2. For the static version, Section 3

discusses an integer programming formulation and an efficient heuristic. Section 4 provides compu-

tational results and sensitivity analyses for the static version of the problem. The dynamic version

of the problem and two methods – a look-ahead algorithm and a myopic algorithm – are discussed

in Section 5. We also compare, both theoretically and experimentally, the dynamic version with

the static version. The sensitivity of both versions with respect to errors in the re-click probability

parameter is discussed in Section 6. Finally, Section 7 concludes the paper and provides directions

for future research.

2 Problem Statement, Assumptions, Approach and Results

2.1 Problem Statement and Assumptions

Given a set of ads that can be displayed to a user and a planning horizon consisting of a sequence

of slots, the goal is to select a subset of ads in each slot to maximize a revenue objective of the web

site owner. Next, we discuss problem constraints and revenue calculation.

There are three problem constraints: size constraints, exposure constraints, and pairwise ad

constraints. The size constraints specify that the set of ads selected for a slot fit in the slot. We

assume that each slot consists of a rectangular space with a given height and width. All slots are

of the same height and width. The width of all the ads is assumed to be the width of the slot.

While this is clearly a limitation of our model, it is often a reasonable assumption in practice. In

general, slots for different types of banner ads can have different widths; however, ads assigned

to a slot are typically of the same width. To illustrate, Biz-stay.com2 allows a choice between

three different types of banner ads: top ads, buttons, and cubes. Within each type, the width of

all the ads is the same. The Interactive Advertising Bureau recommends several types of banner
2
http://www.biz-stay.com; date clicked: February 16, 2007.

5
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

ads, including buttons, leaderboards, and skyscrapers, that have different widths. For scheduling

purposes, however, the ads chosen for a particular banner (e.g., the side banner or the top banner)

of a web page typically have the same width. The optimization problem in this paper considers

the problem of scheduling of ads for one type of banner; such a problem could then be solved for

each type of banner. Ads are allowed to vary in height. Thus, the size constraint requires that the

sum of the heights of the ads selected for a slot is at most the height of the slot.

The exposure constraints – motivated by internet ad scheduling practice – ensure that, for each

slot, at most one copy of any ad is selected for display. The pairwise ad constraints are of two kinds:

(a) inclusion constraints, and (b) exclusion constraints. An inclusion constraint is a constraint that

applies to an ordered pair of ads and requires that the first ad in the pair be exposed in a slot

only if the second ad in the pair is exposed in that slot. Such constraints represent situations that

correspond to complementary products (e.g., a real estate and a mortgage ad could be required to

be exposed together). An inclusion constraint may also be used to model a competitive situation

where two firms with competing ads require that their ads be exposed simultaneously in a slot.

An exclusion constraint, on the other hand, imposes that whenever one of the ads in the pair is

exposed, the other ad in the exclusion pair cannot be shown. In general, exclusion constraints

capture two types of situations: (i) where a very effective ad could adversely affect the value of

another less effective ad, and (ii) two competitors do not want their ads to be displayed together

[15].

The objective is to maximize the total expected revenue in the planning horizon (i.e., the

duration of a user’s visit) which is the sum of the expected revenues in each slot. For a slot, the

expected revenue is the sum of the expected revenue of each ad displayed in the slot. The expected

revenue, per unit size, of an ad in a slot is derived using a hybrid pricing model. In this model, an

ad is charged a fixed amount for exposure in a slot and a variable amount that depends on whether

or not the ad is clicked upon by the user as a result of that exposure.

The probability of a click resulting from the kth exposure of an ad during a user’s visit depends

on two effects: exposure effect and re-click effect [5].

• exposure effect: In internet advertising, the differential impact of each successive ad exposure

6
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

during a user’s visit is initially negative and non-linear but becomes positive later at higher

levels of ad exposure. This U-shaped functional form has been observed in recent research on

individual clickstream analysis using data from a commercial web site and is different from

the inverted U-shaped response found in traditional broadcast media [5]. Due to the exposure

effect, the click probability in the kth exposure of ad Ai is

eik = −ai1 k + bi1 k2 + ai0 (1)

where ai0 , ai1 and bi1 are positive constants with ai1 >> bi1 .

• re-click effect: A user clicking on an ad during a visit increases her probability of clicking on

that ad in future exposures during the same visit [5]. Thus, if an ad has been clicked upon,

the re-click effect increases the click probability by an amount p in all subsequent exposures

of the ad during the current visit.

The two effects, exposure and re-click, capture the typical behavior of a consumer on a website.

The decreasing portion of the U-shaped exposure effect has been attributed to wear-out, while

the increasing portion occurs due to wear-in or familiarity [7][27]. The re-click effect has been

attributed to a consumer’s interest in an ad [19].

We consider two versions of the problem. In the static version, click events are not observed

and scheduling decisions are therefore made based upon the total expected click probability. In the

dynamic version, the user click behavior during a visit is observed and this information is exploited

in scheduling decisions.

In the static version, the total expected click probability, Cik , for the kth exposure of ad Ai can

be calculated as follows. For the first exposure, Ci1 = ei1 , where eik is the click probability in the

kth exposure of ad Ai due to the exposure effect. For the second exposure, the total expected click

probability is Ci2 = Ci1 (ei2 + p) + (1 − Ci1 )ei2 = pCi1 + ei2 , representing an expectation of the two

possible values of the click probability in the second slot: ei2 + p, if the ad is clicked upon in the

first slot (with probability Ci1 ) and ei2 otherwise. In general, for k ≥ 2,
 

k−1
Cik = 1− (1 − Cil ) p + eik (2)
l=1

7
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

In the dynamic version, the probability of a click in the kth exposure of ad Ai is given by:

Cik = eik + p (3)

where p = 0 if ad Ai has not been clicked upon thus far during the visit (i.e., in exposures 1

through k − 1), and p = p otherwise.

In both the static and dynamic versions, the expected revenue, per unit size, from the kth

exposure of ad Ai is given by:

Rik = ai2 + bi2 Cik (4)

where ai2 and bi2 are positive constants.

2.2 Approach and Results

We first provide a linear integer programming formulation for the static version of the revenue

maximization problem. Although realistic instances can be solved to optimality in a few minutes

using state-of-the-art integer programming solvers, we propose a heuristic solution for two reasons.

First, since the problem class defined by the static version is strongly NP-hard, it is conceivable

that obtaining an optimum solution may become difficult for extremely large instances. Second, an

effective algorithm (discussed in Section 5) to solve the dynamic version requires repeated solutions

of the static version. Since providing real-time solutions is a requirement for the dynamic version,

a fast and near-optimal heuristic for the static version becomes a necessity. Our heuristic solves a

sequence of knapsack-like problems (one for each slot) and is easy to implement, runs very quickly,

scales reasonably well, and provides good quality solutions. Solution quality is characterized by

measuring the percentage gap between the heuristic solution value and the optimum solution value.

On a set of 216 realistic problem instances, the percentage gap between the values of the heuristic

and optimum solutions was less than 2.5% on average; the heuristic provided an optimum solution

for many problem instances. We also conduct a sensitivity analysis to examine the importance of

arriving at precise estimates for the click probability parameters, and to demonstrate the impact

of the revenue parameters on the scheduling objective.

In addition to its use in the static version, another important benefit of the knapsack-based

heuristic is that it can be easily extended for the dynamic version of the ad scheduling problem.

8
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Ad Ai A1 A2 A3 A4 A5 A6 A7 A8
si 5 5 4 4 2 1 1 1

(a) Problem data

10 10 10
A6 A7 A7 A8 A7 A6 A8 A7
9 9
A8 A8 A7 A8
A2 8 8
A3 A4 A6 A7 A3 A3 A4 A6 A6
7 7
A6
A5 6 A5 A5
5 A5 5 5
4
A1 A1 A1 A2 A1 A1 A1 A2 A2
A3

Slot 1 Slot 2 Slot 3 Slot 4 Slot 5 Slot 1 Slot 2 Slot 3 Slot 4 Slot 5
(b) A feasible schedule (c) Another feasible schedule

Figure 2: An example illustrating feasible placements of ads

Computational results for the dynamic version demonstrate the importance of observing and ex-

ploiting click events in constructing a schedule; the dynamic version consistently outperforms the

static version. A sensitivity analysis for the dynamic version shows that, typically, overestimating

the re-click probability (i.e., p) should not significantly affect the revenue but underestimating this

probability may lead to a loss in revenue.

3 Formulating the Static Version

Consider a set of n ads A = {A1 , . . . , An } competing for space in a planning horizon that is divided

into N time intervals (slots). Each slot is a rectangle of height S and width W . Ad Ai has height si .

We assume that the width of all the ads is W . Ad Ai is said to be scheduled in slot j if one copy

of Ai appears in that slot. A feasible schedule is a placement of a subset Aj ⊆ A of ads in slot j

such that the following condition is satisfied: for j = 1, ..., N, the sum of ad sizes assigned to slot j

must not exceed S. That is, Ai ∈Aj si ≤ S ∀ j. Clearly, si ≤ S ∀ i. To illustrate, two feasible

schedules for the problem instance in Figure 2(a) (S = 10, N = 5) are shown in Figures 2(b) and

2(c). The objective is to find a feasible schedule of Aj ⊆ A ads for slots j = 1, ..., N , such that the

total expected revenue over the N slots is maximized.

9
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Notation:

N number of slots.

n number of ads.

S height of a slot.

Din set of pairs of ads on which an inclusion constraint is imposed:

(Au , Av ) ∈ Din implies that ad Au is exposed in a slot only if ad Av

is also exposed in that slot.

Dex set of pairs of ads on which an exclusion constraint is imposed:

(Au , Av ) ∈ Dex implies that at most one of Au and Av can be exposed

in a slot.

Rik expected revenue, per unit size, from the kth exposure of ad Ai .

rik expected revenue from the kth exposure of ad Ai ; rik = si Rik .


k
Tik expected revenue from k exposures of ad Ai ; Tik = l=1 ril .

zik 1 if ad Ai is exposed a total of k times; 0 otherwise.

xij 1 if ad Ai is scheduled in slot j; 0 otherwise.



n 
N
Maximize Tik zik
i=1 k=1
subject to

N 
N
xij = k.zik ; i = 1, 2, . . . , n (5)
j=1 k=0

N
zik = 1; i = 1, 2, . . ., n (6)
k=0
xuj + xvj ≤ 1; j = 1, 2, . . ., N ; (Au , Av ) ∈ Dex (7)

xuj − xvj ≤ 0; j = 1, 2, . . ., N ; (Au , Av ) ∈ Din (8)



n
si xij ≤ S; j = 1, 2, . . ., N (9)
i=1
xij ∈ {0, 1}, i = 1, 2, . . . , n; j = 1, 2, . . ., N

zik ∈ {0, 1}, i = 1, 2, . . . , n; k = 1, 2, . . ., N


N
zik = 1 implies that ad Ai has been displayed a total of k times and, hence, j=1 xij = k.

Constraints (5) and (6) ensure this relationship. For each pair of ads in Dex , Constraint (7) imposes

10
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

that at most one ad can be exposed in each slot. For each ordered pair of ads (Au , Av ) ∈ Din ,

Constraint (8) imposes that the first ad in the pair be exposed in a slot only if the second ad in

the pair is also exposed in that slot. Thus, for each slot j, xuj = 1 implies xvj = 1. Constraint (9)

ensures that the sum of the heights of the ads selected for a slot is at most the height of the slot.

It is easy to see that the static problem is strongly NP-hard: an arbitrary instance of the strongly

NP-hard Multiple Subset Sum Problem [4] can be polynomially reduced to an instance of the static

problem. While we could easily solve realistic instances of the static problem to optimality using the

CPLEX optimizer (version 8.1.0), the computation times required to obtain an optimal solution can

be substantial for very large instances. Also, most state-of-the-art integer program solvers employ

a variety of techniques – search strategies, different types of cutting planes, heuristics, etc – to

make the search for integer solutions efficient. Therefore, there is typically a significant amount of

variability in the performance of different solvers on the same class of problems. Thus, we propose

an efficient heuristic which provides near-optimal solutions quickly, and scales well with problem

size.

3.1 A Successive Slot Knapsack (SSK) Heuristic

The basic idea behind the SSK heuristic is easy to explain: if the planning horizon consisted of a

single slot, then the revenue maximization problem is similar to the classical 0-1 knapsack problem

[12] with the only difference being the presence of the inclusion and exclusion constraints. For a

problem involving more than one slot, such a problem can be solved successively for each slot. That

is, we can first determine the optimal allocation of ads in first slot. Then, using the result of this

allocation (i.e., the set of ads which are displayed in the first slot), the expected revenue of the

ads displayed in the first slot can be updated and an optimal allocation for the second slot can be

obtained. The allocation for the second slot then sets up the problem for the third slot and so on.

Consider an ad Ai ∈ A. Let mij be the expected revenue from assigning ad Ai to slot j. The

value mij depends on the number of exposures of ad Ai in the slots preceding slot j (i.e., slots

1, 2, ..., j − 1). Using the same notation as in the integer programming formulation of the previous
j−1
subsection, it is easy to see that, for j ≥ 2, mij = rik , where k = q=1 xiq + 1; and mi1 = ri1 . Thus,

11
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

given the ad assignments for the slots preceding slot j, the values mij can be computed and the

following modified-knapsack problem can be solved to maximize the expected revenue for slot j:


n
M KP (j) = Maximize mij xij
i=1
subject to

xuj + xvj ≤ 1; (Au , Av ) ∈ Dex

xuj − xvj ≤ 0; (Au , Av ) ∈ Din



n
si xij ≤ S
i=1
xij ∈ {0, 1}, i = 1, 2, . . ., n

The optimal allocation obtained by solving problem M KP (j) can then be used to set up the

problem for the next slot. A formal description of the SSK heuristic follows.

Algorithm SSK

Step 1: Define mi1 = ri1 ; i = 1, 2, ..., n. Set j = 1.

Step 2: Solve problem M KP (j). Let xij , i = 1, 2, ..., n, be an optimal solution for M KP (j). Set

j = j + 1.

j−1
Step 3: If j ≤ N , update the values of mij , i = 1, 2, ..., n, as follows: mij = rik , where k = q=1 xiq +1

and go to Step 2; otherwise, terminate.

Algorithm SSK is essentially a greedy algorithm wherein the emphasis, while deciding the schedule

for a slot, is on maximizing the revenue of only that slot. The problem to be solved for each slot,

namely, the modified-knapsack problem is itself an NP-hard problem [12], but can be solved quite

efficiently in practice using state-of-the-art integer program solvers.

4 Computational Experience: Static Version

The purpose of this computational study is twofold. First, we wish to demonstrate that the SSK

heuristic consistently provides a good quality solution in a short amount of time. To achieve this,

we generate a set of problem classes obtained by varying the primary problem parameters: the

12
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

number of ads, n, the number of slots, N , and the slot size, S. Within each class, we generate

a set of instances by varying the secondary problem parameters: the re-click probability (p) and

the revenue parameters (ai2 and bi2 ) and report, for each class, the maximum, minimum and

average percentage gap of the heuristic solution from an upper bound. Second, we wish to study

the sensitivity of the expected revenue and the schedule with respect to the click parameters and

revenue parameters.

4.1 The Test Bed

The test bed consists of a set of problem instances for which we report the performance of the SSK

heuristic in terms of the percentage gap between this heuristic and the optimal solution. The test

bed is chosen to balance the goals of realism and feasibility. Realism is the goal of choosing problem

instances that are representative of ad scheduling problems encountered in practice. Realism is

achieved by anchoring our parameter choices in the real-world data, but varying these choices

to allow testing across a sufficiently large range. Feasibility is the goal of being able to solve

these problems to optimality in a reasonable amount of time. We first describe how values for

the primary problem parameters are chosen followed by the process leading to the choice of the

secondary parameters.

The primary problem parameters consist of n, N , and S. We consider three values of n (10, 50,

and 100); representing values in typical web sites. If ads are refreshed every minute, then a planning

horizon (i.e., the duration of a visit) of about 30 minutes is a reasonable value and corresponds to

30 slots. Thus in the test bed, N is set to 10, 30, and 50. Finally, the slot size and the ad sizes are

chosen so that, on an average, about four ads can be displayed in a slot. We choose two values for

S: 20 and 40. For S = 20 (resp., S = 40), the ad sizes are drawn from the uniform distribution

U (1, 10) (resp., U (2, 20)). Based on the above values of the primary parameters, there are 18 (3 x

3 x 2) problem classes.

There are six secondary parameters of which four are click-rate parameters (ai0 , ai1 , bi1 , and p)

and the other two are revenue parameters (ai2 and bi2 ). To be realistic, the values of ai0 , ai1 , and

bi1 are set based on the data collected from a high-traffic, sponsored, content web site whose details

13
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

are suppressed due to confidentiality. This data has earlier been used by Chatterjee et al. [5] for

modelling click behavior. In order to set different effectiveness levels for the ads, we choose three

different values of ai0 for each problem instance as follows: ai0 = 0.3 for one-third of the total

number of ads, ai0 = 0.4 for the next one-third ads, and ai0 = 0.5 for the remaining ads.

Next we generate a wide variety of instances within each problem class by considering three

values of p and two values each of ai2 and bi2 . Thus a total of 12 (3 x 2 x 2) instances are run

within each problem class by varying the secondary parameters. The total number of instances in

the test bed is, therefore, 18 x 12 or 216.

In order to generate a wide variety of problem instances, we choose three values of p: 0.001, 0.02,

and 0.40. To choose reasonable values for the revenue parameters, we again draw from information

on pricing models used by actual web sites. Typical hybrid pricing schemes reported on web sites

reveal that the base cost of every ad view is around $0.0025 (or $2.50 per thousand impressions) and

the cost per click is approximately $0.25 [26]. Based on this, we choose two values of ai2 (0.002 and

0.007) and two values of bi2 (0.2 and 1.0) in our computational study. For each problem instance,

the number of each of the two kinds of pairwise ad constraints (i.e., inclusion constraints and the

exclusion constraints) is chosen to be 10% of the total number of ads. In other words, |Din| = 0.1n

and |Dex | = 0.1n.

4.2 Computational Results

Algorithm SSK was coded in C++ and all computations were carried out on a Pentium 4 computer

(3.0 GHz, 2 GB RAM) with Windows XP as the operating system. CPLEX 8.1.0 was used to solve

the optimization problems. As indicated earlier, the percentage gap between the optimal solution

and the heuristic solution was used to assess the quality of the heuristic. Each row of Table 1

corresponds to a problem class. The second, third and fourth columns denote the values of n, N

and S, respectively, for the corresponding class. The next three columns indicate the maximum,

average, and minimum percentage gaps, respectively (over the 12 instances in the class) of the

solution provided by the SSK algorithm. For a solution, the percentage gap is computed as follows:

(Optimal Solution) - (SSK Solution)


percentage gap = × 100%.
Optimal Solution

14
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Problem # of Ads # of Slots Slot Size Percentage SSK Gap Average CPU Time (in sec)
Class (n) (N ) (S) Max Avg Min Optimal SSK
1 10 10 20 0.807% 0.244% 0% 0.44 0.85
2 10 30 20 5.790% 1.909% 0% 2402.80 3.84
3 10 50 20 5.911% 1.948% 0% 3601.76 6.18
4 50 10 20 0.009% 0.003% 0% 0.77 0.75
5 50 30 20 3.894% 1.278% 0% 16.97 3.04
6 50 50 20 5.851% 1.918% 0% 112.75 4.73
7 100 10 20 0.402% 0.122% 0% 0.59 0.76
8 100 30 20 3.450% 1.133% 0% 5.84 2.78
9 100 50 20 5.319% 1.724% 0% 20.17 5.03
10 10 10 40 0% 0% 0% 0.47 0.77
11 10 30 40 3.706% 1.213% 0% 2887.92 6.57
12 10 50 40 3.849% 1.123% 0% 3601.74 6.13
13 50 10 40 0.043% 0.004% 0% 1231.36 1.75
14 50 30 40 4.751% 1.502% 0% 826.29 4.63
15 50 50 40 8.472% 2.395% 0% 787.84 6.29
16 100 10 40 0.036% 0.016% 0% 1817.39 2.32
17 100 30 40 3.837% 1.167% 0% 1069.25 5.21
18 100 50 40 5.774% 1.900% 0% 1177.05 6.25

Table 1: Performance of the SSK heuristic

The last two columns indicate the average CPU time required over all the instances of a class.

The computational results indicate that the quality of the solutions obtained by the SSK heuris-

tic is very good. The average gap between the SSK heuristic and the optimal solution is less than

2.5% over a wide variety of problem instances. Moreover, the minimum gap in each problem class

is zero. Thus, the SSK heuristic achieves the optimal solution for at least one problem instance in

each problem class.

The SSK heuristic is very quick: the average CPU time for any problem class is less than 7

seconds. Computational integer programming has seen dramatic improvements over the past few

years. Binary knapsack problems, which form the most basic class of integer programming models,

involving thousands of variables are routinely solved within a few seconds using state-of-the-art

solvers. Consequently, the SSK heuristic, which is based on solving knapsack-like problems, scales

well with the number of ads and the number of slots.

4.3 Sensitivity Analysis

The primary goal of the sensitivity analysis is to examine the importance of arriving at precise

estimates for the click probability parameters for different families of ad scheduling problems. A

15
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

b 1 = 0.00021; a 2 = 0.004; b 2 = 0.75


1500
# of ads = 100; # of slots = 50; Slot size = 40
# of ads = 50; # of slots = 30; Slot size = 20
1200 # of ads = 10; # of slots = 10; Slot size = 20

Revenue
900

600

300

0
0 0.003 0.006 0.009 0.012 0.015 0.018 0.021 0.024 0.027
a1

Figure 3: Sensitivity of the linear click probability parameter on the revenue

secondary goal is to demonstrate the impact of the revenue parameters on the scheduling objective,

namely, the expected revenue to the web site. We consider different families of problems; each

family corresponds to a combination of values for the primary parameters (number of ads, number

of slots and slot size).

For all the sensitivity analysis experiments, we set p = 0.10. Again, we consider three levels of

effectiveness of the ads for each problem instance as follows: ai0 = 0.3 for one-third of the total

number of ads, ai0 = 0.4 for the next one-third ads, and ai0 = 0.5 for the remaining ads. For

simplicity, we set ai1 = a1 , bi1 = b1 , ai2 = a2 , and bi2 = b2 for i = 1, 2, . . ., n. Similar to the

previous experiment, |Din | = 0.1n and |Dex| = 0.1n for each problem instance.

Figure 3 shows the impact of the linear click probability parameter (a1 ) on the revenue. To

understand the behavior of the curves, recall that the exposure related click probability has a U-

shaped functional form. We note that in the decreasing portion of the click probability curve, ads

with fewer exposures should be preferred. Thus the ads move together down the decreasing portion

of the click probability curve. When a1 is high (i.e., the decreasing portion of the U-shape is long),

increasing it further will affect the schedule only after many slots have been exposed. Thus when

a1 is relatively high and we increase it by a small value, the schedule should remain stable, but the

revenue should decrease slowly. However, when a1 is sufficiently low, the U-shaped click probability

curve has a relatively short decreasing portion and begins to increase at a relatively low number

of exposures. When the value of a1 is low, increasing it by a small value can lead to a situation

16
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

in which one or more ads never make it to the increasing part of the click probability curve. Thus

the revenue can drop significantly for small changes in a1 in the region where a1 is relatively low.

This situation can be described as one where the schedule and the revenue are both quite sensitive

to changes in the value of a1 .

The differences between the three curves in Figure 3 show the interaction between the primary

parameters (number of ads, number of slots, and slot size) and the secondary parameters (click

probability and revenue parameters). If the number of slots in the planning horizon is relatively low,

the effect on the revenue may not be significant even at low values of a1 . This effect is highlighted

in the lowermost curve, where the number of slots is only 10. Because the number of slots is low,

it is likely that the ads stay in the decreasing portion of the click probability curve for the entire

planning horizon. Thus a1 has almost no impact on the revenue in its entire range. Note that this

curve also has the fewest number of ads. A lower number of ads should, to some extent, neutralize

the impact of the relatively short planning horizon. However, for this combination of parameters,

the effect of the planning horizon has dominated the effect of the number of ads. As expected, the

quadratic click probability parameter (b1) behaves in a manner opposite to that of the linear click

probability parameter (a1 ).

We conclude from these experiments that the impact of a particular click probability parameter

on the revenue is usually significant when small changes in the value of the parameter permit or

restrict ads from being displayed in the increasing part of the click probability curve. However,

the effect of a click probability parameter is influenced by the primary parameter settings: when

the planning horizon is short (or there are many ads), changes in the click probability parameters

may have very little impact on the revenue. Finally, we note that slot size is somewhat impotent in

this experiment because we deliberately adjust ad size so that about four ads can fit in a slot. In

general, ad sizes and slot sizes should also influence the impact of the click probability parameters

on the revenue in an intuitive manner.

Figures 4(a) and 4(b) demonstrate that the revenue parameters (a2 and b2 ) have a direct and

linear impact on the revenue. Once again, the slopes of the three curves in Figures 4(a) and 4(b)

can be interpreted as an impact of the planning horizon. The lowermost curve is the flattest because

17
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

(a): a 1 = 0.015; b 1 = 0.00021; b 2 = 0.75 (b): a 1 = 0.015; b 1 = 0.00021; a 2 = 0.004


700 1000
# of ads = 100; # of
600 slots = 50; Slot size=40
# of ads = 100; # of slots = 50; Slot size = 40 800
500 # of ads = 50; # of slots
= 30; Slot size=20
Revenue

Revenue
# of ads = 50; # of slots = 30; Slot size = 20
400 600 # of ads = 10; # of slots
# of ads = 10; # of slots = 10; Slot size = 20 = 10; Slot size=20
300
400
200
100 200
0
0
0 1 2 3 4 5 6 7 8 9
-3
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
a2 x 10 b2

Figure 4: Sensitivity of the revenue parameters on the revenue

few ads are allowed to reach the increasing portion of the click probability curve. Note that when

a2 = 0 in Figure 4(a), the pricing model reduces to the click-through model, whereas for b2 = 0

in Figure 4(b), the pricing model reduces to the CPM model. The mediating effect of the other

primary parameters (number of ads, and slot size) should be similar.

5 A Dynamic Version

A drawback of the revenue maximization model in Section 3 is that it does not take advantage of

a user’s click behavior during a visit. We propose a dynamic version which aims at exploiting this

behavior. Recall that the main idea in the dynamic version is to revise, at the end of a slot, the

user’s probability of clicking on an ad, and incorporate this revised estimate while deciding on the

schedule in forthcoming slots. We use a fixed positive probability p to capture the re-click effect.

Note that the probability revision, to be used for scheduling ads during a visit, requires the

knowledge of the actual click behavior during all previous slots in the same visit. We therefore have

a dynamic (or online) version of the revenue maximization problem. The dynamic algorithm begins

by solving the static problem for N slots. From the corresponding solution, we use the allocation

of ads in the first slot. Then, actual click events for these ads in the first slot are observed and an

allocation for the second slot is obtained by again solving the static problem for N − 1 slots and so

on.

Let uik be the click probability of an ad Ai in its kth exposure and η be the number of slots

18
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

that have been displayed when uik is being calculated. The value of uik depends on the actual click

events of ad Ai in the η displayed slots. Let us denote the number of exposures of ad Ai in these

η slots as Miη . Note that Miη = q=1 xiq . We need to calculate uik for each ad Ai ∈ A and each

exposure k where Miη < k ≤ (N − Miη ). It is easy to see that ui1 = ei1 . Next, we calculate the

value of uik for k ≥ 2. If ad Ai has been clicked in any of the η displayed slots, then uik = eik + p;

otherwise uik is calculated in a manner similar to Equation (2) as follows: for k = Miη +1, uik = eik ;
 k−1
and for k > Miη + 1, uik = eik + 1 − q=Miη +1 (1 − uiq ) p. Recall from Section 3 that Tik is the

total expected revenue from k exposures of Ad Ai . Using these values of uik and Equation (4), we

can easily calculate the values of Tik . The optimal allocation for (η + 1)th slot can now be obtained

by solving the static problem for the last N − η slots using the values of Tik . A formal description

of the algorithm is given below.

A Look-Ahead Dynamic Algorithm

Step 1: Solve the static problem (Section 3) for N slots. Based on the solution, schedule the ads in

the first slot. Set η = 1.

Step 2: Observe actual click events for the ads scheduled in slot η.

Step 3: As explained above, calculate the updated values of uik for each ad Ai ∈ A and each exposure

k where Miη < k ≤ (N − Miη ). Here, Miη = q=1 xiq .

Step 4: Using Equation (4) and the values of uik , calculate the updated values of Tik .

Step 5: Solve the static problem again for the last N −η slots using the updated values of Tik . Schedule

the ads in (η + 1)th slot based on the solution of this problem.

Step 6: If η = N − 1, then terminate; otherwise set η = η + 1 and go to Step 2.

5.1 Generation of Hypotheses

The dynamic version should outperform the static version because of its information advantage

(pertaining to click events) over the static version. Our goal in this section is to study the effect

of the different parameters on the amount of gain resulting from this information advantage. The

19
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

mathematical analysis of the comparison of the static and dynamic versions encounters two diffi-

culties. First, since the scheduling problem is NP-hard, there is no known non-enumerative way

to express its solution. Second, tracking the information advantage of the dynamic version across

many slots is difficult. However, tracking this advantage from one slot to the next is the crux of

the issue.

It is easy to identify two conditions (C1 and C2, described below) when the expected revenue

in the dynamic version equals the expected revenue in the static version.

C1. Suppose the probability of a click due to the exposure effect is always increasing with the

number of exposures (i.e., ei(k+1) > eik , ∀i, ∀k). Thus, after the display of an ad in a slot, it

is always advantageous to repeat an ad in the next slot (irrespective of whether the ad has

been clicked upon) in preference to displaying a new ad. Thus, both the static and dynamic

versions would repeat the same set of ads for all the slots. Hence, the expected revenue should

be the same for both versions.

C2. Suppose (i) the probability of a click due to the exposure effect is always decreasing (i.e.,

ei(k+1) < eik , ∀i, ∀k) and p is sufficiently small so that p + ei(k+1) < eik , ∀i, ∀k, and (ii)

there are sufficient ads available so that it is possible to fill all the slots without repeating

an ad. Since p + ei(k+1) < eik , ∀i, ∀k, it is never advantageous to repeat an ad (irrespective

of whether the ad has been clicked or not) in preference to displaying a new ad. Since there

are sufficient ads available to fill all the slots without repeating an ad, no ad would ever be

repeated in either the static or the dynamic version. Intuitively, if the exposure-based click

probability is decreasing with the number of exposures and p is very small, then it is not

advantageous to repeat an ad for display.

To provide a sound basis for the comparison of the expected revenues of the dynamic and static

versions, we first consider a special case of the problem wherein all the ads are of equal size with

the same effectiveness (i.e., ai0 = a0 , ai1 = a1 , and bi1 = b1 , ∀i) and the same revenue parameter

values (i.e., ai2 = a2 and bi2 = b2 , ∀i). In this case, an optimum solution can be obtained by simply

arranging the ads in the decreasing order of the click probability. Also, we restrict our analysis

20
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

to a two slot problem: the scheduling decisions made by the dynamic version in the second slot

illustrate the benefit of observing the click events in the first slot. The complete analysis of this

restricted problem is presented in the Online Supplement. In the following, we will generate several

hypotheses for the general problem that are valid for the restricted problem. The proofs of validity

of the hypotheses for the restricted problem are also provided in the Online Supplement. Then, in

Section 5.2, we check the validity of these hypotheses experimentally for the generalized problem.

Consider a general U-shaped function for the probability of a click due to the exposure effect,

as given in Equation (1). Condition C1 implies that the static and dynamic versions follow the

same schedule in the ascending portion of the U-shaped click probability. Moreover, if p is very

small, then Condition C2 implies that the static and dynamic versions follow the same schedule

in descending portion of the U-shaped curve. Based on these observations, we state our first

hypothesis:

Hypothesis 1 If p < min min (eik − ei(k+1) ), and there are sufficient ads available so that
i {k:eik >ei(k+1) }
it is possible to fill all the slots without repeating an ad, then the expected revenue is the same in

the static and dynamic versions.

In the static version, if the expected revenue from repeating an ad Ai is higher than that of

selecting any new ad (i.e., Ci(k+1) > Cj1 , ∀j = i, ∀k), then Ai would be selected for display (in the

current slot) in preference to displaying a new ad. In this case, the gain from observing click events

in the dynamic version (as compared to the static version) should increase as the loss from making

sub-optimal decisions in the static version increases. In other words, if ai1 increases in Equation (1)

(i.e., the slope of descending portion of the U-shaped click probability is increased), then the loss

from repeating an ad in the static version also increases. Based on this intuition, we state our next

hypothesis below.

Hypothesis 2 If the expected click probability from repeating an ad Ai is higher than that from

selecting a new ad (i.e., Ci(k+1) > Cj1 , ∀j = i, ∀k), and there are sufficient ads available so that

it is possible to fill all the slots without repeating an ad, then the difference in the expected revenue

between the dynamic and static versions increases or remains the same with an increase in ai1 .

21
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Intuitively, the advantage of observing actual click behavior should be more significant when

the value of p is higher. Hence, we propose the following hypothesis.

Hypothesis 3 The difference in the expected revenue between the dynamic and the static versions

increases or remains the same with an increase in p.

5.2 Exploratory Experiments

The mathematical justification (see the Online Supplement) of the various hypotheses in Section 5.1

is based on the analysis of the restricted version of the problem where equal-sized ads with the same

effectiveness level and the same revenue parameter values are scheduled over two slots. In this sec-

tion, we consider the general case (i.e., arbitrary ad sizes with different effectiveness levels, and

more than two slots) and compare the two versions experimentally for a wide variety of problem

instances. The main goal is to check the validity of all the hypotheses presented in Section 5.1. The

general principle for comparison is that the dynamic version uses click probabilities that are con-

ditioned on click events whereas unconditioned click probabilities are used in the static version. In

order to capture the re-click effect for the static version, the expected click probability is computed

as explained in Section 2.

Recall the following definitions from Sections 2 and 3 : n is the number of ads; N is the number

of slots; S is the height of a slot; the click probability in the kth exposure of ad Ai is given by

eik = −ai1 k + bi1 k2 + ai0 , where ai0 , ai1 , and bi1 are constants with ai1 >> bi1 ; p is the re-click

probability; Cik is the total expected click probability for the kth exposure of ad Ai ; the expected

revenue, per unit size, from the kth exposure of ad Ai is Rik = ai2 + bi2 Cik , where ai2 and bi2 are

constants. For all the experiments in this section, we consider three levels of effectiveness of the

ads as follows: ai0 = 0.07 for one-third of the total number of ads, ai0 = 0.08 for the next one-third

ads, and ai0 = 0.09 for the remaining ads. Also, we choose ai1 = a1 , bi1 = b1 , ai2 = a2 , and

bi2 = b2 for i = 1, 2, . . ., n. Similar to the previous experiments, |Din | ≈ 0.1n and |Dex| ≈ 0.1n for

each problem instance. That is, there are about 0.1n inclusion constraints and the same number

of exclusion constraints.

22
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

5.2.1 Hypothesis 1

In order to experimentally explore this hypothesis, we first generate 9 problem instances corre-

sponding to three values of n (30, 35, and 45) and three values of N (5, 7 and 10). For all these

problem instances, the parameter values are as follows: a2 = 0.004, b2 = 0.75, and S = 40. The

values of a1 , b1 , and p are set such that p < eik − ei(k+1) whenever eik > ei(k+1) . Ad sizes are drawn

from the uniform distribution U (12, 28) to ensure that there are sufficient ads available to fill all

the slots without repeating an ad.

Note that the dynamic schedule depends on the set of click events. That is, the ads shown

in any slot depend on the click events on the ads shown in all previous slots. Therefore, for the

dynamic schedule, we simulate 100 sets of click events. We find that the dynamic schedule remains

unchanged for different sets of click events. Moreover, this schedule is same as the static schedule.

This result holds for all the problem instances used in this experiment. Thus, Hypothesis 1 is valid

for all these problem instances.

The main intuition behind this hypothesis is that when p is sufficiently small, the actual click

event does not usually affect the dynamic schedule. As a result, the dynamic version results in

same schedule as the static version. Condition C2 concludes that it would never be advantageous

to repeat the ads in the descending portion of the U-shaped click probability curve, if p is sufficiently

small. However, it may be advantageous to repeat an ad if the increase in revenue of that ad in

the ascending portion of the click probability curve compensates for the decrease in revenue in the

descending portion of the curve. Note that the static schedule always considers the effect of p in

an expected sense and this effect increases with an increase in the number of exposures (as given

in Equation (2)). On the other hand, the effect of p is considered in the dynamic schedule only

when the ad is clicked (as given in Equation (3)). Therefore, for certain values of p and the click

parameters, it may be possible that the static version repeats the same ad whereas the dynamic

schedule does not. For such values, actual click events may affect the scheduling decision in the

dynamic schedule and Hypothesis 1 may not hold. We illustrate this phenomenon in the following

experiment.

The parameter values considered for this experiment are as follows: n = 30, N = 10, S =

23
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

(a): Impact of a 1 on improvement (b): Impact of p on improvement


Improvement of Dynamic 0.24 0.08

Schedule over Static Schedule


Improvement of Dynamic
0.07
Schedule over Static
0.2
0.06
0.16
Schedule

0.05
0.12
0.04
0.08
0.03
0.04 0.02
0 0.01
02

04

06

08

10

12

14

16

18

20
0
00

00

00

00

00

00

00

00

00

00
0.

0.

0.

0.

0.

0.

0.

0.

0.

0.
0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 0.01

a1 Re-Click Probability (p )

Figure 5: Effect of the linear click probability parameter and the re-click probability on the revenue

40, a1 = 0.0016, b1 = 0.00011, a2 = 0.004, and b2 = 0.75. We perform the experiment for two

values of p: 0.00016 and 0.000158; these parameter values represent the situation described above.

Although these values are not a good representation of reality, the main goal of this experiment is

to generate a condition for which Hypothesis 1 does not hold. The results show that the dynamic

and static versions indeed provide different solutions. We, therefore, conclude that Hypothesis 1

does not hold for all parameter values, but can be expected to hold for most realistic situations.

5.2.2 Hypothesis 2

To experimentally explore this hypothesis, we vary a1 from 0.0002 to 0.002 in steps of 0.0002, for a

total of 10 different values. The parameter values are as follows: n = 15, N = 5, S = 40, a2 = 0.004,

and b2 = 0.75. The values of b1 and p are set such that Ci(k+1) > Cj1 , ∀j = i, ∀k. Ad sizes

are drawn from the uniform distribution U (12, 28) and, therefore, it is possible to fill all the slots

without repeating an ad.

In order to estimate the expected difference between the revenue of the dynamic and static

solutions, we simulate a large number of sets of click events for the dynamic schedule until this

difference converges (or the number of sets of click events reaches 10000). The results are shown in

Figure 5(a). This figure plots the improvement of the dynamic schedule over the static schedule for

different values of a1 ; the improvement increases with an increase in a1 . Thus, Hypothesis 2 is valid

for the parameters considered in this experiment. Since these parameters are chosen to represent

24
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

real-world conditions, the claim in Hypothesis 2 (the benefit of dynamic version increases with a1 )

can be expected to hold in practice.

5.2.3 Hypothesis 3

For this experiment, we consider 10 values of p varying from 0.001 to 0.01. Parameter values

are: n = 10, N = 5, S = 40, a2 = 0.004, and b2 = 0.75. In order to generate a wide variety

of problem instances, ad sizes are drawn from the uniform distribution U (2, 20). Similar to the

previous subsection, we again simulate up to 10000 sets of click events for the dynamic schedule.

The results of this experiment are shown in Figure 5(b) where the improvement in revenue from

the dynamic schedule is plotted against p. It is clear from Figure 5(b) that the hypothesis holds

for this experiment. Thus, we believe that the relative advantage of the dynamic approach over

the static approach should increase with the re-click probability parameter p.

5.3 Statistical Validation of Hypotheses

The experiments in the previous subsection provided an informal exploration of different hypotheses

concerning the performance of the dynamic and static algorithms. In this subsection, we provide a

rigorous statistical validation of these hypotheses whose likely validity has already been suggested

by the experiments.

Before we revisit the hypotheses for statistical validation, we briefly describe the testing data

that was generated. We consider three levels of effectiveness of the ads as follows: ai0 = 0.07 for

one-third of the total number of ads, ai0 = 0.08 for the next one-third ads, and ai0 = 0.09 for the

remaining ads. Also, we choose ai1 = a1 , bi1 = b1 , ai2 = a2 , and bi2 = b2 for i = 1, 2, . . . , n. Similar

to the previous experiments, |Din | ≈ 0.1n and |Dex | ≈ 0.1n for each problem instance. That is,

there are about 0.1n inclusion constraints and the same number of exclusion constraints.

To investigate Hypothesis 1, 81 problem instances were generated corresponding to three values

each of n (30, 35, and 45), N (4, 5, and 6), a2 (0.001, 0.004, and 0.007), and b2 (0.5, 0.75, and 1).

For all these problem instances, the slot size S was set to 40. The values of a1 , b1 , and p were set

such that p < eik − ei(k+1) whenever eik > ei(k+1) . To explore Hypotheses 2 and 3, 100 problem

instances were generated, corresponding to 10 values each of a1 (varying from 0.0003 to 0.003 in

25
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

steps of 0.0003) and p (varying from 0.002 to 0.02 in steps of 0.002). The other parameter values

were held constant: n = 10, N = 5, S = 40, a2 = 0.004, and b2 = 0.75. In order to generate a wide

variety of problem instances, ad sizes were drawn from the uniform distribution U (2, 20).

Hypothesis 1 states that under Condition A, stated below, we do not expect any significant

difference between the performance of the static and dynamic algorithms.

Condition A: If p < min min (eik − ei(k+1) ), and there are sufficient ads available so that it
i {k:eik >ei(k+1) }
is possible to fill all the slots without repeating an ad.

Hypothesis 1 is formally set up as follows:

H0 : Under Condition A, the performance of the static and dynamic algorithms is the same.

Ha: The performance is not the same.

Using the data generated to test Hypothesis 1, we find that the null hypothesis cannot be

rejected; the p−value for the two-tailed t−test = 0.99. Thus our data shows that under these special

conditions, the performance difference between the static and dynamic algorithms is statistically

insignificant.

To test Hypotheses 2 and 3 (see Section 5.1 for the statements of these hypotheses), we construct

the following regression model:

Dynamic Revenue − Static Revenue = α + β1 a1 + β2 p + 

The following are the results of the least-squares estimation:

R2 = 0.647, β̂1 = 214.46, β̂2 = 44.77.

Both the coefficients have estimates that are positive and highly significant (p−values are equal

to 10−12 or lower), indicating that the performance difference between the dynamic and static

algorithms increases as a1 or p increase. Hence, Hypotheses 2 and 3 are strongly supported.

5.4 A Myopic SSK Heuristic for the Dynamic Version

Obtaining a dynamic schedule requires us to repeatedly solve the static problem. Since the static

problem is strongly NP-hard, there is a need for an efficient heuristic that can provide an optimal

26
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

0.20%

Percentage Difference between the


Revenue of Look-Ahead and
0.15%

Myopic Algorithms
0.10%

0.05%

0.00%
0.01 0.012 0.014 0.016 0.018 0.02 0.022 0.024 0.026 0.028

Re-Click Probability (p )

Figure 6: A comparison of the look-ahead algorithm and the myopic algorithm for the dynamic
version

or a near-optimal solution for large instances. We, therefore, propose a dynamic version of the SSK

heuristic, and call it the myopic algorithm.

In the dynamic version of the SSK heuristic, we exploit the knowledge gleaned from the user

click behavior during a visit. The only difference between this heuristic and the static version of

algorithm SSK (given in Section 3.1) is in calculation of the value of mij at Step 3. Recall that mij is

the expected revenue from assigning ad Ai to slot j. In the dynamic version, mij = ai2 +bi2 (eik +p ),
j−1
where k = q=1 xiq + 1. Here, p = p if the ad Ai has been clicked in any of the slots preceding

slot j, or p = 0 otherwise.

In Figure 6, we compare the look-ahead dynamic algorithm (explained earlier) with the myopic

algorithm. For this comparison, the parameter values are as follows: n = 10, N = 5, |Din| =

1, |Dex| = 1, and S = 40. The revenue parameters are: ai2 = 0.004 and bi2 = 0.75 for i = 1, 2, . . ., n.

For each problem instance, we consider three levels of effectiveness of the ads as follows: ai0 = 0.07

for the first three ads, ai0 = 0.08 for the next three ads, and ai0 = 0.09 for the last four ads.

For this comparison, ad sizes are drawn from uniform distribution U (2, 20). The results show

that the revenue of the myopic algorithm is within 0.2% of the revenue of the look-ahead dynamic

algorithm. Therefore, we conclude that the myopic algorithm is an effective and efficient alternative

to the look-ahead dynamic algorithm, especially for large instances. Since the dynamic schedule

operates in real-time, the myopic algorithm (which solves very quickly) should be a strong candidate

27
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

for use in practice.

We also observe in Figure 6 that the difference in revenue of look-ahead and myopic algorithms

is zero for larger values of p (p ≥ 0.022). For these values of p, the re-click effect overshadows

the exposure effect for the problem instances considered in this experiment, and therefore both

algorithms display the same set of ads in each slot.

6 Discussions and Recommendations

We now summarize and discuss the main results of this study, and offer guidelines for solving the

ad-scheduling problem in practice. Here, we address the overall question regarding the choice of a

scheduling method, namely static versus dynamic.

Our analysis and computational tests of the previous section illustrate that the dynamic ap-

proach, if implemented correctly, should increase revenue to the web site. However, the dynamic

approach depends on our knowledge of the re-click probability p. A natural question arises: how

critical is it to predict the value of p accurately? Clearly, if the success of the dynamic approach

critically depends on an accurate estimation of p, then its use may be limited by this requirement.

Figure 7 shows the effect of an error in estimating p on the revenue for both the static and

(look-ahead) dynamic versions. We assume a correct value of p and then vary the error in our

estimation of p to create a sequence of schedules using wrong values of p. The correct value of p is

used to evaluate all schedules.

The parameter values for this experiment are as follows: n = 10, N = 5, |Din| = 1, |Dex| = 1,

and S = 40. Also, ai2 = 0.004 and bi2 = 0.75 for i = 1, 2, . . ., n. For each problem instance,

we consider three levels of effectiveness of the ads as follows: ai0 = 0.07 for the first three ads,

ai0 = 0.08 for the next three ads, and ai0 = 0.09 for the last four ads.

The correct p value is fixed at 0.10. The error in p value is varied from -100% to +100% of the

correct p value. In Figure 7, the revenue of the static and dynamic versions for the correct value

of p correspond to a percentage error of zero (on the x-axis). The results show that the revenue

changes only for those values of p that are significantly lower (-80% and -100%) than the correct

p value. Also, the revenue is not affected for any positive error values. We offer the following

28
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

15.3

15.1

14.9

Revenue
14.7 Static Revenue for correct re-click probability
Static Revenue for wrong re-click probability
14.5
Dynamic Revenue for correct re-click probability

14.3 Dynamic Revenue for wrong re-click probability

14.1
-100 -80 -60 -40 -20 0 20 40 60 80 100

Percentage Error in Re-Click Probability

Figure 7: Impact of the error in re-click probability on the revenue

explanation: for values of p that are higher than the correct value, if an ad has been clicked upon in

any slot, its click probability in all subsequent exposures becomes relatively higher. Consequently,

an overestimation should not affect the scheduling decision. However, large negative error values

(i.e., significantly underestimating the value of p) may reduce the chances of clicked ads being

selected in subsequent slots and may impact the schedule.

We also observe in Figure 7 that an error in p value affects the static and dynamic schedule in

a similar fashion. Moreover, the dynamic revenue is always greater than the static revenue at any

error level. Therefore, we conclude that it is always better to use the dynamic schedule even when

we do not have a good estimate of the value of p. Also, it is better to overestimate rather than

underestimate the value of p.

Although the look-ahead dynamic approach outperforms the static approach, one practical

concern that is relevant is the time needed to execute the dynamic approach. Note that while

the static approach computes the entire schedule in advance, the look-ahead dynamic approach

must recompute an updated schedule after every slot. When a real-time solution is desired, the

computational requirements of the look-ahead dynamic approach may be excessive. In Section 5.4,

we show that the myopic implementation of the dynamic approach solves very quickly and provides

solutions that almost match the performance of the look-ahead dynamic approach. The myopic

implementation may have another advantage over the look-ahead version. We discuss this next.

29
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

1%

Percentage Difference between the


Revenue of Look-Ahead and
0%
1 2 3 4 5

Myopic Algorithms
-1%

-2%

-3%

-4%
Actual Number of Slots

Figure 8: A comparison of the myopic algorithm and the look-ahead algorithm when the number
of slots is overestimated

The look-ahead dynamic approach requires the knowledge of N , the planning horizon. In

practice, however, there is no guarantee that a customer will stay for the entire duration of the

horizon. Figure 8 shows the impact of a customer leaving the website earlier than the planned

horizon. The x-axis represents the departure point of a customer; the planning horizon is N = 5.

The y-axis shows the percentage difference of the revenue between the look-ahead and myopic

versions. The myopic version benefits when the customer departs earlier than planned. If the

customer departs after the first slot, the look-ahead and myopic versions exhibit almost the same

performance. The difference between the two schedules takes effect only after the first slot. When

the customer departs after two slots, the look-ahead version does not benefit from its futuristic

schedule; however, the scheduling choices made by the myopic schedule are well-suited for its

single-slot plan. The look-ahead gets progressively better as the actual departure of the customer

occurs beyond the second slot.

To summarize, on the issue of static versus dynamic scheduling, we recommend the use of the

dynamic approach. Within dynamic approaches, for practical reasons (real-time requirements, and

uncertain planning horizons), we favor the use of the myopic implementation.

30
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

7 Conclusions and Future Research Directions

Motivated by its increasing use in internet advertising practice, we consider a hybrid pricing model

for internet ad scheduling with the goal of maximizing revenue to the web site. A novel feature of

our work is the incorporation of user clicking behavior on ad scheduling. We consider two versions

of the problem: static and dynamic. The static version is formulated as an integer program. Most

realistic instances can be solved to optimality in reasonable time. However, keeping in mind that

the static version of the problem is strongly NP-hard and that solutions to the static version are

required in real-time by our algorithm for the dynamic version, we propose an efficient heuristic

based on repeatedly solving knapsack-like problems. A detailed computational study on a variety

of instances, selected by observing ad scheduling practice, reveals that our heuristic consistently

provides good quality solutions quickly. We also obtain valuable insights by examining the sensitiv-

ity of the model parameters on the solution. Next, we study a dynamic version wherein user click

behavior is observed and exploited in scheduling decisions. Inclusion of this feature distinguishes

internet ad scheduling from ad scheduling in a traditional medium, such as a magazine or a news-

paper. To solve the dynamic version, we propose a look-ahead algorithm and a myopic algorithm.

The performance of the myopic algorithm closely matches that of the look-ahead algorithm. More-

over, the solution time for the myopic algorithm is an order of magnitude lower than that for the

look-ahead algorithm and it is, therefore, more appropriate for online ad scheduling.

For the special case where equal-sized ads with the same effectiveness and the same revenue

parameter values are scheduled over two slots, we show that the expected revenue of the dynamic

version is at least as large as that of the static version. For the general case, we develop an

experiment to compare the two versions. The dynamic version is found to consistently outperform

the static version for a range of parameter values. In order to gain further insights into the dynamic

version, we conduct a sensitivity analysis that examines the effect the re-click probability parameter

on the solution.

Ours is the first detailed computational study focused on optimizing web site revenue using a

hybrid pricing model. An important area for further exploration is towards a theoretical analysis

31
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

of hybrid pricing models. Although hybrid pricing models are gaining popularity and make good

business sense, there is no result or framework available to analyze their impact on the total

welfare of the advertising community (i.e., the web site owner and the advertisers). Providing

such a framework and proving that hybrid pricing improves the total welfare of the advertising

community will place it on a firm theoretical footing and will enhance its use in practice. Analyzing

the trade-off between the (opposing) goals of web site owners and advertisers will probably require

a game-theoretic framework.

References

[1] M. Adler, P.B. Gibbons and Y. Matias, “Scheduling Space-sharing for Internet Advertising,”

Journal of Scheduling, Vol. 5, Issue 2, 2002, 103-119.

[2] S. Baker, “The Online Ad Surge: Brand Advertising Online has Taken Off – and It is Shaking

Up Madison Ave,” Business Week, Nov. 22, 2004, 76.

[3] D. Brady and D. Kiley, “Online Ads: Ready For Their Close-Up,” Business Week, Nov. 22,

2004, 82.

[4] A. Caprara, H. Kellerer and U. Pferschy, “The Multiple Subset Sum Problem,” SIAM Journal

of Optimization, Vol. 11, Number 2, 2000, 308-319.

[5] P. Chatterjee, D.L. Hoffman and T.P. Novak, “Modeling the Clickstream: Implications for

Web-based Advertising Efforts,” Vanderbilt University Sloan Center for Internet Retailing,

May, 1998 (available at http://elab.vanderbilt.edu/research papers.htm).

[6] ClickZ Stats Staff, “Population Explosion,” ClickZ Network, March 16, 2005 (available at

www.clickz.com/stats/sectors/geographics/article.php/5911 151151 ).

[7] M. Dahlen, “Banner Advertisements Through a New Lens,” Journal of Advertising Research,

Vol. 41, No. 4, 2001, 23-30.

32
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

[8] M. Dawande, S. Kumar and C. Sriskandarajah, “Performance Bounds of Algorithms for

Scheduling Advertisements on a Web Page,” Journal of Scheduling, Vol. 6, Issue 4, 2003,

373-394.

[9] M. Dierkes and T. Yen, “Internet Users Earning $150K in Household Income Grow

20 Percent Year-to-year Leading all Income Groups, According to Nielsen//Netratings,”

Nielsen//Netratings Press Release, February 16, 2005.

[10] B. Dudley, “Microsoft Touts Ad-selling System as Step Ahead of its Competitors,” The Seattle

Times, March 17, 2005.

[11] K. Gallagher, K.D. Foster and J. Parsons, “The Medium is Not the Message: Advertising Ef-

fectiveness and Content Evaluation in Print and on the Web,” Journal of Advertising Research,

Vol. 41, Issue 4, Jul/Aug 2001, 57-70.

[12] M.R. Garey and D.S. Johnson, Computers and Intractability: A Guide to the Theory of N P -

Completeness, Freeman, San Francisco, 1979.

[13] Interactive Advertising Bureau, “IAB Internet Advertising Report: 2002 Full-Year Results,”

June 2003

(available at http://www.iab.net/resources/adrevenue/pdf/IAB PwC 2002final.pdf ).

[14] Interactive Advertising Bureau, “IAB/PwC Release Fourth-Quarter and FY 2006 Internet Ad

Revenue Figures ,” March 7, 2007 (available at http://iab.net/news/pr 2007 03 07.asp).

[15] V. Kolluri, CEO, Chitika, http://www.chitika.com, Personal Communication, 2007.

[16] V. Kopytoff, “Yahoo, Google look to New Outlets: Ad Networks Could Expand to TV, Mobile

Phones,” San Francisco Chronicle, April 23, 2005.

[17] S. Kumar, V.S. Jacob and C. Sriskandarajah, “Scheduling Advertisements on a Web Page to

Maximize Revenue,” European Journal of Operational Research, Vol. 173, 2006, 1067-1089.

[18] M. Liedtke, “Yahoo’s 1Q Profit Doubles on Online Ads,” Washington Post, April 20, 2005.

33
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

[19] D.J. MacInnis, C. Moorman and B.J. Jaworski, “Enhancing and Measuring Consumers Motiva-

tion, Opportunity, and Ability to Process Brand Information from Ads,” Journal of Marketing,

Vol. 55, No. 4, 1991, 32-53.

[20] M. McCandless, “Web Advertising,” IEEE Intelligent Systems, May/June, 8-9, 1998.

[21] S. Menon and A. Amiri, “Scheduling Banner Advertisements on the Web,” Informs Journal

on Computing, Vol. 16, Issue 1, Winter 2004, 95-105.

[22] B.P.S. Murthi and S. Sarkar, “The Role of the Management Sciences in Research on Person-

alization,” Management Science, Vol. 49, No. 10, October 2003, 1344-1362.

[23] T.P. Novak and D.L. Hoffman, “New Metrics for New Media: Toward the Development of Web

Measurement Standards,” World Wide Web Journal (W3J), Vol. III, Issue I, Winter 1997.

[24] Nua Internet Surveys, “How Many Online,” http://www.nua.ie/surveys/how many online/,

clicked on January 20, 2006.

[25] J. Rewick, “Choices, Choices: A Look at the Pros and Cons of Various Types of Web Adver-

tising,” The Wall Street Journal, April 23, 2001, R12.

[26] SQL-Server-Performance.Com, “Hybrid CPC Banner Advertising Information,”

http://www.sql-server-performance.com/hybrid cpc.asp, clicked on August 01, 2005.

[27] G.J. Tellis, “Advertising Exposure, Loyalty, and Brand Purchase: A Two-Stage Model of

Choice,” Journal of Marketing Research, Vol. 25, May 1988, 134-44.

. Subodha Kumar received the Ph.D. degree in Information Systems and Operations
Management from University of Texas at Dallas, Richardson, TX, in 2001.
He is currently an assistant professor of information systems and operations management at
the University of Washington Business School. His research interests include integer programming

34
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

applications, combinatorial optimization, optimal software development methodologies, web ad-


vertising, and data quality. His papers have appeared in a number of research outlets, including
Information Systems Research, Journal of Scheduling, IIE Transactions, European Journal of Oper-
ational Research, and Computers and Industrial Engineering. He also serves on the editorial board
of Production and Operations Management Journal and Journal of Database Management.

Milind Dawande received the Ph.D. degree in algorithms, combinatorics and


optimization from Carnegie Mellon University, Pittsburgh, PA, in 1997.
Currently, he is an Associate Professor of Operations Management at the School of Management,
University of Texas at Dallas, Richardson. Prior to joining academia, he was a Research Staff
Member at IBMs T. J. Watson Research Center, Yorktown Heights, NY. His research interests
are in combinatorial optimization. His papers have appeared in a number of research outlets,
including Operations Research, Manufacturing and Service Operations Management, SIAM Review,
INFORMS Journal on Computing, Journal of Scheduling, Journal of Algorithms, IIE Transactions,
Interfaces, and the European Journal of Operational Research.

Vijay S. Mookerjee received the Ph.D. degree in Management with a concen-


tration in MIS from Purdue University in 1991.
He is currently a Charles and Nancy Davidson Distinguished Professor of information systems
and operations management at the School of Management, University of Texas at Dallas. His re-
search interests include optimal software development methodologies, content delivery systems, and
the economic design of expert systems and machine learning systems. He has authored numerous
articles in archival journals and refereed conference proceedings. He also serves on the editorial
board of Management Science, Information Systems Research, INFORMS Journal on Computing,
Operations Research, Decision Support Systems, and Journal of Database Management.

35
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

. ONLINE SUPPLEMENT
to

Optimal Scheduling and Placement of Internet Banner Advertisements

Subodha Kumar∗ Milind Dawande† Vijay Mookerjee†

A.1 A Theoretical Comparison of Dynamic and Static Scheduling for a Re-


stricted Problem

We present an analytical comparison of the dynamic and static versions for a restricted version of

the problem where equal-sized ads with the same effectiveness and the same revenue parameter

values are scheduled over two slots. Thus, ai0 = a0 , ai1 = a1 , bi1 = b1 , ai2 = a2 , and bi2 = b2 for

i = 1, 2, . . ., n. We assume that there are no exclusion or inclusion constraints. This analysis forms

the basis of the various hypotheses for the general problem we propose in Section 5.1. Since all

ads are assumed to be similar in this analysis, the identity of an ad is not important. Hence, for

simplicity of notation, we ignore the subscript denoting the index of an ad in all the parameters.

Given n equal-sized ads of which q fit in a slot, it is easy to see that both the static and the

dynamic versions yield the same expected revenue in the first slot. That is, any set of q ads selected

for display in the first slot will generate the same expected number of clicks and, hence, the same

expected revenue. It is therefore sufficient to compare the expected revenue of the two versions in

the second slot. Also, since we have equal-sized ads, this is equivalent to comparing the expected

number of clicks for the two versions in the second slot. In the following analysis, we will use

the fact that the number of clicks, τ , in the first slot is binomially distributed with mean qe1 and

q
P (τ = t) = P1 (t) = t (e1 )t(1 − e1 )q−t . This follows since each of the q ads displayed in the first

slot has the same probability of being clicked (i.e., e1 ).



Business School, University of Washington, Seattle; subodha@u.washington.edu

School of Management, University of Texas at Dallas; {milind,vijaym}@utdallas.edu

1
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

We begin by analyzing the static schedule. After the display of the first slot, the click probability

of ads that were clicked in the first slot is e2 + p; that of the ads displayed but not clicked in first

slot is e2 ; and that of the ads not displayed in first slot is e1 . Since click events are not observed

in the static version, the click probability (for scheduling decisions in the second slot) for ads that

were displayed in the first slot is e1 (e2 + p) + (1 − e1 )e2 . In order to calculate the expected number

of clicks, Estat, in the second slot, we consider the following two cases:

Case 1: e1 (e2 + p) + (1 − e1 )e2 ≥ e1 .

In this case, all the q ads displayed in the first slot will again be selected for display in

second slot. Assume that t out of these q ads were clicked in the first slot. For a given

value of τ (= t), the expected number of clicks in the second slot can be written as: d1 (t) =

t(e2 + p) + (q − t)e2 = tp + qe2 . Hence,


q 
q
Estat = P1 (t)d1 (t) = (tp + qe2 )P1 (t) = qe1 p + qe2 (10)
t=0 t=0

Case 2: e1 (e2 + p) + (1 − e1 )e2 ≤ e1 .

2a. (n − q) ≥ q.

A new set of q ads (i.e., none of which that were displayed in the first slot) will be

selected for display in the second slot. Therefore,

Estat = qe1 (11)

2b. (n − q) < q.

In the second slot, (n−q) new ads will be selected for display and q −(n−q) = 2q −n old

ads will be selected from those displayed in the first slot. Assume that l out of these 2q−n

2q−n
old ads were clicked in the first slot. Hence, P (τ = l) = P2 (l) = l (e1 )l (1 − e1 )2q−n−l .

For a given value of τ (= l), the expected number of clicks on the 2q − n ads is d2 (l) =

l(e2 + p) + (2q − n − l)e2 . Also, the expected number of clicks on the new ads is (n − q)e1 .

Hence,

2
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

2q−n

Estat = (n − q)e1 + d2 (l)P2(l)
l=0
2q−n

= (n − q)e1 + {l(e2 + p) + (2q − n − l)e2 }P2 (l)
l=0
= (n − q)e1 + (2q − n)(pe1 + e2 ) (12)

Next, we analyze the dynamic schedule. As mentioned earlier, the expected revenue in the first

slot is the same as that of the static version. As before, we assume that t out of q ads were clicked

q
in the first slot and P (τ = t) = P (t) = t (e1 )t(1 − e1 )q−t . Let Edyn be the expected number of

clicks in the second slot. There are three cases:

Case 1: e1 ≤ e2 .

In this case, all q ads displayed in the first slot will again be selected for display in second

slot and therefore,


Edyn = qe1 p + qe2 (13)

Case 2: e2 < e1 ≤ (e2 + p).

All the ads that were clicked in the first slot (say t) will be selected for display in the second

slot. For these ads, the expected number of clicks in the second slot is t(e2 +p). If possible, the

remaining portion of the slot will be filled by new ads (i.e., those that were not displayed in the
q
first slot). That is, if (n−q) ≥ (q−t) (or t ≥ 2q−n), Edyn = t=2q−n {t(e2 +p)+(q−t)e1 }P (t).

Otherwise, q − t − (n − q) = (2q − n − t) ads that were not clicked in the first slot will have
2q−n−1
to be chosen. Hence, Edyn = t=0 {t(e2 + p) + (n − q)e1 + (2q − n − t)e2 }P (t). Therefore,
2q−n−1

Edyn = {t(e2 + p) + (n − q)e1 + (2q − n − t)e2 }P (t)
t=0
q
+ {t(e2 + p) + (q − t)e1 }P (t) (14)
t=2q−n

Case 3: e1 > (e2 + p).

3a. (n − q) ≥ q.

A new set of q ads will be selected for display in the second slot.

Edyn = qe1 (15)

3
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

3b. (n − q) < q.

Here, the slot is first filled with (n − q) new ads. Next, the slot is filled by the ads that

were clicked in the first slot (say t). If t ≤ 2q − n, then these t ads chosen along with

(2q − n − t) ads that were not clicked in the first slot. If t ≥ 2q − n, then (2q − n) ads

that were clicked in the first slot will be chosen for display. Consequently,
2q−n

Edyn = (n − q)e1 + {t(e2 + p) + (2q − n − t)e2 }P (t)
t=0

q
+ (2q − n)(e2 + p)P (t) (16)
t=2q−n+1

A.2 Proofs of Validity of the Hypotheses in Section 5.1 for the Restricted
Problem

• Hypothesis 1: If e2 ≥ e1 , then the validity of the hypothesis for the restricted problem

follows from Condition C1. If e2 < e1 , then the hypothesis reduces to Condition C2 for the

restricted problem, because p < e1 − e2 .

• Hypothesis 2: The condition in the statement of the hypothesis refers to Case 1 of our

analysis of the static version and Cases 1 and 2 of the dynamic version in Section A.1. If

e1 < e2 , we have already shown in Condition C1 that the difference in expected revenue of

the dynamic and static versions is zero. Hence, we begin with the condition e2 < e1 ≤ e2 + p.

Given that there are sufficient ads available (i.e., n ≥ 2q), the expected number of clicks in

this case can be derived from Equation (14) as follows:



q
Edyn = qe1 (e2 + p) + {(q − t)e1 }P (t)
t=0

Using Equation (10), we get:



q
Edyn − Estat = qe1 (e2 + p) + {(q − t)e1 }P (t) − (qe1 p + qe2 )
t=0

q
= qe2 (e1 − 1) + {(q − t)e1 }P (t) (17)
t=0

From Equation (1), it is easy to show that for a given value of e1 , the value of e2 decreases

with increase in a1 . Moreover, e1 < 1. Hence, Equation (17) shows that the difference in

4
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

the expected number of clicks between the dynamic and static versions increases with an

increase in a1 . Since all the ads are of equal size, the difference in the expected revenue of the

dynamic and static versions increases with an increase in a1 . Thus, the hypothesis is valid

for the restricted problem.

• Hypothesis 3: We will use our analysis in Section A.1. We begin with Case 1 of the dynamic

version. Condition C1 implies that Case 1 satisfies this hypothesis. Let us now consider Case 2

of the dynamic version. Using Equation (14), the expected number of clicks in this case can

be written as:
2q−n−1

Edyn = qe1 (e2 + p) + {(n − q)e1 + (2q − n − t)e2 }P (t)
t=0

q
+ {(q − t)e1 }P (t) (18)
t=2q−n

We now compare Edyn in (18) with different cases of the static version as follows:

(i) For Case 1 of the static version, using Equations (10) and (18), we get:

2q−n−1

Edyn − Estat = qe2 (e1 − 1) + {(n − q)e1 + (2q − n − t)e2 }P (t)
t=0

q
+ {(q − t)e1 }P (t)
t=2q−n

Clearly, this difference is not dependent on p and hence the hypothesis is valid.

(ii) We now consider Case 2a. of the static version. From Equations (11) and (18), it is easy

to see that:
2q−n−1

Edyn − Estat = qe1 (e2 + p − 1) + {(n − q)e1 + (2q − n − t)e2 }P (t)
t=0

q
+ {(q − t)e1 }P (t)
t=2q−n

Here the difference increases with p. Hence, the hypothesis is again valid.

(iii) Finally, using Equations (12) and (18), for Case 2b. of the static version we have:

q
Edyn − Estat = pe1 (n − q) + e1 (qe2 − n + q) − e2 (2q − n) + {(q − t)e1 }P (t)
t=2q−n
2q−n−1

+ {(n − q)e1 + (2q − n − t)e2 }P (t)
t=0

5
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Since n ≥ q, Hypothesis 3 is valid.

Next, we consider Case 3 of the dynamic version (i.e., e1 > e2 + p). Here, e1 ≥ e1 (e2 + p) +

(1 − e1 )e2 , which is the condition in Case 2 of the static version. Condition C2 has already

shown that the expected revenue is same in Case 3a. of the dynamic version and Case 2a.

of the static version. From Equations (12) and (16), the difference in the expected number

of clicks between Case 3b. of the dynamic version and Case 2b. of the static version can be

written as:
2q−n
 
q
Edyn − Estat = {t(e2 + p) + (2q − n − t)e2 }P (t) + (2q − n)(e2 + p)P (t)
t=0 t=2q−n+1
−(2q − n)(pe1 + e2 )

= (e2 + p)e1 q + (2q − n)e2 − e2 e1 q − (2q − n)(pe1 + e2 )



q
+ {−t(e2 + p) − (2q − n − t)e2 + (2q − n)(e2 + p)}P (t)
t=2q−n+1

q
= p[(n − q)e1 + (2q − n − t)P (t)]
t=2q−n+1
q
Using the expressions t=0 tP (t) = qe1 and 2q − n ≥ 0, we have
2q−n
  q
n−q n−q
Edyn − Estat = p( )tP (t) + [( )t + (2q − n − t)]pP (t)
t=0
q t=2q−n+1
q
2q−n
  q
n−q (2q − n)(q − t)
= p tP (t) + pP (t)
t=0
q t=2q−n+1
q

Since all the coefficients of p are positive, the hypothesis is again valid in this case as well.

Based on the above, we can conclude that Hypothesis 3 is valid for the restricted version of

the problem.

You might also like