35 views

Original Title: Annesha

Uploaded by d-fbuser-58171984

- Lab1_16
- Regression Stepwise (PIZZA)
- Qualitative Methods
- clogit
- 2.Analysis Full
- Webinar10 TourTripModeStopDest Slides With Notes2
- Formula_sheet Final Updated
- LINEST Function
- lab5
- Journal of Retailing and Consumer Services Volume 10 Issue 3 2003 [Doi 10.1016%2Fs0969-6989%2803%2900008-0] Hans S. Solgaard; Torben Hansen -- A Hierarchical Bayes Model of Choice Between Supermarket
- Lecture 7slides
- JAES_spring 2(32)
- HSRA14C - Marshall Declaration -Complete
- bib4
- Transportation Statistics: table 04 03
- Qualifying Exampdf
- First Output
- A guide about Stata Commands
- sample of test feedback
- Linear Fit

You are on page 1of 22

DHAKA CITY

Annesha Enam

Department of Civil Engineering,

Bangladesh University of Engineering and Technology

protiti56@gmail.com

Charisma F. Choudhury*

Department of Civil Engineering,

Bangladesh University of Engineering and Technology

cfc@alum.mit.edu

Text 6403

Total 7653

*Corresponding Author

Abstract

This paper presents the issues and challenges associated with development of a comprehensive

mode choice model for Dhaka, the capital of Bangladesh and the 11th largest city in the world.

Similar to most other developing countries, reliable level-of-service (LOS) data for the wide

variety of motorized and non-motorized modes used by the travelers are not available in

Bangladesh. In addition, the 12 million inhabitants of the city have wide differences in

affordability and accessibility to various modes. These result in substantial heterogeneity in their

choice-sets. These choice-sets are however also unobserved in the data and not easily inferable

from the limited information of the network.

In this paper, we identify the key limitations of the available data and propose methods to

overcome the limitations. A probabilistic choice-set of modes based on a small scale stated

preference (SP) survey has been used to account for the absence of actual choice-set data. The

systematic and stochastic errors in the network-derived time and cost data are explicitly

accounted for in the model structure. The improvements from the proposed approaches are

demonstrated by prediction tests using hold-out samples.

The proposed approaches have immense potential to improve travel mode choice models for

other cities of Bangladesh as well as cities in other developing countries which very often face

similar dearth of data.

1. Background

Bangladesh, the country of about 150 million people has been going through rapid economic

development. Dhaka, the capital city with a population of 12 million, has been influenced the

most by the development processes and has been subjected to high rate of urbanization. The

current urbanization level is around 30 percent and it is expected to rise to 50 percent by the

year 2050 (STP 2005). The city’s transport sector has been adversely affected by the rapid

urbanization and the economic development of the country. Now-a-days traffic congestion is an

issue of great concern for the inhabitants of Dhaka resulting in commuter’s frustration, longer

travel times, lost productivity, increased accidents, more fuel consumption, and deterioration in

air quality. Increasing the physical capacity is a very difficult option for the city since the ratio of

built-up areas is already approximated to be higher than 70% (Bari and Hasan 2001). Therefore

the solution of the problem requires increasing the operational capacity through demand and

supply management.

Though Dhaka is an old city (dating back to 16th century), very few travel demand models have

been developed for the city so far. Among the previous models, a four step travel demand

modeling process was adopted in Dhaka Metropolitan Area Integrated Transport Study (DITS

1993) where the mode choice model was simplified into a binomial choice model between

private and public modes. Habib (2002) also developed a four step model for Dhaka city where

the results of the mode choice model were counterintuitive with positive sign of the coefficients

for time and cost parameters. Moreover, in his study, the coefficient for comfort was greater

than that of time and cost which is not normal for a developing country like Bangladesh. The

most extensive travel demand model for Dhaka in recent years is the Strategic Transport Plan

(STP 2005) where a wide-scale household interview survey has been conducted for the first time

with financial contribution from the World Bank. In the mode choice model of STP (Louis Berger

Inc. and BCL 2005), only two modes were considered i.e. Public Transport (PT) and Individualized

Motorized Vehicles (IMV). In the IMV group, cars and taxis were grouped together overlooking

their very different attributes (e.g. running cost, availability, accessibility, etc.). In addition, non-

motorized vehicles (rickshaw) were not considered for the mode choice model though 37% of

the person trips are made by rickshaw as reported in the same study (STP 2005). In the Binomial

Logit (BL) model three different models have been developed for three income groups and only

two explanatory variables (travel time and travel cost) were used. The model has adapted pre-

set rules for determining choice-sets and ignored the heterogeneity among respondents. Hasan

(2007) developed a four step travel demand model where he adapted a rule based choice model

for car (assuming that a traveler with access to car will always select car as the travel mode

regardless of the situational constraints) and a Multinomial Logit (MNL) model for the choice

among rickshaw, auto-rickshaw, taxi and bus. In the MNL model, it was assumed that all four

modes were available to all travelers. Only two explanatory variables were used for the model

specification (travel time and travel cost) and separate models were developed based on trip

purposes. Hasan’s model was based on STP data but the LOS variables were updated using

supplemental survey (for cost) and outputs of the software EMME/2 (for travel time). The

potential measurement errors introduced in this process have however been ignored.

From the literature review, it was evident that the limitations of the available datasets played

key roles behind the deficiencies of the previously developed mode choice models and this has

prompted the current research. In this paper, the STP data (the most comprehensive travel data

source collected from Dhaka to date) have been explored in detail, the key modeling issues have

been identified and modeling approaches have been proposed to overcome these issues for

development of a more rigorous mode choice model. The improvements from the proposed

approaches have been demonstrated by prediction tests using hold-out samples.

The rest of the paper is organized as follows: A short description of the STP data is presented

first and the main limitations of the data are highlighted. The two key modeling issues, viz.

addressing the unobserved choice-set of respondents and correcting for measurement error in

LOS data are detailed in the subsequent sections. In each case, the problem is described first, a

review of literature on state-of-the-art modeling approaches that are relevant to the problem

are presented next, followed by the proposed methodology, estimation results. Results of

prediction tests using the improved model are then presented followed by the summary of the

research and future research directions.

2. Data

In this research, the Household Interview Survey (HIS) conducted as part of the Strategic

Transport Plan (STP) Study in 2005 has been used as the main data source. This is supplemented

by small-scale detailed surveys.

In the STP HIS more than 6,000 households (STP 2005) were interviewed and a huge amount of

information were collected regarding the location and type of the households, socioeconomic

characteristics of its members, vehicle ownership of the households, daily trip information of

the respondents and also some attitudinal information about the respondents. The

socioeconomic information data are immensely extensive with age, gender, education level,

employment status, occupation, driving license status, address of the worksite and educational

institutions. The daily trips of each member of the households were reported which revealed

more than 30,000 trip information. The trip diary consists of the origin-destination locations,

start and end time between origin and destination (but not for each modal segment of the trip

chain), purpose and transport modes of each trip segment. The attitudinal question part

comprises questions on reasons behind choosing the current mode of travel, existing problems

of the current travel modes, and asks for suggestions for the improvement of the traffic

situation to name a few. More details of the data are available at STP (2005).

Unobserved choice-sets

Unobserved LOS including absence of data on travel time and cost for unchosen modes

The problems and their consequences on model development are elaborated in the next two

sections, along with proposed solutions.

In addition to the STP HIS study, data from a small scale supplemental survey has been used in

this study. The supplemental survey was a stated preference (SP) survey where 1016 samples

were collected as part of this research. The survey questionnaire included specific question on

the respondent’s available modes. Beside the mode choice information in hypothetical

scenarios, the survey provided information regarding the socio-economic characteristics of the

respondent (e.g. age, gender, occupation, education, household income, car ownership,

chauffeur availability, etc.) as well as attributes of the trip in question ( e.g. purpose, duration,

distance traveled, etc.).

3. Choice-set Generation

3.1. The problem

The classical mode choice model can be expressed as follows:

P(i|Cn) = P(Uin Ujn, j Cn) (1)

Where,

P(i|Cn) = Probability of choosing mode i among all modes in the choice-set of the respondent

Ujn = Utility of mode j where j Cn

A basic premise for the theoretical development and practical utilization of discrete choice

models is that the analyst is correctly able to specify the set of modes from which an individual

decision maker chooses a given alternative. However, in practical model development, it is often

the case that only the chosen mode is known with certainty and it is unclear what modes were

available in the choice context and/or were actually considered by the respondent while making

the decision. The analyst is thus burdened with the task of specifying the choice-set. Previous

research has shown that incorrect representation of choice-sets and the imputation of choice-

sets by the use of logical rules can lead to biases in parameter estimation and errors in

forecasting due to misallocation of the alternatives (William and Ortuzar 1979). McFadden and

Reid (1975) and Westin (1974) has shown similar results when the choice-sets are imputed by

some logical rules. Stopher (1980) has empirically shown the impact of captivity on the

estimation of parameters and forecasting with a binary mode choice model. Stopher showed

that the estimated coefficients were smaller and less significant and the alternative specific

constants were larger and more significant than in the “true” model where the true model was

estimated by excluding the captive users. Hensher (1983) has pointed out that for the accurate

estimation of travel time the choice model must include only the “choosers” i.e. the model must

exclude the captive users.

In case of developed countries the transport network data, car ownership and driving license

possession provide reasonable bases for constructing the choice-sets for mode selection. But, in

case of developing countries like Bangladesh, the transport network data are not structured.

Para-transits like human-haulers (and even in some cases traditional transits like buses) operate

beyond their permitted routes which make it impossible to infer the accessibility to these modes

from certain zone using transport network data. Moreover, in developing countries the

affordability of people plays a vital role in determining the choice-set, both for business and

non-business trips. Because of large household sizes and low car ownership rates, the number

of users per car is generally very high and car ownership alone is often not a suitable proxy for

car availability. For instance, if there are five members and a single car in the household, some

household members may get priority over others for the car usage (e.g. school going children,

elderly people, etc.). Moreover, multi-use of the same chauffeur driven car adds complexity in

the car availability since possession of a driving license no longer has correlation with car

availability either1. Therefore, a car is often not chosen, not because of the LOS tradeoffs, but

rather because it is not in the choice-set in the first place.

On the other hand, the affluent people who have access to car are often unaware of what public

transport options exist. This is escalated by the fact that the public transport modes have very

poor traveler information systems. For example, there are no published timetables, no

information regarding the routes or timetable in the bus stops, no options to access public

transport related information via internet or phone, etc. Therefore, a bus or para-transit is often

not chosen, not because of the LOS considerations, but rather because of its exclusion from the

choice-set.

As described in section 1, in the previous mode choice models developed for Dhaka City,

different deterministic rules have been used for specifying the choice-sets (e.g. for household

with car, the car is always in the choice-set, public transports are available either for all (e.g. STP

2005) or only to the non-car owners (e.g. Hasan 2007). Such simplistic rule-based choice-sets

can lead to wrong definitions of choice-sets and subsequently wrong parameter estimates. The

importance of correctly specifying choice-sets and the deficiencies in the choice-set generation

of the previous mode choice model prompts this research where we develop and test a choice-

set generation model that predicts the choice-set probabilistically for the choice context of

Dhaka using socio-economic characteristics, origin-destinations and trip purposes.

In discrete choice models of pre-seventies, it was assumed that either all the alternatives were

available to all the decision makers or some logical rules were used to compute the choice-set

e.g. in mode choice models no car were assumed to be available to an individual without a

driving license etc. Lerman (1975) recognized the inappropriateness of allocating all alternatives

to all individuals which led to the widespread use of imputed choice-sets in discrete choice

modeling (e.g. Ben-Akiva and Lerman 1974).

In classical economic choice theories, individual choice behavior in cases with unobserved

choice-sets has been modeled as a two-stage sequential process (Manski 1977): i) The

determination of an individual’s choice-set Cn; and ii) With the Cn well defined the individual

chooses an alternative according to some pre-established decision rules (e.g. utility

maximization).

1

In Bangladesh, majority of the households with car can afford to employ a chauffeur since the average

monthly wage of the chauffeur is often as little as $50/month.

For the analyst who has limited information about an individual’s choice-set can consider the

choice-set generation model as either deterministic or probabilistic depending upon the degree

of confidence s/he places on information at hand. There are many examples in the literature of

application of deterministic choice-sets (e.g. Ben-Akiva and Lerman 1974, Train 1980 etc) as well

as probabilistic choice-set generation models (e.g. Wermuth 1978, Swait 1984). In the

probabilistic approach, a separate choice-set generation model is used to predict the choice-set

stochastically; the probability of observing alternative j being chosen by individual n can

therefore be expressed as follows:

𝑃𝑛 𝑗 = 𝐶 𝜖 𝐺𝑛 𝑃𝑛 𝑗 𝐶 𝑃𝑛 𝐶

(2)

Where,

C = an element of Gn (C ⊆ Mn).

Gn = the set of all nonempty subsets of Mn; and

Mn= the set of all deterministically feasible alternatives for individual n (Mn ⊆ M ;

M = the universal choice-set, made up of all possible alternatives available for the choice context

and population in question;

1. A probabilistic choice model, Pn(j|C), conditioned on the choice-set being C ∈ Gn, which by

definition yields choice probabilities of zero for j ∉ C;

2. A deterministic choice-set generation model that determines the subset Mn from the set M;

and

3. A probabilistic choice-set generation model, Pn(C), expressing the probability that set C ⊆ Mn

is the individual’s actual choice-set.

A high degree of computational complexity is implied by equation 2. If |x| denotes the number

of elements in any set X, then Gn is equal to (2|Mn| - 1), of which (2|Mn| - 1) choice-sets actually

contain any given alternative j ∈ Mn.

McFadden (1976a) was first to the best of our knowledge to formulate a model following the

above choice processes; he considered a choice situation where an individual is either captive to

an alternative or free to choose from Mn. This logit captivity model was also independently

developed by Ben-Akiva (1977) and Gaudry and Dagenais (1979). The latter named this the

“Dogit” model. Other approaches of probabilistic choice-set generation model include

Independent Availability Logit Model (Swait 1984) and Parameterized Logit Captivity model (e.g.

Swait and Ben-Akiva 1985).

In Captivity Logit Model, an individual is assumed either to be captive to a single alternative or

to be free to choose from among the full set of deterministically available alternatives. The

Independent Availability Logit Model assumes that the probability of availability of an

alternative is independent of the availability or lack thereof any other alternative. In

Parameterized Logit Captivity model, choice set is modeled as a function of independent

variables such that the probability of a mode being included in the choice set is a function of

socio-economic characteristics of the travelers. This can be expressed as follows:

characteristics of the decision-maker and attributes of alternative i, B = (b1,……,bl,…..bL) is a

vector of parameters of the MNL choice model, and Yi is a vector of socio-economic

characteristics of the individual and attributes of alternative i. It is to be mentioned that, the

two vectors Xi and Yi do not generally contain the same variables. Xi should include those

variables thought to explain captivity to alternative i, whereas Yi should include variables

(perhaps partially or totally overlapping with Xi) which influence choice of i from among the

alternatives of C.

Both the Logit Captivity model and Independent Availability Logit models have been applied to

investigate the mode choice process of the city of Maceio, Brazil (Swait and Ben-Akiva 1985a).

The model focused on the home based work mode choice for full-time workers. The outcome of

the model highlighted an important practical challenge. It indicated that the application of the

probabilistic choice-set generation process cannot be arbitrary. Rather, it must account for the

population in question as well as the source of constraints on it. In this particular choice context

the logit captivity model performed better than the independent availability logit model in the

low income group; the opposite was true for the high income group; and in the middle income

group the choice between the two models was indifferent.

A drawback of the probabilistic choice-set formation models is the greatly increased difficulty of

calibrating them. The departure from the standard logit linear-in-parameters formulation can be

costly because the convenient property of concavity of the log-likelihood function, which

guarantees the uniqueness of the parameters at the point of convergence, is lost. Hence a

greater degree of care and sophistication on the part of the analyst as well as specialized

estimation software are necessary.

In this paper, stated choice-set data (from the supplemental SP data) has been used for

developing the choice-set generation model. This developed model will be applied to revealed

preference data of the HIS to predict the missing choice-set information.

In the proposed approach the parameters of a choice-set probability model are estimated first

using stated preference (SP) survey data. In the supplementary survey, an explicit question was

included on the availability of different modes as perceived by the respondents. Each

respondent was presented with a list of typical travel modes in Dhaka and asked what modes

were available to them for this particular trip. The answer of the respondents explicitly revealed

the modes considered by the respondents for the trip in question. These stated choice-sets

along with the comprehensive socio-economic and trip related data of the travelers have been

used as to form their choice-sets in RP data.

For the mode choice context and the population of Dhaka city the universal choice-set consists

of bus/tempo, car, CNG/taxi and rickshaw2. In the extreme cases, the traveler is captive to a

single mode or considers all modes in the choice-set. The total number of non empty subsets of

this universal choice-set is fifteen. The subsets of the universal choice-set are as follows:

3. bus, CNG/taxi, tempo 4. bus, CNG/taxi, rickshaw, tempo

5. bus, car, CNG/taxi, rickshaw, tempo 6. bus, car, rickshaw

7. bus, car, CNG/taxi 8. bus, car

9. rickshaw, tempo 10. CNG/taxi, tempo

11.CNG/taxi, rickshaw 12.car, rickshaw

13.car, CNG/taxi, rickshaw 14.car, CNG/taxi

15.car

These choice-sets can be broadly categorized into one of the following six groups.

Public Transport Group : Includes bus and tempo;

Public and Personalized Public Transport Group: Includes bus/tempo, CNG/taxi and

rickshaw;

Personalized Public Transport Group: Includes CNG/taxi and rickshaw ;

All mode Group: Includes all the four modes mentioned earlier;

Car and Personalized Public Transport Group: Includes all the four modes except

bus/tempo;

Car Group: Includes only car.

The candidate socioeconomic attributes and mode characteristics to be affected the choice-set

generation process and related a priori hypotheses are presented in Table 1.

2

Tempos are low- cost para-transits, CNGs are auto-rickshaws that run on compressed natural gas

Table 1: Candidate Variables and a-priori hypotheses

Monthly Due to considerable disparity in monthly household income, household income

Household is supposed to play a vital role in the determination of the choice set of the

Income (HHI) individual, the propensity of having choice-sets with cars and individual modes

increasing with income.

Gender Due to social norms and culture male and female passengers do not feel free to

share transit vehicles especially in congested situation and also female

passengers try to avoid public transit due to safety concerns. Female

respondents are therefore less likely to have information about public

transport routes and include these modes in choice-sets.

Education & These variables are supposed to be somewhat correlated with HHI; and highly

Occupation educated white-collar employees are likely to have higher propensities of

having choice-sets with cars and individual modes.

Age Due to relatively low fare students are typically more inclined towards public

transit and more familiar with the routes and timetables. They are therefore

more likely to include these modes in choice-sets. Aged people on the other

hand are less likely to have choice-sets that include public transport modes.

Trip Purpose Consideration of the modes may also be strongly affected by the purpose of the

trip; e.g. people may be willing to consider transits in choice-sets for work trips

while they may exclude the same modes from their choice sets for social trips

with family members.

Travel Due to ease of accessibility and door-to-door service there are higher

Duration probabilities of inclusion of rickshaws in the choice-sets for shorter trips while

for long trips mobility is more important and rickshaws are not considered as

viable options

To get an idea about how these variables are going to affect the availability of different modes

in the choice-sets of an individual; an explorative analysis was done where the correlation with

the choice-sets was found to be highest for income, age and gender of the traveler and duration

and purpose of the trip.

Based on these findings, a discrete choice modeling technique has been used to estimate the

utility parameters of different choice-sets. A linear utility function is associated with each

choice-set. The utility of a choice-set i of individual n can be expressed as follows:

Where

Xin = socio-economic characteristics of the individuals and attributes of different modes,

βi = Coefficient of Xin,

εi= Random error term,

Cn = Universal choice-set or the choice-set determined deterministically for individual n.

Though ideally the parameters of the choice set generation model should be estimated jointly

with the corresponding mode choice model. in the current research, the parameters of the two

models have been estimated sequentially due to limitations of the estimation software. This

implies that the correlation among error terms between the choice set generation model and

mode choice model have been ignored.

3.4. Results

As discussed earlier choice-set consideration is affected by the attributes of the alternative

modes and the socioeconomic characteristics of individuals. However, not the entire candidate

variables mentioned in Table 1 was found to be statistically significant and/or have intuitive

signs. The estimation was started with monthly household income and other attributes were

added step by step. The variables have been included only if there was a significant

improvement in the goodness-of-fit (adjusted Rho-square) and if the parameters were

significant and their signs were intuitive. For example, the socioeconomic attribute education

was not found to be significant.

The utility functions are presented below and the estimation results using BIOGEME

(http://roso.epfl.ch/biogeme) are presented in Table 2.

Uall =αCNG_taxi * one + αrickshaw * one + αcar * one + αbus_tempo * one + βincome2 * income2

Ubus_car =αbus_tempo * one + αcar * one

Ubus_car_CNG_taxi =αbus_tempo * one + αCNG_taxi * one + αcar * one

Ubus_CNG_taxi =αCNG_taxi * one + αbus_tempo * one + βincome1 * income1 + βttvlong * vlong + βtp1 * tp1

+ βage1 * age1

Ubus_CNG_taxi_rick =αbus_tempo * one + αCNG_taxi * one + αrickshaw * one + βincome1 * income1 + βttvlong *

vlong + βtp1 * tp1 + βage1 * age1

Ubus_rick =αbus_tempo * one + αrickshaw * one + βincome1 * income1 + βttvlong * vlong + βtp1 * tp1

+ βage1 * age1

Ubus_tempo =αbus_tempo * one + βttvlong * vlong + βtp1 * tp1 + βincome1 * income1 + βage1 * age1

Ucar_CNG_taxi =αCNG_taxi * one + βfemale * female + βincome2 * income2

Ucar_CNG_taxi_rick =αcar * one + αCNG_taxi * one + αrickshaw * one + βfemale * female + βincome2 * income2

Ucar_rickshaw =αcar * one + αrickshaw * one + βfemale * female + βincome2 * income2

UCNG_taxi_rick =αCNG_taxi * one + αrickshaw * one + βfemale * female + βincome2 * income2

Urickshaw =αrickshaw * one + βttvshort * vshort + βfemale * female + βincome2 * income2

Where,

αbus_tempo, αcar, αCNG_taxi, αrickshaw are the alternative specific constants associated with the

corresponding modes;

βage1 = coefficient of age for individuals of 18 to 25 years of old;

βfemale = coefficient of female dummy;

βincome1 =coefficient of income for the range of less than or equal to 20000 monthly;

βincome2 =coefficient of income for the range of more than or equal to 50000 monthly;

βtp1 =coefficient of trip purpose dummy for educational and work trips;

βttvlong =coefficient of travel time dummy for trip duration of greater than 45 minutes;

βttvshort = coefficient of travel time dummy for trip duration of 15 to 30 minutes.

Model : Multinomial Logit

Number of estimated parameters : 10

Number of observations : 756

Number of individuals : 756

Null log-likelihood : -1878.589

Init log-likelihood : -1878.589

Final log-likelihood : -1372.242

Likelihood ratio test : 1012.694

Rho-square : 0.270

Adjusted rho-square : 0.264

Name Value t-test

αbus_tempo 0

αcar 0.344 2.55

αCNG_taxi 1.14 12.65

αrickshaw -1.15 -13.6

βage1 0.555 3.26

βfemale 0.912 4.42

βincome1 2.28 7.13

βincome2 0.43 2.1

βtp1 1.09 6.87

βttvlong 1.4 8.73

βttvshort 2.03 6.07

It is to be mentioned that twelve of the fifteen different choice-sets mentioned earlier have

been used for the choice-set generation model. Three choice-sets have not considered

separately because of very small amount of observation. Those have been merged with the

choice-sets whose characteristics are supposed to have close resemblance. For example, car

has been merged with car, CNG/taxi group.

The results indicate that all else being equal, the rickshaws have the smallest probabilities of

being included in the choice-set. This is probably due to the fact that rickshaws are banned in

the major streets and certain OD pairs are served by rickshaws only through small streets and

alleys (which are often not known to everyone). Therefore, people are likely to exclude

rickshaws from their choice-sets because of lack of this network information. CNG/taxi on the

other hand has the highest value of alternative specific constant (ASC) indicating higher

likelihood of it being included in the choice-sets. However, it should be noted that the ASCs in SP

studies are not representative of market shares; rather, they merely indicate the part of the

utility unexplained by the explanatory variables (Bliemer et al. 2009).

Coefficients of income have different value and sensitivity for different income groups. The

coefficient of income for the range of less than or equal to 20000 BDT3 monthly incomes is very

significant (at more than 95% confidence level) for public and personalized public transport user

group which is intuitive for the context of the city since members of this income group generally

have no or shared access to car and well-aware of public transport availability. On the other

hand, the coefficient of income for the range of greater than or equal to 50000 BDT monthly

incomes is statistically significant for the people whose choice-set include car and does not

include bus.

The dummy term introduced for female respondents indicates that female respondents have

significantly high likelihood to consider the choice-sets which include car, CNG/taxi and

rickshaw.

The trip purpose dummy for educational and work trip exhibit significant high preference of the

user for the choice-sets which include bus and other personalized public transport for the same

reason as the less than or equal to 20000 BDT income group.

Two travel duration dummies have been introduced in the model. Both of them are highly

significant. The long trip duration dummy (trip duration more than 45 minutes) indicates that

public transports are more likely to be included in the choice-set for long trips while short trip

dummy (trip duration 15 to 30 minutes inclusive) indicates that rickshaw is more likely to be

included in the choice-set for short trips.

The coefficient of age indicates that, as hypothesized, the young commuters have a significant

preference for the choice-sets which include bus.

4.1. The Problem

High quality LOS data are essential for accuracy of the estimates of the mode choice model.

Walker et al. (2007) use synthetic data to demonstrate a model with measurement error may

result in inconsistent estimates of parameters.

In the context of Dhaka, the available data from STP Household Interview Survey (HIS) has only

got the stated travel times of the respondents/travelers for the chosen mode. The travel times

of the unchosen modes and the fare of all the modes are missing in the data set.

3

1 BDT = 0.014 USD

The data has been supplemented by Hasan (2007) who has calculated the distances between

different Traffic Analysis Zones (TAZ) from network assignment using the network coded in

EMME/2 for the STP study. These distances have been used along with the assumed speeds of

different vehicles for the determination of the travel time between OD pairs by different modes.

It has been acknowledged that the resulting travel time would have some measurement errors

resulting from two major sources stated below:

1. The distances are the zone to zone distances i.e. they are not the actual distance between the

origin and destination of the traveler. Though the zone to zone distances may suffice the needs

of aggregate analysis (e.g. trip distribution and trip assignment) such assumptions may not be

sufficient for the models that are done in the disaggregate level.

2. The traffic on the streets of Dhaka is heterogeneous in nature and due to congested traffic

situation as well as chaos introduced from weak-lane disciplines, the vehicles do not necessarily

run at their free flow speeds. In fact the speeds of the vehicles are not only a function of the

mode rather it also depends on the road characteristics (e.g. widths, surface quality, traffic mix

and presence of non-motorized traffic, etc.). Since a single speed for a mode has been used for

the calculation of the travel time ignoring these sources of heterogeneity, it might have

introduced some errors.

The available data from STP House Hold Survey (HHS) on the other hand has got the stated

travel time of the respondents/travelers, but only for the chosen modes and the cost data was

totally missing in the collected HIS data. Hasan (2007) has conducted a small scale HIS to

calculate an average fare for different modes.

Brownstone et al. (2001) has developed and demonstrated a technique to overcome the

measurement errors in travel data. In this method multiple imputed values are generated for

each observation and separate choice models are estimated for each set of imputed data. But,

this multiple imputation technique is valuable when validation data are available (Ben-Akiva et

al. 2002). Later on Steimetz and Brownstone (2005) presented a similar method for

measurement error correction which also involves multiple imputations ,but the drawback of

the method is that it can only be used when one has a subsample of accurate observations.

There are other works also which focus on the improvement of level-of-service measurements

(e.g. Ortuzar and Ivelic 1987, Ortuzar and Willumsen 2001, Kim et al. 2006). The task of

measurement error correction also extends to other types of transport models (e.g., Gunn and

Whittaker 1981).

Recently, Walker et al. (2007) has proposed a method to address the measurement errors in

network derived LOS data for the work trip context of Chengdu, China. This method will be

described here in brief due to the similarity of the context.

Walker et al. have treated the true travel time as latent variable which is known only to a

distribution fT() and a set of estimated parameters ϴ i.e. fT(tTrue;ϴ). For example true travel time

may be normally distributed as tTrue ~ N(µ,σT2). The distribution of measured travel time is

conditional on the true travel time and a set of estimated parameters λ as follows:

Now the probability of choosing mode i conditional on a set of estimated parameters β and

explanatory variable tTrue can be expressed as follows:

(6)

Since tTrue is unknown it is necessary to integrate the choice probability given in equation (6)

over the distribution of tTrue as follows:

Now accounting for the tMeasured the likelihood function for the entire framework becomes:

(8)

In the above equation each latent variable adds a dimension for integration and when the latent

variables number exceeds 3, simulation must be done instead of integration.

Walker et al. (2007) in their work treated the travel time for one mode (out of five in the choice-

set) as latent variable. The fit of the model improved only slightly when compared with the logit

model but the Value of Time (VOT) increased significantly from 7.72 yuan/hour to 12.94

yuan/hour. The estimated VOT using the hybrid model was more close to the average income of

the area (15yuan/hour).

In the context of Dhaka the stated values of travel times was too small for using the imputation

method. Using complicated and integrated methods like the one proposed by Walker et al.

(2007) was also not possible due to limitations of the available software. Therefore a simpler

and tractable approach is proposed in the following section.

In the present study the stated travel time (which is available only for the chosen alternatives) is

assumed as true value of the travel time. A relationship has been developed between the true

travel time and the travel time obtained from network assignment and assumed speed to

explore potential systematic variations between the two and determine necessary correction

factors.

address the measurement errors for the travel fare of the respondents as well. But,

unfortunately, no stated travel fares by the respondents were available.

In order to address the measurement errors of the travel time, the data was cleaned first and

the observations with very high anomalies between zone to zone distance with the distances

stated by the travelers between their origin and destination locations were excluded. A

regression analysis was then performed in the clean data to develop a relationship among the

true and measured travel times between the OD pairs. The stated travel times by the travelers

have been considered as the true measure of the travel time while the calculated travel times

have been used as the measured travel times with errors. The proposed relationship of the true

and measured travel times can be expressed as follows:

+Є (9)

Where,

is the true value of the travel time i.e. the stated travel time by the

traveler;

is the measured value of the travel time;

α,β are the systematic components to be estimated by regression analysis;

є is the random error component and its mean has been taken to be zero

for the analysis.

In this analysis, we test the hypothesis of the presence of systematic components of error

through statistical tests. It may be noted that α indicates the fixed component of the

measurement error (if any), β indicates the systematic scale difference between the true and

the measured values (if any) and є represents the random part of the error.

4.4. Results

The estimation was started by pooling all the modes together. The regression equation obtained

from the estimation is with an adjusted R square value of

0.849. The constant was statistically insignificant with a t-statistics value of -0.880 and

consequently ignored for the second trial of the estimation. Further separate models have been

estimated for different modes to check whether the separate models differ significantly from

the pooled model or from each other.

Based on the estimated values of the seven modes have been combined into four groups

and another set of regression analysis have been done for each groups separately. The final

estimation results of the analysis are provided in the table below:

Modes Regression Equation Adjusted R square value

Tempo & Public Transport (bus) 0.973

Taxicab & CNG 0.929

Private Car/Microbus & Motor 0.890

Cycle

Rickshaw 0.873

The estimation results indicate that as hypothesized there are some systematic measurement

errors and the above equations can be used to correct the measurement errors of the calculated

travel times before estimating the mode choice models.

5. Prediction Tests

For testing the improvements in the model from the proposed approaches, a mode choice

model with and without the corrections were estimated using 7431 observations of the STP HIS

data. The estimated MNL model consisted choice between only four modes i.e. rickhaw, car

(private car and microbus), CNG & taxi and public transit (bus and tempo) and had generic time

and cost coefficients. Same specifications were used for both the models but the base model

was estimated without incorporating the measurement error corrections and assuming the

universal choice-set while the second model was estimated incorporating both the

measurement error correction and the probabilistic choice-set.

Parameters Estimated Parameter values (t-statistics)

Base Model Proposed Model

αbus 0 0

αcar -4.59 (-46.91) -6.9981 (-24.51)

αrick 1.75 (37.72) 3.8645 (-21.07)

αcng_taxi -2.35 (-31.59) -4.4326 (-22.45)

βtraveltime -0.137 (-40.23) -0.2379 (-23.283)

βtravelcost -0.0307 (-20.25) -0.0413 (-15.401)

Here, an increase in the scale factor of the MNL model is noted since the utility function

parameters are greater in the integrated model than in the base model. This scale parameter,

unidentifiable in linear-in-parameters specifications, is inversely related to the variance of the

underlying Gumble distribution. Hence, increase in scale factor corresponds to decreases in the

variance of the stochastic component of the utility functions. (Ben Akiva and Lerman 1985).

The estimated parameters were then applied to a hold-out sample of 100 observations (not

used for estimation and the goodness-of-fit statistics were calculated in terms of log likelihoods

(LL) using the two sets of estimated parameters. The LL value was -73.90 with the corrected

model parameters whereas the value was -107.63 in case of base model showing a substantial

increase in goodness-of-fit which demonstrates the superiority of the corrected model in terms

of forecasting.

6. Conclusions

In the study effort has been made to identify the key limitations of the RP mode choice data

available for Dhaka and correcting them by developing a choice-set generation model and a

measurement error correction model (to correct the LOS variables).

The estimation results of the model parameters have got several policy implications. The

estimated values of the parameters of the choice set generation model indicate that, in general

the female travelers and the travelers with comparatively high monthly incomes are reluctant to

include public transport modes in their choice sets. Therefore in order to increase the ridership

of these groups of people, special incentives (e.g. women only buses/compartments, low floor

vehicles etc.) need to be provided.. The estimated parameters have also implied that, people

have a strong inclination to include public transport in their choice sets compared to other

modes for educational and work trips. This fact can be utilized by the policy makers by providing

some additional benefits to this group of travelers. The additional benefits may include, low fare

for the students, frequent departure of buses during morning and evening peak hours, etc.

The study however has several limitations as well. For example, instead of estimating an

integrated model with the choice-set generation, measurement error correction and actual

mode choice, the sub-models have been estimated separately. As mentioned in Section 4.3, this

was due to the limitations of the available software and will be addressed in future research.

Besides, in case of the measurement error correction, only the travel times of the modes were

corrected, similar approach will be explored in the future to correct the travel costs of different

modes. Further, in the prediction tests, a very simple model structure was used. It may be noted

that this was done only for demonstration purposes and in future research, we plan to enrich

these prediction models using more advanced model forms (e.g. nested logit, cross-nested logit,

mixed logit etc.), including more socio-economic data as well as combining SP choice scenarios

where respondents were given options to compare improved public transport modes like Bus-

rapid-transit (BRT) and Metro Rails with their current modes and select the one they perceive to

be the best. The proposed integrated model structure using a structural and measurement

equation framework is presented in Figure 01.

Characteristics Measured Values

Supplementary survey

Choice-set Attributes of

Modes

Errors

Observable variable

Unobservable variable

Behavioral Relationship

Measurement Relationship

Figure 01: Choice model with the consideration of choice-set generation model and correction

for measurement errors of the LOS values

The Figure shows that, the choice set of the respondents, attributes of modes and the utility of

different modes are not directly stated by the respondents and therefore are unobserved to the

analysts. The observed variables include the socio economic characteristics of the respondents,

measured values of the trip attributes from network analysis and the revealed and stated

choices of the respondents. The analyst has to rely on the measurement relationships among

the observed and unobserved variables to arrive to the unobserved variables. In this process

error terms are associated with the unobserved variables. However the model may take a closed

form (logit) if the errors are assumed to be independent and identically distributed (iid).

The developed models will have a huge potential to better predict the rider ship of proposed

improved urban transport initiatives in Dhaka city. It may be noted that though the

modifications proposed in this paper have been formulated with the context of Dhaka and

available data in mind, the methodologies adapted in this research can provide useful guidance

for mode choice developments in other developing countries which often face similar dearth of

data and modeling challenges.

Acknowledgements

The Stated Preference Survey data collection for this study has been supported by Bureau of

Testing Research and Consultancy, BUET and the Japanese International Cooperation Agency

(JICA). Any opinions, findings and conclusions or recommendations expressed in this publication

are those of the authors and do not necessarily reflect the views of BUET or JICA.

Refernces

1. Bari M. F. and Hasan M. (2001), “Effect of Urbanization on Storm Runoff Characteristics of

Dhaka City”, Tsinghua University Press. XXIX IAHR Congress. Beijing.

2. Ben-Akiva M, McFadden D, Train K, Walker J, Bhat C, Bierlaire M, Bolduc D, Boersch-Supan A,

Brownstone D, Bunch D, Daly A, de Palma A, Gopinath D, Karlstrom A, Munizaga A, (2002),

"Hybrid Choice Models: Progress and Challenges", Marketing Letters 13(3), 163-175.

3. Ben-Akiva, M. (1977), "Choice Models with Simple Choice-set Generation Precesses", Working

Paper, Dept, of civil Engineering, MIT, Cambridge, MA.

4. Ben-Akiva, M. and Lerman, S. (1974), “Some Estimation Results of a simultaneous model of

Auto Ownership and Mode Choice to Work”, Transportation, 4, 4, 357-376.

5. Ben-Akiva, M. and Lerman, S.R. (1985), “Discrete Choice Analysis: Theory and Application to

Travel Demand” The MIT Press, Cambridge, MA.

6. Bierlaire, M. (2003) BIOGEME: A free package for the estimation of discrete choice models,

Proceedings of the 3rd Swiss Transportation Research Conference, Ascona, Switzerland.

7. Bliemer M., Rose J. and van Blokland (2009) Experimental Design Influences on Stated Choice

Outputs: An Empirical Study in Air Travel Choice, 12th Conference of the International

Association of Travel Behavior Research, Jaipur, India.

8. Brownstone, D., Golob, T. F., and Kazimi, C. (2001), "Modeling Non-ignorable Attrition and

Measurement Error in Panel Surveys: An application to Travel Demand Modeling" , Chapter 25

in Survey Nonresponse, Editors, R.M. Groves, D. Dillman, J. L. Eltinge and R.J.A. Little, New York:

Wiley, forthcoming.

9. DITS (1993), "Greater Dhaka Metropolitan Area Integrated Transport Study" , Prepared by PPk

Consultants Declan International and Development Design Consultant (DDC), Dhaka .

10. Gaudry, M. and Dagenais M. (1979), "The Dogit Model", Trans. Res. B., 13B, 105-111.

11. Gunn HF, Whittaker JC, (1981), "Estimation errors for well-fitting gravity models", Working

Paper 149. Institute for Transport Studies, University of Leeds.

12. Habib, K. M. N. (2002), "Evaluation of Planning Options to Alliviate Traffic Congestion and

Resulting Air Pollution in Dhaka City", M.Sc. Thesis, Departmet of Civil Engineering, BUET,

Dhaka.

13. Hasan, S. (2007), "Development of a Travel Demand Model for Dhaka City", M.Sc. Thesis,

Departmet of Civil Engineering, BUET, Dhaka.

14. Heijden, R E C M and Timmermans, H J P (1984), "Modeling Choice-set Generating Processe

via Stepwise Logit Regression Procedures: Some Empericial Results", Environment and Planning

A, 16, 1249-1255.

15. Hensher, D. (1978), ““Valuation of Journey Attributes: Some Existing Empirical Evidence”, in

Determinants of Travel Choice”, D. Hensher and Q. Dalvi, eds., Saxon House.

16. Kim HK, Wu SK, M Hunger M, (2006), "A Case Study on Measuring Travel Time, Speed, and

Delay Using GPS-Instrumented Test Vehicles. Applications of Advanced Technology in

Transportation", 9th International Conference, Chicago IL.

17. Lerman, S. (1975), “A Disaggregate Behavioral Model of Urban Mobility Decisions”,

Unpublished Ph.D. Thesis, Department of Civil Engineering, Massachusetts Institute of

Technology, Cambridge, MA.

18. Manski, c. (1977), “The structure of Random Utility Models”, Theory and Decision, 8, 229-

254.

19. McFadden, D. (1976a), “The Multinomial Logit Model When the Population Contains

‘Captive’ Subpopulations”, Unpublished Memorandum, September 13, 1976.

20. McFadden, D. (1981), "Econometric Models of Probabilistic Choice. In structural Models of

Discrete Data with Econometric Applications", (Edited by C. Manski and D. Mcfadden), MIT

Press, Cambridge, MA.

21. McFadden, D. and Reid, F. (1975), “Aggregate Travel Demand Forecasting from Disaggregate

Behavioral Models”, Presented at the Annual TRB Meeting, Washington, DC.

22. Ortúzar J de D, Ivelic AM, (1987), "Effects of using more accurately measured level-of-service

variables in the specification and stability of mode choice model.", Proceeding 15th PTRC

Summer Annual Meeting, P290,117-130. PTRC, London.

23. Ortúzar J de D, Willumsen LG, (2001), "Modeling Transport", Wiley.

24. Steimetz SSC, Brownstone D (2005), "Estimating Commuters ‘Value of Time’ with Noisy Data:

a Multiple Imputation Approach", Transportation Research B 36, 865-889.

25. Stopher, P. (1980), “Captivity and Choice in Travel-Bahavior Models”, Trans. Eng. Journal of

ASCE, 106, TE4, 427-435.

26. STP (2005), "Strategic Transport Plan for Dhaka", Prepared by Louis Berger Group and

Banndladesh Consultant Ltd.

27. Swait, J(1984), " Probabilistic choice-set Generaion on Transportation Demand Models", Ph.

D. Dissertation, Massachusetts Institute of Technology, Cambridge.

28. Swait, J. and Ben-Akiva, M. (1984), "Incorporating Random Constraints in Discrete Models of

Choice-set Generation", Transp. Res. B., 21B, 91-102.

29. Swait, J. and Ben-Akiva, M. (1985a), "Constraints on Individual Travel Behavior in a Brazilian

City", Transp. Res. Record, 1085, 75-85.

30. Swait, J. and Ben-Akiva, M. (1987), "Empirical Test of a Constrained Choice Discrete Model:

Mode choice in Sao Paulo, Brazil", Transp. Res. B., 21B, 103-115.

31. Thill, Jean-Claude (1992), "Choice-set Formation for Destination choice Modelling", Progress

in Human Geography 16, 3, 361-382.

32. Train, K. (1980), “A Structured Logit Model of Auto Ownership and Mode Choice”, Review of

Economoc Studies, XLVII, 357-370.

33. Walker, J. et. al. (2007), "Travel Demand Models in the Developing World: Correcting for

Measurement Errors", TRB 2008 Annual Metting CD-ROM.

34. Wermuth, m, (1978), “Structure and Callibration of Behavioral and Attitudinal Binary Choice

Model Between Public Transport and Private Car”, Presented at PTRC Summer Annual Meeting,

10-13 July, University of Warwick, England.

35. Westin, R. (1974), “Predictions from Binary Choice Models”, J. of Econometrics, 2, 1-16.

36. Williams, H. and Ortuzar, J. (1979), “Behavioral Travel Theories, Model Specification and the

Response Error Problem”, Working Paper 116, Inst. for Transport Studies, The University of

Leeds.

- Lab1_16Uploaded byMichael
- Regression Stepwise (PIZZA)Uploaded byJigar Priydarshi
- Qualitative MethodsUploaded byvsuarezf2732
- clogitUploaded bylsabetti
- 2.Analysis FullUploaded byTJPRC Publications
- Webinar10 TourTripModeStopDest Slides With Notes2Uploaded byUsama
- Formula_sheet Final UpdatedUploaded byAziz Aljurf
- LINEST FunctionUploaded byMomon Dompu
- lab5Uploaded byDwi Satria Firmansyah
- Journal of Retailing and Consumer Services Volume 10 Issue 3 2003 [Doi 10.1016%2Fs0969-6989%2803%2900008-0] Hans S. Solgaard; Torben Hansen -- A Hierarchical Bayes Model of Choice Between SupermarketUploaded byAtika Defita Sari
- Lecture 7slidesUploaded byDominicTan
- JAES_spring 2(32)Uploaded bySounay Phothisane
- HSRA14C - Marshall Declaration -CompleteUploaded byStuart M. Flashman
- bib4Uploaded byJacky C.Y. Ho
- Transportation Statistics: table 04 03Uploaded byBTS
- Qualifying ExampdfUploaded bymaanamendoza
- First OutputUploaded byVishal Srivastava
- A guide about Stata CommandsUploaded byraum123
- sample of test feedbackUploaded byapi-327991016
- Linear FitUploaded byDragan Lazic
- least_sqUploaded byXaid Ibrahim
- ie352l1_labmanual (1)Uploaded byanthony
- Linear RegressionUploaded bysvrbikkina
- OutputUploaded byErinna Susanto
- Final.(May August) 2014Uploaded bysumu_821
- ResultUploaded byHuy Quang Mai
- Amir, Lev, Sougiannis ('03) - Do Analyst Get IntangiblesUploaded bymert0723
- Factor Investing Exercises 2Uploaded byAkram Mohiddin
- lec3Uploaded byjuntujuntu
- Building morphologyUploaded byukeypravin

- Ordinal RegressionUploaded bymoon055888230
- 4-VP-SharmaUploaded byTejas
- 144464779 STATA CommandsUploaded bymaz_wied74
- Softmax for the laymanUploaded byEin Niemand
- Panov and Taleski - Final VersionUploaded byZhidas Daskalovski
- Hyperdominance in the Amazonian Tree FloraUploaded bysaporetti
- Simulating a basketball match with a homogeneous Markov model and forecasting the outcomeUploaded byMichalis Kasioulis
- Dividend Policy and the Method of PaymentUploaded bylesleykong
- Chapter 3 Residential Mobility and Household Location Modelling 1987 ClarkUploaded byJuan Achar
- semUploaded byMaria Santos
- [Brajendra C. Sutradhar]Longitudinal Categorical Data Analysis(PDF){Zzzzz}Uploaded byTchakounte Njoda
- Lecture 2 Discrete ChoiceUploaded byarami63
- S005 - Multinomial Logit Analysis TutorialUploaded bybis22111978
- Marketing Engineering NotesUploaded byNeelesh Kamath
- Multinomial Logistic Regression _ R Data Analysis Examples - IDRE StatsUploaded byDina Nadhirah
- Twitter Distant Supervision 09Uploaded byRuslan López
- Mannering 2014 Analytic Methods in Accident ResearchUploaded byNelson Ragde C. Puma
- Predictors of Long-Term Enrollment and Degree Outcomes for Community College Students (1)Uploaded byRichard Phelps
- Adaptation to Climate Change and Variability in Eastern EthiopiaUploaded byddadddaian
- 3557163Uploaded byRachid El Moutahaf
- postech-maxentUploaded byKry Rivadeneira
- InsuranceUploaded byAnonymous HhTjAjV
- Class Mobility in the Philippines - CHIBA UniversityUploaded byBert M Drona
- IJERD (www.ijerd.com) International Journal of Engineering Research and Development IJERD : hard copy of journal, Call for Papers 2012, publishing of journal, journal of science and technology, research paper publishing, where to publish research paper, journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, how to get a research paper published, publishing a paper, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, how to submit your paper, peer review journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals, yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, journal of engineering, online SubmisUploaded byIJERD
- Indian HighwaysUploaded byharnishtanna21285
- MNPMUploaded byJdPrz
- Priyanka Parvathi (2015) Adoption and Impact of Black Pepper Certification in IndiaUploaded byKhath Bunthorn
- JungWickramaLCGALGMMUploaded byhuanning55
- Multiple choice questions (with answers) (1).docUploaded bySiddharthaChowdary
- Logistic Regression AnalysisUploaded byPIE TUTORS