You are on page 1of 19

An online railway trafc prediction model

Pavle Kecman, Rob M.P. Goverde


Delft University of Technology, Department of Transport and Planning Stevinweg 1, 2628 CN Delft, The Netherlands e-mail: {p.kecman, r.m.p.goverde}@tudelft.nl Abstract Prediction of train positions in time and space is required for trafc control and passenger information. However, in practice only the last measured train delays are known and dispatchers must predict the arrival times of trains without adequate computer support. This paper presents a real-time tool for continuous online prediction of train trafc using a timed event graph that captures all scheduled events and precedence relations between them, such as train runs and stops, connections, and minimum headways. Robust estimates for the minimum process times are derived online by computing small percentiles (conditional on current delay where relevant) for conict-free running times that are obtained by pre-processing historical train describer data. The graph is updated regularly when new information becomes available on train positions or trafc control decisions. The realization times of all events in the graph are predicted considering the usage of running time supplements and buffer times, as well as time loss due to route conicts based on a conict detection scheme within the prediction algorithm. The model is demonstrated on the busy corridor The Hague Rotterdam in the Netherlands. Keywords Railway trafc, Train describers, Monitoring, Timed event graph, Prediction

Introduction

Real-time prediction of train positions in time and space is a basic requirement for effective route setting, trafc control, rescheduling, and passenger information. However, in practice only the last measured train delays are known in the trafc control centres and dispatchers must predict the arrival times of trains using experience only, without adequate computer support. This often results in simple extrapolation of the current delays for the expected arrival delays. Some railways use a linear shift of the timetable to extrapolate the current delays to the future. This method neglects the fact that some trains may (partially) recover from a delay using running time supplements, while others may get (more) delayed due to route conicts. Better predictions could be obtained by microscopic simulation models but excessive computation times still prohibit online application of microscopic simulation to densely operated large-scale networks. This paper presents a real-time model for continuous online prediction of train trafc using process mining, a method of analysing and extracting information about processes from event data records using a process model [23]. The method is applied to Dutch train describer log les.

Train describer systems keep track of train positions in discrete steps over its route, based on train numbers and messages received from elements of the signalling and interlocking systems (sections, switches and signals). One of the tasks of train describers is logging the incoming infrastructure element messages and the generated train number messages, resulting in chronologically ordered lists of infrastructure and train number messages. We use track occupation data to determine the dependency of running and dwell times on departure and arrival delays, respectively. The data are classied separately for each train line, thus reecting the different stopping patterns and rolling-stock types. These dependencies, together with the actual route plan, timetable and current positions of all trains, are used to predict the future running times (on block section level) and dwell times of all trains. Microscopic operational constraints are incorporated in predictions, therefore capturing all train interactions due to capacity or connection constraints. When an update of the train positions becomes available, the predictions are recomputed. The predictive trafc model supports route setting and trafc control decisions and could be interactively used by signallers and trafc controllers. First, the model predicts route conicts for a given actual route plan and train positions. This could be used by the signaller to pro-actively resolve the conict by e.g. changing routes or the order of trains. The impact of any control decision can be checked by an update of the predictive model leading to new conict and arrival time predictions. If a control decision leads to satisfying results it can be implemented in the actual process plan. If on the other hand a route conict cannot be avoided, the signaller could give speed advice (or new target passage times) to the relevant train drivers so that the impact of the route conict is minimal and energy can be saved by preventing unnecessary braking and re-acceleration. Second, the arrival time predictions could be used to check connections in the case of arrival delays. When a connection conict is detected, the signaller may decide to secure or cancel a connection in advance. This way up-to-date passenger information can be provided, both at stations and in the delayed trains. Similarly, endangered logistic connections (crew or rolling-stock) can be predicted in advance. The next section gives an overview of relevant approaches in current literature, Section 3 describes processing of train describer data, and Section 4 gives a detailed description of the model and its components. Section 5 presents the performance of the tool when applied in a simulated real-time environment. Finally, Section 6 summarises the presented model and gives guidelines for further research and improvements.

Literature review

This section gives an overview of recent experiences in the eld of railway data mining and different approaches to railway trafc prediction in the literature. A description of train describers and an overview of different systems used in Europe is given by Exer [7]. Train describer data recently became an important source of information for analysing railway trafc. Medeossi et al. [16] used track occupation data along with train event data recorded on-board to calibrate train motion equation parameters in the process of computing stochastic blocking times for individual trains. This model has recently been extended with stochastic estimates of initial delays and dwell times in order to build a simulation tool for an a priori evaluation of the impacts that new timetables or infrastructure improvements may have on railway operations [15]. Daamen et al. [4] and Goverde et al. [9] described the algorithms for automatic route

conict identication based on data records of the Dutch train describer system TNV, which were implemented in the tool TNV-Conict. The tool was further extended with an addon, TNV-Statistics [10] for detailed statistical analysis of train realization data based on the output les of TNV-Conict. The TNV system was replaced by the new train describer system TROTS which contains an essential new approach to train number steps. Kecman & Goverde [13] presented a tool for recovering realised train paths and automatic conict identication based on process mining of TROTS log les. Signal passage and section occupation data are stored in tabular output which is used in this paper for developing robust estimates of running and dwell times in the prediction model. The prediction, model introduced in this paper, is based on an acyclic timed event graph. Goverde [8] introduced a macroscopic model for train delay propagation based on timed event graphs and max-plus algebra that allows application of fast algorithms for computation of delay propagation in a short time even for large networks. Due to the xed structure of train orders and routes in the model, it is not directly suitable for real-time predictions that need to consider current conditions on the network and changes in the actual process plans. Berger et al. [1] incorporated stochasticity in their graph-based macroscopic model for delay prediction. By using a set of waiting policies for passenger connections and assuming discrete distributions of process times, they are able to predict delay propagation over the network. In another approach, B uker & Seybold [2] modelled delays as random variables, described with suitable distribution functions, and applied analytical methods to compute delay propagation in a mesoscopic graph-based model. However, the large-scale character of the models does not allow precise modelling of train interactions and the resulting variability in running times. A microscopic graph model, presented by DAriano et al. [5], considers the majority of operational constraints of railway trafc. Static arc weights require an iterative approach to recompute feasible speed proles of trains based on train dynamics and detailed infrastructure data. Whereas the train interactions on open track are accurately modelled, detailed modelling of route setting principles in complex station areas is not considered. Corman et al. [3] performed a detailed analysis of necessary granularity in modelling train trafc in interlocking and station areas. In this work, we rely on trafc realization data to develop a fully data-driven model based on actually realized minimum headway times between conicting routes. The prediction model presented in this paper considers dynamic, online computation of arc weights. This concept of time-dependent graphs has been used in railway related applications mainly for supporting timetable queries [17]. While such applications are focused on developing fast algorithms for dynamic shortest path computation, in this paper we employ the depth-rst search based prediction algorithm to obtain the predicted realization times of all events in the graph. Microscopic simulation tools such as OpenTrack [18] or RailSys [20] are able to give accurate predictions of running times, and possible route conicts and resulting delay propagation. Due to a high level of detail in modelling infrastructure and train dynamics, such models are not suitable for real-time applications on large and heavily utilised networks. Moreover, they do not incorporate peculiarities of train trafc such as variability of dwell times due to delays or peak hours, which may affect train trafc to a great extent. Hansen et al. [11] presented a macroscopic model for prediction of train running times

using historical track occupation data. The dependencies of dwell times as well as running times on the level of open track sections (line segment between two stations) were captured and used to compute robust estimates of arrival and departure times. We extend this approach to the microscopic level in order to accurately model train behaviour and interactions on open track sections. An on-line prediction tool similar to the one presented in this paper has been implemented in the Swiss trafc control system RCS-DISPO [6]. The main part of this tool is a microscopic model based on a directed acyclic graph with arc weights that are computed using train motion equations, with respect to detailed description of infrastructure and train dynamics. The tool is successfully applied to a large number of trains (between 900 and 2000). The authors concluded that the accuracy of predictions depend on the time interval between the moment of computation and predicted event. Prediction errors smaller than 1 minute were obtained for events within 20 minutes prediction horizon. We attempt to improve this approach by computing the arc weights using historical data, thus incorporating the dependence of process times on current delays, with possible extensions that include inuence of peak hours, weather, and rolling-stock type. By applying a fully data-driven approach in contrast to deterministic computation of process times, we aim to capture variability in running, dwell and headway times. The prediction window in our model is limited to 20-30 minutes, comprising the events of the trains currently running on the network with routes and operational plans determined by the trafc control centre. The graph structure of the model can conveniently be integrated with a network-wide delay propagation model [8], thus propagating the current conditions (or potential decisions by the trafc controllers) to subsequent periods in the timetable.

Processing of TROTS data

In the Dutch train describer system TROTS, the train steps are recorded on the level of track sections (a block section consists of one or more track sections), with a message when a new track section is occupied by a train and when a track section is released by a train. Moreover, train number step messages are coupled to track section messages. The Dutch railway network is divided into multiple TROTS areas. Each area comprises one or more major station areas with complex topologies and 30 40 km of surrounding railway infrastructure. In order to reconstruct the train trafc over multiple TROTS areas it is necessary to merge the corresponding log les. TROTS log les are archived per day and area in large les of ASCII format of approximately 75 MB. Infrastructure messages contain the following information: time stamp, event code, element type (section, signal, point), element name, and new state (occupied/released, stop/go, left/right). The train number step messages contain amongst others a time stamp, event code, train number, and a sequence of all occupied track sections. Each successive train number step message relates either to a new occupied track section at the front or to a released track section at the rear. The event code of a train number step corresponds to a section message with the same event code. This coding is used to match a message about a section occupation or release with a message of a train number step.

3.1

Process mining approach

Blocking time theory [12] provides the logic for building the process model from the log le. Signal passages are events that initiate processes such as blocking a part of the infrastructure and running over a block. Each complete train run can thus be represented as a graph, built on-line by sweeping through the le. Moreover, route conicts can be identied simultaneously by determining the time difference between relevant events and verifying if the train separation principles are respected. Due to the large size of TROTS log les, it is necessary to build an algorithm that sweeps through the le and visits every line only once, thus avoiding long computation times. An object-oriented approach is used to store the relevant data from the log les in infrastructure and train number objects which enables the algorithm to revisit the objects, and use and update the information therein [4]. Figure 1 shows the attributes of each block and train object. All objects are created and updated on the y while sweeping through a TROTS log le. Static attributes in block objects (Start signal, End signal and Sections) are xed when objects are created, using additional infrastructure les. They provide a unique description of a block or an interlocking route with the start and end signals, and comprised track sections.
Block Start signal End signal Sections Trains: {tocc,trel,series} Train Number Timetable Sections:{name,tocc,trel} Signals:{name,tpass}

Figure 1: Objects and their attributes Dynamic attribute Trains in a block object contains a chronologically sorted list of trains that traversed the block. Information about the block occupation time, release time and train series is stored for each train. A Train object is dened by a train number and the list of traversed sections and signals, that are updated with every message from the log le related to the train. A list of scheduled departure and arrival times at each station is given in the attribute Timetable. It is an important feature of the prediction model to capture the interactions of trains and the resulting conicts and knock-on delays. Consequently, partitioning realized process times data to hindered and unhindered trains is of great importance. The process mining algorithm described by Kecman & Goverde [13], uses blocking time theory as the underlying process model and is thus able to identify all route conicts. Realised hindered running times are ltered out and excluded from further analysis. Furthermore, the process mining algorithm estimates the actually realized departure and arrival times of all trains with the largest estimation error smaller than 10 seconds. Therefore, all running (dwell) times can be analysed with respect to the last measured departure (arrival) delay, which will be exploited for computing running time and dwell time predictions in Section 4.2.

Online trafc prediction tool

The main components of the tool and the ow of data between them are depicted in Figure 2. The online prediction tool is based on a timed event graph with dynamic arc weights. The graph topology is built and updated based on the actual timetable, route and connection plan, and current positions of trains on the network. We assume that the actual route and connection plans are continuously provided by the trafc control for the next 30 min. Route plan for a train is given as a planned sequence of block sections in the train route. By using the Sections attribute of the block objects (Figure 1), a route plan can be translated to the level of track sections and used to determine the necessary headway arcs for routes with common track sections. Each change of the actual plans or new information from the real-time operations, i.e. changing the relative order of trains, adding or cancelling trains, modifying train routes, updating connections and removing passed events, results in an update of the graph topology.
Actual route & connection plan Graph topology
Sched. event times (timetable) Real. event times (train positions)

Graph weights
Predicted delays

Processed historical data Delay propagation

Predicted event times

Figure 2: Components of the online prediction tool Arc weights represent the minimum process times and they are computed based on the current train positions and delays, and processed historical data. The weight of an arc is time-dependent and assigned in a dynamic way depending on the (estimated) starting time of the modelled process. That way the dependence of running and dwell times on current (predicted) delays is incorporated in the model. After every graph update, a prediction of event times of all reachable events is performed. In the following subsections, the three main components of the tool (shaded boxes in Figure 2 and the input data will be explained in detail.

4.1

Microscopic graph model

The railway trafc is modelled microscopically with a timed event graph (TEG). A TEG is a representation of a discrete-event dynamic system in which events are modeled by nodes and processes by arcs. We distinguish between signal events (passing of a signal by a running train) and station events (arrival and departure to and from a platform track). Nodes are described by train number, infrastructure element (signal or platform track), type, predicted realisation time, and recorded realisation time (when available). Station event nodes (arrival and departure nodes) also include scheduled event times as attributes. By comparing the recorded (predicted) event times with the scheduled event times, the current (predicted) delay is obtained for a specic train and used to estimate the duration of its subsequent processes (dwell and running times). Scheduled departure times are also used to incorporate the timetable constraints (trains cannot depart before their scheduled departure times). Apart from modelling the running and dwelling processes related to a specic train, directed arcs are also used to model interactions between trains, namely headway and connection constraints. Arcs are described by starting event, end event, type and weight. Types of the starting and end events determine the type of an arc. For events belonging to the same train, running arcs connect all signal passing events. Dwell arcs connect an arrival event with a subsequent departure event. An inbound running arc connects a signal event with a subsequent arrival event, whereas an outbound running arc connects a departure event with a subsequent signal event. Connection arcs are introduced for modelling commercial constraints (passenger transfers), or logistic constraints (rolling-stock and crew connections). They connect the arrival event of a feeder train and the departure event of a connecting train in the same station. Headway arcs separate the successive occupations of an infrastructure element by different trains. Typically, a signal changes to a permissive aspect as soon as all sections in a block (interlocking route of the approaching train) protected by the signal, have been released. On open tracks and for interlocking routes with the same end signal, the critical section that constrains signal release is the section before the end signal of the block or route. This situation is typical for trains that run over the same block or station route or for trains with merging routes (routes that have different starting signals and the same end signal) in interlocking areas. An accurate train separation is ensured by adding a headway arc that constrains the realisation of a signal passing event of an approaching train until the protected block was cleared by the previous train. The head event is the start signal passing event of the approaching train, the tail event is the end signal passing event of the preceding train and the arc weight is equal to the clearing time of the preceding train increased by the setup and release time of the signalling system. However, in interlocking areas, conicting routes are often diverging (with the same starting signal and different end signals) or intersecting (different starting and different end signals). The sectional release route setting principle [22] is designed to increase the capacity in station areas by allowing simultaneous multiple train movements with safety measures insured by the interlocking systems. A route can be set for an approaching train as soon as the last section, which the route has in common with the preceding route, has been released. Since all events in the model are signal passages or station events, the event of the critical section release has not been included. We model the train separation by adding a headway arc between passing events of signals that initiate running processes over the pro-

tected interlocking area. The head event and the tail event are the start signal passing events of the approaching and the preceding train, respectively, and the arc weight is computed as a small percentile of the headways between the same successive train runs from the historical track occupation data (see Section 4.2). The concepts of train separation based on blocking time theory are implemented in the tool for the purpose of route conict predictions (Section 4.3). Figure 3 shows an illustrative example of a timed event graph for two trains. The planned route for each train can be described by the sequence of signals: S1 , S2 , S3 , S4 , and S6 for train T1 and S1 , S2 , S3 , S5 , S6 for train T2 . Every signal passage is modelled as a node. Both trains have a scheduled stop at the station which is modelled with arrival and departure nodes (large nodes in the gure). Nodes belonging to one train run are connected by

S4 T2 S1
(T1,S1)

T1 S2
(T1,S2)

PLATFORM S3
(T1,S3) (T1,A)

S5
(T1,D) (T1,S4)

S6
(T1,S6)

(T2,S1)

(T2,S2)

(T2,S3)

(T2,A)

(T2,D)

(T2,S5)

(T2,S6)

Figure 3: An illustrative example of a microscopic TEG running and dwell arcs. Since the trains run over the same infrastructure, the necessary minimum headway times are ensured with headway arcs. The route between signals S1 and S3 is the same for both trains, thus requiring at least one block separation between trains, which is modelled with headway arcs (T1 , S2 ) (T2 , S1 ) and (T1 , S3 ) (T2 , S2 ). The sectional release principle between diverging inbound routes of two trains is enabled with the headway arc (T1 , S3 ) (T2 , S3 ). Finally, train T1 can leave the station when the block between S5 and S6 has been released by train T2 , which is modelled by the headway arc (T2 , S6 ) (T1 , S4 ). A planned connection is secured with the arc between the arrival event of T1 and the departure event of T2 . Note that the direction of the headway arcs indicate the order of trains. In Figure 3 train T2 overtakes train T1 in the station. The graph topology is continuously updated as the 30 min rolling horizon moves. Possible new trains, planned to operate within the actual horizon, are added to the graph with their planned route on the level of block sections. The necessary headway arcs are built per block between consecutive trains that use at least one same track section covered by the 8

block. With each update about the train positions (signal passage, departure or arrival of a train), the nodes describing events from the past and their incoming and outgoing arcs are removed from the graph (and stored with the realised event times), thus keeping the size of the graph stable within a certain time interval. 4.2 Computation of arc weights

Arc weights in timed event graphs are equal to the minimum process times of the modelled processes. In order to accurately estimate arc weights, we assume that delayed trains typically run in the full performance regime and have minimal dwell times aiming to use time supplements to (partially) recover from delays. Similarly, trains running on time or ahead of their schedule aim to run in a lower regime to avoid early arrivals and decrease energy consumption. In that context, a time-dependent, dynamic computation of arc weights [17] is added to the timed event graph presented in the previous section. The basic idea behind this approach is that running and dwell times depend on the previously noted delays [11]. Running and dwell arcs In order to determine the dependence of running times on current delays, the train describer event data were processed with the process mining tool for recovering train paths and infrastructure utilization [13]. The tool provides the running times on the level of block sections classied by train line (trains of the same line operate with the same stopping pattern and usually with the same rolling-stock) and attributed by the current delay noted at the last departure or arrival event. Moreover, the actual arrival and departure times for each scheduled train stop are determined, thus enabling similar records for dwell times, and inbound (from passing the home signal to standstill at the platform) and outbound (from the platform to the exit signal) running times in stations. Running times of hindered trains were identied and ltered out from the data. Robust regression with the least trimmed squares method [19], resisting 25% of outliers, is used to t the data and compute the linear dependence of process times on delays. Each block section and station route is attributed with linear coefcients for each train line. We also include the 10th percentile of a process time and use it as the absolute minimum process time to avoid infeasible predictions in case of large delays. Figure 4 gives an example of the dependence of the running times of train line 2100 over the block between signals GV615 and DTA623, on the delay at the previous departure from The Hague HS. It is visible that in this case there is no clear linear dependence of running time on the departure delay. A possible interpretation is that due to intense capacity consumption, the timetable does not include sufcient running time supplements for trains to compensate for their delays on the corridor between The Hague and Rotterdam [11]. Figure 5 shows the dependence of the dwell time of the 2100 trains in The Hague HS station on arrival delay. The robust linear regression t results in a coefcient of determination R2 = 0.94. The horizontal line represents the minimum dwell time for passenger activities estimated as the 10th percentile of all realized dwell times p10 = 195 seconds (the scheduled dwell time for 2100 trains in The Hague HS is 240 seconds). The 10th percentile is included in the estimate in order to avoid unrealistic estimates in case of large delays. The current data set, consists of 6 days of trafc on the corridor, which after ltering out hindered runs is reduced to about 200 records per train line. The analysis of running and dwell times on the Rotterdam The Hague corridor showed a strong dependence of dwell

120 110 100 Running time of 2200 seres [s] 90 80 70 60 50 40 30 20 -50

50

100

150

200

250

300

350

400

450

Departure delay from The Hague HS [s]

Figure 4: Dependence of running time between GV615 DTA623 on departure delay from The Hague HS

600

Dwell time of 2100 trains at The Hague HS [s]

500

400

300

200

100

0 -200

-150

-100

-50

50

100

150

200

250

300

Arrival delay at The Hague HS [s]

Figure 5: Dependence of dwell times in Den Hague HS on arrival delay

10

times on arrival delays. Running times show weaker dependence on departure delay from the last station with a scheduled stop. We expect to nd stronger dependencies of running times when the model is applied on corridors with more running time supplements. Headway arcs The weights of headway arcs represent the minimum headway time between two trains on the same infrastructure element. Minimum headway time between successive block occupations (route settings) equals the sum of running time of the rst train, clearing time, and setup and release time of the signalling system [12]. In this paper a constant value of 2 seconds is used for the setup and release time on open track and 12 seconds for route setting time in stations. Clearing time is estimated from the data as the 10th percentile of the clearing times of a block by a specic train line. In order to model the principle of sectional release using only signal passing events, the minimum headway time between two trains with diverging or intersecting routes is estimated from the data as the 10th percentile of the time headways between train runs of the corresponding train lines from the historical track occupation data. By choosing a small percentile of the realised time headways, the impact of buffer times on minimum headway times estimates is excluded. Connection arcs The weight of a connection arc is equal to the minimum transfer time for passenger connections or the time needed to perform activities that enable planned rolling-stock and crew circulations, for logistic connections. Connection times do not depend on the current delay of trains and the possible effect of delays on headway times was not considered in this work. Therefore, these values are computed ofine and the corresponding arc weights are xed. 4.3 Online prediction of event times

The pseudo code of the online depth-rst search based algorithm for prediction of event times over graphs with dynamic arc weights is given in Algorithm 1. A recursive depth-rst search is chosen as the method of traversing the graph, due to its low memory requirements, which is an important constraint for large graphs. After each event realisation, the reachable set of nodes is isolated, where the root node is the node that models the realised event. The prediction algorithm then updates the predicted event times of all events in the reachable set. Note that if a node is not reachable, the corresponding event time can in no way be affected by the new information. Therefore, it is not necessary to visit that node in the prediction process. We model railway trafc with a graph G = (V, E ) where V is a set of nodes and E is a set of arcs. A node v V is described by (n(v ), infra(v ), type(v ), in(v ), out(v ), tpred (v ), trec (v )), representing the train number, infrastructure element (signal or platform track), type, the set of incoming arcs, the set of outgoing arcs, predicted realisation time and the recorded time (when available), respectively. The nodes that model scheduled events, i.e. arrivals and departures are also attributed with the scheduled event time tsch (v ). For implementation purposes, a vector te (v, vj ), containing the earliest possible realisation time of v with respect to each direct predecessor vj , j = 1, ..., |in(v )|, is added to every node. When a new train is added to the graph, the values of te are computed for each node

11

of the train in the topological order. The initial prediction of each event time can than be computed as the maximum of the earliest possible realization times over all incoming arcs tpred (v ) max(te (v, )) An arc e E is described by (start(e), end(e), w(e), type(e)) representing the start event, the end event, the arc weight and the arc type (dwell, run, headway, connection). Note that, as explained in Section 4.2, headway and connection arcs have xed predened weights, stored in the data structure of processed historical data W . We dene a mapping that retrieves the predened weight of an arc from W . The weights of running and dwell arcs are determined online with every algorithm execution using the functional dependence of process time on the current train delay. For running and dwell arcs, the start and end events belong to the same train and the infrastructure resource (block or station route) is known. The arc weight is computed by w(e) = fblock,line (z (n)), where z is a vector that contains the value of last recorded delay for each train number n (note that delays are recorded only at scheduled arrival and departure events). During the algorithm execution, predicted event times of a scheduled event will give predicted delays of trains. Therefore, subsequent process time estimates are computed by w(e) = fblock,line ( z (n)), where z is the vector of predicted delays for every train number n. Function f is retrieved from W for the appropriate block (station route) and the appropriate train line. For every run and dwell process, the 10th percentile of process times for every train line, denoted by p10 block,line , is computed based on historical data, in order to avoid infeasible estimates of process times for large delays and stored in W . Finally, every rst node in the planned route of a train v1 (n), modelling the entrance time of train n (the rst departure or the rst event within the observed network), is connected to a dummy node 0 by an arc with weight that is equal to the expected entrance time. When an update about the realisation of event vk V, k = 1, ..., |V | arrives, the subgraph Gk = (Vk , Ek ) is computed. Set Vk comprises all nodes reachable from vk , and Ek = {(vi , vj )|{vi , vj } Vk }. We set tpred (vk ) trec (vk ). If type(vk ) {departure, arrival}, the current delay value of the corresponding train is updated z (n(vk )) trec (vk ) tsch (vk ). The information is further propagated through the graph and predicted event times of all reachable events are computed according to Algorithm 1. The main loop of the prediction algorithm is initiated in line 4. In lines 58, the actual weight of an outgoing arc is computed (or retrieved in case of headway and connection arcs). The earliest realization time of the corresponding successor is updated in line 10. If all constraints on event realization time are known (all direct predecessors were visited and all incoming arcs traversed) the predicted event time is computed in line 12. The timetable constraint for departure events is included in line 15. For all scheduled events, the predicted delay vector is updated in line 16. Finally, in line 17, a recursive call of the algorithm is performed. The prediction algorithm sweeps through the subraph of reachable nodes Gk passing each arc only once. The running time of Algorithm 1 is therefore O(Ek ). After each event realisation, the graph updated using the procedure UpdateGraph (Algorithm 2). Events of a train occur in a given sequence that reect a dened route plan. As the rst event in a sequence is realised, the corresponding node is removed from the graph along with its incoming and outgoing arcs (lines 34) and an arc between node 0 and the next node in the event sequence of a train is added (line 5). The weight of the added arc is equal to the predicted realisation time of the next event of the train.

12

Algorithm 1 P REDICT E VENT T IMES 1: Input: Gk , W , z , vk 2: Output: Gk , z 3: z z 4: for all e out(vk ) do 5: if type(e) {dwell,run} then 6: w(e) max[p10 z (n(vi )))] block,line , fblock,line ( 7: else 8: w(e) (W, e) 9: vj end(e) 10: te (vj , vk ) w(e) + tpred (vk ) //update earliest time w.r.t. vk 11: if |te (vj , )| = |in(vj )| then 12: tpred (vj ) max(te (vj , ) //if all direct predecessors of vj wre visited 13: if type(vj ) {arrival,departure} then 14: if type(vj ) = departure then 15: tpred (vj ) max(tpred (vj ), tsch (vj )) 16: z (n(vj )) tpred (vj ) tsch (vj ) 17: PredictEventTimes(Gk , W, z , vj ) //recursive call

Algorithm 2 U PDATE G RAPH 1: Input: G = (V, E ), vk 2: Output: G 3: V V \ vk 4: E E \ {in(vk ),out(vk )} 5: E E (0, v1 (n), tpred (v1 (n)))

//remove realized node //update arc list

13

Algorithm 1 enables straightforward identication of connection conicts. If the critical incoming arc (the arc that actively constraints the earliest realisation time of the predicted event) is a connection arc, then a connection conict is identied. Note that a critical headway arc indicates a prediction of stop signal aspect before the train. Prediction of route conicts as dened by blocking time theory [12] is simple after execution of Algorithm 1 since the estimates of all train speed dependent times (approaching, running, clearing) are known. After including the signal watching, setup and release time, taken as constant values for all trains, the blocking times are determined and a route conict is identied by the overlapping blocking times.

Application of the prediction model

The performance of the model is illustrated on an example of the busy corridor between The Hague and Rotterdam in the Netherlands. The training set of data for regression analysis consists of train describer log les for six days of trafc in two trafc control areas. While sweeping the les with the process mining and conict identication tool [13], the dependencies of process times on current delays are computed, as well as the necessary percentiles to model the lower bounds of running, dwelling and headway times, as explained in Section 4.2. The predictions are performed on a separate test set consisting of track occupation data for one day of trafc. For model validation (and example of application) we simulate the real-time environment by scanning the train describer log le from the separate test set, that contains the chronologically sorted infrastructure and train messages from the two TROTS areas (Rotterdam and The Hague). Trafc control input is included in the form of a list of trains described by the train number, timetable, route plan (block sections) and expected entrance time to the observed part of the network (or the rst departure times if the train starts within the observed area). The selected corridor (Figure 6) and train routes enable testing the model with all possible train interactions. Between station The Hague HS and the junction close to Rijswijk, the trains running towards Rotterdam use two parallel tracks thus leading to merging routes at the junction, where the two tracks merge into one. Diverging routes and corresponding headways are also included, as the inbound routes of local and intercity trains in Rotterdam (RTD) lead to different platforms. An example of predictions is shown in Figure 7. The presented time-distance diagram shows the predicted train paths (local trains are given in magenta and intercity trains in blue). The realised train paths in space and time are presented with black lines. The prediction is performed at the departure of train ST5025 from The Hague HS (GV). Complete routes of the seven trains that enter the network within the 30 min prediction horizon are included in predictions. The average prediction error for 161 predicted events (including signal passages) is 19.33 s, while the maximum prediction error is 68.71 s. The maximum prediction error was produced for the passing time of a signal between stations Delft South (DTZ) and Schiedam (SDM) by train IC9216. The data set for that train line is signicantly smaller than for other lines due to its frequency (it runs only once in an hour while other lines run every 30 min). We expect that the accuracy of predictions for that particular line will improve when a larger training data set is considered. The major advantage of the presented model for trafc controllers is the ability to predict all route conicts within the prediction horizon. We use the principle of overlapping

14

The Hague HS

The Hague MW Rijswijk

Delft

Rotterdam central Delft south Schiedam

Figure 6: Schematic layout of the observed network [21]

ST 50 25 IC1 925

S22 IC9 27 216

ST5 127 IC21 27

025

27

25

S22 27 IC92 16

ST5

IC19

ST51

IC212

19 25 S2 22 IC9 7 216

IC21

ST

DT

IC

ST

25

RSW GVMW GV 07:13 07:21

07:30

IC19

S 22 2

07:38

IC921

07:46

IC212

07:55

ST5

DTZ

50

512

027

25

27

08:03

ST502

SDM

ST50

08:11

27

RTD

08:20

Figure 7: Time-distance diagram of predicted (at 7:13) and realised train paths

15

blocking times [12] to predict and visualise route conicts. Figures 8 and 9 show the predicted and realised blocking time diagram respectively. Local trains are presented in magenta and intercity trains in blue. Overlaps in blocking times that indicate route conicts are given in red.

025

27

25

16

IC19

ST5

IC92

S22

ST51

IC212

S2 22 IC9 7 216

IC21

ST

DT

IC

25

RSW GVMW GV 07:13 07:21

07:30

IC19

S222

07:38

IC921

07:46

IC212

07:55

ST5

19

DTZ

ST 512

50

25

027

25

27

08:03

ST502

27

SDM

ST

50 25 IC1 9 25

S22 IC9 27 216

ST5 127 IC21 27

ST50 27

RTD

08:11

08:20

Figure 8: Blocking time diagram predicted at 7:13 The three out of the four major route conicts that occurred, one in Schiedam (SDM) and two in Rotterdam (RTD), were predicted by the model. However, the very short running time of IC2127 between stations Delft South (DTZ) and Schiedam (Figure 7) caused a route conict with the preceding ST5127 at SDM (Figure 9), which was not captured by the model (Figure 8). Moreover, the chain of very short route conicts at station Delft (DT) was not captured but only the conict between IC9216 and ST5127. This example shows the morning peak hour which might cause longer dwell times in Delft and other short stops. Conditioning the dwell times on period of the day might improve these results. However, these inaccuracies of the current model did not affect the predictions of arrival times in Rotterdam. Computation time of the online prediction model in this example is very short (less than one second). The model is therefore suitable for real-time applications with regular updates of current train positions. For larger examples that model dense trafc, it is expectable that more events can occur almost simultaneously, i.e., more than one update can arrive within one second. The presented event-driven prediction algorithm can easily be modied to a time-driven version where the prediction process is performed in regular time intervals based on the information that arrived within the interval.

16

S22 27 IC9 216

ST5 127 IC2 127

502

025

27

IC21 27

925

16

S 22 2

ST5

IC1

IC92

ST51

DT

ST

S222

RSW GVMW GV 07:13 07:21

07:30

IC1

07:38

IC92

07:46

IC212 7

925

16

07:55

ST5

DTZ

19 IC

IC21 2

25

S2 22 IC 7 92 16 ST 51 27

50 25

027

08:03

ST502

SDM

IC1

ST

ST50

92

27

RTD

08:11

08:20

Figure 9: realised blocking time diagram

Summary and future work

This paper presents a microscopic model for accurate prediction of event times based on a timed event graph with dynamic arc weights. The process times in the model are obtained dynamically using processed historical train describer data, thus reecting all phenomena of railway trafc captured by the train describer systems and preprocessing tools. The graph structure of the model allows applying fast algorithms to compute prediction of event times even for large and busy networks. The main contribution of our approach is the dynamic estimation of process times for each train by using the predetermined functional dependence of process times on actual delays. Train interactions are modelled with high accuracy by including the main operational constraints and relying on actually realised corresponding minimum headway times (obtained from the historical data) rather than on theoretical values. The recursive depth-rst search algorithm with dynamic arc weights gives predictions for all event times within the horizon. The model has been applied in a case study on a busy corridor in the Netherlands in a simulated real-time environment, and produced accurate estimates for train trafc and route conicts within 30 min. Application of the model to a wider area is possible either by enlarging the observed area or by coordinating multiple areas. Finally, the model structure enables straightforward application of the network-wide delay propagation algorithm [8] to estimate the further effect of current trafc conditions (or examined trafc control actions). The further research will be focused on investigating the impact of other factors, such as period of the day and weather, on train running and dwell times. Factors with strong

17

impact on process times will be included in the prediction procedure in order to improve the accuracy. The predictive model provides effective decision support to signallers and trafc control and contributes to a better utilisation of railway infrastructure, improved reliability of train services, and more reliable and dynamic passenger information. The developed model will be embedded in a closed-loop model-predictive railway trafc control framework where online optimization algorithms will automatically resolve detected conicts and propose control decisions to trafc controllers together with the predicted conicts [14]. This way an intelligent railway trafc management system will be obtained that pro-actively monitors the railway trafc and supports trafc controllers with decisions that optimize the trafc on a network level, beyond the traditional local control areas. Acknowledgments This paper is a result of the research project funded by the Dutch Technology Foundation STW: Model-Predictive Railway Trafc Management (project no. 11025).

References
[1] Berger, A., Gebhardt, A., M uller-Hannemann, M., and Ostrowski, M. Stochastic Delay Prediction in Large Train Networks. In Caprara, A. and Kontogiannis, S. (eds.), 11th Workshop on Algorithmic Approaches for Transportation Modelling, Optimization, and Systems, pp. 100111, Dagstuhl, 2011. [2] B uker, T. and Seybold, B. Stochastic modelling of delay propagation in large networks. Journal of Rail Transport Planning & Management, 2(1-2):3450, 2012. [3] Corman, F., Goverde, R.M.P., and DAriano, A. Rescheduling dense train trafc over complex station interlocking areas. In Ahuja, R. K., M ohring, R. H., and Zaroliagis, C. D. (eds.), Robust and Online Large-Scale Optimization, vol. 5868 of Lecture Notes in Computer Science, pp. 369386. Springer, Berlin, 2009. [4] Daamen, W., Goverde, R.M.P., and Hansen, I.A. Non-Discriminatory Automatic Registration of Knock-On Train Delays. Networks and Spatial Economics, 9(1):4761, 2008. [5] DAriano, A., Pranzo, M., and Hansen, I.A. Conict Resolution and Train Speed Coordination for Solving Real-Time Timetable Perturbations. IEEE Transactions on Intelligent Transportation Systems, 8(2):208222, 2007. [6] Dolder, U., Krista, M., and Voelcker, M. RCS Rail Control System Realtime train run simulation and conict detection on a net wide scale based on updated train positions. In Proceedings of the 3rd International Seminar on Railway Operations Modelling and Analyisis (RailZurich2009), pp. 115, Zurich, 2009. [7] Exer, A. European Railway Signalling, chapter Rail Trafc Management, pp. 311 343. Institution of Railway Signal Engineers, A&C Black, London, 1995. [8] Goverde, R.M.P. A delay propagation algorithm for large-scale railway trafc networks. Transportation Research Part C: Emerging Technologies, 18(3):269287, 2010. 18

[9] Goverde, R.M.P. , Daamen, W., and Hansen, I.A. Automatic identication of route conict occurrences and their consequences. In Allan, J., Arias, E., Brebbia C.A., Goodman C.J., Rumsey A.F., Sciutto, G., and Tomii, N. (eds.), Computers in Railways XI, pp. 473482, Southampton, 2008. WIT Press. [10] Goverde, R.M.P. and Meng, L. Advanced monitoring and management information of railway operations. Journal of Rail Transport Planning & Management, 1(2):6979, 2011. [11] Hansen, I.A., Goverde, R.M.P., and Van der Meer, D.J. Online train delay recognition and running time prediction. In Intelligent Transportation Systems (ITSC), 2010 13th International IEEE Conference on, pp. 17831788, Madeira, 2010. [12] Hansen, I.A. and Pachl, J. (eds.). Railway Timetable & Trafc - Analysis, Modelling, Simulation. Eurailpress, Hamburg, 2008. [13] Kecman, P. and Goverde, R.M.P. Process mining of train describer event data and automatic conict identication. In Brebbia, C.A., Tomii, N., and Mera, J.M. (eds.), Computers in Railways XIII, WIT Transactions on The Built Environment, vol. 127, pp. 227238, Southampton, 2012. WIT Press. [14] Kecman, P., Goverde, R.M.P., and van den Boom, T.J.J. A model-predictive control framework for railway trafc management. In Proceedings of the 4th International Seminar on Railway Operations Modelling and Analyisis (RailRome2011), pp. 115, Rome, 2011. [15] Longo, G. and Medeossi, G. An approach for calibrating and validating the simulation of complex rail networks. In TRB 92nd Annual Meeting, pp. 119, Washington, 2013. [16] Medeossi, G., Longo, G., and de Fabris, S. A method for using stochastic blocking times to improve timetable planning. Journal of Rail Transport Planning & Management, 1(1):113, 2011. [17] Nachtigall, K. Time depending shortest-path problems with applications to railway networks. European Journal of Operational Research, 83(1):154166, 1995. [18] Nash, A. and Huerlimann, D. Railroad simulation using OpenTrack. In Allan, J., C.A. Brebbia, R.J. Hill, Sciutto, G., and Sone, S. (eds.), Computers in Railways IX, pp. 4554, Southampton, 2004. WIT Press. [19] Rousseeuw, P.J. and Driessen, K. Computing LTS Regression for Large Data Sets. Data Mining and Knowledge Discovery, 12(1):2945, 2006. [20] Siefer, T. and Radtke, A. Evaluation of delay propagation. In Proceedings of 7th World Congress on Railway Research, Montreal, 2006. [21] Sporenplan. www.sporenplan.nl, 2013. [22] Theeg, G. and Vlasenko, S. (eds.). Railway Signalling & Interlocking: International Compendium. Eurailpress, Hamburg, 2009. [23] Van der Aalst, W.M.P. Process mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Berlin, 2011.

19

You might also like