Information Sciences 133 (2001) 165±173
www.elsevier.com/locate/ins
Flight graph based genetic algorithm for crew
scheduling in airlines
H. Timucin Ozdemir, Chilukuri K. Mohan
*
Department of Electrical Engineering and Computer Science, Syracuse University, 2-177 CST,
Syracuse, NY 13244-4100, USA
Abstract
Crew scheduling is an NP-hard constrained combinatorial optimization problem,
very important for the airline industry [G. Yu (Ed.), Operations Research in Airline
Industry, Kluwer Academic Publishers, Dordrecht, 1998]. We solve this problem using a
genetic algorithm applied to a ¯ight graph representation that represents several
problem-speci®c constraints, unlike previous attempts [D. Levine, Application of a
hybrid genetic algorithm to airline crew scheduling, Ph.D. dissertation, Computer Science Department, IIT , Chicago, USA, 1995; J.E. Beasley, P.C. Chu, A genetic algorithm for the set covering problem, Eur. J. Oper. Res. 94 (1996) 392±404; P.C. Chu, J.E.
Beasley, A genetic algorithm for the set partitioning problem, Technical report, Imperial
College, UK, 1995]. In extensive experimental comparisons on ¯ight data of several
airlines, the new approach performed better than other approaches in 17 out of 24 data
sets. Ó 2001 Elsevier Science Inc. All rights reserved.
1. Introduction
Airline crew scheduling is the assignment of the ¯ight and training activities
schedule for some period of time to various crew members, important because
crew costs constitute the largest direct operating cost of airlines next to fuel
costs [1,3].
Crew scheduling is a dicult combinatorial optimization problem, generally
solved by transformation to the set cover problem (SCP) or the set partition
*
Corresponding author.
E-mail address: mohan@ecs.syr.edu (C.K. Mohan).
0020-0255/01/$ - see front matter Ó 2001 Elsevier Science Inc. All rights reserved.
PII: S 0 0 2 0 - 0 2 5 5 ( 0 1 ) 0 0 0 8 3 - 4
166
H.T. Ozdemir, C.K. Mohan / Information Sciences 133 (2001) 165±173
problem (SPP). These approaches use a binary matrix, whose rows represent
the ¯ights and columns represent collections of ¯ights that can be allocated to
the same crew. Columns (conforming to FAA regulations, company policies,
and labor union's requirements) are generated by a complete enumeration or
using heuristic methods. Then, the primary problem to be solved is to select a
feasible set of columns covering all the ¯ights while minimizing cost. Such an
approach is computationally unsatisfactory, especially when the number of
¯ights is large.
Section 2 describes the problem constraints, our representation, and the
function being optimized. Section 3 presents details of the evolutionary algorithm we used. Section 4 presents experimental results and concludes.
2. Problem and representation
For the optimization task, it is preferable to use the ¯ight schedule as input
data, rather than pre-processed columns. Some algorithms that rely on column
generation techniques attempt to keep the best columns, but some suboptimal
columns may be needed to ®nd better solutions [4]. Therefore, our proposed
technique starts from the ¯ight schedule and builds the ¯ight graph to take care
of some problem-speci®c constraints.
Instead of using nodes in a graph to represent cities and edges to represent
¯ights, we propose a much more suitable graph representation that embeds
problem constraints by using nodes to represent ¯ights, and edges to represent
dependency constraints among ¯ights: an edge exists from the node representing ¯ight X to the node representing ¯ight Y i (i) Y leaves from the
destination city of X, and (ii) Y leaves after a prespeci®ed delay following the
arrival of X.
Each path from a source node of the graph to a sink node represents a
feasible sequence of ¯ights that may be assigned to a single crew, capturing
some of the essential static constraints of the crew scheduling problem. Other
dynamic constraints are enforced by algorithms using this representation. The
directed edge i; j denotes that flightj can be ¯own after flighti , enforcing time
and city constraints. Table 1 and Fig. 1 illustrate this representation, using the
constraint that there is a rest period of at least 20 min between successive ¯ights
to which the same crew is assigned.
We assume that any crew members may be assigned to any ¯ights, i.e., this is
a single ¯eet problem. This assumption does not excessively oversimplify the
actual problem since our approach can be applied separately to each ¯eet type,
if necessary.
The goal of the algorithm is to produce solutions that respect constraints,
minimize number of crews, maximize crew time utilization, and balance
workloads (between crews).
H.T. Ozdemir, C.K. Mohan / Information Sciences 133 (2001) 165±173
167
Table 1
A sample schedule
Flight number
Dep. city
Des. city
Dep. time
Arr. time
0
1
2
3
4
5
6
7
8
0
0
3
1
2
1
0
1
1
1
3
1
2
1
0
1
0
0
08:00
09:30
10:35
09:20
10:40
12:00
13:40
15:20
15:30
09:00
10:10
11:35
10:20
11:40
13:20
15:00
15:50
16:30
Fig. 1. Directed graph representation for the ¯ight schedule in Table 1 with minimum 20 min rest
time between consecutive ¯ights.
Each crew assignment contains a set of rotations (pairings) for each crew.
Each rotation consists of duty periods whose sequence represents a path in the
¯ight graph. Fig. 2 depicts four possible chromosomes for the schedule in Table
1. The algorithm attempts to minimize the total crew cost by using the following constraints:
1. Each ¯ight must be covered by a crew.
2. Each rotation must start and end at the same city.
3. Each rotation must start and end at the same base city.
4. There must be a minimum time delay between consecutive ¯ight legs in a rotation.
5. If the length of the duty period is less than 8 h, then the next duty period
must start at least 8 h later.
6. If the length of the duty period is greater than 8 h and less than 13 h, then
the next duty period must start at least 12 h later.
168
H.T. Ozdemir, C.K. Mohan / Information Sciences 133 (2001) 165±173
Fig. 2. Four chromosomes are de®ned for the problem in Table 1; of these, (b) is not preferred
since the duty period of the second crew exceeds 8 h.
The cost includes overnight stay, deadheadings (crews ¯ying as passengers),
and overtime pay. Each crew gets paid a minimum amount to ¯y a certain
amount of time for a given period. If the schedule assigns unbalanced rotations, then the company gets punished from two sides: high overtime and
under-utilized personnel time. The goal is to build a schedule which has balanced rotations, utilizes the paid time of personnel, and minimizes overtime
and overnight payments.
A cost value for a rotation is calculated by adding terms corresponding to
the following:
· under-utilized time between rotations,
· under-utilized time between duties in each rotation,
· hotel and per-diem expenses for each rotation, and
· required and overtime (if needed) pay for each duty in each rotation.
The total cost of the schedule is obtained by summing the costs of all crew
schedules. A ®tness value is calculated by using the cost augmented by penalty
terms for each violation of constraints 1, 2 and 3, listed earlier.
3. Details of genetic algorithm
GraGA is a steady-state genetic algorithm that operates on the ¯ight graph
representations. Each individual (chromosome) contains a sequence of rotations (pairings) for each crew (see Fig. 3). Each rotation sequence represents a
schedule for a crew. Each rotation contains a sequence of duty periods, containing a sequence of edges. The length of the rotations, duty periods, and rest
periods are restricted by the regulations. A new ospring may replace one of its
parents, using binary tournament selection, except that the best solution in the
population is not replaced.
H.T. Ozdemir, C.K. Mohan / Information Sciences 133 (2001) 165±173
169
Fig. 3. GraGA algorithm.
The mutation operators resemble well-known GA crossover operators (1PX
or 2PX), applied within a single individual: select two edge lists, cut both of
them at some suitable point(s), and splice together the pieces.
We have experimented with three recombination operators, applied to rotations. The set based operator attempts to use genes inherited from both
parents as much as possible. The time based operator cuts the schedule into two
pieces and recombines them. The distance preserving operator attempts to
preserve genes common to both parents but ®lls up the rest using alleles not
used in any parent.
Local search follows the application of recombination and mutation operators. Local search reorganizes the rotations based on time and attempts to
build a schedule as tight as possible. Then, it applies 1PX and 2PX operator to
the rotations of a schedule. An ospring may replace a parent when this improves the schedule cost.
Individuals in the initial population are obtained using the following algorithm:
· Let V 0 ; E0 be the current graph, and let Es contain edges in E0 incident on
sink nodes (with outdegree 0).
· while Es 6 fg and V 0 6 fg, do:
produce a feasible path by concatenating edges backwards, beginning
from edges in Es ;
update E0 by removing edges incident on the covered nodes, V 0 by removing covered nodes, and Es to contain edges connected to sink nodes in the
new V 0 ; E0 .
· If V 0 6 fg, then repeat the previous step after reinserting some previously removed edges into the current graph, penalizing these to discourage their reuse.
· If V 0 6 fg, then cover these nodes (¯ights) while relaxing constraint 1, allowing deadheading (more than one crew per ¯ight).
The restoration algorithm is required when the solution does not cover all the
¯ights. Where possible, previously discovered low cost rotations are used to
170
H.T. Ozdemir, C.K. Mohan / Information Sciences 133 (2001) 165±173
modify an individual candidate solution, with restricted backtracking required
in some cases.
A restart procedure is applied periodically, introducing new randomly
generated chromosomes into the population. This procedure sorts the chromosomes in decreasing ®tness order, and starting from the second individual
marks a chromosome to be replaced if it is within a `distance' (de®ned below)
< d from any of the preceding unmarked chromosomes. If E ci denote the set
of edges used by chromosome ci , then the distance between chromosomes ci
and cj is de®ned to be d ci ; cj 1
jE ci \ E cj j= jE ci j. Each of our
experiments applied the restart procedure 10 times.
4. Results
Test data sets were obtained by querying various airline internet sites. For
ease of comparison with the SCP solution techniques, we enumerated all columns whose total time duration is at most three days.
Computational experiments compared GraGA with greedy algorithms developed based on Chvatal's heuristic [5], Back's heuristic [6] and our new
heuristic which multiplies B
ack's cost-ratio with the number of overlaps between the current cover and the cover of the candidate column. We also examined results obtained using a GA implemented for the SCP [2].
GraGA performed better than other approaches in many respects: better
workload distribution, smaller number of crews, and fewer deadheadings, as
illustrated in Table 2. In some problems, GraGA produced satisfactory results
whereas the number of columns was too large for column based approaches to
be applied.
To conclude, the proposed graph representation is very useful for highly
constrained transportation problems, reducing the search space and facilitating the application of recombination operators. Our approach does away
with the burden of column generation, hopefully making this set of problems
GA-solvable. Recent results [7] show success in applying this approach to the
vehicle routing problem with time windows (VRPTW).
Appendix A. Fitness function
A cost value for a rotation is calculated by adding the amount paid to crew
and hotels for overnight stays. Using the variables described in Fig. 4, the
following criteria are applied when evaluating each rotation:
· No duty can be longer than MaxDutyTime and cannot contain more than
MaxFlyingTime minutes of ¯ying.
Table 2
The cost of best solution obtained using various algorithms; #F denotes the number of ¯ights, #C denotes the number of columns, and #D denotes the
number of days spanned by the schedule
Problem Name
#D
Chvatal
SCP [5]
Back SCP
[6]
SCP based
CBGA [2]
Greedy
SCP(new)
GraGA
16
40
52
76
81
107
122
123
133
182
184
205
295
399
453
857
904
1008
1105
1466
1698
3428
4415
5318
5618
15 250
14 254
21 308
1
1
7
1
1
1
2
2
7
1
1
1
1
2
1
1
1
2
1
2
1
2
2
7
2
2
2
7
3600
9698
22609.7
8340
9636.6
8400
6900
16 070
10441.5
11 400
12 940
14752.8
10 050
39279.7
19 270
16 840
14114
14 984
11 400
20248.9
23806.9
78343.8
89482.1
22042.8
40398.6
16593.5
27444.2
22636.8
3600
9730
17609
7540
8914.1
7054
4500
15 903
9088.5
9444
10 310
12072.8
8750
38020.5
13 680
13 272
9755
11 414
6360
18744.5
20335.9
71984.7
82463.9
18207.3
29178.6
11298.5
23 184
15431.9
3600
9356
18546.3
7120
8790.3
7080
4500
15 598
9756.5
9120
9680
12100.8
8290
35000.7
14 070
12 592
10679
10 784
6380
16 946
20043.1
78 421
88907.5
19356.5
33340.6
11473.5
23854.8
17 085
3600
9730
17609
7553
8914.1
7107
4527
15 903
9118.5
9432
10 310
12072.8
8750
37696.5
14 010
13 310
9740
11 530
6360
18440.5
20365.8
71714.7
82121.7
17593.5
29224.6
11319.5
24 327
19420.9
3600
8912
15992.1
7112
8975.33
7092
4500
15 576
8349.5
9452
10 100
11707.8
8180
39435.5
14 500
12 210
9580
10 678
5800
20732.3
18961.3
77440.8
90 517
16914.2
30006.6
11298.5
30026.2
12321.5
1
7
7
3
17
171
Number of best results
26
43
28
55
54
63
46
85
32
86
75
93
72
173
110
102
118
73
73
86
133
333
380
83
217
97
158
74
#C
H.T. Ozdemir, C.K. Mohan / Information Sciences 133 (2001) 165±173
GBXORD_M_26
Delta_B727_43
MandarinAir
GBXSEA_M_55
OrlandoAir
GBXDFW_M_63
GBXSFO_M_46
Delta_B737_85
LynxAir
GBXDFW_M_86
GBXSEA_M_75
GBXMIA_M_93
GBXDEN_M_72
Delta_MD88_173
GBXSEA_M_110
Delta_102
GBXATL_M_118
GBXORD_MT_73
GBXBOS_M_73
Delta_B727_86
Delta_133
Delta_MD88_333
Delta_MD88_380
OrcaAir
Delta_217
GBXSFO_MT_97
GBXDFW_MT_158
THY_74
#F
172
H.T. Ozdemir, C.K. Mohan / Information Sciences 133 (2001) 165±173
Fig. 4. Variables used in the problem to calculate the cost of each schedule, with values used in our
simulations.
· If a duty contains less than MaxFlyingTime minutes of ¯ying then a crew get
paid at PayForMinPaidFlyingTime (PFMPFT).
· If the ¯ying time (s) in a duty exceeds MinPaidFlyingTime (M) then a crew
earns
P s M E:
· If a crew stays overnight in another place dierent from his/her domicile,
then for each overnight stay
H PDE LengthOfStay
will be charged to this rotation.
· If the length of a duty (ld) is longer than AveDutyTime (ADT), then the minimum required rest time (MRRT) is calculated as
ld ADT
MxDT
MxR MnR
ADT MnR;
where MxR, MnR, and ADT conforms to the standard values used in the
industry.
· If the length of rest (RT) between two duties is longer than Mean Required
Rest Time (MRRT), then this under-utilized crew time is added as an UUC:
UUC RT
MRRT CPMU:
In summary, the components of cost for a sequence of rotations assigned to a
crew are:
· under-utilized time between rotations,
· under-utilized time between duties in each rotation,
· hotel and per-diem expenses for each rotation, and
· required pay for each duty in each rotation.
H.T. Ozdemir, C.K. Mohan / Information Sciences 133 (2001) 165±173
173
Each rotation starts from and ends at the same base city and there is a minimum time requirement between consecutive ¯ights (30 min). After the cost of
schedule is calculated, a ®tness value is assigned to this solution. The ®tness
value is de®ned as
nef
nsf
nuf
1 scost 1
1
1
;
nf
nf
nf
where
scost 1
cost minCostInPop
;
maxCostInPop minCostInPop
minCostInPop is the minimum cost value in the current population; maxCostInPop is the maximum cost value in the current population; nf is the number
of ¯ights in the problem; nef is the number of rotations that does not end where
it is started; nsf is the number of rotations that does not start from a base city,
and; nuf is the number of uncovered ¯ights.
References
[1] G. Yu (Ed.), Operations Research in Airline Industry, Kluwer Academic Publishers, Dordrecht,
1998.
[2] J.E. Beasley, P.C. Chu, A genetic algorithm for the set covering problem, Eur. J. Oper. Res. 94
(1996) 392±404.
[3] R. Anbil, E. Gelman, B. Patty, R. Tanga, Recent advances in crew-pairing optimization at
American airlines, Interfaces 21 (1) (1991) 62±74.
[4] J.-M. Rousseau, J. Desrosiers, Results obtained with Crew-Opt: a column generation method
for transit crew scheduling, in: Proceedings of the Sixth International Workshop on ComputerAided Scheduling of Public Transport, Springer, New York, 1995, pp. 349±358.
[5] V. Chvatal, A greedy heuristic for the set covering problem, Math. Oper. Res. 4 (3) (1979) 233±
235.
[6] T. B
ack, M. Schutz, S. Khuri. A comparative study of a penalty functions, a repair heuristic,
and stochastic operators with the set-covering problem, in: Proceedings of the European
Conference on Arti®cial Evolution, Springer, New York, 1995, pp. 320±332.
[7] H.T. Ozdemir, Graph based evolutionary algorithms for transportation problems, Ph.D.
dissertation, Department of EECS, Syracuse University, 2001.
View publication stats