NIPS2019 TGAN Supplementary PDF

Supplementary Materials:
Time-series Generative Adversarial Networks
Jinsung Yoon Daniel Jarrett

University of California, Los Angeles, USA University of Cambridge, UK
jsyoon0823@g.ucla.edu daniel.jarrett@maths.cam.ac.uk
Mihaela van der Schaar

University of Cambridge, UK
University of California, Los Angeles, USA
Alan Turing Institute, UK
mv472@cam.ac.uk, mihaela@ee.ucla.edu
Additional Related Work
TimeGAN integrates ideas from autoregressive models for sequence prediction [1, 2, 3], GAN-based
methods for sequence generation [4, 5, 6], and time-series representation learning [7, 8, 9]—the
relation and details for which are discussed in the main manuscript. In this section, we additionally
discuss methods related on the periphery, including RNN-based sequence models using variational
autoencoders, as well as GAN-based approaches for semi-supervised learning.
RNN-based models have been combined with variational autoencoders to generate sequences. In [10],
this was done by learning to map entire sequences to single latent vectors, with the goal of capturing
high-level properties of sequences and interpolating in latent space. This idea was extended to the
general time-series setting [11], with the additional proposal that the trained weights and network
states can be used to initialize standard RNN models. However, in both cases sampling from the
prior over these representations involved but a simple deterministic decoder, so the only source of
variability is found in the conditional output probability model. On the other hand, [12] proposed
augmenting the representational power of the standard RNN model with stochastic latent variables at
each time step. Recognizing that model variability should induce dependencies across time steps,
[13] further extended this approach to accommodate temporal dependencies between latent random
variables, and [14] explicitly layer a state space model on top of the RNN structure. In parallel,
this technique has since been applied to temporal convolutional models of sequences as well, with
stochastic latent variables injected into the WaveNet structure [15]. However, the focus of these
methods is specifically on encoding sufficient input variability to model highly structured data (such
as speech and handwriting). In particular, they do not ensure that new sequences unconditionally
sampled from the model match the underlying distribution of the training data. By contrast, TimeGAN
focuses on learning the entire (joint) distribution such that sampled data match the original, while
simultaneously ensuring that the model respects the (conditional) dynamics of the original data.
It is also worth mentioning that our method bears superficial resemblance to GAN-based approaches
for semi-supervised learning. With methods such as [18, 19, 20], the task of interest is one of
supervised classification, with an auxiliary unsupervised loss for generating additional unlabeled
examples for training. Conversely, the focus of TimeGAN is on the unsupervised task of generative
modeling, with an auxiliary supervised loss to provide additional control over the network’s dynamics.
33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.
Table 1: Summary of Related Work. (Open-loop: Previous outputs are used as conditioning informa-
tion for generation at each step; Mixed-variables: Accommodates static & temporal variables).
C-RNN-GAN RCGAN T-Forcing P-Forcing TimeGAN
[4] [5] [16, 17] [2] (Ours)
Stochastic X X X
Open-loop X X X X
Adversarial loss X X X X
Supervised loss X X X
Discrete features X X
Embedding space X
Mixed-variables X
Additional Illustrations
Figure 1(b) in the main manuscript details the training scheme for TimeGAN. For side-by-side
comparison, Figure 1(a) below additionally illustrates the training scheme for existing methods
C-RNN-GAN and RCGAN, which employ the standard GAN setup during training. Furthermore,
Figure 1(b) also compares the flow of data during sampling time for TimeGAN and these existing
methods.
TimeGAN C-RNN-GAN TimeGAN C-RNN-GAN

& RCGAN & RCGAN
s̃, x̃1:T
<latexit sha1_base64="(null)">(null)</latexit>
ỹS , ỹ1:T
ỹS , ỹ1:T
ŝ, x̂1:T
@LR @LU @LU

@✓r <latexit sha1_base64="(null)">(null)</latexit>
@✓d <latexit sha1_base64="(null)">(null)</latexit>
@✓d
r
<latexit
d
<latexit
d
<latexit
r
<latexit
hS , h1:T
ĥS , ĥ1:T
s, x1:T
ŝ, x̂1:T
ĥS , ĥ1:T
ŝ, x̂1:T
@LR @LS @LS @LU @LU

@✓e <latexit sha1_base64="(null)">(null)</latexit>
@✓e @✓g @✓g @✓g
g g g g
<latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit>
e
<latexit
<latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit>
s, x1:T
zS , z1:T
zS , z1:T
zS , z1:T
zS , z1:T
(a) Training Time (b) Sampling Time
Figure 1: Block diagram of our proposed method (TimeGAN), shown here in comparison with
existing methods (C-RNN-GAN and RCGAN) during (a) training time, as well as (b) sampling
time. Solid lines indicate forward propagation of data, and dashed lines indicate backpropagation of
gradients.
Hyperparameters and Benchmarks

We use tensorflow to implement TimeGAN; source code will be made available after acceptance. All
of the components (embedding network, generator, and discriminator) are implemented with 3-layer
GRUs with hidden dimensions 4 times the size of the input features. The dimension of the latent
space is half that of the input features. We use tanh as the activation function and sigmoid as the
output layer activation function such that outputs are within the [0, 1] range. We also normalize the
dataset to the [0, 1] range using min-max scaling. We set λ = 1 and η = 10 in our experiments.
We use the following publicly available source code to implement our benchmarks.
• C-RNN-GAN [4]: https://github.com/olofmogren/c-rnn-gan

• RCGAN [5]: https://github.com/ratschlab/RGAN
2
• T-Forcing [16]: https://github.com/snowkylin/rnn-handwriting-generation
• P-Forcing [2]: https://github.com/anirudh9119/LM_GANS
• WaveNet [21]: https://github.com/ibab/tensorflow-wavenet
• WaveGAN [22]: https://github.com/chrisdonahue/wavegan
For fair comparison, we use the same underlying recurrent neural network architecture (3-layer GRUs
with hidden dimensions 4 times the size of input features) for C-RNN-GAN, RCGAN, T-Forcing,
and P-Forcing as is used in TimeGAN. In the case of deterministic models (such as T-Forcing and
P-Forcing), we first train an original GAN model to generate feature vectors as inputs for the initial
time step, which follows the original feature distribution at the initial time step. Then, using the
generated feature vector as input, we initialize the model to generate the sequence in open-loop mode.
Finally, the post-hoc time-series classification and sequence-prediction models are implemented as
2-layer LSTMs with hidden dimensions 4 times the size of the input features. As before, we use tanh
as the activation function and sigmoid as the output layer activation function such that outputs are
within the [0, 1] range.
Additional Dataset Statistics
Table 2: Additional Dataset Statistics

Dataset Sequences Dim. Avg. Len. Feature Corr. Temporal Variance Temporal Corr.
Sines 10,000 5 24 pts 0.0117 0.3167 0.2056
Stocks 3,773 6 24 days 0.8596 0.0129 0.9902
Energy 19,711 29 24 hrs 0.2843 0.0444 0.8506
Events 149,967 54 58 events 0.0095 0.0622 0.0744
The Google Stocks dataset is available online, and can be downloaded from: LINK. The UCI
Appliances Energy Prediction dataset is also available online, and can be downloaded from: LINK.
3
Algorithm Pseudocode
Algorithm 1 Pseudocode of TimeGAN

1: Input: λ = 1, η = 10, D, batch size nmb , learning rate γ
2: Initialize: θe , θr , θg , θd
3: while Not converged do
4: (1) Map between Feature Space and Latent Space
i.i.d.
5: Sample (s1 , x1,1:Tn ), ..., (snmb , xnmb ,1:Tnmb ) ∼ D
6: for n = 1, ..., nmb , t = 1, ..., Tn do
7: (hn,S , hn,t ) = (eS (sn ), eX (hn,S , hn,t−1 , xn,t ))
8: (s̃n , x̃n,t ) = (rS (hn,S ), rX (hn,t ))
9:
10: (2) Generate Synthetic Latent Codes
i.i.d.
11: Sample (zS,1 , z1,1:Tn ), ..., (zS,nmb , znmb ,1:Tnmb ) ∼ pZS×X
12: for n = 1, ..., nmb , t = 1, ..., Tn do
13: (ĥn,S , ĥn,t ) = (gS (zS,n ), gX (ĥn,S , ĥn,t−1 , zn,t ))
14:
15: (3) Distinguish between Real and Synthetic Codes
16: for n = 1, ..., nmb , t = 1, ..., Tn do
17: (yn,S , yn,t ) = (dS (hn,S ), dX (u~n,t , ~un,t ))
18: ˆ~n,t , ~u
(ŷn,S , ŷn,t ) = (dS (ĥn,S ), dX (u ˆ n,t ))
19:
20: (4) Compute Reconstruction
Pnmb (L̂R ), Unsupervised (L̂U ), and Supervised (L̂S ) Losses
1
P
21: L̂R = nmb n=1 ksn − s̃ n k2 + t kx n,t − x̃ n,t k2
1
Pnmb h P P i
22: L̂U = nmb n=1 log yn,S + t log yn,t + log(1 − ŷn,S ) + t log(1 − ŷn,t )
1
Pnmb P
23: L̂S = nmb n=1 t khn,t − gX (hn,S , hn,t−1 , zn,t )k2
24:
25: (5) Update θe , θr , θg , θd via Stochastic Gradient Descent (SGD)

26: θe = θe − γ∇θe − λL̂S + L̂R

27: θr = θr − γ∇θr − λL̂S + L̂R

28: θg = θg − γ∇θg − η L̂S + L̂U
29: θd = θd + γ∇θd − L̂U
30:
31: (6) Synthetic Data Generation
i.i.d.
32: (6-1) Sample (zS,1 , z1,1:Tn ), ..., (zS,N , zN,1:TN ) ∼ pZS×X
33: (6-2) Generate synthetic latent codes
34: for n = 1, ..., N, t = 1, ..., Tn do
35: (ĥn,S , ĥn,t ) = (gS (zS,n ), gX (ĥn,S , ĥn,t−1 , zn,t ))
36: (6-3) Mapping to the feature space
37: for n = 1, ..., N, t = 1, ..., Tn do
38: (ŝn , x̂1:Tn ) = (rS (hn,S ), rX (hn,t ))
39:
40: Output: D̂ = {ŝn , x̂1:Tn }N
n=1
4
Additional Visualizations with t-SNE and PCA
Figure 2: t-SNE (1st column) and PCA (2nd column) visualizations on Sines, and t-SNE (3rd column)
and PCA (4th column) visualizations on Stocks. Each row provides the visualization for each of the 7
benchmarks, ordered as follows: (1) TimeGAN, (2) RCGAN, (3) C-RNN-GAN, (4) T-Forcing, (5)
P-Forcing, (6) WaveNet, and (7) WaveGAN. Red denotes original data, and blue denotes synthetic.
5
References
[1] Samy Bengio, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer. Scheduled sampling for
sequence prediction with recurrent neural networks. In Advances in Neural Information
Processing Systems, pages 1171–1179, 2015.
[2] Alex M Lamb, Anirudh Goyal Alias Parth Goyal, Ying Zhang, Saizheng Zhang, Aaron C
Courville, and Yoshua Bengio. Professor forcing: A new algorithm for training recurrent
networks. In Advances In Neural Information Processing Systems, pages 4601–4609, 2016.
[3] Vijay R Konda and John N Tsitsiklis. Actor-critic algorithms. In Advances in neural information
processing systems, pages 1008–1014, 2000.
[4] Olof Mogren. C-rnn-gan: Continuous recurrent neural networks with adversarial training. arXiv
preprint arXiv:1611.09904, 2016.
[5] Cristóbal Esteban, Stephanie L Hyland, and Gunnar Rätsch. Real-valued (medical) time series
generation with recurrent conditional gans. arXiv preprint arXiv:1706.02633, 2017.
[6] Giorgia Ramponi, Pavlos Protopapas, Marco Brambilla, and Ryan Janssen. T-cgan: Conditional
generative adversarial network for data augmentation in noisy time series with irregular sampling.
arXiv preprint arXiv:1811.08295, 2018.
[7] Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, and Ole Winther.
Autoencoding beyond pixels using a learned similarity metric. arXiv preprint arXiv:1512.09300,
2015.
[8] Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Ar-
jovsky, and Aaron Courville. Adversarially learned inference. arXiv preprint arXiv:1606.00704,
2016.
[9] Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, and Brendan Frey. Adver-
sarial autoencoders. arXiv preprint arXiv:1511.05644, 2015.
[10] Samuel R Bowman, Luke Vilnis, Oriol Vinyals, Andrew M Dai, Rafal Jozefowicz, and Samy
Bengio. Generating sentences from a continuous space. arXiv preprint arXiv:1511.06349,
2015.
[11] Otto Fabius and Joost R van Amersfoort. Variational recurrent auto-encoders. arXiv preprint
arXiv:1412.6581, 2014.
[12] Justin Bayer and Christian Osendorfer. Learning stochastic recurrent networks. arXiv preprint
arXiv:1411.7610, 2014.
[13] Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C Courville, and Yoshua
Bengio. A recurrent latent variable model for sequential data. In Advances in neural information
processing systems, pages 2980–2988, 2015.
[14] Marco Fraccaro, Søren Kaae Sønderby, Ulrich Paquet, and Ole Winther. Sequential neural
models with stochastic layers. In Advances in neural information processing systems, pages
2199–2207, 2016.
[15] Guokun Lai, Bohan Li, Guoqing Zheng, and Yiming Yang. Stochastic wavenet: A generative
latent variable model for sequential data. arXiv preprint arXiv:1806.06116, 2018.
[16] Alex Graves. Generating sequences with recurrent neural networks. arXiv preprint
arXiv:1308.0850, 2013.
[17] Ilya Sutskever, James Martens, and Geoffrey E Hinton. Generating text with recurrent neural
networks. In Proceedings of the 28th International Conference on Machine Learning (ICML-11),
pages 1017–1024, 2011.
[18] Durk P Kingma, Shakir Mohamed, Danilo Jimenez Rezende, and Max Welling. Semi-supervised
learning with deep generative models. In Advances in neural information processing systems,
pages 3581–3589, 2014.
6
[19] Abhishek Kumar, Prasanna Sattigeri, and Tom Fletcher. Semi-supervised learning with gans:
Manifold invariance with improved inference. In Advances in Neural Information Processing
Systems, pages 5534–5544, 2017.
[20] Takumi Kobayashi. Gan-based semi-supervised learning on fewer labeled samples.
[21] Aäron Van Den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex
Graves, Nal Kalchbrenner, Andrew W Senior, and Koray Kavukcuoglu. Wavenet: A generative
model for raw audio. SSW, 125, 2016.
[22] Chris Donahue, Julian McAuley, and Miller Puckette. Adversarial audio synthesis. arXiv
preprint arXiv:1802.04208, 2018.

NIPS2019 TGAN Supplementary PDF

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

NIPS2019 TGAN Supplementary PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NIPS2019 TGAN Supplementary PDF

Uploaded by

Copyright:

Available Formats

Supplementary Materials:

Time-series Generative Adversarial Networks

Jinsung Yoon Daniel Jarrett

Mihaela van der Schaar

Additional Related Work

TimeGAN C-RNN-GAN TimeGAN C-RNN-GAN

@LR @LU @LU

@LR @LS @LS @LU @LU

<latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit> <latexit sha1_base64="(null)">(null)</latexit>

(a) Training Time (b) Sampling Time

Hyperparameters and Benchmarks

• C-RNN-GAN [4]: https://github.com/olofmogren/c-rnn-gan

Additional Dataset Statistics

Table 2: Additional Dataset Statistics

Algorithm 1 Pseudocode of TimeGAN

You might also like