Part A Traffic Flow Prediction_168

DNN Assignment Part A: Literature Exploration and Comparison
Traffic Flow Prediction

Group No 168
Team Contribution:
Akhand Pratap BITS ID 2023ac05333 : 100%
Agam Kumar, BITS ID 2023ac05102: 100%
Ajeet Kumar, BITS ID 2023ac05561: 100%
Suram Swathi, BITS ID 2023ac05931: 100%
Application Domain : Traffic Flow Prediction

Papers Paper 1: Paper 2: Paper 3:
Spatial linear transformer and STADGCN: spatial–temporal MAPredRNN: Multi-attention
temporal convolution network for adaptive dynamic graph predictive RNN for traffic flow
traffic flow prediction convolutional network for traffic prediction by dynamic spatial-
flow prediction temporal data fusion
Authors Zhibo Xing, Mingxia Huang, Wentao Li Ying Shi, Wentian Cui, Ruiqin Wang Xiaohui Huang, YuanJiang &
& Dan Peng Jungang Lou & Qing Shen JieTang
Year of Publication 19 Feb 2024 02 October 2024 6 March 2023
The model proposed in the paper is The proposed model is called Spatial– The proposed model is called Multi-
called Spatial Linear Transformer Temporal Adaptive Dynamic Graph Attention Predictive Recurrent
with Temporal Convolution Network Convolutional Network (STADGCN) Neural Networks (MAPredRNN)
(SLTTCN).
Main components: Main Components: Main Components:
Spatial Linear Transformer Gated Temporal Convolution CNN (Convolutional Neural

Network (SLTN): This Network (Gated TCN):Utilizes Network): Captures spatial
captures the spatial dilated causal convolution features from input traffic
Architecture of Deep
dependency dynamically by networks to efficiently capture data.
Learning (including the
incorporating real-time traffic long-term temporal
number of layers,
states and connectivity dependencies. Includes two PredRNN (Predictive
types of layers,
among nodes. parallel temporal convolution Recurrent Neural Network):
activation functions,
modules. Captures temporal
and any unique
Temporal Convolution dependencies in the traffic
features)
Network (TCN): This Spatial Static–Dynamic Graph data.
component captures the Learning Layer: Comprises
temporal dependency by three sub-modules: Multi-Attention Module
using dilated causal Static Adaptive Graph Dynamically captures spatial
convolution layers. The Learning (SAGL): dependencies for closeness,
size of the receptive field Learns a static adjacency periodicity, and trend
grows exponentially with the matrix to capture global features.Calculates attention
number of stacked layers. spatial dependencies. weights for feature vectors
using dot product
operations.Normalizes
Dynamic Graph weights using softmax and
Learning (DGL): Uses a computes weighted sums of
Bidirectional and Gate Graph Attention feature vectors.
Fusion Mechanism: This is Network (GAT) to
used in TCN to model both adaptively learn time- Fusion Mechanism:
past-to-future and future-to- varying spatial Integrates features from
past dependencies and fuse dependencies. the multi-attention module.
the two types of temporal Dynamically fuses
dependencies. Spatial Gate Fusion: Merges static and closeness, periodicity, and
dynamic spatial features with adaptive trend features using
weights. weighted sums. Includes
learnable parameters for
Residual and Skip Connections: Used feature weights.
to address vanishing gradients and
improve training efficiency.
Layer details: Layers details:

Layer details: • Spatio-temporal feature
Position embedding to extraction module: Includes
capture spatial correlation. The model consists of 8 spatial- multi-layer CNN and
temporal layers with residual Predictive RNN (PredRNN).
Linear self-attention layer to connections. Each layer • CNN layers: 2
capture dynamic spatial includes: • PredRNN layers: 2
dependencies, which reduces ▪ Gated TCN • Deconvolution Neural
memory and computational module. Network (DCNN) layers: 2
complexity. ▪ Static and
dynamic graph Layer Types:
Multi-head attention learning layers.
mechanism in SLTN to ▪ Spatial gate fusion • CNN: Extracts spatial
ensure efficient learning. module. features.
Specifically, the attention Kernel size: 2 for the dilated • PredRNN: Captures spatio-
layer has 4 attention heads causal convolutions. temporal information.
with input and output
dimensions set to 12.
• Deconvolution Network:
Sequential and Reverse- Generates traffic flow
sequential Temporal predictions.
Convolution layers with
convolution kernel size of 3.
Activation functions:
Activation functions:
ReLU (Rectified Linear Unit) is Tanh Activation Functions: Not explicitly
used for the activation Sigmoid mentioned in the document.
functions in TCN. ReLU
Sigmoid function is used in

the gate fusion mechanism to
balance the fusion of temporal
dependencies from both
directions.
Unique Features:
The SLTN captures dynamic

global spatial Unique Features:
dependencies, allowing the • A multi-attention mechanism
model to consider designed for dynamic spatial
relationships between distant dependencies.
nodes, rather than just local • Feature fusion mechanism
neighboring nodes. for combining closeness,
periodicity, and trend
The bidirectional TCN information.
helps in modeling complex
temporal dependencies from
both past-to-future and
future-to-past.
Feature Engineering: The model uses Feature Engineering: Dynamically Feature engineering: Extracts
the Spatial Linear Transformer to learns spatial and temporal features complex spatial-temporal
dynamically learn spatial features from from historical traffic data. correlations and dynamic spatial
traffic data by analyzing spatial dependencies.
correlations in real time.
How is the network Temporal Dependency Modeling: Multifaceted Purpose: Combines

helping the overall The Bidirectional TCN component regression (predicting traffic flow values)
task? eg: feature engg helps in capturing temporal and spatial-temporal feature extraction
or classification or dependencies between time steps for enhanced accuracy.
regression or all efficiently.
Prediction Task: The overall task is Prediction: Performs time-series Prediction: Provides accurate traffic
traffic flow prediction, and the SLTTCN forecasting to predict future traffic flow inflow and outflow predictions.
model effectively combines spatial and conditions.
temporal feature learning to make
multi-step predictions.
Training procedures Optimization Algorithm: Adam Optimization Algorithm: Adam Optimization Algorithm: Adam
(e.g, training strategy, optimizer. optimizer. optimizer.
including optimization
algorithms, learning Learning Rate: Initial rate of 0.001. Learning Rate: 0.0001.
rates,batch sizes, and
regularization Batch Sizes: 30 Batch Size: 32. Batch Size: 16.
techniques)
Regularization Techniques: Regularization Techniques:
The model uses dropout in Dropout: Dropout ratio of 0.3 to
multi-head attention prevent overfitting.
mechanisms and residual
blocks to prevent overfitting. Z-Score Normalization: Applied
to the input data for scaling.
Zero padding is used in the
temporal convolutional layers to
ensure the input size remains
consistent across layers.
Loss function: The model uses Mean

Absolute Error (MAE) as the loss Loss function: The model uses Mean Loss function: The model uses
function to train the model. Absolute Error (MAE) as the loss Root Mean Square Error (RMSE).
function to train the model. as the loss function to train the
model.
Evaluation /
Performance metric The model performance is evaluated The model performance is evaluated The model performance is evaluated
used using the following metrics: using the following metrics: using the Root Mean Square Error
• Mean Absolute Error (MAE): (RMSE).
Measures the average • Mean Absolute Error (MAE):
magnitude of the prediction • Root Mean Squared Error
errors. (RMSE):
• Root Mean Squared Error • Mean Absolute Percentage
(RMSE): Evaluates the square Error (MAPE):
root of the average of squared
differences between the
predicted and actual values.
The model was evaluated on two Datasets used are: Datasets used are:
publicly available datasets:
• METR-LA: Traffic speed data • TaxiNYC: Source: New York
Name of Dataset • PeMSD4: Traffic flow data from from 207 sensors in Los Angeles City taxi trip data
the California Department of over four months.
Transportation. • PEMS-BAY: Traffic speed data • BikeNYC: Source: New York
from 325 sensors in the San City bike trip data.
• PeMSD8: Another traffic Francisco Bay Area over six
dataset from the same source. months.
Dataset URL: Dataset URL:
https://pems.dot.ca.gov/ https://github.com/liyaguang/DCRNN
MAPredRNN (Multi-Attention Predictive Recurrent Neural Network) focuses on dynamic spatial dependencies and
captures non-linear spatial-temporal correlations effectively, STADGCN (Spatial–Temporal Adaptive Dynamic Graph
Convolutional Network) captures both global (static) and local (dynamic) spatial dependencies and adapts to temporal
and spatial variations dynamically and it also outperforms baseline models on real-world datasets while SLTTCN (Spatial
Linear Transformer and Temporal Convolution Network) optimizes self-attention mechanisms for efficiency and introduces
Conclusion gate fusion to combine forward and backward temporal data. All three models can be concluded on the basis of use,
which are as following: -
• For High-Accuracy Needs: MAPredRNN
• For Dynamic and Adaptive Networks: STADGCN is highly adaptable.
• For Efficiency and Large-Scale Data: SLTTCN is optimized for efficiency with large datasets.

Part A Traffic Flow Prediction_168

Uploaded by

Copyright:

Available Formats

Part A Traffic Flow Prediction_168

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Part A Traffic Flow Prediction_168

Uploaded by

Copyright:

Available Formats

DNN Assignment Part A: Literature Exploration and Comparison

Traffic Flow Prediction

Application Domain : Traffic Flow Prediction

Main components: Main Components: Main Components:

Spatial Linear Transformer Gated Temporal Convolution CNN (Convolutional Neural

Layer details: Layers details:

Sigmoid function is used in

The SLTN captures dynamic

How is the network Temporal Dependency Modeling: Multifaceted Purpose: Combines

Loss function: The model uses Mean

You might also like