[Day 163] Reading about OD demand matrix prediction models

 Hello :)
Today is Day 163!


A quick summary of today:
  • started concentrating more on OD demand matrix papers
  • trying to recreate baseline models for OD demand matrix prediction


Firstly, the papers I read

Deep Multi-View Spatiotemporal Virtual Graph Neural Network for Significant Citywide Ride-hailing Demand Prediction [link]

Introduction

In spatial-temporal deep learning, two main spatial data representation methods are used: image-based and graph-based. The image-based approach grids urban areas by latitude and longitude, with statistical data as pixel values for CNN models. This approach struggles with data sparsity at high granularity and loss of detail at low granularity. The graph-based approach, used for defined networks like roads, captures dynamics via GCN models but has limited structured data access and transferability.

The paper proposes a method using high-granularity grids of urban areas, discarding sparse regions, and retaining significant demand signals to create virtual nodes. A Deep Multi-View Spatial-temporal Virtual Graph Neural Network (DMVST-VGNN) is introduced, combining short-term temporal, spatial, and long-term temporal dynamics. This method uses gated 1D convolution for short-term dynamics, Graph Attention Networks for spatial dependencies, and Transformers for long-term dependencies. Contributions include:

  • Modelling virtual graphs of significant ride-hailing demand without external data, addressing spatial sparsity.
  • A novel multi-view deep learning model integrating different temporal and spatial dynamics.

Methodology

Graph generation

The method proposes a novel approach to citywide spatial-temporal prediction by converting image-based spatial representation into a graph-based one. This involves three steps: discarding sparse regions, aggregating similar regions, and constructing virtual graphs. Sparse regions with low demand are discarded, similar regions are aggregated based on Pearson similarity, and virtual graphs representing distance, correlation, and mobility between aggregated regions are constructed

Short-term temporal dynamics view

The approach considers using 1D CNN models to capture short-term temporal dynamics efficiently, as they offer comparable performance with lower computational requirements compared to RNN structures. The short-term temporal dynamics view involves two 1-D CNN units followed by gated linear units (GLU) for non-linearity activation. Each virtual node is processed by a 1D CNN to explore k temporal neighbors of the input time series, with padding to maintain sequence length.


Spatial dynamics view

In most previous works, ChebNet and Diffusion-Net are two common GCN model to capture spatial dynamics on traffic network. To simplify computation and consider the spatial physical meaning, we utilize spatial GCN model in this case. GAT is a fruitful spatial GCN model to capture spatial dependences, which adaptively calculates contribution of neighbor regions by attention mechanism. In this case, we generalize the traditional 2-D GAT to 3-D GAT, which has an additional time step length dimension M.


Long-term temporal dynamics view

This is actually a transformer layer that is applied to each node individually, and is composed of self-attention layer, position encoding mechanism and feed-forward output layer.

Experiments

Datasets

NYC Uber data, NYC taxi data

Baselines

ARIMA, Random Forest, ConvLSTM, ST-ResNet, DMVST-Net, DCRNN, GraphWaveNet, Multi-GCN, ST-MGCN

Results

Hexagon-Based Convolutional Neural Network for Supply-Demand Forecasting of Ride-Sourcing Services [link]

Introduction

The paper discusses the effectiveness of hexagons over squares in capturing spatial correlations and proposes hexagon-based convolutional neural networks (H-CNN) for spatio-temporal forecasting. It introduces three versions of H-CNN compatible with standard deep learning packages, mapping hexagons to squares/tensors to preserve topology information. These models, combined with local map-to-map prediction and hexagon-based ensemble mechanisms, outperform benchmark algorithms in predicting ride-sourcing service demand-supply gaps, leveraging real-world datasets. Key contributions include the design of H-CNNs and mapping functions to enhance predictive performance in hexagon-based systems.

Advantages of Hexagons

1) Hexagons offer a clear definition of nearest neighbors, simplifying connectivity characterization in hierarchical network topology. Movements between zones can be more accurately represented due to equidistant neighboring zones.

2) Hexagons have a smaller edge-to-area ratio compared to squares, reducing bias from edge effects and better capturing inflow/outflow characteristics between adjacent zones, crucial in transportation/urban computing.

3) Hexagons exhibit greater isotropy, providing more consistent grid distance to straight-line distance ratios than squares. This enhances the characterisation of complex spatial structures, beneficial for various applications including simulation and neural network modelling.

H-CNN framework


Experiments
Dataset
Provided by the Bigdata Research Lab of Didi

Baselines
Lasso, Random Forest, MLP, XGBoost

Results


Secondly, about trying to recreate baseline models

Two of the first models I saw a lot during the literature review in the past almost month are Historical Average (HA) and Auto-regressive (or ARIMA). Today I think I recreated them which was not hard to be fair, but even tho I might be using the same dataset as some papers getting exactly the same results is not really possible because I might not be doing some specific preprocessing step, and without the code a paper used, I have no way to confirm. So I guess going forward I need to figure out how such things are done, confirmed for sanity.



That is all for today!

See you tomorrow :)

Popular posts from this blog

[Day 198] Transactions Data Streaming Pipeline Porject [v1 completed]

[미리 공부] 기초 통계 복습 (Day 1는 1월2일)

[Day 61] Stanford CS224N (NLP with DL): Machine translation, seq2seq + a side CDCGAN mini project