Dynamic Traffic Signal Optimization
Go to file
Jeevan P Rai 33f7aac1ea Complete project code update 2025-08-03 20:50:29 +05:30
config Complete project code update 2025-08-03 20:50:29 +05:30
scripts Complete project code update 2025-08-03 20:50:29 +05:30
src Complete project code update 2025-08-03 20:50:29 +05:30
sumo_configs Complete project code update 2025-08-03 20:50:29 +05:30
.gitignore Initial commit 2025-07-18 21:15:58 +05:30
2q Complete project code update 2025-08-03 20:50:29 +05:30
README.md Complete project code update 2025-08-03 20:50:29 +05:30
WINDOWS_SETUP.md Complete project code update 2025-08-03 20:50:29 +05:30
main.py Complete project code update 2025-08-03 20:50:29 +05:30
requirement.txt Complete project code update 2025-08-03 20:50:29 +05:30

README.md

Dynamic Traffic Signal Optimization using Reinforcement Learning

🚦 Project Overview

This project implements an intelligent traffic signal control system using Deep Reinforcement Learning (DRL) to optimize traffic flow at urban intersections. The system learns optimal signal timing policies through interaction with a simulated traffic environment, achieving significant improvements in traffic efficiency compared to traditional fixed-time control systems.

🎯 Key Features

  • Real-time Traffic Adaptation: Dynamic signal timing based on current traffic conditions
  • Multi-Modal Data Integration: Combines camera feeds, loop detectors, and V2I communication
  • Deep Q-Network (DQN) Learning: Advanced RL algorithm for optimal policy learning
  • SUMO Integration: High-fidelity traffic simulation environment
  • Performance Analytics: Comprehensive metrics and visualization tools
  • Scalable Architecture: Supports single intersection and network-level optimization

🛠️ Quick Setup Guides

📋 Choose Your Platform:

  • 🪟 Windows Users: Complete Windows Setup Guide - Step-by-step installation for Windows 10/11
  • 🐧 Linux Users: Follow the instructions below
  • 🍎 macOS Users: Follow the instructions below with Homebrew modifications

⚠️ Windows users should follow the Windows Setup Guide for detailed platform-specific instructions including SUMO installation, environment setup, and troubleshooting.


🏗️ System Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                          DATA COLLECTION LAYER                         │
├─────────────────┬─────────────────┬─────────────────┬─────────────────┤
│  Traffic Camera │ Inductive Loop  │ Connected Vehicle│  Weather/Event  │
│   Video Feeds   │   Detectors     │   Data (V2I)    │      Data       │
│                 │                 │                 │                 │
│  ┌─────────────┐│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────┐ │
│  │   YOLOv5    ││ │   Vehicle   │ │ │   Speed &   │ │ │   Weather   │ │
│  │ Detection   ││ │   Count &   │ │ │ Position    │ │ │   & Events  │ │
│  │   & Deep    ││ │ Occupancy   │ │ │   Data      │ │ │    APIs     │ │
│  │   SORT      ││ │             │ │ │             │ │ │             │ │
│  └─────────────┘│ └─────────────┘ │ └─────────────┘ │ └─────────────┘ │
└─────────────────┴─────────────────┴─────────────────┴─────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                      DATA PREPROCESSING LAYER                          │
├─────────────────┬─────────────────┬─────────────────┬─────────────────┤
│ Computer Vision │ Noise Filtering │    Feature      │   Temporal      │
│   Processing    │   & Data       │   Extraction    │  Alignment &    │
│                 │    Fusion      │                 │  Interpolation  │
│                 │                │                 │                 │
│  ┌─────────────┐│ ┌─────────────┐ │ ┌─────────────┐ │ ┌─────────────┐ │
│  │   Object    ││ │   Kalman    │ │ │ Queue Length│ │ │   Missing   │ │
│  │  Tracking   ││ │  Filtering  │ │ │ Waiting Time│ │ │    Data     │ │
│  │  Vehicle    ││ │    Data     │ │ │ Flow Rates  │ │ │ Imputation  │ │
│  │ Recognition ││ │   Fusion    │ │ │ Phase Info  │ │ │             │ │
│  └─────────────┘│ └─────────────┘ │ └─────────────┘ │ └─────────────┘ │
└─────────────────┴─────────────────┴─────────────────┴─────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                    STATE REPRESENTATION LAYER                          │
├─────────────────────────────────────────────────────────────────────────┤
│                    Multi-Modal State Vector Generation                  │
│                                                                         │
│  ┌──────────────────────────────────────────────────────────────────┐  │
│  │                         State Vector                             │  │
│  │  [q₁, q₂, q₃, q₄, w₁, w₂, w₃, w₄, p, t, n₁, n₂, n₃, n₄]       │  │
│  │                                                                  │  │
│  │  q₁-q₄: Queue lengths (4 lanes)                                │  │
│  │  w₁-w₄: Waiting times (4 lanes)                                │  │
│  │  p: Current phase                                               │  │
│  │  t: Phase elapsed time                                          │  │
│  │  n₁-n₄: Neighbor intersection states                           │  │
│  └──────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                    REINFORCEMENT LEARNING LAYER                        │
├─────────────────────────────────────────────────────────────────────────┤
│                           Dueling Double DQN                           │
│                                                                         │
│  ┌─────────────────────────────┐  ┌─────────────────────────────────┐  │
│  │        Local RL Agent       │  │    Coordination RL Agent        │  │
│  │     (Intersection Level)    │  │      (Network Level)            │  │
│  │                            │  │                                 │  │
│  │  ┌───────────────────────┐  │  │  ┌───────────────────────────┐  │  │
│  │  │     Input Layer       │  │  │  │    Global State Space     │  │  │
│  │  │    (State: 10-50)     │  │  │  │   (Multi-intersection)    │  │  │
│  │  └───────────────────────┘  │  │  └───────────────────────────┘  │  │
│  │  ┌───────────────────────┐  │  │  ┌───────────────────────────┐  │  │
│  │  │   Hidden Layer 1      │  │  │  │   Coordination Policy     │  │  │
│  │  │    (256 neurons)      │  │  │  │   (Green Wave, Flow)      │  │  │
│  │  └───────────────────────┘  │  │  └───────────────────────────┘  │  │
│  │  ┌───────────────────────┐  │  │                                 │  │
│  │  │   Hidden Layer 2      │  │  │                                 │  │
│  │  │    (128 neurons)      │  │  │                                 │  │
│  │  └───────────────────────┘  │  │                                 │  │
│  │  ┌───────────────────────┐  │  │                                 │  │
│  │  │   Hidden Layer 3      │  │  │                                 │  │
│  │  │    (64 neurons)       │  │  │                                 │  │
│  │  └───────────────────────┘  │  │                                 │  │
│  │  ┌───────────────────────┐  │  │                                 │  │
│  │  │   Output Layer        │  │  │                                 │  │
│  │  │  (4-8 actions)        │  │  │                                 │  │
│  │  └───────────────────────┘  │  │                                 │  │
│  └─────────────────────────────┘  └─────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │                    Experience Replay Buffer                     │  │
│  │             (State, Action, Reward, Next_State, Done)           │  │
│  │                        Capacity: 100,000                       │  │
│  └─────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                        DECISION & CONTROL LAYER                        │
├─────────────────────────────────────────────────────────────────────────┤
│                           Action Selection                              │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │                    ε-Greedy Policy                              │  │
│  │                                                                 │  │
│  │  ┌─────────────┐      ┌─────────────┐      ┌─────────────┐     │  │
│  │  │   Explore   │      │   Exploit   │      │   Safety    │     │  │
│  │  │  (Random    │      │  (Optimal   │      │ Constraints │     │  │
│  │  │   Action)   │      │   Action)   │      │  (Min/Max   │     │  │
│  │  │             │      │             │      │   Timings)  │     │  │
│  │  └─────────────┘      └─────────────┘      └─────────────┘     │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                    │                                   │
│                                    ▼                                   │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │                   Action Execution                              │  │
│  │                                                                 │  │
│  │  Phase Selection: {North-South, East-West, Left Turns, ...}    │  │
│  │  Duration Control: Adaptive timing (10-60 seconds)             │  │
│  │  Safety Compliance: Yellow clearance, pedestrian crossing      │  │
│  └─────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                      SUMO SIMULATION ENVIRONMENT                       │
├─────────────────────────────────────────────────────────────────────────┤
│                        Traffic Microsimulation                         │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │                     Traffic Network                             │  │
│  │                                                                 │  │
│  │       N                                                         │  │
│  │       │                                                         │  │
│  │       │                                                         │  │
│  │  W────┼────E    4-Way Intersection                              │  │
│  │       │         - 4 Approaches                                  │  │
│  │       │         - 8 Lanes Total                                 │  │
│  │       S         - Traffic Lights                                │  │
│  │                 - Vehicle Generation                            │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │                  Reward Function                                │  │
│  │                                                                 │  │
│  │  R(t) = -[α₁∑qᵢ(t) + α₂∑wᵢ(t) + α₃∑sᵢ(t) + α₄∑eᵢ(t)]          │  │
│  │                                                                 │  │
│  │  Where:                                                         │  │
│  │  • qᵢ(t): Queue length at lane i                               │  │
│  │  • wᵢ(t): Waiting time at lane i                               │  │
│  │  • sᵢ(t): Number of stops at lane i                            │  │
│  │  • eᵢ(t): Emissions at lane i                                  │  │
│  │  • α₁, α₂, α₃, α₄: Weighting coefficients                      │  │
│  └─────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                      PERFORMANCE MONITORING LAYER                      │
├─────────────────────────────────────────────────────────────────────────┤
│                        Real-time Analytics                             │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │                    Efficiency Metrics                          │  │
│  │  • Average Vehicle Delay (seconds)                             │  │
│  │  • Queue Lengths (vehicles)                                    │  │
│  │  • Throughput (vehicles/hour)                                  │  │
│  │  • Travel Time Index                                           │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │                  Environmental Metrics                         │  │
│  │  • Fuel Consumption (liters)                                   │  │
│  │  • CO₂ Emissions (kg)                                          │  │
│  │  • Number of Stops per Vehicle                                 │  │
│  │  • Acceleration/Deceleration Patterns                          │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │                    Learning Metrics                            │  │
│  │  • Training Loss                                               │  │
│  │  • Reward Progression                                          │  │
│  │  • Epsilon Decay                                               │  │
│  │  • Q-Value Estimates                                           │  │
│  └─────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────────┐
│                        VISUALIZATION & REPORTING                       │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │                    Dashboard Interface                         │  │
│  │  • Real-time Traffic Visualization                             │  │
│  │  • Performance Metrics Plots                                   │  │
│  │  • Learning Progress Charts                                    │  │
│  │  • Comparative Analysis                                        │  │
│  └─────────────────────────────────────────────────────────────────┘  │
│                                                                         │
│  ┌─────────────────────────────────────────────────────────────────┐  │
│  │                    Export & Storage                            │  │
│  │  • Model Checkpoints                                           │  │
│  │  • Training Logs                                               │  │
│  │  • Performance Reports                                         │  │
│  │  • Configuration Files                                         │  │
│  └─────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────┘

🔧 Technical Architecture

Core Components

1. Reinforcement Learning Framework

  • Algorithm: Dueling Double Deep Q-Network (D3QN)
  • State Space: Multi-modal traffic state representation (10-50 dimensions)
  • Action Space: Discrete signal phase selection (4-8 actions)
  • Reward Function: Weighted combination of delay, queue length, emissions, and throughput

2. Data Integration Pipeline

  • Computer Vision: YOLOv5 + Deep SORT for vehicle detection and tracking
  • Sensor Fusion: Kalman filtering for multi-sensor data integration
  • Real-time Processing: Temporal alignment and feature extraction

3. Traffic Simulation

  • Environment: SUMO (Simulation of Urban Mobility)
  • Network: Configurable intersection geometries
  • Traffic Demand: Realistic traffic patterns and volumes

4. Neural Network Architecture

Input Layer (State Vector) → Hidden Layer 1 (256) → Hidden Layer 2 (128) → 
Hidden Layer 3 (64) → Output Layer (Actions)

Data Flow

  1. Data Collection: Multi-modal sensors capture traffic state
  2. Preprocessing: Noise filtering, fusion, and feature extraction
  3. State Representation: Normalized state vector generation
  4. RL Processing: DQN agent selects optimal actions
  5. Action Execution: Signal control commands sent to intersection
  6. Environment Feedback: SUMO simulation provides next state and reward
  7. Learning Update: Experience replay and network parameter updates

📊 Performance Metrics

Efficiency Metrics

  • Average Vehicle Delay: Target 20-40% reduction vs fixed-time
  • Queue Lengths: Peak and average queue measurements
  • Throughput: Vehicles served per hour
  • Travel Time Index: Congested vs free-flow travel time ratio

Environmental Metrics

  • Fuel Consumption: Estimated based on acceleration patterns
  • CO₂ Emissions: Environmental impact assessment
  • Stop Frequency: Number of stops per vehicle journey

Learning Metrics

  • Convergence Rate: Training episodes to optimal policy
  • Reward Progression: Cumulative reward improvement
  • Exploration vs Exploitation: ε-greedy policy evolution

🚀 Quick Start

Prerequisites

pip install torch torchvision numpy pandas matplotlib
pip install sumo sumo-tools traci
pip install opencv-python gym

Basic Usage

# Train the model
python main.py --mode train --episodes 1000

# Test trained model
python main.py --mode test --model_path models/final_model.pth

# Evaluate performance
python main.py --mode evaluate

📈 Expected Results

  • 20-40% reduction in average vehicle delay
  • Improved throughput during peak hours
  • Adaptive behavior to changing traffic patterns
  • Environmental benefits through reduced emissions

🔮 Future Enhancements

  • Multi-intersection coordination
  • Integration with real-world traffic data
  • Pedestrian and cyclist considerations
  • Transfer learning for new intersections
  • Safety-critical constraint handling

📚 References

Based on state-of-the-art research in traffic signal optimization and reinforcement learning, implementing novel approaches for urban traffic management.