Abstract

Lane detection is an essential function for autonomous vehicles. While GPS and HD maps offer accurate localization under optimal conditions, vision-based detection provides an important backup, especially in areas with weak GNSS signals or outdated maps.

This system implements a deep-learning lane detection pipeline using a convolutional autoencoder, designed to identify and segment multiple lane boundaries in various road conditions using only front-facing camera images.

The approach integrates image preprocessing, deep-learning inference, optional temporal tracking, and curve fitting to generate stable and clean lane boundaries. It has been tested in real-world driving and in simulators such as Carla and Scaner.

Video 1 - Video demo of lane detection running online.

Example detections:

Fig. 1 - Lane detection output

Fig. 2 - Lane detection + segmentation (demo 1)

Fig. 3 - Lane detection + segmentation (demo 2)

Deep Learning Architecture

Global Pipeline

The system is organized into several modules:

  • Autoencoder: Produces binary lane masks from camera input.
  • Tracker (optional): A ConvLSTM-based temporal model for improved frame-to-frame stability.
  • Lane Regression: Fits polynomial curves to lane masks.
  • Post-processing: Removes outliers and can generate drivable region segmentation.

Fig. 4 - Global lane detection pipeline

Fig. 5 - Input/output shapes for the model

Autoencoder Design

The convolutional autoencoder performs lane segmentation by encoding spatial information and decoding it into four binary masks: left border, left middle, right middle, and right border.

Each block contains convolution layers, batch normalization, and ReLU activation. The decoder upsamples using transpose convolutions to restore the original resolution.

Fig. 6 - Autoencoder for lane detection: convolutional and upsampling layers producing four segmentation masks.

Tracking Module (Optional)

The ConvLSTM tracker improves predictions by considering the previous N frames, reducing noise and bridging gaps. Training includes noise-augmented data for robustness.

Fig. 7 - Tracking module explanation

Fig. 8 - ConvLSTM tracking architecture

Fig. 9 - Tracker mask prediction from previous frames

Fig. 10 - Tracker raw prediction animation

Fig. 11 - Tracker improving lane mask prediction

Post-processing and Lane Regression

After mask prediction, RANSAC polynomial regression fits curves to each lane boundary, rejecting outliers for consistent results.

Fig. 12 - Lane boundary regression pipeline
Fig. 13 - Post-processing steps: raw detection, curve fitting, and optional drivable region segmentation.

Visual Results

The model performs well with occlusions, worn markings, curves, and lighting variations.

Case: Simple
Fig. 14.1 - Case: Simple
 Case: Curve
Fig. 14.2 - Case: Curve
 Case: Complex Markings
Fig. 14.3 - Case: Complex Markings
 Case: Missing Markings
Fig. 14.4 - Case: Missing Markings

Real-world example:

Fig. 15 - Real-world lane detection

Simulator examples:
Carla

Fig. 16 - Simulator: Carla

Scaner
Fig. 17 - Simulator: Scaner

Integration with Visual Servoing

After lane detection, a local path reference is computed.
A visual servoing controller adjusts linear velocity $v$ and angular velocity $w$ to minimize deviation from the detected lane centerline.

For more details, refer to the project Visual Control Project

Dataset

The model is trained on the CuLane dataset, converted into binary masks using a custom preprocessing script.