Autonomous Driving

Autonomous Driving includes SAE, Surveys, Datasets, Camera-based / LiDAR-based Methods, End-to-End Learning, Car Simulators, and Road Tests, Self-Driving 2.0


SAE Levels of Autonomy

The Society of Automotive Engineers (SAE) divided levels of driving automation into 6 categories, ranging from 0 (fully manual) to 5 (fully autonomous).

https://aboutselfdriving.com/

  • Level 0: No Automation
    The cars we are normally driving right now. The driver performs all driving tasks with zero autonomy.

  • Level 1: Driver Assistance
    Drivers are still driving, but they could be assisted by the car on one task. Examples include automatic braking or auto-steering.

  • Level 2: Partial Automation
    The cars would have at least two automated functions. The car can drive completely on its own in specific situations such as parking.

  • Level 3: Conditional Automation
    The car can handle “dynamic driving tasks” but still needs instantaneous intervention when the car requests. It is not equipped with a fail-safe yet.

  • Level 4: High Automation
    The level that many companies aim for nowadays. The cars are officially driverless under certain conditions.

  • Level 5: Full Automation
    The vehicle is able to perform all driving functions under all conditions. Some might even get rid of the steering wheels.


automakers/software companies

There are mainly three aspects of self-driving in the industry right now, consumer cars, robo-taxi, and driverless trucking.

  • Robo-Taxi
    Basically, a robo-taxi is an Uber without a driver. That being said, a robo-taxi is clearly an L4 or L5-level self-driving car, which is the ultimate futuristic smart city goal of humanity. It would optimize the rideshare network. People would no longer have to keep a car themselves nor do they worry about the traffic jam getting to work.

Current companies working on it include but are not limited to Waymo, AutoX, Cruise, DiDi, Pony.ai, Zoox, Aurora, Motional, Optimus Ride (acquired by Magna), Tesla, Baidu, etc. Specifically, Waymo is the first to open up its public driverless taxi program in Pheonix in 2020. AutoX opened Its Fully Driverless RoboTaxi Service to the Public in China around Jan 2021, and Cruise recently (Feb 2022) opened up its driverless cars in San Francisco to the public. On a side note, Ford- and VW- backed Argo AI just got shut down on Oct. 27th, 2022.

  • Consumer Cars
    The leading company in this realm is no doubt Tesla, which has been selling millions of its self-manufactured vehicles all over the world. Tesla has also released its Full Self-driving (FSD) software based on subscription to help drivers navigate through urban areas. Not to be tricked by the name though, as the system is still a level 2. Xpeng is also doing quite well in China, a main competitor of Tesla. Other competitors include but are not limited to Mobileye, Nvidia, Apple, GM, Ford, Volvo, Qualcomm, etc.

  • Driverless Trucking
    The leading companies include but are not limited to Aurora, TuSimple, Embark Trucks, and Daimler Truck.


Autonomous Vehicles | Self-Driving Vehicles Enacted Legislation

NCSL has a NEW autonomous vehicles legislative database, providing up-to-date, real-time information about state autonomous vehicle legislation that has been introduced in the 50 states and the District of Columbia.


Surveys

A Survey of Autonomous Driving: Common Practices and Emerging Technologies


Explanations in Autonomous Driving: A Survey


Autonomous Driving with Deep Learning: A Survey of State-of-Art Technologies


Open-Source Autonomous Driving Datasets

  1. A2D2 Dataset for Autonomous Driving
    Released by Audi, the Audi Autonomous Driving Dataset (A2D2) was released to support startups and academic researchers working on autonomous driving. The dataset includes over 41,000 labeled with 38 features. Around 2.3 TB in total, A2D2 is split by annotation type (i.e. semantic segmentation, 3D bounding box). In addition to labelled data, A2D2 provides unlabelled sensor data (~390,000 frames) for sequences with several loops.

  2. ApolloScape Open Dataset for Autonomous Driving
    Part of the Apollo project for autonomous driving, ApolloScape is an evolving research project that aims to foster innovation across all aspects of autonomous driving, from perception to navigation and control. Via their website, users can explore a variety of simulation tools and over 100K street view frames, 80k lidar point cloud and 1000km trajectories for urban traffic.

  3. Argoverse Dataset
    Argoverse is made up of two datasets designed to support autonomous vehicle machine learning tasks such as 3D tracking and motion forecasting. Collected by a fleet of autonomous vehicles in Pittsburgh and Miami, the dataset includes 3D tracking annotations for 113 scenes and over 324,000 unique vehicle trajectories for motion forecasting. Unlike most other open source autonomous driving datasets, Argoverse is the only modern AV dataset that provides forward-facing stereo imagery.

  4. Berkeley DeepDrive Dataset
    Also known as BDD 100K, the DeepDrive dataset gives users access to 100,000 annotated videos and 10 tasks to evaluate image recognition algorithms for autonomous driving. The dataset represents more than 1000 hours of driving experience with more than 100 million frames, as well as information on geographic, environmental, and weather diversity.

  5. CityScapes Dataset
    CityScapes is a large-scale dataset focused on the semantic understanding of urban street scenes in 50 German cities. It features semantic, instance-wise, and dense pixel annotations for 30 classes grouped into 8 categories. The entire dataset includes 5,000 annotated images with fine annotations, and an additional 20,000 annotated images with coarse annotations.

  6. Comma2k19 Dataset
    This dataset includes 33 hours of commute time recorded on highway 280 in California. Each 1-minute scene was captured on a 20km section of highway driving between San Jose and San Francisco. The data was collected using comma EONs, which features a road-facing camera, phone GPS, thermometers and a 9-axis IMU.

  7. Google-Landmarks Dataset
    Published by Google in 2018, the Landmarks dataset is divided into two sets of images to evaluate recognition and retrieval of human-made and natural landmarks. The original dataset contains over 2 million images depicting 30 thousand unique landmarks from across the world. In 2019, Google published Landmarks-v2, an even larger dataset with 5 million images and 200k landmarks.

  8. KITTI Vision Benchmark Suite
    First released in 2012 by Geiger et al, the KITTI dataset was released with the intent of advancing autonomous driving research with a novel set of real-world computer vision benchmarks. One of the first ever autonomous driving datasets, KITTI boasts over 4000 academic citations and counting.
    KITTI provides 2D, 3D, and bird’s eye view object detection datasets, 2D object and multi-object tracking and segmentation datasets, road/lane evaluation detection datasets, both pixel and instance-level semantic datasets, as well as raw datasets.

  9. LeddarTech PixSet Dataset
    Launched in 2021, Leddar PixSet is a new, publicly available dataset for autonomous driving research and development that contains data from a full AV sensor suite (cameras, LiDARs, radar, IMU), and includes full-waveform data from the Leddar Pixell, a 3D solid-state flash LiDAR sensor. The dataset contains 29k frames in 97 sequences, with more than 1.3M 3D boxes annotated

  10. Level 5 Open Data
    Published by popular rideshare app Lyft, the Level5 dataset is another great source for autonomous driving data. It includes over 55,000 human-labeled 3D annotated frames, surface map, and an underlying HD spatial semantic map that is captured by 7 cameras and up to 3 LiDAR sensors that can be used to contextualize the data.

  11. nuScenes Dataset
    Developed by Motional, the nuScenes dataset is one of the largest open-source datasets for autonomous driving. Recorded in Boston and Singapore using a full sensor suite (32-beam LiDAR, 6 360° cameras and radars), the dataset contains over 1.44 million camera images capturing a diverse range of traffic situations, driving maneuvers, and unexpected behaviors.

  12. Oxford Radar RobotCar Dataset
    The Oxford RobotCar Dataset contains over 100 recordings of a consistent route through Oxford, UK, captured over a period of over a year. The dataset captures many different environmental conditions, including weather, traffic and pedestrians, along with longer term changes such as construction and roadworks.

  13. PandaSet
    PandaSet was the first open-source AV dataset available for both academic and commercial use. It contains 48,000 camera images, 16,000 LiDAR sweeps, 28 annotation classes, and 37 semantic segmentation labels taken from a full sensor suite.

  14. Udacity Self Driving Car Dataset
    Online education platform Udacity has open sourced access to a variety of projects for autonomous driving, including neural networks trained to predict steering angles of the car, camera mounts, and dozens of hours of real driving data.

  15. Waymo Open Dataset
    The Waymo Open dataset is an open-source multimodal sensor dataset for autonomous driving. Extracted from Waymo self-driving vehicles, the data covers a wide variety of driving scenarios and environments. It contains 1000 types of different segments where each segment captures 20 seconds of continuous driving, corresponding to 200,000 frames at 10 Hz per sensor.

KITTI


LiDAR-based 3D Object Detection Methods

YOLO3D

Paper: YOLO3D: End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud
Code: maudzung/YOLO3D-YOLOv4-PyTorch

  • Inputs: Bird-eye-view (BEV) maps that are encoded by height, intensity and density of 3D LiDAR point clouds.
  • The input size: 608 x 608 x 3
  • Outputs: 7 degrees of freedom (7-DOF) of objects: (cx, cy, cz, l, w, h, θ)

YOLOv4 architecture


YOLO4D

Paper: YOLO4D: A Spatio-temporal Approach for Real-time Multi-object Detection and Classification from LiDAR Point Clouds


Camera-based 3D Object Detection Methods

Pseudo-LiDAR

Paper: Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving
Code: mileyan/pseudo_lidar


Stereo R-CNN based 3D Object Detection

Paper: Stereo R-CNN based 3D Object Detection for Autonomous Driving
Code: HKUST-Aerial-Robotics/Stereo-RCNN


Monocular Camera-based Depth Estimation Methods

DF-Net

Paper: DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency
Code: vt-vl-lab/DF-Net

  • Model in the paper uses 2-frame as input, while this code uses 5-frame as input (you might use any odd numbers of frames as input, though you would need to tune the hyper-parameters)
  • FlowNet in the paper is pre-trained on SYNTHIA, while this one is pre-trained on Cityscapes


Depth Fusion Methods with LiDAR and Camera

DFineNet

Paper: DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance
Code: vt-vl-lab/DF-Net


3D Object Detection Methods with LiDAR and Camera

PointFusion

Paper: PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation
Code: mialbro/PointFusion


MVX-Net

Paper: MVX-Net: Multimodal VoxelNet for 3D Object Detection
Code: mmdetection3d/mvxnet


DeepVO

Paper: DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks
Code: ChiWeiHsiao/DeepVO-pytorch


Pedestrain Behavior Prediction Methods

Predicting Future Person Activities and Locations in Videos

Paper: Peeking into the Future: Predicting Future Person Activities and Locations in Videos

Spectral Trajectory and Behavior Prediction

Papers: Forecasting Trajectory and Behavior of Road-Agents Using Spectral Clustering in Graph-LSTMs
Code: Xiejc97/Spectral-Trajectory-and-Behavior-Prediction

Vehicle Behavior Prediction Methods

Autonomous Vehicle Papers


End-to-End Deep Learning for Self-Driving Cars

Blog: End-to-End Deep Learning for Self-Driving Cars


End-to-End Learning of Driving Models

Paper: End-to-End Learning of Driving Models with Surround-View Cameras and Route Planners


End-to-end Learning in Simulated Urban Environments

Paper: Autonomous Vehicle Control: End-to-end Learning in Simulated Urban Environments


NeuroTrajectory

Paper: NeuroTrajectory: A Neuroevolutionary Approach to Local State Trajectory Learning for Autonomous Vehicles


ChauffeurNet (Waymo)

Paper: ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst
Code: aidriver/ChauffeurNet

With Stop Signs Rendered

No Stop Signs Rendered

With Perception Boxes Rendered

No Perception Boxes Rendered

Github: Iftimie/ChauffeurNet


Car Simulators

CARLA

Paper: CARLA: An Open Urban Driving Simulator

CARLA Simulator

  • Traffic manager. A built-in system that takes control of the vehicles besides the one used for learning. It acts as a conductor provided by CARLA to recreate urban-like environments with realistic behaviours.
  • Sensors. Vehicles rely on them to dispense information of their surroundings. In CARLA they are a specific kind of actor attached the vehicle and the data they receive can be retrieved and stored to ease the process. Currently the project supports different types of these, from cameras to radars, lidar and many more.
  • Recorder. This feature is used to reenact a simulation step by step for every actor in the world. It grants access to any moment in the timeline anywhere in the world, making for a great tracing tool.
  • ROS bridge and Autoware implementation. As a matter of universalization, the CARLA project ties knots and works for the integration of the simulator within other learning environments.
  • Open assets. CARLA facilitates different maps for urban settings with control over weather conditions and a blueprint library with a wide set of actors to be used. However, these elements can be customized and new can be generated following simple guidelines. Scenario runner. In order to ease the learning process for vehicles,

AirSim

Github
Document

Cars in AirSim

Drones in AirSim


Install AirSim

Releases (.zip)

For Windows, the following environments are available:

  1. AbandonedPark
  2. Africa (uneven terrain and animated animals)
  3. AirSimNH (small urban neighborhood block)
  4. Blocks
  5. Building_99
  6. CityEnviron
  7. Coastline
  8. LandscapeMountains
  9. MSBuild2018 (soccer field)
  10. TrapCamera
  11. ZhangJiajie

For Linux, the following environments are available:

  1. AbandonedPark
  2. Africa (uneven terrain and animated animals)
  3. AirSimNH (small urban neighborhood block)
  4. Blocks
  5. Building_99
  6. LandscapeMountains
  7. MSBuild2018 (soccer field)
  8. TrapCamera
  9. ZhangJiajie
  • Download & Unzip a environment (AirSimNH.zip)
  • Find the exe file, click to run

    Windows : AirSimNH/WindowsNoEditor/AirSimNH/Binaries/Win64/AirSimNH.exe
    Linux : AirSimNH/LinuxNoEditor/AirSimNH/Binaries/Linux/AirSimNH

User Interface

  • press F1 for hot-keys help
  • press Arrow-keys or A-S-W-D to drive
  • press Backspace to rest & restart
  • press Alt-F4 to quit AirSim
  • press 0 for sub-windows (shown below)
  • Press REC button for recording Car/Drone info

  recorded file path will be shown on screen

For Car,

VehicleName TimeStamp   POS_X   POS_Y   POS_Z   Q_W Q_X Q_Y Q_Z Throttle    Steering    Brake   Gear    Handbrake   RPM Speed   ImageFile

For Drone,

VehicleName TimeStamp   POS_X   POS_Y   POS_Z   Q_W Q_X Q_Y Q_Z ImageFile

Manual Drive

remote_control

Reinforcement Learning in AirSim

The simulation needs to be up and running before you execute dqn_car.py!!!

  1. Keep AirSim Env running first
    cd AirSimNH/LinuxNoEditor
    ./AirSimNH.sh -ResX=640 -ResY=480 -windowed
    (For Windows, click AirSimNH.exe to run, Alt-Enter will switch to windowed mode)

  2. To train DQN model for 500,000 eposides (RL with Car)
     Open a Terminal on Linux / run Git-Bash on Windows
    pip install gym
    pip install stable-baselines3
    cd ~
    git clone https://github.com/rkuo2000/AirSim
    cd ~/AirSim/PythonClient/reinforcement_learning
    python dqn_car.py

dqn_car.py

a2c_car.py


Autonomous Driving Cookbook

Currently, the following tutorials are available:

[AirSim settings]


Kaggle: AirSim End-to-End Learning

Build Model with input image and state (driving angle)

image_input_shape = sample_batch_train_data[0].shape[1:]
state_input_shape = sample_batch_train_data[1].shape[1:]

#Create the convolutional stacks
pic_input = layers.Input(shape=image_input_shape)

x = layers.Conv2D(16, (3, 3), activation='relu', padding='same')(pic_input)
x = layers.MaxPooling2D(pool_size=(2,2))(x)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Conv2D(32, (3, 3), activation='relu', padding='same')(x)
x = layers.MaxPooling2D(pool_size=(2, 2))(x)
x = layers.Flatten()(x)
x = layers.Dropout(0.2)(x)

#Inject the state input
state_input = layers.Input(shape=state_input_shape)
m = layers.concatenate([x, state_input])

# Add a few dense layers to finish the model
m = layers.Dense(64, activation='relu')(m)
m = layers.Dropout(0.2)(m)
m = layers.Dense(10, activation='relu')(m)
m = layers.Dropout(0.2)(m)
m = layers.Dense(1)(m)

model = models.Model(inputs=[pic_input, state_input], outputs=m)

model.summary()

Test Result


Azure


Road Tests

Tesla Self-Driving


Cruise

Cruise再闢無人駕駛計程車服務區域


Waymo


DeepRoute.ai (元戎启行)


Xiaomi Self-Driving


Self-Driving 2.0

  • Drive with instance segmentation model
  • Depth Estimation
  • RNN



This site was last updated December 22, 2022.