Lecture

Autonomous Driving

03 Nov 2022 • Richard Kuo

Autonomous Driving includes SAE, Surveys, Datasets, Camera-based / LiDAR-based Methods, End-to-End Learning, Car Simulators, and Road Tests, Self-Driving 2.0

SAE Levels of Autonomy

The Society of Automotive Engineers (SAE) divided levels of driving automation into 6 categories, ranging from 0 (fully manual) to 5 (fully autonomous).

https://aboutselfdriving.com/

Level 0: No Automation
The cars we are normally driving right now. The driver performs all driving tasks with zero autonomy.
Level 1: Driver Assistance
Drivers are still driving, but they could be assisted by the car on one task. Examples include automatic braking or auto-steering.
Level 2: Partial Automation
The cars would have at least two automated functions. The car can drive completely on its own in specific situations such as parking.
Level 3: Conditional Automation
The car can handle “dynamic driving tasks” but still needs instantaneous intervention when the car requests. It is not equipped with a fail-safe yet.
Level 4: High Automation
The level that many companies aim for nowadays. The cars are officially driverless under certain conditions.
Level 5: Full Automation
The vehicle is able to perform all driving functions under all conditions. Some might even get rid of the steering wheels.

automakers/software companies

There are mainly three aspects of self-driving in the industry right now, consumer cars, robo-taxi, and driverless trucking.

Robo-Taxi
Basically, a robo-taxi is an Uber without a driver. That being said, a robo-taxi is clearly an L4 or L5-level self-driving car, which is the ultimate futuristic smart city goal of humanity. It would optimize the rideshare network. People would no longer have to keep a car themselves nor do they worry about the traffic jam getting to work.

Current companies working on it include but are not limited to Waymo, AutoX, Cruise, DiDi, Pony.ai, Zoox, Aurora, Motional, Optimus Ride (acquired by Magna), Tesla, Baidu, etc. Specifically, Waymo is the first to open up its public driverless taxi program in Pheonix in 2020. AutoX opened Its Fully Driverless RoboTaxi Service to the Public in China around Jan 2021, and Cruise recently (Feb 2022) opened up its driverless cars in San Francisco to the public. On a side note, Ford- and VW- backed Argo AI just got shut down on Oct. 27th, 2022.

Consumer Cars
The leading company in this realm is no doubt Tesla, which has been selling millions of its self-manufactured vehicles all over the world. Tesla has also released its Full Self-driving (FSD) software based on subscription to help drivers navigate through urban areas. Not to be tricked by the name though, as the system is still a level 2. Xpeng is also doing quite well in China, a main competitor of Tesla. Other competitors include but are not limited to Mobileye, Nvidia, Apple, GM, Ford, Volvo, Qualcomm, etc.
Driverless Trucking
The leading companies include but are not limited to Aurora, TuSimple, Embark Trucks, and Daimler Truck.

Autonomous Vehicles | Self-Driving Vehicles Enacted Legislation

NCSL has a NEW autonomous vehicles legislative database, providing up-to-date, real-time information about state autonomous vehicle legislation that has been introduced in the 50 states and the District of Columbia.

Open-Source Autonomous Driving Datasets

A2D2 Dataset for Autonomous Driving
Released by Audi, the Audi Autonomous Driving Dataset (A2D2) was released to support startups and academic researchers working on autonomous driving. The dataset includes over 41,000 labeled with 38 features. Around 2.3 TB in total, A2D2 is split by annotation type (i.e. semantic segmentation, 3D bounding box). In addition to labelled data, A2D2 provides unlabelled sensor data (~390,000 frames) for sequences with several loops.
ApolloScape Open Dataset for Autonomous Driving
Part of the Apollo project for autonomous driving, ApolloScape is an evolving research project that aims to foster innovation across all aspects of autonomous driving, from perception to navigation and control. Via their website, users can explore a variety of simulation tools and over 100K street view frames, 80k lidar point cloud and 1000km trajectories for urban traffic.
Argoverse Dataset
Argoverse is made up of two datasets designed to support autonomous vehicle machine learning tasks such as 3D tracking and motion forecasting. Collected by a fleet of autonomous vehicles in Pittsburgh and Miami, the dataset includes 3D tracking annotations for 113 scenes and over 324,000 unique vehicle trajectories for motion forecasting. Unlike most other open source autonomous driving datasets, Argoverse is the only modern AV dataset that provides forward-facing stereo imagery.
Berkeley DeepDrive Dataset
Also known as BDD 100K, the DeepDrive dataset gives users access to 100,000 annotated videos and 10 tasks to evaluate image recognition algorithms for autonomous driving. The dataset represents more than 1000 hours of driving experience with more than 100 million frames, as well as information on geographic, environmental, and weather diversity.
CityScapes Dataset
CityScapes is a large-scale dataset focused on the semantic understanding of urban street scenes in 50 German cities. It features semantic, instance-wise, and dense pixel annotations for 30 classes grouped into 8 categories. The entire dataset includes 5,000 annotated images with fine annotations, and an additional 20,000 annotated images with coarse annotations.
Comma2k19 Dataset
This dataset includes 33 hours of commute time recorded on highway 280 in California. Each 1-minute scene was captured on a 20km section of highway driving between San Jose and San Francisco. The data was collected using comma EONs, which features a road-facing camera, phone GPS, thermometers and a 9-axis IMU.
Google-Landmarks Dataset
Published by Google in 2018, the Landmarks dataset is divided into two sets of images to evaluate recognition and retrieval of human-made and natural landmarks. The original dataset contains over 2 million images depicting 30 thousand unique landmarks from across the world. In 2019, Google published Landmarks-v2, an even larger dataset with 5 million images and 200k landmarks.
KITTI Vision Benchmark Suite
First released in 2012 by Geiger et al, the KITTI dataset was released with the intent of advancing autonomous driving research with a novel set of real-world computer vision benchmarks. One of the first ever autonomous driving datasets, KITTI boasts over 4000 academic citations and counting.
KITTI provides 2D, 3D, and bird’s eye view object detection datasets, 2D object and multi-object tracking and segmentation datasets, road/lane evaluation detection datasets, both pixel and instance-level semantic datasets, as well as raw datasets.
LeddarTech PixSet Dataset
Launched in 2021, Leddar PixSet is a new, publicly available dataset for autonomous driving research and development that contains data from a full AV sensor suite (cameras, LiDARs, radar, IMU), and includes full-waveform data from the Leddar Pixell, a 3D solid-state flash LiDAR sensor. The dataset contains 29k frames in 97 sequences, with more than 1.3M 3D boxes annotated
Level 5 Open Data
Published by popular rideshare app Lyft, the Level5 dataset is another great source for autonomous driving data. It includes over 55,000 human-labeled 3D annotated frames, surface map, and an underlying HD spatial semantic map that is captured by 7 cameras and up to 3 LiDAR sensors that can be used to contextualize the data.
nuScenes Dataset
Developed by Motional, the nuScenes dataset is one of the largest open-source datasets for autonomous driving. Recorded in Boston and Singapore using a full sensor suite (32-beam LiDAR, 6 360° cameras and radars), the dataset contains over 1.44 million camera images capturing a diverse range of traffic situations, driving maneuvers, and unexpected behaviors.
Oxford Radar RobotCar Dataset
The Oxford RobotCar Dataset contains over 100 recordings of a consistent route through Oxford, UK, captured over a period of over a year. The dataset captures many different environmental conditions, including weather, traffic and pedestrians, along with longer term changes such as construction and roadworks.
PandaSet
PandaSet was the first open-source AV dataset available for both academic and commercial use. It contains 48,000 camera images, 16,000 LiDAR sweeps, 28 annotation classes, and 37 semantic segmentation labels taken from a full sensor suite.
Udacity Self Driving Car Dataset
Online education platform Udacity has open sourced access to a variety of projects for autonomous driving, including neural networks trained to predict steering angles of the car, camera mounts, and dozens of hours of real driving data.
Waymo Open Dataset
The Waymo Open dataset is an open-source multimodal sensor dataset for autonomous driving. Extracted from Waymo self-driving vehicles, the data covers a wide variety of driving scenarios and environments. It contains 1000 types of different segments where each segment captures 20 seconds of continuous driving, corresponding to 200,000 frames at 10 Hz per sensor.

KITTI

LiDAR-based 3D Object Detection Methods

YOLO3D

Paper: YOLO3D: End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud
Code: maudzung/YOLO3D-YOLOv4-PyTorch

Inputs: Bird-eye-view (BEV) maps that are encoded by height, intensity and density of 3D LiDAR point clouds.
The input size: 608 x 608 x 3
Outputs: 7 degrees of freedom (7-DOF) of objects: (cx, cy, cz, l, w, h, θ)

YOLOv4 architecture

YOLO4D

Paper: YOLO4D: A Spatio-temporal Approach for Real-time Multi-object Detection and Classification from LiDAR Point Clouds

Camera-based 3D Object Detection Methods

Pseudo-LiDAR

Paper: Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving
Code: mileyan/pseudo_lidar

Stereo R-CNN based 3D Object Detection

Paper: Stereo R-CNN based 3D Object Detection for Autonomous Driving
Code: HKUST-Aerial-Robotics/Stereo-RCNN

Monocular Camera-based Depth Estimation Methods

DF-Net

Paper: DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency
Code: vt-vl-lab/DF-Net

Model in the paper uses 2-frame as input, while this code uses 5-frame as input (you might use any odd numbers of frames as input, though you would need to tune the hyper-parameters)
FlowNet in the paper is pre-trained on SYNTHIA, while this one is pre-trained on Cityscapes

Depth Fusion Methods with LiDAR and Camera

DFineNet

Paper: DFineNet: Ego-Motion Estimation and Depth Refinement from Sparse, Noisy Depth Input with RGB Guidance
Code: vt-vl-lab/DF-Net

3D Object Detection Methods with LiDAR and Camera

Pedestrain Behavior Prediction Methods

Predicting Future Person Activities and Locations in Videos

Paper: Peeking into the Future: Predicting Future Person Activities and Locations in Videos

Spectral Trajectory and Behavior Prediction

Papers: Forecasting Trajectory and Behavior of Road-Agents Using Spectral Clustering in Graph-LSTMs
Code: Xiejc97/Spectral-Trajectory-and-Behavior-Prediction

Vehicle Behavior Prediction Methods

Autonomous Vehicle Papers

End-to-End Deep Learning for Self-Driving Cars

Blog: End-to-End Deep Learning for Self-Driving Cars

End-to-End Learning of Driving Models

Paper: End-to-End Learning of Driving Models with Surround-View Cameras and Route Planners

End-to-end Learning in Simulated Urban Environments

Paper: Autonomous Vehicle Control: End-to-end Learning in Simulated Urban Environments

NeuroTrajectory

Paper: NeuroTrajectory: A Neuroevolutionary Approach to Local State Trajectory Learning for Autonomous Vehicles

ChauffeurNet (Waymo)

Paper: ChauffeurNet: Learning to Drive by Imitating the Best and Synthesizing the Worst
Code: aidriver/ChauffeurNet

With Stop Signs Rendered	No Stop Signs Rendered
With Perception Boxes Rendered	No Perception Boxes Rendered

Github: Iftimie/ChauffeurNet

Car Simulators

CARLA

Paper: CARLA: An Open Urban Driving Simulator

CARLA Simulator

Traffic manager. A built-in system that takes control of the vehicles besides the one used for learning. It acts as a conductor provided by CARLA to recreate urban-like environments with realistic behaviours.
Sensors. Vehicles rely on them to dispense information of their surroundings. In CARLA they are a specific kind of actor attached the vehicle and the data they receive can be retrieved and stored to ease the process. Currently the project supports different types of these, from cameras to radars, lidar and many more.
Recorder. This feature is used to reenact a simulation step by step for every actor in the world. It grants access to any moment in the timeline anywhere in the world, making for a great tracing tool.
ROS bridge and Autoware implementation. As a matter of universalization, the CARLA project ties knots and works for the integration of the simulator within other learning environments.
Open assets. CARLA facilitates different maps for urban settings with control over weather conditions and a blueprint library with a wide set of actors to be used. However, these elements can be customized and new can be generated following simple guidelines. Scenario runner. In order to ease the learning process for vehicles,

AirSim

Github
Document

Cars in AirSim

Drones in AirSim

Install AirSim

Releases (.zip)

For Windows, the following environments are available:

AbandonedPark
Africa (uneven terrain and animated animals)
AirSimNH (small urban neighborhood block)
Blocks
Building_99
CityEnviron
Coastline
LandscapeMountains
MSBuild2018 (soccer field)
TrapCamera
ZhangJiajie

For Linux, the following environments are available:

AbandonedPark
Africa (uneven terrain and animated animals)
AirSimNH (small urban neighborhood block)
Blocks
Building_99
LandscapeMountains
MSBuild2018 (soccer field)
TrapCamera
ZhangJiajie

Download & Unzip a environment (AirSimNH.zip)
Find the exe file, click to run

Windows : AirSimNH/WindowsNoEditor/AirSimNH/Binaries/Win64/AirSimNH.exe
Linux : AirSimNH/LinuxNoEditor/AirSimNH/Binaries/Linux/AirSimNH

User Interface

press F1 for hot-keys help
press Arrow-keys or A-S-W-D to drive
press Backspace to rest & restart
press Alt-F4 to quit AirSim
press 0 for sub-windows (shown below)
Press REC button for recording Car/Drone info

recorded file path will be shown on screen

For Car,

VehicleName TimeStamp   POS_X   POS_Y   POS_Z   Q_W Q_X Q_Y Q_Z Throttle    Steering    Brake   Gear    Handbrake   RPM Speed   ImageFile

For Drone,

VehicleName TimeStamp   POS_X   POS_Y   POS_Z   Q_W Q_X Q_Y Q_Z ImageFile

Manual Drive

remote_control

Reinforcement Learning in AirSim

The simulation needs to be up and running before you execute dqn_car.py!!!

Keep AirSim Env running first
cd AirSimNH/LinuxNoEditor
./AirSimNH.sh -ResX=640 -ResY=480 -windowed
(For Windows, click AirSimNH.exe to run, Alt-Enter will switch to windowed mode)
To train DQN model for 500,000 eposides (RL with Car)
Open a Terminal on Linux / run Git-Bash on Windows
pip install gym
pip install stable-baselines3
cd ~
git clone https://github.com/rkuo2000/AirSim
cd ~/AirSim/PythonClient/reinforcement_learning
python dqn_car.py

dqn_car.py

a2c_car.py