Create a large-scale video driving dataset with detailed attributes using Amazon SageMaker Ground Truth

 Create a large-scale video driving dataset with detailed attributes using Amazon SageMaker Ground Truth

Do you ever wonder what goes behind bringing various levels of autonomy to vehicles? What the vehicle sees (perception) and how the vehicle predicts the actions of different agents in the scene (behavior prediction) are the first two steps in autonomous systems. In order for these steps to be successful, large-scale driving datasets are key. Driving datasets typically comprise of data captured using multiple sensors such as cameras, LIDARs, radars, and GPS, in a variety of traffic scenarios during different times of the day under varied weather conditions and locations. The Amazon Machine Learning Solutions Lab is collaborating with the Laboratory of Intelligent and Safe Automobiles (LISA Lab) at the University of California, San Diego (UCSD) to build a large, richly annotated, real-world driving dataset with fine-grained vehicle, pedestrian, and scene attributes.

This post describes the dataset label taxonomy and labeling architecture for 2D bounding boxes using Amazon SageMaker Ground Truth. Ground Truth is a fully managed data labeling service that makes it easy to build highly accurate training datasets for machine learning (ML) workflows. These workflows support a variety of use cases, including 3D point clouds, video, images, and text. As part of the workflows, labelers have access to assistive labeling features such as automatic 3D cuboid snapping, removal of distortion in 2D images, and auto-segment tools to reduce the time required to label datasets. In addition, Ground Truth offers automatic data labeling, which uses an ML model to label your data.

LISA Amazon-MLSL Vehicle Attributes (LAVA) dataset

LAVA is a diverse, large-scale dataset with a unique label set that we created to provide high-quality labeled video data for a variety of modern computer vision applications in the automotive domain. We captured the data using rigidly mounted cameras with a 1/2.3” sensor size and f/2.8 aperture at 1920×1080 resolution. The chosen aperture, sensor size, and focal length between 1 to 20 meters results in a depth-of-field extending up to infinity, which means most objects on the road are in-focus. We augmented the data with additional navigation sensors that provide centimeter-level localization accuracy and inertial motion information. We collected the data during real-world drives in Southern California under different illumination, weather, traffic, and road conditions to capture the complexity and diversity of real-world vehicle operation. This, put together with our unique set of annotations, allowed us to develop reliable ML models for existing automotive applications, and new ones that were previously unfeasible due to the lack of high-quality labeled data.

From the hundreds of hours of raw data captured during our many data collection d


Source - Continue Reading:


Related post