iToBoS 2024 - Skin Lesion Detection with 3D-TBP

Description

The iToBoS dataset: skin region images extracted from 3D total body photographs for lesion detection

The early detection of skin cancer is critical for improving patient outcomes. Traditionally, dermatologists rely on dermoscopy to examine pigmented skin lesions. While this non-invasive technique enhances diagnostic accuracy, its effectiveness is highly dependent on the clinician’s expertise. Additionally, capturing dermoscopic images for every suspicious lesion is a labor-intensive process. Given these challenges, there is an increasing need for computer-aided diagnosis (CAD) systems that utilize conventional cameras. Such systems can support general physicians and other non-specialist practitioners in identifying potential malignant lesion, improving early detection and intervention. Moreover, they facilitate longitudinal tracking of lesions, aiding researchers in studying disease progression and treatment efficacy.

This dataset provides high-resolution skin patch images extracted from 3D total body photographs to support the development of advanced machine learning models for lesion detection. It serves as a valuable resource for researchers working on automated skin lesion analysis, particularly in the context of total body photography (TBP). This dataset contains 59,997 lesion identifying regions-of-interest (ROIs) embedded in 16,954 images stemming from 100 patients.

Dataset Description:

The iToBoS dataset consists of 16,954 high-resolution images of skin regions obtained from anonymized 3D avatars of patients. These avatars were generated using the Canfield VECTRA WB360 system, a cutting-edge imaging technology that captures comprehensive, full-body skin images using 92 fixed cameras arranged in 46 stereo pairs with xenon flash lighting. The images were collected from patients at two clinical sites: the Clinical Hospital of Barcelona (Spain) (n=7,729) and the University of Queensland (Australia) (n=9,225).

The dataset provides diverse anatomical locations, including the torso, arms, and legs, with each image having an average resolution of 1012x827 pixels and a 45-pixel overlap between adjacent images. The images are extracted from 3D avatars while ensuring compliance with GDPR regulations by automatically removing patient facial features. Each image is accompanied by metadata, including patient age range, body location, and sun damage score, allowing for in-depth analysis and stratification.

Significance of the Dataset:

  1. Facilitates Automated Skin Lesion Detection: The dataset supports the development of AI-based lesion detection models that can improve early diagnosis of skin cancer, particularly in regions with limited access to dermatological expertise.
  2. Supports Total Body Photography Research: Leveraging 3D TBP for lesion detection is an emerging field, and this dataset provides a benchmark for further exploration.
  3. Enhances Machine Learning Applications: The dataset serves as a benchmark for developing state-of-the-art computer vision and deep learning models for detection of skin lesions.

PNG Format

The image files were originally captured in PNG format, but are published here in compressed JPEG format. Our internal testing indicates that over 97% of the JPEG images achieve a PSNR greater than 35dB when compared to the original PNG versions, while being only ~6% of the original dataset size. Additionally, the original PNG files are available on Figshare.

Funding

EC | EU Framework Programme for Research and Innovation H2020 | H2020 Priority Societal Challenges | H2020 Health (H2020 Societal Challenges - Health, Demographic Change and Well-being) - SC1-BHC-06-2020-965221

Files

Description Size Type Action
The complete bundle of all images, metadata, and supplemental files related to this dataset. 1000.5 MB ZIP
The metadata for this dataset. 3.5 MB CSV
image-level labels including body part, degree of sun damage, pixel spacing, and presence of hidden segmentations. 1.3 MB CSV
image-level labels for the training/test split assignment 306.3 KB CSV
bounding box segmentation labels in JSON format 14.5 MB JSON
bounding box segmentation labels in TXT file format 5.1 MB ZIP
polygonal annotations for images that originally contained patient-identifying features such as tattoos or jewelry 7.5 MB JSON

Dataset Details

Published
DOI
10.34970/561126
Images
16,954
Attributions
  • Department of Dermatology, Hospital Clínic de Barcelona and Frazer Institute, The University of Queensland, Dermatology Research Centre

Licenses

CC-BY
CC-BY

This content is free to use, modify, and share as long as you provide credit to the original creator.

How to Cite