In the Journals

Artificial intelligence superior to dermatologists for melanoma detection

A form of artificial intelligence known as a deep learning convolutional neural network appeared more effective than experienced dermatologists for melanoma detection, according to results of a comparative cross-sectional study.

“The convolutional neural network missed fewer melanomas — meaning it had a higher sensitivity than the dermatologists — and it misdiagnosed fewer benign moles as malignant melanoma, which means it had a higher specificity. This would result in less unnecessary surgery,” Professor Holger Haenssle, senior managing physician in the department of dermatology at University of Heidelberg in Germany, said in a press release.

Deep learning convolutional neural networks are artificial neural networks inspired by the biological processes that occur when neurons in the brain respond to what the eye sees. These networks are capable of performing image analysis and teach itself to improve its performance through a process known as machine learning, according to study background.

These networks have been considered a potential tool to facilitate melanoma detection; however, limited data exist comparing the diagnostic performance of these networks with larger groups of dermatologists.

Haenssle and colleagues used dermascopic images and corresponding diagnoses to train and validate Google’s Inception v4 convolutional neural network architecture.

Investigators performed a cross-sectional reader study in which they used a 100-image test set. They divided the test into two levels. Level I included dermoscopy only. Level II included dermoscopy plus clinical information and images.

Sensitivity, specificity and area under the curve of receiver operating characteristics for diagnostic classification of lesions by the convolutional neural network compared with an international group of 58 dermatologists served as the primary outcome measures.

Secondary endpoints included the dermatologists’ diagnostic performance in their management decisions, as well as the differences in diagnostic performance of dermatologists between level I and level II of the study.

In level I of the study, dermatologists achieved a mean sensitivity of 86.6% (standard deviation, ± 9.3%) and mean specificity of 71.3% (standard deviation, ± 11.2%) for lesion classification. The additional clinical information provided in level II improved sensitivity to 88.9% (standard deviation, ± 9.6%) and specificity to 75.7% (standard deviation, ± 11.7%).

The convolutional neural network receiver operating characteristics area under the curve showed higher specificity (82.5%) than for dermatologists in both the level I (P < .01) and level II (P < .01) portions of the study.

The convolutional neural network receiver operating characteristics area under the curve appeared greater than the mean receiver operating characteristics area of dermatologists (0.86 vs. 0.79; P < .01).

“When dermatologists received more clinical information and images at level II, their diagnostic performance improved,” Haenssle said in the release. “However, the convolutional neural network, which was still working solely from the dermoscopic images with no additional clinical information, continued to out-perform the physicians' diagnostic abilities.”

Researchers also compared the convolutional neural network’s performance with the top-five algorithms from the 2016 International Symposium on Biomedical Imaging challenge, designed to foster collaboration and understanding through the quantitative comparison of competing methods to accelerate the pace of research on clinical and academic problems.

Results for the convolutional neural network appeared close to the top three algorithms from that challenge.

The fact a convolutional neural network outperformed most dermatologists suggests clinicians may benefit from assistance from the network’s image classification, regardless of their experience.

“This convolutional neural network may serve physicians involved in skin cancer screening as an aid in their decision whether to biopsy a lesion,” Haenssle said. “Most dermatologists already use digital dermoscopy systems to image and store lesions for documentation and follow-up. The convolutional neural network can then easily and rapidly evaluate the stored image for an ‘expert opinion’ on the probability of melanoma.”

Haenssle and colleagues plan prospective studies to evaluate the real-life impact of the network for physicians and patients.

Researchers acknowledged study limitations. Dermatologists knew they were participating in the study from an artificial setting, and the test sets did not include a full range of skin lesions. In addition, the study also incorporated fewer validated images from nonwhite skin types and genetic backgrounds.

Diagnostic accuracy of melanoma hinges on the training and experience of the treating clinician, Victoria Mar, MBBS, FACD, PhD, adjunct senior lecturer with Monash University in Australia and dermatologist with Victorian Melanoma Service, and Professor H. Peter Soyer from The University of Queensland in Australia, wrote in an accompanying editorial.

The findings from Haenssle and colleagues show “artificial intelligence promises a more standardized level of diagnostic accuracy, such that all people — regardless of where they live or which doctor they see — will be able to access reliable diagnostic assessment.”

However, Mar and Soyer identified several barriers that must be overcome before artificial intelligence will become standard in the clinic.

They include the how to train artificial intelligence to recognize atypical melanomas or ones patients are unaware of, as well as the difficulty of imaging melanomas on sites such as the scalp, toes and fingers.

“Currently, there is no substitute for a thorough clinical examination,” Mar and Soyer wrote. “However, 2-D and 3-D total body photography is able to capture about 90% to 95% of the skin surface and, given exponential development of imaging technology, we envisage that — sooner than later — automated diagnosis will change the diagnostic paradigm in dermatology. Still, there is much more work to be done to implement this exciting technology safely into routine clinical care.” – by Mark Leiser

Disclosure: Haenssle reports honoraria or travel expenses from several companies involved in the development of devices for skin cancer screening. They include FotoFinder Systems GmbH, HEINE Optotechnik GmbH, Magnosco GmbH and Scibase AB. Another researcher reports travel expenses from Magnosco GmbH. Soyer reports shareholder interests and consultant roles with MoleMap and e-derm consult GmbH. Mar reports no relevant financial disclosures.

A form of artificial intelligence known as a deep learning convolutional neural network appeared more effective than experienced dermatologists for melanoma detection, according to results of a comparative cross-sectional study.

“The convolutional neural network missed fewer melanomas — meaning it had a higher sensitivity than the dermatologists — and it misdiagnosed fewer benign moles as malignant melanoma, which means it had a higher specificity. This would result in less unnecessary surgery,” Professor Holger Haenssle, senior managing physician in the department of dermatology at University of Heidelberg in Germany, said in a press release.

Deep learning convolutional neural networks are artificial neural networks inspired by the biological processes that occur when neurons in the brain respond to what the eye sees. These networks are capable of performing image analysis and teach itself to improve its performance through a process known as machine learning, according to study background.

These networks have been considered a potential tool to facilitate melanoma detection; however, limited data exist comparing the diagnostic performance of these networks with larger groups of dermatologists.

Haenssle and colleagues used dermascopic images and corresponding diagnoses to train and validate Google’s Inception v4 convolutional neural network architecture.

Investigators performed a cross-sectional reader study in which they used a 100-image test set. They divided the test into two levels. Level I included dermoscopy only. Level II included dermoscopy plus clinical information and images.

Sensitivity, specificity and area under the curve of receiver operating characteristics for diagnostic classification of lesions by the convolutional neural network compared with an international group of 58 dermatologists served as the primary outcome measures.

Secondary endpoints included the dermatologists’ diagnostic performance in their management decisions, as well as the differences in diagnostic performance of dermatologists between level I and level II of the study.

In level I of the study, dermatologists achieved a mean sensitivity of 86.6% (standard deviation, ± 9.3%) and mean specificity of 71.3% (standard deviation, ± 11.2%) for lesion classification. The additional clinical information provided in level II improved sensitivity to 88.9% (standard deviation, ± 9.6%) and specificity to 75.7% (standard deviation, ± 11.7%).

The convolutional neural network receiver operating characteristics area under the curve showed higher specificity (82.5%) than for dermatologists in both the level I (P < .01) and level II (P < .01) portions of the study.

The convolutional neural network receiver operating characteristics area under the curve appeared greater than the mean receiver operating characteristics area of dermatologists (0.86 vs. 0.79; P < .01).

PAGE BREAK

“When dermatologists received more clinical information and images at level II, their diagnostic performance improved,” Haenssle said in the release. “However, the convolutional neural network, which was still working solely from the dermoscopic images with no additional clinical information, continued to out-perform the physicians' diagnostic abilities.”

Researchers also compared the convolutional neural network’s performance with the top-five algorithms from the 2016 International Symposium on Biomedical Imaging challenge, designed to foster collaboration and understanding through the quantitative comparison of competing methods to accelerate the pace of research on clinical and academic problems.

Results for the convolutional neural network appeared close to the top three algorithms from that challenge.

The fact a convolutional neural network outperformed most dermatologists suggests clinicians may benefit from assistance from the network’s image classification, regardless of their experience.

“This convolutional neural network may serve physicians involved in skin cancer screening as an aid in their decision whether to biopsy a lesion,” Haenssle said. “Most dermatologists already use digital dermoscopy systems to image and store lesions for documentation and follow-up. The convolutional neural network can then easily and rapidly evaluate the stored image for an ‘expert opinion’ on the probability of melanoma.”

Haenssle and colleagues plan prospective studies to evaluate the real-life impact of the network for physicians and patients.

Researchers acknowledged study limitations. Dermatologists knew they were participating in the study from an artificial setting, and the test sets did not include a full range of skin lesions. In addition, the study also incorporated fewer validated images from nonwhite skin types and genetic backgrounds.

Diagnostic accuracy of melanoma hinges on the training and experience of the treating clinician, Victoria Mar, MBBS, FACD, PhD, adjunct senior lecturer with Monash University in Australia and dermatologist with Victorian Melanoma Service, and Professor H. Peter Soyer from The University of Queensland in Australia, wrote in an accompanying editorial.

The findings from Haenssle and colleagues show “artificial intelligence promises a more standardized level of diagnostic accuracy, such that all people — regardless of where they live or which doctor they see — will be able to access reliable diagnostic assessment.”

However, Mar and Soyer identified several barriers that must be overcome before artificial intelligence will become standard in the clinic.

They include the how to train artificial intelligence to recognize atypical melanomas or ones patients are unaware of, as well as the difficulty of imaging melanomas on sites such as the scalp, toes and fingers.

PAGE BREAK

“Currently, there is no substitute for a thorough clinical examination,” Mar and Soyer wrote. “However, 2-D and 3-D total body photography is able to capture about 90% to 95% of the skin surface and, given exponential development of imaging technology, we envisage that — sooner than later — automated diagnosis will change the diagnostic paradigm in dermatology. Still, there is much more work to be done to implement this exciting technology safely into routine clinical care.” – by Mark Leiser

Disclosure: Haenssle reports honoraria or travel expenses from several companies involved in the development of devices for skin cancer screening. They include FotoFinder Systems GmbH, HEINE Optotechnik GmbH, Magnosco GmbH and Scibase AB. Another researcher reports travel expenses from Magnosco GmbH. Soyer reports shareholder interests and consultant roles with MoleMap and e-derm consult GmbH. Mar reports no relevant financial disclosures.