Vis enkel innførsel

dc.contributor.authorLin, Chengchuang
dc.contributor.authorChen, Hanbiao
dc.contributor.authorHuang, Jiesheng
dc.contributor.authorPeng, Jing
dc.contributor.authorGuo, Li
dc.contributor.authorYang, Zhirong
dc.contributor.authorDu, Jiahua
dc.contributor.authorLi, Shuangyin
dc.contributor.authorYin, Aihua
dc.contributor.authorZhao, Gansen
dc.date.accessioned2022-12-30T08:31:42Z
dc.date.available2022-12-30T08:31:42Z
dc.date.created2022-07-18T14:58:41Z
dc.date.issued2022
dc.identifier.issn1476-9271
dc.identifier.urihttps://hdl.handle.net/11250/3040011
dc.description.abstractChromosome karyotyping analysis is a vital cytogenetics technique for diagnosing genetic and congenital malformations, analyzing gestational and implantation failures, etc. Since the chromosome classification as an essential stage in chromosome karyotype analysis is a highly time-consuming, tedious, and error-prone task, which requires a large amount of manual work of experienced cytogenetics experts. Many deep learning-based methods have been proposed to address the chromosome classification issues. However, two challenges still remain in current chromosome classification methods. First, most existing methods were developed by different private datasets, making these methods difficult to compare with each other on the same base. Second, due to the absence of reproducing details of most existing methods, these methods are difficult to be applied in clinical chromosome classification applications widely. To address the above challenges in the chromosome classification issue, this work builds and publishes a massive clinical dataset. This dataset enables the benchmarking and building chromosome classification baselines suitable for different scenarios. The massive clinical dataset consists of 126,453 privacy preserving G-band chromosome instances from 2763 karyotypes of 408 individuals. To our best knowledge, it is the first work to collect, annotate, and release a publicly available clinical chromosome classification dataset whose data size scale is also over 120,000. Meanwhile, the experimental results show that the proposed dataset can boost performance of existing chromosome classification models at a varied range of degrees, with the highest accuracy improvement by 5.39 % points. Moreover, the best baseline with 99.33 % accuracy reports state-of-the-art classification performance. The clinical dataset and state-of-the-art baselines can be found at https://github.com/CloudDataLab/BenchmarkForChromosomeClassification.en_US
dc.language.isoengen_US
dc.publisherElsevieren_US
dc.titleChromosomeNet: A massive dataset enabling benchmarking and building basedlines of clinical chromosome classificationen_US
dc.title.alternativeChromosomeNet: A massive dataset enabling benchmarking and building basedlines of clinical chromosome classificationen_US
dc.typeJournal articleen_US
dc.typePeer revieweden_US
dc.description.versionacceptedVersionen_US
dc.source.journalComputational biology and chemistryen_US
dc.identifier.doihttps://doi.org/10.1016/j.compbiolchem.2022.107731
dc.identifier.cristin2038673
cristin.ispublishedfalse
cristin.fulltextpostprint
cristin.qualitycode1


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel