首页> 美国卫生研究院文献>Viruses >Bioinformatics Pipeline for Human Papillomavirus Short Read Genomic Sequences Classification Using Support Vector Machine
【2h】

Bioinformatics Pipeline for Human Papillomavirus Short Read Genomic Sequences Classification Using Support Vector Machine

机译:人乳头瘤病毒的生物信息学管道短读基因组序列分类使用支持向量机

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We recently developed a test based on the Agilent SureSelect target enrichment system capturing genomic fragments from 191 human papillomaviruses (HPV) types for Illumina sequencing. This enriched whole genome sequencing (eWGS) assay provides an approach to identify all HPV types in a sample. Here we present a machine learning algorithm that calls HPV types based on the eWGS output. The algorithm based on the support vector machine (SVM) technique was trained on eWGS data from 122 control samples with known HPV types. The new algorithm demonstrated good performance in HPV type detection for designed samples with 25 or greater HPV plasmid copies per sample. We compared the results of HPV typing made by the new algorithm for 261 residual epidemiologic samples with the results of the typing delivered by the standard HPV Linear Array (LA). The agreement between methods (97.4%) was substantial (kappa = 0.783). However, the new algorithm identified additionally 428 instances of HPV types not detectable by the LA assay by design. Overall, we have demonstrated that the bioinformatics pipeline is an accurate tool for calling HPV types by analyzing data generated by eWGS processing of DNA fragments extracted from control and epidemiological samples.
机译:我们最近基于安捷伦申请靶向富集系统的测试,捕获191种人乳头瘤病毒(HPV)类型的基因组片段进行Illumina测序。这种富集的全基因组测序(EWGS)测定提供了一种方法来鉴定样品中的所有HPV类型。在这里,我们提出了一种机器学习算法,该算法根据EWGS输出调用HPV类型。基于支持向量机(SVM)技术的算法在具有已知HPV类型的122个控制样本的EWGS数据上培训。新算法在HPV型检测中表现出良好的性能,用于每个样品的设计样品,每个样品具有25个或更大的HPV质粒拷贝。将通过标准HPV线性阵列(LA)提供的键入结果进行了与261个残留的流行病学样本的新算法进行的HPV键入的结果。方法之间的协议(97.4%)很大(Kappa = 0.783)。然而,新算法另外鉴定了428型HPV类型的实例,LA测定通过设计无法检测到。总体而言,我们已经证明,生物信息学管道是通过分析由对照和流行病学样品中提取的DNA片段产生的DNA片段产生的数据来调用HPV类型的准确工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号