首页> 外文会议>Mediterranean Conference on Medical and Biological Engineering and Computing >Community Finding with Applications on Phylogenetic Networks
【24h】

Community Finding with Applications on Phylogenetic Networks

机译:社区在系统发育网络上发现应用

获取原文

摘要

With the advent of high-throughput sequencing methods, new ways of visualizing and analyzing increasingly amounts of data are needed. Although some software already exist, they do not scale well or require advanced skills to be useful in phylogenetics. The aim of this thesis was to implement three community finding algorithms -Louvain, Infomap and Layered Label Propagation (LLP); to benchmark them using two synthetic networks -Girvan-Newman (GN) and Lancichinetti-Fortunato-Radicchi (LFR); to test them in real networks, particularly, in one derived from a Staphylococcus aureus MLST dataset; to compare visualization frameworks -Cytoscape.js and D3.js, and, finally, to make it all available online (mscthesis.herokuapp.com). Louvain, Infomap and LLP were implemented in JavaScript. Unless otherwise stated, next conclusions are valid for GN and LFR. In terms of speed, Louvain outperformed all others. Considering accuracy, in networks with well-defined communities, Louvain was the most accurate. For higher mixing, LLP was the best. Contrarily to weakly mixed, it is advantageous to increase the resolution parameter in highly mixed GN. In LFR, higher resolution decreases the accuracy of detection, independently of the mixing parameter. The increase of the average node degree enhanced partitioning accuracy and suggested detection by chance was minimized. It is computationally more intensive to generate GN with higher mixing or average degree, using the algorithm developed in the thesis or the LFR implementation. In S. aureus network, Louvain was the fastest and the most accurate in detecting the clusters of seven groups of strains directly evolved from the common ancestor.
机译:随着高通量测序方法的出现,需要可视化和分析越来越多的数据的新方法。虽然已经存在了一些软件,但它们不符合速度或需要高级技能在系统发育中可用。本文的目的是实施三个社区发现算法 - 卢比,Infomap和分层标签传播(LLP);使用两个合成网络 - 纽曼(GN)和Lancichinetti-Fortunato-Radicchi(LANCICHINETTI-FORTUNATO-RADICCHI基准测试在真实网络中测试它们,特别是在源自金黄色葡萄球菌MLST数据集的一个中;比较可视化框架-Cytoscape.js和d3.js,最后,使其全部可用(mscthesis.herokuapp.com)。 Louvain,Infomap和LLP在JavaScript中实现。除非另有说明,否则下一个结论对于GN和LFR有效。在速度方面,Louvain优于所有其他人。考虑到具有明确界定社区的网络,Louvain是最准确的。对于更高的混合,LLP是最好的。相反,弱混合,有利的是在高度混合的GN中增加分辨率参数。在LFR中,较高的分辨率可单独降低检测的准确性,独立于混合参数。平均节点度提高的分区精度的增加并通过机会的建议检测是最小化的。使用本文中的算法或LFR实现,在计算上更加强化,以产生更高的混合或平均程度的GN。在S.金黄色航空公司网络中,Louvain是最快,最准确地检测七组菌株直接从共同的祖先演变的簇。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号