To decide which strategy should be our “preferred” genome assembly approach based on data rather than my gut-feeling about the “best assembly” I decided to do some testing with a known “true” reference E Coli K12 MG1655
jokergoo.github.io - Upset plots are a type of visualization used to analyze the intersection of sets or categories. They are particularly useful for displaying data with multiple categories and analyzing their overlaps.
In an upset plot, each row represents a category...
biokit.readthedocs.io - BioKit is a set of tools dedicated to bioinformatics, data visualisation (biokit.viz), access to online biological data (e.g. UniProt, NCBI thanks to bioservices). It also contains more advanced tools related to data analysis...
github.com - Perform Alignment-free k-tuple frequency comparisons from sequences. This can be in the form of two input files (e.g. a reference and a query) or a single file for pairwise comparisons to be made.
github.com - MIKE (MinHash-based k-mer algorithm). This algorithm is designed for the swift calculation of the Jaccard coefficient directly from raw sequencing reads and enables the construction of phylogenetic trees based on the resultant Jaccard...
github.com - HiCdat: a fast and easy-to-use Hi-C data analysis tool
HiCdat is easy-to-use and provides solutions starting from aligned reads up to in-depth analyses. Importantly, HiCdat is focussed on the analysis of larger structural features of chromosomes,...
kissplice.prabi.fr - KisSplice is a software that enables to analyse RNA-seq data with or without a reference genome. It is an exact local transcriptome assembler that allows to identify SNPs, indels and alternative splicing events. It can deal with an arbitrary number...
www.r2d3.us - In machine learning, computers apply statistical learning techniques to automatically identify patterns in data. These techniques can be used to make highly accurate predictions.
Keep scrolling. Using a data set about homes, we will...