Once your research group is ready to make a larger investment and hire a bioinformatician to gain a competitive edge, there are several key traits to seek out in potential candidates. The best bioinformatician are:-
The genome of 130 mammals was sequenced by a large international consortium and the data was analyzed together with 110 existing genomes to allow scientists to identify the important positions in the DNA.
clark.cs.ucr.edu - CLARK, a method based on a supervised sequence classification using discriminative k-mers. Considering two distinct specific classification problems (see the article for details), namely (1) the taxonomic classification of metagenomic reads to...
github.com - Mash is normally distributed as a dependency-free binary for Linux or OSX (see https://github.com/marbl/Mash/releases). This source distribution is intended for other operating systems or for development. Mash requires c++11 to build, which is...
github.com - chromeister: An ultra fast, heuristic approach to detect conserved signals in extremely large pairwise genome comparisons.
USAGE:
-query: sequence A in fasta format
-db: sequence B in fasta format
-out: output matrix
-kmer Integer: k>1...
github.com - Rebaler is a program for conducting reference-based assemblies using long reads. It relies mainly on minimap2 for alignment and Racon for making consensus sequences.
I made Rebaler for bacterial genomes (specifically for the...
github.com - Tool for detecting and cleaning PacBio / Nanopore long reads after whole genome amplification. Check the poster from the Revolutionizing Next-Generation Sequencing (2nd edition) conference in the source...
github.com - NextDenovo is a string graph-based de novo assembler for TGS long reads. It uses a "correct-then-assemble" strategy similar to canu, but requires significantly less computing resources and storages. After assembly, the per-base error rate...