To find repeats in a genome from 2 to 9 length using a Perl script, you can use the RepeatMasker tool with the "--length" option[0]. Here's a step-by-step guide:
Install RepeatMasker: First, you need to install RepeatMasker on your system. You...
http://last.cbrc.jp/ - LAST can:
Handle big sequence data, e.g:
Compare two vertebrate genomes
Align billions of DNA reads to a genome
Indicate the reliability of each aligned column.
Use sequence quality data properly.
Compare DNA to proteins, with...
github.com - Next-gen sequence data such as Illumina HiSeq reads. Data must be sorted into folders by taxon (e.g. species or genus). Paired reads in fastq format must be specified by _R1 and _R2 in the (otherwise identical) filenames. Paired and unpaired reads...
github.com - Perform Alignment-free k-tuple frequency comparisons from sequences. This can be in the form of two input files (e.g. a reference and a query) or a single file for pairwise comparisons to be made.
www2.decipher.codes - DECIPHER is a software toolset that can be used for deciphering and managing biological sequences efficiently using the R programming language. The R package is distributed as platform independent source code under the GPL...
github.com - Gepard (German: "cheetah", Backronym for "GEnome PAir - Rapid Dotter") allows the calculation of dotplots even for large sequences like chromosomes or bacterial genomes. Reference: Krumsiek J, Arnold R, Rattei T. Gepard: A rapid and sensitive tool...
github.com - A JavaScript module for the visualization of genomic sequence graphs. It automatically generates a "tube map"-like visualization of sequence graphs which have been created with vg. (https://github.com/vgteam/vg)
Link to working demo:...
academic.oup.com - With a large number of metagenomic datasets becoming available, eukaryotic metagenomics emerged as a new challenge. The proper classification of eukaryotic nuclear and organellar genomes is an essential step toward a better understanding of...