<?xml version='1.0'?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:atom="http://www.w3.org/2005/Atom" >
<channel>
	<title><![CDATA[BOL: Owner]]></title>
	<link>https://bioinformaticsonline.com/snippets/owner/surabhi?offset=20</link>
	<atom:link href="https://bioinformaticsonline.com/snippets/owner/surabhi?offset=20" rel="self" type="application/rss+xml" />
	<description><![CDATA[]]></description>
	
	<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/41932/sequence-ids-conversion-files</guid>
	<pubDate>Fri, 03 Jul 2020 05:20:28 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/41932/sequence-ids-conversion-files</link>
	<title><![CDATA[Sequence Ids conversion files !]]></title>
	<description><![CDATA[<code>ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/

Name	Size	Date Modified
ARCHIVE/		02/01/2020, 05:30:00
ASN_BINARY/		03/07/2020, 07:49:00
GENE_INFO/		03/07/2020, 07:48:00
0 B	10/02/2012, 05:30:00
15.1 kB	30/06/2020, 23:01:00
expression/		06/03/2017, 05:30:00
2.0 GB	03/07/2020, 07:44:00
61.8 MB	03/07/2020, 07:44:00
21.4 MB	03/07/2020, 07:44:00
45.1 MB	03/07/2020, 07:44:00
864 MB	03/07/2020, 07:45:00
279 kB	03/07/2020, 07:45:00
83.4 MB	03/07/2020, 07:45:00
572 MB	03/07/2020, 07:46:00
715 MB	03/07/2020, 07:47:00
30.2 MB	03/07/2020, 07:47:00
232 MB	03/07/2020, 14:38:00
1.2 kB	06/09/2011, 05:30:00
11.6 kB	16/05/2020, 01:32:00
770 kB	03/07/2020, 14:38:00
special_requests/		18/04/2020, 00:15:00
737 B	09/06/2011, 05:30:00

ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2go.gz
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2accession.gz
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2ensembl.gz
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2pubmed.gz
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene2refseq.gz
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_group.gz
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_history.gz
ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/gene_neighbors.gz</code>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/41931/extract-the-sequence-by-ids</guid>
	<pubDate>Fri, 03 Jul 2020 04:58:39 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/41931/extract-the-sequence-by-ids</link>
	<title><![CDATA[Extract the sequence by IDs !]]></title>
	<description><![CDATA[<code>#This method can be applied directly to FASTA or a FASTQ file, compressed or uncompressed files. Seqtk is a fast and lightweight tool for processing biological data (FASTA/FASTQ). if you have a list of identifiers that you would like to extract from a file, you can run this command as follows:

#Extract sequences with names in file name.list, one sequence name per line:
seqtk subseq input.fasta name.list &gt; output.fasta</code>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/41412/download-with-snakemake</guid>
	<pubDate>Wed, 11 Mar 2020 07:16:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/41412/download-with-snakemake</link>
	<title><![CDATA[Download with Snakemake !]]></title>
	<description><![CDATA[<code># list sample names &amp; download URLs.
sample_links = {&quot;ERR458493&quot;: &quot;https://osf.io/5daup/download&quot;,
                &quot;ERR458494&quot;:&quot;https://osf.io/8rvh5/download&quot;,
                 &quot;ERR458495&quot;:&quot;https://osf.io/2wvn3/download&quot;,
                 &quot;ERR458500&quot;:&quot;https://osf.io/xju4a/download&quot;,
                 &quot;ERR458501&quot;: &quot;https://osf.io/nmqe6/download&quot;,
                 &quot;ERR458502&quot;: &quot;https://osf.io/qfsze/download&quot;}

# the sample names are dictionary keys in sample_links. extract them to a list we can use below
SAMPLES=sample_links.keys()

# download yeast rna-seq data from Schurch et al, 2016 study
rule download_all:
    input:
        expand(&quot;rnaseq/raw_data/{sample}.fq.gz&quot;, sample=SAMPLES)

# rule to download each individual file specified in sample_links
rule download_reads:
    output: &quot;rnaseq/raw_data/{sample}.fq.gz&quot; 
    params:
        # dynamically generate the download link directly from the dictionary
        download_link = lambda wildcards: sample_links[wildcards.sample]
    shell: &quot;&quot;&quot;
        curl -L {params.download_link} -o {output}
        &quot;&quot;&quot;</code>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/34083/loop-over-with-perl</guid>
	<pubDate>Fri, 04 Aug 2017 11:49:44 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/34083/loop-over-with-perl</link>
	<title><![CDATA[Loop over with perl]]></title>
	<description><![CDATA[<code>my @ids =qw (scaffold_4
scaffold_4
scaffold_15
scaffold_40
scaffold_44
scaffold_51
scaffold_54
scaffold_129
scaffold_138
scaffold_138
scaffold_180
scaffold_182
scaffold_184
scaffold_219
scaffold_219
scaffold_267
scaffold_273
scaffold_282
scaffold_282
scaffold_458
scaffold_470
scaffold_480
scaffold_521
scaffold_644);

foreach my $i (@ids) {
  print &quot;Working on $i\n&quot;;
  mkdir $i;
  system (&quot;./actor.sh $i&quot;);
  system (&quot;cp *.* $i&quot;);
}</code>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/27956/perl-script-to-extract-fasta-sequence-by-matching-nameids</guid>
	<pubDate>Tue, 21 Jun 2016 09:28:19 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/27956/perl-script-to-extract-fasta-sequence-by-matching-nameids</link>
	<title><![CDATA[Perl script to extract fasta sequence by matching name/ids !!]]></title>
	<description><![CDATA[<code>#!/usr/bin/perl

use strict;
use warnings;
use Text::Trim qw(trim);

#Usage perl extractSeqbyID.pl ids.txt seq.fasta Result.fasta

$ARGV[2] or die &quot;use extractSeqbyID.pl LIST FASTA OUT\n&quot;;

my $list = shift @ARGV;
my $fasta = shift @ARGV;
my $out = shift @ARGV;
my %select;

open LINE, &quot;$list&quot; or die;
while (&lt;LINE&gt;) {
    chomp;
    next if /^\s*$/;
    s/&gt;//g; 
    my @ids=split (/\t/, $_);
    $select{$ids[0]} = 1;
}
my $size = keys %select;
print &quot;Total Ids $size\n&quot;;
close LINE;

$/ = &quot;\n&gt;&quot;;
open OUT, &quot;&gt;$out&quot; or die;
open FILE, &quot;$fasta&quot; or die;
while (&lt;FILE&gt;) {
    trim($_);
    s/&gt;//g;
    my ($id) = split (/\n/, $_);
    #my @i=split (/\s/, $id); # To avoid &gt;flattened_line_10751 circular cases
    print OUT &quot;&gt;$_&quot; if (defined $select{$id});
}
close FILE;
close OUT;</code>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>
<item>
	<guid isPermaLink="true">https://bioinformaticsonline.com/snippets/view/27955/perl-script-to-extract-lines-with-matching-ids</guid>
	<pubDate>Tue, 21 Jun 2016 09:24:46 -0500</pubDate>
	<link>https://bioinformaticsonline.com/snippets/view/27955/perl-script-to-extract-lines-with-matching-ids</link>
	<title><![CDATA[Perl script to extract lines with matching ids !!]]></title>
	<description><![CDATA[<code>#!/usr/bin/perl
use strict;
use warnings;
my %patterns;

#USAGE: perl extactByIds.pl Idsfile1 file2 &gt; Result

# Open file and get patterns to search for
open(my $fh2,&quot;&lt;&quot;,&quot;$ARGV[0]&quot;)|| die &quot;ERROR: Could not open file2&quot;;
while (&lt;$fh2&gt;)
{
   chop;
   $patterns{$_}=1;
}

# Now read data file
open(my $fh1,&quot;&lt;&quot;,&quot;$ARGV[1]&quot;)|| die &quot;ERROR: Could not open file1&quot;;
while (&lt;$fh1&gt;)
{
   # You might need to adjust this place according to your file type
   #(undef,$srch,undef)=split;
   my @ids=split (/\t/, $_);
   print $_ if defined $patterns{$ids[0]};
}</code>]]></description>
	<dc:creator>Surabhi Chaudhary</dc:creator>
</item>

</channel>
</rss>