[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Omaha.pm] perl



At a glance it looks like that program puts chr1 and chr2 into a file, chr3 and chr4 into another file, etc.

Is that how you want it done this time? Or did you want a separate file for EACH chromosome this time?

j




On Apr 23, 2012, at 4:33 PM, Klinkebiel, David L wrote:
> I need to separate out one lager file into separate files based on the chromosome.
> The original file is in this format.
> 
> Chr   start  End
> Chr1  1273635354 127363570
> Etc. 
> 
> I found this perl you developed for me awhile back, would this work?
> 
> #!/usr/bin/perl
> 
> use strict;
> 
> my $infile = shift;
> usage() unless (-r $infile);
> 
> my %parts = ( 
>   1 => qr/^chr[1-2]$/,
>   2 => qr/^chr[3-4]$/,
>   3 => qr/^chr[5-6]$/,
>   4 => qr/^chr[7-8]$/,
>   5 => qr/^chr(9|10)$/,
>   6 => qr/^chr1[1-2]$/,
>   7 => qr/^chr1[3-4]$/,
>   8 => qr/^chr1[5-6]$/,
>   9 => qr/^chr1[7-8]$/,
> );
> my $outfile = $infile;
> $outfile =~ s/\./_part10./;
> open my $everything_else, ">$outfile" or die "Can't open $outfile for write";
> 
> my %fhs = ();
> 
> foreach my $part (keys %parts) {
>   my $outfile = $infile;
>   $outfile =~ s/\./_part$part./;
>   if ($outfile eq $infile) {
>      die "Can't figure out what my output filename should be";
>   }
>   open my $out, ">$outfile" or die "Can't open $outfile for write";
>   $fhs{$part} = $out;
> }
> 
> open my $in, $infile or die "Can't read $infile";
> while (<$in>) {
>   my @l = split /\t/;
>   my $out;
>   foreach my $part (keys %parts) {
>      if ($l[1] =~ $parts{$part}) {
>         $out = $fhs{$part};
>         last;
>      }
>   }
>   unless ($out) { 
>      $out = $everything_else;
>   }
>   print $out $_;
> }
> 
> exit;
> 
> 
> sub usage {
>   print <<EOT;
> 
> split.pl OPV12345_Annot.txt
> 
>   Splits up the input file based on chromosome.
> 
> EOT
>  exit;
> }