[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Omaha.pm] perl
At a glance it looks like that program puts chr1 and chr2 into a file, chr3 and chr4 into another file, etc.
Is that how you want it done this time? Or did you want a separate file for EACH chromosome this time?
j
On Apr 23, 2012, at 4:33 PM, Klinkebiel, David L wrote:
> I need to separate out one lager file into separate files based on the chromosome.
> The original file is in this format.
>
> Chr start End
> Chr1 1273635354 127363570
> Etc.
>
> I found this perl you developed for me awhile back, would this work?
>
> #!/usr/bin/perl
>
> use strict;
>
> my $infile = shift;
> usage() unless (-r $infile);
>
> my %parts = (
> 1 => qr/^chr[1-2]$/,
> 2 => qr/^chr[3-4]$/,
> 3 => qr/^chr[5-6]$/,
> 4 => qr/^chr[7-8]$/,
> 5 => qr/^chr(9|10)$/,
> 6 => qr/^chr1[1-2]$/,
> 7 => qr/^chr1[3-4]$/,
> 8 => qr/^chr1[5-6]$/,
> 9 => qr/^chr1[7-8]$/,
> );
> my $outfile = $infile;
> $outfile =~ s/\./_part10./;
> open my $everything_else, ">$outfile" or die "Can't open $outfile for write";
>
> my %fhs = ();
>
> foreach my $part (keys %parts) {
> my $outfile = $infile;
> $outfile =~ s/\./_part$part./;
> if ($outfile eq $infile) {
> die "Can't figure out what my output filename should be";
> }
> open my $out, ">$outfile" or die "Can't open $outfile for write";
> $fhs{$part} = $out;
> }
>
> open my $in, $infile or die "Can't read $infile";
> while (<$in>) {
> my @l = split /\t/;
> my $out;
> foreach my $part (keys %parts) {
> if ($l[1] =~ $parts{$part}) {
> $out = $fhs{$part};
> last;
> }
> }
> unless ($out) {
> $out = $everything_else;
> }
> print $out $_;
> }
>
> exit;
>
>
> sub usage {
> print <<EOT;
>
> split.pl OPV12345_Annot.txt
>
> Splits up the input file based on chromosome.
>
> EOT
> exit;
> }