[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Omaha.pm] pos() - WHERE did my regex match?
pos() is neat. Rarely do I care WHERE a regex hit a string, but in the
example below I do care, very deeply, WHERE the hits were. Enter pos().
The part of my code that uses pos():
while ($seqstr =~ /$primer_seq/g) {
printf(" Found '%s'. Next attempt at character %s\n", $&,
pos($seqstr)+1);
Yoinked from this website:
http://www.regular-expressions.info/perl.html
Finding All Matches In a String
That website is actually more helpful than (perldoc -f pos)
I end up Googling for this about once a year. :)
Cheers,
j
primer_finder.pl
#!/usr/bin/perl
use Bio::SeqIO;
# A hash of all our known primers...
my %primers;
$primers{"18S_F"} = uc("attggagggcaagtctggtg");
$primers{"18S_R"} = uc("ctatgccgactagggatcgg");
$primers{"M1"} = "GGAAGTAAAAGTCGTAACAAGGTT";
$primers{"I1"} = "CCGTAGGTGAACCTGCG";
$primers{"I4"} = "GCATATCAATAAGCGGAGGA";
$primers{"H2R8"} = "CCTCGGATCAGGTAGGGATAC";
$primers{"I2"} = "GCATCGATGAAGAACGCAGC";
$primers{"I3"} = "CGAGTCTTTGAACGCACATTG";
my $io = Bio::SeqIO->new(
#-file => '/home/dbastola/genbakDownload/161_88107/gbbct24.seq',
-file => 'fake_data.gbk',
-format => 'genbank'
);
while (my $seq = $io->next_seq) {
# $seq is now a Bio::Seq object
my $acc = $seq->accession;
my $seqstr = uc($seq->seq);
print "Searching $acc...\n";
foreach my $primer_name (keys %primers) {
my $primer_seq = $primers{$primer_name};
print " looking for $primer_name ($primer_seq)...\n";
while ($seqstr =~ /$primer_seq/g) {
printf(" Found '%s'. Next attempt at character %s\n", $&,
pos($seqstr)+1);
my $start = pos($seqstr) - length( $primer_seq ) + 1;
my $stop = pos($seqstr);
print " Hey, I found $primer_name at [$start..$stop]\n";
}
}
}