亚洲av成人无遮挡网站在线观看,少妇性bbb搡bbb爽爽爽,亚洲av日韩精品久久久久久,兔费看少妇性l交大片免费,无码少妇一区二区三区
Chinaunix
標(biāo)題:
perl 比對(duì),出不來(lái)結(jié)果,不知如何修改
[打印本頁(yè)]
作者:
1002279625
時(shí)間:
2015-01-26 10:42
標(biāo)題:
perl 比對(duì),出不來(lái)結(jié)果,不知如何修改
<IN1>:>AT1G50920.1 | [1-104] [403-416] | chr1:18870139-18870554 FORWARD LENGTH=118
GGTTGGGTCAGAAAACAGTCGACCGTCATTTGGTTGGGTCGGCTCGCTTCTCTTATCAGCTAACCCTAAGTACGCCGACTTTCAAATTCGC
CACTCCCGTTCATCTTTTCTCTGCATCGATAAGCGAAG
>AT1G15970.1 | [1-374] | chr1:5488495-5488868 REVERSE LENGTH=374
TCGTTTCGTCGTCGATCAGAAAAAGCTTCTTTCTTTCTTTTTTTTTTCTGGGTTTTTCTGAAGTCAAAAGCTTTTTTTTT
CTTCATTGTTCAAAGATTTTATTTTGTTGCTCGCTTGGGTCAACCTTGCTTTTCTCGTGAACACAGCAAGCAACTTGGTT
TAGTTTTACTAGAAGATTCCTAAAAATGAGATATTTAGCGGTGACAAAAAAGTTTCATTTTTGGTGCTAATGTAGCTGTA
AAGTCTAGGCCTTCCGATATATAATCTCTTGAAATAAGTTTCGAGACTGTTGGAAATAGCAATTTTGTTGTTGTTGTGTG
ATCCAGTGATAGATCTGTGAAATCAAAAAGCAGTTTTTTGTGAGAATCGGGTCA
>AT1G73440.1 | [1-55] | chr1:27611363-27611417 FORWARD LENGTH=55
ATTGAAAAGAAAACACATCCCAACTCTGGAGAAACAAATCCCTAATTAGCTAACA
>AT1G75120.1 | [1-146] | chr1:28198657-28198802 REVERSE LENGTH=146
CCGTGTTAAAAAACACGTTGGGTCAAAATCAATGGTGGGTGGGTCTTAACTTGCACTACTAGGATTTGATCACAGAAGAAA
GAAAGGGGTTTATTAATTTGGTTACTGTATCTTACTCTTTTTACTTTCGGATAACTTCAATTGGCA
<IN>:GGTTGGGTC
[ATC]GTTGGGTC
G[ATC]TTGGGTC
GG[ACG]TGGGTC
GGT[ACG]GGGTC
GGTT[ATC]GGTC
GGTTG[ATC]GTC
GGTTGG[ATC]TC
GGTTGGG[ACG]C
GGTTGGGT[AGT]
目的:將<IN>中的每行序列各自比對(duì)到<IN1>中所有基因序列上,將比對(duì)到的輸出,輸出位置信息及相應(yīng)的比對(duì)序列
我的程序:open IN1,"<seq_list.txt" ;
while (<IN1>
{
chomp;
my @aa1=split /\s+/,$_;
my $a=$aa1[0];
nn($a);
}
close IN1;
sub nn{
my @a=$_;
open IN,"<test.fa" ;
open OUT,">>test_out.txt";
my %hash;
my $seq;
$/=">";
while (<IN>
{
chomp;
my @aa=split /\n/;
my $gene=(split /\s+/,shift @aa)[0];
$hash{$gene}=$_;
$seq=join('',@aa[0..$#aa]);
while ($seq =~ m/$a[0]/g)
{
my $pos = pos $seq;
push (@pos,$pos-
;
pos($seq)=$pos-8;
if (pos($seq)){
print OUT "$gene\t", pos $seq, "\n";}
}
}
close IN;
}
出來(lái)的結(jié)果和我想要的不太一樣:AT1G50920.1 1 GGTTGGGTC
AT1G50920.1 32 GGTTGGGTC
實(shí)際上還有結(jié)果沒(méi)有輸出
作者:
1002279625
時(shí)間:
2015-01-26 12:13
額,沒(méi)有人會(huì)嗎
作者:
huang6894
時(shí)間:
2015-01-26 13:14
本帖最后由 huang6894 于 2015-01-26 13:20 編輯
回復(fù)
2#
1002279625
my($fa, $seq) = @ARGV;
my @all_seq;
open IN,"< $seq" or die "$!";
while(<IN>){
chomp;
if(/^\[([ATCG]+)\]([ATCG]+)$/){
map{push @all_seq,$_.$2}(split //,$1);
}elsif(/^([ATCG]+)\[([ATCG]+)\]([ATCG]+)$/){
map{push @all_seq,$1.$_.$3}(split //,$2);
}elsif(/^([ATCG]+)\[([ATCG]+)\]$/){
map{push @all_seq,$1.$_}(split //,$2);
}else{
push @all_seq,$_;
}
}
close IN;
$/ = '>';
open IN1,"< $fa" or die "$!";
<IN1>;
while(<IN1>){
chomp;
my ($key, @seq) = split /\n/;
my ($gene) = $key =~/^([^\|]+)(\s+)?|/;
my $s = join("",@seq);
foreach my $key2(@all_seq){
my ($pos, $now) = (0, -1);
until ( $pos eq -1 ) {
$pos = index( $s, $key2, $now + 1 );
$now = $pos;
print $gene."\t".$pos."\t".$key2."\n" unless $pos < 0;
}
}
}
close IN2;
$/ = "\n";
復(fù)制代碼
結(jié)果:
AT1G50920.1 0 GGTTGGGTC
AT1G50920.1 31 GGTTGGGTC
AT1G15970.1 112 GCTTGGGTC
AT1G75120.1 15 CGTTGGGTC
AT1G75120.1 36 GGGTGGGTC
復(fù)制代碼
如果數(shù)據(jù)量小也就罷了,太大,可以考慮用kmp算法,或者直接blat
作者:
1002279625
時(shí)間:
2015-01-26 13:29
回復(fù)
3#
huang6894
我是seq_list這個(gè)比較長(zhǎng),有116行,考慮到9bp中最多有兩個(gè)錯(cuò)配。用過(guò)blat,但因?yàn)?bp太短,出不來(lái)結(jié)果,所以才自己寫程序比對(duì)。
作者:
1002279625
時(shí)間:
2015-01-26 16:02
回復(fù)
3#
huang6894
我發(fā)現(xiàn),這樣寫,有漏掉的
作者:
chenyuluoyan沉魚落
時(shí)間:
2015-01-26 16:16
回復(fù)
3#
huang6894
留著以后慢慢研究大神程序~
歡迎光臨 Chinaunix (http://www.72891.cn/)
Powered by Discuz! X3.2