FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB0973, 210 aa 1>>>pF1KB0973 210 - 210 aa - 210 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.0195+/-0.000823; mu= 14.7611+/- 0.049 mean_var=59.4278+/-12.185, 0's: 0 Z-trim(106.3): 31 B-trim: 53 in 1/49 Lambda= 0.166372 statistics sampled from 8857 (8879) to 8857 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.672), E-opt: 0.2 (0.273), width: 16 Scan time: 1.900 The best scores are: opt bits E(32554) CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 ( 210) 1403 344.9 2.1e-95 CCDS812.1 GSTM3 gene_id:2947|Hs108|chr1 ( 225) 383 100.1 1.1e-21 CCDS808.1 GSTM2 gene_id:2946|Hs108|chr1 ( 218) 357 93.8 8.4e-20 CCDS807.1 GSTM4 gene_id:2948|Hs108|chr1 ( 218) 345 91.0 6.2e-19 CCDS811.1 GSTM5 gene_id:2949|Hs108|chr1 ( 218) 329 87.1 8.8e-18 CCDS809.1 GSTM1 gene_id:2944|Hs108|chr1 ( 218) 327 86.6 1.2e-17 CCDS44192.1 GSTM2 gene_id:2946|Hs108|chr1 ( 191) 325 86.1 1.5e-17 CCDS806.1 GSTM4 gene_id:2948|Hs108|chr1 ( 195) 315 83.7 8.3e-17 CCDS4947.1 GSTA3 gene_id:2940|Hs108|chr6 ( 222) 296 79.2 2.2e-15 CCDS4945.1 GSTA1 gene_id:2938|Hs108|chr6 ( 222) 293 78.5 3.6e-15 CCDS4944.1 GSTA2 gene_id:2939|Hs108|chr6 ( 222) 289 77.5 7e-15 CCDS4946.1 GSTA5 gene_id:221357|Hs108|chr6 ( 222) 279 75.1 3.7e-14 CCDS4948.1 GSTA4 gene_id:2941|Hs108|chr6 ( 222) 277 74.7 5.1e-14 >>CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 (210 aa) initn: 1403 init1: 1403 opt: 1403 Z-score: 1826.1 bits: 344.9 E(32554): 2.1e-95 Smith-Waterman score: 1403; 100.0% identity (100.0% similar) in 210 aa overlap (1-210:1-210) 10 20 30 40 50 60 pF1KB0 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTVETWQEGSLKASCLYGQLPKFQDGD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTVETWQEGSLKASCLYGQLPKFQDGD 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB0 LTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDDYV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 LTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDDYV 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB0 KALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFPLLSAY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS41 KALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDAFPLLSAY 130 140 150 160 170 180 190 200 210 pF1KB0 VGRLSARPKLKAFLASPEYVNLPINGNGKQ :::::::::::::::::::::::::::::: CCDS41 VGRLSARPKLKAFLASPEYVNLPINGNGKQ 190 200 210 >>CCDS812.1 GSTM3 gene_id:2947|Hs108|chr1 (225 aa) initn: 312 init1: 136 opt: 383 Z-score: 502.5 bits: 100.1 E(32554): 1.1e-21 Smith-Waterman score: 383; 32.1% identity (62.3% similar) in 212 aa overlap (8-210:11-218) 10 20 30 40 pF1KB0 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTV--------ETWQEGSLKASCL :. .:: :.:.:: :..:. : : . ..: . CCDS81 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD 10 20 30 40 50 60 50 60 70 80 90 100 pF1KB0 YGQLPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIY . .:: . :: . :::.:::...: .. :. ..: ::.... : :.: . : : : CCDS81 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY 70 80 90 100 110 120 110 120 130 140 150 160 pF1KB0 T-NYEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAP . ..: : .:.. :::::: : .: : ....:....:.:. :.: .... : CCDS81 SSDHEKLKPQYLEELPGQLKQFSMFL----GKFSWFAGEKLTFVDFLTYDILDQNRIFDP 130 140 150 160 170 170 180 190 200 210 pF1KB0 GCLDAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ ::: :: :.:.. :. : :. :.: : .. ..:::.. : CCDS81 KCLDEFPNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC 180 190 200 210 220 >>CCDS808.1 GSTM2 gene_id:2946|Hs108|chr1 (218 aa) initn: 313 init1: 143 opt: 357 Z-score: 469.0 bits: 93.8 E(32554): 8.4e-20 Smith-Waterman score: 357; 30.8% identity (63.0% similar) in 211 aa overlap (3-204:2-208) 10 20 30 40 50 pF1KB0 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTV--------ETWQEGSLKASCLYGQ :.:. :. .:: ..:.:: .:..:. :. : . ..: . . . CCDS80 MPMTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 110 pF1KB0 LPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYT-N :: . :: . :::.:::...: .: :....: :.... : : . .: : . CCDS80 LPYLIDGTHKITQSNAILRYIARKHNLCGESEKEQIREDILENQFMDSRMQLAKLCYDPD 60 70 80 90 100 110 120 130 140 150 160 170 pF1KB0 YEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCL .: : .:..::: .:: : :: : . ...::.:.:.:. :.: ..:. :.:: CCDS80 FEKLKPEYLQALPEMLK----LYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVFEPSCL 120 130 140 150 160 170 180 190 200 210 pF1KB0 DAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ :::: :. ...:. . :..:.. : ... :. CCDS80 DAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFTKMAVWGNK 180 190 200 210 >>CCDS807.1 GSTM4 gene_id:2948|Hs108|chr1 (218 aa) initn: 317 init1: 158 opt: 345 Z-score: 453.5 bits: 91.0 E(32554): 6.2e-19 Smith-Waterman score: 345; 29.2% identity (62.7% similar) in 209 aa overlap (5-204:4-208) 10 20 30 40 50 pF1KB0 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTV--------ETWQEGSLKASCLYGQ :. :. .:: :.:.:: .:..:. :. : . ..: . . . CCDS80 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 110 pF1KB0 LPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYT-N :: . :: . :::.:: ...: .: :. ..: ::.... . :. . . :. . CCDS80 LPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPD 60 70 80 90 100 110 120 130 140 150 160 170 pF1KB0 YEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCL .: : .:.. :: ... : :: : . ..:::.:.:.:. :.: .:... :.:: CCDS80 FEKLKPEYLEELPTMMQHF----SQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCL 120 130 140 150 160 170 180 190 200 210 pF1KB0 DAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ :::: :. ...:. . :..:.. : ... :. CCDS80 DAFPNLKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK 180 190 200 210 >>CCDS811.1 GSTM5 gene_id:2949|Hs108|chr1 (218 aa) initn: 258 init1: 116 opt: 329 Z-score: 432.7 bits: 87.1 E(32554): 8.8e-18 Smith-Waterman score: 329; 30.0% identity (61.8% similar) in 207 aa overlap (3-200:2-204) 10 20 30 40 50 pF1KB0 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTV--------ETWQEGSLKASCLYGQ :.:. :. .:: :.:.:: .:. :. :. : . ..: . . . CCDS81 MPMTLGYWDIRGLAHAIRLLLEYTDSSYVEKKYTLGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 110 pF1KB0 LPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYT-N :: . :: . :::.:::...: .: :. ..: ::.... : : . . . : : . CCDS81 LPYLIDGAHKITQSNAILRYIARKHNLCGETEEEKIRVDILENQVMDNHMELVRLCYDPD 60 70 80 90 100 110 120 130 140 150 160 170 pF1KB0 YEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCL .: : :.. :: .:: : :. : . ...::.:.:.:. :.: ..... : :: CCDS81 FEKLKPKYLEELPEKLK----LYSEFLGKRPWFAGDKITFVDFLAYDVLDMKRIFEPKCL 120 130 140 150 160 170 180 190 200 210 pF1KB0 DAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ ::: :. ...:. . :..:.. : ... CCDS81 DAFLNLKDFISRFEGLKKISAYMKSSQFLRGLLFGKSATWNSK 180 190 200 210 >>CCDS809.1 GSTM1 gene_id:2944|Hs108|chr1 (218 aa) initn: 234 init1: 92 opt: 327 Z-score: 430.1 bits: 86.6 E(32554): 1.2e-17 Smith-Waterman score: 327; 28.4% identity (62.1% similar) in 211 aa overlap (3-204:2-208) 10 20 30 40 50 pF1KB0 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTV--------ETWQEGSLKASCLYGQ :. . :. .:: :.:.:: .:..:. :. : . ..: . . . CCDS80 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 110 pF1KB0 LPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYT-N :: . :: . :::.:: ...: .: :. ..: ::.... . : . . . :. . CCDS80 LPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICYNPE 60 70 80 90 100 110 120 130 140 150 160 170 pF1KB0 YEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCL .: : :.. :: .:: : :. : . ...:..:.:.:. . :.: .:... : :: CCDS80 FEKLKPKYLEELPEKLK----LYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCL 120 130 140 150 160 170 180 190 200 210 pF1KB0 DAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ :::: :. ...:. . :..:.. : ... :. CCDS80 DAFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK 180 190 200 210 >>CCDS44192.1 GSTM2 gene_id:2946|Hs108|chr1 (191 aa) initn: 276 init1: 143 opt: 325 Z-score: 428.4 bits: 86.1 E(32554): 1.5e-17 Smith-Waterman score: 325; 31.9% identity (62.8% similar) in 191 aa overlap (3-184:2-188) 10 20 30 40 50 pF1KB0 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTV--------ETWQEGSLKASCLYGQ :.:. :. .:: ..:.:: .:..:. :. : . ..: . . . CCDS44 MPMTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 110 pF1KB0 LPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYT-N :: . :: . :::.:::...: .: :....: :.... : : . .: : . CCDS44 LPYLIDGTHKITQSNAILRYIARKHNLCGESEKEQIREDILENQFMDSRMQLAKLCYDPD 60 70 80 90 100 110 120 130 140 150 160 170 pF1KB0 YEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCL .: : .:..::: .:: : :: : . ...::.:.:.:. :.: ..:. :.:: CCDS44 FEKLKPEYLQALPEMLK----LYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVFEPSCL 120 130 140 150 160 170 180 190 200 210 pF1KB0 DAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ :::: :. ...:. CCDS44 DAFPNLKDFISRFEHS 180 190 >>CCDS806.1 GSTM4 gene_id:2948|Hs108|chr1 (195 aa) initn: 278 init1: 158 opt: 315 Z-score: 415.3 bits: 83.7 E(32554): 8.3e-17 Smith-Waterman score: 315; 30.2% identity (62.4% similar) in 189 aa overlap (5-184:4-188) 10 20 30 40 50 pF1KB0 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTV--------ETWQEGSLKASCLYGQ :. :. .:: :.:.:: .:..:. :. : . ..: . . . CCDS80 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLDFPN 10 20 30 40 50 60 70 80 90 100 110 pF1KB0 LPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYT-N :: . :: . :::.:: ...: .: :. ..: ::.... . :. . . :. . CCDS80 LPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCYSPD 60 70 80 90 100 110 120 130 140 150 160 170 pF1KB0 YEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCL .: : .:.. :: ... : :: : . ..:::.:.:.:. :.: .:... :.:: CCDS80 FEKLKPEYLEELPTMMQHF----SQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCL 120 130 140 150 160 170 180 190 200 210 pF1KB0 DAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ :::: :. ...:. CCDS80 DAFPNLKDFISRFEVSCGIM 180 190 >>CCDS4947.1 GSTA3 gene_id:2940|Hs108|chr6 (222 aa) initn: 279 init1: 130 opt: 296 Z-score: 389.8 bits: 79.2 E(32554): 2.2e-15 Smith-Waterman score: 296; 32.8% identity (58.6% similar) in 198 aa overlap (8-197:9-203) 10 20 30 40 50 pF1KB0 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVV-TVETWQEGSLKASCLYGQLPKFQD :: ::: .: ::: : ..:. . ..: . .: .. :.: . CCDS49 MAGKPKLHYFNGRGRMEPIRWLLAAAGVEFEEKFIGSAEDLGKLRNDGSLMFQQVPMVEI 10 20 30 40 50 60 60 70 80 90 100 110 pF1KB0 GDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDD . : :. .:: ... .::::: .: ::.:: ..:. :: ..: :. :: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYTEGMADLN-EMILLLPLCRPEEKDA 70 80 90 100 110 120 130 140 150 160 170 pF1KB0 YVKALPGQLK----P-FETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDA . . . : : :: .:... :. ..::...: :: .:..:: : : . .. CCDS49 KIALIKEKTKSRYFPAFEKVLQSH--GQDYLVGNKLSRADISLVELLYYVEELDSSLISN 120 130 140 150 160 170 180 190 200 210 pF1KB0 FPLLSAYVGRLSARPKLKAFL--ASPEYVNLPINGNGKQ ::::.: :.: : .: :: .:: CCDS49 FPLLKALKTRISNLPTVKKFLQPGSPRKPPADAKALEEARKIFRF 180 190 200 210 220 >>CCDS4945.1 GSTA1 gene_id:2938|Hs108|chr6 (222 aa) initn: 284 init1: 132 opt: 293 Z-score: 385.9 bits: 78.5 E(32554): 3.6e-15 Smith-Waterman score: 293; 32.3% identity (60.1% similar) in 198 aa overlap (8-197:9-203) 10 20 30 40 50 pF1KB0 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVV-TVETWQEGSLKASCLYGQLPKFQD :: .::: . : ::: : ..:. . ..: .. . .. :.: . CCDS49 MAEKPKLHYFNARGRMESTRWLLAAAGVEFEEKFIKSAEDLDKLRNDGYLMFQQVPMVEI 10 20 30 40 50 60 60 70 80 90 100 110 pF1KB0 GDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIYTNYEAGKDD . : :. .:: ... .::::: .: ::.:: .:. :: ..: :. . :: CCDS49 DGMKLVQTRAILNYIASKYNLYGKDIKERALIDMYIEGIADLG-EMILLLPVCPPEEKDA 70 80 90 100 110 120 130 140 150 160 170 pF1KB0 YVKALPGQLK----P-FETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAPGCLDA . . ..: : :: .:... :. ..::...: :: .:..:: : : . ... CCDS49 KLALIKEKIKNRYFPAFEKVLKSH--GQDYLVGNKLSRADIHLVELLYYVEELDSSLISS 120 130 140 150 160 170 180 190 200 210 pF1KB0 FPLLSAYVGRLSARPKLKAFL--ASPEYVNLPINGNGKQ ::::.: :.: : .: :: .:: CCDS49 FPLLKALKTRISNLPTVKKFLQPGSPRKPPMDEKSLEEARKIFRF 180 190 200 210 220 210 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sat Nov 5 17:35:52 2016 done: Sat Nov 5 17:35:52 2016 Total Scan time: 1.900 Total Display time: 0.000 Function used was FASTA [36.3.4 Apr, 2011]