FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB5375, 225 aa 1>>>pF1KB5375 225 - 225 aa - 225 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.4181+/-0.000882; mu= 13.3003+/- 0.053 mean_var=65.6844+/-13.102, 0's: 0 Z-trim(106.0): 20 B-trim: 0 in 0/52 Lambda= 0.158250 statistics sampled from 8741 (8759) to 8741 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.651), E-opt: 0.2 (0.269), width: 16 Scan time: 2.120 The best scores are: opt bits E(32554) CCDS812.1 GSTM3 gene_id:2947|Hs108|chr1 ( 225) 1546 361.6 2.3e-100 CCDS807.1 GSTM4 gene_id:2948|Hs108|chr1 ( 218) 1101 260.0 8.8e-70 CCDS809.1 GSTM1 gene_id:2944|Hs108|chr1 ( 218) 1093 258.2 3.1e-69 CCDS811.1 GSTM5 gene_id:2949|Hs108|chr1 ( 218) 1083 255.9 1.5e-68 CCDS808.1 GSTM2 gene_id:2946|Hs108|chr1 ( 218) 1062 251.1 4.2e-67 CCDS806.1 GSTM4 gene_id:2948|Hs108|chr1 ( 195) 999 236.7 8.1e-63 CCDS44192.1 GSTM2 gene_id:2946|Hs108|chr1 ( 191) 951 225.7 1.6e-59 CCDS810.1 GSTM1 gene_id:2944|Hs108|chr1 ( 181) 792 189.4 1.3e-48 CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 ( 210) 372 93.5 1.1e-19 >>CCDS812.1 GSTM3 gene_id:2947|Hs108|chr1 (225 aa) initn: 1546 init1: 1546 opt: 1546 Z-score: 1915.2 bits: 361.6 E(32554): 2.3e-100 Smith-Waterman score: 1546; 99.6% identity (99.6% similar) in 225 aa overlap (1-225:1-225) 10 20 30 40 50 60 pF1KB5 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS81 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB5 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS81 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB5 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS81 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD 130 140 150 160 170 180 190 200 210 220 pF1KB5 EFSNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC :: :::::::::::::::::::::::::::::::::::::::::: CCDS81 EFPNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC 190 200 210 220 >>CCDS807.1 GSTM4 gene_id:2948|Hs108|chr1 (218 aa) initn: 1101 init1: 1101 opt: 1101 Z-score: 1366.3 bits: 260.0 E(32554): 8.8e-70 Smith-Waterman score: 1101; 71.9% identity (89.4% similar) in 217 aa overlap (6-222:2-218) 10 20 30 40 50 60 pF1KB5 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD ::.::::::::::::::::::.::.:::::.:: :.::::::::::. :::: :: CCDS80 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLD 10 20 30 40 50 70 80 90 100 110 120 pF1KB5 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY :::::::.:: .::::::::: :::::::.:::::::::::::.:::.:: .:: :.:: CCDS80 FPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCY 60 70 80 90 100 110 130 140 150 160 170 180 pF1KB5 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD : : :::::.:::::: ....::.:::: ::.:.:.::::::.::.:: .:::.:.::: CCDS80 SPDFEKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLD 120 130 140 150 160 170 190 200 210 220 pF1KB5 EFSNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC : ::: :. :::.::::.::..:..: :. ...: :::: CCDS80 AFPNLKDFISRFEGLEKISAYMKSSRFLPKPLYTRVAVWGNK 180 190 200 210 >>CCDS809.1 GSTM1 gene_id:2944|Hs108|chr1 (218 aa) initn: 1093 init1: 1093 opt: 1093 Z-score: 1356.4 bits: 258.2 E(32554): 3.1e-69 Smith-Waterman score: 1093; 72.7% identity (88.0% similar) in 216 aa overlap (7-222:3-218) 10 20 30 40 50 60 pF1KB5 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD :.::::::::::::::::::.::.:::::.:: :.::::::::::. :::: :: CCDS80 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLD 10 20 30 40 50 70 80 90 100 110 120 pF1KB5 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY :::::::.:: .::::::::: :::::::.:::::::::::::.:::.:: . :: .:: CCDS80 FPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICY 60 70 80 90 100 110 130 140 150 160 170 180 pF1KB5 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD . . :::::.:::::: .:: .: :::: ::::.:.::::::.::.:: .:::.::::: CCDS80 NPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNKITFVDFLVYDVLDLHRIFEPKCLD 120 130 140 150 160 170 190 200 210 220 pF1KB5 EFSNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC : ::: :. :::.::::.::..:..: :. .::: :::: CCDS80 AFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFSKMAVWGNK 180 190 200 210 >>CCDS811.1 GSTM5 gene_id:2949|Hs108|chr1 (218 aa) initn: 1083 init1: 1055 opt: 1083 Z-score: 1344.1 bits: 255.9 E(32554): 1.5e-68 Smith-Waterman score: 1083; 72.2% identity (88.9% similar) in 216 aa overlap (7-222:3-218) 10 20 30 40 50 60 pF1KB5 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD :.::::::::::::::::::.::.:: ::.:: :.::::::::::. :::: :: CCDS81 MPMTLGYWDIRGLAHAIRLLLEYTDSSYVEKKYTLGDAPDYDRSQWLNEKFKLGLD 10 20 30 40 50 70 80 90 100 110 120 pF1KB5 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY :::::::.:: .:::::::::::::::::.:::::::::::::.:::::: . .:.:::: CCDS81 FPNLPYLIDGAHKITQSNAILRYIARKHNLCGETEEEKIRVDILENQVMDNHMELVRLCY 60 70 80 90 100 110 130 140 150 160 170 180 pF1KB5 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD . : :::::.:::::: .:: .: :::: ::::.:.::::::.::.::..:::.::::: CCDS81 DPDFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGDKITFVDFLAYDVLDMKRIFEPKCLD 120 130 140 150 160 170 190 200 210 220 pF1KB5 EFSNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC : ::: :. :::.:.::.::..:.:: . . .: : :..: CCDS81 AFLNLKDFISRFEGLKKISAYMKSSQFLRGLLFGKSATWNSK 180 190 200 210 >>CCDS808.1 GSTM2 gene_id:2946|Hs108|chr1 (218 aa) initn: 1062 init1: 1062 opt: 1062 Z-score: 1318.2 bits: 251.1 E(32554): 4.2e-67 Smith-Waterman score: 1062; 68.5% identity (88.0% similar) in 216 aa overlap (7-222:3-218) 10 20 30 40 50 60 pF1KB5 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD :.::::.::::::.::::::.::.:::::.:: :.::::::::::. :::: :: CCDS80 MPMTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLD 10 20 30 40 50 70 80 90 100 110 120 pF1KB5 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY :::::::.:: .:::::::::::::::::.:::.:.:.:: ::.::: :: : :: .::: CCDS80 FPNLPYLIDGTHKITQSNAILRYIARKHNLCGESEKEQIREDILENQFMDSRMQLAKLCY 60 70 80 90 100 110 130 140 150 160 170 180 pF1KB5 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD . : :::::.::. :: .:: .:.:::: :: :.:.:::::..::.:..:..:.:.::: CCDS80 DPDFEKLKPEYLQALPEMLKLYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVFEPSCLD 120 130 140 150 160 170 190 200 210 220 pF1KB5 EFSNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC : ::: :. :::.::::.::..:..: :. .::: :::: CCDS80 AFPNLKDFISRFEGLEKISAYMKSSRFLPRPVFTKMAVWGNK 180 190 200 210 >>CCDS806.1 GSTM4 gene_id:2948|Hs108|chr1 (195 aa) initn: 999 init1: 999 opt: 999 Z-score: 1241.2 bits: 236.7 E(32554): 8.1e-63 Smith-Waterman score: 999; 75.5% identity (90.4% similar) in 188 aa overlap (6-193:2-189) 10 20 30 40 50 60 pF1KB5 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD ::.::::::::::::::::::.::.:::::.:: :.::::::::::. :::: :: CCDS80 MSMTLGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLD 10 20 30 40 50 70 80 90 100 110 120 pF1KB5 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY :::::::.:: .::::::::: :::::::.:::::::::::::.:::.:: .:: :.:: CCDS80 FPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQAMDVSNQLARVCY 60 70 80 90 100 110 130 140 150 160 170 180 pF1KB5 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD : : :::::.:::::: ....::.:::: ::.:.:.::::::.::.:: .:::.:.::: CCDS80 SPDFEKLKPEYLEELPTMMQHFSQFLGKRPWFVGDKITFVDFLAYDVLDLHRIFEPNCLD 120 130 140 150 160 170 190 200 210 220 pF1KB5 EFSNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC : ::: :. ::: CCDS80 AFPNLKDFISRFEVSCGIM 180 190 >>CCDS44192.1 GSTM2 gene_id:2946|Hs108|chr1 (191 aa) initn: 951 init1: 951 opt: 951 Z-score: 1182.1 bits: 225.7 E(32554): 1.6e-59 Smith-Waterman score: 951; 70.6% identity (88.8% similar) in 187 aa overlap (7-193:3-189) 10 20 30 40 50 60 pF1KB5 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD :.::::.::::::.::::::.::.:::::.:: :.::::::::::. :::: :: CCDS44 MPMTLGYWNIRGLAHSIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLD 10 20 30 40 50 70 80 90 100 110 120 pF1KB5 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY :::::::.:: .:::::::::::::::::.:::.:.:.:: ::.::: :: : :: .::: CCDS44 FPNLPYLIDGTHKITQSNAILRYIARKHNLCGESEKEQIREDILENQFMDSRMQLAKLCY 60 70 80 90 100 110 130 140 150 160 170 180 pF1KB5 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD . : :::::.::. :: .:: .:.:::: :: :.:.:::::..::.:..:..:.:.::: CCDS44 DPDFEKLKPEYLQALPEMLKLYSQFLGKQPWFLGDKITFVDFIAYDVLERNQVFEPSCLD 120 130 140 150 160 170 190 200 210 220 pF1KB5 EFSNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC : ::: :. ::: CCDS44 AFPNLKDFISRFEHS 180 190 >>CCDS810.1 GSTM1 gene_id:2944|Hs108|chr1 (181 aa) initn: 809 init1: 781 opt: 792 Z-score: 986.3 bits: 189.4 E(32554): 1.3e-48 Smith-Waterman score: 820; 60.6% identity (73.1% similar) in 216 aa overlap (7-222:3-181) 10 20 30 40 50 60 pF1KB5 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD :.::::::::::::::::::.::.:::::.:: :.::::::::::. :::: :: CCDS81 MPMILGYWDIRGLAHAIRLLLEYTDSSYEEKKYTMGDAPDYDRSQWLNEKFKLGLD 10 20 30 40 50 70 80 90 100 110 120 pF1KB5 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY :::::::.:: .::::::::: :::::::.:::::::::::::.:::.:: . :: .:: CCDS81 FPNLPYLIDGAHKITQSNAILCYIARKHNLCGETEEEKIRVDILENQTMDNHMQLGMICY 60 70 80 90 100 110 130 140 150 160 170 180 pF1KB5 SSDHEKLKPQYLEELPGQLKQFSMFLGKFSWFAGEKLTFVDFLTYDILDQNRIFDPKCLD . . :::::.:::::: .:: .: :::: ::::.: CCDS81 NPEFEKLKPKYLEELPEKLKLYSEFLGKRPWFAGNK------------------------ 120 130 140 150 190 200 210 220 pF1KB5 EFSNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC .::::.::..:..: :. .::: :::: CCDS81 -------------GLEKISAYMKSSRFLPRPVFSKMAVWGNK 160 170 180 >>CCDS41679.1 GSTP1 gene_id:2950|Hs108|chr11 (210 aa) initn: 301 init1: 136 opt: 372 Z-score: 467.1 bits: 93.5 E(32554): 1.1e-19 Smith-Waterman score: 372; 31.6% identity (61.8% similar) in 212 aa overlap (11-218:8-210) 10 20 30 40 50 60 pF1KB5 MSCESSMVLGYWDIRGLAHAIRLLLEFTDTSYEEKRYTCGEAPDYDRSQWLDVKFKLDLD :. .:: :.:.:: :..:. : : . ..: . CCDS41 MPPYTVVYFPVRGRCAALRMLLADQGQSWKEEVVTV--------ETWQEGSLKASCL 10 20 30 40 70 80 90 100 110 120 pF1KB5 FPNLPYLLDGKNKITQSNAILRYIARKHNMCGETEEEKIRVDIIENQVMDFRTQLIRLCY . .:: . :: . :::.:::...: .. :. ..: ::.... : :.: . : : : CCDS41 YGQLPKFQDGDLTLYQSNTILRHLGRTLGLYGKDQQEAALVDMVNDGVEDLRCKYISLIY 50 60 70 80 90 100 130 140 150 160 170 pF1KB5 SSDHEKLKPQYLEELPGQLKQFSMFL----GKFSWFAGEKLTFVDFLTYDILDQNRIFDP . ..: : .:.. :::::: : .: : ....:....:.:. :.: .... : CCDS41 T-NYEAGKDDYVKALPGQLKPFETLLSQNQGGKTFIVGDQISFADYNLLDLLLIHEVLAP 110 120 130 140 150 160 180 190 200 210 220 pF1KB5 KCLDEFSNLKAFMCRFEALEKIAAYLQSDQFCKMPINNKMAQWGNKPVC ::: : :.:.. :. : :. :.: : .. ..:::.. : CCDS41 GCLDAFPLLSAYVGRLSARPKLKAFLASPEYVNLPINGNGKQ 170 180 190 200 210 225 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sat Nov 5 12:32:54 2016 done: Sat Nov 5 12:32:55 2016 Total Scan time: 2.120 Total Display time: -0.020 Function used was FASTA [36.3.4 Apr, 2011]