FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB6988, 248 aa 1>>>pF1KB6988 248 - 248 aa - 248 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 7.7178+/-0.000848; mu= 7.4589+/- 0.052 mean_var=220.4508+/-43.544, 0's: 0 Z-trim(115.4): 150 B-trim: 7 in 1/53 Lambda= 0.086381 statistics sampled from 15782 (15942) to 15782 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.799), E-opt: 0.2 (0.49), width: 16 Scan time: 2.670 The best scores are: opt bits E(32554) CCDS7247.1 MBL2 gene_id:4153|Hs108|chr10 ( 248) 1697 223.0 1.5e-58 CCDS7362.1 SFTPD gene_id:6441|Hs108|chr10 ( 375) 562 81.8 7.3e-16 CCDS44445.1 SFTPA1 gene_id:653509|Hs108|chr10 ( 248) 518 76.1 2.5e-14 CCDS44444.2 SFTPA1 gene_id:653509|Hs108|chr10 ( 263) 518 76.1 2.6e-14 CCDS41540.1 SFTPA2 gene_id:729238|Hs108|chr10 ( 248) 513 75.5 3.8e-14 >>CCDS7247.1 MBL2 gene_id:4153|Hs108|chr10 (248 aa) initn: 1697 init1: 1697 opt: 1697 Z-score: 1164.8 bits: 223.0 E(32554): 1.5e-58 Smith-Waterman score: 1697; 100.0% identity (100.0% similar) in 248 aa overlap (1-248:1-248) 10 20 30 40 50 60 pF1KB6 MSLFPSLPLLLLSMVAASYSETVTCEDAQKTCPAVIACSSPGINGFPGKDGRDGTKGEKG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS72 MSLFPSLPLLLLSMVAASYSETVTCEDAQKTCPAVIACSSPGINGFPGKDGRDGTKGEKG 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB6 EPGQGLRGLQGPPGKLGPPGNPGPSGSPGPKGQKGDPGKSPDGDSSLAASERKALQTEMA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS72 EPGQGLRGLQGPPGKLGPPGNPGPSGSPGPKGQKGDPGKSPDGDSSLAASERKALQTEMA 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB6 RIKKWLTFSLGKQVGNKFFLTNGEIMTFEKVKALCVKFQASVATPRNAAENGAIQNLIKE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS72 RIKKWLTFSLGKQVGNKFFLTNGEIMTFEKVKALCVKFQASVATPRNAAENGAIQNLIKE 130 140 150 160 170 180 190 200 210 220 230 240 pF1KB6 EAFLGITDEKTEGQFVDLTGNRLTYTNWNEGEPNNAGSDEDCVLLLKNGQWNDVPCSTSH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS72 EAFLGITDEKTEGQFVDLTGNRLTYTNWNEGEPNNAGSDEDCVLLLKNGQWNDVPCSTSH 190 200 210 220 230 240 pF1KB6 LAVCEFPI :::::::: CCDS72 LAVCEFPI >>CCDS7362.1 SFTPD gene_id:6441|Hs108|chr10 (375 aa) initn: 542 init1: 274 opt: 562 Z-score: 398.2 bits: 81.8 E(32554): 7.3e-16 Smith-Waterman score: 599; 43.6% identity (67.4% similar) in 227 aa overlap (40-246:149-375) 10 20 30 40 50 60 pF1KB6 LLLSMVAASYSETVTCEDAQKTCPAVIACSSPGINGFPGKDGRDGTKGEKGEPGQ----G .::..: : : : :::.: ::. : CCDS73 REGPLGKQGNIGPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPG 120 130 140 150 160 170 70 80 90 100 110 pF1KB6 LRGLQGPPGKLGPPGNPGPSGSPGPKGQKGDPG-KSPDGDSSL--AASERKA-------- : : : .:: :.:: : :: ::.:: :: :. :.:.: .:: :. CCDS73 NTGAAGSAGAMGPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQV 180 190 200 210 220 230 120 130 140 150 160 170 pF1KB6 --LQTEMARIKKWLTFSLGKQVGNKFFLTNGEIMTFEKVKALCVKFQASVATPRNAAENG ::. ... :: : :..::.:.: : : . : ... ::.. ...:.::.::::. CCDS73 QHLQAAFSQYKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENA 240 250 260 270 280 290 180 190 200 210 220 pF1KB6 AIQNLI---KEEAFLGITDEKTEGQFVDLTGNRLTYTNWNEGEPNNAGSDEDCVLLLKNG :.:.:. .: :::..:: ::::.:. ::. :.:.:: ::::. :..:::: .. :: CCDS73 ALQQLVVAKNEAAFLSMTDSKTEGKFTYPTGESLVYSNWAPGEPNDDGGSEDCVEIFTNG 300 310 320 330 340 350 230 240 pF1KB6 QWNDVPCSTSHLAVCEFPI .::: :. ..:.:::: CCDS73 KWNDRACGEKRLVVCEF 360 370 >>CCDS44445.1 SFTPA1 gene_id:653509|Hs108|chr10 (248 aa) initn: 479 init1: 179 opt: 518 Z-score: 370.8 bits: 76.1 E(32554): 2.5e-14 Smith-Waterman score: 535; 37.5% identity (62.2% similar) in 259 aa overlap (8-246:5-248) 10 20 30 40 50 pF1KB6 MSLFPSLPLLL-LSMVAASYSETVTCEDAQKTC---PAVIACSSPGINGFPGKDGRDGTK :: : : ..::: ..:: .. .: :.. . .:: .:.::.::::: : CCDS44 MWLCPLALNLILMAAS---GAVCE-VKDVCVGSPGIPG--TPGSHGLPGRDGRDGLK 10 20 30 40 50 60 70 80 90 100 110 pF1KB6 GEKGEPGQGLRGLQGPPGKLG-PPGN---PGPSGSPGPKGQKGDPG-KSPDGDSSLAASE :. : :: .::::.. :::: :: : :: :.::.:: ..: : : : CCDS44 GDPGPPGP-----MGPPGEMPCPPGNDGLPGAPGIPGECGEKGEPGERGPPG---LPAHL 60 70 80 90 100 120 130 140 150 160 pF1KB6 RKALQTEMARIKKWLTFSLGK--------QVGNKFFLTNGEIMTFEKVKALCVKFQASVA . ::. . ... . . : ::.: : .::. .::. .. :.. . .: CCDS44 DEELQATLHDFRHQILQTRGALSLQGSIMTVGEKVFSSNGQSITFDAIQEACARAGGRIA 110 120 130 140 150 160 170 180 190 200 210 220 pF1KB6 TPRNAAENGAIQNLIKEE---AFLGITDEKTEGQFVDLTGNRLTYTNWNEGEPNNAGSDE .::: :: :: ...:. :..:.:. . :.: :. ..:::: .::: . :. : CCDS44 VPRNPEENEAIASFVKKYNTYAYVGLTEGPSPGDFRYSDGTPVNYTNWYRGEPAGRGK-E 170 180 190 200 210 220 230 240 pF1KB6 DCVLLLKNGQWNDVPCSTSHLAVCEFPI .:: . .::::: : :.:..::: CCDS44 QCVEMYTDGQWNDRNCLYSRLTICEF 230 240 >>CCDS44444.2 SFTPA1 gene_id:653509|Hs108|chr10 (263 aa) initn: 545 init1: 179 opt: 518 Z-score: 370.5 bits: 76.1 E(32554): 2.6e-14 Smith-Waterman score: 535; 37.5% identity (62.2% similar) in 259 aa overlap (8-246:20-263) 10 20 30 40 pF1KB6 MSLFPSLPLLL-LSMVAASYSETVTCEDAQKTC---PAVIACSSPGIN :: : : ..::: ..:: .. .: :.. . .:: . CCDS44 MRPCQVPGAATGPRAMWLCPLALNLILMAAS---GAVCE-VKDVCVGSPGIPG--TPGSH 10 20 30 40 50 50 60 70 80 90 pF1KB6 GFPGKDGRDGTKGEKGEPGQGLRGLQGPPGKLG-PPGN---PGPSGSPGPKGQKGDPG-K :.::.::::: ::. : :: .::::.. :::: :: : :: :.::.:: . CCDS44 GLPGRDGRDGLKGDPGPPGP-----MGPPGEMPCPPGNDGLPGAPGIPGECGEKGEPGER 60 70 80 90 100 100 110 120 130 140 150 pF1KB6 SPDGDSSLAASERKALQTEMARIKKWLTFSLGK--------QVGNKFFLTNGEIMTFEKV .: : : : . ::. . ... . . : ::.: : .::. .::. . CCDS44 GPPG---LPAHLDEELQATLHDFRHQILQTRGALSLQGSIMTVGEKVFSSNGQSITFDAI 110 120 130 140 150 160 160 170 180 190 200 pF1KB6 KALCVKFQASVATPRNAAENGAIQNLIKEE---AFLGITDEKTEGQFVDLTGNRLTYTNW . :.. . .:.::: :: :: ...:. :..:.:. . :.: :. ..:::: CCDS44 QEACARAGGRIAVPRNPEENEAIASFVKKYNTYAYVGLTEGPSPGDFRYSDGTPVNYTNW 170 180 190 200 210 220 210 220 230 240 pF1KB6 NEGEPNNAGSDEDCVLLLKNGQWNDVPCSTSHLAVCEFPI .::: . :. :.:: . .::::: : :.:..::: CCDS44 YRGEPAGRGK-EQCVEMYTDGQWNDRNCLYSRLTICEF 230 240 250 260 >>CCDS41540.1 SFTPA2 gene_id:729238|Hs108|chr10 (248 aa) initn: 481 init1: 179 opt: 513 Z-score: 367.4 bits: 75.5 E(32554): 3.8e-14 Smith-Waterman score: 530; 35.7% identity (61.2% similar) in 255 aa overlap (8-246:5-248) 10 20 30 40 50 60 pF1KB6 MSLFPSLPLLLLSMVAASYSETVTCEDAQKTCPAVIACSSPGINGFPGKDGRDGTKGEKG :: : .. :. . . .:. :.. . .:: .:.::.:::::.::. : CCDS41 MWLCPLALTLILMAASGAACEVKDVCVGSPGIPG--TPGSHGLPGRDGRDGVKGDPG 10 20 30 40 50 70 80 90 100 110 pF1KB6 EPGQGLRGLQGPPGKLG-PPGN---PGPSGSPGPKGQKGDPG-KSPDGDSSLAASERKAL :: .::::. :::: :: : :: .:.::. : ..: : : : . : CCDS41 PPGP-----MGPPGETPCPPGNNGLPGAPGVPGERGEKGEAGERGPPG---LPAHLDEEL 60 70 80 90 100 120 130 140 150 160 pF1KB6 QTEMARIKKWLTFSLGK--------QVGNKFFLTNGEIMTFEKVKALCVKFQASVATPRN :. . ... . . : ::.: : .::. .::. .. :.. . .:.::: CCDS41 QATLHDFRHQILQTRGALSLQGSIMTVGEKVFSSNGQSITFDAIQEACARAGGRIAVPRN 110 120 130 140 150 160 170 180 190 200 210 220 pF1KB6 AAENGAIQNLIKEE---AFLGITDEKTEGQFVDLTGNRLTYTNWNEGEPNNAGSDEDCVL :: :: ...:. :..:.:. . :.: :. ..:::: .::: . :. :.:: CCDS41 PEENEAIASFVKKYNTYAYVGLTEGPSPGDFRYSDGTPVNYTNWYRGEPAGRGK-EQCVE 170 180 190 200 210 220 230 240 pF1KB6 LLKNGQWNDVPCSTSHLAVCEFPI . .::::: : :.:..::: CCDS41 MYTDGQWNDRNCLYSRLTICEF 230 240 248 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sat Nov 5 17:46:12 2016 done: Sat Nov 5 17:46:12 2016 Total Scan time: 2.670 Total Display time: -0.020 Function used was FASTA [36.3.4 Apr, 2011]