FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB6569, 375 aa 1>>>pF1KB6569 375 - 375 aa - 375 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 10.0427+/-0.00116; mu= 1.0036+/- 0.071 mean_var=476.4702+/-96.721, 0's: 0 Z-trim(115.2): 141 B-trim: 0 in 0/54 Lambda= 0.058757 statistics sampled from 15572 (15712) to 15572 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.771), E-opt: 0.2 (0.483), width: 16 Scan time: 3.340 The best scores are: opt bits E(32554) CCDS7362.1 SFTPD gene_id:6441|Hs108|chr10 ( 375) 2603 234.5 1.2e-61 CCDS11561.1 COL1A1 gene_id:1277|Hs108|chr17 (1464) 660 70.6 9.9e-12 CCDS2297.1 COL3A1 gene_id:1281|Hs108|chr2 (1466) 638 68.7 3.6e-11 CCDS4436.1 COL23A1 gene_id:91522|Hs108|chr5 ( 540) 625 67.1 4.3e-11 CCDS12222.1 COL5A3 gene_id:50509|Hs108|chr19 (1745) 633 68.4 5.4e-11 CCDS75932.1 COL5A1 gene_id:1289|Hs108|chr9 (1838) 630 68.2 6.6e-11 CCDS6982.1 COL5A1 gene_id:1289|Hs108|chr9 (1838) 630 68.2 6.6e-11 >>CCDS7362.1 SFTPD gene_id:6441|Hs108|chr10 (375 aa) initn: 2603 init1: 2603 opt: 2603 Z-score: 1220.4 bits: 234.5 E(32554): 1.2e-61 Smith-Waterman score: 2603; 100.0% identity (100.0% similar) in 375 aa overlap (1-375:1-375) 10 20 30 40 50 60 pF1KB6 MLLFLLSALVLLTQPLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS73 MLLFLLSALVLLTQPLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPR 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB6 GEKGDPGLPGAAGQAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGRE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS73 GEKGDPGLPGAAGQAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGRE 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB6 GPLGKQGNIGPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS73 GPLGKQGNIGPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNT 130 140 150 160 170 180 190 200 210 220 230 240 pF1KB6 GAAGSAGAMGPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS73 GAAGSAGAMGPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQH 190 200 210 220 230 240 250 260 270 280 290 300 pF1KB6 LQAAFSQYKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAAL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS73 LQAAFSQYKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAAL 250 260 270 280 290 300 310 320 330 340 350 360 pF1KB6 QQLVVAKNEAAFLSMTDSKTEGKFTYPTGESLVYSNWAPGEPNDDGGSEDCVEIFTNGKW :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS73 QQLVVAKNEAAFLSMTDSKTEGKFTYPTGESLVYSNWAPGEPNDDGGSEDCVEIFTNGKW 310 320 330 340 350 360 370 pF1KB6 NDRACGEKRLVVCEF ::::::::::::::: CCDS73 NDRACGEKRLVVCEF 370 >>CCDS11561.1 COL1A1 gene_id:1277|Hs108|chr17 (1464 aa) initn: 2177 init1: 618 opt: 660 Z-score: 324.0 bits: 70.6 E(32554): 9.9e-12 Smith-Waterman score: 660; 51.3% identity (60.3% similar) in 199 aa overlap (44-236:531-728) 20 30 40 50 60 70 pF1KB6 QPLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRGEKGDPGLPGAAG :.:::: : : : : : : :: :: CCDS11 VAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAG 510 520 530 540 550 560 80 90 100 110 120 130 pF1KB6 QAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGPLGKQGNIGPQG : : :: :: : .:. : .: ::::: .: : : ::::: : :: ::.:. : :: CCDS11 QDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQG 570 580 590 600 610 620 140 150 160 170 180 190 pF1KB6 KPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGAAGSAGAMGPQG ::: : :: .:: : : : : : ::: :: : :::.::::. :: : .:: : .: CCDS11 PPGPAGPAGERGEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERG 630 640 650 660 670 680 200 210 220 230 240 pF1KB6 SPGARG---PPGLKGDKGI---PGDKGAKGESGLPDVASLRQQVEALQGQVQHLQAAFSQ :: :: ::: : .: ::. ::::..: : : : . .::: CCDS11 FPGERGVQGPPGPAGPRGANGAPGNDGAKGDAGAPG-APGSQGAPGLQGMPGERGAAGLP 690 700 710 720 730 250 260 270 280 290 300 pF1KB6 YKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQLVVAK CCDS11 GPKGDRGDAGPKGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAP 740 750 760 770 780 790 >>CCDS2297.1 COL3A1 gene_id:1281|Hs108|chr2 (1466 aa) initn: 3316 init1: 629 opt: 638 Z-score: 314.0 bits: 68.7 E(32554): 3.6e-11 Smith-Waterman score: 638; 49.7% identity (62.3% similar) in 183 aa overlap (46-222:531-713) 20 30 40 50 60 70 pF1KB6 LGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRGEKGDPGLPGAAGQA : :: : : : : : :: ::. :.. CCDS22 GEKGPAGERGAPGPAGPRGAAGEPGRDGVPGGPGMRGMPGSPGGPGSDGKPGPPGSQGES 510 520 530 540 550 560 80 90 100 110 120 130 pF1KB6 GMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGPLGKQGNIGPQGKP : :: :: ::.:. : .: :::::. : : : : :: : .:: ::.:. :::: : CCDS22 GRPGPPGPSGPRGQPGVMGFPGPKGNDGAPGKNGERGGPGGPGPQGPPGKNGETGPQGPP 570 580 590 600 610 620 140 150 160 170 180 190 pF1KB6 GPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGAAGSAGAMGPQGSP :: : .: ::..: :: :: : : .:: :: : ::: : :..:: :. :. : :.: CCDS22 GPTGPGGDKGDTGPPGPQGLQGLPGTGGPPGENGKPGEPGPKGDAGAPGAPGGKGDAGAP 630 640 650 660 670 680 200 210 220 230 240 pF1KB6 GARGPPGLKGDKGI------PGDKGAKGESGLPDVASLRQQVEALQGQVQHLQAAFSQYK : :::::: : :. :: .:.:: .: : CCDS22 GERGPPGLAGAPGLRGGAGPPGPEGGKGAAGPPGPPGAAGTPGLQGMPGERGGLGSPGPK 690 700 710 720 730 740 250 260 270 280 290 300 pF1KB6 KVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQLVVAKNE CCDS22 GDKGEPGGPGADGVPGKDGPRGPTGPIGPPGPAGQPGDKGEGGAPGLPGIAGPRGSPGER 750 760 770 780 790 800 >>CCDS4436.1 COL23A1 gene_id:91522|Hs108|chr5 (540 aa) initn: 1058 init1: 581 opt: 625 Z-score: 312.6 bits: 67.1 E(32554): 4.3e-11 Smith-Waterman score: 625; 46.6% identity (63.2% similar) in 193 aa overlap (44-236:134-321) 20 30 40 50 60 70 pF1KB6 QPLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRGEKGDPGLPGAAG . : ::..:::: :: : : ::::: : CCDS44 AKIRTAREAPSECVCPPGPPGRRGKPGRRGDPGPPGQSGRDGYPGPLGLDGKPGLPGPKG 110 120 130 140 150 160 80 90 100 110 120 130 pF1KB6 QAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGPLGKQGNIGPQG . : ::. :: : .:..:..: ::: : : :::: : :: : .:: : .:. : .: CCDS44 EKGAPGDFGPRGDQGQDGAAGPPGPPGPPGARGPPGDTGKDGPRGAQGPAGPKGEPGQDG 170 180 190 200 210 220 140 150 160 170 180 190 pF1KB6 KPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGAAGSAGAMGPQG . :::: ::::: :.:: .:. :. . :: : .: :: : : : : :: ::.: CCDS44 EMGPKGPPGPKGEPGVPGKKGDDGTPSQPGPPGPKGEPGSMG-P--RGENGVDGAPGPKG 230 240 250 260 270 280 200 210 220 230 240 250 pF1KB6 SPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQHLQAAFSQYKKVEL :: :: : : .: :: :: .:.. . : . . ..::.: CCDS44 EPGHRGTDGAAGPRGAPGLKGEQGDTVVIDYDG--RILDALKGPPGPQGPPGPPGIPGAK 290 300 310 320 330 260 270 280 290 300 310 pF1KB6 FPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQLVVAKNEAAFL CCDS44 GELGLPGAPGIDGEKGPKGQKGDPGEPGPAGLKGEAGEMGLSGLPGADGLKGEKGESASD 340 350 360 370 380 390 >>CCDS12222.1 COL5A3 gene_id:50509|Hs108|chr19 (1745 aa) initn: 616 init1: 616 opt: 633 Z-score: 310.9 bits: 68.4 E(32554): 5.4e-11 Smith-Waterman score: 633; 48.0% identity (61.6% similar) in 198 aa overlap (46-237:530-726) 20 30 40 50 60 pF1KB6 LGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRGEKGD------PGLP : :: :: : : : ::: :::: CCDS12 GLKGEEGAEGPQGPRGLQGPHGPPGRVGKMGRPGADGARGLPGDTGPKGDRGFDGLPGLP 500 510 520 530 540 550 70 80 90 100 110 120 pF1KB6 GAAGQAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGPLGKQGNI : :: : :..: :: :..: : :: : :: .: ::: :. :: : :: :. : CCDS12 GEKGQRGDFGHVGQPGPPGEDGERGAEGPPGPTGQAGEPGPRGLLGPRGSPGPTGRPGVT 560 570 580 590 600 610 130 140 150 160 170 180 pF1KB6 GPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGAAGSAGAM : .: :: ::..:: :: : ::.::. :..:: ::.: :.:::.: ::: : : :. CCDS12 GIDGAPGAKGNVGPPGEPGPPGQQGNHGSQGLPGPQGLIGTPGEKGPPGNPGIPGLPGSD 620 630 640 650 660 670 190 200 210 220 230 240 pF1KB6 GPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQHLQAAFSQYK :: : :: .:: : :: .: ::. : : : : .. . ..:::. CCDS12 GPLGHPGHEGPTGEKGAQGPPGSAGPPGYPG-PRGVKGTSGNRGLQGEKGEKGEDGFPGF 680 690 700 710 720 730 250 260 270 280 290 300 pF1KB6 KVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQLVVAKNE CCDS12 KGDVGLKGDQGKPGAPGPRGEDGPEGPKGQAGQAGEEGPPGSAGEKGKLGVPGLPGYPGR 740 750 760 770 780 790 >>CCDS75932.1 COL5A1 gene_id:1289|Hs108|chr9 (1838 aa) initn: 622 init1: 622 opt: 630 Z-score: 309.3 bits: 68.2 E(32554): 6.6e-11 Smith-Waterman score: 634; 47.5% identity (61.3% similar) in 204 aa overlap (45-236:603-805) 20 30 40 50 60 pF1KB6 PLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRG------EKGD--- .: ::: :: : .: :: ::: CCDS75 VGPPGSGGLKGEPGDVGPQGPRGVQGPPGPAGKPGRRGRAGSDGARGMPGQTGPKGDRGF 580 590 600 610 620 630 70 80 90 100 110 120 pF1KB6 ---PGLPGAAGQAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGP :::: :. : :: .:: :: ::.: :. : : : : ::: :. :: : :: CCDS75 DGLAGLPGEKGHRGDPGPSGPPGPPGDDGERGDDGEVGPRGLPGEPGPRGLLGPKGPPGP 640 650 660 670 680 690 130 140 150 160 170 180 pF1KB6 LGKQGNIGPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGA : : : .:.:::::..::.:: : ::.::. ::.:: ::.: : :::.: :. : CCDS75 PGPPGVTGMDGQPGPKGNVGPQGEPGPPGQQGNPGAQGLPGPQGAIGPPGEKGPLGKPGL 700 710 720 730 740 750 190 200 210 220 230 240 pF1KB6 AGSAGAMGPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQHLQ : :: :: : :: .:::: :: .: :: .: : : : .. . ...:.: CCDS75 PGMPGADGPPGHPGKEGPPGEKGGQGPPGPQGPIGYPG-PRGVKGADGIRGLKGTKGEKG 760 770 780 790 800 810 250 260 270 280 290 300 pF1KB6 AAFSQYKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQ CCDS75 EDGFPGFKGDMGIKGDRGEIGPPGPRGEDGPEGPKGRGGPNGDPGPLGPPGEKGKLGVPG 820 830 840 850 860 870 >>CCDS6982.1 COL5A1 gene_id:1289|Hs108|chr9 (1838 aa) initn: 622 init1: 622 opt: 630 Z-score: 309.3 bits: 68.2 E(32554): 6.6e-11 Smith-Waterman score: 634; 47.5% identity (61.3% similar) in 204 aa overlap (45-236:603-805) 20 30 40 50 60 pF1KB6 PLGYLEAEMKTYSHRTMPSACTLVMCSSVESGLPGRDGRDGREGPRG------EKGD--- .: ::: :: : .: :: ::: CCDS69 VGPPGSGGLKGEPGDVGPQGPRGVQGPPGPAGKPGRRGRAGSDGARGMPGQTGPKGDRGF 580 590 600 610 620 630 70 80 90 100 110 120 pF1KB6 ---PGLPGAAGQAGMPGQAGPVGPKGDNGSVGEPGPKGDTGPSGPPGPPGVPGPAGREGP :::: :. : :: .:: :: ::.: :. : : : : ::: :. :: : :: CCDS69 DGLAGLPGEKGHRGDPGPSGPPGPPGDDGERGDDGEVGPRGLPGEPGPRGLLGPKGPPGP 640 650 660 670 680 690 130 140 150 160 170 180 pF1KB6 LGKQGNIGPQGKPGPKGEAGPKGEVGAPGMQGSAGARGLAGPKGERGVPGERGVPGNTGA : : : .:.:::::..::.:: : ::.::. ::.:: ::.: : :::.: :. : CCDS69 PGPPGVTGMDGQPGPKGNVGPQGEPGPPGQQGNPGAQGLPGPQGAIGPPGEKGPLGKPGL 700 710 720 730 740 750 190 200 210 220 230 240 pF1KB6 AGSAGAMGPQGSPGARGPPGLKGDKGIPGDKGAKGESGLPDVASLRQQVEALQGQVQHLQ : :: :: : :: .:::: :: .: :: .: : : : .. . ...:.: CCDS69 PGMPGADGPPGHPGKEGPPGEKGGQGPPGPQGPIGYPG-PRGVKGADGIRGLKGTKGEKG 760 770 780 790 800 810 250 260 270 280 290 300 pF1KB6 AAFSQYKKVELFPNGQSVGEKIFKTAGFVKPFTEAQLLCTQAGGQLASPRSAAENAALQQ CCDS69 EDGFPGFKGDMGIKGDRGEIGPPGPRGEDGPEGPKGRGGPNGDPGPLGPPGEKGKLGVPG 820 830 840 850 860 870 375 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Fri Nov 4 15:47:21 2016 done: Fri Nov 4 15:47:22 2016 Total Scan time: 3.340 Total Display time: 0.010 Function used was FASTA [36.3.4 Apr, 2011]