FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB4786, 1464 aa 1>>>pF1KB4786 1464 - 1464 aa - 1464 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 11.4249+/-0.00156; mu= 6.1784+/- 0.095 mean_var=790.9724+/-166.587, 0's: 0 Z-trim(114.3): 285 B-trim: 573 in 1/53 Lambda= 0.045603 statistics sampled from 14598 (14872) to 14598 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.686), E-opt: 0.2 (0.457), width: 16 Scan time: 7.320 The best scores are: opt bits E(32554) CCDS11561.1 COL1A1 gene_id:1277|Hs108|chr17 (1464) 10904 734.5 5.5e-211 CCDS41778.1 COL2A1 gene_id:1280|Hs108|chr12 (1487) 7630 519.1 3.8e-146 CCDS8759.1 COL2A1 gene_id:1280|Hs108|chr12 (1418) 7626 518.8 4.5e-146 CCDS2297.1 COL3A1 gene_id:1281|Hs108|chr2 (1466) 6612 452.1 5.5e-126 CCDS33350.1 COL5A2 gene_id:1290|Hs108|chr2 (1499) 6194 424.6 1.1e-117 CCDS34682.1 COL1A2 gene_id:1278|Hs108|chr7 (1366) 5735 394.4 1.2e-108 CCDS75932.1 COL5A1 gene_id:1289|Hs108|chr9 (1838) 4366 304.5 1.9e-81 CCDS6982.1 COL5A1 gene_id:1289|Hs108|chr9 (1838) 4365 304.4 2e-81 CCDS780.2 COL11A1 gene_id:1301|Hs108|chr1 (1690) 4335 302.4 7.4e-81 CCDS53348.1 COL11A1 gene_id:1301|Hs108|chr1 (1767) 4335 302.4 7.6e-81 CCDS778.1 COL11A1 gene_id:1301|Hs108|chr1 (1806) 4335 302.5 7.7e-81 CCDS43452.1 COL11A2 gene_id:1302|Hs108|chr6 (1650) 3873 272.0 1e-71 CCDS12222.1 COL5A3 gene_id:50509|Hs108|chr19 (1745) 3569 252.0 1.1e-65 CCDS9511.1 COL4A1 gene_id:1282|Hs108|chr13 (1669) 3276 232.7 6.9e-60 CCDS6376.1 COL22A1 gene_id:169044|Hs108|chr8 (1626) 3064 218.8 1.1e-55 CCDS14543.1 COL4A5 gene_id:1287|Hs108|chrX (1685) 3037 217.0 3.8e-55 CCDS42829.1 COL4A3 gene_id:1285|Hs108|chr2 (1670) 3016 215.6 9.8e-55 CCDS41297.1 COL16A1 gene_id:1307|Hs108|chr1 (1604) 2837 203.8 3.4e-51 CCDS41907.1 COL4A2 gene_id:1284|Hs108|chr13 (1712) 2756 198.5 1.4e-49 CCDS42828.1 COL4A4 gene_id:1286|Hs108|chr2 (1690) 2746 197.9 2.2e-49 CCDS76008.1 COL4A6 gene_id:1288|Hs108|chrX (1633) 2674 193.1 5.7e-48 CCDS76009.1 COL4A6 gene_id:1288|Hs108|chrX (1666) 2674 193.1 5.8e-48 CCDS14542.1 COL4A6 gene_id:1288|Hs108|chrX (1690) 2674 193.1 5.8e-48 CCDS14541.1 COL4A6 gene_id:1288|Hs108|chrX (1691) 2674 193.1 5.8e-48 CCDS41353.1 COL24A1 gene_id:255631|Hs108|chr1 (1714) 2531 183.7 4e-45 CCDS6802.1 COL27A1 gene_id:85301|Hs108|chr9 (1860) 2519 183.0 7.2e-45 CCDS35366.1 COL4A5 gene_id:1287|Hs108|chrX (1691) 2489 181.0 2.7e-44 CCDS2773.1 COL7A1 gene_id:1294|Hs108|chr3 (2944) 2227 164.1 5.6e-39 CCDS42971.1 COL18A1 gene_id:80781|Hs108|chr21 (1339) 1919 143.3 4.6e-33 CCDS42972.1 COL18A1 gene_id:80781|Hs108|chr21 (1519) 1919 143.4 5e-33 CCDS47447.1 COL9A1 gene_id:1297|Hs108|chr6 ( 678) 1854 138.6 6.3e-32 CCDS4971.1 COL9A1 gene_id:1297|Hs108|chr6 ( 921) 1854 138.8 7.4e-32 CCDS77643.1 COL18A1 gene_id:80781|Hs108|chr21 (1754) 1845 138.6 1.6e-31 CCDS450.1 COL9A2 gene_id:1298|Hs108|chr1 ( 689) 1824 136.6 2.5e-31 CCDS13505.1 COL9A3 gene_id:1299|Hs108|chr20 ( 684) 1774 133.3 2.4e-30 CCDS4970.1 COL19A1 gene_id:1310|Hs108|chr6 (1142) 1468 113.5 3.7e-24 CCDS58922.1 COL25A1 gene_id:84570|Hs108|chr4 ( 645) 1419 109.9 2.5e-23 CCDS5105.1 COL10A1 gene_id:1300|Hs108|chr6 ( 680) 1392 108.2 8.9e-23 CCDS43553.1 COL28A1 gene_id:340267|Hs108|chr7 (1125) 1348 105.6 8.6e-22 CCDS55025.1 COL21A1 gene_id:81578|Hs108|chr6 ( 957) 1339 104.9 1.2e-21 CCDS83099.1 COL21A1 gene_id:81578|Hs108|chr6 ( 954) 1331 104.4 1.7e-21 CCDS44424.2 COL13A1 gene_id:1305|Hs108|chr10 ( 695) 1315 103.1 3e-21 CCDS44419.1 COL13A1 gene_id:1305|Hs108|chr10 ( 717) 1295 101.8 7.7e-21 CCDS35081.1 COL15A1 gene_id:1306|Hs108|chr9 (1388) 1298 102.5 9.4e-21 CCDS7554.1 COL17A1 gene_id:1308|Hs108|chr10 (1497) 1283 101.5 1.9e-20 CCDS4436.1 COL23A1 gene_id:91522|Hs108|chr5 ( 540) 1240 98.0 8.1e-20 CCDS44423.2 COL13A1 gene_id:1305|Hs108|chr10 ( 668) 1238 98.0 9.9e-20 CCDS44425.2 COL13A1 gene_id:1305|Hs108|chr10 ( 686) 1227 97.3 1.7e-19 CCDS43259.1 COL25A1 gene_id:84570|Hs108|chr4 ( 642) 1213 96.3 3e-19 CCDS43258.1 COL25A1 gene_id:84570|Hs108|chr4 ( 654) 1213 96.4 3.1e-19 >>CCDS11561.1 COL1A1 gene_id:1277|Hs108|chr17 (1464 aa) initn: 10904 init1: 10904 opt: 10904 Z-score: 3901.4 bits: 734.5 E(32554): 5.5e-211 Smith-Waterman score: 10904; 100.0% identity (100.0% similar) in 1464 aa overlap (1-1464:1-1464) 10 20 30 40 50 60 pF1KB4 MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPCRI 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB4 CVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 CVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPR 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB4 GPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTGGISVPGP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 GPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTGGISVPGP 130 140 150 160 170 180 190 200 210 220 230 240 pF1KB4 MGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 MGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGR 190 200 210 220 230 240 250 260 270 280 290 300 pF1KB4 PGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 PGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQ 250 260 270 280 290 300 310 320 330 340 350 360 pF1KB4 MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVGAKGEAGPQGP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 MGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVGAKGEAGPQGP 310 320 330 340 350 360 370 380 390 400 410 420 pF1KB4 RGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 RGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGP 370 380 390 400 410 420 430 440 450 460 470 480 pF1KB4 QGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 QGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGL 430 440 450 460 470 480 490 500 510 520 530 540 pF1KB4 PGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 PGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGL 490 500 510 520 530 540 550 560 570 580 590 600 pF1KB4 TGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 TGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGV 550 560 570 580 590 600 610 620 630 640 650 660 pF1KB4 PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAGPPGEAGKPGE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 PGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAGPPGEAGKPGE 610 620 630 640 650 660 670 680 690 700 710 720 pF1KB4 QGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGAPGNDGAKGDAGAPGAPGS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 QGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGAPGNDGAKGDAGAPGAPGS 670 680 690 700 710 720 730 740 750 760 770 780 pF1KB4 QGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIGPPGPAGAPGD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 QGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIGPPGPAGAPGD 730 740 750 760 770 780 790 800 810 820 830 840 pF1KB4 KGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 KGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGP 790 800 810 820 830 840 850 860 870 880 890 900 pF1KB4 PGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 PGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGP 850 860 870 880 890 900 910 920 930 940 950 960 pF1KB4 AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 AGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGV 910 920 930 940 950 960 970 980 990 1000 1010 1020 pF1KB4 VGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPGA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 VGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPGA 970 980 990 1000 1010 1020 1030 1040 1050 1060 1070 1080 pF1KB4 EGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGDRGETGPAGPAGPVGP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 EGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGDRGETGPAGPAGPVGP 1030 1040 1050 1060 1070 1080 1090 1100 1110 1120 1130 1140 pF1KB4 VGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 VGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGP 1090 1100 1110 1120 1130 1140 1150 1160 1170 1180 1190 1200 pF1KB4 RGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSF :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 RGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSF 1150 1160 1170 1180 1190 1200 1210 1220 1230 1240 1250 1260 pF1KB4 LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 LPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCR 1210 1220 1230 1240 1250 1260 1270 1280 1290 1300 1310 1320 pF1KB4 DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 DLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKD 1270 1280 1290 1300 1310 1320 1330 1340 1350 1360 1370 1380 pF1KB4 KRHVWFGESMTDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 KRHVWFGESMTDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQ 1330 1340 1350 1360 1370 1380 1390 1400 1410 1420 1430 1440 pF1KB4 TGNLKKALLLQGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPII :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS11 TGNLKKALLLQGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPII 1390 1400 1410 1420 1430 1440 1450 1460 pF1KB4 DVAPLDVGAPDQEFGFDVGPVCFL :::::::::::::::::::::::: CCDS11 DVAPLDVGAPDQEFGFDVGPVCFL 1450 1460 >>CCDS41778.1 COL2A1 gene_id:1280|Hs108|chr12 (1487 aa) initn: 5961 init1: 5961 opt: 7630 Z-score: 2737.2 bits: 519.1 E(32554): 3.8e-146 Smith-Waterman score: 7941; 71.1% identity (82.8% similar) in 1489 aa overlap (7-1464:10-1487) 10 20 30 40 50 pF1KB4 MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEP : :: ::.:..: .::. : : .:::.: ::.:.::::::: CCDS41 MIRLGAPQTLVLLTLLVAAVLRCQGQDV-QEAG--------SCVQDGQRYNDKDVWKPEP 10 20 30 40 50 60 70 80 90 100 pF1KB4 CRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVCP------DGSESPTDQE----- :::::::.: :::::.::...:.: . :.: :::::.:: .:. .: :. CCDS41 CRICVCDTGTVLCDDIICEDVKDCLSPEIPFGECCPICPTDLATASGQPGPKGQKGEPGD 60 70 80 90 100 110 110 120 130 140 150 pF1KB4 ---------------TTGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPPG .: .::.:: : .: .: :: :::: :: :: ::::::::::: CCDS41 IKDIVGPKGPPGPQGPAGEQGPRGDRGDKGEKGAPGPRGRDGEPGTPGNPGPPGPPGPPG 120 130 140 150 160 170 160 170 180 190 200 pF1KB4 PPGLGGNFAPQLSYGYDEKSTG---GISVPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEP ::::::::: :.. :.:::. : :. . ::::: :::: ::: :::::::::: :::: CCDS41 PPGLGGNFAAQMAGGFDEKAGGAQLGV-MQGPMGPMGPRGPPGPAGAPGPQGFQGNPGEP 180 190 200 210 220 230 210 220 230 240 250 260 pF1KB4 GEPGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHR ::::.::::::::::::::: ::::::::::. :::::::::::::.::: ::::.:::: CCDS41 GEPGVSGPMGPRGPPGPPGKPGDDGEAGKPGKAGERGPPGPQGARGFPGTPGLPGVKGHR 240 250 260 270 280 290 270 280 290 300 310 320 pF1KB4 GFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQMGPRGLPGERGRPGAPGPAGARGNDGAT :. :::::::.:: : ::: :::::::.:: :::::::::::: : : :::::::: CCDS41 GYPGLDGAKGEAGAPGVKGESGSPGENGSPGPMGPRGLPGERGRTGPAGAAGARGNDGQP 300 310 320 330 340 350 330 340 350 360 370 380 pF1KB4 GAAGPPGPTGPAGPPGFPGAVGAKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNP : ::::::.:::: :::::: :::::::: : :: :: :: ::::: :: : :: .::: CCDS41 GPAGPPGPVGPAGGPGFPGAPGAKGEAGPTGARGPEGAQGPRGEPGTPGSPGPAGASGNP 360 370 380 390 400 410 390 400 410 420 430 440 pF1KB4 GADGQPGAKGANGAPGIAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAK :.:: :::::. ::::::::::::: ::: :::: :: ::::..:::: : ::. : : CCDS41 GTDGIPGAKGSAGAPGIAGAPGFPGPRGPPGPQGATGPLGPKGQTGEPGIAGFKGEQGPK 420 430 440 450 460 470 450 460 470 480 490 500 pF1KB4 GEPGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGPGSRGFPGADGVAGPKGPA :::::.: :: ::::::::::::::::: .: :::::::.::.::::: ::.::::: CCDS41 GEPGPAGPQGAPGPAGEEGKRGARGEPGGVGPIGPPGERGAPGNRGFPGQDGLAGPKGAP 480 490 500 510 520 530 510 520 530 540 550 560 pF1KB4 GERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPP :::: : :::::. :. ::::: :::::.:::: ::. ::.::.:: : :.::::::: CCDS41 GERGPSGLAGPKGANGDPGRPGEPGLPGARGLTGRPGDAGPQGKVGPSGAPGEDGRPGPP 540 550 560 570 580 590 570 580 590 600 610 620 pF1KB4 GPPGARGQAGVMGFPGPKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPA :: ::::: ::::::::::: ::::::::.:.:: :: : :::::.:: ::::::::: CCDS41 GPQGARGQPGVMGFPGPKGANGEPGKAGEKGLPGAPGLRGLPGKDGETGAAGPPGPAGPA 600 610 620 630 640 650 630 640 650 660 670 680 pF1KB4 GERGEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQ ::::::: : :::::::: :::::.::::.:::::. :::: : ::::::::::: CCDS41 GERGEQGAPGPSGFQGLPGPPGPPGEGGKPGDQGVPGEAGAPGLVGPRGERGFPGERGSP 660 670 680 690 700 710 690 700 710 720 730 740 pF1KB4 GPPGPAGPRGANGAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDA : : :::: :.::.:: :: .: : ::.:: ::::::::::::::. ::::::::. CCDS41 GAQGLQGPRGLPGTPGTDGPKGASGPAGPPGAQGPPGLQGMPGERGAAGIAGPKGDRGDV 720 730 740 750 760 770 750 760 770 780 790 800 pF1KB4 GPKGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPP : :: .:.::::: ::::::::::::::: :.::: :: :::: .:::::::.::: ::: CCDS41 GEKGPEGAPGKDGGRGLTGPIGPPGPAGANGEKGEVGPPGPAGSAGARGAPGERGETGPP 780 790 800 810 820 830 810 820 830 840 850 860 pF1KB4 GPAGFAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPPGPIGNVGAPGAKGARGSA :::::::::::::::::::: :.:: ::::: ::: ::.: ::: : .:. : :::::. CCDS41 GPAGFAGPPGADGQPGAKGEQGEAGQKGDAGAPGPQGPSGAPGPQGPTGVTGPKGARGAQ 840 850 860 870 880 890 870 880 890 900 910 920 pF1KB4 GPPGATGFPGAAGRVGPPGPSGNAGPPGPPGPAGKEGGKGPRGETGPAGRPGEVGPPGPP ::::::::::::::::::: .:: ::::::::.::.: :: ::..:: :: :: : :: CCDS41 GPPGATGFPGAAGRVGPPGSNGNPGPPGPPGPSGKDGPKGARGDSGPPGRAGEPGLQGPA 900 910 920 930 940 950 930 940 950 960 970 980 pF1KB4 GPAGEKGSPGADGPAGAPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPS :: :::: :: :::.:: : :::::.:::::.:::::::::::::::::::::::::: CCDS41 GPPGEKGEPGDDGPSGAEGPPGPQGLAGQRGIVGLPGQRGERGFPGLPGPSGEPGKQGAP 960 970 980 990 1000 1010 990 1000 1010 1020 1030 1040 pF1KB4 GASGERGPPGPMGPPGLAGPPGESGREGAPGAEGSPGRDGSPGAKGDRGETGPAGPPGAP ::::.::::::.:::::.:: :: ::::.:::.: :::::. :.:::::::: .: :::: CCDS41 GASGDRGPPGPVGPPGLTGPAGEPGREGSPGADGPPGRDGAAGVKGDRGETGAVGAPGAP 1020 1030 1040 1050 1060 1070 1050 1060 1070 1080 1090 1100 pF1KB4 GAPGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGIK : ::.:::.::.::.:::::.: :: :: ::.:::: :::::::::::.:: :.::.: CCDS41 GPPGSPGPAGPTGKQGDRGEAGAQGPMGPSGPAGARGIQGPQGPRGDKGEAGEPGERGLK 1080 1090 1100 1110 1120 1130 1110 1120 1130 1140 1150 1160 pF1KB4 GHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKDGLNGLPGPIGPPGPR :::::.:::: ::::: :.:: :: .::.::::::: .: :::: ::.:::::::::: CCDS41 GHRGFTGLQGLPGPPGPSGDQGASGPAGPSGPRGPPGPVGPSGKDGANGIPGPIGPPGPR 1140 1150 1160 1170 1180 1190 1170 1180 1190 1200 1210 1220 pF1KB4 GRTGDAGPVGPPGPPGPPGPPGPPSAGFDFS-FLPQPPQEKAHDGGRYYRADDA-NVVRD ::.:..::.:::: ::::::::::. :.:.: : :.::. : .:.:::.: . .:. CCDS41 GRSGETGPAGPPGNPGPPGPPGPPGPGIDMSAFAGLGPREKGPDPLQYMRADQAAGGLRQ 1200 1210 1220 1230 1240 1250 1230 1240 1250 1260 1270 1280 pF1KB4 RDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDA .: :::.:::::..:::.:::::::::::::::::::.:: .::::.:::::::::.::: CCDS41 HDAEVDATLKSLNNQIESIRSPEGSRKNPARTCRDLKLCHPEWKSGDYWIDPNQGCTLDA 1260 1270 1280 1290 1300 1310 1290 1300 1310 1320 1330 1340 pF1KB4 IKVFCNMETGETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPA .:::::::::::::::. .: .:::. ::. :.:.:.::::... ::.: :: .. : CCDS41 MKVFCNMETGETCVYPNPANVPKKNWWSSKS-KEKKHIWFGETINGGFHFSYGDDNLAPN 1320 1330 1340 1350 1360 1350 1360 1370 1380 1390 1400 pF1KB4 DVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLQGSNEIEIRAEGNSRF . .:.:::::.:::.:::::::::::.::.:. .::::::::.::::..:::::::::: CCDS41 TANVQMTFLRLLSTEGSQNITYHCKNSIAYLDEAAGNLKKALLIQGSNDVEIRAEGNSRF 1370 1380 1390 1400 1410 1420 1410 1420 1430 1440 1450 1460 pF1KB4 TYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPIIDVAPLDVGAPDQEFGFDVGPVCFL ::.. ::::.::: ::::::::.. :::::::::.::.:.:.:.:::: :.:::::: CCDS41 TYTALKDGCTKHTGKWGKTVIEYRSQKTSRLPIIDIAPMDIGGPEQEFGVDIGPVCFL 1430 1440 1450 1460 1470 1480 >>CCDS8759.1 COL2A1 gene_id:1280|Hs108|chr12 (1418 aa) initn: 5961 init1: 5961 opt: 7626 Z-score: 2736.0 bits: 518.8 E(32554): 4.5e-146 Smith-Waterman score: 7626; 72.8% identity (84.0% similar) in 1392 aa overlap (82-1464:31-1418) 60 70 80 90 100 pF1KB4 VWKPEPCRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCP----VCPDGSESPTDQET :: . .:: : : : .: : CCDS87 MIRLGAPQTLVLLTLLVAAVLRCQGQDVRQPGPKGQKGEPGDIKDIVGPKGPPGP--QGP 10 20 30 40 50 110 120 130 140 150 160 pF1KB4 TGVEGPKGDTGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPPGPPGLGGNFAPQLSYGY .: .::.:: : .: .: :: :::: :: :: :::::::::::::::::::: :.. :. CCDS87 AGEQGPRGDRGDKGEKGAPGPRGRDGEPGTPGNPGPPGPPGPPGPPGLGGNFAAQMAGGF 60 70 80 90 100 110 170 180 190 200 210 220 pF1KB4 DEKSTG---GISVPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGPPG :::. : :. . ::::: :::: ::: :::::::::: ::::::::.::::::::::: CCDS87 DEKAGGAQLGV-MQGPMGPMGPRGPPGPAGAPGPQGFQGNPGEPGEPGVSGPMGPRGPPG 120 130 140 150 160 170 230 240 250 260 270 280 pF1KB4 PPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAG :::: ::::::::::. :::::::::::::.::: ::::.:::::. :::::::.:: : CCDS87 PPGKPGDDGEAGKPGKAGERGPPGPQGARGFPGTPGLPGVKGHRGYPGLDGAKGEAGAPG 180 190 200 210 220 230 290 300 310 320 330 340 pF1KB4 PKGEPGSPGENGAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPG ::: :::::::.:: :::::::::::: : : :::::::: : ::::::.:::: :: CCDS87 VKGESGSPGENGSPGPMGPRGLPGERGRTGPAGAAGARGNDGQPGPAGPPGPVGPAGGPG 240 250 260 270 280 290 350 360 370 380 390 400 pF1KB4 FPGAVGAKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPG :::: :::::::: : :: :: :: ::::: :: : :: .::::.:: :::::. :::: CCDS87 FPGAPGAKGEAGPTGARGPEGAQGPRGEPGTPGSPGPAGASGNPGTDGIPGAKGSAGAPG 300 310 320 330 340 350 410 420 430 440 450 460 pF1KB4 IAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAG ::::::::: ::: :::: :: ::::..:::: : ::. : ::::::.: :: ::::: CCDS87 IAGAPGFPGPRGPPGPQGATGPLGPKGQTGEPGIAGFKGEQGPKGEPGPAGPQGAPGPAG 360 370 380 390 400 410 470 480 490 500 510 520 pF1KB4 EEGKRGARGEPGPTGLPGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPG :::::::::::: .: :::::::.::.::::: ::.::::: :::: : :::::. : CCDS87 EEGKRGARGEPGGVGPIGPPGERGAPGNRGFPGQDGLAGPKGAPGERGPSGLAGPKGANG 420 430 440 450 460 470 530 540 550 560 570 580 pF1KB4 EAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPG . ::::: :::::.:::: ::. ::.::.:: : :.::::::::: ::::: ::::::: CCDS87 DPGRPGEPGLPGARGLTGRPGDAGPQGKVGPSGAPGEDGRPGPPGPQGARGQPGVMGFPG 480 490 500 510 520 530 590 600 610 620 630 640 pF1KB4 PKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQG :::: ::::::::.:.:: :: : :::::.:: :::::::::::::::: : :::: CCDS87 PKGANGEPGKAGEKGLPGAPGLRGLPGKDGETGAAGPPGPAGPAGERGEQGAPGPSGFQG 540 550 560 570 580 590 650 660 670 680 690 700 pF1KB4 LPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGAPG :::: :::::.::::.:::::. :::: : ::::::::::: : : :::: :.:: CCDS87 LPGPPGPPGEGGKPGDQGVPGEAGAPGLVGPRGERGFPGERGSPGAQGLQGPRGLPGTPG 600 610 620 630 640 650 710 720 730 740 750 760 pF1KB4 NDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRG .:: :: .: : ::.:: ::::::::::::::. ::::::::.: :: .:.::::: :: CCDS87 TDGPKGASGPAGPPGAQGPPGLQGMPGERGAAGIAGPKGDRGDVGEKGPEGAPGKDGGRG 660 670 680 690 700 710 770 780 790 800 810 820 pF1KB4 LTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPG ::::::::::::: :.::: :: :::: .:::::::.::: ::::::::::::::::::: CCDS87 LTGPIGPPGPAGANGEKGEVGPPGPAGSAGARGAPGERGETGPPGPAGFAGPPGADGQPG 720 730 740 750 760 770 830 840 850 860 870 880 pF1KB4 AKGEPGDAGAKGDAGPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVG :::: :.:: ::::: ::: ::.: ::: : .:. : :::::. :::::::::::::::: CCDS87 AKGEQGEAGQKGDAGAPGPQGPSGAPGPQGPTGVTGPKGARGAQGPPGATGFPGAAGRVG 780 790 800 810 820 830 890 900 910 920 930 940 pF1KB4 PPGPSGNAGPPGPPGPAGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAG ::: .:: ::::::::.::.: :: ::..:: :: :: : :: :: :::: :: :::.: CCDS87 PPGSNGNPGPPGPPGPSGKDGPKGARGDSGPPGRAGEPGLQGPAGPPGEKGEPGDDGPSG 840 850 860 870 880 890 950 960 970 980 990 1000 pF1KB4 APGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPG : : :::::.:::::.:::::::::::::::::::::::::: ::::.::::::.:::: CCDS87 AEGPPGPQGLAGQRGIVGLPGQRGERGFPGLPGPSGEPGKQGAPGASGDRGPPGPVGPPG 900 910 920 930 940 950 1010 1020 1030 1040 1050 1060 pF1KB4 LAGPPGESGREGAPGAEGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSG :.:: :: ::::.:::.: :::::. :.:::::::: .: ::::: ::.:::.::.::.: CCDS87 LTGPAGEPGREGSPGADGPPGRDGAAGVKGDRGETGAVGAPGAPGPPGSPGPAGPTGKQG 960 970 980 990 1000 1010 1070 1080 1090 1100 1110 1120 pF1KB4 DRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPG ::::.: :: :: ::.:::: :::::::::::.:: :.::.::::::.:::: ::::: CCDS87 DRGEAGAQGPMGPSGPAGARGIQGPQGPRGDKGEAGEPGERGLKGHRGFTGLQGLPGPPG 1020 1030 1040 1050 1060 1070 1130 1140 1150 1160 1170 1180 pF1KB4 SPGEQGPSGASGPAGPRGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPG :.:: :: .::.::::::: .: :::: ::.::::::::::::.:..::.:::: :: CCDS87 PSGDQGASGPAGPSGPRGPPGPVGPSGKDGANGIPGPIGPPGPRGRSGETGPAGPPGNPG 1080 1090 1100 1110 1120 1130 1190 1200 1210 1220 1230 1240 pF1KB4 PPGPPGPPSAGFDFS-FLPQPPQEKAHDGGRYYRADDA-NVVRDRDLEVDTTLKSLSQQI ::::::::. :.:.: : :.::. : .:.:::.: . .:..: :::.:::::..:: CCDS87 PPGPPGPPGPGIDMSAFAGLGPREKGPDPLQYMRADQAAGGLRQHDAEVDATLKSLNNQI 1140 1150 1160 1170 1180 1190 1250 1260 1270 1280 1290 1300 pF1KB4 ENIRSPEGSRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGETCVYP :.:::::::::::::::::::.:: .::::.:::::::::.:::.::::::::::::::: CCDS87 ESIRSPEGSRKNPARTCRDLKLCHPEWKSGDYWIDPNQGCTLDAMKVFCNMETGETCVYP 1200 1210 1220 1230 1240 1250 1310 1320 1330 1340 1350 1360 pF1KB4 TQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPADVAIQLTFLRLMSTEA . .: .:::. ::. :.:.:.::::... ::.: :: .. : . .:.:::::.:::. CCDS87 NPANVPKKNWWSSKS-KEKKHIWFGETINGGFHFSYGDDNLAPNTANVQMTFLRLLSTEG 1260 1270 1280 1290 1300 1310 1370 1380 1390 1400 1410 1420 pF1KB4 SQNITYHCKNSVAYMDQQTGNLKKALLLQGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAW :::::::::::.::.:. .::::::::.::::..::::::::::::.. ::::.::: : CCDS87 SQNITYHCKNSIAYLDEAAGNLKKALLIQGSNDVEIRAEGNSRFTYTALKDGCTKHTGKW 1320 1330 1340 1350 1360 1370 1430 1440 1450 1460 pF1KB4 GKTVIEYKTTKTSRLPIIDVAPLDVGAPDQEFGFDVGPVCFL :::::::.. :::::::::.::.:.:.:.:::: :.:::::: CCDS87 GKTVIEYRSQKTSRLPIIDIAPMDIGGPEQEFGVDIGPVCFL 1380 1390 1400 1410 >>CCDS2297.1 COL3A1 gene_id:1281|Hs108|chr2 (1466 aa) initn: 11093 init1: 4979 opt: 6612 Z-score: 2375.3 bits: 452.1 E(32554): 5.5e-126 Smith-Waterman score: 6616; 61.9% identity (74.1% similar) in 1487 aa overlap (1-1464:1-1466) 10 20 30 40 50 pF1KB4 MFSFVD---LRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEP :.:::. :: :: : .:. :.:. ::: : . : : ::::::::: CCDS22 MMSFVQKGSWLLLALLHPTIILA--QQEA-VEGG--------CSHLGQSYADRDVWKPEP 10 20 30 40 60 70 80 90 100 110 pF1KB4 CRICVCDNGKVLCDDVICDETK-NCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGD :.:::::.:.:::::.:::. . .::. :.: :::: :::. .:: . .: .::.: CCDS22 CQICVCDSGSVLCDDIICDDQELDCPNPEIPFGECCAVCPQPPTAPT-RPPNG-QGPQG- 50 60 70 80 90 100 120 130 140 150 160 170 pF1KB4 TGPRGPRGPAGPPGRDGIPGQPGLPGPPGPPGPPG-----PPGLGGNFAPQLSYGYDEKS :.: :: : :::.: :: :: :: :: ::::: : : :..:: . .:: :: CCDS22 --PKGDPGPPGIPGRNGDPGIPGQPGSPGSPGPPGICESCPTG-PQNYSPQYD-SYDVKS 110 120 130 140 150 160 180 190 200 210 220 pF1KB4 ---TGGIS-VPGPMGPSGPRGLPGP---PGAPGPQGFQGPPGEPGEPGASGPMGPRGPPG .::.. ::: :: :: : :: ::.:: :.::::::::. : ::: :: : : CCDS22 GVAVGGLAGYPGPAGPPGPPGPPGTSGHPGSPGSPGYQGPPGEPGQAGPSGPPGPPGAIG 170 180 190 200 210 220 230 240 250 260 270 280 pF1KB4 PPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAG : : : :::.:.:::::::: ::: : .: : :.:::::::::.: .: ::..: : CCDS22 PSGPAGKDGESGRPGRPGERGLPGPPGIKGPAGIPGFPGMKGHRGFDGRNGEKGETGAPG 230 240 250 260 270 280 290 300 310 320 330 340 pF1KB4 PKGEPGSPGENGAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPG ::: : :::::::: ::::: :::::::: :: ::::::::: :. : ::: :: : : CCDS22 LKGENGLPGENGAPGPMGPRGAPGERGRPGLPGAAGARGNDGARGSDGQPGPPGPPGTAG 290 300 310 320 330 340 350 360 370 380 390 400 pF1KB4 FPGAVGAKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPG :::. :::::.:: : ::.: : :::::: : ::: :: : :: .:.::.:: : : CCDS22 FPGSPGAKGEVGPAGSPGSNGAPGQRGEPGPQGHAGAQGPPGPPGINGSPGGKGEMGPAG 350 360 370 380 390 400 410 420 430 440 450 460 pF1KB4 IAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAG : ::::. ::::: :: : .: :: .:..:::: :.::. : .:: : .:. : :: : CCDS22 IPGAPGLMGARGPPGPAGANGAPGLRGGAGEPGKNGAKGEPGPRGERGEAGIPGVPGAKG 410 420 430 440 450 460 470 480 490 500 510 520 pF1KB4 EEGKRGARGEPGPTGLPGPPGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPG :.:: :. :::: .:::: ::::.:: :: : .:. : ::::::::.::::::.:. : CCDS22 EDGKDGSPGEPGANGLPGAAGERGAPGFRGPAGPNGIPGEKGPAGERGAPGPAGPRGAAG 470 480 490 500 510 520 530 540 550 560 570 580 pF1KB4 EAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPG : :: : : :: .:. ::::.:: ::: :::: :..:::::::: : ::: ::::::: CCDS22 EPGRDGVPGGPGMRGMPGSPGGPGSDGKPGPPGSQGESGRPGPPGPSGPRGQPGVMGFPG 530 540 550 560 570 580 590 600 610 620 630 640 pF1KB4 PKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQG ::: : ::: :::: :: :: :: ::.::.: ::::::.::.:..:. :: : :.:: CCDS22 PKGNDGAPGKNGERGGPGGPGPQGPPGKNGETGPQGPPGPTGPGGDKGDTGPPGPQGLQG 590 600 610 620 630 640 650 660 670 680 690 700 pF1KB4 LPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGAPG ::: .::::: ::::: : :: :::: :..:. : ::::: : : : ::. : :: CCDS22 LPGTGGPPGENGKPGEPGPKGDAGAPGAPGGKGDAGAPGERGPPGLAGAPGLRGGAGPPG 650 660 670 680 690 700 710 720 730 740 750 760 pF1KB4 NDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRG .:.:: :: :: ::. :.:::::::::::. : ::::::.:. : :::: ::::: :: CCDS22 PEGGKGAAGPPGPPGAAGTPGLQGMPGERGGLGSPGPKGDKGEPGGPGADGVPGKDGPRG 710 720 730 740 750 760 770 780 790 800 810 820 pF1KB4 LTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPG ::::::::::: ::::::.: : : .: ::.::.::: :::::::: : :: .:.:: CCDS22 PTGPIGPPGPAGQPGDKGEGGAPGLPGIAGPRGSPGERGETGPPGPAGFPGAPGQNGEPG 770 780 790 800 810 820 830 840 850 860 870 880 pF1KB4 AKGEPGDAGAKGDAGPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVG .::: : : ::..:::: ::: : :: : : :.:: ::: : :::.::::: : : CCDS22 GKGERGAPGEKGEGGPPGVAGPPGGSGPAGPPGPQGVKGERGSPGGPGAAGFPGARGLPG 830 840 850 860 870 880 890 900 910 920 930 940 pF1KB4 PPGPSGNAGPPGPPGPAGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAG ::: .:: ::::: : ::.: :: :.:: : :: :: : : ::::::::.:: : CCDS22 PPGSNGNPGPPGPSGSPGKDGPPGPAGNTGAPGSPGVSGPKGDAGQPGEKGSPGAQGPPG 890 900 910 920 930 940 950 960 970 980 990 1000 pF1KB4 APGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERGPPGPMGPPG ::: : ::.: ::..: ::. : :: :: : .:: :: : .: :::::::::.: :: CCDS22 APGPLGIAGITGARGLAGPPGMPGPRGSPGPQGVKGESGKPGANGLSGERGPPGPQGLPG 950 960 970 980 990 1000 1010 1020 1030 1040 1050 1060 pF1KB4 LAGPPGESGREGAPGAEGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSG ::: :: ::.: ::..: ::::::::.::::::.: : ::::: :: ::::::::::: CCDS22 LAGTAGEPGRDGNPGSDGLPGRDGSPGGKGDRGENGSPGAPGAPGHPGPPGPVGPAGKSG 1010 1020 1030 1040 1050 1060 1070 1080 1090 1100 1110 1120 pF1KB4 DRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPG ::::.::::::: ::.:.:: ::::::::::::::.: :::::::: : : :: :: CCDS22 DRGESGPAGPAGAPGPAGSRGAPGPQGPRGDKGETGERGAAGIKGHRGFPGNPGAPGSPG 1070 1080 1090 1100 1110 1120 1130 1140 1150 1160 1170 1180 pF1KB4 SPGEQGPSGASGPAGPRGPPGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPPG :.:: :. :::::::: : .: ::::: .: ::::::::::: :. : : :: :: CCDS22 PAGQQGAIGSPGPAGPRGPVGPSGPPGKDGTSGHPGPIGPPGPRGNRGERGSEGSPGHPG 1130 1140 1150 1160 1170 1180 1190 1200 1210 1220 1230 pF1KB4 PPGPPGPPSA------GFDFSFLPQPPQEKAHDGGRYYRADDANVVRDRDLEVDTTLKSL :::::::.: : . . ::: . :: . . . : :. :.:::. CCDS22 QPGPPGPPGAPGPCCGGVGAAAIAGIGGEKAGGFAPYYGDEPMDFKINTD-EIMTSLKSV 1190 1200 1210 1220 1230 1240 1240 1250 1260 1270 1280 1290 pF1KB4 SQQIENIRSPEGSRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETGET . :::.. ::.::::::::.:::::.:: . ::::::.::::::.::::::::::::::: CCDS22 NGQIESLISPDGSRKNPARNCRDLKFCHPELKSGEYWVDPNQGCKLDAIKVFCNMETGET 1250 1260 1270 1280 1290 1300 1300 1310 1320 1330 1340 1350 pF1KB4 CVYPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPADVA-IQLTFLRL :. . .: .:.:. . . .:.:::::::: :::: ::. : :: ..:.:::: CCDS22 CISANPLNVPRKHWW-TDSSAEKKHVWFGESMDGGFQFSYGNP-ELPEDVLDVHLAFLRL 1310 1320 1330 1340 1350 1360 1370 1380 1390 1400 1410 pF1KB4 MSTEASQNITYHCKNSVAYMDQQTGNLKKALLLQGSNEIEIRAEGNSRFTYSVTVDGCTS .:..::::::::::::.::::: .::.:::: :.:::: :..:::::.:::.: ::::. CCDS22 LSSRASQNITYHCKNSIAYMDQASGNVKKALKLMGSNEGEFKAEGNSKFTYTVLEDGCTK 1360 1370 1380 1390 1400 1410 1420 1430 1440 1450 1460 pF1KB4 HTGAWGKTVIEYKTTKTSRLPIIDVAPLDVGAPDQEFGFDVGPVCFL ::: :.:::.::.: :. ::::.:.:: :.:.:::::: :::::::: CCDS22 HTGEWSKTVFEYRTRKAVRLPIVDIAPYDIGGPDQEFGVDVGPVCFL 1420 1430 1440 1450 1460 >>CCDS33350.1 COL5A2 gene_id:1290|Hs108|chr2 (1499 aa) initn: 8325 init1: 4975 opt: 6194 Z-score: 2226.6 bits: 424.6 E(32554): 1.1e-117 Smith-Waterman score: 6479; 59.6% identity (72.8% similar) in 1502 aa overlap (7-1464:11-1499) 10 20 30 40 50 pF1KB4 MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPE : .:..: . . ..::: . :: :.: .:.::: : .::.::: CCDS33 MMANWAEARPLLILIVLLGQFVSIKAQEEDEDEGYGEEI---ACTQNGQMYLNRDIWKPA 10 20 30 40 50 60 70 80 90 100 pF1KB4 PCRICVCDNGKVLCDDVICDETKNCPGAEVPEGECCPVC---PDGSES------------ ::.::::::: .::: . :... .: .: ::::::: : :... CCDS33 PCQICVCDNGAILCDKIECQDVLDCADPVTPPGECCPVCSQTPGGGNTNFGRGRKGQKGE 60 70 80 90 100 110 110 120 130 140 pF1KB4 -------------PTDQETTGVEGPKGDTGPRG---PRGPAGPPGRDGIPGQPGLPGPPG : : .::.:. ::.: :::: : :. :.::::: ::::: CCDS33 PGLVPVVTGIRGRPGPAGPPGSQGPRGERGPKGRPGPRGPQGIDGEPGVPGQPGAPGPPG 120 130 140 150 160 170 150 160 170 180 190 200 pF1KB4 PPGPPGPPGLGGNFAPQLSYGYDEKSTGGISV---PGPMGPSGPRGLPGPPGAPGPQGFQ :. ::: ::. :. :.. : :::: : .: :: .:: :::: : : : : CCDS33 HPSHPGPDGLSRPFSAQMA-GLDEKSGLGSQVGLMPGSVGPVGPRGPQGLQGQQGGAGPT 180 190 200 210 220 230 210 220 230 240 250 260 pF1KB4 GPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLP :::::::.:: ::.: ::: ::::: :.::: :. : ::: : : ::::.::. ::: CCDS33 GPPGEPGDPGPMGPIGSRGPEGPPGKPGEDGEPGRNGNPGEVGFAGSPGARGFPGAPGLP 240 250 260 270 280 290 270 280 290 300 310 320 pF1KB4 GMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQMGPRGLPGERGRPGAPGPAGAR :.::::: .::.: ::..: : ::: : : :: : .::::.:::::: : : : : CCDS33 GLKGHRGHKGLEGPKGEVGAPGSKGEAGPTGPMGAMGPLGPRGMPGERGRLGPQGAPGQR 300 310 320 330 340 350 330 340 350 360 370 380 pF1KB4 GNDGATGAAGPPGPTGPAGPPGFPGAVGAKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAA : : : :: :: : : :::: : :::::: : :: ::::: ::: :::::.:. CCDS33 GAHGMPGKPGPMGPLGIPGSSGFPGNPGMKGEAGPTGARGPEGPQGQRGETGPPGPVGSP 360 370 380 390 400 410 390 400 410 420 430 440 pF1KB4 GPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSK : : :.:: ::::: .:.:: .: :: : : :::: :: : .:. :.::.:: : CCDS33 GLPGAIGTDGTPGAKGPTGSPGTSGPPGSAGPPGSPGPQGSTGPQGIRGQPGDPGVPGFK 420 430 440 450 460 470 450 460 470 480 490 500 pF1KB4 GDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGPGSRGFPGADGVA :..: :::::: :.::: :: ::::::: ::.:: .: ::: ::::.::.:::::.::. CCDS33 GEAGPKGEPGPHGIQGPIGPPGEEGKRGPRGDPGTVGPPGPVGERGAPGNRGFPGSDGLP 480 490 500 510 520 530 510 520 530 540 550 560 pF1KB4 GPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAGQD :::: :::: : .::::: :. ::::: :::::.::::.:: ::.:: :: : :.: CCDS33 GPKGAQGERGPVGSSGPKGSQGDPGRPGEPGLPGARGLTGNPGVQGPEGKLGPLGAPGED 540 550 560 570 580 590 570 580 590 600 610 620 pF1KB4 GRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPP ::::::: : ::: : ::.:::::..:.::: :: : : :: : :::::.: .:: CCDS33 GRPGPPGSIGIRGQPGSMGLPGPKGSSGDPGKPGEAGNAGVPGQRGAPGKDGEVGPSGPV 600 610 620 630 640 650 630 640 650 660 670 680 pF1KB4 GPAGPAGERGEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFP :: : ::::::::: : :::::::: :::::.::::.:::::: :: :: : ::::: : CCDS33 GPPGLAGERGEQGPPGPTGFQGLPGPPGPPGEGGKPGDQGVPGDPGAVGPLGPRGERGNP 660 670 680 690 700 710 690 700 710 720 730 740 pF1KB4 GERGVQGPPGPAGPRGANGAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPK :::: : : : .: :. : :: ::. : :.::. : ::::::::::: :: :::: CCDS33 GERGEPGITGLPGEKGMAGGHGPDGPKGSPGPSGTPGDTGPPGLQGMPGERGIAGTPGPK 720 730 740 750 760 770 750 760 770 780 790 800 pF1KB4 GDRGDAGPKGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPGDR :::: : :::.:. :.::.::: ::.::::::: :.::: :: : .:: :.:: ::.: CCDS33 GDRGGIGEKGAEGTAGNDGARGLPGPLGPPGPAGPTGEKGEPGPRGLVGPPGSRGNPGSR 780 790 800 810 820 830 810 820 830 840 850 860 pF1KB4 GEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPPGPIGNVGAPGAK :: :: : .::::: : :::::.:::::. : ::::: ::: : :: ::: : :.:: : CCDS33 GENGPTGAVGFAGPQGPDGQPGVKGEPGEPGQKGDAGSPGPQGLAGSPGPHGPNGVPGLK 840 850 860 870 880 890 870 880 890 900 910 920 pF1KB4 GARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGPAGKEGGKGPRGETGPAGRPGEV :.::. ::::::::::.:::::::::.: :: :: : :::: : ::. : :: :. CCDS33 GGRGTQGPPGATGFPGSAGRVGPPGPAGAPGPAGPLGEPGKEGPPGLRGDPGSHGRVGDR 900 910 920 930 940 950 930 940 950 960 970 980 pF1KB4 GPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSGEP :: :::: :.::.:: :: : : ::: : .::::.::.::::::::.::::::.: : CCDS33 GPAGPPGGPGDKGDPGEDGQPGPDGPPGPAGTTGQRGIVGMPGQRGERGMPGLPGPAGTP 960 970 980 990 1000 1010 990 1000 1010 1020 1030 1040 pF1KB4 GKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPGAEGSPGRDGSPGAKGDRGETGPA :: ::.::.:..:::::.:::: :: :: : :: : .:.:::::. : .::::. ::: CCDS33 GKVGPTGATGDKGPPGPVGPPGSNGPVGEPGPEGPAGNDGTPGRDGAVGERGDRGDPGPA 1020 1030 1040 1050 1060 1070 1050 1060 1070 1080 1090 1100 pF1KB4 GPPGAPGAPGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQ : ::. ::::.::::: : .:.::. : :: :: : .: :: ::::::::::. :.. CCDS33 GLPGSQGAPGTPGPVGAPGDAGQRGDPGSRGPIGPPGRAGKRGLPGPQGPRGDKGDHGDR 1080 1090 1100 1110 1120 1130 1110 1120 1130 1140 1150 1160 pF1KB4 GDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKDGLNGLPGPI :::: ::::::.:::: ::::: :::: .: :: ::::::: .: ::.: : ::: CCDS33 GDRGQKGHRGFTGLQGLPGPPGPNGEQGSAGIPGPFGPRGPPGPVGPSGKEGNPGPLGPI 1140 1150 1160 1170 1180 1190 1170 1180 1190 1200 1210 pF1KB4 GPPGPRGRTGDAGPVGPPGPPGPPGPPGPPS----------AGFDFSFLPQPPQEKAHDG :::: :: .:.::: :::: ::::::::::. . .: : .:.: : ..: CCDS33 GPPGVRGSVGEAGPEGPPGEPGPPGPPGPPGHLTAALGDIMGHYDES-MPDPLPEFTEDQ 1200 1210 1220 1230 1240 1250 1220 1230 1240 1250 1260 1270 pF1KB4 GRYYRADDANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCRDLKMCHSDWKSG . :: : : : .::::::.:::..:::.::.:.::::: :::.::: .:: CCDS33 AA---PDDKN---KTDPGVHATLKSLSSQIETMRSPDGSKKHPARTCDDLKLCHSAKQSG 1260 1270 1280 1290 1300 1280 1290 1300 1310 1320 1330 pF1KB4 EYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESMTD ::::::::: :::::.:::::::::. . :: .:.:. ::.: :.. ::.: .:. CCDS33 EYWIDPNQGSVEDAIKVYCNMETGETCISANPSSVPRKTWWASKSP-DNKPVWYGLDMNR 1310 1320 1330 1340 1350 1360 1340 1350 1360 1370 1380 1390 pF1KB4 GFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLQG : :: :: . : : . :.:::::.: :::::::: :::::.:::.:. :::::..:.: CCDS33 GSQFAYGDHQS-PNTAITQMTFLRLLSKEASQNITYICKNSVGYMDDQAKNLKKAVVLKG 1370 1380 1390 1400 1410 1420 1400 1410 1420 1430 1440 1450 pF1KB4 SNEIEIRAEGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPIIDVAPLDVGAPDQ .:...:.:::: :: : : : :....: ::::.::.: ...::::::.::.:::. :: CCDS33 ANDLDIKAEGNIRFRYIVLQDTCSKRNGNVGKTVFEYRTQNVARLPIIDLAPVDVGGTDQ 1430 1440 1450 1460 1470 1480 1460 pF1KB4 EFGFDVGPVCFL ::: ..:::::. CCDS33 EFGVEIGPVCFV 1490 >>CCDS34682.1 COL1A2 gene_id:1278|Hs108|chr7 (1366 aa) initn: 11439 init1: 5201 opt: 5735 Z-score: 2063.7 bits: 394.4 E(32554): 1.2e-108 Smith-Waterman score: 6472; 64.1% identity (77.1% similar) in 1363 aa overlap (105-1463:26-1365) 80 90 100 110 120 130 pF1KB4 CDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPRGPRGPAGPPGRDGI :: : .:: :: :::: ::: :::::: CCDS34 MLSFVDTRTLLLLAVTLCLATCQSLQEETVRKGPAGDRGPRGERGPPGPPGRD-- 10 20 30 40 50 140 150 160 170 180 190 pF1KB4 PGQPGLPGPPGPPGPPGPPGLGGNFAPQLSYGYDEKSTG-GISVPGPMGPSGPRGLPGPP :. : ::::::::::::::::::: : :: :..: : ::::: :::: :: CCDS34 -GEDGPTGPPGPPGPPGPPGLGGNFAAQ----YDGKGVGLG---PGPMGLMGPRGPPGAA 60 70 80 90 100 200 210 220 230 240 250 pF1KB4 GAPGPQGFQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGERGPPGPQGAR ::::::::::: ::::::: .:: : ::: ::::: :.::. :::::::::: :::::: CCDS34 GAPGPQGFQGPAGEPGEPGQTGPAGARGPAGPPGKAGEDGHPGKPGRPGERGVVGPQGAR 110 120 130 140 150 160 260 270 280 290 300 310 pF1KB4 GLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQMGPRGLPGERGRP :.::: ::::.:: :: .:::: ::. : : :::::.:::::.::: : ::::::::: CCDS34 GFPGTPGLPGFKGIRGHNGLDGLKGQPGAPGVKGEPGAPGENGTPGQTGARGLPGERGRV 170 180 190 200 210 220 320 330 340 350 360 370 pF1KB4 GAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVGAKGEAGPQGPRGSEGPQGVRGEP ::::::::::.::..: .:: :: : ::::::::: : ::: : : : :: : ::: CCDS34 GAPGPAGARGSDGSVGPVGPAGPIGSAGPPGFPGAPGPKGEIGAVGNAGPAGPAGPRGEV 230 240 250 260 270 280 380 390 400 410 420 430 pF1KB4 GPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGPQGPGGPPGPKGNS : :: .: .:: :::::.: ::::: : ::.:::::.:: :: :: : .: : .: CCDS34 GLPGLSGPVGPPGNPGANGLTGAKGAAGLPGVAGAPGLPGPRGIPGPVGAAGATGARGLV 290 300 310 320 330 340 440 450 460 470 480 490 pF1KB4 GEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPPGERGGPGSR :::: ::::..: ::::: .: ::::::.::::::: :: : .: ::::: ::.:::: CCDS34 GEPGPAGSKGESGNKGEPGSAGPQGPPGPSGEEGKRGPNGEAGSAGPPGPPGLRGSPGSR 350 360 370 380 390 400 500 510 520 530 540 550 pF1KB4 GFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKT :.::::: :: :: : ::. :::: .: :.:::::: :: : .:: ::::. :: :: CCDS34 GLPGADGRAGVMGPPGSRGASGPAGVRGPNGDAGRPGEPGLMGPRGLPGSPGNIGPAGKE 410 420 430 440 450 460 560 570 580 590 600 610 pF1KB4 GPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGKAGERGVPGPPGAVGPAGKD :: : : :::::: :: ::::. : .::::::: .:.::: :..: : :: : : : CCDS34 GPVGLPGIDGRPGPIGPAGARGEPGNIGFPGPKGPTGDPGKNGDKGHAGLAGARGAPGPD 470 480 490 500 510 520 620 630 640 650 660 670 pF1KB4 GEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPS :. :::::::: : : .::::: : :::::::::.:: ::.:::::.:. :..: :::. CCDS34 GNNGAQGPPGPQGVQGGKGEQGPPGPPGFQGLPGPSGPAGEVGKPGERGLHGEFGLPGPA 530 540 550 560 570 580 680 690 700 710 720 730 pF1KB4 GARGERGFPGERGVQGPPGPAGPRGANGAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGER : ::::: ::: :. :: :: : :: .: :: :: ::. :. :: :. : : .:.:::: CCDS34 GPRGERGPPGESGAAGPTGPIGSRGPSGPPGPDGNKGEPGVVGAVGTAGPSGPSGLPGER 590 600 610 620 630 640 740 750 760 770 780 790 pF1KB4 GAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPT ::::.:: ::..:. : .: :.::.::.:: : .: :::::: ::.::.: .:::::. CCDS34 GAAGIPGGKGEKGEPGLRGEIGNPGRDGARGAPGAVGAPGPAGATGDRGEAGAAGPAGPA 650 660 670 680 690 700 800 810 820 830 840 850 pF1KB4 GARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAKGDAGPPGPAGPAGPPGPI : ::.::.::: :: :: ::::: :: :::::::: : : ::. : ::.::.: :: CCDS34 GPRGSPGERGEVGPAGPNGFAGPAGAAGQPGAKGERGAKGPKGENGVVGPTGPVGAAGPA 710 720 730 740 750 760 860 870 880 890 900 910 pF1KB4 GNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNAGPPGPPGPAGKEGGKGPRGET : : :: :.::..:::: :::::::::.::::::: .::::::::::::: .::::. CCDS34 GPNGPPGPAGSRGDGGPPGMTGFPGAAGRTGPPGPSGISGPPGPPGPAGKEGLRGPRGDQ 770 780 790 800 810 820 920 930 940 950 960 970 pF1KB4 GPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQGIAGQRGVVGLPGQRGERGFP ::.:: :::: :::: ::::: : : :: ::::::::. : :..::::.:::::.: CCDS34 GPVGRTGEVGAVGPPGFAGEKGPSGEAGTAGPPGTPGPQGLLGAPGILGLPGSRGERGLP 830 840 850 860 870 880 980 990 1000 1010 1020 1030 pF1KB4 GLPGPSGEPGKQGPSGASGERGPPGPMGPPGLAGPPGESGREGAPGAEGSPGRDGSPGAK :. : :::: : .: : ::::: .: ::. : :::.::.: :: .: :::::.:: : CCDS34 GVAGAVGEPGPLGIAGPPGARGPPGAVGSPGVNGAPGEAGRDGNPGNDGPPGRDGQPGHK 890 900 910 920 930 940 1040 1050 1060 1070 1080 1090 pF1KB4 GDRGETGPAGPPGAPGAPGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPR :.:: : :: :: :::: :::::::: :.::::::.::.::.: :: :::.:::: : CCDS34 GERGYPGNIGPVGAAGAPGPHGPVGPAGKHGNRGETGPSGPVGPAGAVGPRGPSGPQGIR 950 960 970 980 990 1000 1100 1110 1120 1130 1140 1150 pF1KB4 GDKGETGEQGDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKD ::::: ::.: ::. : .: .:::: :: : :.:: :. :::::::: : .: ::: CCDS34 GDKGEPGEKGPRGLPGLKGHNGLQGLPGIAGHHGDQGAPGSVGPAGPRGPAGPSGPAGKD 1010 1020 1030 1040 1050 1060 1160 1170 1180 1190 1200 1210 pF1KB4 GLNGLPGPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSFLPQPPQEKAHDGG : .: :: .:: : :: : ::.::::::::::::: ..:.::.. : CCDS34 GRTGHPGTVGPAGIRGPQGHQGPAGPPGPPGPPGPPGVSGGGYDFGY-----------DG 1070 1080 1090 1100 1110 1220 1230 1240 1250 1260 1270 pF1KB4 RYYRADD---ANVVRDRDLEVDTTLKSLSQQIENIRSPEGSRKNPARTCRDLKMCHSDWK .::::. : .: .: :::.:::::..:::.. .:::::::::::::::.. : .:. CCDS34 DFYRADQPRSAPSLRPKDYEVDATLKSLNNQIETLLTPEGSRKNPARTCRDLRLSHPEWS 1120 1130 1140 1150 1160 1170 1280 1290 1300 1310 1320 1330 pF1KB4 SGEYWIDPNQGCNLDAIKVFCNMETGETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESM :: :::::::::..:::::.:.. :::::. .. :::: .. :::.:::.::.. CCDS34 SGYYWIDPNQGCTMDAIKVYCDFSTGETCIRAQPENIPAKNWY--RSSKDKKHVWLGETI 1180 1190 1200 1210 1220 1230 1340 1350 1360 1370 1380 1390 pF1KB4 TDGFQFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLL . : ::::. .: ..: ::.:.::... ::::::::::::.::::..:::::::..: CCDS34 NAGSQFEYNVEGVTSKEMATQLAFMRLLANYASQNITYHCKNSIAYMDEETGNLKKAVIL 1240 1250 1260 1270 1280 1290 1400 1410 1420 1430 1440 1450 pF1KB4 QGSNEIEIRAEGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPIIDVAPLDVGAP ::::..:. :::::::::.: ::::...:. ::::.:::::.: ::::..:.::::.:. CCDS34 QGSNDVELVAEGNSRFTYTVLVDGCSKKTNEWGKTIIEYKTNKPSRLPFLDIAPLDIGGA 1300 1310 1320 1330 1340 1350 1460 pF1KB4 DQEFGFDVGPVCFL :::: :.::::: CCDS34 DQEFFVDIGPVCFK 1360 >>CCDS75932.1 COL5A1 gene_id:1289|Hs108|chr9 (1838 aa) initn: 12041 init1: 3408 opt: 4366 Z-score: 1575.8 bits: 304.5 E(32554): 1.9e-81 Smith-Waterman score: 4369; 47.2% identity (61.8% similar) in 1413 aa overlap (96-1464:470-1837) 70 80 90 100 110 120 pF1KB4 GKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPRGPRG- : : :.:. :. :: : :: : : CCDS75 TIYEGIGGPRGEKGQKGEPAIIEPGMLIEGPPGPEGPA-----GLPGPPGTMGPTGQVGD 440 450 460 470 480 490 130 140 150 160 170 pF1KB4 PA--GPPGRDGIPGQPGLPGPPGP----PGPPGPPGLGGNFAPQLSYGYDEKSTGGISVP :. ::::: :.:: ::::::: : : : .:. .:..: .:... .: CCDS75 PGERGPPGRPGLPGADGLPGPPGTMLMLPFRFGGGGDAGSKGPMVS--AQESQAQAILQQ 500 510 520 530 540 550 180 190 200 210 220 230 pF1KB4 GPM---GPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEA . . ::.:: :: : :: :: : : ::::. : .:: : .::::: :: : :.: CCDS75 ARLALRGPAGPMGLTGRPGPVGPPGSGGLKGEPGDVGPQGPRGVQGPPGPAGKPGRRGRA 560 570 580 590 600 610 240 250 260 270 280 290 pF1KB4 GKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGEN :. : : : ::.: ::. : ::::: ::::: : .: : : : .:. : : CCDS75 GSDGARGMPGQTGPKGDRGFDGLAGLPGEKGHRGDPGPSGPPGPPGDDGERGDDGEVGPR 620 630 640 650 660 670 300 310 320 330 340 350 pF1KB4 GAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVGAKGEA : ::. ::::: : .: :: ::: :. : :: : : :: : :::: : ::.: CCDS75 GLPGEPGPRGLLGPKGPPGPPGPPGVTGMDGQPGPKGNVGPQGEPGPPGQQGNPGAQGLP 680 690 700 710 720 730 360 370 380 390 400 410 pF1KB4 GPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGAR :::: : : .: :.:: :: :: :: :.:: .: :: ::..: :: : :.:: : CCDS75 GPQGAIGPPGEKGPLGKPGLPGMPGADGPPGHPGKEGPPGEKGGQGPPGPQGPIGYPGPR 740 750 760 770 780 790 420 430 440 450 460 470 pF1KB4 GPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEP : .: .: : : ::..:: : :: ::: : ::. : .: :::: ::.: .: .:. CCDS75 GVKGADGIRGLKGTKGEKGEDGFPGFKGDMGIKGDRGEIG---PPGPRGEDGPEGPKGRG 800 810 820 830 840 480 490 500 510 520 530 pF1KB4 GPTGLPGP---PGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEA ::.: ::: :::.: : :.:: : :::: : : :: : ::. : :.:: CCDS75 GPNGDPGPLGPPGEKGKLGVPGLPGYPGRQGPKGSIGFPGFPGANGEKGGRGTPGKPGPR 850 860 870 880 890 900 540 550 560 570 580 pF1KB4 GLPGAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMG---FPGPKGAA : .: :: : :: : :: ::: :..: :: :::: :: : .: :::::: CCDS75 G---QRGPTGPRGERGPRGITGKPGPKGNSGGDGPAGPPGERGPNGPQGPTGFPGPKGPP 910 920 930 940 950 960 590 600 610 620 630 640 pF1KB4 GEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPA : ::: :.:: :: : .: .:..: :::: .:: : :: :: : : : ::: CCDS75 GPPGK---DGLPGHPGQRGETGFQGKTGPPGPPGVVGPQGPTGETGPMGERGHPGPPGPP 970 980 990 1000 1010 1020 650 660 670 680 690 700 pF1KB4 GP---PGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGAPGND : :: ::: : .: :: : :: .: : :::::.::. :: : : .: .: :: CCDS75 GEQGLPGLAGKEGTKGDPGPAGLPGKDGPPGLRGFPGDRGLPGPVGALGLKGNEGPPG-- 1030 1040 1050 1060 1070 1080 710 720 730 740 750 760 pF1KB4 GAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLT : ::.:: : :: : :.::. : : ::: :..: : :: .: :.:: : CCDS75 -PPGPAGSPGERGPAGAAGPIGIPGRPGPQGPPGPAGEKGAPGEKGPQGPAGRDG---LQ 1090 1100 1110 1120 1130 770 780 790 800 810 820 pF1KB4 GPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAK ::.: ::::: : ::.: .: : : .:. ::.:: :::::.: :: : : :: CCDS75 GPVGLPGPAGPVGPPGEDGDKGEIGEPGQKGSKGDKGEQGPPGPTGPQGPIGQPGPSGAD 1140 1150 1160 1170 1180 1190 830 840 850 860 870 880 pF1KB4 GEPGDAGAKG---DAGPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRV :::: : .: . : :: : :::::.: : :: : .: .: : : :: : CCDS75 GEPGPRGQQGLFGQKGDEGPRGFPGPPGPVGLQGLPGPPGEKGETGDVGQMGPPGPPGPR 1200 1210 1220 1230 1240 1250 890 900 910 920 930 940 pF1KB4 GPPGPSGNAGPPGPPGPAGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPA :: : : :: :::: :. :. : .:: : ::.:: : ::::: ::.: : .::. CCDS75 GPSGAPGADGPQGPPGGIGNPGAVGEKGEPGEAGEPGLPGEGGPPGPKGERGEKGESGPS 1260 1270 1280 1290 1300 1310 950 960 970 980 990 1000 pF1KB4 GAPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPG---KQGPSGASGERGPPGPM :: : :::.: :. : : :: : ::: ::: :::: ..:: : .:. : :: CCDS75 GAAGPPGPKGPPGDDGPKGSPGPVG---FPGDPGPPGEPGPAGQDGPPGDKGDDGEPGQT 1320 1330 1340 1350 1360 1370 1010 1020 1030 1040 1050 pF1KB4 GPPGLAGPPGESG---REGAPGAEGSPGRDGSPGAKGDRG------ETGPAGPPGAPGAP : :: .: :: :: ..: :: : ::.: ::::. : .::: :: :::: : CCDS75 GSPGPTGEPGPSGPPGKRGPPGPAGPEGRQGEKGAKGEAGLEGPPGKTGPIGPQGAPGKP 1380 1390 1400 1410 1420 1430 1060 1070 1080 1090 1100 pF1KB4 GAPG----PVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGI : : : ::.:..: : :: :: ::.:: : : : .::.:.::. : : : CCDS75 GPDGLRGIP-GPVGEQGLPGSPGPDGPPGPMGPPGLPGLKGDSGPKGEKGHPGLIGLIGP 1440 1450 1460 1470 1480 1490 1110 1120 1130 1140 1150 1160 pF1KB4 KGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKDGLNGLPGPIGPPGP :..: .: .: ::: :: : .: .: .::.:: :::: :: :::: :: : CCDS75 PGEQGEKGDRGLPGPQGSSGPKGEQGITGPSGPIGPPGP---PG------LPGPPGPKGA 1500 1510 1520 1530 1540 1170 1180 1190 1200 1210 1220 pF1KB4 RGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSFLPQPPQEKAHDGGRYYRADDANVVRDR .: .: .:: : : :::::::::: : .. :: .. .. ::.: CCDS75 KGSSGPTGPKGEAGHPGPPGPPGPP--GEVIQPLPIQASRTRRNIDASQLLDDGNGENYV 1550 1560 1570 1580 1590 1600 1230 1240 1250 1260 1270 1280 pF1KB4 DL-----EVDTTLKSLSQQIENIRSPEGSRKNPARTCRDLKMCHSDWKSGEYWIDPNQGC : :. .:.::. .::... : :...::::::.::..:: :. .::::.:::::: CCDS75 DYADGMEEIFGSLNSLKLEIEQMKRPLGTQQNPARTCKDLQLCHPDFPDGEYWVDPNQGC 1610 1620 1630 1640 1650 1660 1290 1300 1310 1320 1330 1340 pF1KB4 NLDAIKVFCNMETG-ETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQ . :..::.::. .: :::.: . : ..: ... ::.. .:... . : . : CCDS75 SRDSFKVYCNFTAGGSTCVFPDKKSEGSK---MARWPKEQPSTWYSQ-YKRGSLLSYVDA 1670 1680 1690 1700 1710 1350 1360 1370 1380 1390 1400 pF1KB4 GSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLQGSNEIEIRAE ..:. : .:.:::::.:. : ::.:::: .:::..: ::. ::: . :::. :. . CCDS75 EGNPVGV-VQMTFLRLLSASAHQNVTYHCYQSVAWQDAATGSYDKALRFLGSNDEEMSYD 1720 1730 1740 1750 1760 1770 1410 1420 1430 1440 1450 1460 pF1KB4 GNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPIIDVAPLDVGAPDQEFGFDVGPV .: . . ::::... : . :::.: : :. ..::.:. : : .:.:::.:::. CCDS75 NNPYIR--ALVDGCATKKG-YQKTVLEIDTPKVEQVPIVDIMFNDFGEASQKFGFEVGPA 1780 1790 1800 1810 1820 1830 pF1KB4 CFL ::. CCDS75 CFMG >>CCDS6982.1 COL5A1 gene_id:1289|Hs108|chr9 (1838 aa) initn: 12041 init1: 3408 opt: 4365 Z-score: 1575.4 bits: 304.4 E(32554): 2e-81 Smith-Waterman score: 4368; 47.3% identity (61.6% similar) in 1413 aa overlap (96-1464:470-1837) 70 80 90 100 110 120 pF1KB4 GKVLCDDVICDETKNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGPRGPRG- : : :.:. :. :: : :: : : CCDS69 TIYEGIGGPRGEKGQKGEPAIIEPGMLIEGPPGPEGPA-----GLPGPPGTMGPTGQVGD 440 450 460 470 480 490 130 140 150 160 170 pF1KB4 PA--GPPGRDGIPGQPGLPGPPGP----PGPPGPPGLGGNFAPQLSYGYDEKSTGGISVP :. ::::: :.:: ::::::: : : : .:. .:..: .:... .: CCDS69 PGERGPPGRPGLPGADGLPGPPGTMLMLPFRFGGGGDAGSKGPMVS--AQESQAQAILQQ 500 510 520 530 540 550 180 190 200 210 220 230 pF1KB4 GPM---GPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEA . . ::.:: :: : :: :: : : ::::. : .:: : .::::: :: : :.: CCDS69 ARLALRGPAGPMGLTGRPGPVGPPGSGGLKGEPGDVGPQGPRGVQGPPGPAGKPGRRGRA 560 570 580 590 600 610 240 250 260 270 280 290 pF1KB4 GKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGEN :. : : : ::.: ::. : ::::: ::::: : .: : : : .:. : : CCDS69 GSDGARGMPGQTGPKGDRGFDGLAGLPGEKGHRGDPGPSGPPGPPGDDGERGDDGEVGPR 620 630 640 650 660 670 300 310 320 330 340 350 pF1KB4 GAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVGAKGEA : ::. ::::: : .: :: ::: :. : :: : : :: : :::: : ::.: CCDS69 GLPGEPGPRGLLGPKGPPGPPGPPGVTGMDGQPGPKGNVGPQGEPGPPGQQGNPGAQGLP 680 690 700 710 720 730 360 370 380 390 400 410 pF1KB4 GPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGAR :::: : : .: :.:: :: :: :: :.:: .: :: ::..: :: : :.:: : CCDS69 GPQGAIGPPGEKGPLGKPGLPGMPGADGPPGHPGKEGPPGEKGGQGPPGPQGPIGYPGPR 740 750 760 770 780 790 420 430 440 450 460 470 pF1KB4 GPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEP : .: .: : : ::..:: : :: ::: : ::. : .: :::: ::.: .: .:. CCDS69 GVKGADGIRGLKGTKGEKGEDGFPGFKGDMGIKGDRGEIG---PPGPRGEDGPEGPKGRG 800 810 820 830 840 480 490 500 510 520 530 pF1KB4 GPTGLPGP---PGERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEA ::.: ::: :::.: : :.:: : :::: : : :: : ::. : :.:: CCDS69 GPNGDPGPLGPPGEKGKLGVPGLPGYPGRQGPKGSIGFPGFPGANGEKGGRGTPGKPGPR 850 860 870 880 890 900 540 550 560 570 580 pF1KB4 GLPGAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMG---FPGPKGAA : .: :: : :: : :: ::: :..: :: :::: :: : .: :::::: CCDS69 G---QRGPTGPRGERGPRGITGKPGPKGNSGGDGPAGPPGERGPNGPQGPTGFPGPKGPP 910 920 930 940 950 960 590 600 610 620 630 640 pF1KB4 GEPGKAGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPA : ::: :.:: :: : .: .:..: :::: .:: : :: :: : : : ::: CCDS69 GPPGK---DGLPGHPGQRGETGFQGKTGPPGPPGVVGPQGPTGETGPMGERGHPGPPGPP 970 980 990 1000 1010 1020 650 660 670 680 690 700 pF1KB4 GP---PGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGAPGND : :: ::: : .: :: : :: .: : :::::.::. :: : : .: .: :: CCDS69 GEQGLPGLAGKEGTKGDPGPAGLPGKDGPPGLRGFPGDRGLPGPVGALGLKGNEGPPG-- 1030 1040 1050 1060 1070 1080 710 720 730 740 750 760 pF1KB4 GAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLT : ::.:: : :: : :.::. : : ::: :..: : :: .: :.:: : CCDS69 -PPGPAGSPGERGPAGAAGPIGIPGRPGPQGPPGPAGEKGAPGEKGPQGPAGRDG---LQ 1090 1100 1110 1120 1130 770 780 790 800 810 820 pF1KB4 GPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAK ::.: ::::: : ::.: .: : : .:. ::.:: :::::.: :: : : :: CCDS69 GPVGLPGPAGPVGPPGEDGDKGEIGEPGQKGSKGDKGEQGPPGPTGPQGPIGQPGPSGAD 1140 1150 1160 1170 1180 1190 830 840 850 860 870 880 pF1KB4 GEPGDAGAKG---DAGPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRV :::: : .: . : :: : :::::.: : :: : .: .: : : :: : CCDS69 GEPGPRGQQGLFGQKGDEGPRGFPGPPGPVGLQGLPGPPGEKGETGDVGQMGPPGPPGPR 1200 1210 1220 1230 1240 1250 890 900 910 920 930 940 pF1KB4 GPPGPSGNAGPPGPPGPAGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPA :: : : :: :::: :. :. : .:: : ::.:: : ::::: ::.: : .::. CCDS69 GPSGAPGADGPQGPPGGIGNPGAVGEKGEPGEAGEPGLPGEGGPPGPKGERGEKGESGPS 1260 1270 1280 1290 1300 1310 950 960 970 980 990 1000 pF1KB4 GAPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPG---KQGPSGASGERGPPGPM :: : :::.: :. : : :: : ::: ::: :::: ..:: : .:. : :: CCDS69 GAAGPPGPKGPPGDDGPKGSPGPVG---FPGDPGPPGEPGPAGQDGPPGDKGDDGEPGQT 1320 1330 1340 1350 1360 1370 1010 1020 1030 1040 1050 pF1KB4 GPPGLAGPPGESG---REGAPGAEGSPGRDGSPGAKGDRG------ETGPAGPPGAPGAP : :: .: :: :: ..: :: : ::.: ::::. : .::: :: :::: : CCDS69 GSPGPTGEPGPSGPPGKRGPPGPAGPEGRQGEKGAKGEAGLEGPPGKTGPIGPQGAPGKP 1380 1390 1400 1410 1420 1430 1060 1070 1080 1090 1100 pF1KB4 GAPG----PVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGI : : : ::.:..: : :: :: ::.:: : : : .::.:.::. : : : CCDS69 GPDGLRGIP-GPVGEQGLPGSPGPDGPPGPMGPPGLPGLKGDSGPKGEKGHPGLIGLIGP 1440 1450 1460 1470 1480 1490 1110 1120 1130 1140 1150 1160 pF1KB4 KGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGPPGSAGAPGKDGLNGLPGPIGPPGP :..: .: .: ::: :: : .: .: .::.:: :::: :: :::: :: : CCDS69 PGEQGEKGDRGLPGPQGSSGPKGEQGITGPSGPIGPPGP---PG------LPGPPGPKGA 1500 1510 1520 1530 1540 1170 1180 1190 1200 1210 1220 pF1KB4 RGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSFLPQPPQEKAHDGGRYYRADDANVVRDR .: .: .:: : : :::::::::: : .. :: .. .. ::.: CCDS69 KGSSGPTGPKGEAGHPGPPGPPGPP--GEVIQPLPIQASRTRRNIDASQLLDDGNGENYV 1550 1560 1570 1580 1590 1600 1230 1240 1250 1260 1270 1280 pF1KB4 DL-----EVDTTLKSLSQQIENIRSPEGSRKNPARTCRDLKMCHSDWKSGEYWIDPNQGC : :. .:.::. .::... : :...::::::.::..:: :. .::::.:::::: CCDS69 DYADGMEEIFGSLNSLKLEIEQMKRPLGTQQNPARTCKDLQLCHPDFPDGEYWVDPNQGC 1610 1620 1630 1640 1650 1660 1290 1300 1310 1320 1330 1340 pF1KB4 NLDAIKVFCNMETG-ETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQ . :..::.::. .: :::.: . : . . :.. ::.. ::.: . : . : CCDS69 SRDSFKVYCNFTAGGSTCVFPDKKSEGAR---ITSWPKENPGSWFSE-FKRGKLLSYVDA 1670 1680 1690 1700 1710 1350 1360 1370 1380 1390 1400 pF1KB4 GSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLQGSNEIEIRAE ..:. : .:.:::::.:. : ::.:::: .:::..: ::. ::: . :::. :. . CCDS69 EGNPVGV-VQMTFLRLLSASAHQNVTYHCYQSVAWQDAATGSYDKALRFLGSNDEEMSYD 1720 1730 1740 1750 1760 1770 1410 1420 1430 1440 1450 1460 pF1KB4 GNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPIIDVAPLDVGAPDQEFGFDVGPV .: . . ::::... : . :::.: : :. ..::.:. : : .:.:::.:::. CCDS69 NNPYIR--ALVDGCATKKG-YQKTVLEIDTPKVEQVPIVDIMFNDFGEASQKFGFEVGPA 1780 1790 1800 1810 1820 1830 pF1KB4 CFL ::. CCDS69 CFMG >>CCDS780.2 COL11A1 gene_id:1301|Hs108|chr1 (1690 aa) initn: 5900 init1: 3327 opt: 4335 Z-score: 1565.1 bits: 302.4 E(32554): 7.4e-81 Smith-Waterman score: 4338; 46.4% identity (60.1% similar) in 1480 aa overlap (29-1464:255-1689) 10 20 30 40 50 pF1KB4 MFSFVDLRLLLLLAATALLTHGQEEGQVEGQDEDIPPITCVQNGLRYHDRDVWKPEPC : : .. : .. .: . . . : CCDS78 ITGDPKAAYDYCEHYSPDCDSSAPKAAQAQEPQIDEYAPEDIIEYDYEYGEAEYKEAESV 230 240 250 260 270 280 60 70 80 90 100 110 pF1KB4 RICVCDNGKVLCDDVICDETKNCPGA---EVPEGECCPVCPDG-SESPTDQETTGVEGPK .: .. ...: . : :: . .:: : : :.: : :: CCDS78 T-----EGPTVTEETIAQTEINGHGAYGEKGQKGEPAVVEPGMLVEGPP-----GPAGPA 290 300 310 320 330 120 130 140 150 160 pF1KB4 GDTGP---RGPRGPAGPPGRDGIPGQPGLPGPPGPPGPPG-----PPGLGGNFA--PQLS : :: .:: :: : :: : ::.::::: : ::::: : ::. . : .: CCDS78 GIMGPPGLQGPTGPPGDPGDRGPPGRPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTIS 340 350 360 370 380 390 170 180 190 200 210 220 pF1KB4 YGYDEKST----GGISVPGPMGPSGPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPR . .. . :.. :: :: : : ::: :.:: .: .: : :.:: .:: : . CCDS78 AQEAQAQAILQQARIALRGPPGPMGLTGRPGPVGGPGSSGAKG---ESGDPGPQGPRGVQ 400 410 420 430 440 450 230 240 250 260 270 280 pF1KB4 GPPGPPGKNGDDGEAGKPGRPGERGPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDA ::::: :: : :. : : : : :: .: ::. : :::: ::::: : .: : CCDS78 GPPGPTGKPGKRGRPGADGGRGMPGEPGAKGDRGFDGLPGLPGDKGHRGERGPQGPPGPP 460 470 480 490 500 510 290 300 310 320 330 340 pF1KB4 GPAGPKGEPGSPGENGAPGQMGPRGLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPA : : .:: : : : ::. ::::: : :: ::::: : : :: : : :: : CCDS78 GDDGMRGEDGEIGPRGLPGEAGPRGLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEP 520 530 540 550 560 570 350 360 370 380 390 400 pF1KB4 GPPGFPGAVGAKGEAGPQGPRGSEGPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGAN :::: : : .: ::::: : : .: .:.:: : :: :: :.:: .:: : ::: CCDS78 GPPGQQGNPGPQGLPGPQGPIGPPGEKGPQGKPGLAGLPGADGPPGHPGKEGQSGEKGAL 580 590 600 610 620 630 410 420 430 440 450 460 pF1KB4 GAPGIAGAPGFPGARGPSGPQGPGGPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPP : :: : :.:: :: .: .: : : ::..:: : :: ::: : ::. : :: CCDS78 GPPGPQGPIGYPGPRGVKGADGVRGLKGSKGEKGEDGFPGFKGDMGLKGDRGEVG---QI 640 650 660 670 680 470 480 490 500 510 pF1KB4 GPAGEEGKRGARGEPGPTGLPGPPG---ERGGPGSRGFPGADGVAGPKGPAGERGSPGPA :: ::.: .: .:. :::: ::: : :.: : :.:: : :::: .: : :: CCDS78 GPRGEDGPEGPKGRAGPTGDPGPSGQAGEKGKLGVPGLPGYPGRQGPKGSTGFPGFPGAN 690 700 710 720 730 740 520 530 540 550 560 570 pF1KB4 GPKGSPGEAGRPGEAGLPGAKGLTGSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQA : ::. : ::.:: : : : :: :. :: :: :: : .: :: ::::: : .: CCDS78 GEKGARGVAGKPGPRGQRGPTGPRGSRGARGPTGKPGPKGTSGGDGPPGPPGERGPQGPQ 750 760 770 780 790 800 580 590 600 610 620 630 pF1KB4 GVMGFPGPKGAAGEPGK------AGERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGER : .::::::: : ::: :.:: : : .:: : : .: ::: : .:: ::: CCDS78 GPVGFPGPKGPPGPPGKDGLPGHPGQRGETGFQGKTGPPGPGGVVGPQGPTGETGPIGER 810 820 830 840 850 860 640 650 660 670 680 690 pF1KB4 GEQGPAGSPGFQGLPGPAGPPGEAGKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPP :. :: : :: ::::: :: : : :: ::. : ::.: ::::::::. : CCDS78 GHPGPPGPPGEQGLPGAAGKEGAKGDPGPQGIS---GKDGPAGL---RGFPGERGL---P 870 880 890 900 910 700 710 720 730 740 750 pF1KB4 GPAGPRGANGAPGNDGAKGDAGAPGAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPK : : : .:. : .: : .:.:: :: :. : :.::. : : ::: :..: : : CCDS78 GAQGAPGLKGGEGPQGPPGPVGSPGERGSAGTAGPIGLPGRPGPQGPPGPAGEKGAPGEK 920 930 940 950 960 970 760 770 780 790 800 810 pF1KB4 GADGSPGKDGVRGLTGPIGPPGPAGAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPA : .: :.:::.: :.: ::::: :. ::.: .: : : .:. ::.:: ::::: CCDS78 GPQGPAGRDGVQG---PVGLPGPAGPAGSPGEDGDKGEIGEPGQKGSKGDKGENGPPGPP 980 990 1000 1010 1020 1030 820 830 840 850 860 pF1KB4 GFAGPPGADGQPGAKGEPGDAGAKG---DAGPPGPAGPAGPPGPIGNVGAPGAKGARGSA :. :: :: : :. :::: : .: . : : : ::::::: : :: : .: CCDS78 GLQGPVGAPGIAGGDGEPGPRGQQGMFGQKGDEGARGFPGPPGPIGLQGLPGPPGEKGEN 1040 1050 1060 1070 1080 1090 870 880 890 900 910 920 pF1KB4 GPPGATGFPGAAGRVGPPGPSGNAGPPGPPGPAGKEGGKGPRGETGPAGRPGEVGPPGPP : : : :: : :: ::.: :: :::: .:. :: : .:: : :: :: : : CCDS78 GDVGPMGPPGPPGPRGPQGPNGADGPQGPPGSVGSVGGVGEKGEPGEAGNPGPPGEAGVG 1100 1110 1120 1130 1140 1150 930 940 950 960 970 980 pF1KB4 GPAGEKGSPGADGPAGAPGTPGPQGIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPS :: ::.: : :: :: : :: .: :. : : :: :::: ::: :::: : . CCDS78 GPKGERGEKGEAGPPGAAGPPGAKGPPGDDGPKGNPGPV---GFPGDPGPPGEPGPAGQD 1160 1170 1180 1190 1200 1210 990 1000 1010 1020 1030 1040 pF1KB4 GASGERG----P--PGPMGPPGLAGPPGESGREGAPGAEGSPGRDGSPGAKGDRGETGPA :..:..: : ::: :: : ::::: :..: ::: :. ::.: :::: :.: CCDS78 GVGGDKGEDGDPGQPGPPGPSGEAGPPGPPGKRGPPGAAGAEGRQGEKGAKG---EAGAE 1220 1230 1240 1250 1260 1270 1050 1060 1070 1080 1090 1100 pF1KB4 GPPGAPGAPGAPGPVGPAGKSGDRGETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQ :::: : : ::.: : : :: ::.: : : .: :: ::.:: : : :. CCDS78 GPPGKTGPVGPQGPAGKPGPEGLRGIPGPVGEQGLPGAAGQDGPPGPMGPPGLPGLKGDP 1280 1290 1300 1310 1320 1330 1110 1120 1130 1140 1150 pF1KB4 GDRGIKGHRGFSGLQGPPGPPGSPGEQGPSGASGPAGPRGP---PGSAGAPGKDGLNGLP :..: ::: :. :: :::: : :..: :..: : .: :: :: : : ::: CCDS78 GSKGEKGHPGLIGLIGPPGEQGEKGDRGLPGTQGSPGAKGDGGIPGPAGPLGPPGPPGLP 1340 1350 1360 1370 1380 1390 1160 1170 1180 1190 1200 1210 pF1KB4 GPIGPPGPRGRTGDAGPVGPPGPPGPPGPPGPPSAGFDFSFLP--QPPQEKAHDGGRYYR :: :: : .: :: :: : : ::::: :::: : .. :: . . . : : CCDS78 GPQGPKGNKGSTGPAGQKGDSGLPGPPGSPGPP--GEVIQPLPILSSKKTRRHTEGMQAD 1400 1410 1420 1430 1440 1220 1230 1240 1250 1260 1270 pF1KB4 ADDANVVRDRD--LEVDTTLKSLSQQIENIRSPEGSRKNPARTCRDLKMCHSDWKSGEYW ::: :.. : :. .:.::.:.::... : :.. ::::::.::.. : :. .:::: CCDS78 ADD-NILDYSDGMEEIFGSLNSLKQDIEHMKFPMGTQTNPARTCKDLQLSHPDFPDGEYW 1450 1460 1470 1480 1490 1500 1280 1290 1300 1310 1320 1330 pF1KB4 IDPNQGCNLDAIKVFCNMETG-ETCVYPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGF :::::::. :..::.::. .: :::.:: . : .. ::. ::.: ::.: . : CCDS78 IDPNQGCSGDSFKVYCNFTSGGETCIYPDKKS---EGVRISSWPKEKPGSWFSE-FKRGK 1510 1520 1530 1540 1550 1560 1340 1350 1360 1370 1380 1390 pF1KB4 QFEYGGQGSDPADVAIQLTFLRLMSTEASQNITYHCKNSVAYMDQQTGNLKKALLLQGSN . : .. .. .:.:::.:... : ::.::::..:.:..: ..:. ::: . ::: CCDS78 LLSYLDVEGNSINM-VQMTFLKLLTASARQNFTYHCHQSAAWYDVSSGSYDKALRFLGSN 1570 1580 1590 1600 1610 1620 1400 1410 1420 1430 1440 1450 pF1KB4 EIEIRAEGNSRFTYSVTVDGCTSHTGAWGKTVIEYKTTKTSRLPIIDVAPLDVGAPDQEF . :. ..: : .. :::.:. : . ::::: .: : ...::.:: : : .:.: CCDS78 DEEMSYDNNP-FIKTL-YDGCASRKG-YEKTVIEINTPKIDQVPIVDVMINDFGDQNQKF 1630 1640 1650 1660 1670 1460 pF1KB4 GFDVGPVCFL ::.::::::: CCDS78 GFEVGPVCFLG 1680 1690 >>CCDS53348.1 COL11A1 gene_id:1301|Hs108|chr1 (1767 aa) initn: 5900 init1: 3327 opt: 4335 Z-score: 1564.9 bits: 302.4 E(32554): 7.6e-81 Smith-Waterman score: 4336; 48.0% identity (61.3% similar) in 1396 aa overlap (109-1464:406-1766) 80 90 100 110 120 130 pF1KB4 KNCPGAEVPEGECCPVCPDGSESPTDQETTGVEGPKGDTGP---RGPRGPAGPPGRDGIP : :: : :: .:: :: : :: : : CCDS53 SINGHGAYGEKGQKGEPAVVEPGMLVEGPPGPAGPAGIMGPPGLQGPTGPPGDPGDRGPP 380 390 400 410 420 430 140 150 160 170 180 pF1KB4 GQPGLPGPPGPPGPPG-----PPGLGGNFA--PQLSYGYDEKST----GGISVPGPMGPS :.::::: : ::::: : ::. . : .: . .. . :.. :: :: CCDS53 GRPGLPGADGLPGPPGTMLMLPFRYGGDGSKGPTISAQEAQAQAILQQARIALRGPPGPM 440 450 460 470 480 490 190 200 210 220 230 240 pF1KB4 GPRGLPGPPGAPGPQGFQGPPGEPGEPGASGPMGPRGPPGPPGKNGDDGEAGKPGRPGER : : ::: :.:: .: .: : :.:: .:: : .::::: :: : :. : : : CCDS53 GLTGRPGPVGGPGSSGAKG---ESGDPGPQGPRGVQGPPGPTGKPGKRGRPGADGGRGMP 500 510 520 530 540 550 250 260 270 280 290 300 pF1KB4 GPPGPQGARGLPGTAGLPGMKGHRGFSGLDGAKGDAGPAGPKGEPGSPGENGAPGQMGPR : :: .: ::. : :::: ::::: : .: : : : .:: : : : ::. ::: CCDS53 GEPGAKGDRGFDGLPGLPGDKGHRGERGPQGPPGPPGDDGMRGEDGEIGPRGLPGEAGPR 560 570 580 590 600 610 310 320 330 340 350 360 pF1KB4 GLPGERGRPGAPGPAGARGNDGATGAAGPPGPTGPAGPPGFPGAVGAKGEAGPQGPRGSE :: : :: ::::: : : :: : : :: : :::: : : .: ::::: : CCDS53 GLLGPRGTPGAPGQPGMAGVDGPPGPKGNMGPQGEPGPPGQQGNPGPQGLPGPQGPIGPP 620 630 640 650 660 670 370 380 390 400 410 420 pF1KB4 GPQGVRGEPGPPGPAGAAGPAGNPGADGQPGAKGANGAPGIAGAPGFPGARGPSGPQGPG : .: .:.:: : :: :: :.:: .:: : ::: : :: : :.:: :: .: .: CCDS53 GEKGPQGKPGLAGLPGADGPPGHPGKEGQSGEKGALGPPGPQGPIGYPGPRGVKGADGVR 680 690 700 710 720 730 430 440 450 460 470 480 pF1KB4 GPPGPKGNSGEPGAPGSKGDTGAKGEPGPVGVQGPPGPAGEEGKRGARGEPGPTGLPGPP : : ::..:: : :: ::: : ::. : : : :: ::.: .: .:. :::: ::: CCDS53 GLKGSKGEKGEDGFPGFKGDMGLKGDRGEV---GQIGPRGEDGPEGPKGRAGPTGDPGPS 740 750 760 770 780 490 500 510 520 530 540 pF1KB4 G---ERGGPGSRGFPGADGVAGPKGPAGERGSPGPAGPKGSPGEAGRPGEAGLPGAKGLT : :.: : :.:: : :::: .: : :: : ::. : ::.:: : : : CCDS53 GQAGEKGKLGVPGLPGYPGRQGPKGSTGFPGFPGANGEKGARGVAGKPGPRGQRGPTGPR 790 800 810 820 830 840 550 560 570 580 590 pF1KB4 GSPGSPGPDGKTGPPGPAGQDGRPGPPGPPGARGQAGVMGFPGPKGAAGEPGK------A :: :. :: :: :: : .: :: ::::: : .: : .::::::: : ::: CCDS53 GSRGARGPTGKPGPKGTSGGDGPPGPPGERGPQGPQGPVGFPGPKGPPGPPGKDGLPGHP 850 860 870 880 890 900 600 610 620 630 640 650 pF1KB4 GERGVPGPPGAVGPAGKDGEAGAQGPPGPAGPAGERGEQGPAGSPGFQGLPGPAGPPGEA :.:: : : .:: : : .: ::: : .:: ::::. :: : :: ::::: :: : CCDS53 GQRGETGFQGKTGPPGPGGVVGPQGPTGETGPIGERGHPGPPGPPGEQGLPGAAGKEGAK 910 920 930 940 950 960 660 670 680 690 700 710 pF1KB4 GKPGEQGVPGDLGAPGPSGARGERGFPGERGVQGPPGPAGPRGANGAPGNDGAKGDAGAP : :: ::. : ::.: : :::::::. :: : : .:. : .: : .:.: CCDS53 GDPGPQGISG---KDGPAGLR---GFPGERGL---PGAQGAPGLKGGEGPQGPPGPVGSP 970 980 990 1000 1010 1020 720 730 740 750 760 770 pF1KB4 GAPGSQGAPGLQGMPGERGAAGLPGPKGDRGDAGPKGADGSPGKDGVRGLTGPIGPPGPA : :: :. : :.::. : : ::: :..: : :: .: :.:::.: :.: :::: CCDS53 GERGSAGTAGPIGLPGRPGPQGPPGPAGEKGAPGEKGPQGPAGRDGVQG---PVGLPGPA 1030 1040 1050 1060 1070 780 790 800 810 820 830 pF1KB4 GAPGDKGESGPSGPAGPTGARGAPGDRGEPGPPGPAGFAGPPGADGQPGAKGEPGDAGAK : :. ::.: .: : : .:. ::.:: ::::: :. :: :: : :. :::: : . CCDS53 GPAGSPGEDGDKGEIGEPGQKGSKGDKGENGPPGPPGLQGPVGAPGIAGGDGEPGPRGQQ 1080 1090 1100 1110 1120 1130 840 850 860 870 880 890 pF1KB4 G---DAGPPGPAGPAGPPGPIGNVGAPGAKGARGSAGPPGATGFPGAAGRVGPPGPSGNA : . : : : ::::::: : :: : .: : : : :: : :: ::.: CCDS53 GMFGQKGDEGARGFPGPPGPIGLQGLPGPPGEKGENGDVGPMGPPGPPGPRGPQGPNGAD 1140 1150 1160 1170 1180 1190 900 910 920 930 940 950 pF1KB4 GPPGPPGPAGKEGGKGPRGETGPAGRPGEVGPPGPPGPAGEKGSPGADGPAGAPGTPGPQ :: :::: .:. :: : .:: : :: :: : : :: ::.: : :: :: : :: . CCDS53 GPQGPPGSVGSVGGVGEKGEPGEAGNPGPPGEAGVGGPKGERGEKGEAGPPGAAGPPGAK 1200 1210 1220 1230 1240 1250 960 970 980 990 1000 pF1KB4 GIAGQRGVVGLPGQRGERGFPGLPGPSGEPGKQGPSGASGERG----P--PGPMGPPGLA : :. : : :: :::: ::: :::: : .:..:..: : ::: :: : : CCDS53 GPPGDDGPKGNPGPV---GFPGDPGPPGEPGPAGQDGVGGDKGEDGDPGQPGPPGPSGEA 1260 1270 1280 1290 1300 1310 1010 1020 1030 1040 1050 1060 pF1KB4 GPPGESGREGAPGAEGSPGRDGSPGAKGDRGETGPAGPPGAPGAPGAPGPVGPAGKSGDR :::: :..: ::: :. ::.: :::: :.: :::: : : ::.: : : : CCDS53 GPPGPPGKRGPPGAAGAEGRQGEKGAKG---EAGAEGPPGKTGPVGPQGPAGKPGPEGLR 1320 1330 1340 1350 1360 1370 1070 1080 1090 1100 1110 1120 pF1KB4 GETGPAGPAGPVGPVGARGPAGPQGPRGDKGETGEQGDRGIKGHRGFSGLQGPPGPPGSP : ::.: : : .: :: ::.:: : : :. :..: ::: :. :: :::: : CCDS53 GIPGPVGEQGLPGAAGQDGPPGPMGPPGLPGLKGDPGSKGEKGHPGLIGLIGPPGEQGEK 1380 1390 1400 1410 1420 1430 1130 1140 1150 1160 1170 1180 pF1KB4 GEQGPSGASGPAGPRGP---PGSAGAPGKDGLNGLPGPIGPPGPRGRTGDAGPVGPPGPP :..: :..: : .: :: :: : : ::::: :: : .: :: :: : : : CCDS53 GDRGLPGTQGSPGAKGDGGIPGPAGPLGPPGPPGLPGPQGPKGNKGSTGPAGQKGDSGLP 1440 1450 1460 1470 1480 1490 1190 1200 1210 1220 1230 pF1KB4 GPPGPPGPPSAGFDFSFLP--QPPQEKAHDGGRYYRADDANVVRDRD--LEVDTTLKSLS :::: :::: : .. :: . . . : : ::: :.. : :. .:.::. CCDS53 GPPGSPGPP--GEVIQPLPILSSKKTRRHTEGMQADADD-NILDYSDGMEEIFGSLNSLK 1500 1510 1520 1530 1540 1240 1250 1260 1270 1280 1290 pF1KB4 QQIENIRSPEGSRKNPARTCRDLKMCHSDWKSGEYWIDPNQGCNLDAIKVFCNMETG-ET :.::... : :.. ::::::.::.. : :. .:::::::::::. :..::.::. .: :: CCDS53 QDIEHMKFPMGTQTNPARTCKDLQLSHPDFPDGEYWIDPNQGCSGDSFKVYCNFTSGGET 1550 1560 1570 1580 1590 1600 1300 1310 1320 1330 1340 1350 pF1KB4 CVYPTQPSVAQKNWYISKNPKDKRHVWFGESMTDGFQFEYGGQGSDPADVAIQLTFLRLM :.:: . : .. ::. ::.: ::.: . : . : .. .. .:.:::.:. CCDS53 CIYPDKKS---EGVRISSWPKEKPGSWFSE-FKRGKLLSYLDVEGNSINM-VQMTFLKLL 1610 1620 1630 1640 1650 1660 1360 1370 1380 1390 1400 1410 pF1KB4 STEASQNITYHCKNSVAYMDQQTGNLKKALLLQGSNEIEIRAEGNSRFTYSVTVDGCTSH .. : ::.::::..:.:..: ..:. ::: . :::. :. ..: : .. :::.:. CCDS53 TASARQNFTYHCHQSAAWYDVSSGSYDKALRFLGSNDEEMSYDNNP-FIKTL-YDGCASR 1670 1680 1690 1700 1710 1720 1420 1430 1440 1450 1460 pF1KB4 TGAWGKTVIEYKTTKTSRLPIIDVAPLDVGAPDQEFGFDVGPVCFL : . ::::: .: : ...::.:: : : .:.:::.::::::: CCDS53 KG-YEKTVIEINTPKIDQVPIVDVMINDFGDQNQKFGFEVGPVCFLG 1730 1740 1750 1760 1464 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Thu Nov 3 21:43:00 2016 done: Thu Nov 3 21:43:01 2016 Total Scan time: 7.320 Total Display time: 1.060 Function used was FASTA [36.3.4 Apr, 2011]