FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KE2406, 1497 aa
1>>>pF1KE2406 1497 - 1497 aa - 1497 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 14.7199+/-0.00148; mu= -15.0269+/- 0.090
mean_var=760.2333+/-156.310, 0's: 0 Z-trim(115.2): 256 B-trim: 3 in 1/54
Lambda= 0.046516
statistics sampled from 15482 (15729) to 15482 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.731), E-opt: 0.2 (0.483), width: 16
Scan time: 7.790
The best scores are: opt bits E(32554)
CCDS7554.1 COL17A1 gene_id:1308|Hs108|chr10 (1497) 10430 716.7 1.3e-205
CCDS8759.1 COL2A1 gene_id:1280|Hs108|chr12 (1418) 1708 131.4 2e-29
CCDS41778.1 COL2A1 gene_id:1280|Hs108|chr12 (1487) 1708 131.4 2e-29
CCDS6376.1 COL22A1 gene_id:169044|Hs108|chr8 (1626) 1603 124.4 2.9e-27
CCDS2297.1 COL3A1 gene_id:1281|Hs108|chr2 (1466) 1533 119.6 6.9e-26
CCDS42829.1 COL4A3 gene_id:1285|Hs108|chr2 (1670) 1489 116.7 5.9e-25
CCDS43452.1 COL11A2 gene_id:1302|Hs108|chr6 (1650) 1461 114.9 2.1e-24
CCDS33350.1 COL5A2 gene_id:1290|Hs108|chr2 (1499) 1410 111.4 2.2e-23
CCDS14542.1 COL4A6 gene_id:1288|Hs108|chrX (1690) 1402 110.9 3.4e-23
CCDS14541.1 COL4A6 gene_id:1288|Hs108|chrX (1691) 1402 110.9 3.4e-23
CCDS76010.1 COL4A6 gene_id:1288|Hs108|chrX (1707) 1398 110.6 4.1e-23
CCDS34682.1 COL1A2 gene_id:1278|Hs108|chr7 (1366) 1378 109.2 8.9e-23
CCDS42828.1 COL4A4 gene_id:1286|Hs108|chr2 (1690) 1362 108.2 2.2e-22
CCDS14543.1 COL4A5 gene_id:1287|Hs108|chrX (1685) 1359 108.0 2.5e-22
CCDS35366.1 COL4A5 gene_id:1287|Hs108|chrX (1691) 1352 107.5 3.5e-22
CCDS47447.1 COL9A1 gene_id:1297|Hs108|chr6 ( 678) 1324 105.3 6.8e-22
CCDS4971.1 COL9A1 gene_id:1297|Hs108|chr6 ( 921) 1324 105.4 8.4e-22
CCDS41907.1 COL4A2 gene_id:1284|Hs108|chr13 (1712) 1314 105.0 2e-21
CCDS780.2 COL11A1 gene_id:1301|Hs108|chr1 (1690) 1299 104.0 4.1e-21
CCDS53348.1 COL11A1 gene_id:1301|Hs108|chr1 (1767) 1299 104.0 4.2e-21
CCDS778.1 COL11A1 gene_id:1301|Hs108|chr1 (1806) 1299 104.0 4.3e-21
CCDS6982.1 COL5A1 gene_id:1289|Hs108|chr9 (1838) 1299 104.0 4.3e-21
CCDS75932.1 COL5A1 gene_id:1289|Hs108|chr9 (1838) 1299 104.0 4.3e-21
CCDS6802.1 COL27A1 gene_id:85301|Hs108|chr9 (1860) 1299 104.0 4.4e-21
CCDS76008.1 COL4A6 gene_id:1288|Hs108|chrX (1633) 1290 103.4 6.1e-21
CCDS11561.1 COL1A1 gene_id:1277|Hs108|chr17 (1464) 1287 103.1 6.5e-21
CCDS13505.1 COL9A3 gene_id:1299|Hs108|chr20 ( 684) 1274 101.9 7e-21
CCDS12222.1 COL5A3 gene_id:50509|Hs108|chr19 (1745) 1281 102.8 9.6e-21
CCDS9511.1 COL4A1 gene_id:1282|Hs108|chr13 (1669) 1256 101.1 3e-20
CCDS42971.1 COL18A1 gene_id:80781|Hs108|chr21 (1339) 1223 98.8 1.2e-19
CCDS42972.1 COL18A1 gene_id:80781|Hs108|chr21 (1519) 1223 98.8 1.3e-19
CCDS2773.1 COL7A1 gene_id:1294|Hs108|chr3 (2944) 1232 99.7 1.3e-19
CCDS77643.1 COL18A1 gene_id:80781|Hs108|chr21 (1754) 1223 98.9 1.4e-19
CCDS41353.1 COL24A1 gene_id:255631|Hs108|chr1 (1714) 1217 98.5 1.9e-19
CCDS5105.1 COL10A1 gene_id:1300|Hs108|chr6 ( 680) 1183 95.8 4.8e-19
CCDS58922.1 COL25A1 gene_id:84570|Hs108|chr4 ( 645) 1167 94.7 9.8e-19
CCDS43553.1 COL28A1 gene_id:340267|Hs108|chr7 (1125) 1147 93.6 3.6e-18
CCDS450.1 COL9A2 gene_id:1298|Hs108|chr1 ( 689) 1133 92.4 5e-18
CCDS43259.1 COL25A1 gene_id:84570|Hs108|chr4 ( 642) 1088 89.4 3.9e-17
CCDS43258.1 COL25A1 gene_id:84570|Hs108|chr4 ( 654) 1088 89.4 3.9e-17
CCDS2934.1 COL8A1 gene_id:1295|Hs108|chr3 ( 744) 1001 83.6 2.4e-15
CCDS4436.1 COL23A1 gene_id:91522|Hs108|chr5 ( 540) 992 82.9 3e-15
CCDS76009.1 COL4A6 gene_id:1288|Hs108|chrX (1666) 991 83.3 6.7e-15
CCDS76649.1 COL4A1 gene_id:1282|Hs108|chr13 ( 519) 963 80.9 1.1e-14
CCDS4970.1 COL19A1 gene_id:1310|Hs108|chr6 (1142) 967 81.5 1.6e-14
CCDS83099.1 COL21A1 gene_id:81578|Hs108|chr6 ( 954) 944 79.9 4.1e-14
CCDS55025.1 COL21A1 gene_id:81578|Hs108|chr6 ( 957) 944 79.9 4.1e-14
CCDS41297.1 COL16A1 gene_id:1307|Hs108|chr1 (1604) 951 80.6 4.2e-14
CCDS72756.1 COL8A2 gene_id:1296|Hs108|chr1 ( 638) 919 78.0 9.9e-14
CCDS403.1 COL8A2 gene_id:1296|Hs108|chr1 ( 703) 919 78.1 1.1e-13
>>CCDS7554.1 COL17A1 gene_id:1308|Hs108|chr10 (1497 aa)
initn: 10430 init1: 10430 opt: 10430 Z-score: 3804.9 bits: 716.7 E(32554): 1.3e-205
Smith-Waterman score: 10430; 99.8% identity (99.9% similar) in 1497 aa overlap (1-1497:1-1497)
10 20 30 40 50 60
pF1KE2 MDVTKKNKRDGTEVTERIVTETVTTRLTSLPPKGGTSNGYAKTASLGGGSRLEKQSLTHG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 MDVTKKNKRDGTEVTERIVTETVTTRLTSLPPKGGTSNGYAKTASLGGGSRLEKQSLTHG
10 20 30 40 50 60
70 80 90 100 110 120
pF1KE2 SSGYINSTGSTRGHASTSSYRRAHSPASTLPNSPGSTFERKTHVTRHAYEGSSSGNSSPE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 SSGYINSTGSTRGHASTSSYRRAHSPASTLPNSPGSTFERKTHVTRHAYEGSSSGNSSPE
70 80 90 100 110 120
130 140 150 160 170 180
pF1KE2 YPRKEFASSSTRGRSQTRESEIRVRLQSASPSTRWTELDDVKRLLKGSRSASVSPTRNSS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 YPRKEFASSSTRGRSQTRESEIRVRLQSASPSTRWTELDDVKRLLKGSRSASVSPTRNSS
130 140 150 160 170 180
190 200 210 220 230 240
pF1KE2 NTLPIPKKGTVETKIVTASSQSVSGTYDAMILDANLPSHVWSSTLPAGSSMGTYHNNMTT
::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::
CCDS75 NTLPIPKKGTVETKIVTASSQSVSGTYDATILDANLPSHVWSSTLPAGSSMGTYHNNMTT
190 200 210 220 230 240
250 260 270 280 290 300
pF1KE2 QSSSLLNTNAYSAGSVFGVPNNMASCSPTLHPGLSTSSSVFGMQNNLAPSLTTLSHGTTT
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 QSSSLLNTNAYSAGSVFGVPNNMASCSPTLHPGLSTSSSVFGMQNNLAPSLTTLSHGTTT
250 260 270 280 290 300
310 320 330 340 350 360
pF1KE2 TSTAYGVKKNMPQSPAAVNTGVSTSAACTTSVQSDDLLHKDCKFLILEKDNTPAKKEMEL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 TSTAYGVKKNMPQSPAAVNTGVSTSAACTTSVQSDDLLHKDCKFLILEKDNTPAKKEMEL
310 320 330 340 350 360
370 380 390 400 410 420
pF1KE2 LIMTKDSGKVFTASPASIAATSFSEDTLKKEKQAAYNADSGLKAEANGDLKTVSTKGKTT
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 LIMTKDSGKVFTASPASIAATSFSEDTLKKEKQAAYNADSGLKAEANGDLKTVSTKGKTT
370 380 390 400 410 420
430 440 450 460 470 480
pF1KE2 TADIHSYSSSGGGGSGGGGGVGGAGGGPWGPAPAWCPCGSCCSWWKWLLGLLLTWLLLLG
:::::::.::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 TADIHSYGSSGGGGSGGGGGVGGAGGGPWGPAPAWCPCGSCCSWWKWLLGLLLTWLLLLG
430 440 450 460 470 480
490 500 510 520 530 540
pF1KE2 LLFGLIALAEEVRKLKARVDELERIRRSILPYGDSMDRIEKDRLQGMAPAAGADLDKIGL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 LLFGLIALAEEVRKLKARVDELERIRRSILPYGDSMDRIEKDRLQGMAPAAGADLDKIGL
490 500 510 520 530 540
550 560 570 580 590 600
pF1KE2 HSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPKGDRGFPGTPGIPGPLGHPG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 HSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPKGDRGFPGTPGIPGPLGHPG
550 560 570 580 590 600
610 620 630 640 650 660
pF1KE2 PQGPKGQKGSVGDPGMEGPMGQRGREGPMGPRGEAGPPGSGEKGERGAAGEPGPHGPPGV
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 PQGPKGQKGSVGDPGMEGPMGQRGREGPMGPRGEAGPPGSGEKGERGAAGEPGPHGPPGV
610 620 630 640 650 660
670 680 690 700 710 720
pF1KE2 PGSVGPKGSSGSPGPQGPPGPVGLQGLRGEVGLPGVKGDKGPVGPPGPKGDQGEKGPRGL
::::::::::::::::::::::::::::::::::::::::::.:::::::::::::::::
CCDS75 PGSVGPKGSSGSPGPQGPPGPVGLQGLRGEVGLPGVKGDKGPMGPPGPKGDQGEKGPRGL
670 680 690 700 710 720
730 740 750 760 770 780
pF1KE2 TGEPGMRGLPGAVGEPGAKGAMGPAGPDGHQGPRGEQGLTGMPGIRGPPGPSGDPGKPGL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 TGEPGMRGLPGAVGEPGAKGAMGPAGPDGHQGPRGEQGLTGMPGIRGPPGPSGDPGKPGL
730 740 750 760 770 780
790 800 810 820 830 840
pF1KE2 TGPQGPQGLPGTPGRPGIKGEPGAPGKIVTSEGSSMLTVPGPPGPPGAMGPPGPPGAPGP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 TGPQGPQGLPGTPGRPGIKGEPGAPGKIVTSEGSSMLTVPGPPGPPGAMGPPGPPGAPGP
790 800 810 820 830 840
850 860 870 880 890 900
pF1KE2 AGPAGLPGHQEVLNLQGPPGPPGPRGPPGPSIPGPPGPRGPPGEGLPGPPGPPGSFLSNS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 AGPAGLPGHQEVLNLQGPPGPPGPRGPPGPSIPGPPGPRGPPGEGLPGPPGPPGSFLSNS
850 860 870 880 890 900
910 920 930 940 950 960
pF1KE2 ETFLSGPPGPPGPPGPKGDQGPPGPRGHQGEQGLPGFSTSGSSSFGLNLQGPPGPPGPQG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 ETFLSGPPGPPGPPGPKGDQGPPGPRGHQGEQGLPGFSTSGSSSFGLNLQGPPGPPGPQG
910 920 930 940 950 960
970 980 990 1000 1010 1020
pF1KE2 PKGDKGDPGVPGALGIPSGPSEGGSSSTMYVSGPPGPPGPPGPPGSISSSGQEIQQYISE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 PKGDKGDPGVPGALGIPSGPSEGGSSSTMYVSGPPGPPGPPGPPGSISSSGQEIQQYISE
970 980 990 1000 1010 1020
1030 1040 1050 1060 1070 1080
pF1KE2 YMQSDSIRSYLSGVQGPPGPPGPPGPVTTITGETFDYSELASHVVSYLRTSGYGVSLFSS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 YMQSDSIRSYLSGVQGPPGPPGPPGPVTTITGETFDYSELASHVVSYLRTSGYGVSLFSS
1030 1040 1050 1060 1070 1080
1090 1100 1110 1120 1130 1140
pF1KE2 SISSEDILAVLQRDDVRQYLRQYLMGPRGPPGPPGASGDGSLLSLDYAELSSRILSYMSS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 SISSEDILAVLQRDDVRQYLRQYLMGPRGPPGPPGASGDGSLLSLDYAELSSRILSYMSS
1090 1100 1110 1120 1130 1140
1150 1160 1170 1180 1190 1200
pF1KE2 SGISIGLPGPPGPPGLPGTSYEELLSLLRGSEFRGIVGPPGPPGPPGIPGNVWSSISVED
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 SGISIGLPGPPGPPGLPGTSYEELLSLLRGSEFRGIVGPPGPPGPPGIPGNVWSSISVED
1150 1160 1170 1180 1190 1200
1210 1220 1230 1240 1250 1260
pF1KE2 LSSYLHTAGLSFIPGPPGPPGPPGPRGPPGVSGALATYAAENSDSFRSELISYLTSPDVR
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 LSSYLHTAGLSFIPGPPGPPGPPGPRGPPGVSGALATYAAENSDSFRSELISYLTSPDVR
1210 1220 1230 1240 1250 1260
1270 1280 1290 1300 1310 1320
pF1KE2 SFIVGPPGPPGPQGPPGDSRLLSTDASHSRGSSSSSHSSSVRRGSSYSSSMSTGGGGAGS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 SFIVGPPGPPGPQGPPGDSRLLSTDASHSRGSSSSSHSSSVRRGSSYSSSMSTGGGGAGS
1270 1280 1290 1300 1310 1320
1330 1340 1350 1360 1370 1380
pF1KE2 LGAGGAFGEAAGDRGPYGTDIGPGGGYGAAAEGGMYAGNGGLLGADFAGDLDYNELAVRV
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 LGAGGAFGEAAGDRGPYGTDIGPGGGYGAAAEGGMYAGNGGLLGADFAGDLDYNELAVRV
1330 1340 1350 1360 1370 1380
1390 1400 1410 1420 1430 1440
pF1KE2 SESMQRQGLLQGMAYTVQGPPGQPGPQGPPGISKVFSAYSNVTADLMDFFQTYGAIQGPP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 SESMQRQGLLQGMAYTVQGPPGQPGPQGPPGISKVFSAYSNVTADLMDFFQTYGAIQGPP
1390 1400 1410 1420 1430 1440
1450 1460 1470 1480 1490
pF1KE2 GQKGEMGTPGPKGDRGPAGPPGHPGPPGPRGHKGEKGDKGDQVYAGRRRRRSIAVKP
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS75 GQKGEMGTPGPKGDRGPAGPPGHPGPPGPRGHKGEKGDKGDQVYAGRRRRRSIAVKP
1450 1460 1470 1480 1490
>>CCDS8759.1 COL2A1 gene_id:1280|Hs108|chr12 (1418 aa)
initn: 2077 init1: 715 opt: 1708 Z-score: 641.8 bits: 131.4 E(32554): 2e-29
Smith-Waterman score: 1842; 37.6% identity (50.8% similar) in 991 aa overlap (567-1488:189-1064)
540 550 560 570 580 590
pF1KE2 KIGLHSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPKGDRGFPGTPGIPGPL
:.:: :. : :::.: ::::::::.::
CCDS87 GEPGEPGVSGPMGPRGPPGPPGKPGDDGEAGKPGKAGERGPPGPQGARGFPGTPGLPGVK
160 170 180 190 200 210
600 610 620 630 640
pF1KE2 GHPGPQGPKGQKGSVGDPGMEGPMG---QRGREGPMGPRG------EAGPPGS----GEK
:: : : : :: .: ::..: : . : ::::::: ..:: :. :.
CCDS87 GHRGYPGLDGAKGEAGAPGVKGESGSPGENGSPGPMGPRGLPGERGRTGPAGAAGARGND
220 230 240 250 260 270
650 660 670 680 690 700
pF1KE2 GERGAAGEPGPHGPPGVPGSVGPKGSSGSPGPQGPPGPVGLQGLRGEVGLPGVKGDKGPV
:. : :: ::: :: : :: : :..: :: : :: : :: ::: : :: : :
CCDS87 GQPGPAGPPGPVGPAGGPGFPGAPGAKGEAGPTGARGPEGAQGPRGEPGTPGSPGPAGAS
280 290 300 310 320 330
710 720 730 740 750
pF1KE2 GPPGPKGDQGEKGPRGLTGEPGMRGLPGAVGEPGAKGAMGPAGPDGHQG-P-----RGEQ
: :: : : :: : : : :.:: : :: .:: :: :: :. : : .:::
CCDS87 GNPGTDGIPGAKGSAGAPGIAGAPGFPGPRGPPGPQGATGPLGPKGQTGEPGIAGFKGEQ
340 350 360 370 380 390
760 770 780 790 800
pF1KE2 GLTGMPGIRGP---PGPSGDPGK------PGLTGPQGP---QGLPGT---PGRPGIKGEP
: : :: :: :::.:. :: :: .:: :: .: ::. ::. :. :
CCDS87 GPKGEPGPAGPQGAPGPAGEEGKRGARGEPGGVGPIGPPGERGAPGNRGFPGQDGLAGPK
400 410 420 430 440 450
810 820 830 840 850
pF1KE2 GAPGKIVTSEGSSMLTVP-GPPGPPGAMGPPGPPGA------PGPAGPAGLPGHQEVLNL
::::. .: : :. : : : :: : :: ::: :: ::: : : . . .
CCDS87 GAPGE----RGPSGLAGPKGANGDPGRPGEPGLPGARGLTGRPGDAGPQGKVGPSGAPGE
460 470 480 490 500 510
860 870 880 890 900
pF1KE2 QGPPGPPGPRGPPG-PSIPGPPGPRGPPGE-------GLPGPPGPPGSFLSNSETFLSGP
.: ::::::.: : :.. : :::.: :: :::: :: : ...:: .::
CCDS87 DGRPGPPGPQGARGQPGVMGFPGPKGANGEPGKAGEKGLPGAPGLRGLPGKDGETGAAGP
520 530 540 550 560 570
910 920 930 940 950 960
pF1KE2 PGPPGPPGPKGDQGPPGPRGHQGEQGLPGFSTSGSSSFGLNLQGPPGPPGPQGPKGDKGD
::: :: : .:.:: ::: : :: : :: :.. .. : : :: ::.:..:
CCDS87 PGPAGPAGERGEQGAPGPSGFQGLPGPPGPPGEGGKPGDQGVPGEAGAPGLVGPRGERGF
580 590 600 610 620 630
970 980 990 1000 1010 1020
pF1KE2 PG---VPGALGI--PSG-PSEGGSSSTMYVSGPPGPPGPPGPPGSISSSGQEIQQYISEY
:: ::: :. : : :. :... .::: :::: :::: . :.. :.
CCDS87 PGERGSPGAQGLQGPRGLPGTPGTDGPKGASGPAGPPGAQGPPGLQGMPGERGAAGIAGP
640 650 660 670 680 690
1030 1040 1050 1060 1070
pF1KE2 MQSDSIRSYLSGVQGPPG---------PPGPPGPVTTITGETFDYSELASHVVSYLRTSG
..: .: .: :: : :::::. : . . .:.. :
CCDS87 -KGDRGDVGEKGPEGAPGKDGGRGLTGPIGPPGPA----GANGEKGEVG--------PPG
700 710 720 730 740
1080 1090 1100 1110 1120 1130
pF1KE2 YGVSLFSSSISSEDILAVLQRDDVRQYLRQYLMGPRGPPGPPGASGDGSLLSLDYAELSS
. : . . .: : .. :: : :::::.:. . . . .: ..
CCDS87 PAGSAGARGAPGE-------RGETGPP------GPAGFAGPPGADGQPGAKG-EQGEAGQ
750 760 770 780
1140 1150 1160 1170 1180 1190
pF1KE2 RILSYMSSSGISIGLPGPPGPPGLPGTSYEELLSLLRGSEFRGIVGPPGPPGPPGIPGNV
. : . : ::: :: : :: . .. .:. :: :::: : :: : :
CCDS87 K--------G-DAGAPGPQGPSGAPGPQGPTGVTGPKGA--RGAQGPPGATGFPGAAGRV
790 800 810 820 830
1200 1210 1220 1230 1240 1250
pF1KE2 WSSISVEDLSSYLHTAGLSFIPGPPGPPGPPGPRGPPGVSGALATYAAENSDSFRSELIS
: . ::::::::: : :: :. : ... :.
CCDS87 -------------GPPGSNGNPGPPGPPGPSGKDGPKGARG-------DSGPPGRA----
840 850 860 870
1260 1270 1280 1290 1300
pF1KE2 YLTSPDVRSFIVGPPGPPGPQGPPGDSRLLSTDASHSRGSSSSSHSSSV-----RRGSSY
: .. :: :::: .: :::. . : : .. . . .. .::
CCDS87 --GEPGLQ----GPAGPPGEKGEPGDDG--PSGAEGPPGPQGLAGQRGIVGLPGQRGERG
880 890 900 910 920
1310 1320 1330 1340 1350 1360
pF1KE2 SSSMSTGGGGAGSLGAGGAFGEAAGDRGPYGTDIGPGGGYGAAAEGGMYAGNGGLLGADF
.. .: :. :: :: .::::: : .:: : : :.: : : :::
CCDS87 FPGLPGPSGEPGKQGAPGA----SGDRGPPGP-VGPPGLTGPAGE----PGREGSPGADG
930 940 950 960 970
1370 1380 1390 1400 1410 1420
pF1KE2 AGDLDYNELAVRVSESMQRQGLLQGMAYTVQGPPGQPGPQGPPGISKVFSAYSNVTADLM
: :. :. . . : . : . ::::.::: :: :
CCDS87 PPGRDG---AAGVKGDRGETGAVG--APGAPGPPGSPGPAGPTG----------------
980 990 1000 1010
1430 1440 1450 1460 1470 1480
pF1KE2 DFFQTYGAIQGPPGQKGEMGTPGPKGDRGPAGPPGHPGPPGPRGHKGEKGDKGDQVYAGR
:..:: :. :: : :::: : :: :::: ::: :. :.. :.
CCDS87 -----------KQGDRGEAGAQGPMGPSGPAGARGIQGPQGPRGDKGEAGEPGERGLKGH
1020 1030 1040 1050 1060
1490
pF1KE2 RRRRSIAVKP
:
CCDS87 RGFTGLQGLPGPPGPSGDQGASGPAGPSGPRGPPGPVGPSGKDGANGIPGPIGPPGPRGR
1070 1080 1090 1100 1110 1120
>>CCDS41778.1 COL2A1 gene_id:1280|Hs108|chr12 (1487 aa)
initn: 2077 init1: 715 opt: 1708 Z-score: 641.6 bits: 131.4 E(32554): 2e-29
Smith-Waterman score: 1842; 37.6% identity (50.8% similar) in 991 aa overlap (567-1488:258-1133)
540 550 560 570 580 590
pF1KE2 KIGLHSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPKGDRGFPGTPGIPGPL
:.:: :. : :::.: ::::::::.::
CCDS41 GEPGEPGVSGPMGPRGPPGPPGKPGDDGEAGKPGKAGERGPPGPQGARGFPGTPGLPGVK
230 240 250 260 270 280
600 610 620 630 640
pF1KE2 GHPGPQGPKGQKGSVGDPGMEGPMG---QRGREGPMGPRG------EAGPPGS----GEK
:: : : : :: .: ::..: : . : ::::::: ..:: :. :.
CCDS41 GHRGYPGLDGAKGEAGAPGVKGESGSPGENGSPGPMGPRGLPGERGRTGPAGAAGARGND
290 300 310 320 330 340
650 660 670 680 690 700
pF1KE2 GERGAAGEPGPHGPPGVPGSVGPKGSSGSPGPQGPPGPVGLQGLRGEVGLPGVKGDKGPV
:. : :: ::: :: : :: : :..: :: : :: : :: ::: : :: : :
CCDS41 GQPGPAGPPGPVGPAGGPGFPGAPGAKGEAGPTGARGPEGAQGPRGEPGTPGSPGPAGAS
350 360 370 380 390 400
710 720 730 740 750
pF1KE2 GPPGPKGDQGEKGPRGLTGEPGMRGLPGAVGEPGAKGAMGPAGPDGHQG-P-----RGEQ
: :: : : :: : : : :.:: : :: .:: :: :: :. : : .:::
CCDS41 GNPGTDGIPGAKGSAGAPGIAGAPGFPGPRGPPGPQGATGPLGPKGQTGEPGIAGFKGEQ
410 420 430 440 450 460
760 770 780 790 800
pF1KE2 GLTGMPGIRGP---PGPSGDPGK------PGLTGPQGP---QGLPGT---PGRPGIKGEP
: : :: :: :::.:. :: :: .:: :: .: ::. ::. :. :
CCDS41 GPKGEPGPAGPQGAPGPAGEEGKRGARGEPGGVGPIGPPGERGAPGNRGFPGQDGLAGPK
470 480 490 500 510 520
810 820 830 840 850
pF1KE2 GAPGKIVTSEGSSMLTVP-GPPGPPGAMGPPGPPGA------PGPAGPAGLPGHQEVLNL
::::. .: : :. : : : :: : :: ::: :: ::: : : . . .
CCDS41 GAPGE----RGPSGLAGPKGANGDPGRPGEPGLPGARGLTGRPGDAGPQGKVGPSGAPGE
530 540 550 560 570 580
860 870 880 890 900
pF1KE2 QGPPGPPGPRGPPG-PSIPGPPGPRGPPGE-------GLPGPPGPPGSFLSNSETFLSGP
.: ::::::.: : :.. : :::.: :: :::: :: : ...:: .::
CCDS41 DGRPGPPGPQGARGQPGVMGFPGPKGANGEPGKAGEKGLPGAPGLRGLPGKDGETGAAGP
590 600 610 620 630 640
910 920 930 940 950 960
pF1KE2 PGPPGPPGPKGDQGPPGPRGHQGEQGLPGFSTSGSSSFGLNLQGPPGPPGPQGPKGDKGD
::: :: : .:.:: ::: : :: : :: :.. .. : : :: ::.:..:
CCDS41 PGPAGPAGERGEQGAPGPSGFQGLPGPPGPPGEGGKPGDQGVPGEAGAPGLVGPRGERGF
650 660 670 680 690 700
970 980 990 1000 1010 1020
pF1KE2 PG---VPGALGI--PSG-PSEGGSSSTMYVSGPPGPPGPPGPPGSISSSGQEIQQYISEY
:: ::: :. : : :. :... .::: :::: :::: . :.. :.
CCDS41 PGERGSPGAQGLQGPRGLPGTPGTDGPKGASGPAGPPGAQGPPGLQGMPGERGAAGIAGP
710 720 730 740 750 760
1030 1040 1050 1060 1070
pF1KE2 MQSDSIRSYLSGVQGPPG---------PPGPPGPVTTITGETFDYSELASHVVSYLRTSG
..: .: .: :: : :::::. : . . .:.. :
CCDS41 -KGDRGDVGEKGPEGAPGKDGGRGLTGPIGPPGPA----GANGEKGEVG--------PPG
770 780 790 800 810
1080 1090 1100 1110 1120 1130
pF1KE2 YGVSLFSSSISSEDILAVLQRDDVRQYLRQYLMGPRGPPGPPGASGDGSLLSLDYAELSS
. : . . .: : .. :: : :::::.:. . . . .: ..
CCDS41 PAGSAGARGAPGE-------RGETGPP------GPAGFAGPPGADGQPGAKG-EQGEAGQ
820 830 840 850
1140 1150 1160 1170 1180 1190
pF1KE2 RILSYMSSSGISIGLPGPPGPPGLPGTSYEELLSLLRGSEFRGIVGPPGPPGPPGIPGNV
. : . : ::: :: : :: . .. .:. :: :::: : :: : :
CCDS41 K--------G-DAGAPGPQGPSGAPGPQGPTGVTGPKGA--RGAQGPPGATGFPGAAGRV
860 870 880 890 900
1200 1210 1220 1230 1240 1250
pF1KE2 WSSISVEDLSSYLHTAGLSFIPGPPGPPGPPGPRGPPGVSGALATYAAENSDSFRSELIS
: . ::::::::: : :: :. : ... :.
CCDS41 -------------GPPGSNGNPGPPGPPGPSGKDGPKGARG-------DSGPPGRA----
910 920 930 940
1260 1270 1280 1290 1300
pF1KE2 YLTSPDVRSFIVGPPGPPGPQGPPGDSRLLSTDASHSRGSSSSSHSSSV-----RRGSSY
: .. :: :::: .: :::. . : : .. . . .. .::
CCDS41 --GEPGLQ----GPAGPPGEKGEPGDDG--PSGAEGPPGPQGLAGQRGIVGLPGQRGERG
950 960 970 980 990
1310 1320 1330 1340 1350 1360
pF1KE2 SSSMSTGGGGAGSLGAGGAFGEAAGDRGPYGTDIGPGGGYGAAAEGGMYAGNGGLLGADF
.. .: :. :: :: .::::: : .:: : : :.: : : :::
CCDS41 FPGLPGPSGEPGKQGAPGA----SGDRGPPGP-VGPPGLTGPAGE----PGREGSPGADG
1000 1010 1020 1030 1040
1370 1380 1390 1400 1410 1420
pF1KE2 AGDLDYNELAVRVSESMQRQGLLQGMAYTVQGPPGQPGPQGPPGISKVFSAYSNVTADLM
: :. :. . . : . : . ::::.::: :: :
CCDS41 PPGRDG---AAGVKGDRGETGAVG--APGAPGPPGSPGPAGPTG----------------
1050 1060 1070 1080
1430 1440 1450 1460 1470 1480
pF1KE2 DFFQTYGAIQGPPGQKGEMGTPGPKGDRGPAGPPGHPGPPGPRGHKGEKGDKGDQVYAGR
:..:: :. :: : :::: : :: :::: ::: :. :.. :.
CCDS41 -----------KQGDRGEAGAQGPMGPSGPAGARGIQGPQGPRGDKGEAGEPGERGLKGH
1090 1100 1110 1120 1130
1490
pF1KE2 RRRRSIAVKP
:
CCDS41 RGFTGLQGLPGPPGPSGDQGASGPAGPSGPRGPPGPVGPSGKDGANGIPGPIGPPGPRGR
1140 1150 1160 1170 1180 1190
>>CCDS6376.1 COL22A1 gene_id:169044|Hs108|chr8 (1626 aa)
initn: 824 init1: 824 opt: 1603 Z-score: 603.0 bits: 124.4 E(32554): 2.9e-27
Smith-Waterman score: 1742; 36.8% identity (50.6% similar) in 995 aa overlap (561-1486:531-1435)
540 550 560 570 580 590
pF1KE2 AGADLDKIGLHSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPKGDRGFPGTP
:.:.: : ::: : :: : .:. : : :
CCDS63 IGAIGPVGAPGPKGEKGDVGIGPFGQGEKGEKGSL-GLPGPPGRDGSKGMRGEPGELGEP
510 520 530 540 550
600 610 620 630
pF1KE2 GIPGPLGHPGPQGPKG---QKGSVGDPGMEGPMGQRGREGPMGPRGEAGPPG--------
:.:: .: ::::: : : :: ::..: :..: .: : :: : ::
CCDS63 GLPGEVGMRGPQGPPGLPGPPGRVGAPGLQGERGEKGTRGEKGERGLDGFPGKPGDTGQQ
560 570 580 590 600 610
640 650 660 670 680
pF1KE2 -----SGEKGERGAAGEPGPHGPPGVPGSV-------------GPKGSSGSPGPQGPPGP
:: : .: :. :: ::::::::: ::.: .:.::: : ::
CCDS63 GRPGPSGVAGPQGEKGDVGPAGPPGVPGSVVQQEGLKGEQGAPGPRGHQGAPGPPGARGP
620 630 640 650 660 670
690 700 710 720 730
pF1KE2 VGLQGLRGEVGLPGVKGDKGPVGPPG-PK--GDQGEKGPRGLTGEPG---MRGLPGAVGE
.: .: : :: :..: :: .:::: : : :: :: :. : :: :::: .:
CCDS63 IGPEGRDGPPGLQGLRGKKGDMGPPGIPGLLGLQGPPGPPGVPGPPGPGGSPGLPGEIGF
680 690 700 710 720 730
740 750 760 770 780 790
pF1KE2 PGAKGAMGPAGPDGHQGPRGEQGLTGMPGIRGPPGPSGD---PGKPGLTGPQGPQGLPGT
:: : ::.:: :..:: : : :: .: :: :. :::::: : : ::: :
CCDS63 PGKPGPPGPTGPPGKDGPNGPPG---PPGTKGEPGERGEDGLPGKPGLRGEIGEQGLAGR
740 750 760 770 780 790
800 810 820 830 840 850
pF1KE2 PGRPGIKGEPGAPG-KIVTSEGSSMLTVPGPPGPPGAMGPPGPPGAPGPAGPAGLPGHQE
::. : : ::::: : .: ... : : :: : : : ::::: ::::
CCDS63 PGEKGEAGLPGAPGFPGVRGEKGDQGE-KGELGLPGLKGDRGEKGEAGPAGPPGLPGTTS
800 810 820 830 840 850
860 870 880 890 900
pF1KE2 VLNLQGP--PGPPGPRGPPG-PSIPGPPGPRGPPGE-GLPGPPGPPGSFLSNSETFLSGP
... . : :: ::.: : :..:: :: .: ::: : :: ::::. ... :
CCDS63 LFTPH-PRMPGEQGPKGEKGDPGLPGEPGLQGRPGELGPQGPTGPPGA---KGQEGAHGA
860 870 880 890 900 910
910 920 930 940 950 960
pF1KE2 PGPPGPPGPKGDQGPPGPRGHQGEQGLPGF-STSGSSSF-GLN-LQGPPGPPGPQGPKGD
:: : :: : : ::: : : : ::. .: :... : . : : ::: ::.::
CCDS63 PGAAGNPGAPGHVGAPGPSGPPGSVGAPGLRGTPGKDGERGEKGAAGEEGSPGPVGPRGD
920 930 940 950 960 970
970 980 990 1000 1010 1020
pF1KE2 KGDPGVPGALGIPSGPSEGGSSSTMYVSGPPGPPGPPGPPGSISS-SGQEIQQYISEYMQ
: ::.:: : : .. : . . : :: ::: : .. .. :.: .. ..
CCDS63 PGAPGLPG----PPGKGKDGEPG---LRGSPGLPGPLGTKAACGKVRGSENCALGGQCVK
980 990 1000 1010 1020
1030 1040 1050 1060 1070
pF1KE2 SDSIRSYLSGVQGPPGPPG-----PPGPVTTITGETFDYSELASHVVSYLRTSGYGVSLF
.: . : : : :: :::: .: : . .:. :. :
CCDS63 GDRGAPGIPGSPGSRGDPGIGVAGPPGP----SGPPGDKGSPGSR----------GLPGF
1030 1040 1050 1060 1070
1080 1090 1100 1110 1120 1130
pF1KE2 SSSISSEDILAVLQRDDVRQYLRQYLMGPRGPPGPPGASGDGSLLSL-DYAELSSRILSY
. . :: . : ::::: :: : :::: : :.. . .
CCDS63 PGPQGPAG------RDGAPGN-----PGERGPPGKPGLS---SLLSPGDINLLAKDVCND
1080 1090 1100 1110
1140 1150 1160 1170 1180 1190
pF1KE2 MSSSGISIGLPGPPGPPGLPGTSYEELLSLLRGSEF----RGIVGPPGPPGPPGIPGNVW
::::: ::::: . .. . : : .: .:::: :::::: :
CCDS63 CP--------PGPPGLPGLPGFKGDKGVPGKPGREGTEGKKGEAGPPGLPGPPGIAGPQG
1120 1130 1140 1150 1160
1200 1210 1220 1230 1240 1250
pF1KE2 SSI--SVEDLSSYLHTAGLSFIPGPPGPPGPPGPRGPPGVSGALATYAAENSDSFRSELI
:. ... . : .:: :::: ::: : :..:: . . ..: . ..
CCDS63 SQGERGADGEVGQKGDQGHPGVPGFMGPPGNPGPPGADGIAGAAGPPGIQGSPGKEGPPG
1170 1180 1190 1200 1210 1220
1260 1270 1280 1290 1300
pF1KE2 SYLTS--PDV-----RSFIVGPPGPPGPQGPPGDSRLLSTDASHSRGSSSSSHSSSVRRG
: : . . : ::::: : :. : . ... :: . .:
CCDS63 PQGPSGLPGIPGEEGKEGRDGKPGPPGEPGKAGEPGLPGPEGA--RGPPGF-------KG
1230 1240 1250 1260 1270
1310 1320 1330 1340 1350 1360
pF1KE2 SSYSSSMSTGGGGAGSLGAGGAFGEAA--GDRGPYGTDIGPGGGYGAAAEGGMYAGNGGL
. .:. : .:..: : : . :: :: : . :: : : ...: :. :
CCDS63 HTGDSGAPGPRGESGAMGLPGQEGLPGKDGDTGPTGPQ-GPQGPRGPPGKNGS-PGSPGE
1280 1290 1300 1310 1320 1330
1370 1380 1390 1400 1410 1420
pF1KE2 LG-ADFAGDLDYNELAVRVSESMQRQGLLQGMAYTVQGPPGQPGPQGPPGISKVFSAYSN
: . :. . :.. . . : :. .::::.:: .: :: : .
CCDS63 PGPSGTPGQ--------KGSKGENGSPGLPGF-LGPRGPPGEPGEKGVPGKEGVPGK---
1340 1350 1360 1370 1380
1430 1440 1450 1460 1470 1480
pF1KE2 VTADLMDFFQTYGAIQGPPGQKGEMGTPGPKGDRGPAGPPGHPGPPGPRGHKGEKGDKGD
: :: ::: : :: :::.:: : :.:: :: ::::. : :
CCDS63 ---------------PGEPGFKGERGDPGIKGDKGPPGGKGQPGDPGIPGHKGHTGLMGP
1390 1400 1410 1420 1430
1490
pF1KE2 QVYAGRRRRRSIAVKP
: :
CCDS63 QGLPGENGPVGPPGPPGQPGFPGLRGESPSMETLRRLIQEELGKQLETRLAYLLAQMPPA
1440 1450 1460 1470 1480 1490
>>CCDS2297.1 COL3A1 gene_id:1281|Hs108|chr2 (1466 aa)
initn: 787 init1: 787 opt: 1533 Z-score: 578.2 bits: 119.6 E(32554): 6.9e-26
Smith-Waterman score: 1787; 37.0% identity (50.4% similar) in 962 aa overlap (567-1480:213-1041)
540 550 560 570 580 590
pF1KE2 KIGLHSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPKGDRGFPGTPGIPGPL
: ::: : .: :: : : : :: ::
CCDS22 GPPGTSGHPGSPGSPGYQGPPGEPGQAGPSGPPGPPGAIGPSGPAGKDGESGRPGRPGER
190 200 210 220 230 240
600 610 620 630 640 650
pF1KE2 GHPGPQGPKGQKGSVGDPGMEGPMGQRGREGPMGPRGEAGPPG-SGEKGERGAAGEPGPH
: ::: : :: : : :::.: .:: .: : .::.: :: .::.: : : :::
CCDS22 GLPGPPGIKGPAGIPGFPGMKG---HRGFDGRNGEKGETGAPGLKGENGLPGENGAPGPM
250 260 270 280 290
660 670 680 690 700
pF1KE2 GP---------PGVPGSVGPKGSSGSPGPQGPPGPVGLQGLRGEVGLPGVKGDKGPVGPP
:: ::.::..: .:..:. : .: ::: : : : : ::.::. ::.: :
CCDS22 GPRGAPGERGRPGLPGAAGARGNDGARGSDGQPGPPGPPGTAGFPGSPGAKGEVGPAGSP
300 310 320 330 340 350
710 720 730 740 750
pF1KE2 GPKG---DQGEKGPRGLTGEPGMRGLPGAVGEPGAKGAMGPAG-PD-----GHQGPRGEQ
: .: ..:: ::.: .: : : :: : ::.:: ::::: : : .:: :
CCDS22 GSNGAPGQRGEPGPQGHAGAQGPPGPPGINGSPGGKGEMGPAGIPGAPGLMGARGPPGPA
360 370 380 390 400 410
760 770 780 790 800 810
pF1KE2 GLTGMPGIRGPPGPSGDPGKPGLTGPQGPQGLPGTPGRPGIKGEPGAPGKIVTSEGSSML
: .: ::.:: : : : : ::.: .: : :: :: ::: : : : :
CCDS22 GANGAPGLRGGAGEPGKNGAKGEPGPRGERGEAGIPGVPGAKGEDGKDG----SPGE---
420 430 440 450 460 470
820 830 840 850 860 870
pF1KE2 TVPGPPGPPGAMGPPGPPGAPGPAGPAGLPGHQEVLNLQGPPGPPGPRGPPG-PS---IP
:: : ::: : : :: ::::: :.::.. . .: ::: :::: : :. .:
CCDS22 --PGANGLPGAAGERGAPGFRGPAGPNGIPGEKGPAGERGAPGPAGPRGAAGEPGRDGVP
480 490 500 510 520 530
880 890 900 910 920
pF1KE2 GPPGPRGPPGE-GLPGPPGPPGSFLSNSETFLSGPPGPPGP---------PGPKGDQGPP
: :: :: :: : :: : :: :..:. ::::: :: :::::..: :
CCDS22 GGPGMRGMPGSPGGPGSDGKPGPPGSQGESGRPGPPGPSGPRGQPGVMGFPGPKGNDGAP
540 550 560 570 580 590
930 940 950 960 970 980
pF1KE2 GPRGHQGEQGLPGFSTSGSSSFGLNLQ-GPPGPPGPQGPKGDKGDPGVPGALGIPSGPSE
: :..: : :: .: : : . :: ::::: :: ::::: : :: :. . :.
CCDS22 GKNGERGGPGGPG--PQGPP--GKNGETGPQGPPGPTGPGGDKGDTGPPGPQGLQGLPGT
600 610 620 630 640
990 1000 1010 1020 1030
pF1KE2 GGSSSTMYVSGPPGPPGP---PGPPGSISSSGQEIQQYISEYMQSDSIRSYLS-----GV
:: . : ::: : :: ::. ...: .. . ..:. . :
CCDS22 GGPPGENGKPGEPGPKGDAGAPGAPGGKGDAGAPGERGPPGLAGAPGLRGGAGPPGPEGG
650 660 670 680 690 700
1040 1050 1060 1070 1080 1090
pF1KE2 QGPPGPPGPPGPVTT--ITGETFDYSELASHVVSYLRTSGYGVSLFSSSISSEDILAVLQ
.: ::::::: . : . : . . :.: : . .. .. .:
CCDS22 KGAAGPPGPPGAAGTPGLQGMPGERGGLGSP----------GPKGDKGEPGGPGADGVPG
710 720 730 740 750
1100 1110 1120 1130 1140 1150
pF1KE2 RDDVRQYLRQYLMGPRGPPGPPGASGDGSLLSLDYAELSSRILSYMSSSGISIGLPGPPG
.: : :: :: :::: .:. . : .: : . :::: :
CCDS22 KDGPR--------GPTGPIGPPGPAGQPG----DKGE------------GGAPGLPGIAG
760 770 780 790
1160 1170 1180 1190 1200 1210
pF1KE2 PPGLPGTSYEELLSLLRGSEFRGIVGPPGPPGPPGIPGNVW--SSISVEDLSSYLHTAGL
: : :: :: .::::: : :: ::. .. . . . .:
CCDS22 PRGSPGE--------------RGETGPPGPAGFPGAPGQNGEPGGKGERGAPGEKGEGGP
800 810 820 830
1220 1230 1240 1250 1260 1270
pF1KE2 SFIPGPPGPPGPPGPRGPPGVSGALATYAAENSDSFRSELISYLTSPDVRSFIVGPPGPP
. :::: :: :: :: ::.: .. .. .. .: : .: : ::::
CCDS22 PGVAGPPGGSGPAGPPGPQGVKGERGSPGGPGAAGF----------PGAR----GLPGPP
840 850 860 870 880
1280 1290 1300 1310 1320 1330
pF1KE2 GPQGPPGDSRLLSTDASHSRGSSSSSHSSSVRRGSSYSSSMSTGGGGAGSLGAGGAFGEA
: .: :: : : : . . : ::. :: :. : .
CCDS22 GSNGNPGPP------------------------GPSGSPGKDGPPGPAGNTGAPGSPGVS
890 900 910 920
1340 1350 1360 1370 1380
pF1KE2 A--GDRGPYGTDIGPGGGYGAAAEGGMYAGNGGLLGADFAGDLDYNELAVRVSESMQRQG
. :: : : .::. .: : . : .:. ::
CCDS22 GPKGDAGQPGEKGSPGAQGPPGAPGPL--GIAGITGA-----------------------
930 940 950
1390 1400 1410 1420 1430 1440
pF1KE2 LLQGMAYTVQGPPGQPGPQGPPGISKVFSAYSNVTADLMDFFQTYGAIQGPPGQKGEMGT
.:.: ::::.:::.: :: . : . .. :. .. . . :: :: : :
CCDS22 --RGLA----GPPGMPGPRGSPGPQGVKGESGKPGANGLSGERGPPGPQGLPGLAGTAGE
960 970 980 990 1000
1450 1460 1470 1480 1490
pF1KE2 PGPKGDRGPAGPPGHPGPPGPRGHKGEKGDKGDQVYAGRRRRRSIAVKP
:: :. : : ::. : :: .: .::.:. :
CCDS22 PGRDGNPGSDGLPGRDGSPGGKGDRGENGSPGAPGAPGHPGPPGPVGPAGKSGDRGESGP
1010 1020 1030 1040 1050 1060
CCDS22 AGPAGAPGPAGSRGAPGPQGPRGDKGETGERGAAGIKGHRGFPGNPGAPGSPGPAGQQGA
1070 1080 1090 1100 1110 1120
>>CCDS42829.1 COL4A3 gene_id:1285|Hs108|chr2 (1670 aa)
initn: 1029 init1: 631 opt: 1489 Z-score: 561.5 bits: 116.7 E(32554): 5.9e-25
Smith-Waterman score: 1629; 35.8% identity (49.9% similar) in 1008 aa overlap (572-1485:170-1059)
550 560 570 580 590 600
pF1KE2 SDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPKGDRGFPGTPGIPGPLGHPGP
::: : :: : .:.:: ::.:::.: :::
CCDS42 TLGYPGIPGAAGLKGQKGAPAKEEDIELDAKGDPGLPGAPGPQGLPGPPGFPGPVGPPGP
140 150 160 170 180 190
610 620 630 640
pF1KE2 QGPKGQKGSVGDPGMEGPMGQR--GREGPMGPRGEAGPPG------------------SG
: : :..: : .: ::.: :..: : .: .:::: .:
CCDS42 PGFFGFPGAMGPRGPKGHMGERVIGHKGERGVKGLTGPPGPPGTVIVTLTGPDNRTDLKG
200 210 220 230 240 250
650 660 670 680 690
pF1KE2 EKGERGAAGEPGPHGPPGVPG-SVGP-KGSSGSPGPQGPPGPVGLQGLRGEVGLPGVKGD
:::..:: ::::: :: :.:: : : ::. :.:: :: :: :. :. : :. : .:
CCDS42 EKGDKGAMGEPGPPGPSGLPGESYGSEKGAPGDPGLQGKPGKDGVPGFPGSEGVKGNRGF
260 270 280 290 300 310
700 710 720 730 740 750
pF1KE2 KGPVGPPGPKGDQGEKGPRGLTG---------EPGMRGLPGAVGEPGAKGAMGPAGPDGH
: .: : ::..:. :: :. : : : .: :: : ::.: .::.:: :
CCDS42 PGLMGEDGIKGQKGDIGPPGFRGPTEYYDTYQEKGDEGTPGPPGPRGARGPQGPSGPPGV
320 330 340 350 360 370
760 770 780 790 800
pF1KE2 QGPRGE-----QGLTGMPGIRGPPGPSGDPGKPGLTGPQGPQGLPGTPGRPGIKGEPGAP
: : .: : ::..: : : ::: .. : .: : :.:: :: : :: :
CCDS42 PGSPGSSRPGLRGAPGWPGLKGSKGERGRPGKDAMGTPGSP-GCAGSPGLPGSPGPPGPP
380 390 400 410 420 430
810 820 830 840
pF1KE2 GKIVTSEGS-------SMLTVPG------PPGPPGAM--------GPPGPPGAPGPAGPA
: :: .: ..: :: : : :: . :::: :: :: :
CCDS42 GDIVFRKGPPGDHGLPGYLGSPGIPGVDGPKGEPGLLCTQCPYIPGPPGLPGLPGLHGVK
440 450 460 470 480 490
850 860 870 880 890
pF1KE2 GLPGHQEVLNLQGPPGPPGPRGPPGPSIPGPPGPRGPPG-----------EGLPGPPGPP
:.::.: . .:.: :: :: : :: .:: :: .: :: :: : :: :
CCDS42 GIPGRQGAAGLKGSPGSPGNTGLPG--FPGFPGAQGDPGLKGEKGETLQPEGQVGVPGDP
500 510 520 530 540 550
900 910 920 930 940 950
pF1KE2 GSFLSNSETFLSGPPGPPGP---PGPKGDQGPPGPRGHQGEQGLPGFSTSGSSSFGLNLQ
: . .. :.: :: :: :::::. . : .: :: : :: . :: . .
CCDS42 GLRGQPGRKGLDGIPGTPGVKGLPGPKGELALSGEKGDQGPPGDPG--SPGSPGPA----
560 570 580 590 600 610
960 970 980 990 1000
pF1KE2 GPPGPPG--PQGPKGDKGDPGVPGALGIPSGPSEGGSSSTMYVSGP-PGPPGPPGPPGSI
:: :::: ::: : .: ::::: :. :.:.: . . :: : :::::::::::
CCDS42 GPAGPPGYGPQGEPGLQGTQGVPGA---PGPPGEAGPRGELSVSTPVPGPPGPPGPPGHP
620 630 640 650 660
1010 1020 1030 1040 1050 1060
pF1KE2 SSSGQE-IQQYISEYMQSDSIRSYLSGVQGPPGPPGP--PGPVTTITGETFDYSELASHV
. .: : ... . : : .: :: :: ::: . : .. .
CCDS42 GPQGPPGIPGSLGKCGDPG-----LPGPDGEPGIPGIGFPGPPGPKGDQGFPGTKGSLGC
670 680 690 700 710 720
1070 1080 1090 1100 1110 1120
pF1KE2 VSYLRTSGYGVSLFSSSISSEDILAVLQRDDVRQYLRQYLMGPRGPPGPPGASGDGSLLS
. . : . . ..: .:. :: : :: :: :. :
CCDS42 PGKMGEPGLPGKPGLPGAKGEPAVAMPG-------------GP-GTPGFPGERGN----S
730 740 750 760
1130 1140 1150 1160
pF1KE2 LDYAELSSRILSYMSSSGISIGLPGP------PGPPGL---PGTSYE--------ELLSL
...:.. : . .. . :: :: ::::: :: : :.
CCDS42 GEHGEIGLPGLPGLPGTPGNEGLDGPRGDPGQPGPPGEQGPPGRCIEGPRGAQGLPGLNG
770 780 790 800 810 820
1170 1180 1190 1200 1210 1220
pF1KE2 LRGSEFRGIVGPPGPPGPPGIPGNVWSSISVEDLSSYLHTAGLSFIPGPPGPPGPPGPRG
:.:.. : : :: : ::::: : :.. .: ::: : :: : ::
CCDS42 LKGQQ--GRRGKTGPKGDPGIPG--------LDRSGFPGETGSPGIPGHQGEMGPLGQRG
830 840 850 860 870
1230 1240 1250 1260 1270 1280
pF1KE2 PPGVSGALATYAAENSDSFRSELISYLTSPDVRSFIVGPPGPPGPQGPPGDSRLLSTDAS
:: : :. . .. .:... : . .::::::: : ::.
CCDS42 YPGNPGILGPPGEDG-------VIGMMGFPGA----IGPPGPPGNPGTPGQ---------
880 890 900 910
1290 1300 1310 1320 1330 1340
pF1KE2 HSRGSSSSSHSSSVRRGSSYSSSMSTGGGGAGSLGAGGAFGEAAGDRGPYGTDIGPGGGY
::: . : :. :. :: :: ::.: .:: .
CCDS42 --RGSPGIP-------------------GVKGQRGTPGAKGEQ-GDKG------NPGPSE
920 930 940
1350 1360 1370 1380 1390 1400
pF1KE2 GAAAEGGMYAGNGGLLGADFAGDLDYNELAVRVSESMQRQGLLQGMAYTVQGPPGQPGPQ
. . : :. :: : :::. .:. .: :. :: ..: : :::
CCDS42 ISHVIGD--KGEPGLKG--FAGN---------PGEKGNR-GV-PGMP-GLKGLKGLPGPA
950 960 970 980 990
1410 1420 1430 1440 1450 1460
pF1KE2 GPPGISKVFSAYSNVTADLMDFFQTYGAIQGPPGQKGEMGTPGPKGDRGPAGPPGHPGPP
:::: ... .: ...: ::. :.:: :: :: :: : ::. : :
CCDS42 GPPGPRGDLGSTGNPGEP---------GLRGIPGSMGNMGMPGSKGKRGTLGFPGRAGRP
1000 1010 1020 1030 1040
1470 1480 1490
pF1KE2 GPRGHKGEKGDKGDQVYAGRRRRRSIAVKP
: : .: .::::. :.
CCDS42 GLPGIHGLQGDKGEPGYSEGTRPGPPGPTGDPGLPGDMGKKGEMGQPGPPGHLGPAGPEG
1050 1060 1070 1080 1090 1100
>--
initn: 976 init1: 608 opt: 852 Z-score: 330.5 bits: 74.0 E(32554): 4.3e-12
Smith-Waterman score: 1146; 43.3% identity (56.2% similar) in 436 aa overlap (569-988:1064-1437)
540 550 560 570 580 590
pF1KE2 GLHSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPKGDRGFPGTPGIPGPLGH
::: : :.:: :: : : : ::: ::
CCDS42 FPGRAGRPGLPGIHGLQGDKGEPGYSEGTRPGPPGPTGDPGLPGDMGKKGEMGQPGPPGH
1040 1050 1060 1070 1080 1090
600 610 620 630 640 650
pF1KE2 PGPQGPKGQKGSVGDPGMEGPMGQRGREGPMGPRGEAGPPGSGEKGERGAAGEPGPHGPP
:: ::.: :: :.::. : : ::.:. : :: .: : :: .:::
CCDS42 LGPAGPEGAPGSPGSPGLPGK--------P-GPHGDL-----GFKGIKGLLGPPGIRGPP
1100 1110 1120 1130
660 670 680 690 700 710
pF1KE2 GVPGSVGPKGSSGSPGPQGPPGPVGLQGLRGEVGLPGVKGDKGPVG----PPGPKGDQGE
:.:: : : :::.:..: .:. :.:: :.:: .: ::::.:. :
CCDS42 GLPGF---------P---GSPGPMGIRGDQGRDGIPGPAGEKGETGLLRAPPGPRGNPGA
1140 1150 1160 1170 1180
720 730 740 750 760 770
pF1KE2 KGPRGLTGEPGMRGLPGAVGEPGAKGAMGPAGPDGHQGPRGEQG--LTGMPGIRGPPGPS
.: .: : ::. :::: : : : ::.: .: :: : : . :. : :::::
CCDS42 QGAKGDRGAPGFPGLPGRKGAMGDAGPRGPTGIEGFPGPPGLPGAIIPGQTGNRGPPGSR
1190 1200 1210 1220 1230 1240
780 790 800 810 820
pF1KE2 GDPGKPGLTGPQGP-----QGLPGTPGRPGIKGEPGAPGKIVTSEGSSMLTVPGPPGPPG
:.:: :: :: : .: :. :.:: :: ::. : . ::::
CCDS42 GSPGAPGPPGPPGSHVIGIKGDKGSMGHPGPKGPPGTAGDM---------------GPPG
1250 1260 1270 1280 1290
830 840 850 860 870 880
pF1KE2 AMGPPGPPGAPGPAGPAGLPGHQEVLNLQGPPGPPGPRGPPGPSIP-GPPGPRGPPGE--
.: :: :: ::: : :. : : . .: :: : ::::: : :::: :: ::
CCDS42 RLGAPGTPGLPGPRGDPGFQGFPGVKGEKGNPGFLGSIGPPGPIGPKGPPGVRGDPGTLK
1300 1310 1320 1330 1340 1350
890 900 910 920 930 940
pF1KE2 --GLPGPPGPPGSFLSNSETFLSGPPGPPGPPGPKGDQGPPGPRGHQGEQGLPGFSTSGS
.::: :::::. .: ..: :::::::: . :: ::::. :..: ::
CCDS42 IISLPGSPGPPGT---PGEPGMQGEPGPPGPPG---NLGPCGPRGKPGKDGKPG------
1360 1370 1380 1390 1400
950 960 970 980 990 1000
pF1KE2 SSFGLNLQGPPGPPGPQGPKGDKGDPGVPGALGIPSGPSEGGSSSTMYVSGPPGPPGPPG
::: : .: ::.::.:: :. :.:. .. :.:..
CCDS42 ---------TPGPAGEKGNKGSKGEPGPAGSDGLPGLKGKRGDSGSPATWTTRGFVFTRH
1410 1420 1430 1440 1450
1010 1020 1030 1040 1050 1060
pF1KE2 PPGSISSSGQEIQQYISEYMQSDSIRSYLSGVQGPPGPPGPPGPVTTITGETFDYSELAS
CCDS42 SQTTAIPSCPEGTVPLYSGFSFLFVQGNQRAHGQDLGTLGSCLQRFTTMPFLFCNVNDVC
1460 1470 1480 1490 1500 1510
>>CCDS43452.1 COL11A2 gene_id:1302|Hs108|chr6 (1650 aa)
initn: 718 init1: 718 opt: 1461 Z-score: 551.4 bits: 114.9 E(32554): 2.1e-24
Smith-Waterman score: 1909; 37.7% identity (51.7% similar) in 1022 aa overlap (567-1486:455-1400)
540 550 560 570 580 590
pF1KE2 KIGLHSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPKGDRGFPGTPGIPGPL
:. : .: :.:: :::::: : ::.::
CCDS43 GESGDLGPQGPRGPQGLTGPPGKAGRRGRAGADGARGMPGDPGVKGDRGFDGLPGLPGEK
430 440 450 460 470 480
600 610 620 630 640
pF1KE2 GH---------PGPQGPKGQKGS---VGDPGMEGPMGQRGREGPMGPRGEAGPPG-SGEK
:: ::: : :..:. .: :. : : :: :: :: : :::: :
CCDS43 GHRGDTGAQGLPGPPGEDGERGDDGEIGPRGLPGESGPRGLLGPKGPPGIPGPPGVRGMD
490 500 510 520 530 540
650 660 670 680 690 700
pF1KE2 GERGAAGEPGPHGPPGVPGSVGPKGSSGSPGPQGPPGPVGLQGLRGEVGLPGVKGDKGPV
: .: : ::.: :: ::. : :..: ::::: :: : .: .:. ::::. :. ::
CCDS43 GPQGPKGSLGPQGEPGPPGQQGTPGTQGLPGPQGAIGPHGEKGPQGKPGLPGMPGSDGPP
550 560 570 580 590 600
710 720 730 740 750
pF1KE2 G------PPGPKGDQGEKGPRGLTGEPGMRGLPGAVGEPGAKGAMGPAGPDGHQG-----
: ::: ::.:: .::.: : :: ::. :. : : :: : : :: :
CCDS43 GHPGKEGPPGTKGNQGPSGPQGPLGYPGPRGVKGVDGIRGLKGHKGEKGEDGFPGFKGDI
610 620 630 640 650 660
760 770 780 790 800
pF1KE2 -PRGEQGLTGMPGIRG---P--P----GPSGDPGKPGLTGPQGPQGLPGTPGRPGIKGEP
.:..: .:.:: :: : : ::.:::: ::: : .: :.:: :: :: .:
CCDS43 GVKGDRGEVGVPGSRGEDGPEGPKGRTGPTGDPGPPGLMGEKGKLGVPGLPGYPGRQGPK
670 680 690 700 710 720
810 820 830 840 850
pF1KE2 GA---PGKIVTSEGSSMLTVPGPPGPPGAMGPPGPPGAPGPAGPAGLPGHQEVLNLQGPP
:. :: .: .. . : :: : :: :: : :: : .: : . . . .::
CCDS43 GSLGFPGFPGASGEKGARGLSGKSGPRGERGPTGPRGQRGPRGATGKSGAKGTSGGDGPH
730 740 750 760 770 780
860 870 880 890 900 910
pF1KE2 GPPGPRGPPGPS----IPGPPGPRGPPG-EGLPGPPGPPGSFLSNSETFLSGPPGPPG--
:::: :: :::. .::: :: :::: .:::: :: : ...: :::::::
CCDS43 GPPGERGLPGPQGPNGFPGPKGPPGPPGKDGLPGHPGQRGEVGFQGKT---GPPGPPGVV
790 800 810 820 830 840
920 930 940 950 960
pF1KE2 -PPGPKGDQGP------PGPRGHQGEQGLPGFSTSGSSSFGLNLQGPPGPPGPQGPKGDK
: : :. :: ::: : ::::::: :.:. . . :::: :: .:: : .
CCDS43 GPQGAAGETGPMGERGHPGPPGPPGEQGLPG--TAGKEGTKGD-PGPPGAPGKDGPAGLR
850 860 870 880 890
970 980 990 1000
pF1KE2 GDPG---VPGALGIPS-----GPS----------EGGSSSTMYVSGPPGPPGPPGPPGSI
: :: .::. : :. ::: : :.... :::: ::: ::::.
CCDS43 GFPGERGLPGTAGGPGLKGNEGPSGPPGPAGSPGERGAAGSGGPIGPPGRPGPQGPPGAA
900 910 920 930 940 950
1010 1020 1030 1040 1050 1060
pF1KE2 SSSGQEIQQYISEYMQSDSIRSYLSGVQGPPGPPGPPGPVTTITGETFDYSELASHVVSY
. .: .. : ::::: : ::: :: ..:: : .:... .
CCDS43 GEKGVPGEKGPIGPTGRD-------GVQGPVGLPGPAGP-PGVAGEDGDKGEVGDP--GQ
960 970 980 990 1000
1070 1080 1090 1100 1110 1120
pF1KE2 LRTSGYGVSLFSSSISSEDILAVLQRDDVRQYLRQYLMGPRGPPGPPGASGDGSLLSLDY
:.: . :: ::::: : :. . . :
CCDS43 KGTKG----------------------------NKGEHGPPGPPGPIGPVGQPGAAGADG
1010 1020 1030 1040
1130 1140 1150 1160 1170 1180
pF1KE2 AELSSRILSYMSSSGI--SIGLPGPPGP---PGLPGTSYEELLSLLRGSEFRGIVGPPGP
. ......: . :. ::::: :::: : : .: : ::: ::
CCDS43 EPGARGPQGHFGAKGDEGTRGFNGPPGPIGLQGLPGPSGE------KGET--GDVGPMGP
1050 1060 1070 1080 1090
1190 1200 1210 1220 1230
pF1KE2 PGPPGIPGNVWSSISVEDLSSYLHTAGLSFIPGPPGPPG------PPGPRGPPGVSGALA
::::: : :: . :: :::: ::: .: :: ::. .
CCDS43 PGPPGPRG----------------PAGPNGADGPQGPPGGVGNLGPPGEKGEPGESGSPG
1100 1110 1120 1130
1240 1250 1260 1270 1280 1290
pF1KE2 TYAAENSDSFRSELISYLTSPDVRSFIVGPPGPPGPQGPPGDSRLLSTDASHSRGSSSSS
. . . :.: . .: : ::::::.:: ::. .. . . .. .
CCDS43 IQGEPGVKGPRGE-----RGEKGESGQPGEPGPPGPKGPTGDDGPKGNPGPVGFPGDPGP
1140 1150 1160 1170 1180 1190
1300 1310 1320 1330 1340
pF1KE2 HSSSVRRGSSYSSSMSTGGGGAGSLGAGGAFGEAA-----GDRGPYGTDIGPG--GGYGA
. . ::.. ... : :. :. : :: . : ::: :. . : :: ::
CCDS43 PGEGGPRGQDGAKGDRGEDGEPGQPGSPGPTGENGPPGPLGKRGPAGSPGSEGRQGGKGA
1200 1210 1220 1230 1240 1250
1350 1360 1370 1380 1390 1400
pF1KE2 AAEGGMYAGNG--GLLG-ADFAGDLDYNELAVRVSESMQRQGLLQGMAYTVQGPPGQPGP
.. : .. : : .: : :: . : . :. .:: . : :::: ::
CCDS43 KGDPGAIGAPGKTGPVGPAGPAGKPGPDGLR-GLPGSVGQQG--RPGATGQAGPPGPVGP
1260 1270 1280 1290 1300
1410 1420 1430 1440 1450
pF1KE2 QGPPGISKVFSAYSNVT-ADLMDFFQTYG-----------AIQGPPGQKGEMGTPGPKGD
: ::. .: .. :. .. : . :: :::::::: :: .:
CCDS43 PGLPGLRGDAGAKGEKGHPGLIGLIGPPGEQGEKGDRGLPGPQGSPGQKGEMGIPGASGP
1310 1320 1330 1340 1350 1360
1460 1470 1480 1490
pF1KE2 RGPAGPPGHPGPPGPRGHKGEKGDKGDQVYAGRRRRRSIAVKP
::.:::: ::: ::.: :: : : . :
CCDS43 IGPGGPPGLPGPAGPKGAKGATGPGGPKGEKGVQGPPGHPGPPGEVIQPLPIQMPKKTRR
1370 1380 1390 1400 1410 1420
CCDS43 SVDGSRLMQEDEAIPTGGAPGSPGGLEEIFGSLDSLREEIEQMRRPTGTQDSPARTCQDL
1430 1440 1450 1460 1470 1480
>>CCDS33350.1 COL5A2 gene_id:1290|Hs108|chr2 (1499 aa)
initn: 1379 init1: 718 opt: 1410 Z-score: 533.5 bits: 111.4 E(32554): 2.2e-23
Smith-Waterman score: 1890; 38.5% identity (51.9% similar) in 970 aa overlap (566-1488:338-1145)
540 550 560 570 580 590
pF1KE2 DKIGLHSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPKGDRGFPGTPGIPGP
:: :: .: .: : :.:: : :: :::
CCDS33 EGPKGEVGAPGSKGEAGPTGPMGAMGPLGPRGMPGERGRLGPQGAPGQRGAHGMPGKPGP
310 320 330 340 350 360
600 610 620 630 640
pF1KE2 LGHPGPQGPKGQKGSVGDPGMEG---PMGQRGREGPMGPRGEAGPPGS----GEKGERGA
.: : : :..: :.:::.: : : :: :::.: :::.:::: : : :.
CCDS33 MG---PLGIPGSSGFPGNPGMKGEAGPTGARGPEGPQGQRGETGPPGPVGSPGLPGAIGT
370 380 390 400 410 420
650 660 670 680 690 700
pF1KE2 AGEPGPHGPPGVPGSVGPKGSSG---SPGPQGPPGPVGLQGLRGEVGLPGVKGDKGPVGP
: :: .:: : ::. :: ::.: :::::: :: :..: :. :.:: ::. :: :
CCDS33 DGTPGAKGPTGSPGTSGPPGSAGPPGSPGPQGSTGPQGIRGQPGDPGVPGFKGEAGPKGE
430 440 450 460 470 480
710 720 730 740 750
pF1KE2 PGPKGDQGEKGP------RGLTGEPGMRGLPGAVGEPGAKGAMGPAGPDGHQGPRGEQGL
:::.: :: :: :: :.:: : :: ::: :: : : : :: ::.: ::
CCDS33 PGPHGIQGPIGPPGEEGKRGPRGDPGTVGPPGPVGERGAPGNRGFPGSDGLPGPKGAQGE
490 500 510 520 530 540
760 770 780 790 800 810
pF1KE2 TGMPGIRGPPGPSGDPGKPGLTGPQGPQGLPGTPGRPGIKGE--P-GAPGKIVTSEGSSM
: : :: : .::::.:: : : .:: :.:: : .:. : ::::. .:
CCDS33 RGPVGSSGPKGSQGDPGRPGEPGLPGARGLTGNPGVQGPEGKLGPLGAPGE----DGR--
550 560 570 580 590
820 830 840 850 860
pF1KE2 LTVPGPPGP------PGAMGPPGP------PGAPGPAGPAGLPGHQEVLNLQGPPGPPGP
::::: ::.:: ::: :: :: :: ::.::.. . . .: :: ::
CCDS33 ---PGPPGSIGIRGQPGSMGLPGPKGSSGDPGKPGEAGNAGVPGQRGAPGKDGEVGPSGP
600 610 620 630 640 650
870 880 890 900 910 920
pF1KE2 RGPPGPSIPGPPGPRGPPG----EGLPGPPGPPGSFLSNSETFLSGPPGPPGPPGPKGDQ
:::: . : : .:::: .:::::::::: . .. . : :: :: ::.:..
CCDS33 VGPPG--LAGERGEQGPPGPTGFQGLPGPPGPPGEGGKPGDQGVPGDPGAVGPLGPRGER
660 670 680 690 700 710
930 940 950 960 970 980
pF1KE2 GPPGPRGHQGEQGLPGFSTSGSSSFGLNLQGPPGPPGPQGPKGDKGDPGVPGALGIPSGP
: :: ::. : :::: .: .. : .:: :: : ::.: :: : :: :.: :
CCDS33 GNPGERGEPGITGLPG--EKGMAG-G---HGPDGPKGSPGPSGTPGDTGPPGLQGMP-GE
720 730 740 750 760
990 1000 1010 1020 1030 1040
pF1KE2 SEGGSSSTMYVSGPPGPPGPPGPPGSISSSGQEIQQYISEYMQSDSIRSYLSGVQGPPGP
: : ::: : :.:. .: : .:. : :. :: ::
CCDS33 R-----------GIAGTPGPKGDRGGIGEKGAEGTA------GNDGAR----GLPGPLGP
770 780 790 800
1050 1060 1070 1080 1090 1100
pF1KE2 PGPPGPVTTITGETFDYSELASHVVSYLRTSGYGVSLFSSSISSEDILAVLQRDDVRQYL
::: :: ::: ..:
CCDS33 PGPAGP----TGE-----------------------------KGEP--------------
810
1110 1120 1130 1140 1150 1160
pF1KE2 RQYLMGPRGPPGPPGASGDGSLLSLDYAELSSRILSYMSSSGISIGLPGPPGPPGLPGTS
:::: ::::. :. . :: . . .: ..:. :: :: : ::..
CCDS33 -----GPRGLVGPPGSRGNPG----------SR--GENGPTG-AVGFAGPQGPDGQPGVK
820 830 840 850 860
1170 1180 1190 1200 1210
pF1KE2 YEELLSLLRGSEF----RGIVGPPGPPGPPGIPGNVWSSISVEDLSSYLHTAGLSFIPGP
: .:. .:..: ::: :: :.:: .. . : . .::
CCDS33 GEPGEPGQKGDAGSPGPQGLAGSPGPHGPNGVPG-------LKGGRGTQGPPGATGFPGS
870 880 890 900 910
1220 1230 1240 1250 1260 1270
pF1KE2 PGPPGPPGPRGPPGVSGALATYAAENSDSFRSELISYLTSPDVRSFIVGPPGPPGPQGPP
: ::::: : :: .: :. . :. ..:.. :. : :. .:::: :: .: :
CCDS33 AGRVGPPGPAGAPGPAGPLGEPGKEGPPGLRGDPGSHGRVGD-RG-PAGPPGGPGDKGDP
920 930 940 950 960 970
1280 1290 1300 1310 1320 1330
pF1KE2 GDSRLLSTDASHSRGSSSSSHSSSVRRGSSYSSSMSTGGGGAGSLGAGGAFGEAAGDRGP
:.. . :. . ........ :. .: : ::. : : : :.::.::
CCDS33 GEDGQPGPDGPPGPAGTTGQRGIVGMPGQRGERGMPGLPGPAGTPGKVGPTG-ATGDKGP
980 990 1000 1010 1020 1030
1340 1350 1360 1370 1380 1390
pF1KE2 YGTDIGPGGGYGAAAEGGMY--AGNGGLLGADFAGDLDYNELAVRVSESMQRQGLLQGMA
: .:: :. : ..: : ::: : : : : :.: .:
CCDS33 PGP-VGPPGSNGPVGEPGPEGPAGNDGTPGRDGA-----------VGERGDRGD------
1040 1050 1060 1070
1400 1410 1420 1430 1440 1450
pF1KE2 YTVQGPPGQPGPQGPPGISKVFSAYSNVTADLMDFFQTYGAIQGPPGQKGEMGTPGPKGD
:: : :: :: :: : : . : ::. :. : : :.
CCDS33 ---PGPAGLPGSQGAPG--------------------TPGPV-GAPGDAGQRGDP---GS
1080 1090 1100
1460 1470 1480 1490
pF1KE2 RGPAGPPGH------PGPPGPRGHKGEKGDKGDQVYAGRRRRRSIAVKP
::: ::::. ::: :::: ::..::.::. :.:
CCDS33 RGPIGPPGRAGKRGLPGPQGPRGDKGDHGDRGDRGQKGHRGFTGLQGLPGPPGPNGEQGS
1110 1120 1130 1140 1150 1160
CCDS33 AGIPGPFGPRGPPGPVGPSGKEGNPGPLGPIGPPGVRGSVGEAGPEGPPGEPGPPGPPGP
1170 1180 1190 1200 1210 1220
>>CCDS14542.1 COL4A6 gene_id:1288|Hs108|chrX (1690 aa)
initn: 701 init1: 701 opt: 1402 Z-score: 529.9 bits: 110.9 E(32554): 3.4e-23
Smith-Waterman score: 1605; 35.9% identity (52.7% similar) in 990 aa overlap (566-1480:402-1294)
540 550 560 570 580 590
pF1KE2 DKIGLHSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPK--GDRGFPGTPGIP
.: :: :::.:.:: : :.:: :.:
CCDS14 KGDEGIQGLRGPSGVPGLPALSGVPGALGPQGFPGLKGDQGNPGRTTIGAAGLPGRDGLP
380 390 400 410 420 430
600 610 620 630 640
pF1KE2 GPLGHPGPQGPKGQKGSV-----GDPGMEGPMGQRGREGPMGPRGEAGPPG-SGEKGERG
:: : ::: .:. . .. : ::..: .: .: : : .:..: . .: . :
CCDS14 GPPGPPGPPSPEFETETLHNKESGFPGLRGEQGPKGNLGLKGIKGDSGFCACDGGVPNTG
440 450 460 470 480 490
650 660 670 680 690 700
pF1KE2 AAGEPGPHGPPGVPGSVGPKGSSGSPGPQGPPGPVGLQGLRGEVGLPGVKGDKG-PV---
::::: :: :. : : ::. :. : : ::.: :: : .: : :: :: :.
CCDS14 PPGEPGPPGPWGLIGLPGLKGARGDRGSGGAQGPAGAPGLVGPLGPSGPKGKKGEPILST
500 510 520 530 540 550
710 720 730 740 750 760
pF1KE2 --GPPGPKGDQGEKGPRGLTGEPGMRGLPGAVGEPGAKGAMGPAGPDGHQGPRGEQGLTG
: :: .::.: .: ::. :::: :.:: : :: : :: :: ::.:: :
CCDS14 IQGMPGDRGDSGSQGFRGVIGEPGKDGVPGLPGLPGLPG-------DGGQGFPGEKGLPG
560 570 580 590 600
770 780 790 800 810 820
pF1KE2 MPGIRGPPGPSGDPGKPGLTGPQGPQGLPGTPGRPGIKGEPGAPGKIVTSEGSSM-LTVP
.:: .: ::: : ::. :: : ::.:::: :. :. :. : :: :.: .. .:
CCDS14 LPGEKGHPGPPGLPGN-GLPGLPGPRGLPGDKGKDGLPGQQGLPG----SKGITLPCIIP
610 620 630 640 650
830 840 850 860 870
pF1KE2 GPPGPPGAMGPPGPPGAPGPAGPAGLPGHQEVLNLQGPPGPPGPRGPPG-------PSIP
: :: : :: :: ::: : :::: : :: : .: :: : .:
CCDS14 GSYGPSGF---PGTPGFPGPKGSRGLPG------TPGQPGSSGSKGEPGSPGLVHLPELP
660 670 680 690 700 710
880 890 900 910 920
pF1KE2 GPPGPRGPPGEGLPGPPGPPGSFLSNSETFLSGPPGPPGPPGPKGD-----QGPPGPRGH
: ::::: .:::: :: ::. .. . : :: :: : :: .: :: .:
CCDS14 GFPGPRGE--KGLPGFPGLPGK---DGLPGMIGSPGLPGSKGATGDIFGAENGAPGEQGL
720 730 740 750 760
930 940 950 960 970 980
pF1KE2 QGEQGLPGFSTSGSSSFGL-NLQGPPGPPGPQGPKGDKGDPGVPGALGIPSGPSEGGSSS
:: : :: :.: :: .:.: : :: ::::..:.::.:: .: :. : :::.
CCDS14 QGLTGHKGFL--GDS--GLPGLKGVHGKPGLLGPKGERGSPGTPGQVGQPGTP---GSSG
770 780 790 800 810
990 1000 1010 1020 1030 1040
pF1KE2 TMYVSGPPGPPGPPGPPGSISSSGQEIQQYISEYMQSDSIRSYLSGVQGPPGPPG-----
. ..: : :: :: :: . :.. . .. .. ... : :..: :: ::
CCDS14 PYGIKGKSGLPGAPGFPGISGHPGKKGTRG-KKGPPGSIVKKGLPGLKGLPGNPGLVGLK
820 830 840 850 860 870
1050 1060 1070 1080 1090
pF1KE2 --PPGP-VTTITGETFDYSELASHVVSYLRTSGY-GVSLFSSSISSEDILAVLQRDDVRQ
: .: :. . . . .: .: :... : :. . .. . . : . .
CCDS14 GSPGSPGVAGLPALSGPKGEKGS--VGFVGFPGIPGLPGIPGTRGLKGIPGSTGK-----
880 890 900 910 920 930
1100 1110 1120 1130 1140
pF1KE2 YLRQYLMGPRGPPGPPGASGD----GSL-LSLDYAELSSRILS----YMSSSGISIGLPG
::: : : :: .:: : . . .:. :. ..:.: : :.::
CCDS14 ------MGPSGRAGTPGEKGDRGNPGPVGIPSPRRPMSNLWLKGDKGSQGSAG-SNGFPG
940 950 960 970 980
1150 1160 1170 1180 1190 1200
pF1KE2 P---------PGPPGLPGTSYEELLSLLRGSEFRGIVGPPGPPGPPGIPGNVWSSISVED
: ::::::::. : : ..:. : ::::: :: :
CCDS14 PRGDKGEAGRPGPPGLPGAPG------LPGI-IKGVSGKPGPPGFMGIRG----------
990 1000 1010 1020
1210 1220 1230 1240 1250 1260
pF1KE2 LSSYLHTAGLSFIPGPPGPPGPPGPRGPPGVSGALATYAAENSDSFRSELISYLTSPDVR
: . ..:.. .:: :: : : :: ::. :: . . ..:. .. ::
CCDS14 LPGLKGSSGITGFPGMPGESGSQGIRGSPGLPGA-SGLPGLKGDNGQTVEIS--------
1030 1040 1050 1060 1070
1270 1280 1290 1300 1310 1320
pF1KE2 SFIVGPPGPPGPQGPPGDSRLLSTDASHSRGSSSSSHSSSVRRGSSYSSSMSTGGGGAGS
: :::.: ::.: . .: ..: . .. . .: . . ..: : .:
CCDS14 -------GSPGPKGQPGESGFKGT---KGRDGLIGNIGFPGNKGEDGKVGVS---GDVGL
1080 1090 1100 1110 1120
1330 1340 1350 1360 1370
pF1KE2 LGAGGAFGEAAGDRGPYGTDIGPGGGYGAAAEGGMYAGNGGLLGAD-FAGDLDYNELAVR
:: : : .:: :: : : .: :: : . :. ::.: : : . :
CCDS14 PGAPG-FPGVAGMRGEPGLP-GSSGHQGAI--GPL--GSPGLIGPKGFPGFPGLHGLN-G
1130 1140 1150 1160 1170
1380 1390 1400 1410 1420 1430
pF1KE2 VSESMQRQGLLQGMAYT-VQGPPGQPGPQGP---PGISKVFSAYSNVTADLMDFFQTYGA
. . .: : . : : :: : :::.: :::. . .. .. : . . .
CCDS14 LPGTKGTHGT-PGPSITGVPGPAGLPGPKGEKGYPGIGIGAPGKPGLRGQKGD--RGFPG
1180 1190 1200 1210 1220 1230
1440 1450 1460 1470 1480
pF1KE2 IQGP---PG------------QKGEMGTPGPKGDRGPAGPPGHPGPPGPRGHKGEKGDKG
.::: :: : :. : :: :.:: :: : :::::: ...:. :: :
CCDS14 LQGPAGLPGAPGISLPSLIAGQPGDPGRPGLDGERGRPGPAGPPGPPGPSSNQGDTGDPG
1240 1250 1260 1270 1280 1290
1490
pF1KE2 DQVYAGRRRRRSIAVKP
CCDS14 FPGIPGPKGPKGDQGIPGFSGLPGELGLKGMRGEPGFMGTPGKVGPPGDPGFPGMKGKAG
1300 1310 1320 1330 1340 1350
>>CCDS14541.1 COL4A6 gene_id:1288|Hs108|chrX (1691 aa)
initn: 701 init1: 701 opt: 1402 Z-score: 529.9 bits: 110.9 E(32554): 3.4e-23
Smith-Waterman score: 1605; 35.9% identity (52.7% similar) in 990 aa overlap (566-1480:403-1295)
540 550 560 570 580 590
pF1KE2 DKIGLHSDSQEELWMFVRKKLMMEQENGNLRGSPGPKGDMGSPGPK--GDRGFPGTPGIP
.: :: :::.:.:: : :.:: :.:
CCDS14 KGDEGIQGLRGPSGVPGLPALSGVPGALGPQGFPGLKGDQGNPGRTTIGAAGLPGRDGLP
380 390 400 410 420 430
600 610 620 630 640
pF1KE2 GPLGHPGPQGPKGQKGSV-----GDPGMEGPMGQRGREGPMGPRGEAGPPG-SGEKGERG
:: : ::: .:. . .. : ::..: .: .: : : .:..: . .: . :
CCDS14 GPPGPPGPPSPEFETETLHNKESGFPGLRGEQGPKGNLGLKGIKGDSGFCACDGGVPNTG
440 450 460 470 480 490
650 660 670 680 690 700
pF1KE2 AAGEPGPHGPPGVPGSVGPKGSSGSPGPQGPPGPVGLQGLRGEVGLPGVKGDKG-PV---
::::: :: :. : : ::. :. : : ::.: :: : .: : :: :: :.
CCDS14 PPGEPGPPGPWGLIGLPGLKGARGDRGSGGAQGPAGAPGLVGPLGPSGPKGKKGEPILST
500 510 520 530 540 550
710 720 730 740 750 760
pF1KE2 --GPPGPKGDQGEKGPRGLTGEPGMRGLPGAVGEPGAKGAMGPAGPDGHQGPRGEQGLTG
: :: .::.: .: ::. :::: :.:: : :: : :: :: ::.:: :
CCDS14 IQGMPGDRGDSGSQGFRGVIGEPGKDGVPGLPGLPGLPG-------DGGQGFPGEKGLPG
560 570 580 590 600
770 780 790 800 810 820
pF1KE2 MPGIRGPPGPSGDPGKPGLTGPQGPQGLPGTPGRPGIKGEPGAPGKIVTSEGSSM-LTVP
.:: .: ::: : ::. :: : ::.:::: :. :. :. : :: :.: .. .:
CCDS14 LPGEKGHPGPPGLPGN-GLPGLPGPRGLPGDKGKDGLPGQQGLPG----SKGITLPCIIP
610 620 630 640 650 660
830 840 850 860 870
pF1KE2 GPPGPPGAMGPPGPPGAPGPAGPAGLPGHQEVLNLQGPPGPPGPRGPPG-------PSIP
: :: : :: :: ::: : :::: : :: : .: :: : .:
CCDS14 GSYGPSGF---PGTPGFPGPKGSRGLPG------TPGQPGSSGSKGEPGSPGLVHLPELP
670 680 690 700 710
880 890 900 910 920
pF1KE2 GPPGPRGPPGEGLPGPPGPPGSFLSNSETFLSGPPGPPGPPGPKGD-----QGPPGPRGH
: ::::: .:::: :: ::. .. . : :: :: : :: .: :: .:
CCDS14 GFPGPRGE--KGLPGFPGLPGK---DGLPGMIGSPGLPGSKGATGDIFGAENGAPGEQGL
720 730 740 750 760
930 940 950 960 970 980
pF1KE2 QGEQGLPGFSTSGSSSFGL-NLQGPPGPPGPQGPKGDKGDPGVPGALGIPSGPSEGGSSS
:: : :: :.: :: .:.: : :: ::::..:.::.:: .: :. : :::.
CCDS14 QGLTGHKGFL--GDS--GLPGLKGVHGKPGLLGPKGERGSPGTPGQVGQPGTP---GSSG
770 780 790 800 810
990 1000 1010 1020 1030 1040
pF1KE2 TMYVSGPPGPPGPPGPPGSISSSGQEIQQYISEYMQSDSIRSYLSGVQGPPGPPG-----
. ..: : :: :: :: . :.. . .. .. ... : :..: :: ::
CCDS14 PYGIKGKSGLPGAPGFPGISGHPGKKGTRG-KKGPPGSIVKKGLPGLKGLPGNPGLVGLK
820 830 840 850 860 870
1050 1060 1070 1080 1090
pF1KE2 --PPGP-VTTITGETFDYSELASHVVSYLRTSGY-GVSLFSSSISSEDILAVLQRDDVRQ
: .: :. . . . .: .: :... : :. . .. . . : . .
CCDS14 GSPGSPGVAGLPALSGPKGEKGS--VGFVGFPGIPGLPGIPGTRGLKGIPGSTGK-----
880 890 900 910 920 930
1100 1110 1120 1130 1140
pF1KE2 YLRQYLMGPRGPPGPPGASGD----GSL-LSLDYAELSSRILS----YMSSSGISIGLPG
::: : : :: .:: : . . .:. :. ..:.: : :.::
CCDS14 ------MGPSGRAGTPGEKGDRGNPGPVGIPSPRRPMSNLWLKGDKGSQGSAG-SNGFPG
940 950 960 970 980
1150 1160 1170 1180 1190 1200
pF1KE2 P---------PGPPGLPGTSYEELLSLLRGSEFRGIVGPPGPPGPPGIPGNVWSSISVED
: ::::::::. : : ..:. : ::::: :: :
CCDS14 PRGDKGEAGRPGPPGLPGAPG------LPGI-IKGVSGKPGPPGFMGIRG----------
990 1000 1010 1020
1210 1220 1230 1240 1250 1260
pF1KE2 LSSYLHTAGLSFIPGPPGPPGPPGPRGPPGVSGALATYAAENSDSFRSELISYLTSPDVR
: . ..:.. .:: :: : : :: ::. :: . . ..:. .. ::
CCDS14 LPGLKGSSGITGFPGMPGESGSQGIRGSPGLPGA-SGLPGLKGDNGQTVEIS--------
1030 1040 1050 1060 1070
1270 1280 1290 1300 1310 1320
pF1KE2 SFIVGPPGPPGPQGPPGDSRLLSTDASHSRGSSSSSHSSSVRRGSSYSSSMSTGGGGAGS
: :::.: ::.: . .: ..: . .. . .: . . ..: : .:
CCDS14 -------GSPGPKGQPGESGFKGT---KGRDGLIGNIGFPGNKGEDGKVGVS---GDVGL
1080 1090 1100 1110 1120
1330 1340 1350 1360 1370
pF1KE2 LGAGGAFGEAAGDRGPYGTDIGPGGGYGAAAEGGMYAGNGGLLGAD-FAGDLDYNELAVR
:: : : .:: :: : : .: :: : . :. ::.: : : . :
CCDS14 PGAPG-FPGVAGMRGEPGLP-GSSGHQGAI--GPL--GSPGLIGPKGFPGFPGLHGLN-G
1130 1140 1150 1160 1170
1380 1390 1400 1410 1420 1430
pF1KE2 VSESMQRQGLLQGMAYT-VQGPPGQPGPQGP---PGISKVFSAYSNVTADLMDFFQTYGA
. . .: : . : : :: : :::.: :::. . .. .. : . . .
CCDS14 LPGTKGTHGT-PGPSITGVPGPAGLPGPKGEKGYPGIGIGAPGKPGLRGQKGD--RGFPG
1180 1190 1200 1210 1220 1230
1440 1450 1460 1470 1480
pF1KE2 IQGP---PG------------QKGEMGTPGPKGDRGPAGPPGHPGPPGPRGHKGEKGDKG
.::: :: : :. : :: :.:: :: : :::::: ...:. :: :
CCDS14 LQGPAGLPGAPGISLPSLIAGQPGDPGRPGLDGERGRPGPAGPPGPPGPSSNQGDTGDPG
1240 1250 1260 1270 1280 1290
1490
pF1KE2 DQVYAGRRRRRSIAVKP
CCDS14 FPGIPGPKGPKGDQGIPGFSGLPGELGLKGMRGEPGFMGTPGKVGPPGDPGFPGMKGKAG
1300 1310 1320 1330 1340 1350
1497 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Sat Nov 5 22:21:24 2016 done: Sat Nov 5 22:21:26 2016
Total Scan time: 7.790 Total Display time: 0.710
Function used was FASTA [36.3.4 Apr, 2011]