FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KB7991, 442 aa
1>>>pF1KB7991 442 - 442 aa - 442 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 10.0889+/-0.000948; mu= -1.6557+/- 0.057
mean_var=317.2493+/-66.521, 0's: 0 Z-trim(116.1): 24 B-trim: 0 in 0/52
Lambda= 0.072007
statistics sampled from 16679 (16699) to 16679 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.796), E-opt: 0.2 (0.513), width: 16
Scan time: 3.770
The best scores are: opt bits E(32554)
CCDS5983.1 GATA4 gene_id:2626|Hs108|chr8 ( 442) 3005 325.3 7.4e-89
CCDS78303.1 GATA4 gene_id:2626|Hs108|chr8 ( 443) 2993 324.1 1.8e-88
CCDS78304.1 GATA4 gene_id:2626|Hs108|chr8 ( 236) 1608 180.0 2.3e-45
CCDS11872.1 GATA6 gene_id:2627|Hs108|chr18 ( 595) 1032 120.5 4.6e-27
CCDS13499.1 GATA5 gene_id:140628|Hs108|chr20 ( 397) 1016 118.7 1.1e-26
CCDS3049.1 GATA2 gene_id:2624|Hs108|chr3 ( 480) 846 101.1 2.6e-21
CCDS31143.1 GATA3 gene_id:2625|Hs108|chr10 ( 444) 743 90.3 4.1e-18
CCDS7083.1 GATA3 gene_id:2625|Hs108|chr10 ( 443) 732 89.2 8.9e-18
CCDS14305.1 GATA1 gene_id:2623|Hs108|chrX ( 413) 691 84.9 1.6e-16
>>CCDS5983.1 GATA4 gene_id:2626|Hs108|chr8 (442 aa)
initn: 3005 init1: 3005 opt: 3005 Z-score: 1708.7 bits: 325.3 E(32554): 7.4e-89
Smith-Waterman score: 3005; 100.0% identity (100.0% similar) in 442 aa overlap (1-442:1-442)
10 20 30 40 50 60
pF1KB7 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGAG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGAG
10 20 30 40 50 60
70 80 90 100 110 120
pF1KB7 SASGGASGGSSGGAASGAGPGTQQGSPGWSQAGADGAAYTPPPVSPRFSFPGTTGSLAAA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 SASGGASGGSSGGAASGAGPGTQQGSPGWSQAGADGAAYTPPPVSPRFSFPGTTGSLAAA
70 80 90 100 110 120
130 140 150 160 170 180
pF1KB7 AAAAAAREAAAYSSGGGAAGAGLAGREQYGRAGFAGSYSSPYPAYMADVGASWAAAAAAS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 AAAAAAREAAAYSSGGGAAGAGLAGREQYGRAGFAGSYSSPYPAYMADVGASWAAAAAAS
130 140 150 160 170 180
190 200 210 220 230 240
pF1KB7 AGPFDSPVLHSLPGRANPAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCNA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 AGPFDSPVLHSLPGRANPAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCNA
190 200 210 220 230 240
250 260 270 280 290 300
pF1KB7 CGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYMK
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 CGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYMK
250 260 270 280 290 300
310 320 330 340 350 360
pF1KB7 LHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSEE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 LHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSEE
310 320 330 340 350 360
370 380 390 400 410 420
pF1KB7 MRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQT
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS59 MRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQT
370 380 390 400 410 420
430 440
pF1KB7 SSKQDSWNSLVLADSHGDIITA
::::::::::::::::::::::
CCDS59 SSKQDSWNSLVLADSHGDIITA
430 440
>>CCDS78303.1 GATA4 gene_id:2626|Hs108|chr8 (443 aa)
initn: 1623 init1: 1623 opt: 2993 Z-score: 1702.0 bits: 324.1 E(32554): 1.8e-88
Smith-Waterman score: 2993; 99.8% identity (99.8% similar) in 443 aa overlap (1-442:1-443)
10 20 30 40 50 60
pF1KB7 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGAG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS78 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGAG
10 20 30 40 50 60
70 80 90 100 110 120
pF1KB7 SASGGASGGSSGGAASGAGPGTQQGSPGWSQAGADGAAYTPPPVSPRFSFPGTTGSLAAA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS78 SASGGASGGSSGGAASGAGPGTQQGSPGWSQAGADGAAYTPPPVSPRFSFPGTTGSLAAA
70 80 90 100 110 120
130 140 150 160 170 180
pF1KB7 AAAAAAREAAAYSSGGGAAGAGLAGREQYGRAGFAGSYSSPYPAYMADVGASWAAAAAAS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS78 AAAAAAREAAAYSSGGGAAGAGLAGREQYGRAGFAGSYSSPYPAYMADVGASWAAAAAAS
130 140 150 160 170 180
190 200 210 220 230
pF1KB7 AGPFDSPVLHSLPGRANPAARHPNL-DMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCN
::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::
CCDS78 AGPFDSPVLHSLPGRANPAARHPNLVDMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCN
190 200 210 220 230 240
240 250 260 270 280 290
pF1KB7 ACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYM
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS78 ACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYM
250 260 270 280 290 300
300 310 320 330 340 350
pF1KB7 KLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS78 KLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSE
310 320 330 340 350 360
360 370 380 390 400 410
pF1KB7 EMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS78 EMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQ
370 380 390 400 410 420
420 430 440
pF1KB7 TSSKQDSWNSLVLADSHGDIITA
:::::::::::::::::::::::
CCDS78 TSSKQDSWNSLVLADSHGDIITA
430 440
>>CCDS78304.1 GATA4 gene_id:2626|Hs108|chr8 (236 aa)
initn: 1608 init1: 1608 opt: 1608 Z-score: 928.0 bits: 180.0 E(32554): 2.3e-45
Smith-Waterman score: 1608; 100.0% identity (100.0% similar) in 236 aa overlap (207-442:1-236)
180 190 200 210 220 230
pF1KB7 AAASAGPFDSPVLHSLPGRANPAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTGHY
::::::::::::::::::::::::::::::
CCDS78 MFDDFSEGRECVNCGAMSTPLWRRDGTGHY
10 20 30
240 250 260 270 280 290
pF1KB7 LCNACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS78 LCNACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACG
40 50 60 70 80 90
300 310 320 330 340 350
pF1KB7 LYMKLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS78 LYMKLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTS
100 110 120 130 140 150
360 370 380 390 400 410
pF1KB7 SSEEMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS78 SSEEMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQ
160 170 180 190 200 210
420 430 440
pF1KB7 SPQTSSKQDSWNSLVLADSHGDIITA
::::::::::::::::::::::::::
CCDS78 SPQTSSKQDSWNSLVLADSHGDIITA
220 230
>>CCDS11872.1 GATA6 gene_id:2627|Hs108|chr18 (595 aa)
initn: 1160 init1: 814 opt: 1032 Z-score: 599.3 bits: 120.5 E(32554): 4.6e-27
Smith-Waterman score: 1256; 50.2% identity (68.3% similar) in 476 aa overlap (1-433:147-595)
10 20 30
pF1KB7 MYQSLAMAANHGPPPGAYEAGGPGAFMHGA
:::.:: ...:: .::. :.::.:.:.:
CCDS11 DLDQAATASKLLWSSRGAKLSPFAPEQPEEMYQTLAALSSQGP--AAYD-GAPGGFVHSA
120 130 140 150 160 170
40 50 60 70 80
pF1KB7 GAA-------SSPVYVPTPRVPSSVLGLSY-LQGGGAGSAS--GGASGGSSGGAASGAGP
.:: :::::::: :: : . :: : :::.:.: :. :::.. . ::. .:
CCDS11 AAAAAAAAAASSPVYVPTTRVGSMLPGLPYHLQGSGSGPANHAGGAGAHPGWPQASADSP
180 190 200 210 220 230
90 100 110 120 130
pF1KB7 --GTQQGSPGWSQAGADGAAYTPPPVSPRFSFPGTTGSLAAAAA------AAAAREAAAY
:. :. : . :: ::. . :: :: . . . .: .:: :::. .:.
CCDS11 PYGSGGGAAGGGAAGPGGAGSAAAHVSARFPY-SPSPPMANGAAREPGGYAAAGSGGAGG
240 250 260 270 280 290
140 150 160 170
pF1KB7 SSGGGAAGAGLAGRE-QYGRAGFA----GSYS----------SPYPAYMADVGASWAAAA
::::.. :...::: ::. . : :.: ::: : ::: . :
CCDS11 VSGGGSSLAAMGGREPQYSSLSAARPLNGTYHHHHHHHHHHPSPYSPY---VGAPLTPAW
300 310 320 330 340
180 190 200 210 220 230
pF1KB7 AASAGPFDSPVLHSLPGRAN---PAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTG
::::..:::::: .::. :. : :. :...:.::.:::::::...::::::::::
CCDS11 P--AGPFETPVLHSLQSRAGAPLPVPRGPSADLLEDLSESRECVNCGSIQTPLWRRDGTG
350 360 370 380 390 400
240 250 260 270 280 290
pF1KB7 HYLCNACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNA
:::::::::: ::::..:::::::.:. .:::.:::::::.:::::::::::::::::::
CCDS11 HYLCNACGLYSKMNGLSRPLIKPQKRVPSSRRLGLSCANCHTTTTTLWRRNAEGEPVCNA
410 420 430 440 450 460
300 310 320 330 340 350
pF1KB7 CGLYMKLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLP--PASGASSNSSN
::::::::::::::::.:::::::::::::.::::: .. :.. :.: :.: .::::..
CCDS11 CGLYMKLHGVPRPLAMKKEGIQTRKRKPKNINKSKTCSGNSNN-SIPMTPTS-TSSNSDD
470 480 490 500 510 520
360 370 380 390 400 410
pF1KB7 ATTSSSEEMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQ-GYA
. ..: .: . : . .:.: : : .: : :: : : :
CCDS11 CSKNTSPTTQPTASGAG----------------APVMTGAGESTNPENSELKYSGQDGLY
530 540 550 560
420 430 440
pF1KB7 SPVS-QSPQ--TSS-KQDSWNSLVLADSHGDIITA
:: :: ::: . ::: .:.::
CCDS11 IGVSLASPAEVTSSVRPDSWCALALA
570 580 590
>>CCDS13499.1 GATA5 gene_id:140628|Hs108|chr20 (397 aa)
initn: 1152 init1: 796 opt: 1016 Z-score: 592.7 bits: 118.7 E(32554): 1.1e-26
Smith-Waterman score: 1144; 46.0% identity (67.4% similar) in 448 aa overlap (1-433:1-397)
10 20 30 40 50 60
pF1KB7 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGAG
::::::.::. : .:: . :.:.:. ::.: :..:: :::: ::::.:
CCDS13 MYQSLALAAS--PRQAAY--ADSGSFLHAPGAGS-PMFVPPARVPSM---LSYLSG----
10 20 30 40
70 80 90 100 110
pF1KB7 SASGGASGGSSGGAASGAGPGTQQGSPGWSQ-AGADGAAYTP-PPVSPRFSFPGTTGSLA
. : . :::.: : ::..:. : : : ::.:.
CCDS13 -------------CEPSPQPPELAARPGWAQTATADSSAFGPGSPHPPAAHPPGATAFPF
50 60 70 80 90
120 130 140 150 160 170
pF1KB7 AAAAAAAAREAAAYSSGGGAAGAGLAGREQY----GRAGFAGSYSSPYPAYMA-DVGASW
: . .. . ..: . :.: ..: :::. :: . :::. ::::.. ::. ::
CCDS13 AHSPSGPGSGGSAGGRDGSAYQGALLPREQFAAPLGRP-VGTSYSATYPAYVSPDVAQSW
100 110 120 130 140 150
180 190 200 210 220 230
pF1KB7 AAAAAASAGPFDSPVLHSLPGRANPAARHPNL--DMFDDF-SEGRECVNCGAMSTPLWRR
.:::::. :::.:::: .:.. :....: .::::::::::.:::::::
CCDS13 ------TAGPFDGSVLHGLPGR------RPTFVSDFLEEFPGEGRECVNCGALSTPLWRR
160 170 180 190 200
240 250 260 270 280 290
pF1KB7 DGTGHYLCNACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEP
:::::::::::::::::::.::::..::.:::.:::.:: :.::.::.:::::::.::::
CCDS13 DGTGHYLCNACGLYHKMNGVNRPLVRPQKRLSSSRRAGLCCTNCHTTNTTLWRRNSEGEP
210 220 230 240 250 260
300 310 320 330 340 350
pF1KB7 VCNACGLYMKLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNS
::::::::::::::::::::.::.:::::::::.. :.. .. . . : :.. ::..:
CCDS13 VCNACGLYMKLHGVPRPLAMKKESIQTRKRKPKTIAKARGSSGSTRNASASPSAVASTDS
270 280 290 300 310 320
360 370 380 390 400 410
pF1KB7 SNATTSSSEEMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGY
: ::. :..:.:.: . :.. .: . . :. : .:. :. .
CCDS13 SAATS---------KAKPSLASPVCPGPSMAP----QASGQEDDSLAPGHLEFKFEPEDF
330 340 350 360
420 430 440
pF1KB7 ASP-VSQSPQT----SSKQDSWNSLVLADSHGDIITA
: : .. :::. . .:..: .:.::
CCDS13 AFPSTAPSPQAGLRGALRQEAWCALALA
370 380 390
>>CCDS3049.1 GATA2 gene_id:2624|Hs108|chr3 (480 aa)
initn: 794 init1: 704 opt: 846 Z-score: 496.1 bits: 101.1 E(32554): 2.6e-21
Smith-Waterman score: 846; 43.2% identity (64.7% similar) in 405 aa overlap (15-398:93-472)
10 20 30 40
pF1KB7 MYQSLAMAANHGPPPG-AYEAGGPGAFMHGAGAASSPVYV-P--
:: . :: .:. .:. .: : :
CCDS30 PAHARARVSYSPAHARLTGGQMCRPHLLHSPGLPWLDGGKAALSAAAAHHHNPWTVSPFS
70 80 90 100 110 120
50 60 70 80 90
pF1KB7 -TPRVPSSVLGLSYLQGGGAGSASGGASGGSSGGAASGAGPGTQQGSPGWSQAGADGAAY
:: ::.. : :: :. ::.:::.::..:... : : ...:. ..
CCDS30 KTPLHPSAAGG-----PGGPLSVYPGAGGGSGGGSGSSVASLT----PTAAHSGSHLFGF
130 140 150 160 170
100 110 120 130 140 150
pF1KB7 TP-PP--VSPRFSFPGTTGSLAAAAAAAAAREAAAYSSGGGAAGAGL--AGREQYG---R
: :: ::: :.:::. . :...:.. : . .. : ..: . . . : :
CCDS30 PPTPPKEVSPD---PSTTGAASPASSSAGGSAARGEDKDGVKYQVSLTESMKMESGSPLR
180 190 200 210 220 230
160 170 180 190 200
pF1KB7 AGFAGSYSSPYPAYMADVGASWAAAAAASAGPFDSPVLHSLPGR--ANPAARH-PNL-DM
:.: ..: . . :.. ::: . ..: ..: :: ..::. :. .
CCDS30 PGLATMGTQPATHHPIPTYPSYVPAAAHD---YSSGLFH--PGGFLGGPASSFTPKQRSK
240 250 260 270 280
210 220 230 240 250 260
pF1KB7 FDDFSEGRECVNCGAMSTPLWRRDGTGHYLCNACGLYHKMNGINRPLIKPQRRLSASRRV
. ::::::::::: .::::::::::::::::::::::::: :::::::.:::::.::.
CCDS30 ARSCSEGRECVNCGATATPLWRRDGTGHYLCNACGLYHKMNGQNRPLIKPKRRLSAARRA
290 300 310 320 330 340
270 280 290 300 310 320
pF1KB7 GLSCANCQTTTTTLWRRNAEGEPVCNACGLYMKLHGVPRPLAMRKEGIQTRKRKPKNLNK
: ::::::::::::::::.:.:::::::::.:::.: :::.:.:::::::.:: .: :
CCDS30 GTCCANCQTTTTTLWRRNANGDPVCNACGLYYKLHNVNRPLTMKKEGIQTRNRKMSN--K
350 360 370 380 390 400
330 340 350 360 370 380
pF1KB7 SKTPAAPSGSESLPPASGASSNSSNATTSSSE--EMRPIKTEPGLSSHYGHS-SSVSQTF
:: . .:.: . : ...:. .... .: :. : .: : :: . .
CCDS30 SKK--SKKGAECFEELSKCMQEKSSPFSAAALAGHMAPVGHLPPFS-HSGHILPTPTPIH
410 420 430 440 450 460
390 400 410 420 430 440
pF1KB7 SVSAMS-GHGPSIHPVLSALKLSPQGYASPVSQSPQTSSKQDSWNSLVLADSHGDIITA
:..: :: : ::
CCDS30 PSSSLSFGH-P--HPSSMVTAMG
470 480
>>CCDS31143.1 GATA3 gene_id:2625|Hs108|chr10 (444 aa)
initn: 699 init1: 699 opt: 743 Z-score: 438.8 bits: 90.3 E(32554): 4.1e-18
Smith-Waterman score: 758; 40.9% identity (62.7% similar) in 362 aa overlap (51-395:94-437)
30 40 50 60 70
pF1KB7 GGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGA-GS---ASGGASGGSSGGAAS
: .:.:: : :: :: . : .
CCDS31 YGNSVRATVQRYPPTHHGSQVCRPPLLHGSLPWLDGGKALGSHHTASPWNLSPFSKTSIH
70 80 90 100 110 120
80 90 100 110 120
pF1KB7 GAGPGTQQGSPGWSQAGADGAAYTP-----PPVSPRFSFPGTTGSLAAAAAAAAA--REA
..:: . : :... .:. .: ::. :. : . : ..:..: .:
CCDS31 HGSPGPLSVYPPASSSSLSGGHASPHLFTFPPTPPKDVSPDPSLSTPGSAGSARQDEKEC
130 140 150 160 170 180
130 140 150 160 170 180
pF1KB7 AAYSSGGGAAGAGLAGREQYGRAGFAGSYSS---P---YPAYMADVGASWAAAAAASAGP
:. . ... . . ....:. :: : :: :. . ... .. .:
CCDS31 LKYQVPLPDSMKLESSHSRGSMTALGGASSSTHHPITTYPPYVPEYSSGLFPPSSLLGG-
190 200 210 220 230 240
190 200 210 220 230 240
pF1KB7 FDSPVLHSLPGRANPAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCNACGL
::. . .: : :: .:::::::::: ::::::::::::::::::::
CCDS31 --SPTGFGCKSR--PKARSS--------TEGRECVNCGATSTPLWRRDGTGHYLCNACGL
250 260 270 280 290
250 260 270 280 290 300
pF1KB7 YHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYMKLHG
:::::: :::::::.:::::.::.: :::::::::::::::::.:.:::::::::.:::.
CCDS31 YHKMNGQNRPLIKPKRRLSAARRAGTSCANCQTTTTTLWRRNANGDPVCNACGLYYKLHN
300 310 320 330 340 350
310 320 330 340 350 360
pF1KB7 VPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSEEMRP
. :::.:.:::::::.:: .. .: : . .. :..: :. . . . :: .. :
CCDS31 INRPLTMKKEGIQTRNRKMSSKSK-KCKKVHDSLEDFPKNSSFNPAALSRHMSSLSHISP
360 370 380 390 400
370 380 390 400 410 420
pF1KB7 IKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQTSSK
.. :::. . . . : ... : ::
CCDS31 FSH----SSHMLTTPTPMHPPSSLSFGPHHPSSMVTAMG
410 420 430 440
430 440
pF1KB7 QDSWNSLVLADSHGDIITA
>>CCDS7083.1 GATA3 gene_id:2625|Hs108|chr10 (443 aa)
initn: 695 init1: 695 opt: 732 Z-score: 432.6 bits: 89.2 E(32554): 8.9e-18
Smith-Waterman score: 753; 40.9% identity (62.4% similar) in 362 aa overlap (51-395:94-436)
30 40 50 60 70
pF1KB7 GGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGA-GS---ASGGASGGSSGGAAS
: .:.:: : :: :: . : .
CCDS70 YGNSVRATVQRYPPTHHGSQVCRPPLLHGSLPWLDGGKALGSHHTASPWNLSPFSKTSIH
70 80 90 100 110 120
80 90 100 110 120
pF1KB7 GAGPGTQQGSPGWSQAGADGAAYTP-----PPVSPRFSFPGTTGSLAAAAAAAAA--REA
..:: . : :... .:. .: ::. :. : . : ..:..: .:
CCDS70 HGSPGPLSVYPPASSSSLSGGHASPHLFTFPPTPPKDVSPDPSLSTPGSAGSARQDEKEC
130 140 150 160 170 180
130 140 150 160 170 180
pF1KB7 AAYSSGGGAAGAGLAGREQYGRAGFAGSYSS---P---YPAYMADVGASWAAAAAASAGP
:. . ... . . ....:. :: : :: :. . ... .. .:
CCDS70 LKYQVPLPDSMKLESSHSRGSMTALGGASSSTHHPITTYPPYVPEYSSGLFPPSSLLGG-
190 200 210 220 230 240
190 200 210 220 230 240
pF1KB7 FDSPVLHSLPGRANPAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCNACGL
::. . .: : :: : ::::::::: ::::::::::::::::::::
CCDS70 --SPTGFGCKSR--PKARS---------STGRECVNCGATSTPLWRRDGTGHYLCNACGL
250 260 270 280
250 260 270 280 290 300
pF1KB7 YHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYMKLHG
:::::: :::::::.:::::.::.: :::::::::::::::::.:.:::::::::.:::.
CCDS70 YHKMNGQNRPLIKPKRRLSAARRAGTSCANCQTTTTTLWRRNANGDPVCNACGLYYKLHN
290 300 310 320 330 340
310 320 330 340 350 360
pF1KB7 VPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSEEMRP
. :::.:.:::::::.:: .. .: : . .. :..: :. . . . :: .. :
CCDS70 INRPLTMKKEGIQTRNRKMSSKSK-KCKKVHDSLEDFPKNSSFNPAALSRHMSSLSHISP
350 360 370 380 390 400
370 380 390 400 410 420
pF1KB7 IKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQTSSK
.. :::. . . . : ... : ::
CCDS70 FSH----SSHMLTTPTPMHPPSSLSFGPHHPSSMVTAMG
410 420 430 440
430 440
pF1KB7 QDSWNSLVLADSHGDIITA
>>CCDS14305.1 GATA1 gene_id:2623|Hs108|chrX (413 aa)
initn: 820 init1: 644 opt: 691 Z-score: 410.0 bits: 84.9 E(32554): 1.6e-16
Smith-Waterman score: 702; 36.5% identity (56.9% similar) in 425 aa overlap (3-422:39-404)
10 20 30
pF1KB7 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGA
..: ::. : : :.. :... : :
CCDS14 LGTSEPLPQFVDPALVSSTPESGVFFPSGPEGLDAAASSTAPSTATAAAAALAYYRDAEA
10 20 30 40 50 60
40 50 60 70 80 90
pF1KB7 AS-SPVYVPTPRVPSSVLGLSYLQGGGAGSASGGASGGSSG-GAASGAGPGTQQGSPGWS
:::. : :. ..: .:: .: . :..: :: . : :.. :: .
CCDS14 YRHSPVFQVYPL-------LNCMEGIPGGSPYAGWAYGKTGLYPASTVCP-TREDSPPQA
70 80 90 100 110 120
100 110 120 130 140
pF1KB7 QAGADGAAYTPPPVSPRFSFPGTTGSLAAAAAAAAAREAAAYSSGGGAAGAGL-AGREQY
:: :: . . . : . . : : ..: . :
CCDS14 VEDLDGK-----------------GSTSFLETLKTERLSPDLLTLGPALPSSLPVPNSAY
130 140 150 160
150 160 170 180 190 200
pF1KB7 GRAGFAGSYSSPYPAYMADVGASWAAAAAASAGPFDSPVLH-SLPGRANPAARHPNLDMF
: :.... :: .:. .:: ..:: :. .:: :
CCDS14 GGPDFSSTFFSP-------TGSPLNSAA------YSSPKLRGTLP--LPPC---------
170 180 190
210 220 230 240 250 260
pF1KB7 DDFSEGRECVNCGAMSTPLWRRDGTGHYLCNACGLYHKMNGINRPLIKPQRRLSASRRVG
:.:::::::: .::::::: ::::::::::::::::: :::::.:..:: .:.:.:
CCDS14 ----EARECVNCGATATPLWRRDRTGHYLCNACGLYHKMNGQNRPLIRPKKRLIVSKRAG
200 210 220 230 240 250
270 280 290 300 310 320
pF1KB7 LSCANCQTTTTTLWRRNAEGEPVCNACGLYMKLHGVPRPLAMRKEGIQTRKRKPKNLNKS
.:.:::::::::::::: :.:::::::::.::: : :::.:::.:::::.:: .. .:.
CCDS14 TQCTNCQTTTTTLWRRNASGDPVCNACGLYYKLHQVNRPLTMRKDGIQTRNRKASGKGKK
260 270 280 290 300 310
330 340 350 360 370 380
pF1KB7 KTPAAPSGSESLP-PASGASSNSSNATTSSSEEMRPIKTEPGLSSHYGHSSSVSQTFSVS
: .. .:. . ::.: .... ... :. ::. .. . : ..
CCDS14 KRGSSLGGTGAAEGPAGGFMVVAGGSGSGNCGEV-----ASGLTLGPPGTAHLYQGLGPV
320 330 340 350 360 370
390 400 410 420 430 440
pF1KB7 AMSGHGPSIHPVLSALKLSPQGYASPVSQSPQTSSKQDSWNSLVLADSHGDIITA
..:: . : . : :: : . :.. : :.:
CCDS14 VLSGPVSHLMPFPGPLLGSPTG-SFPTGPMPPTTSTTVVAPLSS
380 390 400 410
442 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Sun Nov 6 11:20:47 2016 done: Sun Nov 6 11:20:48 2016
Total Scan time: 3.770 Total Display time: 0.060
Function used was FASTA [36.3.4 Apr, 2011]