FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB7991, 442 aa 1>>>pF1KB7991 442 - 442 aa - 442 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 10.0889+/-0.000948; mu= -1.6557+/- 0.057 mean_var=317.2493+/-66.521, 0's: 0 Z-trim(116.1): 24 B-trim: 0 in 0/52 Lambda= 0.072007 statistics sampled from 16679 (16699) to 16679 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.796), E-opt: 0.2 (0.513), width: 16 Scan time: 3.770 The best scores are: opt bits E(32554) CCDS5983.1 GATA4 gene_id:2626|Hs108|chr8 ( 442) 3005 325.3 7.4e-89 CCDS78303.1 GATA4 gene_id:2626|Hs108|chr8 ( 443) 2993 324.1 1.8e-88 CCDS78304.1 GATA4 gene_id:2626|Hs108|chr8 ( 236) 1608 180.0 2.3e-45 CCDS11872.1 GATA6 gene_id:2627|Hs108|chr18 ( 595) 1032 120.5 4.6e-27 CCDS13499.1 GATA5 gene_id:140628|Hs108|chr20 ( 397) 1016 118.7 1.1e-26 CCDS3049.1 GATA2 gene_id:2624|Hs108|chr3 ( 480) 846 101.1 2.6e-21 CCDS31143.1 GATA3 gene_id:2625|Hs108|chr10 ( 444) 743 90.3 4.1e-18 CCDS7083.1 GATA3 gene_id:2625|Hs108|chr10 ( 443) 732 89.2 8.9e-18 CCDS14305.1 GATA1 gene_id:2623|Hs108|chrX ( 413) 691 84.9 1.6e-16 >>CCDS5983.1 GATA4 gene_id:2626|Hs108|chr8 (442 aa) initn: 3005 init1: 3005 opt: 3005 Z-score: 1708.7 bits: 325.3 E(32554): 7.4e-89 Smith-Waterman score: 3005; 100.0% identity (100.0% similar) in 442 aa overlap (1-442:1-442) 10 20 30 40 50 60 pF1KB7 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGAG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGAG 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB7 SASGGASGGSSGGAASGAGPGTQQGSPGWSQAGADGAAYTPPPVSPRFSFPGTTGSLAAA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 SASGGASGGSSGGAASGAGPGTQQGSPGWSQAGADGAAYTPPPVSPRFSFPGTTGSLAAA 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB7 AAAAAAREAAAYSSGGGAAGAGLAGREQYGRAGFAGSYSSPYPAYMADVGASWAAAAAAS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 AAAAAAREAAAYSSGGGAAGAGLAGREQYGRAGFAGSYSSPYPAYMADVGASWAAAAAAS 130 140 150 160 170 180 190 200 210 220 230 240 pF1KB7 AGPFDSPVLHSLPGRANPAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCNA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 AGPFDSPVLHSLPGRANPAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCNA 190 200 210 220 230 240 250 260 270 280 290 300 pF1KB7 CGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYMK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 CGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYMK 250 260 270 280 290 300 310 320 330 340 350 360 pF1KB7 LHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSEE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 LHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSEE 310 320 330 340 350 360 370 380 390 400 410 420 pF1KB7 MRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 MRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQT 370 380 390 400 410 420 430 440 pF1KB7 SSKQDSWNSLVLADSHGDIITA :::::::::::::::::::::: CCDS59 SSKQDSWNSLVLADSHGDIITA 430 440 >>CCDS78303.1 GATA4 gene_id:2626|Hs108|chr8 (443 aa) initn: 1623 init1: 1623 opt: 2993 Z-score: 1702.0 bits: 324.1 E(32554): 1.8e-88 Smith-Waterman score: 2993; 99.8% identity (99.8% similar) in 443 aa overlap (1-442:1-443) 10 20 30 40 50 60 pF1KB7 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGAG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGAG 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB7 SASGGASGGSSGGAASGAGPGTQQGSPGWSQAGADGAAYTPPPVSPRFSFPGTTGSLAAA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 SASGGASGGSSGGAASGAGPGTQQGSPGWSQAGADGAAYTPPPVSPRFSFPGTTGSLAAA 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB7 AAAAAAREAAAYSSGGGAAGAGLAGREQYGRAGFAGSYSSPYPAYMADVGASWAAAAAAS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 AAAAAAREAAAYSSGGGAAGAGLAGREQYGRAGFAGSYSSPYPAYMADVGASWAAAAAAS 130 140 150 160 170 180 190 200 210 220 230 pF1KB7 AGPFDSPVLHSLPGRANPAARHPNL-DMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCN ::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::: CCDS78 AGPFDSPVLHSLPGRANPAARHPNLVDMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCN 190 200 210 220 230 240 240 250 260 270 280 290 pF1KB7 ACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYM :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 ACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYM 250 260 270 280 290 300 300 310 320 330 340 350 pF1KB7 KLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 KLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSE 310 320 330 340 350 360 360 370 380 390 400 410 pF1KB7 EMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 EMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQ 370 380 390 400 410 420 420 430 440 pF1KB7 TSSKQDSWNSLVLADSHGDIITA ::::::::::::::::::::::: CCDS78 TSSKQDSWNSLVLADSHGDIITA 430 440 >>CCDS78304.1 GATA4 gene_id:2626|Hs108|chr8 (236 aa) initn: 1608 init1: 1608 opt: 1608 Z-score: 928.0 bits: 180.0 E(32554): 2.3e-45 Smith-Waterman score: 1608; 100.0% identity (100.0% similar) in 236 aa overlap (207-442:1-236) 180 190 200 210 220 230 pF1KB7 AAASAGPFDSPVLHSLPGRANPAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTGHY :::::::::::::::::::::::::::::: CCDS78 MFDDFSEGRECVNCGAMSTPLWRRDGTGHY 10 20 30 240 250 260 270 280 290 pF1KB7 LCNACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 LCNACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACG 40 50 60 70 80 90 300 310 320 330 340 350 pF1KB7 LYMKLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 LYMKLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTS 100 110 120 130 140 150 360 370 380 390 400 410 pF1KB7 SSEEMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 SSEEMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQ 160 170 180 190 200 210 420 430 440 pF1KB7 SPQTSSKQDSWNSLVLADSHGDIITA :::::::::::::::::::::::::: CCDS78 SPQTSSKQDSWNSLVLADSHGDIITA 220 230 >>CCDS11872.1 GATA6 gene_id:2627|Hs108|chr18 (595 aa) initn: 1160 init1: 814 opt: 1032 Z-score: 599.3 bits: 120.5 E(32554): 4.6e-27 Smith-Waterman score: 1256; 50.2% identity (68.3% similar) in 476 aa overlap (1-433:147-595) 10 20 30 pF1KB7 MYQSLAMAANHGPPPGAYEAGGPGAFMHGA :::.:: ...:: .::. :.::.:.:.: CCDS11 DLDQAATASKLLWSSRGAKLSPFAPEQPEEMYQTLAALSSQGP--AAYD-GAPGGFVHSA 120 130 140 150 160 170 40 50 60 70 80 pF1KB7 GAA-------SSPVYVPTPRVPSSVLGLSY-LQGGGAGSAS--GGASGGSSGGAASGAGP .:: :::::::: :: : . :: : :::.:.: :. :::.. . ::. .: CCDS11 AAAAAAAAAASSPVYVPTTRVGSMLPGLPYHLQGSGSGPANHAGGAGAHPGWPQASADSP 180 190 200 210 220 230 90 100 110 120 130 pF1KB7 --GTQQGSPGWSQAGADGAAYTPPPVSPRFSFPGTTGSLAAAAA------AAAAREAAAY :. :. : . :: ::. . :: :: . . . .: .:: :::. .:. CCDS11 PYGSGGGAAGGGAAGPGGAGSAAAHVSARFPY-SPSPPMANGAAREPGGYAAAGSGGAGG 240 250 260 270 280 290 140 150 160 170 pF1KB7 SSGGGAAGAGLAGRE-QYGRAGFA----GSYS----------SPYPAYMADVGASWAAAA ::::.. :...::: ::. . : :.: ::: : ::: . : CCDS11 VSGGGSSLAAMGGREPQYSSLSAARPLNGTYHHHHHHHHHHPSPYSPY---VGAPLTPAW 300 310 320 330 340 180 190 200 210 220 230 pF1KB7 AASAGPFDSPVLHSLPGRAN---PAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTG ::::..:::::: .::. :. : :. :...:.::.:::::::...:::::::::: CCDS11 P--AGPFETPVLHSLQSRAGAPLPVPRGPSADLLEDLSESRECVNCGSIQTPLWRRDGTG 350 360 370 380 390 400 240 250 260 270 280 290 pF1KB7 HYLCNACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNA :::::::::: ::::..:::::::.:. .:::.:::::::.::::::::::::::::::: CCDS11 HYLCNACGLYSKMNGLSRPLIKPQKRVPSSRRLGLSCANCHTTTTTLWRRNAEGEPVCNA 410 420 430 440 450 460 300 310 320 330 340 350 pF1KB7 CGLYMKLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLP--PASGASSNSSN ::::::::::::::::.:::::::::::::.::::: .. :.. :.: :.: .::::.. CCDS11 CGLYMKLHGVPRPLAMKKEGIQTRKRKPKNINKSKTCSGNSNN-SIPMTPTS-TSSNSDD 470 480 490 500 510 520 360 370 380 390 400 410 pF1KB7 ATTSSSEEMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQ-GYA . ..: .: . : . .:.: : : .: : :: : : : CCDS11 CSKNTSPTTQPTASGAG----------------APVMTGAGESTNPENSELKYSGQDGLY 530 540 550 560 420 430 440 pF1KB7 SPVS-QSPQ--TSS-KQDSWNSLVLADSHGDIITA :: :: ::: . ::: .:.:: CCDS11 IGVSLASPAEVTSSVRPDSWCALALA 570 580 590 >>CCDS13499.1 GATA5 gene_id:140628|Hs108|chr20 (397 aa) initn: 1152 init1: 796 opt: 1016 Z-score: 592.7 bits: 118.7 E(32554): 1.1e-26 Smith-Waterman score: 1144; 46.0% identity (67.4% similar) in 448 aa overlap (1-433:1-397) 10 20 30 40 50 60 pF1KB7 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGAG ::::::.::. : .:: . :.:.:. ::.: :..:: :::: ::::.: CCDS13 MYQSLALAAS--PRQAAY--ADSGSFLHAPGAGS-PMFVPPARVPSM---LSYLSG---- 10 20 30 40 70 80 90 100 110 pF1KB7 SASGGASGGSSGGAASGAGPGTQQGSPGWSQ-AGADGAAYTP-PPVSPRFSFPGTTGSLA . : . :::.: : ::..:. : : : ::.:. CCDS13 -------------CEPSPQPPELAARPGWAQTATADSSAFGPGSPHPPAAHPPGATAFPF 50 60 70 80 90 120 130 140 150 160 170 pF1KB7 AAAAAAAAREAAAYSSGGGAAGAGLAGREQY----GRAGFAGSYSSPYPAYMA-DVGASW : . .. . ..: . :.: ..: :::. :: . :::. ::::.. ::. :: CCDS13 AHSPSGPGSGGSAGGRDGSAYQGALLPREQFAAPLGRP-VGTSYSATYPAYVSPDVAQSW 100 110 120 130 140 150 180 190 200 210 220 230 pF1KB7 AAAAAASAGPFDSPVLHSLPGRANPAARHPNL--DMFDDF-SEGRECVNCGAMSTPLWRR .:::::. :::.:::: .:.. :....: .::::::::::.::::::: CCDS13 ------TAGPFDGSVLHGLPGR------RPTFVSDFLEEFPGEGRECVNCGALSTPLWRR 160 170 180 190 200 240 250 260 270 280 290 pF1KB7 DGTGHYLCNACGLYHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEP :::::::::::::::::::.::::..::.:::.:::.:: :.::.::.:::::::.:::: CCDS13 DGTGHYLCNACGLYHKMNGVNRPLVRPQKRLSSSRRAGLCCTNCHTTNTTLWRRNSEGEP 210 220 230 240 250 260 300 310 320 330 340 350 pF1KB7 VCNACGLYMKLHGVPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNS ::::::::::::::::::::.::.:::::::::.. :.. .. . . : :.. ::..: CCDS13 VCNACGLYMKLHGVPRPLAMKKESIQTRKRKPKTIAKARGSSGSTRNASASPSAVASTDS 270 280 290 300 310 320 360 370 380 390 400 410 pF1KB7 SNATTSSSEEMRPIKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGY : ::. :..:.:.: . :.. .: . . :. : .:. :. . CCDS13 SAATS---------KAKPSLASPVCPGPSMAP----QASGQEDDSLAPGHLEFKFEPEDF 330 340 350 360 420 430 440 pF1KB7 ASP-VSQSPQT----SSKQDSWNSLVLADSHGDIITA : : .. :::. . .:..: .:.:: CCDS13 AFPSTAPSPQAGLRGALRQEAWCALALA 370 380 390 >>CCDS3049.1 GATA2 gene_id:2624|Hs108|chr3 (480 aa) initn: 794 init1: 704 opt: 846 Z-score: 496.1 bits: 101.1 E(32554): 2.6e-21 Smith-Waterman score: 846; 43.2% identity (64.7% similar) in 405 aa overlap (15-398:93-472) 10 20 30 40 pF1KB7 MYQSLAMAANHGPPPG-AYEAGGPGAFMHGAGAASSPVYV-P-- :: . :: .:. .:. .: : : CCDS30 PAHARARVSYSPAHARLTGGQMCRPHLLHSPGLPWLDGGKAALSAAAAHHHNPWTVSPFS 70 80 90 100 110 120 50 60 70 80 90 pF1KB7 -TPRVPSSVLGLSYLQGGGAGSASGGASGGSSGGAASGAGPGTQQGSPGWSQAGADGAAY :: ::.. : :: :. ::.:::.::..:... : : ...:. .. CCDS30 KTPLHPSAAGG-----PGGPLSVYPGAGGGSGGGSGSSVASLT----PTAAHSGSHLFGF 130 140 150 160 170 100 110 120 130 140 150 pF1KB7 TP-PP--VSPRFSFPGTTGSLAAAAAAAAAREAAAYSSGGGAAGAGL--AGREQYG---R : :: ::: :.:::. . :...:.. : . .. : ..: . . . : : CCDS30 PPTPPKEVSPD---PSTTGAASPASSSAGGSAARGEDKDGVKYQVSLTESMKMESGSPLR 180 190 200 210 220 230 160 170 180 190 200 pF1KB7 AGFAGSYSSPYPAYMADVGASWAAAAAASAGPFDSPVLHSLPGR--ANPAARH-PNL-DM :.: ..: . . :.. ::: . ..: ..: :: ..::. :. . CCDS30 PGLATMGTQPATHHPIPTYPSYVPAAAHD---YSSGLFH--PGGFLGGPASSFTPKQRSK 240 250 260 270 280 210 220 230 240 250 260 pF1KB7 FDDFSEGRECVNCGAMSTPLWRRDGTGHYLCNACGLYHKMNGINRPLIKPQRRLSASRRV . ::::::::::: .::::::::::::::::::::::::: :::::::.:::::.::. CCDS30 ARSCSEGRECVNCGATATPLWRRDGTGHYLCNACGLYHKMNGQNRPLIKPKRRLSAARRA 290 300 310 320 330 340 270 280 290 300 310 320 pF1KB7 GLSCANCQTTTTTLWRRNAEGEPVCNACGLYMKLHGVPRPLAMRKEGIQTRKRKPKNLNK : ::::::::::::::::.:.:::::::::.:::.: :::.:.:::::::.:: .: : CCDS30 GTCCANCQTTTTTLWRRNANGDPVCNACGLYYKLHNVNRPLTMKKEGIQTRNRKMSN--K 350 360 370 380 390 400 330 340 350 360 370 380 pF1KB7 SKTPAAPSGSESLPPASGASSNSSNATTSSSE--EMRPIKTEPGLSSHYGHS-SSVSQTF :: . .:.: . : ...:. .... .: :. : .: : :: . . CCDS30 SKK--SKKGAECFEELSKCMQEKSSPFSAAALAGHMAPVGHLPPFS-HSGHILPTPTPIH 410 420 430 440 450 460 390 400 410 420 430 440 pF1KB7 SVSAMS-GHGPSIHPVLSALKLSPQGYASPVSQSPQTSSKQDSWNSLVLADSHGDIITA :..: :: : :: CCDS30 PSSSLSFGH-P--HPSSMVTAMG 470 480 >>CCDS31143.1 GATA3 gene_id:2625|Hs108|chr10 (444 aa) initn: 699 init1: 699 opt: 743 Z-score: 438.8 bits: 90.3 E(32554): 4.1e-18 Smith-Waterman score: 758; 40.9% identity (62.7% similar) in 362 aa overlap (51-395:94-437) 30 40 50 60 70 pF1KB7 GGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGA-GS---ASGGASGGSSGGAAS : .:.:: : :: :: . : . CCDS31 YGNSVRATVQRYPPTHHGSQVCRPPLLHGSLPWLDGGKALGSHHTASPWNLSPFSKTSIH 70 80 90 100 110 120 80 90 100 110 120 pF1KB7 GAGPGTQQGSPGWSQAGADGAAYTP-----PPVSPRFSFPGTTGSLAAAAAAAAA--REA ..:: . : :... .:. .: ::. :. : . : ..:..: .: CCDS31 HGSPGPLSVYPPASSSSLSGGHASPHLFTFPPTPPKDVSPDPSLSTPGSAGSARQDEKEC 130 140 150 160 170 180 130 140 150 160 170 180 pF1KB7 AAYSSGGGAAGAGLAGREQYGRAGFAGSYSS---P---YPAYMADVGASWAAAAAASAGP :. . ... . . ....:. :: : :: :. . ... .. .: CCDS31 LKYQVPLPDSMKLESSHSRGSMTALGGASSSTHHPITTYPPYVPEYSSGLFPPSSLLGG- 190 200 210 220 230 240 190 200 210 220 230 240 pF1KB7 FDSPVLHSLPGRANPAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCNACGL ::. . .: : :: .:::::::::: :::::::::::::::::::: CCDS31 --SPTGFGCKSR--PKARSS--------TEGRECVNCGATSTPLWRRDGTGHYLCNACGL 250 260 270 280 290 250 260 270 280 290 300 pF1KB7 YHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYMKLHG :::::: :::::::.:::::.::.: :::::::::::::::::.:.:::::::::.:::. CCDS31 YHKMNGQNRPLIKPKRRLSAARRAGTSCANCQTTTTTLWRRNANGDPVCNACGLYYKLHN 300 310 320 330 340 350 310 320 330 340 350 360 pF1KB7 VPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSEEMRP . :::.:.:::::::.:: .. .: : . .. :..: :. . . . :: .. : CCDS31 INRPLTMKKEGIQTRNRKMSSKSK-KCKKVHDSLEDFPKNSSFNPAALSRHMSSLSHISP 360 370 380 390 400 370 380 390 400 410 420 pF1KB7 IKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQTSSK .. :::. . . . : ... : :: CCDS31 FSH----SSHMLTTPTPMHPPSSLSFGPHHPSSMVTAMG 410 420 430 440 430 440 pF1KB7 QDSWNSLVLADSHGDIITA >>CCDS7083.1 GATA3 gene_id:2625|Hs108|chr10 (443 aa) initn: 695 init1: 695 opt: 732 Z-score: 432.6 bits: 89.2 E(32554): 8.9e-18 Smith-Waterman score: 753; 40.9% identity (62.4% similar) in 362 aa overlap (51-395:94-436) 30 40 50 60 70 pF1KB7 GGPGAFMHGAGAASSPVYVPTPRVPSSVLGLSYLQGGGA-GS---ASGGASGGSSGGAAS : .:.:: : :: :: . : . CCDS70 YGNSVRATVQRYPPTHHGSQVCRPPLLHGSLPWLDGGKALGSHHTASPWNLSPFSKTSIH 70 80 90 100 110 120 80 90 100 110 120 pF1KB7 GAGPGTQQGSPGWSQAGADGAAYTP-----PPVSPRFSFPGTTGSLAAAAAAAAA--REA ..:: . : :... .:. .: ::. :. : . : ..:..: .: CCDS70 HGSPGPLSVYPPASSSSLSGGHASPHLFTFPPTPPKDVSPDPSLSTPGSAGSARQDEKEC 130 140 150 160 170 180 130 140 150 160 170 180 pF1KB7 AAYSSGGGAAGAGLAGREQYGRAGFAGSYSS---P---YPAYMADVGASWAAAAAASAGP :. . ... . . ....:. :: : :: :. . ... .. .: CCDS70 LKYQVPLPDSMKLESSHSRGSMTALGGASSSTHHPITTYPPYVPEYSSGLFPPSSLLGG- 190 200 210 220 230 240 190 200 210 220 230 240 pF1KB7 FDSPVLHSLPGRANPAARHPNLDMFDDFSEGRECVNCGAMSTPLWRRDGTGHYLCNACGL ::. . .: : :: : ::::::::: :::::::::::::::::::: CCDS70 --SPTGFGCKSR--PKARS---------STGRECVNCGATSTPLWRRDGTGHYLCNACGL 250 260 270 280 250 260 270 280 290 300 pF1KB7 YHKMNGINRPLIKPQRRLSASRRVGLSCANCQTTTTTLWRRNAEGEPVCNACGLYMKLHG :::::: :::::::.:::::.::.: :::::::::::::::::.:.:::::::::.:::. CCDS70 YHKMNGQNRPLIKPKRRLSAARRAGTSCANCQTTTTTLWRRNANGDPVCNACGLYYKLHN 290 300 310 320 330 340 310 320 330 340 350 360 pF1KB7 VPRPLAMRKEGIQTRKRKPKNLNKSKTPAAPSGSESLPPASGASSNSSNATTSSSEEMRP . :::.:.:::::::.:: .. .: : . .. :..: :. . . . :: .. : CCDS70 INRPLTMKKEGIQTRNRKMSSKSK-KCKKVHDSLEDFPKNSSFNPAALSRHMSSLSHISP 350 360 370 380 390 400 370 380 390 400 410 420 pF1KB7 IKTEPGLSSHYGHSSSVSQTFSVSAMSGHGPSIHPVLSALKLSPQGYASPVSQSPQTSSK .. :::. . . . : ... : :: CCDS70 FSH----SSHMLTTPTPMHPPSSLSFGPHHPSSMVTAMG 410 420 430 440 430 440 pF1KB7 QDSWNSLVLADSHGDIITA >>CCDS14305.1 GATA1 gene_id:2623|Hs108|chrX (413 aa) initn: 820 init1: 644 opt: 691 Z-score: 410.0 bits: 84.9 E(32554): 1.6e-16 Smith-Waterman score: 702; 36.5% identity (56.9% similar) in 425 aa overlap (3-422:39-404) 10 20 30 pF1KB7 MYQSLAMAANHGPPPGAYEAGGPGAFMHGAGA ..: ::. : : :.. :... : : CCDS14 LGTSEPLPQFVDPALVSSTPESGVFFPSGPEGLDAAASSTAPSTATAAAAALAYYRDAEA 10 20 30 40 50 60 40 50 60 70 80 90 pF1KB7 AS-SPVYVPTPRVPSSVLGLSYLQGGGAGSASGGASGGSSG-GAASGAGPGTQQGSPGWS :::. : :. ..: .:: .: . :..: :: . : :.. :: . CCDS14 YRHSPVFQVYPL-------LNCMEGIPGGSPYAGWAYGKTGLYPASTVCP-TREDSPPQA 70 80 90 100 110 120 100 110 120 130 140 pF1KB7 QAGADGAAYTPPPVSPRFSFPGTTGSLAAAAAAAAAREAAAYSSGGGAAGAGL-AGREQY :: :: . . . : . . : : ..: . : CCDS14 VEDLDGK-----------------GSTSFLETLKTERLSPDLLTLGPALPSSLPVPNSAY 130 140 150 160 150 160 170 180 190 200 pF1KB7 GRAGFAGSYSSPYPAYMADVGASWAAAAAASAGPFDSPVLH-SLPGRANPAARHPNLDMF : :.... :: .:. .:: ..:: :. .:: : CCDS14 GGPDFSSTFFSP-------TGSPLNSAA------YSSPKLRGTLP--LPPC--------- 170 180 190 210 220 230 240 250 260 pF1KB7 DDFSEGRECVNCGAMSTPLWRRDGTGHYLCNACGLYHKMNGINRPLIKPQRRLSASRRVG :.:::::::: .::::::: ::::::::::::::::: :::::.:..:: .:.:.: CCDS14 ----EARECVNCGATATPLWRRDRTGHYLCNACGLYHKMNGQNRPLIRPKKRLIVSKRAG 200 210 220 230 240 250 270 280 290 300 310 320 pF1KB7 LSCANCQTTTTTLWRRNAEGEPVCNACGLYMKLHGVPRPLAMRKEGIQTRKRKPKNLNKS .:.:::::::::::::: :.:::::::::.::: : :::.:::.:::::.:: .. .:. CCDS14 TQCTNCQTTTTTLWRRNASGDPVCNACGLYYKLHQVNRPLTMRKDGIQTRNRKASGKGKK 260 270 280 290 300 310 330 340 350 360 370 380 pF1KB7 KTPAAPSGSESLP-PASGASSNSSNATTSSSEEMRPIKTEPGLSSHYGHSSSVSQTFSVS : .. .:. . ::.: .... ... :. ::. .. . : .. CCDS14 KRGSSLGGTGAAEGPAGGFMVVAGGSGSGNCGEV-----ASGLTLGPPGTAHLYQGLGPV 320 330 340 350 360 370 390 400 410 420 430 440 pF1KB7 AMSGHGPSIHPVLSALKLSPQGYASPVSQSPQTSSKQDSWNSLVLADSHGDIITA ..:: . : . : :: : . :.. : :.: CCDS14 VLSGPVSHLMPFPGPLLGSPTG-SFPTGPMPPTTSTTVVAPLSS 380 390 400 410 442 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sun Nov 6 11:20:47 2016 done: Sun Nov 6 11:20:48 2016 Total Scan time: 3.770 Total Display time: 0.060 Function used was FASTA [36.3.4 Apr, 2011]