FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KB9723, 414 aa
1>>>pF1KB9723 414 - 414 aa - 414 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 10.0886+/-0.000916; mu= -1.1031+/- 0.055
mean_var=383.4931+/-79.841, 0's: 0 Z-trim(117.4): 57 B-trim: 169 in 1/51
Lambda= 0.065493
statistics sampled from 18044 (18100) to 18044 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.833), E-opt: 0.2 (0.556), width: 16
Scan time: 2.800
The best scores are: opt bits E(32554)
CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 ( 414) 2980 295.0 8.8e-80
CCDS5977.1 SOX7 gene_id:83595|Hs108|chr8 ( 388) 847 93.4 3.9e-19
CCDS13552.1 SOX18 gene_id:54345|Hs108|chr20 ( 384) 683 77.9 1.8e-14
>>CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 (414 aa)
initn: 2980 init1: 2980 opt: 2980 Z-score: 1545.8 bits: 295.0 E(32554): 8.8e-80
Smith-Waterman score: 2980; 100.0% identity (100.0% similar) in 414 aa overlap (1-414:1-414)
10 20 30 40 50 60
pF1KB9 MSSPDAGYASDDQSQTQSALPAVMAGLGPCPWAESLSPIGDMKVKGEAPANSGAPAGAAG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS61 MSSPDAGYASDDQSQTQSALPAVMAGLGPCPWAESLSPIGDMKVKGEAPANSGAPAGAAG
10 20 30 40 50 60
70 80 90 100 110 120
pF1KB9 RAKGESRIRRPMNAFMVWAKDERKRLAQQNPDLHNAELSKMLGKSWKALTLAEKRPFVEE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS61 RAKGESRIRRPMNAFMVWAKDERKRLAQQNPDLHNAELSKMLGKSWKALTLAEKRPFVEE
70 80 90 100 110 120
130 140 150 160 170 180
pF1KB9 AERLRVQHMQDHPNYKYRPRRRKQVKRLKRVEGGFLHGLAEPQAAALGPEGGRVAMDGLG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS61 AERLRVQHMQDHPNYKYRPRRRKQVKRLKRVEGGFLHGLAEPQAAALGPEGGRVAMDGLG
130 140 150 160 170 180
190 200 210 220 230 240
pF1KB9 LQFPEQGFPAGPPLLPPHMGGHYRDCQSLGAPPLDGYPLPTPDTSPLDGVDPDPAFFAAP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS61 LQFPEQGFPAGPPLLPPHMGGHYRDCQSLGAPPLDGYPLPTPDTSPLDGVDPDPAFFAAP
190 200 210 220 230 240
250 260 270 280 290 300
pF1KB9 MPGDCPAAGTYSYAQVSDYAGPPEPPAGPMHPRLGPEPAGPSIPGLLAPPSALHVYYGAM
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS61 MPGDCPAAGTYSYAQVSDYAGPPEPPAGPMHPRLGPEPAGPSIPGLLAPPSALHVYYGAM
250 260 270 280 290 300
310 320 330 340 350 360
pF1KB9 GSPGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPSPPPEALPCRDGTDPSQPAELLGEVDR
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS61 GSPGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPSPPPEALPCRDGTDPSQPAELLGEVDR
310 320 330 340 350 360
370 380 390 400 410
pF1KB9 TEFEQYLHFVCKPEMGLPYQGHDSGVNLPDSHGAISSVVSDASSAVYYCNYPDV
::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS61 TEFEQYLHFVCKPEMGLPYQGHDSGVNLPDSHGAISSVVSDASSAVYYCNYPDV
370 380 390 400 410
>>CCDS5977.1 SOX7 gene_id:83595|Hs108|chr8 (388 aa)
initn: 775 init1: 525 opt: 847 Z-score: 456.9 bits: 93.4 E(32554): 3.9e-19
Smith-Waterman score: 847; 43.3% identity (61.9% similar) in 404 aa overlap (27-411:5-385)
10 20 30 40 50
pF1KB9 MSSPDAGYASDDQSQTQSALPAVMAGLGPCPWAESLS-PIGDMKVK-GEAPANSGAPAGA
:: :: :.: : : ... :..: : :
CCDS59 MASLLGAYPWPEGLECPALDAELSDGQSPPAVPRPPGD
10 20 30
60 70 80 90 100 110
pF1KB9 AGRAKGESRIRRPMNAFMVWAKDERKRLAQQNPDLHNAELSKMLGKSWKALTLAEKRPFV
: .::::::::::::::::::::::: :::::::::::::::::::::::..:::.:
CCDS59 KG---SESRIRRPMNAFMVWAKDERKRLAVQNPDLHNAELSKMLGKSWKALTLSQKRPYV
40 50 60 70 80 90
120 130 140 150 160 170
pF1KB9 EEAERLRVQHMQDHPNYKYRPRRRKQVKRL-KRVEGGFL-HGLAEPQAAALGPEGGRVAM
.::::::.:::::.:::::::::.::.::: :::. ::: .:.. : : :: .
CCDS59 DEAERLRLQHMQDYPNYKYRPRRKKQAKRLCKRVDPGFLLSSLSRDQNAL--PEKRSGSR
100 110 120 130 140 150
180 190 200 210 220
pF1KB9 DGLGLQFPEQGFPAGPPLLPPHMGGHYRDCQSLGA---PP--LDGYP--LPTP-DTSPLD
.:: . . . : : : . : :.. . :. : .: :: :::: . ::::
CCDS59 GALGEKEDRGEYSPGTAL--PSLRGCYHEGPAGGGGGGTPSSVDTYPYGLPTPPEMSPLD
160 170 180 190 200 210
230 240 250 260 270 280
pF1KB9 GVDPDPAFFAAPMPGDCPAAGTYSYAQVSDYAGPPEPPAGPMHPRLGPEPAGPSIP-GLL
..:. .::..: : . :. :. :. :. .: : : : : :
CCDS59 VLEPEQTFFSSP----CQEEHGHPRRI-------PHLPGHPYSPEYAPSPLHCSHPLGSL
220 230 240 250 260
290 300 310 320 330 340
pF1KB9 APPSALHVYYGAMGSPGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPSPPPEALPCRDGTD
: .. : .: :: : . . .. :.. :: ::::: : :. :
CCDS59 ALGQSPGV---SMMSPVPGCPPSPAYYSPATYHPLHSNLQAHLGQLSPPPEH-PGFDALD
270 280 290 300 310
350 360 370 380 390 400
pF1KB9 PSQPAELLGEVDRTEFEQYLHFVCKPEMG---LPYQGHD--SGVN-LPDSHGAISSVVSD
. .::::..::.::.:::. .:. . . .:: : :. .. .. ::..:
CCDS59 QLSQVELLGDMDRNEFDQYLNTPGHPDSATGAMALSGHVPVSQVTPTGPTETSLISVLAD
320 330 340 350 360 370
410
pF1KB9 ASSAVYYCNYPDV
:. :.:: .:
CCDS59 AT-ATYYNSYSVS
380
>>CCDS13552.1 SOX18 gene_id:54345|Hs108|chr20 (384 aa)
initn: 849 init1: 541 opt: 683 Z-score: 373.3 bits: 77.9 E(32554): 1.8e-14
Smith-Waterman score: 859; 44.0% identity (59.6% similar) in 423 aa overlap (3-408:23-378)
10 20 30 40
pF1KB9 MSSPDAGYASDDQSQTQSALPAVMAGLGPCPWAESLSPIG
.: : :.: .. .: ::..: .: : ::
CCDS13 MQRSPPGYGAQDDPPARRDCAWAPGHGAAAD--TRGLAAGPAALA--APAAPASPPSP-Q
10 20 30 40 50
50 60 70 80 90
pF1KB9 DMKVKGEAPANSG-APAGAAGR-AKGESRIRRPMNAFMVWAKDERKRLAQQNPDLHNAEL
.. :. : .::: . : : :::::::::::::::::::::::::::::::: :
CCDS13 RSPPRSPEPGRYGLSPAGRGERQAADESRIRRPMNAFMVWAKDERKRLAQQNPDLHNAVL
60 70 80 90 100 110
100 110 120 130 140 150
pF1KB9 SKMLGKSWKALTLAEKRPFVEEAERLRVQHMQDHPNYKYRPRRRKQVKRLKRVEGGFL-H
::::::.:: :. :::::::::::::::::..:::::::::::.::... .:.: :.:
CCDS13 SKMLGKAWKELNAAEKRPFVEEAERLRVQHLRDHPNYKYRPRRKKQARKARRLEPGLLLP
120 130 140 150 160 170
160 170 180 190 200 210
pF1KB9 GLAEPQAAALGPEGGRVAMDGLGLQFPEQGFPAGPPLLPPHMGGHYRDCQSLGAPPLDGY
::: :: : : . :::. . .:. ::: .::
CCDS13 GLAPPQ-----P--------------PPEPFPAASG-----SARAFRELPPLGAE-FDGL
180 190 200 210
220 230 240 250 260 270
pF1KB9 PLPTPDTSPLDGVDP-DPAFFAAPM-PGDC---PAAGTYSYAQVS-DYAGPPEPPAGPMH
::::. :::::..: . ::: : : :: : . :. ...: : .: : .
CCDS13 GLPTPERSPLDGLEPGEAAFFPPPAAPEDCALRPFRAPYAPTELSRDPGGCYGAPLAEAL
220 230 240 250 260 270
280 290 300 310 320 330
pF1KB9 PRLGPEPAGPSIPGLLAPPSALHVYYGAMGSPGAGGGRGFQMQPQHQHQHQHQHHPPGPG
: .: ::.: . :: :::..:.:: : ::
CCDS13 -RTAP-PAAP-LAGL---------YYGTLGTPG-----------------------PYPG
280 290
340 350 360 370 380
pF1KB9 QPSPPPEALPCRDGTDPSQPA-ELLGEVDRTEFEQYLHFV-CKPEM-GLPY-----QGHD
:::::: : ....: :: .: ..:: :::.:::. .:. :::: .
CCDS13 PLSPPPEAPPL-ESAEPLGPAADLWADVDLTEFDQYLNCSRTRPDAPGLPYHVALAKLGP
300 310 320 330 340 350
390 400 410
pF1KB9 SGVNLPDSHGAISSVVSDASSAVYYCNYPDV
... :. . ::.. :::::::::
CCDS13 RAMSCPEESSLISAL-SDASSAVYYSACISG
360 370 380
414 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Fri Nov 4 18:32:10 2016 done: Fri Nov 4 18:32:10 2016
Total Scan time: 2.800 Total Display time: -0.020
Function used was FASTA [36.3.4 Apr, 2011]