FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KB9649, 441 aa
1>>>pF1KB9649 441 - 441 aa - 441 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 8.6813+/-0.000815; mu= 5.6134+/- 0.050
mean_var=247.3238+/-50.350, 0's: 0 Z-trim(116.3): 52 B-trim: 0 in 0/54
Lambda= 0.081553
statistics sampled from 16872 (16924) to 16872 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.802), E-opt: 0.2 (0.52), width: 16
Scan time: 3.850
The best scores are: opt bits E(32554)
CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 ( 441) 2950 359.6 3.7e-99
CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 ( 474) 718 97.0 4.4e-20
CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 ( 315) 586 81.3 1.6e-15
>>CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 (441 aa)
initn: 2950 init1: 2950 opt: 2950 Z-score: 1893.7 bits: 359.6 E(32554): 3.7e-99
Smith-Waterman score: 2950; 100.0% identity (100.0% similar) in 441 aa overlap (1-441:1-441)
10 20 30 40 50 60
pF1KB9 MVQQAESLEAESNLPREALDTEEGEFMACSPVALDESDPDWCKTASGHIKRPMNAFMVWS
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS16 MVQQAESLEAESNLPREALDTEEGEFMACSPVALDESDPDWCKTASGHIKRPMNAFMVWS
10 20 30 40 50 60
70 80 90 100 110 120
pF1KB9 KIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHMADYPDYKYRP
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS16 KIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHMADYPDYKYRP
70 80 90 100 110 120
130 140 150 160 170 180
pF1KB9 RKKPKMDPSAKPSASQSPEKSAAGGGGGSAGGGAGGAKTSKGSSKKCGKLKAPAAAGAKA
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS16 RKKPKMDPSAKPSASQSPEKSAAGGGGGSAGGGAGGAKTSKGSSKKCGKLKAPAAAGAKA
130 140 150 160 170 180
190 200 210 220 230 240
pF1KB9 GAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGGAGKTVKCVFLDEDDDDDDDDDELQLQIK
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS16 GAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGGAGKTVKCVFLDEDDDDDDDDDELQLQIK
190 200 210 220 230 240
250 260 270 280 290 300
pF1KB9 QEPDEEDEEPPHQQLLQPPGQQPSQLLRRYNVAKVPASPTLSSSAESPEGASLYDEVRAG
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS16 QEPDEEDEEPPHQQLLQPPGQQPSQLLRRYNVAKVPASPTLSSSAESPEGASLYDEVRAG
250 260 270 280 290 300
310 320 330 340 350 360
pF1KB9 ATSGAGGGSRLYYSFKNITKQHPPPLAQPALSPASSRSVSTSSSSSSGSSSGSSGEDADD
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS16 ATSGAGGGSRLYYSFKNITKQHPPPLAQPALSPASSRSVSTSSSSSSGSSSGSSGEDADD
310 320 330 340 350 360
370 380 390 400 410 420
pF1KB9 LMFDLSLNFSQSAHSASEQQLGGGAAAGNLSLSLVDKDLDSFSEGSLGSHFEFPDYCTPE
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS16 LMFDLSLNFSQSAHSASEQQLGGGAAAGNLSLSLVDKDLDSFSEGSLGSHFEFPDYCTPE
370 380 390 400 410 420
430 440
pF1KB9 LSEMIAGDWLEANFSDLVFTY
:::::::::::::::::::::
CCDS16 LSEMIAGDWLEANFSDLVFTY
430 440
>>CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 (474 aa)
initn: 1098 init1: 628 opt: 718 Z-score: 474.1 bits: 97.0 E(32554): 4.4e-20
Smith-Waterman score: 1010; 43.8% identity (64.6% similar) in 491 aa overlap (1-441:1-474)
10 20 30 40 50
pF1KB9 MVQQAESLE-AESNLPREALDTEEG-EF-MACSPVALDES-------DPDWCKTASGHIK
::::... : .:. : :. :. : :. .: ::. . . ::.:::: :::::
CCDS45 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK
10 20 30 40 50 60
60 70 80 90 100 110
pF1KB9 RPMNAFMVWSKIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHM
::::::::::.:::::::::::::::::::::::::::.::::.::::::::::::::::
CCDS45 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM
70 80 90 100 110 120
120 130 140 150 160
pF1KB9 ADYPDYKYRPRKKPK---MDPSAKPSASQSP-EKS--AAGGGGGSAGGGAGGAKTSKGSS
::::::::::::: : . :.. .::..: ::. ..:.:::. :::.::.... :..
CCDS45 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG
130 140 150 160 170 180
170 180 190 200 210 220
pF1KB9 KKCGKLKAPAAAGAKAGAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGGAGKTVKCV---F
: . ..:..: . :. : :::: .. .::::.::.. . :
CCDS45 ---GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASF
190 200 210 220 230
230 240 250 260 270
pF1KB9 LDEDDDD------DDDDDELQLQIKQEPDEE---DEEPPHQQLLQPPGQQ--PSQLLRRY
:. :. .: . :. . . : ::.. ... : :
CCDS45 AAEQAGAAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVY
240 250 260 270 280 290
280 290 300 310
pF1KB9 NVAKV--PASPT--LSSSAESPEGASLYDEVRAG--------------ATSGAGGGSRL-
. . .::. ....:. . .::.: :: :.: :.: :
CCDS45 LFGGLGTSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPAD
300 310 320 330 340 350
320 330 340 350 360 370
pF1KB9 YYSFKNITKQHPPPLAQPALSPASSRSVSTSSSSSSGSSSGSSGEDADDLMFDLSLNFSQ
. .. .. : : . : : ::: . : ::::::..::.:. : ::: :.:: :.
CCDS45 HRGYASLRAASPAPSSAP--SHASSSASSHSSSSSSSGSSSSDDEFEDDL---LDLNPSS
360 370 380 390 400 410
380 390 400 410 420 430
pF1KB9 SAHSASEQQLGGGAAAGNLSLSLVDKDLD-SFSEGSLGSHFEFPDYCTPELSEMIAGDWL
. .: : ::. ... : .:.::: .: :: :::::::::::::.::::.::::
CCDS45 NFESMS---LGSFSSS-----SALDRDLDFNFEPGS-GSHFEFPDYCTPEVSEMISGDWL
420 430 440 450 460
440
pF1KB9 EANFSDLVFTY
:...:.:::::
CCDS45 ESSISNLVFTY
470
>>CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 (315 aa)
initn: 820 init1: 562 opt: 586 Z-score: 392.4 bits: 81.3 E(32554): 1.6e-15
Smith-Waterman score: 720; 39.6% identity (55.1% similar) in 412 aa overlap (31-441:22-315)
10 20 30 40 50 60
pF1KB9 MVQQAESLEAESNLPREALDTEEGEFMACSPVALDESDPDWCKTASGHIKRPMNAFMVWS
:. .: :::: :::::::::::::::
CCDS12 MVQQRGARAKRDGGPPPPGPGPAEEGAREPGWCKTPSGHIKRPMNAFMVWS
10 20 30 40 50
70 80 90 100 110 120
pF1KB9 KIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHMADYPDYKYRP
. ::::::.: :::::::::::::.::..:.:::::::.:::::::::::::::::::::
CCDS12 QHERRKIMDQWPDMHNAEISKRLGRRWQLLQDSEKIPFVREAERLRLKHMADYPDYKYRP
60 70 80 90 100 110
130 140 150 160 170
pF1KB9 RKKPKMDPS-AKPSASQSPEKSAAGGGGGSAGGGAGGAKTSKGSSKKCGKLKAPAAAGAK
::: : :. :.: . : ::..::.. . : . :. .: .
CCDS12 RKKSKGAPAKARP---RPP------------GGSGGGSRLKPGP-------QLPGRGGRR
120 130 140
180 190 200 210 220 230
pF1KB9 AGAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGGAGKTVKCVFLDEDDDDDDDDDELQLQI
: :: : ::::. .:::.:::.:: :..
CCDS12 A-AG-----------------------GPLGGGAAAP--------EDDDEDDDEEL-LEV
150 160 170
240 250 260 270 280 290
pF1KB9 KQEPDEEDEEPPHQQLLQPPGQQPSQLLRRYNVAKVPASPTLSSSAESPEGASLYDEVRA
. :.. ::.. : :. :::. . ..:: ::
CCDS12 R--------------LVETPGRE----LWRM----VPAGRAARGQAE-----------RA
180 190 200
300 310 320 330 340 350
pF1KB9 GATSGAGGGSRLYYSFKNITKQHPPPLAQPALSPASSRSVSTSSSSSSGSSSGSSGEDAD
. :: :.. : : ::. :.. .... ::.
CCDS12 QGPSGEGAA------------------AAAAASPTPSED-EEPEEEEEEAAAAEEGEEET
210 220 230 240
360 370 380 390 400 410
pF1KB9 DLMFDLSLNFSQSAHSASEQQLGGGAAAGNLSLSLVDKDLDSFSEGSLGSHFEFPDYCTP
. ::.: . .: : :. :. : .:.: : .. : :::::::::::
CCDS12 VASGEESLGFLS--------RLPPGPAG--LDCSALDRDPD-LQPPSGTSHFEFPDYCTP
250 260 270 280 290
420 430 440
pF1KB9 ELSEMIAGDWLEANFSDLVFTY
:..::::::: ....::::::
CCDS12 EVTEMIAGDWRPSSIADLVFTY
300 310
441 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Fri Nov 4 17:57:26 2016 done: Fri Nov 4 17:57:26 2016
Total Scan time: 3.850 Total Display time: 0.000
Function used was FASTA [36.3.4 Apr, 2011]