FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011
Please cite:
W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448
Query: pF1KB9646, 204 aa
1>>>pF1KB9646 204 - 204 aa - 204 aa
Library: human.CCDS.faa
18511270 residues in 32554 sequences
Statistics: Expectation_n fit: rho(ln(x))= 5.7286+/-0.000695; mu= 11.6390+/- 0.042
mean_var=76.9492+/-15.386, 0's: 0 Z-trim(110.3): 61 B-trim: 0 in 0/53
Lambda= 0.146208
statistics sampled from 11438 (11501) to 11438 sequences
Algorithm: FASTA (3.7 Nov 2010) [optimized]
Parameters: BL50 matrix (15:-5), open/ext: -10/-2
ktup: 2, E-join: 1 (0.74), E-opt: 0.2 (0.353), width: 16
Scan time: 2.090
The best scores are: opt bits E(32554)
CCDS14772.1 SRY gene_id:6736|Hs108|chrY ( 204) 1412 306.6 6.7e-84
CCDS14669.1 SOX3 gene_id:6658|Hs108|chrX ( 446) 439 101.6 7.8e-22
CCDS3239.1 SOX2 gene_id:6657|Hs108|chr3 ( 317) 424 98.3 5.3e-21
CCDS9523.1 SOX1 gene_id:6656|Hs108|chr13 ( 391) 425 98.6 5.4e-21
CCDS9473.1 SOX21 gene_id:11166|Hs108|chr13 ( 276) 409 95.1 4.2e-20
CCDS3094.1 SOX14 gene_id:8403|Hs108|chr3 ( 240) 402 93.6 1e-19
CCDS32549.1 SOX15 gene_id:6665|Hs108|chr17 ( 233) 381 89.2 2.2e-18
CCDS5977.1 SOX7 gene_id:83595|Hs108|chr8 ( 388) 352 83.2 2.3e-16
CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 ( 414) 346 81.9 5.9e-16
CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 ( 441) 342 81.1 1.1e-15
CCDS13552.1 SOX18 gene_id:54345|Hs108|chr20 ( 384) 329 78.3 6.7e-15
CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 ( 315) 322 76.8 1.6e-14
CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 ( 474) 321 76.7 2.6e-14
CCDS41761.1 SOX5 gene_id:6660|Hs108|chr12 ( 377) 317 75.8 3.8e-14
CCDS58216.1 SOX5 gene_id:6660|Hs108|chr12 ( 642) 317 75.9 5.9e-14
CCDS81672.1 SOX5 gene_id:6660|Hs108|chr12 ( 728) 317 76.0 6.6e-14
CCDS44844.1 SOX5 gene_id:6660|Hs108|chr12 ( 750) 317 76.0 6.7e-14
CCDS58217.1 SOX5 gene_id:6660|Hs108|chr12 ( 753) 317 76.0 6.8e-14
CCDS8699.1 SOX5 gene_id:6660|Hs108|chr12 ( 763) 317 76.0 6.8e-14
CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 ( 509) 314 75.2 7.6e-14
CCDS10428.1 SOX8 gene_id:30812|Hs108|chr16 ( 446) 312 74.8 9.1e-14
CCDS53604.1 SOX6 gene_id:55553|Hs108|chr11 ( 801) 314 75.4 1.1e-13
CCDS53605.1 SOX6 gene_id:55553|Hs108|chr11 ( 804) 314 75.4 1.1e-13
CCDS7821.1 SOX6 gene_id:55553|Hs108|chr11 ( 808) 314 75.4 1.1e-13
CCDS13964.1 SOX10 gene_id:6663|Hs108|chr22 ( 466) 309 74.2 1.5e-13
CCDS44299.1 SOX13 gene_id:9580|Hs108|chr1 ( 622) 305 73.4 3.3e-13
CCDS78080.1 SOX30 gene_id:11063|Hs108|chr5 ( 448) 294 71.0 1.3e-12
CCDS4340.1 SOX30 gene_id:11063|Hs108|chr5 ( 501) 291 70.4 2.2e-12
CCDS4339.1 SOX30 gene_id:11063|Hs108|chr5 ( 753) 292 70.7 2.6e-12
>>CCDS14772.1 SRY gene_id:6736|Hs108|chrY (204 aa)
initn: 1412 init1: 1412 opt: 1412 Z-score: 1619.8 bits: 306.6 E(32554): 6.7e-84
Smith-Waterman score: 1412; 100.0% identity (100.0% similar) in 204 aa overlap (1-204:1-204)
10 20 30 40 50 60
pF1KB9 MQSYASAMLSVFNSDDYSPAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRV
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS14 MQSYASAMLSVFNSDDYSPAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRV
10 20 30 40 50 60
70 80 90 100 110 120
pF1KB9 KRPMNAFIVWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMH
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS14 KRPMNAFIVWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMH
70 80 90 100 110 120
130 140 150 160 170 180
pF1KB9 REKYPNYKYRPRRKAKMLPKNCSLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQL
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
CCDS14 REKYPNYKYRPRRKAKMLPKNCSLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQL
130 140 150 160 170 180
190 200
pF1KB9 GHLPPINAASSPQQRDRYSHWTKL
::::::::::::::::::::::::
CCDS14 GHLPPINAASSPQQRDRYSHWTKL
190 200
>>CCDS14669.1 SOX3 gene_id:6658|Hs108|chrX (446 aa)
initn: 449 init1: 430 opt: 439 Z-score: 505.5 bits: 101.6 E(32554): 7.8e-22
Smith-Waterman score: 452; 49.1% identity (72.3% similar) in 159 aa overlap (54-192:133-291)
30 40 50 60 70 80
pF1KB9 NIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKMALENP
:. :::::::::::.:::: ::::::::::
CCDS14 PGGAGKSSANAAGGANSGGGSSGGASGGGGGTDQDRVKRPMNAFMVWSRGQRRKMALENP
110 120 130 140 150 160
90 100 110 120 130 140
pF1KB9 RMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKMLPKN--
.:.::::::.:: .::.::.::: ::..::..:.:.: ..::.::::::::.: : :.
CCDS14 KMHNSEISKRLGADWKLLTDAEKRPFIDEAKRLRAVHMKEYPDYKYRPRRKTKTLLKKDK
170 180 190 200 210 220
150 160 170 180
pF1KB9 ----CSLLP----------ADPASVLCSEV----QLDNRLYRDDCTKATHSRMEHQLGHL
.::: : :.. : : .::. . . .....: ...:::.
CCDS14 YSLPSGLLPPGAAAAAAAAAAAAAAASSPVGVGQRLDTYTHVNGWANGAYSLVQEQLGYA
230 240 250 260 270 280
190 200
pF1KB9 PPINAASSPQQRDRYSHWTKL
: . .: :
CCDS14 QPPSMSSPPPPPALPPMHRYDMAGLQYSPMMPPGAQSYMNVAAAAAAASGYGGMAPSATA
290 300 310 320 330 340
>>CCDS3239.1 SOX2 gene_id:6657|Hs108|chr3 (317 aa)
initn: 434 init1: 416 opt: 424 Z-score: 490.6 bits: 98.3 E(32554): 5.3e-21
Smith-Waterman score: 439; 45.6% identity (69.6% similar) in 171 aa overlap (49-198:31-200)
20 30 40 50 60 70
pF1KB9 PAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKM
: :.: : ::::::::::.:::: :::::
CCDS32 MYNMMETELKPPGPQQTSGGGGGNSTAAAAGGNQK-NSPDRVKRPMNAFMVWSRGQRRKM
10 20 30 40 50
80 90 100 110 120 130
pF1KB9 ALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAK-M
: :::.:.::::::.:: .::.:.:.:: ::..::..:.:.: ...:.::::::::.: .
CCDS32 AQENPKMHNSEISKRLGAEWKLLSETEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTL
60 70 80 90 100 110
140 150 160 170 180
pF1KB9 LPKNCSLLP----ADPASVLCSEV------------QLDNRLYRDDCTKATHSRMEHQLG
. :. :: : .. . : : ..:. . . .....: :. :::
CCDS32 MKKDKYTLPGGLLAPGGNSMASGVGVGAGLGAGVNQRMDSYAHMNGWSNGSYSMMQDQLG
120 130 140 150 160 170
190 200
pF1KB9 H--LPPINA--ASSPQQRDRYSHWTKL
. : .:: :.. : ::
CCDS32 YPQHPGLNAHGAAQMQPMHRYDVSALQYNSMTSSQTYMNGSPTYSMSYSQQGTPGMALGS
180 190 200 210 220 230
>>CCDS9523.1 SOX1 gene_id:6656|Hs108|chr13 (391 aa)
initn: 418 init1: 418 opt: 425 Z-score: 490.4 bits: 98.6 E(32554): 5.4e-21
Smith-Waterman score: 425; 66.3% identity (90.2% similar) in 92 aa overlap (49-140:41-131)
20 30 40 50 60 70
pF1KB9 PAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKM
: ..:.: :::::::::::.:::: :::::
CCDS95 HSPGGAQAPTNLSGPAGAGGGGGGGGGGGGGGGAKAN-QDRVKRPMNAFMVWSRGQRRKM
20 30 40 50 60
80 90 100 110 120 130
pF1KB9 ALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKML
: :::.:.::::::.:: .::...:::: ::..::..:.:.: ...:.::::::::.: :
CCDS95 AQENPKMHNSEISKRLGAEWKVMSEAEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTL
70 80 90 100 110 120
140 150 160 170 180 190
pF1KB9 PKNCSLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRY
:
CCDS95 LKKDKYSLAGGLLAAGAGGGGAAVAMGVGVGVGAAAVGQRLESPGGAAGGGYAHVNGWAN
130 140 150 160 170 180
>>CCDS9473.1 SOX21 gene_id:11166|Hs108|chr13 (276 aa)
initn: 419 init1: 401 opt: 409 Z-score: 474.4 bits: 95.1 E(32554): 4.2e-20
Smith-Waterman score: 409; 69.9% identity (90.4% similar) in 83 aa overlap (58-140:6-88)
30 40 50 60 70 80
pF1KB9 LRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKMALENPRMRN
:.::::::::.:::: :::::: :::.:.:
CCDS94 MSKPVDHVKRPMNAFMVWSRAQRRKMAQENPKMHN
10 20 30
90 100 110 120 130 140
pF1KB9 SEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKMLPKNCSLLPA
:::::.:: .::.:::.:: ::..::..:.::: ...:.:::::::: : : :
CCDS94 SEISKRLGAEWKLLTESEKRPFIDEAKRLRAMHMKEHPDYKYRPRRKPKTLLKKDKFAFP
40 50 60 70 80 90
150 160 170 180 190 200
pF1KB9 DPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRYSHWTKL
CCDS94 VPYGLGGVADAEHPALKAGAGLHAGAGGGLVPESLLANPEKAAAAAAAAAARVFFPQSAA
100 110 120 130 140 150
>>CCDS3094.1 SOX14 gene_id:8403|Hs108|chr3 (240 aa)
initn: 404 init1: 386 opt: 402 Z-score: 467.3 bits: 93.6 E(32554): 1e-19
Smith-Waterman score: 402; 62.2% identity (88.9% similar) in 90 aa overlap (58-146:6-95)
30 40 50 60 70 80
pF1KB9 LRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKMALENPRMRN
:..:::::::.:::: :::::: :::.:.:
CCDS30 MSKPSDHIKRPMNAFMVWSRGQRRKMAQENPKMHN
10 20 30
90 100 110 120 130 140
pF1KB9 SEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAK-MLPKNCSLLP
:::::.:: .::.:.:::: :...::..:.:.: ...:.:::::::: : .: :. ..:
CCDS30 SEISKRLGAEWKLLSEAEKRPYIDEAKRLRAQHMKEHPDYKYRPRRKPKNLLKKDRYVFP
40 50 60 70 80 90
150 160 170 180 190 200
pF1KB9 ADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRYSHWTKL
CCDS30 LPYLGDTDPLKAAGLPVGASDGLLSAPEKARAFLPPASAPYSLLDPAQFSSSAIQKMGEV
100 110 120 130 140 150
>>CCDS32549.1 SOX15 gene_id:6665|Hs108|chr17 (233 aa)
initn: 411 init1: 380 opt: 381 Z-score: 443.6 bits: 89.2 E(32554): 2.2e-18
Smith-Waterman score: 381; 62.5% identity (80.7% similar) in 88 aa overlap (58-142:47-134)
30 40 50 60 70 80
pF1KB9 LRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKMALENPRMRN
..::::::::.::: :::.:: .::.:.:
CCDS32 PAATAAASSSSGPQEREGAGSPAAPGTLPLEKVKRPMNAFMVWSSAQRRQMAQQNPKMHN
20 30 40 50 60 70
90 100 110 120 130 140
pF1KB9 SEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKML---PKNCSL
:::::.:: :::.: : :: :: .::..:.: : . ::.:::::::::: :. :
CCDS32 SEISKRLGAQWKLLDEDEKRPFVEEAKRLRARHLRDYPDYKYRPRRKAKSSGAGPSRCGQ
80 90 100 110 120 130
150 160 170 180 190 200
pF1KB9 LPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRYSHWTKL
CCDS32 GRGNLASGGPLWGPGYATTQPSRGFGYRPPSYSTAYLPGSYGSSHCKLEAPSPCSLPQSD
140 150 160 170 180 190
>>CCDS5977.1 SOX7 gene_id:83595|Hs108|chr8 (388 aa)
initn: 352 init1: 320 opt: 352 Z-score: 407.2 bits: 83.2 E(32554): 2.3e-16
Smith-Waterman score: 352; 45.9% identity (79.3% similar) in 111 aa overlap (53-163:39-145)
30 40 50 60 70 80
pF1KB9 ENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKMALEN
::. ..:..::::::.::..:.:...:..:
CCDS59 PWPEGLECPALDAELSDGQSPPAVPRPPGDKGS-ESRIRRPMNAFMVWAKDERKRLAVQN
10 20 30 40 50 60
90 100 110 120 130 140
pF1KB9 PRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKMLPKNC
: ..:.:.::.:: .:: :: ..: :. .::..:. .: . ::::::::::: :. . :
CCDS59 PDLHNAELSKMLGKSWKALTLSQKRPYVDEAERLRLQHMQDYPNYKYRPRRK-KQAKRLC
70 80 90 100 110 120
150 160 170 180 190 200
pF1KB9 SLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRYSHWT
. . ::. .: : . .: :
CCDS59 KRV--DPGFLLSSLSRDQNALPEKRSGSRGALGEKEDRGEYSPGTALPSLRGCYHEGPAG
130 140 150 160 170 180
>>CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 (414 aa)
initn: 380 init1: 346 opt: 346 Z-score: 399.9 bits: 81.9 E(32554): 5.9e-16
Smith-Waterman score: 346; 48.3% identity (85.4% similar) in 89 aa overlap (49-137:57-145)
20 30 40 50 60 70
pF1KB9 PAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKM
: .... ..:..::::::.::..:.:...
CCDS61 LGPCPWAESLSPIGDMKVKGEAPANSGAPAGAAGRAKGESRIRRPMNAFMVWAKDERKRL
30 40 50 60 70 80
80 90 100 110 120 130
pF1KB9 ALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKML
: .:: ..:.:.::.:: .:: :: ::: :: .::..:...: . .:::::::::. ..
CCDS61 AQQNPDLHNAELSKMLGKSWKALTLAEKRPFVEEAERLRVQHMQDHPNYKYRPRRRKQVK
90 100 110 120 130 140
140 150 160 170 180 190
pF1KB9 PKNCSLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRY
CCDS61 RLKRVEGGFLHGLAEPQAAALGPEGGRVAMDGLGLQFPEQGFPAGPPLLPPHMGGHYRDC
150 160 170 180 190 200
>>CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 (441 aa)
initn: 366 init1: 337 opt: 342 Z-score: 395.0 bits: 81.1 E(32554): 1.1e-15
Smith-Waterman score: 342; 47.5% identity (75.2% similar) in 101 aa overlap (39-139:28-128)
10 20 30 40 50 60
pF1KB9 LSVFNSDDYSPAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFI
.:. :. . ... ..:::::::.
CCDS16 MVQQAESLEAESNLPREALDTEEGEFMACSPVALDESDPDWCKTASGHIKRPMNAFM
10 20 30 40 50
70 80 90 100 110 120
pF1KB9 VWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYK
:::. .:::. ..: :.:.::::.:: .:::: ..:: ::..::..:. : ::.::
CCDS16 VWSKIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHMADYPDYK
60 70 80 90 100 110
130 140 150 160 170 180
pF1KB9 YRPRRKAKMLPKNCSLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINA
::::.: :: :
CCDS16 YRPRKKPKMDPSAKPSASQSPEKSAAGGGGGSAGGGAGGAKTSKGSSKKCGKLKAPAAAG
120 130 140 150 160 170
204 residues in 1 query sequences
18511270 residues in 32554 library sequences
Tcomplib [36.3.4 Apr, 2011] (8 proc)
start: Fri Nov 4 17:56:46 2016 done: Fri Nov 4 17:56:47 2016
Total Scan time: 2.090 Total Display time: -0.020
Function used was FASTA [36.3.4 Apr, 2011]