FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB9646, 204 aa 1>>>pF1KB9646 204 - 204 aa - 204 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.7286+/-0.000695; mu= 11.6390+/- 0.042 mean_var=76.9492+/-15.386, 0's: 0 Z-trim(110.3): 61 B-trim: 0 in 0/53 Lambda= 0.146208 statistics sampled from 11438 (11501) to 11438 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.74), E-opt: 0.2 (0.353), width: 16 Scan time: 2.090 The best scores are: opt bits E(32554) CCDS14772.1 SRY gene_id:6736|Hs108|chrY ( 204) 1412 306.6 6.7e-84 CCDS14669.1 SOX3 gene_id:6658|Hs108|chrX ( 446) 439 101.6 7.8e-22 CCDS3239.1 SOX2 gene_id:6657|Hs108|chr3 ( 317) 424 98.3 5.3e-21 CCDS9523.1 SOX1 gene_id:6656|Hs108|chr13 ( 391) 425 98.6 5.4e-21 CCDS9473.1 SOX21 gene_id:11166|Hs108|chr13 ( 276) 409 95.1 4.2e-20 CCDS3094.1 SOX14 gene_id:8403|Hs108|chr3 ( 240) 402 93.6 1e-19 CCDS32549.1 SOX15 gene_id:6665|Hs108|chr17 ( 233) 381 89.2 2.2e-18 CCDS5977.1 SOX7 gene_id:83595|Hs108|chr8 ( 388) 352 83.2 2.3e-16 CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 ( 414) 346 81.9 5.9e-16 CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 ( 441) 342 81.1 1.1e-15 CCDS13552.1 SOX18 gene_id:54345|Hs108|chr20 ( 384) 329 78.3 6.7e-15 CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 ( 315) 322 76.8 1.6e-14 CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 ( 474) 321 76.7 2.6e-14 CCDS41761.1 SOX5 gene_id:6660|Hs108|chr12 ( 377) 317 75.8 3.8e-14 CCDS58216.1 SOX5 gene_id:6660|Hs108|chr12 ( 642) 317 75.9 5.9e-14 CCDS81672.1 SOX5 gene_id:6660|Hs108|chr12 ( 728) 317 76.0 6.6e-14 CCDS44844.1 SOX5 gene_id:6660|Hs108|chr12 ( 750) 317 76.0 6.7e-14 CCDS58217.1 SOX5 gene_id:6660|Hs108|chr12 ( 753) 317 76.0 6.8e-14 CCDS8699.1 SOX5 gene_id:6660|Hs108|chr12 ( 763) 317 76.0 6.8e-14 CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 ( 509) 314 75.2 7.6e-14 CCDS10428.1 SOX8 gene_id:30812|Hs108|chr16 ( 446) 312 74.8 9.1e-14 CCDS53604.1 SOX6 gene_id:55553|Hs108|chr11 ( 801) 314 75.4 1.1e-13 CCDS53605.1 SOX6 gene_id:55553|Hs108|chr11 ( 804) 314 75.4 1.1e-13 CCDS7821.1 SOX6 gene_id:55553|Hs108|chr11 ( 808) 314 75.4 1.1e-13 CCDS13964.1 SOX10 gene_id:6663|Hs108|chr22 ( 466) 309 74.2 1.5e-13 CCDS44299.1 SOX13 gene_id:9580|Hs108|chr1 ( 622) 305 73.4 3.3e-13 CCDS78080.1 SOX30 gene_id:11063|Hs108|chr5 ( 448) 294 71.0 1.3e-12 CCDS4340.1 SOX30 gene_id:11063|Hs108|chr5 ( 501) 291 70.4 2.2e-12 CCDS4339.1 SOX30 gene_id:11063|Hs108|chr5 ( 753) 292 70.7 2.6e-12 >>CCDS14772.1 SRY gene_id:6736|Hs108|chrY (204 aa) initn: 1412 init1: 1412 opt: 1412 Z-score: 1619.8 bits: 306.6 E(32554): 6.7e-84 Smith-Waterman score: 1412; 100.0% identity (100.0% similar) in 204 aa overlap (1-204:1-204) 10 20 30 40 50 60 pF1KB9 MQSYASAMLSVFNSDDYSPAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 MQSYASAMLSVFNSDDYSPAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRV 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB9 KRPMNAFIVWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 KRPMNAFIVWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMH 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB9 REKYPNYKYRPRRKAKMLPKNCSLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 REKYPNYKYRPRRKAKMLPKNCSLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQL 130 140 150 160 170 180 190 200 pF1KB9 GHLPPINAASSPQQRDRYSHWTKL :::::::::::::::::::::::: CCDS14 GHLPPINAASSPQQRDRYSHWTKL 190 200 >>CCDS14669.1 SOX3 gene_id:6658|Hs108|chrX (446 aa) initn: 449 init1: 430 opt: 439 Z-score: 505.5 bits: 101.6 E(32554): 7.8e-22 Smith-Waterman score: 452; 49.1% identity (72.3% similar) in 159 aa overlap (54-192:133-291) 30 40 50 60 70 80 pF1KB9 NIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKMALENP :. :::::::::::.:::: :::::::::: CCDS14 PGGAGKSSANAAGGANSGGGSSGGASGGGGGTDQDRVKRPMNAFMVWSRGQRRKMALENP 110 120 130 140 150 160 90 100 110 120 130 140 pF1KB9 RMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKMLPKN-- .:.::::::.:: .::.::.::: ::..::..:.:.: ..::.::::::::.: : :. CCDS14 KMHNSEISKRLGADWKLLTDAEKRPFIDEAKRLRAVHMKEYPDYKYRPRRKTKTLLKKDK 170 180 190 200 210 220 150 160 170 180 pF1KB9 ----CSLLP----------ADPASVLCSEV----QLDNRLYRDDCTKATHSRMEHQLGHL .::: : :.. : : .::. . . .....: ...:::. CCDS14 YSLPSGLLPPGAAAAAAAAAAAAAAASSPVGVGQRLDTYTHVNGWANGAYSLVQEQLGYA 230 240 250 260 270 280 190 200 pF1KB9 PPINAASSPQQRDRYSHWTKL : . .: : CCDS14 QPPSMSSPPPPPALPPMHRYDMAGLQYSPMMPPGAQSYMNVAAAAAAASGYGGMAPSATA 290 300 310 320 330 340 >>CCDS3239.1 SOX2 gene_id:6657|Hs108|chr3 (317 aa) initn: 434 init1: 416 opt: 424 Z-score: 490.6 bits: 98.3 E(32554): 5.3e-21 Smith-Waterman score: 439; 45.6% identity (69.6% similar) in 171 aa overlap (49-198:31-200) 20 30 40 50 60 70 pF1KB9 PAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKM : :.: : ::::::::::.:::: ::::: CCDS32 MYNMMETELKPPGPQQTSGGGGGNSTAAAAGGNQK-NSPDRVKRPMNAFMVWSRGQRRKM 10 20 30 40 50 80 90 100 110 120 130 pF1KB9 ALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAK-M : :::.:.::::::.:: .::.:.:.:: ::..::..:.:.: ...:.::::::::.: . CCDS32 AQENPKMHNSEISKRLGAEWKLLSETEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTL 60 70 80 90 100 110 140 150 160 170 180 pF1KB9 LPKNCSLLP----ADPASVLCSEV------------QLDNRLYRDDCTKATHSRMEHQLG . :. :: : .. . : : ..:. . . .....: :. ::: CCDS32 MKKDKYTLPGGLLAPGGNSMASGVGVGAGLGAGVNQRMDSYAHMNGWSNGSYSMMQDQLG 120 130 140 150 160 170 190 200 pF1KB9 H--LPPINA--ASSPQQRDRYSHWTKL . : .:: :.. : :: CCDS32 YPQHPGLNAHGAAQMQPMHRYDVSALQYNSMTSSQTYMNGSPTYSMSYSQQGTPGMALGS 180 190 200 210 220 230 >>CCDS9523.1 SOX1 gene_id:6656|Hs108|chr13 (391 aa) initn: 418 init1: 418 opt: 425 Z-score: 490.4 bits: 98.6 E(32554): 5.4e-21 Smith-Waterman score: 425; 66.3% identity (90.2% similar) in 92 aa overlap (49-140:41-131) 20 30 40 50 60 70 pF1KB9 PAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKM : ..:.: :::::::::::.:::: ::::: CCDS95 HSPGGAQAPTNLSGPAGAGGGGGGGGGGGGGGGAKAN-QDRVKRPMNAFMVWSRGQRRKM 20 30 40 50 60 80 90 100 110 120 130 pF1KB9 ALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKML : :::.:.::::::.:: .::...:::: ::..::..:.:.: ...:.::::::::.: : CCDS95 AQENPKMHNSEISKRLGAEWKVMSEAEKRPFIDEAKRLRALHMKEHPDYKYRPRRKTKTL 70 80 90 100 110 120 140 150 160 170 180 190 pF1KB9 PKNCSLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRY : CCDS95 LKKDKYSLAGGLLAAGAGGGGAAVAMGVGVGVGAAAVGQRLESPGGAAGGGYAHVNGWAN 130 140 150 160 170 180 >>CCDS9473.1 SOX21 gene_id:11166|Hs108|chr13 (276 aa) initn: 419 init1: 401 opt: 409 Z-score: 474.4 bits: 95.1 E(32554): 4.2e-20 Smith-Waterman score: 409; 69.9% identity (90.4% similar) in 83 aa overlap (58-140:6-88) 30 40 50 60 70 80 pF1KB9 LRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKMALENPRMRN :.::::::::.:::: :::::: :::.:.: CCDS94 MSKPVDHVKRPMNAFMVWSRAQRRKMAQENPKMHN 10 20 30 90 100 110 120 130 140 pF1KB9 SEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKMLPKNCSLLPA :::::.:: .::.:::.:: ::..::..:.::: ...:.:::::::: : : : CCDS94 SEISKRLGAEWKLLTESEKRPFIDEAKRLRAMHMKEHPDYKYRPRRKPKTLLKKDKFAFP 40 50 60 70 80 90 150 160 170 180 190 200 pF1KB9 DPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRYSHWTKL CCDS94 VPYGLGGVADAEHPALKAGAGLHAGAGGGLVPESLLANPEKAAAAAAAAAARVFFPQSAA 100 110 120 130 140 150 >>CCDS3094.1 SOX14 gene_id:8403|Hs108|chr3 (240 aa) initn: 404 init1: 386 opt: 402 Z-score: 467.3 bits: 93.6 E(32554): 1e-19 Smith-Waterman score: 402; 62.2% identity (88.9% similar) in 90 aa overlap (58-146:6-95) 30 40 50 60 70 80 pF1KB9 LRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKMALENPRMRN :..:::::::.:::: :::::: :::.:.: CCDS30 MSKPSDHIKRPMNAFMVWSRGQRRKMAQENPKMHN 10 20 30 90 100 110 120 130 140 pF1KB9 SEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAK-MLPKNCSLLP :::::.:: .::.:.:::: :...::..:.:.: ...:.:::::::: : .: :. ..: CCDS30 SEISKRLGAEWKLLSEAEKRPYIDEAKRLRAQHMKEHPDYKYRPRRKPKNLLKKDRYVFP 40 50 60 70 80 90 150 160 170 180 190 200 pF1KB9 ADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRYSHWTKL CCDS30 LPYLGDTDPLKAAGLPVGASDGLLSAPEKARAFLPPASAPYSLLDPAQFSSSAIQKMGEV 100 110 120 130 140 150 >>CCDS32549.1 SOX15 gene_id:6665|Hs108|chr17 (233 aa) initn: 411 init1: 380 opt: 381 Z-score: 443.6 bits: 89.2 E(32554): 2.2e-18 Smith-Waterman score: 381; 62.5% identity (80.7% similar) in 88 aa overlap (58-142:47-134) 30 40 50 60 70 80 pF1KB9 LRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKMALENPRMRN ..::::::::.::: :::.:: .::.:.: CCDS32 PAATAAASSSSGPQEREGAGSPAAPGTLPLEKVKRPMNAFMVWSSAQRRQMAQQNPKMHN 20 30 40 50 60 70 90 100 110 120 130 140 pF1KB9 SEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKML---PKNCSL :::::.:: :::.: : :: :: .::..:.: : . ::.:::::::::: :. : CCDS32 SEISKRLGAQWKLLDEDEKRPFVEEAKRLRARHLRDYPDYKYRPRRKAKSSGAGPSRCGQ 80 90 100 110 120 130 150 160 170 180 190 200 pF1KB9 LPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRYSHWTKL CCDS32 GRGNLASGGPLWGPGYATTQPSRGFGYRPPSYSTAYLPGSYGSSHCKLEAPSPCSLPQSD 140 150 160 170 180 190 >>CCDS5977.1 SOX7 gene_id:83595|Hs108|chr8 (388 aa) initn: 352 init1: 320 opt: 352 Z-score: 407.2 bits: 83.2 E(32554): 2.3e-16 Smith-Waterman score: 352; 45.9% identity (79.3% similar) in 111 aa overlap (53-163:39-145) 30 40 50 60 70 80 pF1KB9 ENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKMALEN ::. ..:..::::::.::..:.:...:..: CCDS59 PWPEGLECPALDAELSDGQSPPAVPRPPGDKGS-ESRIRRPMNAFMVWAKDERKRLAVQN 10 20 30 40 50 60 90 100 110 120 130 140 pF1KB9 PRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKMLPKNC : ..:.:.::.:: .:: :: ..: :. .::..:. .: . ::::::::::: :. . : CCDS59 PDLHNAELSKMLGKSWKALTLSQKRPYVDEAERLRLQHMQDYPNYKYRPRRK-KQAKRLC 70 80 90 100 110 120 150 160 170 180 190 200 pF1KB9 SLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRYSHWT . . ::. .: : . .: : CCDS59 KRV--DPGFLLSSLSRDQNALPEKRSGSRGALGEKEDRGEYSPGTALPSLRGCYHEGPAG 130 140 150 160 170 180 >>CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 (414 aa) initn: 380 init1: 346 opt: 346 Z-score: 399.9 bits: 81.9 E(32554): 5.9e-16 Smith-Waterman score: 346; 48.3% identity (85.4% similar) in 89 aa overlap (49-137:57-145) 20 30 40 50 60 70 pF1KB9 PAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFIVWSRDQRRKM : .... ..:..::::::.::..:.:... CCDS61 LGPCPWAESLSPIGDMKVKGEAPANSGAPAGAAGRAKGESRIRRPMNAFMVWAKDERKRL 30 40 50 60 70 80 80 90 100 110 120 130 pF1KB9 ALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYKYRPRRKAKML : .:: ..:.:.::.:: .:: :: ::: :: .::..:...: . .:::::::::. .. CCDS61 AQQNPDLHNAELSKMLGKSWKALTLAEKRPFVEEAERLRVQHMQDHPNYKYRPRRRKQVK 90 100 110 120 130 140 140 150 160 170 180 190 pF1KB9 PKNCSLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINAASSPQQRDRY CCDS61 RLKRVEGGFLHGLAEPQAAALGPEGGRVAMDGLGLQFPEQGFPAGPPLLPPHMGGHYRDC 150 160 170 180 190 200 >>CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 (441 aa) initn: 366 init1: 337 opt: 342 Z-score: 395.0 bits: 81.1 E(32554): 1.1e-15 Smith-Waterman score: 342; 47.5% identity (75.2% similar) in 101 aa overlap (39-139:28-128) 10 20 30 40 50 60 pF1KB9 LSVFNSDDYSPAVQENIPALRRSSSFLCTESCNSKYQCETGENSKGNVQDRVKRPMNAFI .:. :. . ... ..:::::::. CCDS16 MVQQAESLEAESNLPREALDTEEGEFMACSPVALDESDPDWCKTASGHIKRPMNAFM 10 20 30 40 50 70 80 90 100 110 120 pF1KB9 VWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEAEKWPFFQEAQKLQAMHREKYPNYK :::. .:::. ..: :.:.::::.:: .:::: ..:: ::..::..:. : ::.:: CCDS16 VWSKIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHMADYPDYK 60 70 80 90 100 110 130 140 150 160 170 180 pF1KB9 YRPRRKAKMLPKNCSLLPADPASVLCSEVQLDNRLYRDDCTKATHSRMEHQLGHLPPINA ::::.: :: : CCDS16 YRPRKKPKMDPSAKPSASQSPEKSAAGGGGGSAGGGAGGAKTSKGSSKKCGKLKAPAAAG 120 130 140 150 160 170 204 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Fri Nov 4 17:56:46 2016 done: Fri Nov 4 17:56:47 2016 Total Scan time: 2.090 Total Display time: -0.020 Function used was FASTA [36.3.4 Apr, 2011]