FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB9649, 441 aa 1>>>pF1KB9649 441 - 441 aa - 441 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 8.6813+/-0.000815; mu= 5.6134+/- 0.050 mean_var=247.3238+/-50.350, 0's: 0 Z-trim(116.3): 52 B-trim: 0 in 0/54 Lambda= 0.081553 statistics sampled from 16872 (16924) to 16872 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.802), E-opt: 0.2 (0.52), width: 16 Scan time: 3.850 The best scores are: opt bits E(32554) CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 ( 441) 2950 359.6 3.7e-99 CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 ( 474) 718 97.0 4.4e-20 CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 ( 315) 586 81.3 1.6e-15 >>CCDS1654.1 SOX11 gene_id:6664|Hs108|chr2 (441 aa) initn: 2950 init1: 2950 opt: 2950 Z-score: 1893.7 bits: 359.6 E(32554): 3.7e-99 Smith-Waterman score: 2950; 100.0% identity (100.0% similar) in 441 aa overlap (1-441:1-441) 10 20 30 40 50 60 pF1KB9 MVQQAESLEAESNLPREALDTEEGEFMACSPVALDESDPDWCKTASGHIKRPMNAFMVWS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS16 MVQQAESLEAESNLPREALDTEEGEFMACSPVALDESDPDWCKTASGHIKRPMNAFMVWS 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB9 KIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHMADYPDYKYRP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS16 KIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHMADYPDYKYRP 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB9 RKKPKMDPSAKPSASQSPEKSAAGGGGGSAGGGAGGAKTSKGSSKKCGKLKAPAAAGAKA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS16 RKKPKMDPSAKPSASQSPEKSAAGGGGGSAGGGAGGAKTSKGSSKKCGKLKAPAAAGAKA 130 140 150 160 170 180 190 200 210 220 230 240 pF1KB9 GAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGGAGKTVKCVFLDEDDDDDDDDDELQLQIK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS16 GAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGGAGKTVKCVFLDEDDDDDDDDDELQLQIK 190 200 210 220 230 240 250 260 270 280 290 300 pF1KB9 QEPDEEDEEPPHQQLLQPPGQQPSQLLRRYNVAKVPASPTLSSSAESPEGASLYDEVRAG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS16 QEPDEEDEEPPHQQLLQPPGQQPSQLLRRYNVAKVPASPTLSSSAESPEGASLYDEVRAG 250 260 270 280 290 300 310 320 330 340 350 360 pF1KB9 ATSGAGGGSRLYYSFKNITKQHPPPLAQPALSPASSRSVSTSSSSSSGSSSGSSGEDADD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS16 ATSGAGGGSRLYYSFKNITKQHPPPLAQPALSPASSRSVSTSSSSSSGSSSGSSGEDADD 310 320 330 340 350 360 370 380 390 400 410 420 pF1KB9 LMFDLSLNFSQSAHSASEQQLGGGAAAGNLSLSLVDKDLDSFSEGSLGSHFEFPDYCTPE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS16 LMFDLSLNFSQSAHSASEQQLGGGAAAGNLSLSLVDKDLDSFSEGSLGSHFEFPDYCTPE 370 380 390 400 410 420 430 440 pF1KB9 LSEMIAGDWLEANFSDLVFTY ::::::::::::::::::::: CCDS16 LSEMIAGDWLEANFSDLVFTY 430 440 >>CCDS4547.1 SOX4 gene_id:6659|Hs108|chr6 (474 aa) initn: 1098 init1: 628 opt: 718 Z-score: 474.1 bits: 97.0 E(32554): 4.4e-20 Smith-Waterman score: 1010; 43.8% identity (64.6% similar) in 491 aa overlap (1-441:1-474) 10 20 30 40 50 pF1KB9 MVQQAESLE-AESNLPREALDTEEG-EF-MACSPVALDES-------DPDWCKTASGHIK ::::... : .:. : :. :. : :. .: ::. . . ::.:::: ::::: CCDS45 MVQQTNNAENTEALLAGESSDSGAGLELGIASSPTPGSTASTGGKADDPSWCKTPSGHIK 10 20 30 40 50 60 60 70 80 90 100 110 pF1KB9 RPMNAFMVWSKIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHM ::::::::::.:::::::::::::::::::::::::::.::::.:::::::::::::::: CCDS45 RPMNAFMVWSQIERRKIMEQSPDMHNAEISKRLGKRWKLLKDSDKIPFIREAERLRLKHM 70 80 90 100 110 120 120 130 140 150 160 pF1KB9 ADYPDYKYRPRKKPK---MDPSAKPSASQSP-EKS--AAGGGGGSAGGGAGGAKTSKGSS ::::::::::::: : . :.. .::..: ::. ..:.:::. :::.::.... :.. CCDS45 ADYPDYKYRPRKKVKSGNANSSSSAAASSKPGEKGDKVGGSGGGGHGGGGGGGSSNAGGG 130 140 150 160 170 180 170 180 190 200 210 220 pF1KB9 KKCGKLKAPAAAGAKAGAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGGAGKTVKCV---F : . ..:..: . :. : :::: .. .::::.::.. . : CCDS45 ---GGGASGGGANSKPAQKKSCGSKVAGGAGGGVSKPHAKLILAGGGGGGKAAAAAAASF 190 200 210 220 230 230 240 250 260 270 pF1KB9 LDEDDDD------DDDDDELQLQIKQEPDEE---DEEPPHQQLLQPPGQQ--PSQLLRRY :. :. .: . :. . . : ::.. ... : : CCDS45 AAEQAGAAALLPLGAAADHHSLYKARTPSASASASSAASASAALAAPGKHLAEKKVKRVY 240 250 260 270 280 290 280 290 300 310 pF1KB9 NVAKV--PASPT--LSSSAESPEGASLYDEVRAG--------------ATSGAGGGSRL- . . .::. ....:. . .::.: :: :.: :.: : CCDS45 LFGGLGTSSSPVGGVGAGADPSDPLGLYEEEGAGCSPDAPSLSGRSSAASSPAAGRSPAD 300 310 320 330 340 350 320 330 340 350 360 370 pF1KB9 YYSFKNITKQHPPPLAQPALSPASSRSVSTSSSSSSGSSSGSSGEDADDLMFDLSLNFSQ . .. .. : : . : : ::: . : ::::::..::.:. : ::: :.:: :. CCDS45 HRGYASLRAASPAPSSAP--SHASSSASSHSSSSSSSGSSSSDDEFEDDL---LDLNPSS 360 370 380 390 400 410 380 390 400 410 420 430 pF1KB9 SAHSASEQQLGGGAAAGNLSLSLVDKDLD-SFSEGSLGSHFEFPDYCTPELSEMIAGDWL . .: : ::. ... : .:.::: .: :: :::::::::::::.::::.:::: CCDS45 NFESMS---LGSFSSS-----SALDRDLDFNFEPGS-GSHFEFPDYCTPEVSEMISGDWL 420 430 440 450 460 440 pF1KB9 EANFSDLVFTY :...:.::::: CCDS45 ESSISNLVFTY 470 >>CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 (315 aa) initn: 820 init1: 562 opt: 586 Z-score: 392.4 bits: 81.3 E(32554): 1.6e-15 Smith-Waterman score: 720; 39.6% identity (55.1% similar) in 412 aa overlap (31-441:22-315) 10 20 30 40 50 60 pF1KB9 MVQQAESLEAESNLPREALDTEEGEFMACSPVALDESDPDWCKTASGHIKRPMNAFMVWS :. .: :::: ::::::::::::::: CCDS12 MVQQRGARAKRDGGPPPPGPGPAEEGAREPGWCKTPSGHIKRPMNAFMVWS 10 20 30 40 50 70 80 90 100 110 120 pF1KB9 KIERRKIMEQSPDMHNAEISKRLGKRWKMLKDSEKIPFIREAERLRLKHMADYPDYKYRP . ::::::.: :::::::::::::.::..:.:::::::.::::::::::::::::::::: CCDS12 QHERRKIMDQWPDMHNAEISKRLGRRWQLLQDSEKIPFVREAERLRLKHMADYPDYKYRP 60 70 80 90 100 110 130 140 150 160 170 pF1KB9 RKKPKMDPS-AKPSASQSPEKSAAGGGGGSAGGGAGGAKTSKGSSKKCGKLKAPAAAGAK ::: : :. :.: . : ::..::.. . : . :. .: . CCDS12 RKKSKGAPAKARP---RPP------------GGSGGGSRLKPGP-------QLPGRGGRR 120 130 140 180 190 200 210 220 230 pF1KB9 AGAGKAAQSGDYGGAGDDYVLGSLRVSGSGGGGAGKTVKCVFLDEDDDDDDDDDELQLQI : :: : ::::. .:::.:::.:: :.. CCDS12 A-AG-----------------------GPLGGGAAAP--------EDDDEDDDEEL-LEV 150 160 170 240 250 260 270 280 290 pF1KB9 KQEPDEEDEEPPHQQLLQPPGQQPSQLLRRYNVAKVPASPTLSSSAESPEGASLYDEVRA . :.. ::.. : :. :::. . ..:: :: CCDS12 R--------------LVETPGRE----LWRM----VPAGRAARGQAE-----------RA 180 190 200 300 310 320 330 340 350 pF1KB9 GATSGAGGGSRLYYSFKNITKQHPPPLAQPALSPASSRSVSTSSSSSSGSSSGSSGEDAD . :: :.. : : ::. :.. .... ::. CCDS12 QGPSGEGAA------------------AAAAASPTPSED-EEPEEEEEEAAAAEEGEEET 210 220 230 240 360 370 380 390 400 410 pF1KB9 DLMFDLSLNFSQSAHSASEQQLGGGAAAGNLSLSLVDKDLDSFSEGSLGSHFEFPDYCTP . ::.: . .: : :. :. : .:.: : .. : ::::::::::: CCDS12 VASGEESLGFLS--------RLPPGPAG--LDCSALDRDPD-LQPPSGTSHFEFPDYCTP 250 260 270 280 290 420 430 440 pF1KB9 ELSEMIAGDWLEANFSDLVFTY :..::::::: ....:::::: CCDS12 EVTEMIAGDWRPSSIADLVFTY 300 310 441 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Fri Nov 4 17:57:26 2016 done: Fri Nov 4 17:57:26 2016 Total Scan time: 3.850 Total Display time: 0.000 Function used was FASTA [36.3.4 Apr, 2011]