FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB9552, 466 aa 1>>>pF1KB9552 466 - 466 aa - 466 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 8.7342+/-0.000787; mu= 4.2724+/- 0.048 mean_var=207.8376+/-41.944, 0's: 0 Z-trim(116.0): 58 B-trim: 241 in 2/52 Lambda= 0.088964 statistics sampled from 16520 (16578) to 16520 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.802), E-opt: 0.2 (0.509), width: 16 Scan time: 3.190 The best scores are: opt bits E(32554) CCDS13964.1 SOX10 gene_id:6663|Hs108|chr22 ( 466) 3261 430.6 1.6e-120 CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 ( 509) 1350 185.4 1.2e-46 CCDS10428.1 SOX8 gene_id:30812|Hs108|chr16 ( 446) 869 123.6 4.1e-28 CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 ( 414) 436 68.0 2.1e-11 CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 ( 315) 422 66.1 5.9e-11 >>CCDS13964.1 SOX10 gene_id:6663|Hs108|chr22 (466 aa) initn: 3261 init1: 3261 opt: 3261 Z-score: 2277.1 bits: 430.6 E(32554): 1.6e-120 Smith-Waterman score: 3261; 100.0% identity (100.0% similar) in 466 aa overlap (1-466:1-466) 10 20 30 40 50 60 pF1KB9 MAEEQDLSEVELSPVGSEEPRCLSPGSAPSLGPDGGGGGSGLRASPGPGELGKVKKEQQD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 MAEEQDLSEVELSPVGSEEPRCLSPGSAPSLGPDGGGGGSGLRASPGPGELGKVKKEQQD 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB9 GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNGASKSKPHVKRPMNAFMVWAQAARR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNGASKSKPHVKRPMNAFMVWAQAARR 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB9 KLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKYQPRRRKN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 KLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKYQPRRRKN 130 140 150 160 170 180 190 200 210 220 230 240 pF1KB9 GKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDGNPEHPSGQSHGPPT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 GKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDGNPEHPSGQSHGPPT 190 200 210 220 230 240 250 260 270 280 290 300 pF1KB9 PPTTPKTELQSGKADPKRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMETFDVAELDQY :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 PPTTPKTELQSGKADPKRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMETFDVAELDQY 250 260 270 280 290 300 310 320 330 340 350 360 pF1KB9 LPPNGHPGHVSSYSAAGYGLGSALAVASGHSAWISKPPGVALPTVSPPGVDAKAQVKTET :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 LPPNGHPGHVSSYSAAGYGLGSALAVASGHSAWISKPPGVALPTVSPPGVDAKAQVKTET 310 320 330 340 350 360 370 380 390 400 410 420 pF1KB9 AGPQGPPHYTDQPSTSQIAYTSLSLPHYGSAFPSISRPQFDYSDHQPSGPYYGHSGQASG :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 AGPQGPPHYTDQPSTSQIAYTSLSLPHYGSAFPSISRPQFDYSDHQPSGPYYGHSGQASG 370 380 390 400 410 420 430 440 450 460 pF1KB9 LYSAFSYMGPSQRPLYTAISDPSPSGPQSHSPTHWEQPVYTTLSRP :::::::::::::::::::::::::::::::::::::::::::::: CCDS13 LYSAFSYMGPSQRPLYTAISDPSPSGPQSHSPTHWEQPVYTTLSRP 430 440 450 460 >>CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 (509 aa) initn: 1439 init1: 805 opt: 1350 Z-score: 951.0 bits: 185.4 E(32554): 1.2e-46 Smith-Waterman score: 1624; 54.0% identity (72.2% similar) in 493 aa overlap (18-456:13-499) 10 20 30 40 50 pF1KB9 MAEEQDLSEVELSPVGSEEPRCLSPGSAPSLGPDGGGG----GSGLRASPGPGELGKVKK :. . :: . .:... :..:. ::: . . . : CCDS11 MNLLDPFMKMTDEQEKGLSGAPSPTMSEDSAGSPCPSGSGSDTENTRPQENTFPK 10 20 30 40 50 60 70 80 90 100 110 pF1KB9 EQQD--GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNGASKSKPHVKRPMNAFMVW . : :...:::::::::::::::.:::::::::::::::.::.:::::::::::::: CCDS11 GEPDLKKESEEDKFPVCIREAVSQVLKGYDWTLVPMPVRVNGSSKNKPHVKRPMNAFMVW 60 70 80 90 100 110 120 130 140 150 160 170 pF1KB9 AQAARRKLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKYQ ::::::::::::::::::::::::::::::::::.::::.:::::::.:::::::::::: CCDS11 AQAARRKLADQYPHLHNAELSKTLGKLWRLLNESEKRPFVEEAERLRVQHKKDHPDYKYQ 120 130 140 150 160 170 180 190 200 210 220 230 pF1KB9 PRRRKNGKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSD-GNPEHPSG :::::. : .:.::: :: . . .: .:. . : : . : ::. .: . :: CCDS11 PRRRKSVKNGQAEAE----EATEQTHISPNAIFKALQADSPHSSSG--MSEVHSPGEHSG 180 190 200 210 220 240 250 260 270 280 290 pF1KB9 QSHGPPTPPTTPKTELQSGKADPKRDGRSMGEGGK-PHIDFGNVDIGEISHEVMSNMETF ::.:::::::::::..: :::: ::.:: . :::. : ::: .:::::.: .:.::.::: CCDS11 QSQGPPTPPTTPKTDVQPGKADLKREGRPLPEGGRQPPIDFRDVDIGELSSDVISNIETF 230 240 250 260 270 280 300 310 320 330 340 pF1KB9 DVAELDQYLPPNGHPG----HVSSYSAAGYGLGSALAV-ASGHSAWISK----PPGVALP :: :.::::::::::: : . ...::..:. :. ::. .:.:: :: : CCDS11 DVNEFDQYLPPNGHPGVPATHGQVTYTGSYGISSTAATPASAGHVWMSKQQAPPPPPQQP 290 300 310 320 330 340 350 360 370 pF1KB9 TVSPPGVDAKAQVKTE-----TAGPQ---------------------------GPPHYTD .::. .: : .. .: :: .: ::.. CCDS11 PQAPPAPQAPPQPQAAPPQQPAAPPQQPQAHTLTTLSSEPGQSQRTHIKTEQLSPSHYSE 350 360 370 380 390 400 380 390 400 410 420 pF1KB9 QP--STSQIAYTSLSLPHYGSAFPSISRPQFDYSDHQPSGPYYGHS-GQASGLYSAFSYM : : .::::. ..::::. ..: :.: :.::.::: :. ::.:. ::..::::.:.:: CCDS11 QQQHSPQQIAYSPFNLPHYSPSYPPITRSQYDYTDHQNSSSYYSHAAGQGTGLYSTFTYM 410 420 430 440 450 460 430 440 450 460 pF1KB9 GPSQRPLYTAISDPS--PSGPQSHSPTHWEQPVYTTLSRP .:.:::.:: :.: : :: ::.::: ::: CCDS11 NPAQRPMYTPIADTSGVPSIPQTHSPQHWEQPVYTQLTRP 470 480 490 500 >>CCDS10428.1 SOX8 gene_id:30812|Hs108|chr16 (446 aa) initn: 1010 init1: 579 opt: 869 Z-score: 618.1 bits: 123.6 E(32554): 4.1e-28 Smith-Waterman score: 1281; 50.2% identity (68.1% similar) in 474 aa overlap (10-466:2-446) 10 20 30 40 50 pF1KB9 MAEEQDLSEVELSPVGSEEPRCLSPGSAPSLG--PDGGGGGSGLRA-SPGPGELG-KVKK ...: . :. : : :.: :.. :. . . : : : :. : : CCDS10 MLDMSEARSQPP-CSPSGTASSMSHVEDSDSDAPPSPAGSEGLGRAGVAVGG 10 20 30 40 50 60 70 80 90 100 110 pF1KB9 EQQD-GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNG--ASKSKPHVKRPMNAFMV . : .:: :..::.:::.::::::.::::.::::::: .: : :.::::::::::::: CCDS10 ARGDPAEAADERFPACIRDAVSQVLKGYDWSLVPMPVRGGGGGALKAKPHVKRPMNAFMV 60 70 80 90 100 110 120 130 140 150 160 170 pF1KB9 WAQAARRKLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKY ::::::::::::::::::::::::::::::::.::.::::.:::::::.::::::::::: CCDS10 WAQAARRKLADQYPHLHNAELSKTLGKLWRLLSESEKRPFVEEAERLRVQHKKDHPDYKY 120 130 140 150 160 170 180 190 200 210 220 230 pF1KB9 QPRRRKNGKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDGNPEHPSG ::::::..:: :... .: :: : . : ::. . : :. :. : .: CCDS10 QPRRRKSAKA--GHSDSDSG-AELGPHPGGGAVYKA------EAGLGDGHHHGD--H-TG 180 190 200 210 240 250 260 270 280 290 pF1KB9 QSHGPPTPPTTPKTELQSGKADP--KRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMET :.:::::::::::::::.. : : : .:: ..:. .:::.::::.:.: :::..:.. CCDS10 QTHGPPTPPTTPKTELQQAGAKPELKLEGRRPVDSGRQNIDFSNVDISELSSEVMGTMDA 220 230 240 250 260 270 300 310 320 330 340 pF1KB9 FDVAELDQYLPPNGHPGHVSSYSAAGYGLGSALAVASGHSAWISK--PPGVALPTVSPPG ::: :.::::: .: :. : . :.: :.. .: : : . : :: . : CCDS10 FDVHEFDQYLPLGG-PAP----PEPGQAYGGAYFHAGASPVWAHKSAPSASASPTETGP- 280 290 300 310 320 330 350 360 370 380 390 400 pF1KB9 VDAKAQVKTETAGPQGPPHYTDQPSTSQIAYTSLSLPHYGSAFPSISRPQF-----DYSD . ..::: :. : :: ::: : : : : .:: :. : ::.: CCDS10 --PRPHIKTEQ--PS-PGHYGDQPRGSP-DYGSCS--GQSSATPAAPAGPFAGSQGDYGD 340 350 360 370 380 410 420 430 440 450 460 pF1KB9 HQPSGPYYGHSGQASGLYSAFSYMGPSQRPLYTAISDPSPSGPQSHSPT-HWEQPVYTTL : :. : .. : : :::. . .: .:: . . . . . : .:::: ::.::::::: CCDS10 LQASSYYGAYPGYAPGLYQYPCFHSP-RRPYASPLLN-GLALPPAHSPTSHWDQPVYTTL 390 400 410 420 430 440 pF1KB9 SRP .:: CCDS10 TRP >>CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 (414 aa) initn: 451 init1: 410 opt: 436 Z-score: 318.2 bits: 68.0 E(32554): 2.1e-11 Smith-Waterman score: 466; 35.4% identity (55.4% similar) in 325 aa overlap (91-410:55-330) 70 80 90 100 110 120 pF1KB9 GEADDDKFPVCIREAVSQVLSGYDWTLVPMPVRVNGASKSKPHVKRPMNAFMVWAQAARR :. . : .:.. ...::::::::::. :. CCDS61 AGLGPCPWAESLSPIGDMKVKGEAPANSGAPAGAAGRAKGESRIRRPMNAFMVWAKDERK 30 40 50 60 70 80 130 140 150 160 170 180 pF1KB9 KLADQYPHLHNAELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKYQPRRRKN .::.: : :::::::: ::: :. :. ..::::.:::::::.:: .:::.:::.:::::. CCDS61 RLAQQNPDLHNAELSKMLGKSWKALTLAEKRPFVEEAERLRVQHMQDHPNYKYRPRRRKQ 90 100 110 120 130 140 190 200 210 220 230 pF1KB9 GKAAQGEAECPGGEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDG-NPEHP-SGQSHGP : . . :: . : : :: : : : : :: . . : .: :: CCDS61 VKRLK---RVEGGFLH--GLAEPQA----AALG---PEGGRVAMDGLGLQFPEQGFPAGP 150 160 170 180 190 240 250 260 270 280 290 pF1KB9 PTPPTTPKTELQSGKADPKRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMETFDVAELD : : :. ..:. :: .:.: : .: : . : :.. :: CCDS61 PLLP--PH---MGGHY---RDCQSLGA---PPLD------GY-------PLPTPDTSPLD 200 210 220 300 310 320 330 340 350 pF1KB9 QYLPPNGHPGHVSSYSAAGYGLGSALAVASGHSAWISKPPGVALPTVSPPGVDAKAQVKT : : .. :: . :. :... : .: : : ::. . .. CCDS61 GVDPD---P----AFFAAPMP-GDCPAAGTYSYAQVSDYAG---PP-EPPAGPMHPRLGP 230 240 250 260 270 360 370 380 390 400 410 pF1KB9 ETAGPQGPPHYTDQPSTSQIAYTSLSLPHYGSAFPSISRPQFDYS---DHQPSGPYYGHS : :::. : ::. .. : ... : :.. .:: ... .:.: :: CCDS61 EPAGPS-IPGLLAPPSALHVYYGAMGSPGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPSP 280 290 300 310 320 330 420 430 440 450 460 pF1KB9 GQASGLYSAFSYMGPSQRPLYTAISDPSPSGPQSHSPTHWEQPVYTTLSRP CCDS61 PPEALPCRDGTDPSQPAELLGEVDRTEFEQYLHFVCKPEMGLPYQGHDSGVNLPDSHGAI 340 350 360 370 380 390 >>CCDS12995.1 SOX12 gene_id:6666|Hs108|chr20 (315 aa) initn: 463 init1: 414 opt: 422 Z-score: 310.2 bits: 66.1 E(32554): 5.9e-11 Smith-Waterman score: 446; 33.2% identity (57.2% similar) in 271 aa overlap (103-369:39-285) 80 90 100 110 120 130 pF1KB9 REAVSQVLSGYDWTLVPMPVRVNGASKSKPHVKRPMNAFMVWAQAARRKLADQYPHLHNA :.::::::::::.: :::. ::.: .::: CCDS12 AKRDGGPPPPGPGPAEEGAREPGWCKTPSGHIKRPMNAFMVWSQHERRKIMDQWPDMHNA 10 20 30 40 50 60 140 150 160 170 180 190 pF1KB9 ELSKTLGKLWRLLNESDKRPFIEEAERLRMQHKKDHPDYKYQPRRRKNGKAAQGEAECPG :.:: ::. :.::..:.: ::..::::::..: :.:::::.::....: :... . :: CCDS12 EISKRLGRRWQLLQDSEKIPFVREAERLRLKHMADYPDYKYRPRKKSKGAPAKARPRPPG 70 80 90 100 110 120 200 210 220 230 240 250 pF1KB9 GEAEQGGTAAIQAHYKSAHLDHRHPGEGSPMSDGNPEHPSGQSHGPPTPPTTPKTELQSG : .:: . .. . .: ::.:. . :.: .: . .: :: CCDS12 G---SGGGSRLKP---GPQL----PGRGGRRAAGGPL--GGGAAAPEDDDEDDDEELLEV 130 140 150 160 170 260 270 280 290 300 310 pF1KB9 KADPKRDGRSMGEGGKPHIDFGNVDIGEISHEVMSNMETFDVAELDQYLPPNGHPGHVSS . . :: . . . : . :. . . : .: . : . . . CCDS12 RL-VETPGRELWR----MVPAGRAARGQAERAQGPSGEGAAAAAAASPTPSEDEEPEEEE 180 190 200 210 220 230 320 330 340 350 360 pF1KB9 YSAAGYGLGSALAVASGHSA--WISK-PPGVALPTVSPPGVDAKAQVKT-ETAGPQGPPH ::. : .::::. . ..:. ::: : :.: .: . . :.: : CCDS12 EEAAAAEEGEEETVASGEESLGFLSRLPPG-------PAGLDCSALDRDPDLQPPSGTSH 240 250 260 270 280 370 380 390 400 410 420 pF1KB9 YTDQPSTSQIAYTSLSLPHYGSAFPSISRPQFDYSDHQPSGPYYGHSGQASGLYSAFSYM . CCDS12 FEFPDYCTPEVTEMIAGDWRPSSIADLVFTY 290 300 310 466 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sat Nov 5 02:06:27 2016 done: Sat Nov 5 02:06:28 2016 Total Scan time: 3.190 Total Display time: 0.030 Function used was FASTA [36.3.4 Apr, 2011]