FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB8988, 388 aa 1>>>pF1KB8988 388 - 388 aa - 388 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 9.1071+/-0.000803; mu= 1.5643+/- 0.048 mean_var=247.9370+/-50.117, 0's: 0 Z-trim(117.0): 53 B-trim: 4 in 1/51 Lambda= 0.081452 statistics sampled from 17619 (17672) to 17619 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.838), E-opt: 0.2 (0.543), width: 16 Scan time: 3.400 The best scores are: opt bits E(32554) CCDS5977.1 SOX7 gene_id:83595|Hs108|chr8 ( 388) 2730 333.2 2.5e-91 CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 ( 414) 847 111.9 1.1e-24 CCDS13552.1 SOX18 gene_id:54345|Hs108|chr20 ( 384) 559 78.0 1.6e-14 CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 ( 509) 461 66.6 5.7e-11 >>CCDS5977.1 SOX7 gene_id:83595|Hs108|chr8 (388 aa) initn: 2730 init1: 2730 opt: 2730 Z-score: 1753.1 bits: 333.2 E(32554): 2.5e-91 Smith-Waterman score: 2730; 100.0% identity (100.0% similar) in 388 aa overlap (1-388:1-388) 10 20 30 40 50 60 pF1KB8 MASLLGAYPWPEGLECPALDAELSDGQSPPAVPRPPGDKGSESRIRRPMNAFMVWAKDER :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 MASLLGAYPWPEGLECPALDAELSDGQSPPAVPRPPGDKGSESRIRRPMNAFMVWAKDER 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB8 KRLAVQNPDLHNAELSKMLGKSWKALTLSQKRPYVDEAERLRLQHMQDYPNYKYRPRRKK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 KRLAVQNPDLHNAELSKMLGKSWKALTLSQKRPYVDEAERLRLQHMQDYPNYKYRPRRKK 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB8 QAKRLCKRVDPGFLLSSLSRDQNALPEKRSGSRGALGEKEDRGEYSPGTALPSLRGCYHE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 QAKRLCKRVDPGFLLSSLSRDQNALPEKRSGSRGALGEKEDRGEYSPGTALPSLRGCYHE 130 140 150 160 170 180 190 200 210 220 230 240 pF1KB8 GPAGGGGGGTPSSVDTYPYGLPTPPEMSPLDVLEPEQTFFSSPCQEEHGHPRRIPHLPGH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 GPAGGGGGGTPSSVDTYPYGLPTPPEMSPLDVLEPEQTFFSSPCQEEHGHPRRIPHLPGH 190 200 210 220 230 240 250 260 270 280 290 300 pF1KB8 PYSPEYAPSPLHCSHPLGSLALGQSPGVSMMSPVPGCPPSPAYYSPATYHPLHSNLQAHL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 PYSPEYAPSPLHCSHPLGSLALGQSPGVSMMSPVPGCPPSPAYYSPATYHPLHSNLQAHL 250 260 270 280 290 300 310 320 330 340 350 360 pF1KB8 GQLSPPPEHPGFDALDQLSQVELLGDMDRNEFDQYLNTPGHPDSATGAMALSGHVPVSQV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS59 GQLSPPPEHPGFDALDQLSQVELLGDMDRNEFDQYLNTPGHPDSATGAMALSGHVPVSQV 310 320 330 340 350 360 370 380 pF1KB8 TPTGPTETSLISVLADATATYYNSYSVS :::::::::::::::::::::::::::: CCDS59 TPTGPTETSLISVLADATATYYNSYSVS 370 380 >>CCDS6159.1 SOX17 gene_id:64321|Hs108|chr8 (414 aa) initn: 775 init1: 525 opt: 847 Z-score: 556.8 bits: 111.9 E(32554): 1.1e-24 Smith-Waterman score: 847; 43.3% identity (61.9% similar) in 404 aa overlap (5-385:27-411) 10 20 30 pF1KB8 MASLLGAYPWPEGLECPALDAELSDGQSPPAVPRPPGD :: :: :.: : : ... :..: : : CCDS61 MSSPDAGYASDDQSQTQSALPAVMAGLGPCPWAESLS-PIGDMKVK-GEAPANSGAPAGA 10 20 30 40 50 40 50 60 70 80 90 pF1KB8 KG---SESRIRRPMNAFMVWAKDERKRLAVQNPDLHNAELSKMLGKSWKALTLSQKRPYV : .::::::::::::::::::::::: :::::::::::::::::::::::..:::.: CCDS61 AGRAKGESRIRRPMNAFMVWAKDERKRLAQQNPDLHNAELSKMLGKSWKALTLAEKRPFV 60 70 80 90 100 110 100 110 120 130 140 150 pF1KB8 DEAERLRLQHMQDYPNYKYRPRRKKQAKRLCKRVDPGFLLSSLSRDQNAL--PEKRSGSR .::::::.:::::.:::::::::.::.::: :::. ::: .:.. : : :: . CCDS61 EEAERLRVQHMQDHPNYKYRPRRRKQVKRL-KRVEGGFL-HGLAEPQAAALGPEGGRVAM 120 130 140 150 160 170 160 170 180 190 200 210 pF1KB8 GALGEKEDRGEYSPGTAL--PSLRGCYHEGPAGGGGGGTPSSVDTYPYGLPTPPEMSPLD .:: . . . : : : . : :.. . :.: .: :: :::: . :::: CCDS61 DGLGLQFPEQGFPAGPPLLPPHMGGHYRDCQSL----GAPP-LDGYP--LPTP-DTSPLD 180 190 200 210 220 220 230 240 250 260 pF1KB8 VLEPEQTFFSSP----CQEEHGHPRRI-------PHLPGHPYSPEYAPSPLHCSHPLGSL ..:. .::..: : . :. :. :. :. .: : : : : : CCDS61 GVDPDPAFFAAPMPGDCPAAGTYSYAQVSDYAGPPEPPAGPMHPRLGPEPAGPSIP-GLL 230 240 250 260 270 280 270 280 290 300 310 pF1KB8 ALGQSPGV---SMMSPVPGCPPSPAYYSPATYHPLHSNLQAHLGQLSPPPEH-PGFDALD : .. : .: :: : . . .. :.. :: ::::: : :. : CCDS61 APPSALHVYYGAMGSPGAGGGRGFQMQPQHQHQHQHQHHPPGPGQPSPPPEALPCRDGTD 290 300 310 320 330 340 320 330 340 350 360 370 pF1KB8 QLSQVELLGDMDRNEFDQYLNTPGHPDSATGAMALSGHVPVSQVTPTGPTETSLISVLAD . .::::..::.::.:::. .:. . . .:: : :. .. .. ::..: CCDS61 PSQPAELLGEVDRTEFEQYLHFVCKPEMG---LPYQGHD--SGVN-LPDSHGAISSVVSD 350 360 370 380 390 400 380 pF1KB8 AT-ATYYNSYSVS :. :.:: .: CCDS61 ASSAVYYCNYPDV 410 >>CCDS13552.1 SOX18 gene_id:54345|Hs108|chr20 (384 aa) initn: 745 init1: 480 opt: 559 Z-score: 374.4 bits: 78.0 E(32554): 1.6e-14 Smith-Waterman score: 782; 41.6% identity (58.2% similar) in 409 aa overlap (2-388:30-383) 10 20 30 pF1KB8 MASLLGAYPWPEGLECPALDAELSDGQ-SPPA :. : : .: :: : . : ::: CCDS13 MQRSPPGYGAQDDPPARRDCAWAPGHGAAADTRGLAAGPAALAAPAAPASPPSPQRSPPR 10 20 30 40 50 60 40 50 60 70 80 pF1KB8 VPRP------PGDKG-----SESRIRRPMNAFMVWAKDERKRLAVQNPDLHNAELSKMLG :.: :. .: .::::::::::::::::::::::: :::::::: :::::: CCDS13 SPEPGRYGLSPAGRGERQAADESRIRRPMNAFMVWAKDERKRLAQQNPDLHNAVLSKMLG 70 80 90 100 110 120 90 100 110 120 130 140 pF1KB8 KSWKALTLSQKRPYVDEAERLRLQHMQDYPNYKYRPRRKKQAKRLCKRVDPGFLLSSLSR :.:: :. ..:::.:.::::::.::..:.:::::::::::::.. .:..::.:: .:. CCDS13 KAWKELNAAEKRPFVEEAERLRVQHLRDHPNYKYRPRRKKQARK-ARRLEPGLLLPGLAP 130 140 150 160 170 150 160 170 180 190 200 pF1KB8 DQNALPEKRSGSRGALGEKEDRGEYSPGTALPSLRGCYHEGPAGGGGGGTPSSVDTYPYG : :: .. : : :. ..: : : ... : CCDS13 PQPP-PEPFPAASG------------------SARA-FRELP--------PLGAEFDGLG 180 190 200 210 210 220 230 240 250 pF1KB8 LPTPPEMSPLDVLEP-EQTFFSSPCQEEHGHPRRIPHLPGHPYSPEYAPSPLHCSHPLGS :::: : :::: ::: : .:: : : : :. :::. : : : CCDS13 LPTP-ERSPLDGLEPGEAAFFPPPAAPEDCALR--------PFRAPYAPTELS-RDPGGC 220 230 240 250 260 260 270 280 290 300 310 pF1KB8 LALGQSPGVSMMSPVPGCPPSPAYY----SPATYHPLHSNLQAHLGQLSPPPEHPGFDAL : . .. . :. : . :: .:. : : : :::::: : ... CCDS13 Y--GAPLAEALRTAPPAAPLAGLYYGTLGTPGPY-P---------GPLSPPPEAPPLESA 270 280 290 300 320 330 340 350 360 370 pF1KB8 DQLSQV-ELLGDMDRNEFDQYLN-TPGHPDSATGAMALSGHVPVSQVTPTG---PTETSL . :. . .: .:.: .::::::: . .:: : : : :: .... : . : :.:: CCDS13 EPLGPAADLWADVDLTEFDQYLNCSRTRPD-APG---LPYHVALAKLGPRAMSCPEESSL 310 320 330 340 350 360 380 pF1KB8 ISVLADATATYYNSYSVS ::.:.::... : : .: CCDS13 ISALSDASSAVYYSACISG 370 380 >>CCDS11689.1 SOX9 gene_id:6662|Hs108|chr17 (509 aa) initn: 479 init1: 383 opt: 461 Z-score: 310.5 bits: 66.6 E(32554): 5.7e-11 Smith-Waterman score: 461; 32.3% identity (54.7% similar) in 322 aa overlap (30-341:90-406) 10 20 30 40 50 pF1KB8 MASLLGAYPWPEGLECPALDAELSDGQSPPAVPRPPGDKGSESRIRRPMNAFMVWAKDE : : :.. .. ...::::::::::. CCDS11 LKKESEEDKFPVCIREAVSQVLKGYDWTLVPMPVRVNGSSKNKPHVKRPMNAFMVWAQAA 60 70 80 90 100 110 60 70 80 90 100 110 pF1KB8 RKRLAVQNPDLHNAELSKMLGKSWKALTLSQKRPYVDEAERLRLQHMQDYPNYKYRPRRK :..:: : : :::::::: ::: :. :. :.:::.:.::::::.:: .:.:.:::.:::. CCDS11 RRKLADQYPHLHNAELSKTLGKLWRLLNESEKRPFVEEAERLRVQHKKDHPDYKYQPRRR 120 130 140 150 160 170 120 130 140 150 160 170 pF1KB8 KQAKRLCKRVDPGFLLSSLSRDQ--NALPEKRSGSRGALGEKEDRGEYSPGTALPSLRGC :..: ... . . .: . .:: : ....: .. ::.: . : CCDS11 KSVKNGQAEAEEATEQTHISPNAIFKALQADSPHSSSGMSEVHSPGEHSGQSQGPPTPPT 180 190 200 210 220 230 180 190 200 210 220 230 pF1KB8 YHEGPAGGGGGGTPSSVDTYPYGLPTPP-EMSPLDVLEPEQTFFSSPCQEEHGHPRRIPH . . : . : : :: .. .:. : . .:. : . . CCDS11 TPKTDVQPGKADLKREGRPLPEGGRQPPIDFRDVDIGELSSDVISN--IETFDVNEFDQY 240 250 260 270 280 290 240 250 260 270 280 pF1KB8 LP--GHPYSPE-YAPSPLHCSHPLGSLA-LGQSPGVSMMSP--VPGCPPS-PAYYSPATY :: ::: : .. :. ..: : : : :: .: ::. : :: CCDS11 LPPNGHPGVPATHGQVTYTGSYGISSTAATPASAGHVWMSKQQAPPPPPQQPPQAPPAPQ 300 310 320 330 340 350 290 300 310 320 330 340 pF1KB8 HPLHSNLQAHLGQLSPPPEHPGFDALDQLSQVELLGDMDRNEFDQYLNTPGHPDSATGAM : . . : : . ::..: .: ::. :. .:... .:.: CCDS11 APPQPQ-AAPPQQPAAPPQQPQAHTLTTLSSEP--GQSQRTHIKTEQLSPSHYSEQQQHS 360 370 380 390 400 410 350 360 370 380 pF1KB8 ALSGHVPVSQVTPTGPTETSLISVLADATATYYNSYSVS CCDS11 PQQIAYSPFNLPHYSPSYPPITRSQYDYTDHQNSSSYYSHAAGQGTGLYSTFTYMNPAQR 420 430 440 450 460 470 388 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Fri Nov 4 16:52:00 2016 done: Fri Nov 4 16:52:00 2016 Total Scan time: 3.400 Total Display time: 0.000 Function used was FASTA [36.3.4 Apr, 2011]