FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KA0790, 1247 aa 1>>>pF1KA0790 1247 - 1247 aa - 1247 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 11.6694+/-0.00103; mu= -5.4885+/- 0.063 mean_var=341.3180+/-70.604, 0's: 0 Z-trim(115.0): 21 B-trim: 217 in 2/54 Lambda= 0.069422 statistics sampled from 15512 (15523) to 15512 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.769), E-opt: 0.2 (0.477), width: 16 Scan time: 5.460 The best scores are: opt bits E(32554) CCDS5212.1 SASH1 gene_id:23328|Hs108|chr6 (1247) 8410 857.0 0 CCDS42906.1 SAMSN1 gene_id:64092|Hs108|chr21 ( 373) 855 100.0 1.2e-20 CCDS58786.1 SAMSN1 gene_id:64092|Hs108|chr21 ( 441) 855 100.0 1.4e-20 CCDS14614.1 SASH3 gene_id:54440|Hs108|chrX ( 380) 827 97.2 8.5e-20 CCDS74774.1 SAMSN1 gene_id:64092|Hs108|chr21 ( 304) 793 93.7 7.5e-19 >>CCDS5212.1 SASH1 gene_id:23328|Hs108|chr6 (1247 aa) initn: 8410 init1: 8410 opt: 8410 Z-score: 4565.9 bits: 857.0 E(32554): 0 Smith-Waterman score: 8410; 99.9% identity (100.0% similar) in 1247 aa overlap (1-1247:1-1247) 10 20 30 40 50 60 pF1KA0 MEDAGAAGPGPEPEPEPEPEPEPAPEPEPEPKPGAGTSEAFSRLWTDVMGILDGSLGNID :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 MEDAGAAGPGPEPEPEPEPEPEPAPEPEPEPKPGAGTSEAFSRLWTDVMGILDGSLGNID 10 20 30 40 50 60 70 80 90 100 110 120 pF1KA0 DLAQQYADYYNTCFSDVCERMEELRKRRVSQDLEVEKPDASPTSLQLRSQIEESLGFCSA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 DLAQQYADYYNTCFSDVCERMEELRKRRVSQDLEVEKPDASPTSLQLRSQIEESLGFCSA 70 80 90 100 110 120 130 140 150 160 170 180 pF1KA0 VSTPEVERKNPLHKSNSEDSSVGKGDWKKKNKYFWQNFRKNQKGIMRQTSKGEDVGYVAS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 VSTPEVERKNPLHKSNSEDSSVGKGDWKKKNKYFWQNFRKNQKGIMRQTSKGEDVGYVAS 130 140 150 160 170 180 190 200 210 220 230 240 pF1KA0 EITMSDEERIQLMMMVKEKMITIEEALARLKEYEAQHRQSAALDPADWPDGSYPTFDGSS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 EITMSDEERIQLMMMVKEKMITIEEALARLKEYEAQHRQSAALDPADWPDGSYPTFDGSS 190 200 210 220 230 240 250 260 270 280 290 300 pF1KA0 NCNSREQSDDETEESVKFKRLHKLVNSTRRVRKKLIRVEEMKKPSTEGGEEHVFENSPVL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 NCNSREQSDDETEESVKFKRLHKLVNSTRRVRKKLIRVEEMKKPSTEGGEEHVFENSPVL 250 260 270 280 290 300 310 320 330 340 350 360 pF1KA0 DERSALYSGVHKKPLFFDGSPEKPPEDDSDSLTTSPSSSSLDTWGAGRKLVKTFSKGESR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 DERSALYSGVHKKPLFFDGSPEKPPEDDSDSLTTSPSSSSLDTWGAGRKLVKTFSKGESR 310 320 330 340 350 360 370 380 390 400 410 420 pF1KA0 GLIKPPKKMGTFFSYPEEEKAQKVSRSLTEGEMKKGLGSLSHGRTCSFGGFDLTNRSLHV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 GLIKPPKKMGTFFSYPEEEKAQKVSRSLTEGEMKKGLGSLSHGRTCSFGGFDLTNRSLHV 370 380 390 400 410 420 430 440 450 460 470 480 pF1KA0 GSNNSDPMGKEGDFVYKEVIKSPTASRISLGKKVKSVKETMRKRMSKKYSSSVSEQDSGL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 GSNNSDPMGKEGDFVYKEVIKSPTASRISLGKKVKSVKETMRKRMSKKYSSSVSEQDSGL 430 440 450 460 470 480 490 500 510 520 530 540 pF1KA0 DGMPGSPPPSQPDPEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVSTTDSSTSNRESVK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 DGMPGSPPPSQPDPEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVSTTDSSTSNRESVK 490 500 510 520 530 540 550 560 570 580 590 600 pF1KA0 SEDGDDEEPPYRGPFCGRARVHTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMGLLNN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 SEDGDDEEPPYRGPFCGRARVHTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMGLLNN 550 560 570 580 590 600 610 620 630 640 650 660 pF1KA0 KVGTFKFIYVDVLSEDEEKPKRPTRRRRKGRPPQPKSVEDLLDRINLKEHMPTFLFNGYE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 KVGTFKFIYVDVLSEDEEKPKRPTRRRRKGRPPQPKSVEDLLDRINLKEHMPTFLFNGYE 610 620 630 640 650 660 670 680 690 700 710 720 pF1KA0 DLDTFKLLEEEDLDELKIRDPEHRAVLLTAVELLQEYDSNSDQSGSQEKLLVDSQGLSGC ::::::::::::::::.::::::::::::::::::::::::::::::::::::::::::: CCDS52 DLDTFKLLEEEDLDELNIRDPEHRAVLLTAVELLQEYDSNSDQSGSQEKLLVDSQGLSGC 670 680 690 700 710 720 730 740 750 760 770 780 pF1KA0 SPRDSGCYESSENLENGKTRKASLLSAKSSTEPSLKSFSRNQLGNYPTLPLMKSGDALKQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 SPRDSGCYESSENLENGKTRKASLLSAKSSTEPSLKSFSRNQLGNYPTLPLMKSGDALKQ 730 740 750 760 770 780 790 800 810 820 830 840 pF1KA0 GQEEGRLGGGLAPDTSKSCDPPGVTGLNKNRRSLPVSICRSCETLEGPQTVDTWPRSHSL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 GQEEGRLGGGLAPDTSKSCDPPGVTGLNKNRRSLPVSICRSCETLEGPQTVDTWPRSHSL 790 800 810 820 830 840 850 860 870 880 890 900 pF1KA0 DDLQVEPGAEQDVPTEVTEPPPQIVPEVPQKTTASSTKAQPLEQDSAVDNALLLTQSKRF :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 DDLQVEPGAEQDVPTEVTEPPPQIVPEVPQKTTASSTKAQPLEQDSAVDNALLLTQSKRF 850 860 870 880 890 900 910 920 930 940 950 960 pF1KA0 SEPQKLTTKKLEGSIAASGRGLSPPQCLPRNYDAQPPGAKHGLARTPLEGHRKGHEFEGT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 SEPQKLTTKKLEGSIAASGRGLSPPQCLPRNYDAQPPGAKHGLARTPLEGHRKGHEFEGT 910 920 930 940 950 960 970 980 990 1000 1010 1020 pF1KA0 HHPLGTKEGVDAEQRMQPKIPSQPPPVPAKKSRERLANGLHPVPMGPSGALPSPDAPCLP :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 HHPLGTKEGVDAEQRMQPKIPSQPPPVPAKKSRERLANGLHPVPMGPSGALPSPDAPCLP 970 980 990 1000 1010 1020 1030 1040 1050 1060 1070 1080 pF1KA0 VKRGSPASPTSPSDCPPALAPRPLSGQAPGSPPSTRPPPWLSELPENTSLQEHGVKLGPA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 VKRGSPASPTSPSDCPPALAPRPLSGQAPGSPPSTRPPPWLSELPENTSLQEHGVKLGPA 1030 1040 1050 1060 1070 1080 1090 1100 1110 1120 1130 1140 pF1KA0 LTRKVSCARGVDLETLTENKLHAEGIDLTEEPYSDKHGRCGIPEALVQRYAEDLDQPERD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 LTRKVSCARGVDLETLTENKLHAEGIDLTEEPYSDKHGRCGIPEALVQRYAEDLDQPERD 1090 1100 1110 1120 1130 1140 1150 1160 1170 1180 1190 1200 pF1KA0 VAANMDQIRVKQLRKQHRMAIPSGGLTEICRKPVSPGCISSVSDWLISIGLPMYAGTLST :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 VAANMDQIRVKQLRKQHRMAIPSGGLTEICRKPVSPGCISSVSDWLISIGLPMYAGTLST 1150 1160 1170 1180 1190 1200 1210 1220 1230 1240 pF1KA0 AGFSTLSQVPSLSHTCLQEAGITEERHIRKLLSAARLFKLPPGPEAM ::::::::::::::::::::::::::::::::::::::::::::::: CCDS52 AGFSTLSQVPSLSHTCLQEAGITEERHIRKLLSAARLFKLPPGPEAM 1210 1220 1230 1240 >>CCDS42906.1 SAMSN1 gene_id:64092|Hs108|chr21 (373 aa) initn: 778 init1: 492 opt: 855 Z-score: 484.1 bits: 100.0 E(32554): 1.2e-20 Smith-Waterman score: 883; 45.7% identity (71.1% similar) in 346 aa overlap (404-738:20-351) 380 390 400 410 420 430 pF1KA0 SYPEEEKAQKVSRSLTEGEMKKGLGSLSHGRTCSFGGFD-LTNRSLHVGSNNSDPMGKEG :. :::.:: . : :: ..... ..:: CCDS42 MLKRKPSNVSEKEKHQKPKRSSSFGNFDRFRNNSLSKPDDSTE--AHEG 10 20 30 40 440 450 460 470 480 490 pF1KA0 DFVYKEVIKSPTASRIS-LGKKVKSVKETMRKRMSKKYSSSVSEQDSGLDGMPGSPPPSQ : . .: :.. . ::::..... ::.:...::: ...::. . :: .. : . CCDS42 DPTNGSGEQSKTSNNGGGLGKKMRAISWTMKKKVGKKYIKALSEEKDEEDG-ENAHPYRN 50 60 70 80 90 100 500 510 520 530 540 pF1KA0 PDP---EHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVSTTDSSTSNRESVKSEDGDDEE :: : .: .:::. :..:: : :::: :: ... ...::::.: . .: . CCDS42 SDPVIGTHTEKVSLKASDSMDSLYS---GQSSSSG--ITSCSDGTSNRDSFRLDD----D 110 120 130 140 150 550 560 570 580 590 600 pF1KA0 PPYRGPFCGRARVHTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMGLLNNKVGTFKFI :: ::::::::::::::::::::::::.::::::::: : ::: : :.::::::.:::: CCDS42 GPYSGPFCGRARVHTDFTPSPYDTDSLKIKKGDIIDIICKTPMGMWTGMLNNKVGNFKFI 160 170 180 190 200 210 610 620 630 640 650 660 pF1KA0 YVDVLSEDEEKPKRPTRRRRKGRPPQPKSVEDLLDRINLKEHMPTFLFNGYEDLDTFKLL ::::.::.: ::. . :.. . :.....:.::.:.:. :.:.:::: :. .: . CCDS42 YVDVISEEEAAPKK-IKANRRSNSKKSKTLQEFLERIHLQEYTSTLLLNGYETLEDLKDI 220 230 240 250 260 270 670 680 690 700 710 720 pF1KA0 EEEDLDELKIRDPEHRAVLLTAVELLQEYDSNSDQSGSQEKLLVDSQ------GLSGCSP .: : ::.:..:. : ::.:.: . : . ..: . : : ..:. :. : : CCDS42 KESHLIELNIENPDDRRRLLSAAENFLEEEIIQEQENEPEPLSLSSDISLNKSQLDDC-P 280 290 300 310 320 330 730 740 750 760 770 780 pF1KA0 RDSGCYESSENLENGKTRKASLLSAKSSTEPSLKSFSRNQLGNYPTLPLMKSGDALKQGQ :::::: :: : .::: CCDS42 RDSGCYISSGNSDNGKEDLESENLSDMVHKIIITEPSD 340 350 360 370 >>CCDS58786.1 SAMSN1 gene_id:64092|Hs108|chr21 (441 aa) initn: 778 init1: 492 opt: 855 Z-score: 483.1 bits: 100.0 E(32554): 1.4e-20 Smith-Waterman score: 883; 45.7% identity (71.1% similar) in 346 aa overlap (404-738:88-419) 380 390 400 410 420 430 pF1KA0 SYPEEEKAQKVSRSLTEGEMKKGLGSLSHGRTCSFGGFD-LTNRSLHVGSNNSDPMGKEG :. :::.:: . : :: ..... ..:: CCDS58 QVGPWDHCSSCIRHTRLKSSCSDMDLLHSWRSSSFGNFDRFRNNSLSKPDDSTE--AHEG 60 70 80 90 100 110 440 450 460 470 480 490 pF1KA0 DFVYKEVIKSPTASRIS-LGKKVKSVKETMRKRMSKKYSSSVSEQDSGLDGMPGSPPPSQ : . .: :.. . ::::..... ::.:...::: ...::. . :: .. : . CCDS58 DPTNGSGEQSKTSNNGGGLGKKMRAISWTMKKKVGKKYIKALSEEKDEEDG-ENAHPYRN 120 130 140 150 160 170 500 510 520 530 540 pF1KA0 PDP---EHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVSTTDSSTSNRESVKSEDGDDEE :: : .: .:::. :..:: : :::: :: ... ...::::.: . .: . CCDS58 SDPVIGTHTEKVSLKASDSMDSLYS---GQSSSSG--ITSCSDGTSNRDSFRLDD----D 180 190 200 210 220 550 560 570 580 590 600 pF1KA0 PPYRGPFCGRARVHTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMGLLNNKVGTFKFI :: ::::::::::::::::::::::::.::::::::: : ::: : :.::::::.:::: CCDS58 GPYSGPFCGRARVHTDFTPSPYDTDSLKIKKGDIIDIICKTPMGMWTGMLNNKVGNFKFI 230 240 250 260 270 280 610 620 630 640 650 660 pF1KA0 YVDVLSEDEEKPKRPTRRRRKGRPPQPKSVEDLLDRINLKEHMPTFLFNGYEDLDTFKLL ::::.::.: ::. . :.. . :.....:.::.:.:. :.:.:::: :. .: . CCDS58 YVDVISEEEAAPKK-IKANRRSNSKKSKTLQEFLERIHLQEYTSTLLLNGYETLEDLKDI 290 300 310 320 330 340 670 680 690 700 710 720 pF1KA0 EEEDLDELKIRDPEHRAVLLTAVELLQEYDSNSDQSGSQEKLLVDSQ------GLSGCSP .: : ::.:..:. : ::.:.: . : . ..: . : : ..:. :. : : CCDS58 KESHLIELNIENPDDRRRLLSAAENFLEEEIIQEQENEPEPLSLSSDISLNKSQLDDC-P 350 360 370 380 390 400 730 740 750 760 770 780 pF1KA0 RDSGCYESSENLENGKTRKASLLSAKSSTEPSLKSFSRNQLGNYPTLPLMKSGDALKQGQ :::::: :: : .::: CCDS58 RDSGCYISSGNSDNGKEDLESENLSDMVHKIIITEPSD 410 420 430 440 >>CCDS14614.1 SASH3 gene_id:54440|Hs108|chrX (380 aa) initn: 823 init1: 687 opt: 827 Z-score: 468.9 bits: 97.2 E(32554): 8.5e-20 Smith-Waterman score: 827; 45.0% identity (67.0% similar) in 351 aa overlap (400-733:20-357) 370 380 390 400 410 420 pF1KA0 GTFFSYPEEEKAQKVSRSLTEGEMKKGLGSLSHGRTCSFGGFDLTNRSLHVGSN---NSD :: :. :: : .. : : :. : : CCDS14 MLRRKPSNASEKEPTQKKKLSLQRSSSFKDFAKSKPSSPVVSEKEFNLD 10 20 30 40 430 440 450 460 470 480 pF1KA0 PMGKEGDFVYKEVIKSPTASRIS---LGKKVKSV-KETMRKRMSKKYSSSVSEQDSGLDG : : . .: . : :::: ..: ..:: ..:.: . ...::. . : CCDS14 DNIPEDD----SGVPTPEDAGKSGKKLGKKWRAVISRTMNRKMGKMMVKALSEEMA--DT 50 60 70 80 90 100 490 500 510 520 530 540 pF1KA0 MP-GSPPPSQPDPEHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVSTTDSSTSNRESVKS . :: :..:: ::.: . :.. ..: : ..: :. :. : . CCDS14 LEEGSASPTSPD-YSLDSP------GPEKMALAFSEQEEHELPVLSRQASTGSELCSPSP 110 120 130 140 150 550 560 570 580 590 pF1KA0 EDGD-DEEPP---YRGPFCGRARVHTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMGL .:. :::: : ::::::::::::::::::: :::::.:::.:.:: :::.:::.:: CCDS14 GSGSFGEEPPAPQYTGPFCGRARVHTDFTPSPYDHDSLKLQKGDVIQIIEKPPVGTWLGL 160 170 180 190 200 210 600 610 620 630 640 650 pF1KA0 LNNKVGTFKFIYVDVLSEDEEKPKRPTRRRRKGRPPQPKSVEDLLDRINLKEHMPTFLFN ::.:::.::::::::: :. ::.::. ::. :.::....::.::.:.:: :.:.: CCDS14 LNGKVGSFKFIYVDVLPEEAVGHARPSRRQSKGKRPKPKTLHELLERIGLEEHTSTLLLN 220 230 240 250 260 270 660 670 680 690 700 710 pF1KA0 GYEDLDTFKLLEEEDLDELKIRDPEHRAVLLTAVELLQEYDSNSDQS-----GSQEKLLV ::. :. :: :.: :.::.: ::.::: ::::.::: .::..:... .::: . CCDS14 GYQTLEDFKELRETHLNELNIMDPQHRAKLLTAAELLLDYDTGSEEAEEGAESSQEPVAH 280 290 300 310 320 330 720 730 740 750 760 770 pF1KA0 DSQGLSGCSPRDSGCYESSENLENGKTRKASLLSAKSSTEPSLKSFSRNQLGNYPTLPLM . . ::::::.:.::. CCDS14 TVSEPKVDIPRDSGCFEGSESGRDDAELAGTEEQLQGLSLAGAP 340 350 360 370 380 >>CCDS74774.1 SAMSN1 gene_id:64092|Hs108|chr21 (304 aa) initn: 753 init1: 492 opt: 793 Z-score: 451.9 bits: 93.7 E(32554): 7.5e-19 Smith-Waterman score: 821; 47.6% identity (72.4% similar) in 294 aa overlap (454-738:1-282) 430 440 450 460 470 480 pF1KA0 NSDPMGKEGDFVYKEVIKSPTASRISLGKKVKSVKETMRKRMSKKYSSSVSEQDSGLDGM ..... ::.:...::: ...::. . :: CCDS74 MRAISWTMKKKVGKKYIKALSEEKDEEDG- 10 20 490 500 510 520 530 540 pF1KA0 PGSPPPSQPDP---EHLDKPKLKAGGSVESLRSSLSGQSSMSGQTVSTTDSSTSNRESVK .. : . :: : .: .:::. :..:: ::::: :: ... ...::::.: . CCDS74 ENAHPYRNSDPVIGTHTEKVSLKASDSMDSL---YSGQSSSSG--ITSCSDGTSNRDSFR 30 40 50 60 70 80 550 560 570 580 590 600 pF1KA0 SEDGDDEEPPYRGPFCGRARVHTDFTPSPYDTDSLKLKKGDIIDIISKPPMGTWMGLLNN .: . :: ::::::::::::::::::::::::.::::::::: : ::: : :.::: CCDS74 LDD----DGPYSGPFCGRARVHTDFTPSPYDTDSLKIKKGDIIDIICKTPMGMWTGMLNN 90 100 110 120 130 140 610 620 630 640 650 660 pF1KA0 KVGTFKFIYVDVLSEDEEKPKRPTRRRRKGRPPQPKSVEDLLDRINLKEHMPTFLFNGYE :::.::::::::.::.: ::. . :.. . :.....:.::.:.:. :.:.:::: CCDS74 KVGNFKFIYVDVISEEEAAPKK-IKANRRSNSKKSKTLQEFLERIHLQEYTSTLLLNGYE 150 160 170 180 190 670 680 690 700 710 pF1KA0 DLDTFKLLEEEDLDELKIRDPEHRAVLLTAVELLQEYDSNSDQSGSQEKLLVDSQ----- :. .: ..: : ::.:..:. : ::.:.: . : . ..: . : : ..:. CCDS74 TLEDLKDIKESHLIELNIENPDDRRRLLSAAENFLEEEIIQEQENEPEPLSLSSDISLNK 200 210 220 230 240 250 720 730 740 750 760 770 pF1KA0 -GLSGCSPRDSGCYESSENLENGKTRKASLLSAKSSTEPSLKSFSRNQLGNYPTLPLMKS :. : ::::::: :: : .::: CCDS74 SQLDDC-PRDSGCYISSGNSDNGKEDLESENLSDMVHKIIITEPSD 260 270 280 290 300 1247 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Fri Nov 4 01:01:18 2016 done: Fri Nov 4 01:01:19 2016 Total Scan time: 5.460 Total Display time: 0.050 Function used was FASTA [36.3.4 Apr, 2011]