FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE0425, 373 aa 1>>>pF1KE0425 373 - 373 aa - 373 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.3764+/-0.000953; mu= 15.7455+/- 0.057 mean_var=63.9724+/-12.879, 0's: 0 Z-trim(104.6): 30 B-trim: 0 in 0/50 Lambda= 0.160353 statistics sampled from 7977 (7990) to 7977 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.628), E-opt: 0.2 (0.245), width: 16 Scan time: 2.410 The best scores are: opt bits E(32554) CCDS14717.1 NSDHL gene_id:50814|Hs108|chrX ( 373) 2485 583.8 8.1e-167 CCDS42205.1 SDR42E1 gene_id:93517|Hs108|chr16 ( 393) 514 127.9 1.5e-29 CCDS903.1 HSD3B1 gene_id:3283|Hs108|chr1 ( 373) 324 83.9 2.5e-16 CCDS902.1 HSD3B2 gene_id:3284|Hs108|chr1 ( 372) 303 79.0 7.3e-15 CCDS10698.1 HSD3B7 gene_id:80270|Hs108|chr16 ( 369) 286 75.1 1.1e-13 >>CCDS14717.1 NSDHL gene_id:50814|Hs108|chrX (373 aa) initn: 2485 init1: 2485 opt: 2485 Z-score: 3108.4 bits: 583.8 E(32554): 8.1e-167 Smith-Waterman score: 2485; 100.0% identity (100.0% similar) in 373 aa overlap (1-373:1-373) 10 20 30 40 50 60 pF1KE0 MEPAVSEPMRDQVARTHLTEDTPKVNADIEKVNQNQAKRCTVIGGSGFLGQHMVEQLLAR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 MEPAVSEPMRDQVARTHLTEDTPKVNADIEKVNQNQAKRCTVIGGSGFLGQHMVEQLLAR 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE0 GYAVNVFDIQQGFDNPQVRFFLGDLCSRQDLYPALKGVNTVFHCASPPPSSNNKELFYRV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 GYAVNVFDIQQGFDNPQVRFFLGDLCSRQDLYPALKGVNTVFHCASPPPSSNNKELFYRV 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE0 NYIGTKNVIETCKEAGVQKLILTSSASVIFEGVDIKNGTEDLPYAMKPIDYYTETKILQE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 NYIGTKNVIETCKEAGVQKLILTSSASVIFEGVDIKNGTEDLPYAMKPIDYYTETKILQE 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE0 RAVLGANDPEKNFLTTAIRPHGIFGPRDPQLVPILIEAARNGKMKFVIGNGKNLVDFTFV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 RAVLGANDPEKNFLTTAIRPHGIFGPRDPQLVPILIEAARNGKMKFVIGNGKNLVDFTFV 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE0 ENVVHGHILAAEQLSRDSTLGGKAFHITNDEPIPFWTFLSRILTGLNYEAPKYHIPYWVA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 ENVVHGHILAAEQLSRDSTLGGKAFHITNDEPIPFWTFLSRILTGLNYEAPKYHIPYWVA 250 260 270 280 290 300 310 320 330 340 350 360 pF1KE0 YYLALLLSLLVMVISPVIQLQPTFTPMRVALAGTFHYYSCERAKKAMGYQPLVTMDDAME :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS14 YYLALLLSLLVMVISPVIQLQPTFTPMRVALAGTFHYYSCERAKKAMGYQPLVTMDDAME 310 320 330 340 350 360 370 pF1KE0 RTVQSFRHLRRVK ::::::::::::: CCDS14 RTVQSFRHLRRVK 370 >>CCDS42205.1 SDR42E1 gene_id:93517|Hs108|chr16 (393 aa) initn: 503 init1: 171 opt: 514 Z-score: 643.7 bits: 127.9 E(32554): 1.5e-29 Smith-Waterman score: 604; 33.8% identity (65.3% similar) in 346 aa overlap (34-360:5-348) 10 20 30 40 50 60 pF1KE0 AVSEPMRDQVARTHLTEDTPKVNADIEKVNQNQAKRCTVIGGSGFLGQHMVEQLLARGYA ..: . . ::::..: .. : : CCDS42 MDPKRSQKESVLITGGSGYFGFRLGCALNQNGVH 10 20 30 70 80 90 100 110 pF1KE0 VNVFDIQQGFDN-PQ-VRFFLGDLCSRQDLYPALKG--VNTVFHCASPPPSSN---NKEL : .:::.. .. :. ..:. ::. .:. :.. :. ::: :: :. :..: CCDS42 VILFDISSPAQTIPEGIKFIQGDIRHLSDVEKAFQDADVTCVFHIASYGMSGREQLNRNL 40 50 60 70 80 90 120 130 140 150 160 170 pF1KE0 FYRVNYIGTKNVIETCKEAGVQKLILTSSASVIFEGVDIKNGTEDLPYA---MKPIDYYT . .:: :: :....:.. : .:. ::. .::: : :.:: :.::: ..: :.:. CCDS42 IKEVNVRGTDNILQVCQRRRVPRLVYTSTFNVIFGGQVIRNGDESLPYLPLHLHP-DHYS 100 110 120 130 140 150 180 190 200 210 220 pF1KE0 ETKILQERAVLGAN----DPEKNFLTT-AIRPHGIFGPRDPQLVPILIEAARNGKMKFVI .:: . :. :: :: : . : : :.:: ::.:: . . .: .. ..: .::: CCDS42 RTKSIAEQKVLEANATPLDRGDGVLRTCALRPAGIYGPGEQRHLPRIVSYIEKGLFKFVY 160 170 180 190 200 210 230 240 250 260 270 280 pF1KE0 GNGKNLVDFTFVENVVHGHILAAEQLSRDS--TLGGKAFHITNDEPIPFWTFLSRILTGL :. ..::.:. :.:.:..::::.: : :. .:. . :.. .:. . :. .. :: CCDS42 GDPRSLVEFVHVDNLVQAHILASEALRADKGHIASGQPYFISDGRPVNNFEFFRPLVEGL 220 230 240 250 260 270 290 300 310 320 330 340 pF1KE0 NYEAPKYHIPYWVAYYLALLLSLLVMVISPVIQLQPTFTPMRVALAGTFHYYSCERAKKA .: :. ..: ..: .:.: .. .... . ..:: .: .: .:. ::.: :.::: CCDS42 GYTFPSTRLPLTLVYCFAFLTEMVHFILGRLYNFQPFLTRTEVYKTGVTHYFSLEKAKKE 280 290 300 310 320 330 350 360 370 pF1KE0 MGY--QPLVTMDDAMERTVQSFRHLRRVK .:: ::. ...:.: CCDS42 LGYKAQPF-DLQEAVEWFKAHGHGRSSGSRDSECFVWDGLLVFLLIIAVLMWLPSSVILS 340 350 360 370 380 390 >>CCDS903.1 HSD3B1 gene_id:3283|Hs108|chr1 (373 aa) initn: 206 init1: 70 opt: 324 Z-score: 406.5 bits: 83.9 E(32554): 2.5e-16 Smith-Waterman score: 378; 27.2% identity (58.9% similar) in 353 aa overlap (40-364:6-355) 10 20 30 40 50 60 pF1KE0 RDQVARTHLTEDTPKVNADIEKVNQNQAKRCTVIGGSGFLGQHMVEQLLARGY--AVNVF : : :..:::::.... :. . . :. CCDS90 MTGWSCLVTGAGGFLGQRIIRLLVKEKELKEIRVL 10 20 30 70 80 90 100 110 pF1KE0 D------IQQGFDNPQVRFFL----GDLCSRQDLYPALKGVNTVFH--CASPPPSSNNKE : ... :.. : . : ::. .. : : . :....: : . ...: CCDS90 DKAFGPELREEFSKLQNKTKLTVLEGDILDEPFLKRACQDVSVIIHTACIIDVFGVTHRE 40 50 60 70 80 90 120 130 140 150 160 170 pF1KE0 LFYRVNYIGTKNVIETCKEAGVQKLILTSSASVI----FEGVDIKNGTEDLPYAMKPIDY .. :: ::. ..:.: .:.: .: ::: : .. . :.:: :. : CCDS90 SIMNVNVKGTQLLLEACVQASVPVFIYTSSIEVAGPNSYKEI-IQNGHEEEPLENTWPAP 100 110 120 130 140 150 180 190 200 210 220 pF1KE0 YTETKILQERAVLGANDPE-KN---FLTTAIRPHGIFGPRDPQLVPILIEAARNGKMKFV : ..: : :.:::.:: . :: . : :.:: :.: . : . :: :. . CCDS90 YPHSKKLAEKAVLAANGWNLKNGGTLYTCALRPMYIYGEGSRFLSASINEALNNNGILSS 160 170 180 190 200 210 230 240 250 260 270 280 pF1KE0 IGNGKNLVDFTFVENVVHGHILAAEQLS---RDSTLGGKAFHITNDEPIPFWTFLSRILT .:. .. :. ..: ::. .:::: . :. . .. :. ..:..: : . :. :. CCDS90 VGKFST-VNPVYVGNVAWAHILALRALQDPKKAPSIRGQFYYISDDTPHQSYDNLNYTLS 220 230 240 250 260 270 290 300 310 320 330 340 pF1KE0 ---GLNYEAPKYHIPYWVAYYLALLLSLLVMVISPVIQLQPTFTPMRVALAGTFHYYSCE :: .. .. .: . :....:: .. ... :. .: :. :.:... .: . CCDS90 KEFGLRLDS-RWSFPLSLMYWIGFLLEIVSFLLRPIYTYRPPFNRHIVTLSNSVFTFSYK 280 290 300 310 320 330 350 360 370 pF1KE0 RAKKAMGYQPLVTMDDAMERTVQSFRHLRRVK .:.. ..:.:: . ..: ..::. CCDS90 KAQRDLAYKPLYSWEEAKQKTVEWVGSLVDRHKETLKSKTQ 340 350 360 370 >>CCDS902.1 HSD3B2 gene_id:3284|Hs108|chr1 (372 aa) initn: 285 init1: 127 opt: 303 Z-score: 380.3 bits: 79.0 E(32554): 7.3e-15 Smith-Waterman score: 407; 29.0% identity (58.9% similar) in 355 aa overlap (40-364:5-354) 10 20 30 40 50 60 pF1KE0 RDQVARTHLTEDTPKVNADIEKVNQNQAKRCTVIGGSGFLGQHMVEQLLARGYAVNVFDI : : :..:.:::..:. :. . .. . CCDS90 MGWSCLVTGAGGLLGQRIVRLLVEEKELKEIRAL 10 20 30 70 80 90 100 110 pF1KE0 QQGFDNPQVRFFLGDLCSRQDLY---------PALK----GVNTVFH--CASPPPSSNNK ...: :..: .. : .: : : :: :..:.: : . ... CCDS90 DKAF-RPELREEFSKLQNRTKLTVLEGDILDEPFLKRACQDVSVVIHTACIIDVFGVTHR 40 50 60 70 80 90 120 130 140 150 160 170 pF1KE0 ELFYRVNYIGTKNVIETCKEAGVQKLILTSSASVI----FEGVDIKNGTEDLPYAMKPID : .. :: ::. ..:.: .:.: .: ::: : .. . :.:: :. : CCDS90 ESIMNVNVKGTQLLLEACVQASVPVFIYTSSIEVAGPNSYKEI-IQNGHEEEPLENTWPT 100 110 120 130 140 150 180 190 200 210 220 pF1KE0 YYTETKILQERAVLGANDPE-KN---FLTTAIRPHGIFGPRDPQLVPILIEAARNGKMKF : .: : :.:::.:: . :: . : :.:: :.: : : . :: :. . CCDS90 PYPYSKKLAEKAVLAANGWNLKNGDTLYTCALRPTYIYGEGGPFLSASINEALNNNGILS 160 170 180 190 200 210 230 240 250 260 270 280 pF1KE0 VIGNGKNLVDFTFVENVVHGHILAAEQLSRDS----TLGGKAFHITNDEPIPFWTFLSRI .:. .. :. ..: ::. .:::: . : :: .. :. ..:..: : . :. : CCDS90 SVGKFST-VNPVYVGNVAWAHILALRAL-RDPKKAPSVRGQFYYISDDTPHQSYDNLNYI 220 230 240 250 260 270 290 300 310 320 330 pF1KE0 LT---GLNYEAPKYHIPYWVAYYLALLLSLLVMVISPVIQLQPTFTPMRVALAGTFHYYS :. :: .. .. .: . :....:: .. ...::. . :: :. :.:... .: CCDS90 LSKEFGLRLDS-RWSLPLTLMYWIGFLLEVVSFLLSPIYSYQPPFNRHTVTLSNSVFTFS 280 290 300 310 320 340 350 360 370 pF1KE0 CERAKKAMGYQPLVTMDDAMERTVQSFRHLRRVK ..:.. ..:.:: . ..: ..::. CCDS90 YKKAQRDLAYKPLYSWEEAKQKTVEWVGSLVDRHKETLKSKTQ 330 340 350 360 370 >>CCDS10698.1 HSD3B7 gene_id:80270|Hs108|chr16 (369 aa) initn: 427 init1: 174 opt: 286 Z-score: 359.1 bits: 75.1 E(32554): 1.1e-13 Smith-Waterman score: 446; 30.0% identity (55.8% similar) in 353 aa overlap (34-363:6-358) 10 20 30 40 50 60 pF1KE0 AVSEPMRDQVARTHLTEDTPKVNADIEKVNQNQAKRCTVIGGSGFLGQHMVEQLLARGYA : : : :: ::::.:.:..:: : CCDS10 MADSAQAQKLVYLVTGGCGFLGEHVVRMLLQREPR 10 20 30 70 80 90 100 110 pF1KE0 VN---VFDIQQG-----FDNPQVRF--FLGDLCSRQDLYPALKGVNTVFHCASPPP--SS .. ::: . : . . :: . ::. . ... :. :...:.: :. . CCDS10 LGELRVFDQHLGPWLEELKTGPVRVTAIQGDVTQAHEVAAAVAGAHVVIHTAGLVDVFGR 40 50 60 70 80 90 120 130 140 150 160 pF1KE0 NNKELFYRVNYIGTKNVIETCKEAGVQKLILTSSASVI---FEGVDIKNGTEDLPYAMKP . . ...:: ::.::::.: ..:.. :. ::: :. .: . :.:: :: CCDS10 ASPKTIHEVNVQGTRNVIEACVQTGTRFLVYTSSMEVVGPNTKGHPFYRGNEDTPYEAVH 100 110 120 130 140 150 170 180 190 200 210 220 pF1KE0 IDYYTETKILQERAVLGANDPEKN----FLTTAIRPHGIFGPRDPQLVPILIEAARNGKM : .: : : :: :: . ..: :.:: ::.: . . .. : : CCDS10 RHPYPCSKALAEWLVLEANGRKVRGGLPLVTCALRPTGIYGEGHQIMRDFYRQGLRLGGW 160 170 180 190 200 210 230 240 250 260 270 280 pF1KE0 KFVIGNGKNLVDFTFVENVVHGHILAAEQLSRDSTL-GGKAFHITNDEPI-PFWTFLSRI : .. ..: ::. :.:::..: . .:: ::... . : . : .. CCDS10 LFRAIPASVEHGRVYVGNVAWMHVLAARELEQRATLMGGQVYFCYDGSPYRSYEDFNMEF 220 230 240 250 260 270 290 300 310 320 330 340 pF1KE0 L--TGLNYEAPKYHIPYWVAYYLALLLSLLVMVISPVIQLQPTFTPMRVALAGTFHYYSC : :: . . .:::. .:: : .:: .. :.. : ..:. .:.:.: : CCDS10 LGPCGLRLVGARPLLPYWLLVFLAALNALLQWLLRPLVLYAPLLNPYTLAVANTTFTVST 280 290 300 310 320 330 350 360 370 pF1KE0 ERAKKAMGYQPLVTMDDAMERTVQSFRHLRRVK ..:.. .::.:: . .:. ::. CCDS10 DKAQRHFGYEPLFSWEDSRTRTILWVQAATGSAQ 340 350 360 373 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Thu Nov 3 10:36:40 2016 done: Thu Nov 3 10:36:40 2016 Total Scan time: 2.410 Total Display time: 0.010 Function used was FASTA [36.3.4 Apr, 2011]