FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE0755, 329 aa 1>>>pF1KE0755 329 - 329 aa - 329 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.2831+/-0.000692; mu= 16.9533+/- 0.042 mean_var=74.6678+/-14.768, 0's: 0 Z-trim(111.0): 11 B-trim: 38 in 1/51 Lambda= 0.148425 statistics sampled from 12042 (12051) to 12042 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.733), E-opt: 0.2 (0.37), width: 16 Scan time: 2.890 The best scores are: opt bits E(32554) CCDS47350.1 DOK3 gene_id:79930|Hs108|chr5 ( 330) 2241 488.7 2.8e-138 CCDS78098.1 DOK3 gene_id:79930|Hs108|chr5 ( 440) 1487 327.3 1.4e-89 CCDS4426.1 DOK3 gene_id:79930|Hs108|chr5 ( 496) 1487 327.3 1.5e-89 CCDS47349.1 DOK3 gene_id:79930|Hs108|chr5 ( 228) 1364 300.8 7.1e-82 CCDS1954.1 DOK1 gene_id:1796|Hs108|chr2 ( 481) 504 116.8 3.5e-26 CCDS6016.1 DOK2 gene_id:9046|Hs108|chr8 ( 412) 485 112.7 5.1e-25 CCDS82474.1 DOK1 gene_id:1796|Hs108|chr2 ( 177) 358 85.3 4.1e-17 >>CCDS47350.1 DOK3 gene_id:79930|Hs108|chr5 (330 aa) initn: 1957 init1: 1957 opt: 2241 Z-score: 2596.1 bits: 488.7 E(32554): 2.8e-138 Smith-Waterman score: 2241; 99.4% identity (99.4% similar) in 330 aa overlap (1-329:1-330) 10 20 30 40 50 60 pF1KE0 MDPLETPIKDGILYQQHVKFGKKCWRKVWALLYAGGPSGVARLESWEVRDGGLGAAGDRS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS47 MDPLETPIKDGILYQQHVKFGKKCWRKVWALLYAGGPSGVARLESWEVRDGGLGAAGDRS 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE0 AGPGRRGERRVIRLADCVSVLPADGESCPRDTGAFLLTTTERSHLLAAQHRQAWMGPICQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS47 AGPGRRGERRVIRLADCVSVLPADGESCPRDTGAFLLTTTERSHLLAAQHRQAWMGPICQ 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE0 LAFPGTGEASSGSTDAQSPKRGLVPMEENSIYSSWQEVGEFPVVVQRTEAATRCQLKGPA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS47 LAFPGTGEASSGSTDAQSPKRGLVPMEENSIYSSWQEVGEFPVVVQRTEAATRCQLKGPA 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE0 LLVLGPDAIQLREAKGTQALYSWPYHFLRKFGSDKILLGTPGVSLLICKGERTDDVSGII :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS47 LLVLGPDAIQLREAKGTQALYSWPYHFLRKFGSDKILLGTPGVSLLICKGERTDDVSGII 190 200 210 220 230 240 250 260 270 280 290 pF1KE0 LDESLLRAYSVPGAGGHSRVQDSLGPVLREPTFQGERSFLKTSMLRSL-CSCSWRHPRSQ :::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::: CCDS47 LDESLLRAYSVPGAGGHSRVQDSLGPVLREPTFQGERSFLKTSMLRSLLCSCSWRHPRSQ 250 260 270 280 290 300 300 310 320 pF1KE0 PCTQASCLQGSDCPAPHRNSTSAAHTLGTS : :::::::::::::::::::::::::::: CCDS47 PRTQASCLQGSDCPAPHRNSTSAAHTLGTS 310 320 330 >>CCDS78098.1 DOK3 gene_id:79930|Hs108|chr5 (440 aa) initn: 1534 init1: 1487 opt: 1487 Z-score: 1721.8 bits: 327.3 E(32554): 1.4e-89 Smith-Waterman score: 1487; 100.0% identity (100.0% similar) in 215 aa overlap (1-215:1-215) 10 20 30 40 50 60 pF1KE0 MDPLETPIKDGILYQQHVKFGKKCWRKVWALLYAGGPSGVARLESWEVRDGGLGAAGDRS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 MDPLETPIKDGILYQQHVKFGKKCWRKVWALLYAGGPSGVARLESWEVRDGGLGAAGDRS 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE0 AGPGRRGERRVIRLADCVSVLPADGESCPRDTGAFLLTTTERSHLLAAQHRQAWMGPICQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 AGPGRRGERRVIRLADCVSVLPADGESCPRDTGAFLLTTTERSHLLAAQHRQAWMGPICQ 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE0 LAFPGTGEASSGSTDAQSPKRGLVPMEENSIYSSWQEVGEFPVVVQRTEAATRCQLKGPA :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS78 LAFPGTGEASSGSTDAQSPKRGLVPMEENSIYSSWQEVGEFPVVVQRTEAATRCQLKGPA 130 140 150 160 170 180 190 200 210 220 230 240 pF1KE0 LLVLGPDAIQLREAKGTQALYSWPYHFLRKFGSDKILLGTPGVSLLICKGERTDDVSGII ::::::::::::::::::::::::::::::::::: CCDS78 LLVLGPDAIQLREAKGTQALYSWPYHFLRKFGSDKGVFSFEAGRRCHSGEGLFAFSTPCA 190 200 210 220 230 240 250 260 270 280 290 300 pF1KE0 LDESLLRAYSVPGAGGHSRVQDSLGPVLREPTFQGERSFLKTSMLRSLCSCSWRHPRSQP CCDS78 PDLCRAVAGAIARQRERLPELTRPQPCPLPRATSLPSLDTPGELREMPPGPEPPTSRKMH 250 260 270 280 290 300 >>CCDS4426.1 DOK3 gene_id:79930|Hs108|chr5 (496 aa) initn: 1534 init1: 1487 opt: 1487 Z-score: 1721.0 bits: 327.3 E(32554): 1.5e-89 Smith-Waterman score: 1487; 100.0% identity (100.0% similar) in 215 aa overlap (1-215:57-271) 10 20 30 pF1KE0 MDPLETPIKDGILYQQHVKFGKKCWRKVWA :::::::::::::::::::::::::::::: CCDS44 GQKGKCEEFPSSLSSVSPGLEAAALLLAVTMDPLETPIKDGILYQQHVKFGKKCWRKVWA 30 40 50 60 70 80 40 50 60 70 80 90 pF1KE0 LLYAGGPSGVARLESWEVRDGGLGAAGDRSAGPGRRGERRVIRLADCVSVLPADGESCPR :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 LLYAGGPSGVARLESWEVRDGGLGAAGDRSAGPGRRGERRVIRLADCVSVLPADGESCPR 90 100 110 120 130 140 100 110 120 130 140 150 pF1KE0 DTGAFLLTTTERSHLLAAQHRQAWMGPICQLAFPGTGEASSGSTDAQSPKRGLVPMEENS :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 DTGAFLLTTTERSHLLAAQHRQAWMGPICQLAFPGTGEASSGSTDAQSPKRGLVPMEENS 150 160 170 180 190 200 160 170 180 190 200 210 pF1KE0 IYSSWQEVGEFPVVVQRTEAATRCQLKGPALLVLGPDAIQLREAKGTQALYSWPYHFLRK :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS44 IYSSWQEVGEFPVVVQRTEAATRCQLKGPALLVLGPDAIQLREAKGTQALYSWPYHFLRK 210 220 230 240 250 260 220 230 240 250 260 270 pF1KE0 FGSDKILLGTPGVSLLICKGERTDDVSGIILDESLLRAYSVPGAGGHSRVQDSLGPVLRE ::::: CCDS44 FGSDKGVFSFEAGRRCHSGEGLFAFSTPCAPDLCRAVAGAIARQRERLPELTRPQPCPLP 270 280 290 300 310 320 >>CCDS47349.1 DOK3 gene_id:79930|Hs108|chr5 (228 aa) initn: 1219 init1: 1080 opt: 1364 Z-score: 1583.4 bits: 300.8 E(32554): 7.1e-82 Smith-Waterman score: 1364; 99.0% identity (99.0% similar) in 206 aa overlap (125-329:23-228) 100 110 120 130 140 150 pF1KE0 FLLTTTERSHLLAAQHRQAWMGPICQLAFPGTGEASSGSTDAQSPKRGLVPMEENSIYSS :::::::::::::::::::::::::::::: CCDS47 MDPLETPIKDGILYQQHVKFGKGTGEASSGSTDAQSPKRGLVPMEENSIYSS 10 20 30 40 50 160 170 180 190 200 210 pF1KE0 WQEVGEFPVVVQRTEAATRCQLKGPALLVLGPDAIQLREAKGTQALYSWPYHFLRKFGSD :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS47 WQEVGEFPVVVQRTEAATRCQLKGPALLVLGPDAIQLREAKGTQALYSWPYHFLRKFGSD 60 70 80 90 100 110 220 230 240 250 260 270 pF1KE0 KILLGTPGVSLLICKGERTDDVSGIILDESLLRAYSVPGAGGHSRVQDSLGPVLREPTFQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS47 KILLGTPGVSLLICKGERTDDVSGIILDESLLRAYSVPGAGGHSRVQDSLGPVLREPTFQ 120 130 140 150 160 170 280 290 300 310 320 pF1KE0 GERSFLKTSMLRSL-CSCSWRHPRSQPCTQASCLQGSDCPAPHRNSTSAAHTLGTS :::::::::::::: :::::::::::: :::::::::::::::::::::::::::: CCDS47 GERSFLKTSMLRSLLCSCSWRHPRSQPRTQASCLQGSDCPAPHRNSTSAAHTLGTS 180 190 200 210 220 >>CCDS1954.1 DOK1 gene_id:1796|Hs108|chr2 (481 aa) initn: 410 init1: 147 opt: 504 Z-score: 583.6 bits: 116.8 E(32554): 3.5e-26 Smith-Waterman score: 504; 35.9% identity (61.3% similar) in 287 aa overlap (4-276:1-273) 10 20 30 40 50 60 pF1KE0 MDPLETPIKDGILYQQHVKFGKKCWRKVWALLYAGGPSGVARLESWEVRDGGLGAAGDRS .. . .: :. : .:: : :::.::.:: ..: :::::: .. . : ...: : CCDS19 MDGAVMEGPLFLQSQRFGTKRWRKTWAVLYPASPHGVARLEFFDHK--GSSSGGGR- 10 20 30 40 50 70 80 90 100 110 pF1KE0 AGPGRRGERRVIRLADCVSVLPADGESCPRDTG-AFLLTTTERSHLLAAQ--HRQAWMGP : .:: . .:::::.:::: :. :. :. . :: : :..:::::::. ::. CCDS19 -GSSRRLDCKVIRLAECVSVAPVTVETPPEPGATAFRLDTAQRSHLLAADAPSSAAWVQT 60 70 80 90 100 110 120 130 140 150 160 170 pF1KE0 ICQLAFPGTGEASSGSTDAQSPKRGLVPMEENSIYSSWQEVGEFPVVVQRTEAATRCQLK .:. ::: : . . :: . :: . . : :::.:: : ..: :.::::::: :: :. CCDS19 LCRNAFP-KGSWTLAPTD-NPPKLSALEMLENSLYSPTWEGSQFWVTVQRTEAAERCGLH 120 130 140 150 160 170 180 190 200 210 220 pF1KE0 GPALLVLGPDAIQL----REAKGTQALYSWPYHFLRKFGSDKILLG------TP-GVSLL : .: . . . : ... . : :::: .::..: ::.... : : . . CCDS19 GSYVLRVEAERLTLLTVGAQSQILEPLLSWPYTLLRRYGRDKVMFSFEAGRRCPSGPGTF 180 190 200 210 220 230 230 240 250 260 270 280 pF1KE0 ICKGERTDDVSGIILDESLLRAYSVPGAGGHSRVQDSLGPVLREPTFQGERSFLKTSMLR . . .:. . :. .. .. : .:... ::: . .:: CCDS19 TFQTAQGNDIFQAV--ETAIHRQKAQGKAGQGH------DVLRADSHEGEVAEGKLPSPP 240 250 260 270 280 290 300 310 320 pF1KE0 SLCSCSWRHPRSQPCTQASCLQGSDCPAPHRNSTSAAHTLGTS CCDS19 GPQELLDSPPALYAEPLDSLRIAPCPSQDSLYSDPLDSTSAQAGEGVQRKKPLYWDLYEH 290 300 310 320 330 340 >>CCDS6016.1 DOK2 gene_id:9046|Hs108|chr8 (412 aa) initn: 422 init1: 102 opt: 485 Z-score: 562.6 bits: 112.7 E(32554): 5.1e-25 Smith-Waterman score: 485; 46.3% identity (61.9% similar) in 218 aa overlap (8-216:6-207) 10 20 30 40 50 pF1KE0 MDPLETPIKDGILY-QQHVKFGKKCWRKVWALLYAGGPSGVARLESWEVRDGGLGAAGDR .:.:.:: ::. :::: ::. : ::.:. ..:::: : : : CCDS60 MGDGAVKQGFLYLQQQQTFGKK-WRRFGASLYGGSDCALARLELQE------GPEKPR 10 20 30 40 50 60 70 80 90 100 110 pF1KE0 SAGPGRRGERRVIRLADCVSVLPADGE-SCPRDTGAFLLTTTERSHLLAAQ--HRQAWMG .. :.::::.::. : : :: : ::::.::.: : :: .:::: .: :. CCDS60 RC----EAARKVIRLSDCLRVAEAGGEASSPRDTSAFFLETKERLYLLAAPAAERGDWVQ 60 70 80 90 100 120 130 140 150 160 170 pF1KE0 PICQLAFPGTGEASSGSTDAQSPKRGLVPMEENSIYSSWQEVG---EFPVVVQRTEAATR :: ::::: . :: :: : :::: .::: :: :: :... :::. : CCDS60 AICLLAFPGQRKELSGPEGKQS--RPC--MEENELYSSAVTVGPHKEFAVTMRPTEASER 110 120 130 140 150 160 180 190 200 210 220 230 pF1KE0 CQLKGPALLVLGPDAIQLR--EAKGTQALYSWPYHFLRKFGSDKILLGTPGVSLLICKGE :.:.: : : .:..: ::: ::.:::.:::.:: ::. CCDS60 CHLRGSYTLRAGESALELWGGPEPGTQ-LYDWPYRFLRRFGRDKVTFSFEAGRRCVSGEG 170 180 190 200 210 220 240 250 260 270 280 290 pF1KE0 RTDDVSGIILDESLLRAYSVPGAGGHSRVQDSLGPVLREPTFQGERSFLKTSMLRSLCSC CCDS60 NFEFETRQGNEIFLALEEAISAQKNAAPATPQPQPATIPASLPRPDSPYSRPHDSLPPPS 230 240 250 260 270 280 >>CCDS82474.1 DOK1 gene_id:1796|Hs108|chr2 (177 aa) initn: 258 init1: 147 opt: 358 Z-score: 420.8 bits: 85.3 E(32554): 4.1e-17 Smith-Waterman score: 358; 42.7% identity (66.9% similar) in 157 aa overlap (4-156:1-151) 10 20 30 40 50 60 pF1KE0 MDPLETPIKDGILYQQHVKFGKKCWRKVWALLYAGGPSGVARLESWEVRDGGLGAAGDRS .. . .: :. : .:: : :::.::.:: ..: :::::: .. . : ...: : CCDS82 MDGAVMEGPLFLQSQRFGTKRWRKTWAVLYPASPHGVARLEFFDHK--GSSSGGGR- 10 20 30 40 50 70 80 90 100 110 pF1KE0 AGPGRRGERRVIRLADCVSVLPADGESCPRDTG-AFLLTTTERSHLLAAQ--HRQAWMGP : .:: . .:::::.:::: :. :. :. . :: : :..:::::::. ::. CCDS82 -GSSRRLDCKVIRLAECVSVAPVTVETPPEPGATAFRLDTAQRSHLLAADAPSSAAWVQT 60 70 80 90 100 110 120 130 140 150 160 170 pF1KE0 ICQLAFPGTGEASSGSTDAQSPKRGLVPMEENSIYS-SWQEVGEFPVVVQRTEAATRCQL .:. ::: : . . :: . :: . . : :::.:: .:. CCDS82 LCRNAFP-KGSWTLAPTD-NPPKLSALEMLENSLYSPTWEGHVLFRGRPPLPLRPWNLHL 120 130 140 150 160 170 180 190 200 210 220 230 pF1KE0 KGPALLVLGPDAIQLREAKGTQALYSWPYHFLRKFGSDKILLGTPGVSLLICKGERTDDV CCDS82 PDGTGK 329 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Sat Nov 5 18:35:02 2016 done: Sat Nov 5 18:35:03 2016 Total Scan time: 2.890 Total Display time: 0.000 Function used was FASTA [36.3.4 Apr, 2011]