FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KE0502, 211 aa 1>>>pF1KE0502 211 - 211 aa - 211 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.2150+/-0.000662; mu= 13.5673+/- 0.040 mean_var=57.3171+/-11.571, 0's: 0 Z-trim(109.1): 17 B-trim: 0 in 0/51 Lambda= 0.169407 statistics sampled from 10663 (10680) to 10663 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.718), E-opt: 0.2 (0.328), width: 16 Scan time: 1.900 The best scores are: opt bits E(32554) CCDS13830.1 CRYBB3 gene_id:1417|Hs108|chr22 ( 211) 1479 369.2 1e-102 CCDS13840.1 CRYBB1 gene_id:1414|Hs108|chr22 ( 252) 855 216.7 9.8e-57 CCDS13831.1 CRYBB2 gene_id:1415|Hs108|chr22 ( 205) 840 213.0 1e-55 CCDS11249.1 CRYBA1 gene_id:1411|Hs108|chr17 ( 215) 580 149.5 1.5e-36 CCDS13841.1 CRYBA4 gene_id:1413|Hs108|chr22 ( 196) 508 131.9 2.7e-31 CCDS2429.1 CRYBA2 gene_id:1412|Hs108|chr2 ( 197) 494 128.5 2.9e-30 CCDS3275.1 CRYGS gene_id:1427|Hs108|chr3 ( 178) 416 109.4 1.4e-24 CCDS2379.1 CRYGC gene_id:1420|Hs108|chr2 ( 174) 400 105.5 2.1e-23 CCDS2380.1 CRYGB gene_id:1419|Hs108|chr2 ( 175) 386 102.1 2.3e-22 CCDS33367.1 CRYGA gene_id:1418|Hs108|chr2 ( 174) 382 101.1 4.5e-22 CCDS2378.1 CRYGD gene_id:1421|Hs108|chr2 ( 174) 361 95.9 1.6e-20 CCDS5926.1 CRYGN gene_id:155051|Hs108|chr7 ( 182) 305 82.3 2.2e-16 CCDS78289.1 CRYGN gene_id:155051|Hs108|chr7 ( 125) 251 69.0 1.5e-12 >>CCDS13830.1 CRYBB3 gene_id:1417|Hs108|chr22 (211 aa) initn: 1479 init1: 1479 opt: 1479 Z-score: 1957.5 bits: 369.2 E(32554): 1e-102 Smith-Waterman score: 1479; 99.5% identity (99.5% similar) in 211 aa overlap (1-211:1-211) 10 20 30 40 50 60 pF1KE0 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRCELSAECPSLTDSLLEKVGSIQ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRCELSAECPSLTDSLLEKVGSIQ 10 20 30 40 50 60 70 80 90 100 110 120 pF1KE0 VESGPWLAFESRAFRGEQFVLEKGDYPRWDAWSNSRDSDSLLSLRPLNIDSPDHKLHLFE :::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::: CCDS13 VESGPWLAFESRAFRGEQFVLEKGDYPRWDAWSNSRDSDSLLSLRPLNIDSPHHKLHLFE 70 80 90 100 110 120 130 140 150 160 170 180 pF1KE0 NPAFSGRKMEIVDDDVPSLWAHGFQDRVASVRAINGTWVGYEFPGYRGRQYVFERGEYRH :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS13 NPAFSGRKMEIVDDDVPSLWAHGFQDRVASVRAINGTWVGYEFPGYRGRQYVFERGEYRH 130 140 150 160 170 180 190 200 210 pF1KE0 WNEWDASQPQLQSVRRIRDQKWHKRGRFPSS ::::::::::::::::::::::::::::::: CCDS13 WNEWDASQPQLQSVRRIRDQKWHKRGRFPSS 190 200 210 >>CCDS13840.1 CRYBB1 gene_id:1414|Hs108|chr22 (252 aa) initn: 855 init1: 855 opt: 855 Z-score: 1132.0 bits: 216.7 E(32554): 9.8e-57 Smith-Waterman score: 855; 56.9% identity (85.6% similar) in 188 aa overlap (22-209:57-244) 10 20 30 40 50 pF1KE0 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRCELSAECPSLTDS :.:.....:::::::.: :.:.:: .:.: CCDS13 PPAGTSPSPGTTLAPTTVPITSAKAAELPPGNYRLVVFELENFQGRRAEFSGECSNLADR 30 40 50 60 70 80 60 70 80 90 100 110 pF1KE0 LLEKVGSIQVESGPWLAFESRAFRGEQFVLEKGDYPRWDAWSNSRDSDSLLSLRPLNIDS ...: :: : .:::.:::. ::::.:.::::.::::..::.: :: :.:.::...:. CCDS13 GFDRVRSIIVSAGPWVAFEQSNFRGEMFILEKGEYPRWNTWSSSYRSDRLMSFRPIKMDA 90 100 110 120 130 140 120 130 140 150 160 170 pF1KE0 PDHKLHLFENPAFSGRKMEIVDDDVPSLWAHGFQDRVASVRAINGTWVGYEFPGYRGRQY .::. :::. :.: .:: ::.::::..::.:::.::.. .::::::..::::: :: CCDS13 QEHKISLFEGANFKGNTIEIQGDDAPSLWVYGFSDRVGSVKVSSGTWVGYQYPGYRGYQY 150 160 170 180 190 200 180 190 200 210 pF1KE0 VFERGEYRHWNEWDASQPQLQSVRRIRDQKWHKRGRFPSS ..: :..:::::: : :::.::.::.::..:: .: :: CCDS13 LLEPGDFRHWNEWGAFQPQMQSLRRLRDKQWHLEGSFPVLATEPPK 210 220 230 240 250 >>CCDS13831.1 CRYBB2 gene_id:1415|Hs108|chr22 (205 aa) initn: 937 init1: 807 opt: 840 Z-score: 1113.6 bits: 213.0 E(32554): 1e-55 Smith-Waterman score: 840; 53.8% identity (83.0% similar) in 212 aa overlap (1-211:1-205) 10 20 30 40 50 60 pF1KE0 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRCELSAECPSLTDSLLEKVGSIQ :: .: :. ::: .. . :.:..: :::::. ::.. ::.: .. .::.::. CCDS13 MASDH----QTQAGKPQSL---NPKIIIFEQENFQGHSHELNGPCPNLKETGVEKAGSVL 10 20 30 40 50 70 80 90 100 110 120 pF1KE0 VESGPWLAFESRAFRGEQFVLEKGDYPRWDAWSNSRDSDSLLSLRPLNIDSPDHKLHLFE :..:::...:. .:::::.:::.:::::.:..:: .::: ::::...:: .::. :.: CCDS13 VQAGPWVGYEQANCKGEQFVFEKGEYPRWDSWTSSRRTDSLSSLRPIKVDSQEHKIILYE 60 70 80 90 100 110 130 140 150 160 170 180 pF1KE0 NPAFSGRKMEIVDDDVPSLWAHGFQDRVASVRAINGTWVGYEFPGYRGRQYVFERGEYRH :: :.:.::::.::::::. :::.:..:.:::. .::::::..::::: ::..:.:.:. CCDS13 NPNFTGKKMEIIDDDVPSFHAHGYQEKVSSVRVQSGTWVGYQYPGYRGLQYLLEKGDYKD 120 130 140 150 160 170 190 200 210 pF1KE0 WNEWDASQPQLQSVRRIRDQKWHKRGRF-PSS ... : .::.::::::::..::.:: : ::. CCDS13 SSDFGAPHPQVQSVRRIRDMQWHQRGAFHPSN 180 190 200 >>CCDS11249.1 CRYBA1 gene_id:1411|Hs108|chr17 (215 aa) initn: 572 init1: 307 opt: 580 Z-score: 769.9 bits: 149.5 E(32554): 1.5e-36 Smith-Waterman score: 598; 43.5% identity (74.0% similar) in 200 aa overlap (9-198:17-214) 10 20 30 40 50 pF1KE0 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRCELSAECPSLTDSL ..: .. . :.:: .:. .:. ::::::: :... ::.... CCDS11 METQAEQQELETLPTTKMAQTNPTPGSLG-PWKITIYDQENFQGKRMEFTSSCPNVSERS 10 20 30 40 50 60 70 80 90 100 pF1KE0 LEKVGSIQVESGPWLAFESRAFRGEQFVLEKGDYPRWDAWS--NSRDSDSLLSLRPL-NI ...: :..:::: :...: .: :.::.::.:.:::::::: :. . :.:.::. . CCDS11 FDNVRSLKVESGAWIGYEHTSFCGQQFILERGEYPRWDAWSGSNAYHIERLMSFRPICSA 60 70 80 90 100 110 110 120 130 140 150 160 pF1KE0 DSPDHKLHLFENPAFSGRKMEIVDDDVPSLWAHG-FQDRVASVRAINGTWVGYEFPGYRG . . :. .::. : ::. :: .:: ::: : : :...:.:.. .:.:: :..::::: CCDS11 NHKESKMTIFEKENFIGRQWEI-SDDYPSLQAMGWFNNEVGSMKIQSGAWVCYQYPGYRG 120 130 140 150 160 170 170 180 190 200 210 pF1KE0 RQYVFE----RGEYRHWNEWD--ASQPQLQSVRRIRDQKWHKRGRFPSS ::..: :.:.:: :: :. :.::.:::. CCDS11 YQYILECDHHGGDYKHWREWGSHAQTSQIQSIRRIQQ 180 190 200 210 >>CCDS13841.1 CRYBA4 gene_id:1413|Hs108|chr22 (196 aa) initn: 447 init1: 250 opt: 508 Z-score: 675.4 bits: 131.9 E(32554): 2.7e-31 Smith-Waterman score: 526; 42.2% identity (72.2% similar) in 187 aa overlap (22-198:10-195) 10 20 30 40 50 60 pF1KE0 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRCELSAECPSLTDSLLEKVGSIQ : .:..... ..:::.: :..:::::. . .: : :.. CCDS13 MTLQCTKSAGPWKMVVWDEDGFQGRRHEFTAECPSVLELGFETVRSLK 10 20 30 40 70 80 90 100 110 pF1KE0 VESGPWLAFESRAFRGEQFVLEKGDYPRWDAW--SNSRDSDSLLSLRPLN-IDSPDHKLH : :: :..:: .:.:.:..::.:.:: :::: ... .. : :.:: . : .: CCDS13 VLSGAWVGFEHAGFQGQQYILERGEYPSWDAWGGNTAYPAERLTSFRPAACANHRDSRLT 50 60 70 80 90 100 120 130 140 150 160 170 pF1KE0 LFENPAFSGRKMEIVDDDVPSLWAHGFQ-DRVASVRAINGTWVGYEFPGYRGRQYVFE-- .::. : :.: :. .:: ::: : :.. ..:.: .. .:.:: .:::::: :::.: CCDS13 IFEQENFLGKKGEL-SDDYPSLQAMGWEGNEVGSFHVHSGAWVCSQFPGYRGFQYVLECD 110 120 130 140 150 160 180 190 200 210 pF1KE0 --RGEYRHWNEWDASQP--QLQSVRRIRDQKWHKRGRFPSS :.:.:. :: . : :.::.:::. CCDS13 HHSGDYKHFREWGSHAPTFQVQSIRRIQQ 170 180 190 >>CCDS2429.1 CRYBA2 gene_id:1412|Hs108|chr2 (197 aa) initn: 469 init1: 211 opt: 494 Z-score: 656.9 bits: 128.5 E(32554): 2.9e-30 Smith-Waterman score: 521; 45.6% identity (75.8% similar) in 182 aa overlap (28-198:16-196) 10 20 30 40 50 pF1KE0 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRCELSAECPSLTD-SLLEKVGSI :.. :.:::.::.: ..: .. . . : .: :. CCDS24 MSSAPAPGPAPASLTLWDEEDFQGRRCRLLSDCANVCERGGLPRVRSV 10 20 30 40 60 70 80 90 100 110 pF1KE0 QVESGPWLAFESRAFRGEQFVLEKGDYPRWDAWS--NSRDSDSLLSLRP-LNIDSPDHKL .::.: :.::: :.:.::.:::::::::.::: .:..:..:::.:: : . : .. CCDS24 KVENGVWVAFEYPDFQGQQFILEKGDYPRWSAWSGSSSHNSNQLLSFRPVLCANHNDSRV 50 60 70 80 90 100 120 130 140 150 160 170 pF1KE0 HLFENPAFSGRKMEIVDDDVPSLWAHGFQDR-VASVRAINGTWVGYEFPGYRGRQYVFER :::. :.: :...::: ::: . :. .. :.:... .:.::.:..::::: :::.:: CCDS24 TLFEGDNFQGCKFDLVDD-YPSLPSMGWASKDVGSLKVSSGAWVAYQYPGYRGYQYVLER 110 120 130 140 150 160 180 190 200 210 pF1KE0 ----GEYRHWNEW--DASQPQLQSVRRIRDQKWHKRGRFPSS ::. ..: .: ::::.::.. CCDS24 DRHSGEFCTYGELGTQAHTGQLQSIRRVQH 170 180 190 >>CCDS3275.1 CRYGS gene_id:1427|Hs108|chr3 (178 aa) initn: 354 init1: 203 opt: 416 Z-score: 554.6 bits: 109.4 E(32554): 1.4e-24 Smith-Waterman score: 416; 35.6% identity (70.7% similar) in 174 aa overlap (25-197:7-176) 10 20 30 40 50 60 pF1KE0 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRCELSAECPSLTDSLLEKVGSIQ :. .:: .::::.: . . .: .. . : . .::. CCDS32 MSKTGTKITFYEDKNFQGRRYDCDCDCADF-HTYLSRCNSIK 10 20 30 40 70 80 90 100 110 pF1KE0 VESGPWLAFESRAFRGEQFVLEKGDYPRWDAWSNSRDSDSLLSLRPLNIDSP-DHKLHLF ::.: : ..: : : ...: .:.::... : . .: : : : ... : ..:...: CCDS32 VEGGTWAVYERPNFAGYMYILPQGEYPEYQRWMGL--NDRLSSCRAVHLPSGGQYKIQIF 50 60 70 80 90 120 130 140 150 160 170 pF1KE0 ENPAFSGRKMEIVDDDVPSLWAHGFQDRVASVRAINGTWVGYEFPGYRGRQYVFERGEYR :. :::. .: . .: ::. . . .. : ....:.:. ::.:.::::::.... ::: CCDS32 EKGDFSGQMYE-TTEDCPSIMEQFHMREIHSCKVLEGVWIFYELPNYRGRQYLLDKKEYR 100 110 120 130 140 150 180 190 200 210 pF1KE0 HWNEWDASQPQLQSVRRIRDQKWHKRGRFPSS . .: :..: .:: ::: CCDS32 KPIDWGAASPAVQSFRRIVE 160 170 >>CCDS2379.1 CRYGC gene_id:1420|Hs108|chr2 (174 aa) initn: 347 init1: 168 opt: 400 Z-score: 533.6 bits: 105.5 E(32554): 2.1e-23 Smith-Waterman score: 400; 34.9% identity (68.0% similar) in 175 aa overlap (25-199:3-172) 10 20 30 40 50 60 pF1KE0 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRCELSAECPSLTDSLLEKVGSIQ :. .:: . :::. : ...::.: . . . .::. CCDS23 MGKITFYEDRAFQGRSYETTTDCPNL-QPYFSRCNSIR 10 20 30 70 80 90 100 110 120 pF1KE0 VESGPWLAFESRAFRGEQFVLEKGDYPRWDAWSNSRDSDSLLSLRPLNIDSPDHKLHLFE :::: :. .: ..:.:..:..:.:: .. : . :: : : ... :.:.:.: CCDS23 VESGCWMLYERPNYQGQQYLLRRGEYPDYQQWMGLSDSIRSCCLIPQTVS---HRLRLYE 40 50 60 70 80 90 130 140 150 160 170 180 pF1KE0 NPAFSGRKMEIVDDDVPSLWAHGFQDRVASVRAINGTWVGYEFPGYRGRQYVFERGEYRH .: ::. ..: ::. . ... :.....: :: ::.:.::::::... :::. CCDS23 REDHKGLMMEL-SEDCPSIQDRFHLSEIRSLHVLEGCWVLYELPNYRGRQYLLRPQEYRR 100 110 120 130 140 150 190 200 210 pF1KE0 WNEWDASQPQLQSVRRIRDQKWHKRGRFPSS ..: : . . :.::. : CCDS23 CQDWGAMDAKAGSLRRVVDLY 160 170 >>CCDS2380.1 CRYGB gene_id:1419|Hs108|chr2 (175 aa) initn: 360 init1: 185 opt: 386 Z-score: 515.0 bits: 102.1 E(32554): 2.3e-22 Smith-Waterman score: 386; 32.6% identity (67.4% similar) in 175 aa overlap (25-199:3-173) 10 20 30 40 50 60 pF1KE0 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRCELSAECPSLTDSLLEKVGSIQ :. .:: . :::. : ...::.: . . . .::. CCDS23 MGKITFYEDRAFQGRSYECTTDCPNL-QPYFSRCNSIR 10 20 30 70 80 90 100 110 120 pF1KE0 VESGPWLAFESRAFRGEQFVLEKGDYPRWDAWSNSRDSDSLLSLRPLNIDSPDHKLHLFE :::: :. .: ..:.:. :..:.:: .. : . :: : : . : ....... CCDS23 VESGCWMIYERPNYQGHQYFLRRGEYPDYQQWMGLSDSIRSCCLIPPH--SGAYRMKIYD 40 50 60 70 80 90 130 140 150 160 170 180 pF1KE0 NPAFSGRKMEIVDDDVPSLWAHGFQDRVASVRAINGTWVGYEFPGYRGRQYVFERGEYRH . :. :..:: . :. . .. :. ...:.:. ::.:.::::::... ::::. CCDS23 RDELRGQMSELTDDCI-SVQDRFHLTEIHSLNVLEGSWILYEMPNYRGRQYLLRPGEYRR 100 110 120 130 140 150 190 200 210 pF1KE0 WNEWDASQPQLQSVRRIRDQKWHKRGRFPSS . .: : . .. :.::. : CCDS23 FLDWGAPNAKVGSLRRVMDLY 160 170 >>CCDS33367.1 CRYGA gene_id:1418|Hs108|chr2 (174 aa) initn: 380 init1: 177 opt: 382 Z-score: 509.8 bits: 101.1 E(32554): 4.5e-22 Smith-Waterman score: 382; 36.0% identity (68.0% similar) in 178 aa overlap (25-199:3-172) 10 20 30 40 50 pF1KE0 MAEQHGAPEQAAAGKSHGDLGGSYKVILYELENFQGKRC-ELSAECPSLTDSLLEKVGSI :. .:: ..::: :: . ..::.: . . .:: CCDS33 MGKITFYEDRDFQG-RCYNCISDCPNLR-VYFSRCNSI 10 20 30 60 70 80 90 100 110 pF1KE0 QVESGPWLAFESRAFRGEQFVLEKGDYPRWDAWSNSRDSDSLLSLRPLNIDSPDHKLHLF .:.:: :. .: ..:.:. :..: :: .. : . :::. : : . : .:::.:. CCDS33 RVDSGCWMLYERPNYQGHQYFLRRGKYPDYQHWMGL--SDSVQSCRIIPHTS-SHKLRLY 40 50 60 70 80 90 120 130 140 150 160 170 pF1KE0 ENPAFSGRKMEIVDDD--VPSLWAHGFQDRVASVRAINGTWVGYEFPGYRGRQYVFERGE : . : :..:: :: :. .. :.....: :: ::.:.::::::... :. CCDS33 ERDDYRGLMSELTDDCACVPELFR---LPEIYSLHVLEGCWVLYEMPNYRGRQYLLRPGD 100 110 120 130 140 150 180 190 200 210 pF1KE0 YRHWNEWDASQPQLQSVRRIRDQKWHKRGRFPSS ::....: ... .. :.::. : CCDS33 YRRYHDWGGADAKVGSLRRVTDLY 160 170 211 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Thu Nov 3 03:55:33 2016 done: Thu Nov 3 03:55:33 2016 Total Scan time: 1.900 Total Display time: -0.010 Function used was FASTA [36.3.4 Apr, 2011]