FASTA searches a protein or DNA sequence data bank 36.3.4 Apr, 2011 Please cite: W.R. Pearson & D.J. Lipman PNAS (1988) 85:2444-2448 Query: pF1KB7168, 412 aa 1>>>pF1KB7168 412 - 412 aa - 412 aa Library: human.CCDS.faa 18511270 residues in 32554 sequences Statistics: Expectation_n fit: rho(ln(x))= 5.0567+/-0.000949; mu= 18.2490+/- 0.057 mean_var=63.2727+/-13.122, 0's: 0 Z-trim(104.2): 33 B-trim: 35 in 1/49 Lambda= 0.161238 statistics sampled from 7764 (7773) to 7764 sequences Algorithm: FASTA (3.7 Nov 2010) [optimized] Parameters: BL50 matrix (15:-5), open/ext: -10/-2 ktup: 2, E-join: 1 (0.605), E-opt: 0.2 (0.239), width: 16 Scan time: 2.790 The best scores are: opt bits E(32554) CCDS3091.1 SLC35G2 gene_id:80723|Hs108|chr3 ( 412) 2736 645.4 2.9e-185 CCDS45603.1 SLC35G6 gene_id:643664|Hs108|chr17 ( 338) 566 140.6 2.2e-33 CCDS82241.1 SLC35G4 gene_id:646000|Hs108|chr18 ( 338) 552 137.3 2.1e-32 CCDS11293.1 SLC35G3 gene_id:146861|Hs108|chr17 ( 338) 540 134.5 1.5e-31 CCDS5980.1 SLC35G5 gene_id:83650|Hs108|chr8 ( 338) 533 132.9 4.5e-31 CCDS7432.1 SLC35G1 gene_id:159371|Hs108|chr10 ( 364) 260 69.4 6.3e-12 CCDS44459.1 SLC35G1 gene_id:159371|Hs108|chr10 ( 365) 260 69.4 6.3e-12 >>CCDS3091.1 SLC35G2 gene_id:80723|Hs108|chr3 (412 aa) initn: 2736 init1: 2736 opt: 2736 Z-score: 3439.5 bits: 645.4 E(32554): 2.9e-185 Smith-Waterman score: 2736; 99.8% identity (100.0% similar) in 412 aa overlap (1-412:1-412) 10 20 30 40 50 60 pF1KB7 MDTSPSRKYPVKKRVKIHPNTVMVKYTSHYPQPGDDGYEEINEGYGNFMEENPKKGLLSE :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS30 MDTSPSRKYPVKKRVKIHPNTVMVKYTSHYPQPGDDGYEEINEGYGNFMEENPKKGLLSE 10 20 30 40 50 60 70 80 90 100 110 120 pF1KB7 MKKKGRAFFGTMDTLPPPTEDPMINEIGQFQSFAEKNIFQSRKMWIVLFGSALAHGCVAL :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS30 MKKKGRAFFGTMDTLPPPTEDPMINEIGQFQSFAEKNIFQSRKMWIVLFGSALAHGCVAL 70 80 90 100 110 120 130 140 150 160 170 180 pF1KB7 ITRLVSDRSKVPSLELIFIRSVFQVLSVLVVCYYQEAPFGPSGYRLRLFFYGVCNVISIT :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS30 ITRLVSDRSKVPSLELIFIRSVFQVLSVLVVCYYQEAPFGPSGYRLRLFFYGVCNVISIT 130 140 150 160 170 180 190 200 210 220 230 240 pF1KB7 CAYTSFSIVPPSNGTTMWRATTTVFSAILAFLLVDEKMAYVDMATVVCSILGVCLVMIPN :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS30 CAYTSFSIVPPSNGTTMWRATTTVFSAILAFLLVDEKMAYVDMATVVCSILGVCLVMIPN 190 200 210 220 230 240 250 260 270 280 290 300 pF1KB7 IVDEDNSLLNAWKEAFGYTMTVMAGLTTALSMIVYRSIKEKISMWTALFTFGWTGTIWGI :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS30 IVDEDNSLLNAWKEAFGYTMTVMAGLTTALSMIVYRSIKEKISMWTALFTFGWTGTIWGI 250 260 270 280 290 300 310 320 330 340 350 360 pF1KB7 STMFILQEPIIPLDGETWSYLIAICVCSTAAFLGVYYALDKFHPALVSTVQHLEIVVAMV :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: CCDS30 STMFILQEPIIPLDGETWSYLIAICVCSTAAFLGVYYALDKFHPALVSTVQHLEIVVAMV 310 320 330 340 350 360 370 380 390 400 410 pF1KB7 LQLLVLHIFPSIYDVFGGVIIMISVFVLAGYKLYWRNLRRQDYQEILDSPIK :::::::::::::::::::::::::::::::::::::::.:::::::::::: CCDS30 LQLLVLHIFPSIYDVFGGVIIMISVFVLAGYKLYWRNLRKQDYQEILDSPIK 370 380 390 400 410 >>CCDS45603.1 SLC35G6 gene_id:643664|Hs108|chr17 (338 aa) initn: 540 init1: 283 opt: 566 Z-score: 712.8 bits: 140.6 E(32554): 2.2e-33 Smith-Waterman score: 566; 31.5% identity (69.5% similar) in 292 aa overlap (104-393:40-328) 80 90 100 110 120 130 pF1KB7 TLPPPTEDPMINEIGQFQSFAEKNIFQSRKMWIVLFGSALAHGCVALITRLVSDRSKVPS . ..:.:..: : :. ..... . :..:: CCDS45 PPDSTHPSPPSAPPSLRWHQCCQPSDATNGLLVALLGGGLPAGFVGPLSHMAYQASNLPS 10 20 30 40 50 60 140 150 160 170 180 190 pF1KB7 LELIFIRSVFQVLSVLVVCYYQEAPFGPSGYRLRLFFYGVCNVISITCAYTSFSIVPPSN :::.. : .:.. .:.. . .:: : : .::.. ::.:: :::.. ..:: .: CCDS45 LELLICRCLFHLPIALLLKLRGDPLLGPPDIRGRAYFYALLNVLSIGCAYSAVQVVPAGN 70 80 90 100 110 120 200 210 220 230 240 250 pF1KB7 GTTMWRATTTVFSAILAFLLVDEKMAYVDMATVVCSILGVCLVMIPNIVDEDNSLLNAWK ..:. ....:: ::.:.. : .. .. : .. ::::. ... :.. .... ... CCDS45 AATVRKGSSTVCSAVLTLCLESQGLSGYDWCGLLGSILGLIIIVGPGLWTLQEGITGVYT 130 140 150 160 170 180 260 270 280 290 300 310 pF1KB7 EAFGYTMTVMAGLTTALSMIVYRSIKEKISMWTALFTFGWTGTIWGISTMFILQEPIIPL :.:: .. ..::. .:...::::.. . :. : : .: . .. .:.:: :..: CCDS45 -ALGYGQAFVGGLALSLGLLVYRSLHFPSCLPTVAFLSGLVGLLGSVPGLFVLQPPVLPS 190 200 210 220 230 240 320 330 340 350 360 370 pF1KB7 DGETWSYLIAICVCSTAAFLGVYYALDKFHPALVSTVQHLEIVVAMVLQLLVLH--IFPS : .:: . :. . . ..: : ::. : ::::: .: : :.:::..:: .:: . :: CCDS45 DLPSWSCVGAVGILALVSFTCVSYAVTKAHPALVCAVLHSEVVVALILQYYMLHETVAPS 250 260 270 280 290 300 380 390 400 410 pF1KB7 IYDVFGGVIIMISVFVLAGYKLYWRNLRRQDYQEILDSPIK :. :. ... :. ......: CCDS45 --DIVGAGVVLGSIAIITAWNLSCEREGKVEE 310 320 330 >>CCDS82241.1 SLC35G4 gene_id:646000|Hs108|chr18 (338 aa) initn: 522 init1: 278 opt: 552 Z-score: 695.2 bits: 137.3 E(32554): 2.1e-32 Smith-Waterman score: 552; 30.0% identity (65.5% similar) in 330 aa overlap (68-393:7-328) 40 50 60 70 80 90 pF1KB7 YEEINEGYGNFMEENPKKGLLSEMKKKGRAFFGTMD-TLP-PPTEDPMINEIGQFQSFAE .:. : : : ::. : .. . : CCDS82 MAGSHPYFNLPDSTHPSPPSTPPSLHWHQRCQPSDA 10 20 30 100 110 120 130 140 150 pF1KB7 KNIFQSRKMWIVLFGSALAHGCVALITRLVSDRSKVPSLELIFIRSVFQVLSVLVVCYYQ : . ..:.:..: : :. ..:.. . :..:::::.. : .:.. .:.. CCDS82 TN-----GLLVALLGGGLPAGFVGPLSRMAYQASNLPSLELVICRCLFHLPIALLLKLRG 40 50 60 70 80 90 160 170 180 190 200 210 pF1KB7 EAPFGPSGYRLRLFFYGVCNVISITCAYTSFSIVPPSNGTTMWRATTTVFSAILAFLLVD . .:: : : : .. ::..: :::.. ..:: .:..:. . ..:: ::::.. : . CCDS82 DPLLGPPDIRGRTCFCALLNVLNIGCAYSAVQVVPTGNAATVRKHSSTVCSAILTLCLES 100 110 120 130 140 150 220 230 240 250 260 270 pF1KB7 EKMAYVDMATVVCSILGVCLVMIPNIVDEDNSLLNAWKEAFGYTMTVMAGLTTALSMIVY . .. : .. ::::. ... :.. ... ... ..::... ..::. .:...:: CCDS82 QVLSGYDWCGLLGSILGLIIIVGPGLWTLQEGTTGVYT-GLGYVQAFLGGLALSLGLLVY 160 170 180 190 200 210 280 290 300 310 320 330 pF1KB7 RSIKEKISMWTALFTFGWTGTIWGISTMFILQEPIIPLDGETWSYLIAICVCSTAAFLGV ::.. . :. : : .: . .. .:.:: :..: : .:: . :. . . ..: : CCDS82 RSLHFPSCLPTVAFLSGLVGLLGSVPGLFVLQSPVLPSDLLSWSCVGAVGILTLVSFTCV 220 230 240 250 260 270 340 350 360 370 380 390 pF1KB7 YYALDKFHPALVSTVQHLEIVVAMVLQLLVLH--IFPSIYDVFGGVIIMISVFVLAGYKL ::. : ::::: .: : :.:.:..:: ..:: . :: :..:. ... :. .... .: CCDS82 GYAVTKAHPALVCAVLHSEVVMALILQYFMLHETVAPS--DIMGAGVVLGSIAIITARNL 280 290 300 310 320 400 410 pF1KB7 YWRNLRRQDYQEILDSPIK CCDS82 ICERTGKVEE 330 >>CCDS11293.1 SLC35G3 gene_id:146861|Hs108|chr17 (338 aa) initn: 520 init1: 281 opt: 540 Z-score: 680.1 bits: 134.5 E(32554): 1.5e-31 Smith-Waterman score: 540; 29.4% identity (64.5% similar) in 330 aa overlap (68-393:7-328) 40 50 60 70 80 90 pF1KB7 YEEINEGYGNFMEENPKKGLLSEMKKKGRAFFGTMD-TLP-PPTEDPMINEIGQFQSFAE .:. : : : ::. : . .: CCDS11 MAGSHPYFNQPDSTHPSPPSAPPSLR---WYQRCQP 10 20 30 100 110 120 130 140 150 pF1KB7 KNIFQSRKMWIVLFGSALAHGCVALITRLVSDRSKVPSLELIFIRSVFQVLSVLVVCYYQ .. . . ..:.:..: : :. ..:.. . :..:::::.. : .:.. .:.. CCDS11 SD--ATSGLLVALLGGGLPAGFVGPLSRMAYQASNLPSLELLIWRCLFHLPIALLLKLRG 40 50 60 70 80 90 160 170 180 190 200 210 pF1KB7 EAPFGPSGYRLRLFFYGVCNVISITCAYTSFSIVPPSNGTTMWRATTTVFSAILAFLLVD . .: : : :: .. :..:: :::.. ..:: .:..:. ....:: ::.:.. : . CCDS11 DPLLGTPDIRSRAFFCALLNILSIGCAYSAVQVVPAGNAATVRKGSSTVCSAVLTLCLES 100 110 120 130 140 150 220 230 240 250 260 270 pF1KB7 EKMAYVDMATVVCSILGVCLVMIPNIVDEDNSLLNAWKEAFGYTMTVMAGLTTALSMIVY . .. : .. :::. ... :.. ... ... :.::. . ..::. .: ..:: CCDS11 QGLSGYDWCGLLGCILGLIIIVGPGLWTLQEGTTGVYT-ALGYVEAFLGGLALSLRLLVY 160 170 180 190 200 210 280 290 300 310 320 330 pF1KB7 RSIKEKISMWTALFTFGWTGTIWGISTMFILQEPIIPLDGETWSYLIAICVCSTAAFLGV ::.. . :. : : .: . .. .:.:: :..: : .:: . :. . . ..: : CCDS11 RSLHFPPCLPTVAFLSGLVGLLGSVPGLFVLQAPVLPSDLLSWSCVGAVGILALVSFTCV 220 230 240 250 260 270 340 350 360 370 380 390 pF1KB7 YYALDKFHPALVSTVQHLEIVVAMVLQLLVLH--IFPSIYDVFGGVIIMISVFVLAGYKL ::. : ::::: .: : :.:::..:: .:: . :: :. .. ... :. .... .: CCDS11 GYAVTKAHPALVCAVLHSEVVVALILQYYMLHETVAPS--DIVAAGVVLGSIAIITAQNL 280 290 300 310 320 400 410 pF1KB7 YWRNLRRQDYQEILDSPIK CCDS11 SCERTGRVEE 330 >>CCDS5980.1 SLC35G5 gene_id:83650|Hs108|chr8 (338 aa) initn: 462 init1: 268 opt: 533 Z-score: 671.3 bits: 132.9 E(32554): 4.5e-31 Smith-Waterman score: 533; 29.3% identity (64.6% similar) in 328 aa overlap (68-393:7-328) 40 50 60 70 80 90 pF1KB7 YEEINEGYGNFMEENPKKGLLSEMKKKGRAFFGTMD-TLP-PPTEDPMINEIGQFQSFAE .:. : : : ::. : . . : . CCDS59 MAGSHPYFNLPDSTHPSPPSAPPSLRWHQRCQPSGA 10 20 30 100 110 120 130 140 150 pF1KB7 KNIFQSRKMWIVLFGSALAHGCVALITRLVSDRSKVPSLELIFIRSVFQVLSVLVVCYYQ : . ..:.:..: : :. ..:.. . :..:::::.. : .:.. .:.. CCDS59 TN-----GLLVALLGGGLPAGFVGPLSRMAYQGSNLPSLELLICRCLFHLPIALLLKLRG 40 50 60 70 80 90 160 170 180 190 200 210 pF1KB7 EAPFGPSGYRLRLFFYGVCNVISITCAYTSFSIVPPSNGTTMWRATTTVFSAILAFLLVD . .:: : : .. ::.:: :::.. ..:: .:..:. ....:: ::.:.. : . CCDS59 DPLLGPPDIRGWACFCALLNVLSIGCAYSAVQVVPAGNAATVRKGSSTVCSAVLTLCLES 100 110 120 130 140 150 220 230 240 250 260 270 pF1KB7 EKMAYVDMATVVCSILGVCLVMIPNIVDEDNSLLNAWKEAFGYTMTVMAGLTTALSMIVY . .. . .. ::::. ... :.. ... ... ..::... ..::. .:...:: CCDS59 QGLGGYEWCGLLGSILGLIIILGPGLWTLQEGTTGVYT-TLGYVQAFLGGLALSLGLLVY 160 170 180 190 200 210 280 290 300 310 320 330 pF1KB7 RSIKEKISMWTALFTFGWTGTIWGISTMFILQEPIIPLDGETWSYLIAICVCSTAAFLGV ::.. . :. : : .: . . .:.:: :..: : .:: . : . . ..: : CCDS59 RSLHFPSCLPTVAFLSGLVGLLGCVPGLFVLQTPVLPSDLLSWSCVGAEGILALVSFTCV 220 230 240 250 260 270 340 350 360 370 380 390 pF1KB7 YYALDKFHPALVSTVQHLEIVVAMVLQLLVLHIFPSIYDVFGGVIIMISVFVLAGYKLYW ::. : ::::: .: : :.:::..:: .:: .. :..:. ... :. .... .: CCDS59 GYAVTKAHPALVCAVLHSEVVVALILQYYMLHETVALSDIMGAGVVLGSIAIITARNLSC 280 290 300 310 320 330 400 410 pF1KB7 RNLRRQDYQEILDSPIK CCDS59 ERTGKVEE >>CCDS7432.1 SLC35G1 gene_id:159371|Hs108|chr10 (364 aa) initn: 109 init1: 74 opt: 260 Z-score: 327.6 bits: 69.4 E(32554): 6.3e-12 Smith-Waterman score: 260; 21.2% identity (62.5% similar) in 293 aa overlap (111-397:77-362) 90 100 110 120 130 140 pF1KB7 DPMINEIGQFQSFAEKNIFQSRKMWIVLFGSALAHGCVALITRLVSDRSKVPSLELIFIR ::. . .:... :.: : ..:. .: CCDS74 LCLSSPCCSRTEPAKKKAPCPGLGLFYTLLSAFLFSVGSLFVKKVQD---VHAVEISAFR 50 60 70 80 90 100 150 160 170 180 190 pF1KB7 SVFQVLSVLVVCYYQEAPF-GPSGYRLRLFFYGVCNVISITCAYTSFSIVPPSNGTTMWR :::.: :. :... : ::.: :. :.. :: . .. : ... . ...:.. CCDS74 CVFQMLVVIPCLIYRKTGFIGPKGQRIFLILRGVLGSTAMMLIYYAYQTMSLADATVI-T 110 120 130 140 150 160 200 210 220 230 240 250 pF1KB7 ATTTVFSAILAFLLVDEKMAYVDMATVVCSILGVCLVMIPNIVDEDNSLLNAWKEAF-GY .. ::..:.:.. . ::.. : .: .: :: :.. : .. ... .. .:.. :. CCDS74 FSSPVFTSIFAWICLKEKYSPWDALFTVFTITGVILIVRPPFLFGSDT--SGMEESYSGH 170 180 190 200 210 220 260 270 280 290 300 310 pF1KB7 TMTVMAGLTTAL----SMIVYRSIKEKISMWTALFTFGWTGTIWGISTMFILQEPIIPLD ..:.. .:. .... :.. ...... ... . : . .. . .: : .: CCDS74 LKGTFAAIGSAVFAASTLVILRKMGKSVDYFLSIWYYVVLGLVESVIILSVLGEWSLPYC 230 240 250 260 270 280 320 330 340 350 360 370 pF1KB7 GETWSYLIAICVCSTAAFLGVYYALDKFHPALVSTVQHLEIVVAMVLQLLVLHIFPSIYD : .:: : . . .. . . ::. . . :. .. ...: :...:.. .. :. . CCDS74 GLDRLFLIFIGLFGLGGQIFITKALQIEKAGPVAIMKTMDVVFAFIFQIIFFNNVPTWWT 290 300 310 320 330 340 380 390 400 410 pF1KB7 VFGGVIIMISVFVLAGYKLYWRNLRRQDYQEILDSPIK : ::.. ... : :. . .... CCDS74 V-GGALCVVASNVGAAIRKWYQSSK 350 360 >>CCDS44459.1 SLC35G1 gene_id:159371|Hs108|chr10 (365 aa) initn: 109 init1: 74 opt: 260 Z-score: 327.6 bits: 69.4 E(32554): 6.3e-12 Smith-Waterman score: 260; 21.2% identity (62.5% similar) in 293 aa overlap (111-397:78-363) 90 100 110 120 130 140 pF1KB7 DPMINEIGQFQSFAEKNIFQSRKMWIVLFGSALAHGCVALITRLVSDRSKVPSLELIFIR ::. . .:... :.: : ..:. .: CCDS44 CLSSPCCSRTEPEAKKKAPCPGLGLFYTLLSAFLFSVGSLFVKKVQD---VHAVEISAFR 50 60 70 80 90 100 150 160 170 180 190 pF1KB7 SVFQVLSVLVVCYYQEAPF-GPSGYRLRLFFYGVCNVISITCAYTSFSIVPPSNGTTMWR :::.: :. :... : ::.: :. :.. :: . .. : ... . ...:.. CCDS44 CVFQMLVVIPCLIYRKTGFIGPKGQRIFLILRGVLGSTAMMLIYYAYQTMSLADATVI-T 110 120 130 140 150 160 200 210 220 230 240 250 pF1KB7 ATTTVFSAILAFLLVDEKMAYVDMATVVCSILGVCLVMIPNIVDEDNSLLNAWKEAF-GY .. ::..:.:.. . ::.. : .: .: :: :.. : .. ... .. .:.. :. CCDS44 FSSPVFTSIFAWICLKEKYSPWDALFTVFTITGVILIVRPPFLFGSDT--SGMEESYSGH 170 180 190 200 210 220 260 270 280 290 300 310 pF1KB7 TMTVMAGLTTAL----SMIVYRSIKEKISMWTALFTFGWTGTIWGISTMFILQEPIIPLD ..:.. .:. .... :.. ...... ... . : . .. . .: : .: CCDS44 LKGTFAAIGSAVFAASTLVILRKMGKSVDYFLSIWYYVVLGLVESVIILSVLGEWSLPYC 230 240 250 260 270 280 320 330 340 350 360 370 pF1KB7 GETWSYLIAICVCSTAAFLGVYYALDKFHPALVSTVQHLEIVVAMVLQLLVLHIFPSIYD : .:: : . . .. . . ::. . . :. .. ...: :...:.. .. :. . CCDS44 GLDRLFLIFIGLFGLGGQIFITKALQIEKAGPVAIMKTMDVVFAFIFQIIFFNNVPTWWT 290 300 310 320 330 340 380 390 400 410 pF1KB7 VFGGVIIMISVFVLAGYKLYWRNLRRQDYQEILDSPIK : ::.. ... : :. . .... CCDS44 V-GGALCVVASNVGAAIRKWYQSSK 350 360 412 residues in 1 query sequences 18511270 residues in 32554 library sequences Tcomplib [36.3.4 Apr, 2011] (8 proc) start: Fri Nov 4 05:25:59 2016 done: Fri Nov 4 05:26:00 2016 Total Scan time: 2.790 Total Display time: 0.020 Function used was FASTA [36.3.4 Apr, 2011]