KOP
Description of the Gene/Protein Characteristic Table

Features of the cloned ORF sequence

This section describes features of the ORF sequence cloned in Flexi Vector.

(1) Restriction map

Commercially available restriction enzymes (REBASE; Roberts, R. J., Macelis, D. "REBASE - restriction enzymes and methylases" Nucleic Acids Res. 1998; 26: 338-350.) are sorted according to the number of the restriction sites present in the cDNA insert.

(2) Prediction of the genomic structure of the cDNA

The ORF sequence was subjected to a BLAST search (Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., and Lipman, D.J. " Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." 1997; Nucleic Acids Res 25: 3389-3402) against the human genome sequences in NCBI. When a genomic fragment was found to be considerably similar to the cDNA sequence (E-value of 0.0 and sequence identity of 90% or greater), the genomic structure of the cDNA was assigned by SIM4 (Florea, L., Hartzell, G., Zhang, Z., Rubin, G.M., and Miller, W. " A computer program for aligning a cDNA sequence with a genomic DNA sequence " 1998; Genome Res. 8: 967-974) to the genomic fragment.

GENSCAN (Burge, C. and Karlin, S. 1997; " Prediction of complete gene structures in human genomic DNA." J. Mol. Biol. 268: 78-94) was also applied to detect the plausible gene structure of the genomic fragment. The results of a comparison of the gene structures deduced from the cDNA and those predicted by GENSCAN are displayed in graphics.


Features of the predicted protein sequence

This section describes the features of the predicted protein sequence.

(1) FASTA homology searches against the CCDS protein database

Shown are the top 5 entries given the expectation value smaller than 1^-10 in the CCDS protein database.

The numbers on the left and right sides of the black line in the graphical overview indicate the lengths (in amino acid residues) of the non-homologous N-terminal and C-terminal portions flanking the homologous region (indicated by the black line), respectively. The FASTA output and the multiple alignment of these entries can be obtained by clicking.

(2) Analysis of Motifs, Domains, and Membrane-spanning regions

The predicted protein sequences were examined for motifs present in the InterPro database. Because weakly defined sequence motifs appear too many times in the HUGE database and are, thus, unlikely to be informative, the following motifs were excluded from the analysis: amidation site; N-glycosylation site; cAMP- and cGMP-dependent protein kinase phosphorylation site; casein kinase II phosphorylation site; N-myristoylation site; protein kinase C phosphorylation site; and tyrosine kinase phosphorylation site.

Motifs and domains in the InterPro database were searched for by InterProScan. (Zdobnov, EM, and Apweiler, R. InterProScan--an integration platform for the signature-recognition methods in InterPro" Bioinformatics 2001; 17:847-848).

Membrane-spanning region were predicted by SOSUI (Hirokawa, T., Boon-Chieng, S., Mitaku, S. "SOSUI: classification and secondary structure prediction system for membrane proteins" Bioinformatics 1998; 14:378-379).

(3) FASTA homology searches against the OMIM database

Shown are the top 5 entries matching the following three conditions: 1) an expectation value smaller than 0.01 in OMIM database, 2) sequence identity greater than 30%, and 3) sequence coverage greater than 80%.

The numbers on the left and right sides of the black line in the graphical overview indicate the lengths (in amino acid residues) of the non-homologous N-terminal and C-terminal portions flanking the homologous region (indicated by the black line), respectively. The FASTA output and the multiple alignment of these entries can be obtained by clicking.


How to obtain KOP clone(s) Japan || Other countries
Send a message to flexiclone AT kazusagt.com