KOP |
|
|
Description of the Gene/Protein Characteristic Table
| |
Quality control data of the Flexi HaloTag Clones
This section describes quality control data for the Flexi HaloTag Clones.
- (1) Characterization of the Flexi
HaloTag Clones
-
The terminal sequences of ORFs and the integrity of the Sgf I
and Pme I sites adjacent to the ORFs were confirmed by single-pass
sequencing. Thesequence primers are as follows: For 5'-end, FN21F1
(5'-CTGCTGCAAGAAGACAACCC-3', for nucleotides 1883 - 1902 in the
sequence of pFN21A) or FN21F3 (5'-TCGGCCCGGGTCTGAATCTG-3', for nucleotides 1866 -
1885 in the sequence of pFN21A), and for 3'-end, F1KR
(5'-TTCCTTTCGGGCTTTGTTAG-3', for nucleotides 2439 - 2458 in the
sequence of pFN21A) or T7term (5'-TTATGCTAGTTATTGCTCAGCGG-3', for nucleotides 2478
- 2500 in the sequence of pFN21A). Moreover, the cleavage
characteristics of the Sgf I and Pme I sites were
confirmed by digestion of the clones by Sgf I
and Pme I restriction enzymes followed by analysis of agarose gel
electrophoresis. The sizes of the cloned ORFs were also estimated.
- (2) HaloTag-fusion proteins expressed in
HEK293 cells visualized by HaloTag TMR ligand
-
Production and subcellular localization of proteins from Flexi HaloTag
Clones were observed in HEK293 cells by HaloTag TMR ligand labeling
using fluorescent microscopy. The proteins were transiently expressed
as N-terminal HaloTag-fusion types. HaloTag-fusion proteins were
labeled with a HaloTag TMR ligand (red). Nuclei were stained with
Hoechst33342 (blue). Photos labeled with Hoechst33342 and a TMR ligand
are merged. The cell lysates were then applied to SDS-PAGE to estimate
their molecular masses by fluorescent imaging analysis. Please note
that an N-terminal tag might cause incorrect subcellular localization
of certain proteins in general.
Features of the cloned ORF sequence
This section describes features of the ORF sequence cloned in the Flexi HaloTag vector.
- (1) Restriction map
-
Commercially available restriction enzymes
(REBASE;
Roberts, R. J., Macelis, D.
"REBASE - restriction enzymes and methylases"
Nucleic Acids Res. 1998; 26: 338-350.) are sorted according
to the number of restriction sites present in the cDNA insert.
- (2) Prediction of the genomic structure of the cDNA
- The ORF sequence was subjected to a BLAST search
(Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z.,
Miller, W., and Lipman, D.J.
"
Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs." 1997; Nucleic Acids Res 25: 3389-3402)
against
human genome sequences in NCBI.
When a genomic fragment was found to be considerably similar to the cDNA
sequence (E-value of 0.0 and sequence identity of 90% or greater), the genomic
structure of the cDNA was assigned by
SIM4
(Florea, L., Hartzell, G., Zhang, Z., Rubin, G.M., and Miller, W. "
A computer program for aligning a cDNA sequence with a genomic DNA sequence
"
1998; Genome Res. 8: 967-974)
to the genomic fragment.
-
GENSCAN
(Burge, C. and Karlin, S. 1997; "
Prediction of complete gene structures
in human genomic DNA." J. Mol. Biol. 268: 78-94)
was also applied to detect the plausible gene structure of the genomic
fragment. The results of a comparison of the gene structures deduced from
the cDNA and those predicted by GENSCAN are displayed in graphics.
Features of the predicted protein sequence
This section describes the features of the predicted protein sequence.
- (1) FASTA homology searches against the CCDS protein database
- Shown are the top 5 entries given the expectation value smaller than 1^-10
in the CCDS protein database.
The numbers on the left and right
sides of the black line in the graphical overview indicate the lengths
(in amino acid residues) of the non-homologous N-terminal and
C-terminal portions flanking the homologous region (indicated by the
black line), respectively. The FASTA output and the multiple alignment
of these entries can be obtained by clicking.
- (2) Analysis of Motifs, Domains, and Membrane-spanning regions
-
The predicted protein sequences were examined for motifs present
in the InterPro database.
Because weakly defined sequence motifs appear too many times in
the HUGE database and are, thus, unlikely to be informative,
the following motifs were excluded from the
analysis: amidation site; N-glycosylation site; cAMP- and
cGMP-dependent protein kinase phosphorylation site; casein kinase II
phosphorylation site; N-myristoylation site; protein kinase C
phosphorylation site; and tyrosine kinase phosphorylation site.
Motifs and domains in the InterPro database were searched for by InterProScan. (Zdobnov, EM, and Apweiler, R. InterProScan--an integration platform for the signature-recognition methods in InterPro" Bioinformatics 2001; 17:847-848).
Membrane-spanning regions were predicted by
SOSUI
(Hirokawa, T., Boon-Chieng, S., Mitaku, S.
"SOSUI: classification and
secondary structure prediction system for membrane proteins"
Bioinformatics 1998; 14:378-379).
- (3) FASTA homology searches against the OMIM database
- Shown are the top 5 entries matching the following three conditions:
1) an expectation value smaller than 0.01 in
OMIM database,
2) sequence identity greater than 30%, and
3) sequence coverage greater than 80%.
The numbers on the left and right
sides of the black line in the graphical overview indicate the lengths
(in amino acid residues) of the non-homologous N-terminal and
C-terminal portions flanking the homologous region (indicated by the
black line), respectively. The FASTA output and the multiple alignment
of these entries can be obtained by clicking.