Page last updated: 2024-10-24

mRNA splice site recognition

Definition

Target type: biologicalprocess

Selection of a splice site by components of the assembling spliceosome. [GOC:krc, ISBN:0879695897]

mRNA splice site recognition is a critical step in gene expression, ensuring that the correct protein is produced from a gene. It involves the precise identification and removal of non-coding sequences (introns) from pre-mRNA transcripts, leaving behind the coding sequences (exons) that will be translated into protein. This process is facilitated by a complex molecular machinery called the spliceosome, which is composed of small nuclear ribonucleoproteins (snRNPs) and a variety of protein factors.

The spliceosome recognizes two key splice sites within pre-mRNA: the 5' splice site (5'SS) and the 3' splice site (3'SS). These sites are defined by specific nucleotide sequences that are highly conserved across different species. The 5'SS typically begins with a GU dinucleotide, while the 3'SS typically ends with an AG dinucleotide.

The process of splice site recognition begins with the binding of the U1 snRNP to the 5'SS. This binding event recruits other snRNPs, including U2, U4, U5, and U6, to form a complex structure called the spliceosome. The U2 snRNP specifically recognizes the branch point sequence, a conserved adenine nucleotide located approximately 20-50 nucleotides upstream of the 3'SS.

Once the spliceosome is assembled, it catalyzes two sequential transesterification reactions:

1. **First transesterification:** The 5'SS is cleaved, and the 5' end of the intron is joined to the branch point adenine, forming a lariat structure.

2. **Second transesterification:** The 3'SS is cleaved, and the 3' end of the intron is joined to the 5' end of the downstream exon, ligating the two exons together.

The lariat intron is then released and degraded, leaving behind the mature mRNA transcript containing only the exons. This mature mRNA transcript is then transported out of the nucleus and translated into protein.

The accuracy of splice site recognition is crucial for proper gene expression. Errors in splice site recognition can lead to the production of truncated or aberrant proteins, which can have detrimental effects on cellular function. To ensure accuracy, splice site recognition is regulated by a variety of factors, including:

- **Sequence context:** The nucleotide sequences surrounding the splice sites play a significant role in recognition.

- **Splicing factors:** Proteins known as splicing factors can enhance or suppress splice site recognition.

- **Chromatin structure:** The organization of DNA into chromatin can influence splice site accessibility.

- **Alternative splicing:** Many genes can be spliced in multiple ways, producing different protein isoforms from the same gene.

- **Disease:** Mutations in splice site sequences or splicing factors can cause various genetic diseases.'
"

Proteins (1)

ProteinDefinitionTaxonomy
Serine/arginine-rich splicing factor 6A serine/arginine-rich splicing factor 6 that is encoded in the genome of human. [PRO:DNx, UniProtKB:Q13247]Homo sapiens (human)

Compounds (1)

CompoundDefinitionClassesRoles
indacaterolindacaterol : A monohydroxyquinoline that consists of 5-[(1R)-2-amino-1-hydroxyethyl]-8-hydroxyquinolin-2-one having a 5,6-diethylindan-2-yl group attached to the amino function. Used as the maleate salt for treatment of chronic obstructive pulmonary disease.

indacaterol: a beta2 adrenoceptor agonist; indacaterol is the (R)-isomer; structure in first source
indanes;
monohydroxyquinoline;
quinolone;
secondary alcohol;
secondary amino compound
beta-adrenergic agonist;
bronchodilator agent
chemdatabank.com