Optimzing the 5'-end of Coding Sequences in Recombinant mRNA to achieve high-level Expression in the Bacterium Escherichia coli
Abstract
Recombinant protein production in Escherichia coli provides a cheap and efficient way of producing medically and industrially relevant proteins. Sequence features of individual genes and especially their 5 terminal coding sequences act on the efficiency of gene expression by complex regulatory mechanisms which are still not fully understood. This study aimed to investigate the features of the 5 coding region of recombinant mRNAs, and to optimize them for increased expression in E. coli. A previous study had found that a synonymous change of the bla reporter gene 2nd codon leads to an increased expression, and accordingly a synonymous library in the 5 bla coding sequence was created by a directed evolution approach building on this feature. Variants conferring up to three-fold increases in active enzyme amounts were identified, and the increased expression was shown to stem from increased transcriptional efficiency. The effect of changing the 2nd codon synonymously was further investigated by synonymous substitutions of the 2nd codons of the bla and two other reporter genes, phoA and celB. These experiments showed that the effect of 2nd codon changes on the gene expression is determined by the sequence context, as changes in expression levels appeared to be gene specific. All the coding sequences of the study were also analysed in silico, and an application for calculating the tRNA adaptation index was programmed in Python and made freely available online.As the synonymous codon changes did not lead to a great improvement in protein amounts and any sequence features affecting the expression were hard to pinpoint, an alternative strategy involving 5 terminal gene fusions was investigated. Combinatorial mutagenesis coupled to an effective screening technique was applied to further optimize a 5 terminal fusion partner, previously shown to improve expression of several eukaryotic genes. The application of the best identified fusion partner candidate yielded a 3.8-fold improvement in IFN-α2b protein amounts over the original fusion, and showed twice as high protein amounts than a pelB-IFN-α2b fusion previously proven to give industrial expression amounts. The developed peptide fusion is thus an eligible candidate for further development for use in heterologous protein production.