Polypeptides for preparing terpenoid compounds
Patent Information
- Authority / Receiving Office
- JP · JP
- Patent Type
- Applications
- Current Assignee / Owner
- FIRMENICH SA
- Filing Date
- 2024-06-12
- Publication Date
- 2026-06-26
AI Technical Summary
The fragrance industry seeks sustainable methods for producing ambroxide and other terpenoid compounds, as chemical synthesis poses environmental and waste management challenges.
A multi-step enzymatic process using polypeptides with terpene cyclase enzyme activity, such as meroterpenoid cyclases, to convert precursor compounds into ambroxide and other terpenoids through in vivo or bioconversion methods.
This approach achieves high yields of ambroxide and other terpenoids, with over 97% purity, providing a sustainable alternative to chemical synthesis.
Smart Images

Figure 2026521177000246 
Figure 2026521177000247 
Figure 2026521177000248
Abstract
Description
[Technical Field]
[0001] The present invention relates to polypeptides and methods for the enzymatic preparation of ambroxide and other terpenoid compounds.
[0002] background The fragrance industry has always had a need for methods to prepare compounds for industrial use in fragrances. Among these compounds, the most important are those related to olfactory compounds of amber, which are naturally found in ambergris and can act as fixatives that make the scent last much longer.
[0003] The main compound in ambergris is ambroxide, a terpenoid compound. Terpenoids are found in most organisms (microorganisms, animals, and plants). This compound is composed of five carbon units called isoprene units and is classified by the number of these units present in its structure. Thus, monoterpenes, sesquiterpenes, and diterpenes are terpenes containing 10, 15, and 20 carbon atoms, respectively. Sesquiterpenes are widely found, for example, in the plant kingdom. Many sesquiterpene molecules are known for their flavor and fragrance properties, as well as their cosmetic, medicinal, and antimicrobial effects. Numerous sesquiterpene hydrocarbons and sesquiterpenoids have been identified. Commercially relevant compounds include Cetalox® ((3aRS,9aRS,9bRS)-3a,6,6,9a-tetramethyl-1,2,3a,4,6,7,8,9,9a,9b-decahydronaphtho[2,1-b]furan; manufactured by Firmenich SA, Geneva, Switzerland) or Ambrox® ((3aR,5aS,9aS,9bR)-3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan; manufactured by Firmenich SA, Geneva, Switzerland), which mimic ambroxide.
[0004] The chemical routes for preparing these compounds are known in the art. However, considering the environmental and waste issues associated with the chemical manufacture of such compounds, there is a need to develop a more sustainable method for the production of ambroxide and other terpenoid compounds.
[0005] This problem is addressed by the present invention which provides polypeptides and methods for producing such compounds by in vivo and / or bioconversion methods. The method may use a multi-step enzymatic process.
[0006] Summary A first aspect of the present invention is a method for preparing a compound of formula (I) in the form of any one of the stereoisomers or a mixture thereof
Chemical formula
Chemical formula
[0007] One embodiment of the present invention is that more than 97% of the compound of formula (I) is in the form of formula (Ia)
Chemical formula
Chemical formula
[0008] [[ID=四十八]] [[ID=四十九]] [[ID=五十]] [[ID=五十一]] It is in the form of...
[0009] A further aspect of the present invention is that the polypeptide having terpene cyclase enzymatic activity is a polypeptide that is not a squalene cyclase (SHC) enzyme, and / or a polypeptide that is a squalene cyclase enzyme. In the context of the present invention, the polypeptide that is not an SHC enzyme is a meroterpenoid cyclase enzyme.
[0010] In one embodiment, the meroterpenoid cyclase enzyme is the following polypeptide: (a) .[W]xxx[D]xx[ILVMN](Array No.: 254); .PxxAxxxNxxWE(array:255); .MxxxFxxMLxxR(array_number:256); and .RxxxxGQS(Sequence ID: 257) A bacterial membrane-integrated meloterpenoid cyclase containing at least one amino acid motif selected from, (b) .[WY]Exx[YFW](Sequence ID: 258); and .[DNE]xSYxxP(Sequence ID:259) A fungal membrane-integrated meloterpenoid cyclase containing at least one amino acid motif selected from, (c) .GxWxxxW[WG]xxxxY(array_number:260); .WxxxHxxV[TSA](array_number:261); and .GxWxD[FY](Sequence ID: 262) A bacterial-derived soluble meloterpenoid cyclase containing at least one amino acid motif selected from the following: Selected from at least one of the following: Each residue x represents an arbitrary native amino acid residue, independently of the others.
[0011] In a preferred embodiment, the meroterpenoid cyclase enzyme is a membrane-integrated meroterpenoid cyclase enzyme. The enzyme preferably produces a compound of formula (I) in the form of formula (Ia).
[0012] In another preferred embodiment, the meroterpenoid cyclase enzyme is a soluble meroterpenoid cyclase enzyme. The enzyme preferably produces a compound of formula (I) in the form of formula (Ib).
[0013] In another embodiment, the squalene cyclase enzyme comprises at least one motif selected from [SP][TP][VIL]WDTx[LWI] (SEQ ID NO: 247), PGG[WF][GYA]F (SEQ ID NO: 248), PDxDD[TAS][TIAS] (SEQ ID NO: 249), [MIL]QxxxG[GA][WF]x[AS][FY] (SEQ ID NO: 250), Qxxx[GH]xWxG[RK]WGxx[YF]xYG (SEQ ID NO: 251), Qxx[DN]G[GS][WF][GS]ExxxS (SEQ ID NO: 252), and [STA]xx[SFN][QC]T[AGT]W[AS][LIV]xx[LQ] (SEQ ID NO: 253), where residue x independently represents any native amino acid residue.
[0014] In a further aspect of the present invention, the method further comprises one or more steps prior to step (i), wherein the step is (a) Formula (V) of one of the stereoisomers or a mixture thereof [ka] The compound is brought into contact with a polypeptide having esterase enzyme activity to produce the compound of formula (VI), (b) Formula (IV) of any one of the stereoisomers or a mixture thereof [ka] The compound is contacted with a polypeptide having Bayer-Villiger monooxygenase (BVMO) enzyme activity to produce the compound of formula (V), (c) Formula (III) of one of the stereoisomers or a mixture thereof [ka] The compound is brought into contact with a polypeptide having enal cleavage enzyme activity to produce the compound of formula (IV), (d) Formula (II) of one of the stereoisomers or a mixture thereof [ka] The compound is brought into contact with a polypeptide having alcohol dehydrogenase (ADH) enzyme activity to produce the compound of formula (III). (e) The step of producing a compound of formula (II) from geranylgeranyl diphosphate (GGPP) using one or more polypeptides having phosphatase enzyme activity, and / or (f) A step of producing GGPP from isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) using one or more polypeptides having prenyltransferase enzyme activity. Includes.
[0015] In a further aspect of the present invention, the method is an in vivo or biotransformation method.
[0016] A further aspect of the present invention provides recombinant cells that contain, can produce, or produce a compound of formula (I), wherein more than 97% of the compound of formula (I) is in the form of formula (Ia) and / or (Ib).
[0017] A further aspect of the present invention provides a cell culture fermentation medium containing recombinant cells of the present invention.
[0018] A further aspect of the present invention provides a reaction mixture comprising a compound of formula (I), wherein more than 97% of the compound of formula (I) is in the form of formula (Ia) and / or (Ib).
[0019] Further aspects of the present invention provide compounds obtained or obtainable by the methods of the present invention, or from recombinant cells of the present invention as described herein, or from cell culture fermentation media of the present invention, or from reaction mixtures of the present invention.
[0020] A further aspect of the present invention provides the use of the compound as a fragrance component.
[0021] A further aspect of the present invention provides the use of a meroterpenoid cyclase enzyme for the production of compounds of formula (I) and / or derivatives thereof.
[0022] A further aspect of the present invention provides a variant meloterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to any of the sequences described in any of SEQ ID NOs: 56 to 70.
[0023] A further aspect of the present invention provides a variant meloterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 56-61, 69, and 70, wherein the variant meloterpenoid cyclase enzyme has an amino acid substitution at amino acid position 9 with respect to the sequence described in SEQ ID NO: 51.
[0024] A further aspect of the present invention provides a variant squalene cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to any of the sequences described in SEQ ID NOs: 29, 31, 33, 34, 36-38, 40, 41, 43-46, 48, 49, and 265-279, wherein the polypeptide has the amino acid alanine at position 437 and the amino acid methionine at position 600 with respect to the sequence described in SEQ ID NO: 82. Preferably, the variant squalene cyclase enzyme has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 29, 31, 33, 34, 36-38, 40, 41, 43-46, 48, 49, 265-274, and 276-279, and the polypeptide has the amino acid alanine at position 437 and the amino acid methionine at position 600 with respect to the sequence described in SEQ ID NO: 82. [Brief explanation of the drawing]
[0025] [Figure 1] This describes the biosynthetic pathways for (2E)-geranyl diphosphate (GPP), (2E,6E)-farnesyl diphosphate (FPP), and (2E,6E,10E)-geranylgeranyl diphosphate (GGPP) from isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). [Figure 2] This is the biosynthetic pathway for (2E,6E,10E)-geranylgeraniol from (2E,6E,10E)-geranylgeranyl diphosphate (GGPP). Pi is an inorganic phosphate, and PPi is an inorganic pyrophosphate. [Figure 3] This is a new biochemical pathway to (3E,7E)-homofarnesol. [Figure 4]This is a GC-MS analysis of terpenoids and derivatives produced using Escherichia coli (E. coli) cells expressing the plasmid pHFOL-5 encoded proteins PsAerADH (SEQ ID NO: 11), SCH24-BVMO1 (SEQ ID NO: 23), SCH24-EST1 (SEQ ID NO: 27), CcrGGPPS2-del57 (SEQ ID NO: 1), and PgpB (SEQ ID NO: 3), which have been engineered to produce (3E,7E)-homofarnesol. [Figure 5] GC-MS analysis (A) of terpenoids and derivatives produced using Escherichia coli (E. coli) cells expressing the plasmid pF-Facetone-7 encoded proteins PsAerADH (SEQ ID NO: 11), SCH24-EST1 (SEQ ID NO: 27), CcrGGPPS2-del57 (SEQ ID NO: 1), and PgpB (SEQ ID NO: 3), and GC-MS analysis (B) of terpenoids and derivatives produced using the same cells further expressing the BVMO enzyme (AflaBVMO1, SEQ ID NO: 26). [Figure 6] This is a GC-MS chromatogram of the YST403_HFOL strain engineered to produce (3E,7E)-homofarnesol. The final product (3E,7E)-homofarnesol, as well as pathway intermediates (5E,9E)-farnesylacetone and (2E,6E,10E)-geranylgeraniol, are shown. [Figure 7] This is a novel biochemical pathway for compounds of formula Ia that utilizes squalene cyclase. [Figure 8] This is a GC-MS analysis of terpenoids and derivatives produced using Escherichia coli (E. coli) DP1205 expressing the proteins PsAerADH (SEQ ID NO: 11), AflavBVMO1 (SEQ ID NO: 26), SCH24-EST1 (SEQ ID NO: 27), CcrGGPPS2-del57 (SEQ ID NO: 1), PgpB (SEQ ID NO: 3), wild-type A0A5P9HJ69 (SEQ ID NO: 42) (A), and A0A5P9HJ69_V1 variant (SEQ ID NO: 43) (B). The compound of formula (Ia) and the pathway intermediate (3E,7E)-homofarnesol are shown. [Figure 9] This represents the titer of the compound of formula (Ia) produced by Escherichia coli (E. coli) DP1205 expressing the proteins PsAerADH (SEQ ID NO: 11), AflavBVMO1 (SEQ ID NO: 26), SCH24-EST1 (SEQ ID NO: 27), CcrGGPPS2-del57 (SEQ ID NO: 1), PgpB (SEQ ID NO: 3), and various wild-type or mutant squalene cyclases. [Figure 10] This represents the titer of the compound of formula (Ia) produced in S. cerevisiae cells expressing various wild-type or mutant variants of geranylgeranyl diphosphate synthase CarG (SEQ ID NO: 2), phosphatase PgpB (SEQ ID NO: 3), alcohol dehydrogenase SCH23-ADH1 (SEQ ID NO: 21), Bayer-Villiger monooxygenase AflavBVMO1 (SEQ ID NO: 26), enal cleavage enzyme SCH94-03944 (SEQ ID NO: 22), and squalene cyclase. [Figure 11] This is a GC-MS chromatogram of culture extracts of S. cerevisiae manipulated to express the (3E,7E)-homofarnesole biosynthesis pathway gene and either the squalene cyclase mutant variant OYT72085_V1 (SEQ ID NO: 48) (A) or the wild-type squalene cyclase OYT72085.1 (SEQ ID NO: 47) (B). Peaks corresponding to (3E,7E)-homofarnesole and the compound of formula (Ia) are shown. [Figure 12] This is the titer of the compound of formula (Ia) produced by Escherichia coli (E. coli) DP1205 expressing the proteins PsAerADH (SEQ ID NO: 11), AflavBVMO1 (SEQ ID NO: 26), SCH24-EST1 (SEQ ID NO: 27), CcrGGPPS2-del57 (SEQ ID NO: 1), PgpB (SEQ ID NO: 3), and bacterial membrane-integrating meroterpenoid cyclase. [Figure 13] This is a novel biochemical pathway for compounds of formula (Ia) that utilizes meroterpenoid cyclase. [Figure 14]This is the titer of the compound of formula (Ia) produced by the yeast strain YST403, which expresses various bacterial membrane-integrating meroterpenoid cyclases and has been engineered to produce (3E,7E)-homofarnesol. [Figure 15] GC-MS chromatogram (A) of yeast strain YST403 engineered to produce (3E,7E)-homofarnesol and express WP_234754442.1 (SEQ ID NO: 51), and GC-MS chromatogram (B) of control strain YST403 expressing only (3E,7E)-homofarnesol pathway enzymes. The compound of formula (Ia) and the pathway intermediate (3E,7E)-homofarnesol are shown. [Figure 16] GC-MS chromatogram (A) of yeast strain YST403 engineered to express (3E,7E)-homofarnesol and A0A2P1DP74.1(MacJ)(SEQ ID NO: 71) in single-ion monitoring mode (221Da), and GC-MS chromatogram (B) of control strain YST403 expressing only (3E,7E)-homofarnesol pathway enzymes. The compound of formula (Ia) and the pathway intermediate (3E,7E)-homofarnesol are shown. [Figure 17] This is a chiral GC-MS chromatogram in single-ion monitoring mode (221Da) of Escherichia coli (E. coli) DP1205 (C) expressing OKH29475.1 (SEQ ID NO: 74), which has been engineered to express (3E,7E)-homofarnesol, compared to authentic standards of compound (B) of formula (Ia) and compound (A) of formula (Ib). [Figure 18] These are GC-MS chromatograms in single-ion monitoring mode (221Da) of the YST403 strain expressing soluble meloterpenoid cyclases OKH29475.1 (SEQ ID NO: 74) and NEQ07043.1 (SEQ ID NO: 75), engineered to produce (3E,7E)-homofarnesol, and the control strain YST403HFOL, which also produces (3E,7E)-homofarnesol. The compound of formula (Ia) is shown. [Figure 19]This is the predicted structure of WP_234754442.1 (SEQ ID NO: 51) using ESMFold, showing a porous structure consisting of seven helical bodies. The N and C termini and the entry point to the presented active site on the same side as the N terminus are shown. [Figure 20] This is the titer of the compound of formula (Ia) produced by Escherichia coli (E. coli) DP1205 expressing various mutant variants of the proteins PsAerADH (SEQ ID NO: 11), AflavBVMO1 (SEQ ID NO: 26), SCH24-EST1 (SEQ ID NO: 27), CcrGGPPS2-del57 (SEQ ID NO: 1), PgpB (SEQ ID NO: 3), and bacterial membrane-integrating meroterpenoid cyclase WP_234754442.1 (SEQ ID NO: 51). The titer is shown in comparison to the wild-type enzyme WP_234754442.1 (SEQ ID NO: 51). [Figure 21] This is the titer of the compound of formula (Ia) produced by the yeast strain YST403, which was engineered to produce (3E,7E)-homofarnesol and express the bacterial membrane-integrating meroterpenoid cyclase WP_234754442.1 (SEQ ID NO: 51), as well as the mutant variants WP_234754442.1_S9C (SEQ ID NO: 56) and WP_234754442.1_S9M (SEQ ID NO: 57). The titer is shown as a percentage compared to the wild-type enzyme WP_234754442.1 (SEQ ID NO: 51). [Figure 22] These are the titers of compound (Ia) (left), compound (Ic) (center), and compound (Id) (right) produced by biotransformation of chemically synthesized homofarnesol with bacterial membrane-integrating meloterpenoid cyclases WP_051467941.1 (SEQ ID NO: 50), WP_234754442.1 (SEQ ID NO: 51), WP_190963420.1 (SEQ ID NO: 52), and squalenehopencyclase AacSHC_M132R_I432T_A224V (SEQ ID NO: 78) in the presence or absence of 0.06 (w / v) of the surfactant sodium dodecyl sulfate (SDS). [Figure 23]This is a GC-MS analysis of terpenoids and derivatives produced by biotransformation of chemically synthesized (3E,7E)-homofarnesol containing (3Z,7E)-homofarnesol impurities using Escherichia coli (E. coli) Bl21(DE3)Star cells expressing the bacterial membrane-integrated meroterpenoid cyclase WP_234754442.1 (SEQ ID NO: 51) (A) and the mutant squalene cyclase AAcSHC_M132R_A224V_I432T (SEQ ID NO: 78) (B). Cyclochemicals of compounds of formula (Ia) and (Id) from (3E,7E)-homofarnesol, as well as cyclochemicals of compounds of formula (Ic) from (3Z,7E)-homofarnesol, are shown. [Figure 24] This is a GC-MS analysis of the bioconversion of chemically synthesized (3E,7E)-homofarnesol by Escherichia coli (E. coli) Bl21(DE3)Star cells expressing genes encoding the squalene cyclases ZmSHC_F437A_G600M (SEQ ID NO: 88), AacSHC_F437A_G600M (SEQ ID NO: 81), A0A0T6LPP7-V1 (SEQ ID NO: 265), A0A7V0I7Y5-V1 (SEQ ID NO: 266), UPI00248B5E40-V1 (SEQ ID NO: 267), and UPI002800B5BA-V1 (SEQ ID NO: 268). The control group does not contain the squalene cyclases. [Figure 25] The titers are those of the compound of formula (Ia) produced by Escherichia coli (E. coli) DP1205 expressing PsAerADH (SEQ ID NO: 11), AflavBVMO1 (SEQ ID NO: 26), SCH24-EST1 (SEQ ID NO: 27), CcrGGPPS2-del57 (SEQ ID NO: 1), PgpB (SEQ ID NO: 3), and various bacterial membrane-integrating meroterpenoid cyclases. The titers are shown in comparison to the enzyme WP_234754442.1 (SEQ ID NO: 51).
[0026] Abbreviations used ADH (Alcohol Dehydrogenase) BVMO Bayer-Villiger Monooxygenase bp (base pair) kb (kilobase) DNA (Deoxyribonucleic Acid) cDNA complementary DNA DMAPP (Dimethylallyl Diphosphate) FMO Flavin Monooxygenase FPP Farnesirdiphosphate GPP geranyl diphosphate GGPP geranylgeranyl diphosphate GGPS Geranylgeranyl diphosphate synthase GC gas chromatograph IPP isopentenyl diphosphate iMS mass spectrometer / mass spectrometry MVA Mevalonate PP diphosphate, pyrophosphate PCR (polymerase chain reaction) RNA (ribonucleic acid) SHC squalene cyclase mRNA messenger ribonucleic acid miRNA (microRNA) siRNA, small interfering RNA rRNA (ribosomal RNA) tRNA transfer RNA TPP terpenyl diphosphate
[0027] definition General terms: In this description and the appended claims, the use of “or” means “and / or” unless otherwise specified. Similarly, “comprise,” “comprises,” “comprising,” “include,” “includes,” and “including” are interchangeable and not intended to be limiting.
[0028] Furthermore, where the term “including” is used in the description of various embodiments, it should be understood that in some specific cases the embodiments can be described alternatively using the words “substantially from” or “consisting of.”
[0029] As used herein, the terms “purified,” “substantially purified,” and “isolated” refer to a state in which the compound of the present invention is free from other different compounds that are normally bound to it in its natural state, and therefore, a “purified,” “substantially purified,” and “isolated” subject constitutes at least 0.5% by weight, 1% by weight, 5% by weight, 10% by weight, or 20% by weight, or at least 50% by weight or 75% by weight of a given sample. In one embodiment, these terms refer to the compound of the present invention constituting at least 95% by weight, 96% by weight, 97% by weight, 98% by weight, 99% by weight, or 100% by weight of a given sample. As used herein, the terms “purified,” “substantially purified,” and “isolated,” when relating to nucleic acids or proteins, refer to a purified or concentrated state different from that which is naturally occurring, for example, present in the environment of prokaryotes or eukaryotes, for example, in bacterial or fungal cells or mammalian organisms, particularly the human body. (1) Purification from other bound structures or compounds, or (2) Purification or concentration to a degree greater than that found in nature, including binding with structures or compounds not typically bound in the prokaryotic or eukaryotic environment, falls within the meaning of “isolated.” The nucleic acids or proteins or classes of nucleic acids or proteins described herein may be isolated or bound with structures or compounds not typically bound in nature by various methods and processes known to those skilled in the art.
[0030] The term "approximately" indicates that the stated value may vary by ±25%, particularly ±15%, ±10%, and more specifically ±5%, ±2%, or ±1%.
[0031] The term "effectively" refers to a range of values of approximately 80-100%, for example, 85-99.9%, especially 90-99.9%, more specifically 95-99.9%, or 98-99.9%, especially 99-99.9%.
[0032] "Majority" refers to a percentage in the range of over 50%, such as 95-99%, 51-100%, especially 75-99.9%, and more specifically, 85-98.5%.
[0033] In the context of the present invention, “principal product” means a single compound or at least two compounds, e.g., two, three, four, five or more, particularly two or three compounds, the single compound or group of compounds being prepared in “majority” by a reaction as described herein, and included in the reaction in a majority ratio based on the total amount of components of the product formed by the reaction. The ratio may be a molar ratio, a weight ratio, or preferably an area ratio calculated from the corresponding chromatogram of the reaction product based on chromatographic analysis.
[0034] In the context of the present invention, “by-product” means a single compound or at least two compounds, e.g., two, three, four, five or more, particularly two or three compounds, and the single compound or group of compounds is not prepared in “majority” by a reaction as described herein.
[0035] Due to the reversibility of enzymatic reactions, the present invention relates to enzymatic or biocatalytic reactions described herein in both directions of the reaction, unless otherwise specified.
[0036] The term "stereoisomer" includes conformational isomers, particularly configurational isomers.
[0037] In general, the present invention includes all "stereoisomer forms" of the compounds described herein, such as "structural isomers" and "stereoisomers."
[0038] "Stereoisomer morphology" particularly includes "stereoisomers" and mixtures thereof, such as configuration isomers (optical isomers) like enantiomers, or geometric isomers (diastereomers) like E and Z isomers, as well as combinations thereof. When one or more asymmetric centers are present in a single molecule, the present invention includes all combinations of various conformations of these asymmetric centers, such as enantiomer pairs.
[0039] "Stereoselectivity" refers to the ability to produce a specific stereoisomer of a compound in a stereoisomerically pure form, or the ability to specifically convert a specific stereoisomer from multiple stereoisomers using an enzyme-catalyzed method as described herein. More specifically, this means that the products of the present invention may be concentrated with respect to a particular stereoisomer, or the reactants may be reduced with respect to a particular stereoisomer. This is represented by the formula %ee=[X A -X B ] / [X A +X B ] * 100 [In the formula, X A and X B This represents the molar ratio (mole fraction) of stereoisomers A and B. It can be quantified by the purity %ee parameter, which is calculated according to [the specified method].
[0040] Generally, the terms “selectively converting” or “increasing selectivity” mean that a particular stereoisomer form of an unsaturated hydrocarbon, e.g., form E, is converted (on a molar basis) at a larger ratio or amount than the corresponding other stereoisomer form, e.g., form Z, at any point in the reaction, during the entire course of the reaction (i.e., between the start and end of the reaction), at any particular point in time of the reaction, or during an “interval” of the reaction. In particular, the selectivity can be observed during an “interval” corresponding to a conversion of 1–99%, 2–95%, 3–90%, 5–85%, 10–80%, 15–75%, 20–70%, 25–65%, 30–60%, or 40–50% of the initial amount of the substrate. A larger ratio or amount is, for example, from the following perspectives: - The maximum yield of the isomer observed during the entire reaction process or the aforementioned interval of the reaction is greater. - The relative amount of the isomer is greater at a defined percentage of the substrate conversion value, and / or - The relative amounts of the isomers are the same at higher percentages of the conversion value. These can be expressed as follows, and these are each preferably observed in comparison to a reference method, the reference method being carried out under otherwise identical conditions using known chemical or biochemical means.
[0041] The "yield" and / or "conversion rate" of the reaction according to the present invention is determined, for example, over a defined period of 4, 6, 8, 10, 12, 16, 20, 24, 36, or 48 hours during which the reaction takes place. In particular, the reaction is carried out under precisely defined conditions, for example, the "standard conditions" as defined herein.
[0042] Where this disclosure refers to any of its features, parameters, and ranges of varying degrees of preference (including, generally, unexpressly, those preferred features, parameters, and ranges), any combination of two or more of those features, parameters, and ranges is included in this disclosure, regardless of the degree of preference of each of them, unless otherwise stated.
[0043] Biochemical and biological terms The term "domain" refers to a set of amino acids or a subsequence of amino acid residues that are conserved at a specific position along the alignment of evolutionarily related protein sequences. While amino acids at other positions may differ among protein homologs, amino acids that are highly conserved at a specific position within such a domain are likely to be essential for the structure, stability, or function of the protein. Because they are identified by their high degree of conservation in the aligned sequences of a family of protein homologs, these can be used as identifyrs to determine whether a polypeptide in question belongs to a previously identified polypeptide family.
[0044] The terms "motif," "consensus sequence," or "signature" refer to short, conserved regions in evolutionarily related protein sequences. Motifs are often highly conserved portions of a domain, but may also contain only a portion of a domain. Signatures are predictive models that describe a family, domain, or site of a protein.
[0045] The motif sequence can be described using standard IUPAC single-letter codes for amino acids. Ambiguity is indicated by listing acceptable amino acids in parentheses at the given position. For example, [LWI] represents L (leucine), W (tryptophan), or I (isoleucine). X represents a position where any native amino acid residue exists independently of each other.
[0046] A "protein family" is defined as a group of proteins that share a common evolutionary origin, reflected in their related functions, sequence similarities, or similar primary, secondary, or tertiary structures. Proteins within a protein family are typically homologous, possessing similar structures of conserved functional domains and motifs.
[0047] Expert databases for identifying protein domains exist, such as SMART (http: / / smart.embl-heidelberg.de / smart / set_mode.cgi?GENOMIC=1) (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA 95, 5857-5864; Letunic et al. (2020) Nucleic Acids Res 49, D458-D460), InterPro (Paysan-Lafosse et al, Nucleic Acids Research, Nov 2022; Mulder et al., (2003) Nucl. Acids. Res. 31, 315-318), or Pfam (Bateman et al., Nucleic Acids Research 30(1): 276-280 (2002)).
[0048] Useful tools for searching for or predicting protein domains or protein family signatures within protein sequences include, for example, the NCBI preserved domain search tool (https: / / www.ncbi.nlm.nih.gov / Structure / cdd / wrpsb.cgi) or the InterProScan tool (http: / / www.ebi.ac.uk / interpro / search / sequence / ). Domains or motifs can also be identified using routine techniques such as sequence alignment.
[0049] The term "Pfam" refers to a large collection of protein domains and protein families maintained by the Pfam Consortium and available on several sponsored worldwide websites, including the InterPro Consortium website https: / / www.ebi.ac.uk / interpro / (European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL_EBI)). The latest release of Pfam is Pfam 35.0 (November 2021), based on UniProt Reference Proteomes (El-Gebali S. et al, 2019, Nucleic Acids Res. 47, Database issue D427-D432). Pfam domains and families are identified using multiple sequence alignments and hidden Markov models (HMMs). The assignment of the Pfam-A family or domain is a high-quality assignment generated by curated seed alignments using representative members of the protein family and a profiled hidden Markov model based on the seed alignments (unless otherwise specified, a match of the queried protein to a Pfam domain or family is a Pfam-A match). Then, a complete alignment of the family is automatically generated using all identified sequences belonging to the family (Sonnhammer (1998) Nucleic Acids Research 26, 320-322; Bateman (2000) Nucleic Acids Research 26, 263-266; Bateman (2004) Nucleic Acids Research 32, Database Issue, D138-D141; Finn (2006) Nucleic Acids Research Database Issue 34, D247-251; Finn (2010) Nucleic Acids Research Database Issue 38, D211-222).For example, by accessing the Pfam database using one of the aforementioned reference websites, a protein sequence can be queried against the HMM using HMMER homology search software (e.g., HMMER2, HMMER3, or later versions, hmmer.janelia.org / ). A key match for identifying that a queried protein belongs to the pfam family (or has a specific Pfam domain) is that its bit score is above the collection threshold for the Pfam domain. The expected value (e-value) can also be used as a criterion for including a queried protein in Pfam or determining whether a queried protein has a specific Pfam domain, where the e-value is low, much less than 1.0, for example, 0.1 or less.
[0050] InterPro is another database of protein families that provides classification of protein sequences into families, identifying functionally important domains and conserved sites (Blum et al, Nucleic Acids Res. 2021 49(D1):D344-D354). Protein signatures are provided by multiple databases such as Pfam or SMART (Simple Modular Architecture Research Tool). InterProScan is software that allows searching of protein and nucleic acid sequences against InterPro signatures.
[0051] The "E-value" (expected value) is the number of hits that, by chance alone, are expected to have a score equal to or better than this value. This means that a good E-value that provides a reliable prediction is much smaller than 1. An E-value around 1 is a value that is expected to be due to chance. Therefore, the lower the E-value, the more specific the search about the domain will be. Only positive numbers are possible.
[0052] "Precursor" compounds or molecules of target compounds or molecules as described herein are preferably converted to the target compound by the enzymatic action of a suitable polypeptide that carries out at least one structural or functional modification to the precursor molecule. For example, a "diphosphate precursor" (e.g., a "terpenyl diphosphate precursor") is converted to the target compound (e.g., a terpene alcohol) by enzymatic removal of the diphosphate moiety, for example, by removal of the monophosphate or diphosphate group by a phosphatase enzyme. For example, an "acyclic precursor" (e.g., an "acyclic terpenyl precursor") can be converted to a cyclic target molecule (e.g., a cyclic terpene compound) by the action of a cyclase or synthase enzyme in one or more steps, regardless of the specific enzymatic mechanism of such enzymes.
[0053] Enzyme nomenclature, or enzyme classification (EC), established by the International Union of Biochemistry and Molecular Biology (IUBMB), is a system for naming and categorizing enzymes based on their catalytic activity and biochemical properties. Enzyme nomenclature is widely used in biochemistry for classification and categorization based on their function. In the EC classification, each enzyme is assigned a number that reflects the reaction or type of reaction catalyzed by that enzyme.
[0054] Enzyme classification can be looked up using the "ExplorEnz" database (https: / / www.enzyme-database.org / ) or the International Union of Biochemistry and Molecular Biology (IUBMB) website (https: / / iubmb.qmul.ac.uk). Information on enzyme classification and nomenclature, as well as their functions and properties, can be found. You can search the databases to find information on specific enzyme families or enzymes.
[0055] The terms “biological function,” “function,” “biological activity,” or “activity” of a terpenyl synthase refer to the ability of a terpenyl diphosphate synthase, as described herein, to catalyze the formation of at least one terpenyl diphosphate from a corresponding precursor terpene.
[0056] The terms “biological function,” “function,” “biological activity,” or “activity” of terpenyl diphosphate phosphatase refer to the ability of a terpenyl diphosphate phosphatase, as described herein, to catalyze the removal of a diphosphate group from a terpenyl compound to form a corresponding terpene alcohol.
[0057] As used herein, the terms “host cell,” “recombinant cell,” or “transformed cell” refer to a cell (or organism) that has been modified to possess at least one nucleic acid molecule, for example, a recombinant gene encoding a desired protein or nucleic acid sequence that yields at least one functional polypeptide of the present invention at the time of transcription. A host cell is, in particular, a bacterial cell, a fungal cell, or a plant cell or plant. The host cell may contain the recombinant gene or several genes, for example, organized as an operon integrated into the nuclear organelle genome of the host cell. Alternatively, the host may contain the recombinant gene outside of chromosomes. Methods for introducing recombinant nucleic acid sequences into such host cells are well known in the art and are routine laboratory methodologies that do not need further description herein.
[0058] The term "living organism" refers to non-human multicellular or single-celled organisms, such as plants or microorganisms. In particular, microorganisms include bacteria, yeasts, algae, or fungi.
[0059] The term "plant" is used interchangeably to include plant protoplasts, plant tissues, plant cell tissue cultures that give rise to regenerated plants, or plant cells containing parts of a plant, or plant organs such as roots, stems, leaves, flowers, pollen, eggs, embryos, and fruits. Any plant can be used to carry out the method of one embodiment of this specification.
[0060] Detailed explanation As described above, many sesquiterpene molecules are known for their flavor and fragrance properties, as well as their cosmetic, pharmaceutical, and antimicrobial effects. Numerous sesquiterpene hydrocarbons and sesquiterpenoids have been identified. Commercially relevant compounds include Cetalox® ((3aRS,9aRS,9bRS)-3a,6,6,9a-tetramethyl-1,2,3a,4,6,7,8,9,9a,9b-decahydronaphtho[2,1-b]furan; manufactured by Firmenich SA, Geneva, Switzerland) and Ambrox® ((3aR,5aS,9aS,9bR)-3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan; manufactured by Firmenich SA, Geneva, Switzerland), which mimic ambroxide.
[0061] The inventors sought to identify an improved method for preparing the compound of formula (I), also known as 3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan.
[0062] To prepare an improved method for preparing the compound of formula (I), the inventors have cultivated a deep understanding of the biochemical pathway that produces this compound by a multi-enzyme reaction from a precursor compound. This multi-enzyme reaction represents the first instance in which the preparation of this compound has been carried out by such a stepwise reaction, and represents a significant scientific and commercial advance in the preparation of the sesquiterpene compound of formula (I). In particular, the combination of enzymes and their order in the method have not been described in the prior art.
[0063] This invention includes an in vivo method for preparing the compound of formula (I) in recombinant cells. This is the first time that a complete in vivo method for producing the compound of formula (I) has been demonstrated by creating a biosynthetic pathway to the compound of formula (I) in recombinant cells.
[0064] Therefore, this invention provides a means to solve the problem of preparing such compounds.
[0065] A first aspect of the present invention relates to formula (I) in the form of one or a mixture thereof of stereoisomers. [ka] A method for preparing the compound, (i) Formula (II) of one of the stereoisomers or a mixture thereof [ka] The compound is brought into contact with a polypeptide having ADH enzyme activity to produce the compound of formula (III). (ii) Formula of one of the stereoisomers or a mixture thereof (III) [ka] The compound is brought into contact with a polypeptide having enal cleavage enzyme activity to produce the compound of formula (IV), (iii) Formula of one of the stereoisomers or a mixture thereof (IV) [ka] The compound is brought into contact with a polypeptide having BVMO enzyme activity to produce the compound of formula (V), (iv) Formula of one of the stereoisomers or a mixture thereof (V) [ka] The steps include contacting the compound with a polypeptide having esterase enzyme activity to produce the compound of formula (VI), and (v) Formula (VI) of one of the stereoisomers or a mixture thereof [ka] The compound is brought into contact with a polypeptide having terpene cyclase enzyme activity to produce the compound of formula (I). This provides a method that includes [something].
[0066] For clarity, the expression "any one of (the) stereoisomers" or similar expression means in the ordinary sense as understood by those skilled in the art, that is, the compound of the present invention may be a pure stereoisomer, such as an enantiomer or diastereomer (for example, relating to either configuration E or Z of the double bond, or to either configuration R or S of the chiral carbon center).
[0067] According to any aspect or embodiment of the present invention, the compound may be in the form of one or more of its stereoisomers or mixtures thereof. For example, the present invention relates to a composition of a substance comprising one or more forms of the compound of formula (I) having the same chemical structure but different arrangements of chiral centers.
[0068] In particular, compound (I) may be in the form of a mixture containing stereoisomer Ia (formula Ia), wherein stereoisomer Ia accounts for at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, or 99% or more of the entire mixture. [ka]
[0069] Alternatively, compound (I) may be in the form of a mixture containing stereoisomer Ib (formula Ib), wherein stereoisomer Ib accounts for at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, or 99% or more of the whole mixture. [ka]
[0070] In one embodiment, more than 97% of compounds of formula (I) are in the form of formula (Ia) and / or (Ib).
[0071] According to any aspect or embodiment of the present invention, the compounds of formulas (II) to (VI) may be in the form of their E or Z isomers or mixtures thereof. In particular, any one of the compounds of formulas (II) to (VI) may be in the form of a mixture comprising stereoisomers E and Z, wherein stereoisomers IIa, IIIa, IVa, Va, or VIa constitute at least 50%, or even more than 75%, of the entire mixture (i.e., the mixture E / Z is contained in an amount of 75 / 25 to 100 / 0).
[0072] Step (i) of the method of the present invention Step (i) of the present invention relates to contacting a compound of formula (II) with a polypeptide having ADH enzyme activity. [ka]
[0073] The compound of formula (II) is also known as geranylgeraniol, i.e., 3,7,11,15-tetramethylhexadeca-2,6,10,14-tetraen-1-ol; CAS number 7614-21-3.
[0074] The compound of formula (II) may exist as one of its stereoisomers or a mixture thereof. Specifically, the compound may have the following structures and isoforms. [ka] (2E,6E,10E)-Geranylgeraniol; (2E,6E,10E)-3,7,11,15-Tetramethylhexadeca-2,6,10,14-Tetraen-1-ol; CAS number 24034-73-9. [ka] (2Z,6E,10E)-Geranylgeraniol; (2Z,6E,10E)-3,7,11,15-Tetramethylhexadeca-2,6,10,14-Tetraen-1-ol; CAS number 57784-25-5. [ka] (2E,6Z,10E)-Geranylgeraniol; (2E,6Z,10E)-3,7,11,15-Tetramethylhexadeca-2,6,10,14-Tetraen-1-ol; CAS number 83689-05-8. [ka] (2E,6E,10Z)-Geranylgeraniol; (2E,6E,10Z)-3,7,11,15-Tetramethylhexadeca-2,6,10,14-Tetraen-1-ol; CAS number 68690-77-7. [ka] (2Z,6Z,10E)-Geranylgeraniol; (2Z,6Z,10E)-3,7,11,15-Tetramethylhexadeca-2,6,10,14-Tetraen-1-ol; CAS number 83689-06-9. [ka] (2Z,6E,10Z)-Geranylgeraniol; (2Z,6E,10Z)-3,7,11,15-Tetramethylhexadeca-2,6,10,14-Tetraen-1-ol; CAS number 83689-07-0. [ka] (2E,6Z,10Z)-Geranylgeraniol; (2E,6Z,10Z)-3,7,11,15-Tetramethylhexadeca-2,6,10,14-tetraen-1-ol; CAS No. 83689-08-1. [Chem.] (2Z,6Z,10Z)-Geranylgeraniol; (2Z,6Z,10Z)-3,7,11,15-Tetramethylhexadeca-2,6,10,14-tetraen-1-ol; CAS No. 1945-42-2.
[0075] Step (i) relates to the use of a polypeptide having ADH enzyme activity.
[0076] "Alcohol dehydrogenase" (ADH) in the context of the present invention refers to a polypeptide having the ability to oxidize an alcohol to the corresponding aldehyde in the presence of NAD + or NADP + as a cofactor. Such enzymes are members of the E.C. family 1.1.1.1 (NAD + -dependent) or 1.1.1.2 (NADP + -dependent). More specifically, the ADH of the present invention has the ability to oxidize linear terpenoid alcohols to their respective carbonyl compounds, particularly the corresponding aldehydes, for example, oxidizing geranylgeraniol to geranylgeranal. As used herein, ADH may be either endogenously present or exogenous in each biocatalytic process.
[0077] "Alcohol dehydrogenase enzyme activity" is determined under "standard conditions" as described below herein. This can be determined by adding host cells expressing recombinant alcohol dehydrogenase (ADH) polypeptide, cells expressing disrupted ADH polypeptide, or fractions of these or concentrated or purified ADH polypeptide in preferably buffered culture or reaction medium having a pH in the range of 6 to 11, preferably 7 to 9, at an initial concentration in the range of 1 to 100 μM, preferably 5 to 50 μM, particularly 30 to 40 μM, in the presence of a reference substrate, here particularly geranylgeraniol, or by using a reference substrate endogenously produced by the host cell. For in vitro assays, a cofactor selected from NADH and NADPH should be added at an appropriate concentration that can be easily determined. The conversion reaction to form each aldehyde compound, such as geranylgeraniol, is carried out for 10 minutes to 5 hours, preferably about 1 to 2 hours. Next, the oxidation products can be conventionally determined after extraction with an organic solvent such as ethyl acetate.
[0078] Further methods for evaluating the oxidation of geranylgeraniol to geranylgeranial by ADH are described in Example 3.
[0079] A preferred embodiment of the present invention is that the polypeptide having the ADH enzyme activity is For example, CHTD(sequence number: 228), such as in sequence numbers: 11, 12, 13, 14, 17, 18, 19, or 20; For example, GHEGxG(sequence number: 229), such as in sequence numbers: 11, 12, 13, 14, 17, 18, 19, or 20; For example, LxCGxxTGxGA(sequence number: 230), which is located in sequence numbers: 11, 12, 13, 14, 17, 18, 19, or 20; For example, Gx[VI]GL(sequence number: 231), such as in sequence numbers: 11, 12, 13, 14, 15, 17, 18, 19, or 20; For example, LxxxG[LVI][PA](sequence number: 232), such as in sequence numbers 11, 12, 15, 17, 18, 19, or 20; For example, GxVxAI(sequence number:233); and such as in sequence number:16 or 21. For example, YxATKxA(sequence number: 234), which is located at sequence number: 16 or 21; Includes at least one array motif selected from, In the motif above, each residue x independently represents any native amino acid residue in a polypeptide that has ADH activity. Ambiguity is indicated by listing acceptable amino acids in the given position within parentheses. For example, [VI] represents V (valine) or I (isoleucine).
[0080] Preferably, the polypeptide having ADH enzyme activity includes the CHTD (SEQ ID NO: 228), GHEGxG (SEQ ID NO: 229), LxCGxxTGxGA (SEQ ID NO: 230), and Gx[VI]GL (SEQ ID NO: 231) motifs, such as those found in SEQ ID NOs: 11, 12, 13, 14, 17, 18, 19, or 20.
[0081] Preferably, the polypeptide having ADH activity includes the CHTD (SEQ ID NO: 228), GHEGxG (SEQ ID NO: 229), LxCGxxTGxGA (SEQ ID NO: 230), Gx[VI]GL (SEQ ID NO: 231), and LxxxG[LVI][PA] (SEQ ID NO: 232) motifs, such as those found in SEQ ID NOs: 11, 12, 17, 18, 19, or 20.
[0082] A preferred embodiment of the present invention is a polypeptide having ADH enzyme activity that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 11 to 21. Preferably, the polypeptide has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with the amino acid sequence described in SEQ ID NOs: 11 or 21. Preferably, the polypeptide has the amino acid sequence described in SEQ ID NOs: 11 or 21.
[0083] Step (ii) of the method of the present invention Step (ii) of the present invention relates to contacting a compound of formula (III) with a polypeptide having enal cleavage enzyme activity. [ka]
[0084] The compound of formula (III) is also known as geranylgeranial, i.e., 3,7,11,15-tetramethylhexadeca-2,6,10,14-tetraenal; CAS number 32480-11-8.
[0085] The compound of formula (III) may exist as one of its stereoisomers or a mixture thereof. Specifically, the compound may have the following structures and isoforms: [ka] (2E,6E,10E)-Geranylgeranial; (2E,6E,10E)-3,7,11,15-Tetramethylhexadeca-2,6,10,14-Tetraenal; CAS No. 13920-12-2. [ka] (2Z,6E,10E)-Geranylgeranial; (2Z,6E,10E)-3,7,11,15-Tetramethylhexadeca-2,6,10,14-Tetraenal; CAS number 57784-38-0. [ka] (2Z,6Z,10E)-Geranylgeranial; (2Z,6Z,10E)-3,7,11,15-Tetramethyl-2,6,10,14-Hexadecatetraenal. [ka] (2Z,6E,10Z)-Geranylgeranial; (2Z,6E,10Z)-3,7,11,15-Tetramethyl-2,6,10,14-Hexadecatetraenal. [ka] (2Z,6Z,10E)-Geranylgeranial; (2Z,6Z,10E)-3,7,11,15-Tetramethyl-2,6,10,14-Hexadecatetraenal. [ka] (2Z,6E,10Z)-Geranylgeranial; (2Z,6E,10Z)-3,7,11,15-Tetramethyl-2,6,10,14-Hexadecatetraenal. [ka] (2E,6Z,10Z)-Geranylgeranial; (2E,6Z,10Z)-3,7,11,15-Tetramethyl-2,6,10,14-Hexadecatetraenal. [ka] (2Z,6Z,10Z)-Geranylgeranial; (2Z,6Z,10Z)-3,7,11,15-Tetramethyl-2,6,10,14-Hexadecatetraenal.
[0086] Step (ii) relates to the use of polypeptides having enal cleavage enzyme activity.
[0087] In the context of the present invention, “Enal cleavage enzyme,” “Enal cleavage protein,” or “Enal cleavage polypeptide” refers to an “α,β-unsaturated aldehyde carbon-carbon double bond cleavage enzyme,” which may also be called an “α,β-unsaturated aldehyde C=C bond cleavage enzyme,” an “α,β-unsaturated aldehyde C=C cleavage enzyme,” or an “Enal C=C cleavage enzyme.” The Enal cleavage proteins of the present invention may also be described as members of the “DUF4334 protein family” and / or the “GXWXG protein family” based on protein domain organization (SEQ ID NO: 263). Examples of such enzymes can be found in the literature, e.g., International Publication No. 2021005097.
[0088] More specifically, the enal cleavage enzyme of the present invention has the ability to cleave terpenoid compounds containing α,β-unsaturated aldehyde groups, particularly geranylgeranial, to farnesylacetone.
[0089] The "Enal cleavage enzyme activity" is determined under the "standard conditions" described below herein. This can be determined by adding a host cell expressing recombinant Enal cleavage polypeptide, a cell expressing disrupted Enal cleavage polypeptide, or a fraction of these or concentrated or purified Enal cleavage polypeptides in a preferably buffered culture medium or reaction medium having a pH in the range of 6 to 11, preferably 7 to 9, at an initial concentration in the range of 1 to 100 μM, preferably 5 to 50 μM, particularly 30 to 40 μM, in the presence of a reference substrate, here particularly geranylgeranial, or by using a reference substrate endogenously produced by the host cell. The conversion reaction to form each cleavage product, such as farnesylacetone, is carried out for 10 minutes to 5 hours, preferably about 1 to 2 hours. The cleavage products can then be conventionally determined after extraction with an organic solvent, such as ethyl acetate.
[0090] The polypeptide having Enal cleavage enzyme activity is a) At least one DUF4334 protein family domain having PfamID number PF14232 (in particular, within the C-terminal region of their amino acid sequence) b) At least one GXWXG (SEQ ID NO: 263) protein family domain having PfamID number PF14231 (in particular, within the N-terminal region of their amino acid sequence), and / or c) Domains that maintain at least 90% sequence identity with PF14232 or PF14231 You can choose from a group of polypeptides that include [the specified polypeptide].
[0091] In particular, the polypeptide of the present invention having Enal cleavage enzyme activity has the above domain and 1 × 10 -5 Less than, or 1 × 10⁻⁶ -10 Less than, or 1 × 10⁻⁶ -15 Less than, or 1 × 10⁻⁶ -20 Less than, or 1 × 10⁻⁶ -25 Less than, or 1 × 10⁻⁶ -30Less than, or 1 × 10⁻⁶ -35 The following, in particular, 1 × 10 -20 ~1 × 10 -32 The range, more specifically 1 × 10 -25 ~1 × 10 -31 If the e-value matches within the range, it is identified as a member of the DUF4334 protein family, which includes the domain PF14232.
[0092] In particular, polypeptides possessing Enal cleavage enzyme activity have 1 × 10 -5 Less than, or 1 × 10⁻⁶ -10 Less than, or 1 × 10⁻⁶ -15 Less than, or 1 × 10⁻⁶ -20 Less than, or 1 × 10⁻⁶ -25 Less than, or 1 × 10⁻⁶ -30 Less than, or 1 × 10⁻⁶ -35 The following, in particular, 1 × 10 -20 ~1 × 10 -30 If the e value matches within the specified range, it is identified as a member of the GXWXG (SEQ ID NO: 263) protein family, which includes the domain PF14231.
[0093] The query sequence is the sequence of a polypeptide possessing Enal cleavage enzyme activity.
[0094] For example, the following websites can be used to search for and calculate such e-values: http: / / www.ebi.ac.uk / Tools / hmmer / search / hmmscan or http: / / www.ebi.ac.uk / Tools / pfa / pfamscan / .
[0095] Furthermore, the polypeptides having the enal cleavage enzyme activity are as follows: . G-[Y or "-"]-xWxGxx-[F, L, or I]-x-[T, S, or R]-G-[H or D] (also represented as GxxWxGxxxxxGx) as described in Sequence ID No. 235, or any partial motif containing up to 10 or up to 5 consecutive amino acid residues corresponding to residues at positions 1-8 or 9-13 in Sequence ID No. 235, where X2 may be Y or may be omitted, X3 may be any naturally occurring amino acid, X5 may be any naturally occurring amino acid, X7 may be any naturally occurring amino acid, X8 may be any naturally occurring amino acid, X9 may be F, L, or I, X10 may be any naturally occurring amino acid, X11 may be R, S, or T, and X13 may be H or D; .W-[Y, A, or V]-GKx-[F or Y]-x-[S or D] (also represented as WxGKxxxx) as described in Sequence ID No. 236, or any partial motif containing up to four consecutive amino acid residues corresponding to residues at positions 1-4 or 5-8 in Sequence ID No. 236, where X2 may be A, V, or Y, X5 may be any naturally occurring amino acid, X6 may be F or Y, X7 may be any naturally occurring amino acid, and X8 may be D or S; .[G or S]-x-[A or G]-x-[L or V]-xxxx-[F, Y, or L]-RGxV (also represented as xxxxxxxxxxRGxV) as described in Sequence ID No. 237, or any partial motif of that name containing up to 10 or up to 5 consecutive amino acid residues corresponding to residues at positions 1-8 or 9-14 in Sequence ID No. 237, where X1 may be G or S, X2 may be any naturally occurring amino acid, X3 may be A or G, X4 may be any naturally occurring amino acid, X5 may be L or V, X6 may be any naturally occurring amino acid, X7 may be any naturally occurring amino acid, X8 may be any naturally occurring amino acid, X9 may be any naturally occurring amino acid, X10 may be F, L, or Y, X13 may be any naturally occurring amino acid; and .[M or L]-[V or I]-YDxxP-[I or V]-xD-[H or S]-[F or L] (also represented as xxYDxxPxxDxx) as described in SEQ ID NO: 238, or any partial motif containing up to 10 or up to 5 consecutive amino acid residues corresponding to, for example, residues at positions 1-6 or 7-12 in SEQ ID NO: 238, where X1 may be L or M, X2 may be I or V, X5 may be any naturally occurring amino acid, X6 may be any naturally occurring amino acid, X8 may be I or V, X9 may be any naturally occurring amino acid, X11 may be H or S, and X12 may be F or L; You can choose from a group of polypeptides that contain at least one sequence motif / domain selected from the following: The numbering of X (e.g., X2) corresponds to its position in the associated sequence. For example, X2 corresponds to X at position 2 in the associated sequence. In the aforementioned motif, residue x represents any native amino acid residue independently of each other, and optionally, in each of the aforementioned motifs, one, two, three, four, or five amino acid residues different from residue x may be modified, for example, by amino acid substitution, particularly by conservative substitution, provided that the enzyme retains at least analytically detectable enal cleavage activity. The function of the square brackets was described previously.
[0096] A preferred embodiment of the present invention is a polypeptide having enal cleavage activity that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to the sequence described in SEQ ID NO: 22.
[0097] Step (iii) of the method of the present invention Step (iii) of the present invention relates to contacting a compound of formula (IV) with a polypeptide having BVMO enzyme activity. [ka]
[0098] The compound of formula (IV) is farnesylacetone, also known as 6,10,14-trimethylpentadeca-5,9,13-trien-2-one; CAS number 762-29-8.
[0099] The compound of formula (IV) may exist as one of its stereoisomers or a mixture thereof. Specifically, the compound may have the following structures and isoforms: [ka] (5E,9E)-Farnesylacetone; (5E,9E)-6,10,14-Trimethylpentadeca-5,9,13-Trien-2-one; CAS number 1117-52-8. [ka] (5Z,9E)-Farnesylacetone; (5Z,9E)-6,10,14-Trimethylpentadeca-5,9,13-Trien-2-one; CAS number 1117-51-7. [ka] (5E,9Z)-Farnesylacetone; (5E,9Z)-6,10,14-Trimethylpentadeca-5,9,13-Trien-2-one; CAS number 3053-35-3. [ka] (5Z,9Z)-Farnesylacetone; (5Z,9Z)-6,10,14-Trimethylpentadeca-5,9,13-Trien-2-one; CAS number 3796-69-8.
[0100] Step (iii) relates to the use of polypeptides having BVMO enzyme activity.
[0101] Baeyer-Villiger monooxygenases (BVMOs) are flavin enzymes belonging to a class of polypeptides that possess oxidoreductase activity (EC 1.14.13.X). They catalyze the oxidation of linear, cyclic (aromatic or non-aromatic) aldehydes or ketones to their corresponding esters or lactones, and are very similar to chemical Baeyer-Villiger oxidation. During enzymatic oxidation, one atom of molecular oxygen is incorporated into the carbon-carbon bond of the inactivated carbonyl compound. BVMOs require or accept NADPH or NADH as cofactors. They also require molecular oxygen as a co-matrix. More specifically, the BVMOs of this invention have the ability to oxidize, for example, linear terpenoid carbonyl compounds, particularly terpene-derived aldehydes or ketones such as farnesylacetone, to their respective carbonyl esters.
[0102] "BVMO enzyme activity" is determined under "standard conditions" as described below herein. This can be determined by adding host cells expressing recombinant BVMO, cells expressing disrupted BVMO, or fractions of these or concentrated or purified BVMO enzymes to a preferably buffered culture medium or reaction medium having a pH in the range of 6 to 11, preferably 7 to 9, at a temperature in the range of about 20 to 45°C, for example, about 25 to 40°C, preferably 25 to 32°C, in the presence of a reference substrate, here particularly farnesylacetone, at an initial concentration in the range of 1 to 100 μM, preferably 5 to 50 μM, particularly 30 to 40 μM, or by using endogenously produced by host cells, and in the presence of molecular oxygen. In an in vitro assay, a cofactor selected from NADH and NADPH must be added within an appropriate concentration range for the conversion reaction to form each enzyme product, for example, homofarnesylacetate in the case of farnesylacetone. This process is carried out for 10 minutes to 5 hours, preferably about 1 to 2 hours. The BVMO product can then be conventionally determined after extraction with an organic solvent such as ethyl acetate.
[0103] Further methods for screening for BVMO and evaluating the conversion of farnesylacetone to homofarnesylacetate are described in Example 2.
[0104] Polypeptides possessing BVMO enzyme activity include the following: (1) A group of polypeptides comprising a flavin-containing monooxygenase (FMO) protein family domain having PfamID number PF00743 in its amino acid sequence, or a domain that maintains at least 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to PF00743; In particular, polypeptides having BVMO activity are such that the domain and 1 × 10 -5 Less than, or 1 × 10⁻⁶ -10 Less than, or 1 × 10⁻⁶ -15 The following, or 1 × 10 -18The following, in particular, 1 × 10 -10 ~1 × 10 -18 The range, more specifically 1 × 10 -14 ~1 × 10 -17 If the e-value matches within the range, it is identified as a member of the FMO protein family containing the domain PF00743. The sequence of a polypeptide having BVMO activity is applied as the query sequence. For example, to search for and calculate such e-values, you can use the following websites: http: / / www.ebi.ac.uk / Tools / hmmer / search / hmmscan or http: / / www.ebi.ac.uk / Tools / pfa / pfamscan / and / or (2) A group of polypeptides containing at least one sequence motif / domain selected from the following: For example, GxGxxG (sequence number: 239), such as one of sequence numbers: 23-26. Here, X4 may be a naturally occurring amino acid, particularly A or I. The numbering of X corresponds to its position in the sequence. For example, [GS]GxWxxxxYPGxxxD(sequence number:240), which is one of sequence numbers 23-26; For example, Gxxx[FY]xGxxx[HS]xxxW(sequence number:241); and which are located in any of sequence numbers 23-26. For example, [KQ]x[VI]xx[IV]GxG (sequence number: 242), which is one of sequence numbers 23-26. You can choose from the following: In the aforementioned motif, residue x independently represents any native amino acid residue, and optionally, in each of the aforementioned motifs, 1, 2, 3, 4, or 5 conserved amino acid residues (i.e., residue x) may be modified, for example, by amino acid substitution, particularly by conservative substitution, provided that the enzyme retains BVMO enzyme activity to at least an analytically detectable degree. The function of the square brackets was described previously.
[0105] A preferred embodiment of the present invention is a polypeptide having BVMO enzyme activity that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 23 to 26. A preferred embodiment of the present invention is a polypeptide having BVMO enzyme activity that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with SEQ ID NOs: 25 or 26. Preferably, the polypeptide has the amino acid sequence described in SEQ ID NOs: 25 or 26.
[0106] Alternatively, a polypeptide having BVMO enzyme activity is one that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences listed in SEQ ID NOs: 216-227.
[0107] Step (iv) of the method of the present invention Step (iv) of the present invention relates to contacting a compound of formula (V) with a polypeptide having esterase enzyme activity. [ka]
[0108] The compound of formula (V) is also known as homofarnesylacetate, i.e., 4,8,12-trimethyltrideca-3,7,11-triene-1-ylacetate; CAS number 109813-25-4.
[0109] The compound of formula (V) may exist as one of its stereoisomers or a mixture thereof.
[0110] Specifically, the compound may have the following structures and isoforms: [ka] (3E,7E)-homofarnesylacetate; (3E,7E)-4,8,12-trimethyltrideca-3,7,11-triene-1-ylacetate; CAS number 944346-19-4. [ka] (3Z,7E)-homofarnesylacetate; (3Z,7E)-4,8,12-trimethyltrideca-3,7,11-triene-1-ylacetate; CAS number 1467099-77-9. [ka] (3E,7Z)-homofarnesylacetate; (3E,7Z)-4,8,12-trimethyltrideca-3,7,11-triene-1-ylacetate. [ka] (3Z,7Z)-homofarnesylacetate; (3Z,7Z)-4,8,12-trimethyltrideca-3,7,11-triene-1-ylacetate.
[0111] Step (iv) relates to the use of polypeptides having esterase enzyme activity.
[0112] "Esterase" refers to a polypeptide having hydrolytic activity that splits esters into acids and alcohols in a chemical reaction (hydrolysis) with water. In the context of the present invention, esterases are selected from the class of carboxylic acid ester hydrolases (EC3.1.1.-) that split acyl groups, such as acetyl or formyl groups, from their respective ester substrates. More specifically, the esterases of the present invention have the ability to cleave terpenyl ester compounds, such as homofarnesyl acetate, to form the corresponding alcohols, particularly homofarnesol.
[0113] "Esterase enzyme activity" is determined under "standard conditions" as described below herein. This can be determined by adding host cells expressing recombinant esterase polypeptides, cells expressing disrupted esterase polypeptides, or fractions of these or concentrated or purified esterase polypeptides in a preferably buffered culture medium or reaction medium having a pH in the range of 6 to 11, preferably 7 to 9, at an initial concentration in the range of 1 to 100 μM, preferably 5 to 50 μM, particularly 30 to 40 μM, in the presence of a reference substrate, here particularly homofarnesyl acetate, or by using a reference substrate endogenously produced by the host cell. The conversion reaction to form each alcohol, particularly homofarnesol, is carried out for 10 minutes to 5 hours, preferably about 1 to 2 hours. The detection and quantification of esterase products can then be conventionally determined after extraction with an organic solvent, such as ethyl acetate.
[0114] Further methods for evaluating the conversion of homofarnesyl acetate to homofarnesol by esterase are described in Example 4.
[0115] A preferred embodiment of the present invention is that the polypeptide having esterase activity is For example, AxVVxVxxRLAPE(sequence number:243), which is located at sequence number:27 or 28; For example, GASAGGGLxA(sequence number:244), which is located at sequence number:27 or 28; For example, VxQLLxYPMLDDR(sequence number:245); and as shown in sequence number:27 or 28. For example, ARxxDLSGLPxT(sequence number: 246), which is located at array index: 27 or 28; Includes at least one array motif selected from, In the aforementioned motif, residue x independently represents any native amino acid residue, and optionally, in each of the aforementioned motifs, one, two, three, or four amino acid residues different from residue x may be modified, for example, by amino acid substitution, particularly by conservative substitution, provided that the enzyme retains esterase enzyme activity to at least an analytically detectable degree.
[0116] Preferably, the polypeptide having esterase activity includes, for example, AxVVxVxxRLAPE (SEQ ID NO: 243), GASAGGGLxA (SEQ ID NO: 244), VxQLLxYPMLDDR (SEQ ID NO: 245), and ARxxDLSGLPxT (SEQ ID NO: 246), as shown in SEQ ID NO: 27 or 28.
[0117] A preferred embodiment of the present invention is a polypeptide having esterase enzyme activity that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with SEQ ID NO: 27 or 28. Preferably, the polypeptide has the amino acid sequence described in SEQ ID NO: 28.
[0118] Step (v) of the method of the present invention Step (v) of the present invention relates to contacting a compound of formula (VI) with a polypeptide having terpene cyclase enzyme activity. [ka]
[0119] The compound of formula (VI) is also known as homofarnesol or 4,8,12-trimethyltrideca-3,7,11-trien-1-ol; CAS number 35826-67-6.
[0120] The compound of formula (VI) may exist as one of its stereoisomers or a mixture thereof. Specifically, the compound may have the following structures and isoforms: [ka] (3E,7E)-homofarnesol; (3E,7E)-4,8,12-trimethyltrideca-3,7,11-trien-1-ol; CAS number 459-89-2. [ka] (3Z,7E)-homofarnesol; (3Z,7E)-4,8,12-trimethyltrideca-3,7,11-trien-1-ol; CAS number 138152-06-4. [ka] (3E,7Z)-homofarnesol; (3E,7Z)-4,8,12-trimethyltrideca-3,7,11-trien-1-ol; CAS number 2032064-12-1. [ka] (3Z,7Z)-homofarnesol; (3Z,7Z)-4,8,12-trimethyltrideca-3,7,11-trien-1-ol; CAS number 138152-08-6.
[0121] Terpene cyclases can be divided into two categories depending on the method by which the initial carbocation is produced. In Class I (or Type I) terpene cyclases, the diphosphate group of a linear terpenoid precursor is removed, and an allyl carbocation is formed on the terpene portion. In Class II cyclases, the initial carbocation is formed by the protonation of a double bond or epoxy group in the terpene carbon chain. Therefore, Class I cyclases necessarily use substrates with diphosphate groups, while Class II cyclases can use terpenoids as substrates (since they do not require diphosphate groups for the production of the initial carbocation).
[0122] In all terpene cyclases, the resulting reactive carbocation species triggers subsequent cascade reactions involving carbocation reactions with double bond, alkyl shift, hydride shift, or carbon-carbon bond formation. These reactions can be terminated by deprotonation of a carbon atom adjacent to the carbocation, or by quenching the carbocation with a hydroxyl group or a water molecule.
[0123] Terpene cyclase type II activity is associated with the abundant conserved motif of aspartate.
[0124] A typical example of a class II terpene cyclase is a class II diterpene cyclase that catalyzes the cyclization of geranylgeranyl diphosphates initiated by protonation, to form, for example, labdadienyl-diphosphate intermediates or other cyclic diphosphate intermediates (Peters, RJ (2010). Nat. Prod. Rep. 27, 1521-1530; Zerbe, P. et al (2015). Plant J. 83, 783-793).
[0125] Squalene cyclase (SHC) is a conventional example of a class II terpene cyclase whose substrate does not contain a diphosphate functional group. The squalene cyclase enzyme family includes squalene cyclase, 2,3-oxide squalene cyclase, and enzymes that catalyze mechanistically related cyclization reactions. Squalene cyclase catalyzes a cyclization cascade from linear terpenes to cyclic compounds initiated by protonation. Therefore, squalene cyclase is a class II terpene cyclase. The squalene family includes, for example, squalene-hopene cyclase (EC 5.4.99.17), which catalyzes the cyclization of squalene to hopene, and squalene-hopanol cyclase (EC 4.2.1.129), which catalyzes the cyclization of squalene to hopane. Tetraprenyl-β-curcumene-sporlenol cyclase catalyzes similar Class II cyclization of linear terpene substrates (EC 4.2.1.137). Since tetraprenyl-β-curcumene-sporlenol cyclase has also been shown to catalyze squalene cyclization (Sato, T., et al. (2011). Journal of the American Chemical Society 133(44): 17540-17543), tetraprenyl-β-curcumene-sporlenol cyclase is also a member of the squalene cyclase family.
[0126] Squalene cyclase polypeptides are typically membrane-bound proteins, having a length of 600–800 amino acids. They bind to the surface of the cell membrane but do not contain a transmembrane region. Squalene cyclase is classified into the IPR018333 family in the InterPro protein sequence classification database (https: / / www.ebi.ac.uk / interpro / entry / InterPro / IPR018333 / ) (InterPro release 93.0, March 2, 2023). The structure of squalene cyclase is organized into two domains containing several alpha helices, recognized as a β-domain and a γ-domain or βγ-domain architecture (Christianson DW, Chem. Rev, 2017, 117, 11570–11648). The two domains possess distinctive sequence signatures, as described in the Pfam database for the Pfam squalene-hopencyclase N-terminal domain (PF13249) and the squalene-hopencyclase C-terminal domain (PF13243) (Pfam 35.0 release, November 19, 2021). The presence of the IPR018333, PF13249, or PF13243 protein sequence signatures can be predicted using the NCBI preserved domain search tool (https: / / www.ncbi.nlm.nih.gov / Structure / cdd / wrpsb.cgi) or the InterProScan tool (http: / / www.ebi.ac.uk / interpro / search / sequence / ).
[0127] Squalene cyclase polypeptides contain characteristic conserved amino acid motifs that are positioned along the sequence and are associated with protein architecture and enzymatic reactions. In particular, squalene cyclase is .[SP][TP][VIL]WDTx[LWI](Sequence ID: 247), .PGG[WF][GYA]F (Sequence ID: 248), .PDxDD[TAS][TIAS](Sequence ID: 249), .[MIL]QxxxG[GA][WF]x[AS][FY](Sequence ID: 250), .Qxxx[GH]xWxG[RK]WGxx[YF]xYG(Sequence ID: 251), .Qxx[DN]G[GS][WF][GS]ExxxS(Sequence ID: 252), and .[STA]xx[SFN][QC]T[AGT]W[AS][LIV]xx[LQ](Sequence ID: 253) It contains at least one amino acid motif selected from the following.
[0128] Motif sequences are described using standard IUPAC single-letter codes for amino acids. Ambiguity is indicated by listing acceptable amino acids in parentheses at a given position. For example, [SP] or [S or P] represents S (serine) or P (proline). "x" represents a position where any native amino acid residue exists independently of each other. The function of square brackets was described earlier.
[0129] Meroterpenoids are hybrid secondary metabolites derived from mixed biosynthetic pathways, and are partially derived from terpenoid co-substrates (Cornforth, JW Terpenoid biosynthesis. Chem. Br. 1968, 4, 102-106). The non-terpenoid portion may originate from, for example, polyketides, alkaloids, phenols, or amino acid biosynthetic pathways. Great chemical diversity is observed in meroterpenoids, particularly in bacteria and fungi.
[0130] The meroterpenoid biosynthesis pathway follows several modular biosynthetic steps. In the first step, building blocks are generated from the corresponding biosynthetic pathways (e.g., terpenoids, polyketides). The terpenoid and non-terpenoid portions are assembled by prenyltransferase. The precursors of the terpenoid portion are generally linear terpenoid diphosphates such as geranyl diphosphate, farnesyl diphosphate, or geranylgeranyl diphosphate.
[0131] In the following steps, the linear polyenterpenoid moiety of the hybrid precursor is cyclized to form a monocyclic or polycyclic structure. This cyclization is catalyzed by a specific class of non-canonical class II terpene cyclases called meroterpenoid cyclases, first discovered in fungi (T. Itoh et al, 2010, 2, 858-864). The first representative meroterpenoid cyclase discovered was Pyr4 from Aspergillus fumigatus Af293 (Itoh, T., et al. (2010). Nature Chemistry 2(10): 858-864).
[0132] In many meroterpenoids, the linear terpenoid precursor is first activated by stereoselective epoxidation of one of the double bonds by a monooxygenase. The meroterpenoid cyclase then catalyzes the protonation of the epoxide moiety, generating a reactive carbocation species, triggering a subsequent cascade reaction similar to that of other terpene cyclases. Some meroterpenoid cyclases can convert isoprene precursors into cycl products without the prior epoxidation step. These meroterpenoid cyclases can directly protonate the terminal double bond, generating a reactive carbocation and catalyzing cyclization. For example, MacJ from the fungus Penicillium terrestry was the first fungal meroterpenoid cyclase identified using a type II double-bond protonation initiation reaction (Tang, M.-C., et al. (2017). Organic Letters 19(19): 5376-5379). Another example of a meroterpenoid cyclase that initiates polyene cyclization by direct double bond protonation is DmtA1 from the bacterium Streptomyces youssoufiensis OUC68199 (Yao et al, Nat. Commun., 2018, 9, 4091).
[0133] Similar to other class II terpene cyclases, the carbocations produced by meroterpenoid cyclases generally trigger cascade reactions that begin with double bond attack, generating monocyclic or polycyclic structures with tertiary carbocations. The reaction is either terminated by deprotonation to form a double bond, or terminated by reaction with water molecules to produce a tertiary alcohol. Typical cyclic structures found in meroterpenoid compounds include driman or labdan skeletons.
[0134] The largest group of meroterpenoid cyclases are compact membrane-integrating proteins containing several (generally seven) transmembrane helices. This transmembrane-based protein architecture can be readily predicted using the TMHMM2.0 server, available, for example, at https: / / dtu.biolib.com / DeepTMHMM (Krogh, A., et al. (2001) J Mol Biol 305(3): 567-580). In addition to this protein architecture, meroterpenoid cyclases differ from other class II cyclases, such as squalene cyclases, in that their polypeptide sizes are smaller. Bacterial and fungal meroterpenoid cyclase polypeptides have lengths ranging from 150 to 550 residues. The transmembrane helices are located across a portion of the polypeptide, encompassing 180 to 300 amino acids, and contain a catalytic domain.
[0135] Recently, meroterpenoid cyclases with protein architectures different from membrane-integrated meroterpenoid cyclases have been described. For example, MstE from the bacterial Scytonema sp. PCC1002 is a soluble cyclase with a structure similar to canonical cyclases such as diterpene synthase and squalene cyclase, but it is a monomeric protein with only an α-domain, so it is different (Moosmann, P., et al. (2020). Nat Chem 12(10): 968-972). Soluble bacterial meroterpenoid cyclase polypeptides have lengths in the range of 150 to 550.
[0136] Some meroterpenoid cyclases catalyze the cyclization reaction that converts the terpenoid moiety of the meroterpenoid hybrid precursor into a lambda cyclic structure. However, cyclization of linear terpenoids into lambda compounds by meroterpenoid cyclases has not been shown so far.
[0137] Meroterpenoid cyclase polypeptides are located along the sequence and contain characteristic conserved amino acid motifs related to protein architecture or enzymatic reactions.
[0138] .WxxxDxxILVMN (SEQ ID NO: 254); .PxxAxxxNxxWE (SEQ ID NO: 255); .MxxxFxxMLxxR (SEQ ID NO: 256); and .RxxxxGQS (SEQ ID NO: 257) A membrane-integrated meroterpenoid cyclase derived from bacteria, containing at least one or more amino acid motifs selected from the above.
[0139] .WYExxYFW (SEQ ID NO: 258); and .DNExSYxxP (SEQ ID NO: 259) A membrane-integrated meroterpenoid cyclase derived from fungi, containing at least one or more amino acid motifs selected from the above.
[0140] .GxWxxxW[WG]xxxxY (Accession number: 260); .WxxxHxxV[TSA] (Accession number: 261); and .GxWxD[FY] (Accession number: 262) A soluble meroterpenoid cyclase derived from bacteria, comprising at least one or more amino acid motifs selected from
[0141] Motif sequences are described using the standard IUPAC one-letter code for amino acids. Residue x represents any natural amino acid residue independently of each other, and optionally, in each of the previous motifs, one, two, three, or four amino acid residues different from the x residue may be modified, for example, by amino acid substitution, particularly conservative substitution, provided that the enzyme retains its enzymatic activity at least to an analytically detectable level. The function of square brackets was described above.
[0142] The meroterpenoid cyclase polypeptide can be searched in a sequence database, for example, using the BLAST search tool (Tatiana et al, FEMS Microbiol Lett., 1999, 174: (247 - 250, 1999)) with MacJ (Accession number: 71), DmTA1 (Accession number: 77), or MstE (Accession number: 76) as query sequences. The selection can be further narrowed by selecting sequences having an appropriate length or sequences containing characteristic amino acid motifs as described above. This selection may be based on a narrowed-down prediction of protein architecture, particularly by predicting the presence of transmembrane helices as described above.
[0143] Specific examples of appropriate standard conditions for each of the enzymatic activities described above can be obtained from the Examples section below.
[0144] As mentioned above, the terpene cyclase may be squalene cyclase (SHC) or meroterpenoid cyclase (MeroTPS).
[0145] To avoid confusion, SHCs and meroterpenoid cyclases are different classes of enzymes that can be distinguished by their physical properties. Furthermore, meroterpenoid cyclases can be classified into (i) bacterial membrane-integrated meroterpenoid cyclases, (ii) fungal membrane-integrated meroterpenoid cyclases, and (iii) bacterial soluble meroterpenoid cyclases.
[0146] Table A below outlines the differences between SHC and various types of meroterpenoid cyclases.
[0147] [Table 1]
[0148] Therefore, those skilled in the art can easily identify from the information provided herein whether the enzyme is an SHC enzyme or a class of meroterpenoid cyclase enzymes.
[0149] A preferred embodiment of the present invention is in which the terpene cyclase is an SHC.
[0150] As described in the attached examples, the inventors have demonstrated that an SHC enzyme can be used in step (v) of the method of the present invention. A preferred embodiment of the present invention is a polypeptide having SHC enzyme activity that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 29-49 and 265-279. Preferably, the polypeptide having SHC enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 29-49, 265-274, and 276-279. More preferably, the polypeptide having SHC enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 48, 265, 266, 267, 268, 274, 276, and 279. More preferably, the polypeptide having SHC enzyme activity has the amino acid sequence of any of SEQ ID NOs: 48, 265, 266, 267, 268, 274, 276, and 279.
[0151] Alternatively, polypeptides possessing SHC enzyme activity have at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences listed in SEQ ID NOs: 79-89.
[0152] A preferred embodiment of the present invention is that the terpene cyclase is a meloterpenoid cyclase.
[0153] In preparing the method of the present invention, the inventors attempted to compare the isomer profiles of the compound of formula (I) synthesized by SHC and meroterpenoid cyclase.
[0154] To the surprise of the inventors, they found that when meroterpenoid cyclase is used in the method of the present invention, it produces a compound of formula (I) that has an isomer bias towards the isomer of the compound of formula (I) that has a more favorable olfactory profile than that produced by the SHC enzyme.
[0155] In particular, the inventors demonstrated that less than 1% of the compound of formula (I) produced by the method of the present invention were in the isomer form of formula I(c) and / or I(d). Therefore, the use of meroterpenoid cyclases offers remarkable technical advantages compared to the use of SHC enzymes.
[0156] Preferably, the meroterpenoid cyclase is a membrane-integrated meroterpenoid cyclase. From the attached examples, it can be seen that this class of meroterpenoid cyclases preferably produces compounds of formula (Ia) rather than formula (Ic) and / or (Id). Examples of membrane-integrated meroterpenoid cyclases include those described in any of SEQ ID NOs: 50-73 and 280-289. SEQ ID NOs: 50-70 and 280-289 are membrane-integrated meroterpenoid cyclases derived from bacteria, and SEQ ID NOs: 71-73 are membrane-integrated meroterpenoid cyclases derived from fungi.
[0157] Preferably, the meroterpenoid cyclase is a soluble meroterpenoid cyclase. From the attached examples, it can be seen that meroterpenoid cyclases of this class preferably produce compounds of formula (Ib) rather than formulas (Ic) and / or (Id). Examples of soluble meroterpenoid cyclases are those described in SEQ ID NO: 74 or 75.
[0158] This is the first time that a meroterpenoid cyclase has been used to prepare the compound of formula (I), and the bias toward such isomeric forms is remarkable and of industrial commercial importance.
[0159] Furthermore, surprisingly, the inventors have also found that the meroterpenoid cyclase can be used in the bioconversion method without adding any surfactant to the reaction. The absence of surfactant in the bioconversion reaction results in a simplified and thus more cost-effective method for preparing the compound of formula (I).
[0160] As described in the attached examples, the inventors have demonstrated that the meroterpenoid cyclase enzyme can be used in step (v) of the method of the present invention. Preferred embodiments of the present invention are polypeptides having meroterpenoid cyclase enzyme activity that have at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity to any of the sequences described in SEQ ID NOs: 50-75 and 280-289. Preferably, the meroterpenoid cyclase enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity to any of the sequences described in SEQ ID NOs: 57, 71, 74, 280, 281, 282, 283, 286, 287, and 288. Preferably, the meroterpenoid cyclase enzyme activity has the sequence described in SEQ ID NOs: 57, 71, 74, 280, 281, 282, 283, 286, 287, or 288.
[0161] Form of compound (I) produced by the method of the present invention The method of the present invention is for preparing the compound of formula (I).
[0162] Compound of formula (I)
Chemical formula
[0163] The compound of formula (I) is also known as 3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan; CAS number 3738-00-9.
[0164] The compound of formula (I) may exist as one of its stereoisomers or a mixture thereof. Specifically, the compound may have the following structures and isoforms: [ka] (3aR,5aS,9aS,9bR)-3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan; CAS number 6790-58-5. [ka] (3aS,5aR,9aR,9bS)-3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan; CAS number 234431-64-2. [ka] (3aR,5aS,9aS,9bS)-3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan. [ka] (3aS,5aS,9aS,9bS)-3a,6,6,9a-tetramethyldodecahydronaphtho[2,1-b]furan.
[0165] The compound of formula (I) may have different enantiomer forms. Some isomers have a preferred olfactory profile than alternative isomer forms of the compound. In particular, the olfactorily preferred forms of compound (I) are compound (Ia) and / or (Ib).
[0166] To improve the yield or ratio of compound (Ia) and / or (Ib) to other isomer forms of compound (I), achieving high selectivity in the formation of compound (VI) plays a crucial role. Specifically, the E / Z isomer ratio at the 3,4-double bond of compound (VI) is very important. The preferred form of compound (VI) is that of formula (VIa).
[0167] Despite considerable effort, obtaining compounds of formula (VI) with an E / Z ratio at the 3,4-double bond higher than 90:10 using chemical methods remains difficult and, to date, not available on a large scale (Eichhorn, E. and F. Schroeder (2023). J Agric Food Chem).
[0168] This invention achieves this goal by using a highly selective enzyme. In this invention, the production of the compound of formula (VI) in the form of formula (VIa) with high selectivity has been achieved, as demonstrated in the appended examples. Such high selectivity cannot be achieved by chemical methods (such as those described in Eichhorn, E. and F. Schroeder (2023). J Agric Food Chem). Compounds of pathways with high E / Z ratios can be obtained using enzymatic pathways. In particular, geranylgeranyl diphosphates and geranylgeraniols with high E / Z ratios of double bonds can be achieved using enzymes, especially geranylgeranyl diphosphate synthase. The arrangement of the double bond is maintained at all intermediates in the pathway.
[0169] Subsequently, this provides a method for preparing olfactory-favorable forms of the compound of formula (I). Therefore, the method of the present invention for preparing the compound of formula (I) lacking a considerable amount of undesirable byproducts is of great commercial importance. As can be shown in the attached examples, more than 97% of the compounds of formula (I) are in the form of formula (Ia) and / or (Ib).
[0170] Therefore, in one embodiment of the present invention, more than 97% of the compounds of formula (I) are in the form of formula (Ia) and / or (Ib).
[0171] The scope of the present invention also includes compounds of formula (I) that are obtained or can be obtained by in vivo methods.
[0172] It is evident that this is the first time that a compound of formula (I) has been prepared by an in vivo method. For the reasons outlined herein, this represents a significant advance in the preparation of this compound, both technically and commercially. Compounds of formula (I) can be prepared by any in vivo method, preferably using recombinant cells expressing enzymes that can be used in the pathway to synthesize this molecule. One embodiment of the present invention is that more than 97% of compounds of formula (I) are in the form of formula (Ia) and / or (Ib).
[0173] Embodiments of the Method of the Present Invention The method of the present invention comprises using (i) a polypeptide having ADH enzyme activity, (ii) a polypeptide having enal cleavage enzyme activity, (iii) a polypeptide having BVMO enzyme activity, (iv) a polypeptide having esterase enzyme activity, and (v) a polypeptide having terpene cyclase enzyme activity.
[0174] In a further preferred embodiment of the method of the present invention, (i) Polypeptides possessing ADH enzyme activity have at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 11-21; (ii) A polypeptide having enal cleavage enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 22; (iii) The polypeptide having BVMO enzyme activity has sequence identity of at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more with respect to any of SEQ ID NOs: 23-26 and 216-227, preferably with respect to any of SEQ ID NOs: 23-26; (iv) A polypeptide having esterase enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with either SEQ ID NO: 27 or 28; and / or (v) Polypeptides having terpene cyclase enzyme activity have sequence identity of at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more with any of SEQ ID NOs: 29-75, 79-89, and 265-289, preferably with any of SEQ ID NOs: 29-75 and 265-289, and more preferably with any of SEQ ID NOs: 29-75, 265-274, and 276-289.
[0175] Further preferred embodiments of the present invention are: (i) The polypeptide having ADH enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 11 or 21, preferably the ADH enzyme having the sequence of SEQ ID NO: 11 or 21; (ii) The polypeptide having enal cleavage activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 22, and preferably the enal cleavage enzyme has the sequence of SEQ ID NO: 22; (iii) A polypeptide having BVMO enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 25 or 26, preferably the BVMO enzyme having the sequence of SEQ ID NO: 25 or 26; (iv) A polypeptide having esterase enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 28, preferably the esterase enzyme having the sequence of SEQ ID NO: 28; and / or (v) A polypeptide having terpene cyclase enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity with SEQ ID NOs: 48, 57, 71, 74, 265, 266, 267, 268, 274, 276, 279, 280, 281, 282, 283, 286, 287, or 288, and preferably the terpene cyclase enzyme has the sequence of SEQ ID NOs: 48, 57, 71, 74, 265, 266, 267, 268, 274, 276, 279, 280, 281, 282, 283, 286, 287, or 288. That is the case.
[0176] Alternatively, in this embodiment of the present invention, (v) The polypeptide having terpene cyclase enzymatic activity is an SHC enzyme and has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with SEQ ID NOs: 48, 265, 266, 267, 268, 274, 276, or 279, and preferably the SHC enzyme has the sequence of SEQ ID NOs: 48, 265, 266, 267, 268, 274, 276, or 279.
[0177] Alternatively, in this embodiment of the present invention, (v) The polypeptide having terpene cyclase enzyme activity is a meroterpenoid cyclase enzyme and has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with SEQ ID NOs: 57, 71, 74, 280, 281, 282, 283, 286, 287, or 288, preferably the meroterpenoid cyclase enzyme has the sequence of SEQ ID NOs: 57, 71, 74, 280, 281, 282, 283, 286, 287, or 288.
[0178] Further embodiments of the present invention include: (i) A compound of formula (I) [ka] It is in the form of, (ii) The compound of formula (II) [ka] It is in the form of, (iii) A compound of formula (III) [ka] It is in the form of, (iv) The compound of formula (IV) [ka] It is in the form of, (v) The compound of formula (V) [ka] It is in the form of, (vi) The compound of formula (VI) [ka] It is in the form of It is.
[0179] Preparation of the compound of formula (II) This aspect of the present invention provides a method for preparing a compound of formula (I) from a series of biocatalytic transformations of a compound of formula (II).
[0180] The compound of formula (II) can be used, for example, in the form of a purified compound preparation, as a starting substrate in the method of the present invention.
[0181] However, a preferred embodiment of the present invention further includes a method of the present invention that provides a compound of formula (II) by a series of biocatalytic transformations that convert a precursor compound into a compound of formula (II).
[0182] The compounds of formulas (I) to (VI) are terpenoids.
[0183] Terpenoids are a large family of structurally diverse natural compounds. All terpenoids are derived by biosynthesis from two 5-carbon units: isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP). IPP and DMAPP can be produced from various biosynthetic pathways, including the 2-C-methyl-D-erythritol-4-phosphate (MEP) pathway, the mevalonate (MVA) pathway, or alternative MVA pathways (Dellas, N., et al. (2013) eLife 2: e00672). Alternatively, IPP and DMAPP can also be formed by the sequential enzymatic phosphorylation or enzymatic pyrophosphorylation of their corresponding alcohols, isoprenol and prenol (Ma, X, et al. (2022). J Agric Food Chem 70(11): 3512-3520).
[0184] These terpene building blocks are sequentially condensed to form linear terpenoid precursors of varying lengths and with multiples of 5 carbon atoms, such as geranyl diphosphate (GPP), farnesyl diphosphate (FPP), or geranylgeranyl diphosphate (GGPP), each containing 10, 15, and 15 carbon atoms, respectively. The condensation of IPP and DMAPP is carried out by a class of enzymes called prenyltransferases. Prenyltransferase enzymes catalyze the initial condensation reaction between IPP and DMAPP to produce GPP, and then catalyze the addition of the IPP molecule to produce FPP, and subsequently GGPP (Ogura, K., and Koyama, T. (1998). Chem. Rev. 98, 1263-1276). The sequential condensation of DMAPP and IPP to GGPP is carried out by: i. The sequential action of three prenyltransferases: GPP synthase, FPP synthase which catalyzes the addition of one IPP to GPP, and GGPP synthase which catalyzes the addition of one IPP to FPP; ii. A combination of two prenyltransferases, for example, an FPP synthase that catalyzes the condensation of one DMAPP and two IPPs, and a GGPP synthase that catalyzes the addition of one IPP to an FPP; and / or iii. Action of a single prenyltransferase, for example, GGPP synthase, which can catalyze the sequential condensation of three IPPs and one DMAPP. This can be done by [method].
[0185] Some terpenoids have a simple linear structure; for example, geranylgeraniol is a linear diterpene containing terminal hydroxyl groups (having 20 carbon atoms). Geranylgeraniol is, i. Enzymes having pyrophosphatase activity, e.g., phosphatases; and / or ii. Two enzymes that possess phosphatase activity and sequentially cleave the two phosphate groups of GGPP. It can be created from GGPP using either of the following methods.
[0186] Alternatively, geranylgeraniol can be produced from GGPP using enzymes from the class I terpene cyclase family (such as those described below) that can cleave the pyrophosphate group of GGPP but lack the ability to catalyze the sequential cyclization.
[0187] Therefore, these enzymatic pathways utilize enzymes with phosphatase activity to enable the cleavage of the diphosphate group of GGPP and the release of geranylgeraniol.
[0188] Alternatively, geranylgeraniol can be prepared from another linear diterpene, for example, from geranyllinalool or from the corresponding polyene, such as beta-springen. Enzymes such as dehydratase-isomerase can be used in such reactions. Dehydratase-isomerase can catalyze reversible isomerization reactions between linear terpene compounds having terminal alcohol groups or terminal double bonds (Nestl, BM, et al. (2017). Nature Chemical Biology 13(3): 275-281) (see Figure 2).
[0189] Alternatively, geranylgeraniol can also be synthesized using chemical methods. For example, geranylgeraniol can be obtained from farnesene or farnesol by chain extension (Organic Syntheses, Vol. 84, p. 43-57 (2007)).
[0190] The pathways leading to IPP and DMAPP, as well as to geranyl diphosphate (GPP), farnesyl diphosphate (FPP), or geranylgeranyl diphosphate (GGPP), are involved in the synthesis of terpenoids, a diverse class of molecules that play crucial roles in primary metabolism and various cellular processes. Terpenoids are involved in numerous biological functions, including the synthesis of sterols such as cholesterol in animals and phytosterols in plants, as well as the production of hormones, vitamins (such as vitamins E and K), and signaling molecules (such as ubiquinone and dolichol). Furthermore, terpenoids are critical for membrane lipid formation and post-translational modification of proteins.
[0191] Therefore, the pathways leading to GPP, FPP, and GGPP described above are important components of primary metabolism in all organisms, as they provide precursors necessary for the synthesis of essential terpenoid compounds involved in various physiological processes essential for cell growth.
[0192] Most terpenoid compounds contain a cyclic carbon skeleton. The diversity of monocyclic and polycyclic carbon skeletons is due to the enzymatic conversion of linear terpenoid precursors by terpene cyclases (TCs), also known as terpene synthases. Cyclization reactions start with a carbocation, which reacts with an electron-rich double bond to form a new carbon bond. The outcome of the reaction depends on the substrate folding and the properties of the amino acid side chains at the enzyme-active site.
[0193] Therefore, in a preferred embodiment of the present invention, the method of the present invention further includes one or more steps prior to step (i), wherein the step is (a) preparing (producing) geranylgeranyl diphosphate (GGPP) from IPP and DMAPP using one or more prenyltransferase enzymes; and / or (b) Prepare (produce) the compound of formula (II) from GGPP using one or more enzymes having phosphatase enzyme activity. Includes.
[0194] Step (a) of this embodiment of the present invention requires the use of one or more prenyltransferase enzymes.
[0195] The term "prenyltransferase" refers to a group of enzymes capable of sequentially condensing 5-carbon units, such as isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), to form linear terpenyl diphosphate compounds containing 10, 15, and 15 carbon atoms, such as geranyl diphosphate (GPP), farnesyl diphosphate (FPP), or geranylgeranyl diphosphate (GGPP). Some prenyltransferases can extend the length of the carbon chain by adding 5-carbon units to the linear terpenyl diphosphate compound. An example of a prenyltransferase is geranyl diphosphate synthase (GGPP synthase), which is capable of producing GGPP from IPP and DMAPP or by adding 5 carbon atoms to FPP. The term "prenyltransferase" also refers to a group of enzymes that, during meroterpenoid biosynthesis, have the ability to transfer isoprenoid subunits from terpenyl diphosphate compounds, generally linear terpenyl diphosphate compounds, to non-terpenoid skeletons.
[0196] Step (a) of this embodiment of the present invention is, i. The sequential action of three prenyltransferases: GPP synthase, FPP synthase which catalyzes the addition of one IPP to GPP, and GGPP synthase which catalyzes the addition of one IPP to FPP; ii. A combination of two prenyltransferases, for example, an FPP synthase that catalyzes the condensation of one DMAPP and two IPPs, and a GGPP synthase that catalyzes the addition of one IPP to an FPP; and / or iii. Action of a single prenyltransferase, for example, GGPP synthase, which can catalyze the sequential condensation of three IPPs and one DMAPP. This can be done by [method].
[0197] Examples of prenyltransferase enzymes that can be used in this step of the method of the present invention are well known in the art. For example, GGPP synthase from Brakeslea trispora can be used in the biosynthesis of GGPP (Sun et al, Biotechnol. Lett. 34 (11), 2077-2082 (2012)).
[0198] Preferably, the prenyltransferase is a GGPP synthase. Preferably, the GGPP synthase has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with SEQ ID NO: 1 or 2. Preferably, the GGPP synthase has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with SEQ ID NO: 2. Preferably, the GGPP synthase has the amino acid sequence of SEQ ID NO: 2.
[0199] Step (b) of this embodiment of the present invention requires the use of one or more enzymes having phosphatase activity.
[0200] The term "phosphatase" refers to a group of enzymes known to remove phosphate or diphosphate groups from precursors containing phosphate or diphosphate groups. Certain subgroups of phosphatases have the ability to remove phosphate or diphosphate groups from terpenyl precursors to release inorganic phosphates and corresponding terpenyl alcohols. For example, some phosphatases are known to remove the diphosphate group from GGPP to form geranylgeraniol. Phosphatases acting on terpenyl diphosphates are found in several enzyme classes.
[0201] The term "protein tyrosine phosphatase" refers to a group of enzymes commonly known to remove phosphate groups from phosphorylated tyrosine residues on proteins. Certain subgroups of this family, as described in International Publication No. 2020011883, are enzymes useful for removing diphosphate groups from phosphorylated terpene molecules (terpenyl diphosphates). In particular, phosphatases from the protein tyrosine phosphatase family with Pfam ID number PF13350 can dephosphorylate GGPP to geranylgeraniol. Polypeptides can be scanned for matches against the Pfam protein family signature database.
[0202] Phosphatases, particularly GGPP phosphatases, can also be obtained from other protein families, such as the type 2 phosphatidic acid phosphatase (PAP2) protein family (IPR00326), the Nudix-hydrolase protein family (IPR015797), and the halo acid dehalogenase-like (HAD-like) hydrolase protein family (IPR041492). A method for screening phosphatases and evaluating the conversion of geranylgeranyl diphosphate to geranylgeraniol is described in Example 5.
[0203] Examples of phosphatase enzymes that can be used in this step of the method of the present invention are well known in the art.
[0204] Preferably, the phosphatase is a GGPP phosphatase. Preferably, the GGPP phosphatase has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any one of SEQ ID NOs: 3 to 10. Preferably, the GGPP phosphatase has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with SEQ ID NO: 3. Preferably, the GGPP phosphatase has the amino acid sequence of SEQ ID NO: 3.
[0205] As described above, this embodiment of the present invention provides a compound of formula (II) by a series of biocatalytic transformations that convert a precursor compound into a compound of formula (II).
[0206] In this embodiment of the present invention, the precursor compound that becomes the compound of formula (II) is provided from IPP and DMAPP.
[0207] A further embodiment of the present invention is one in which the method further includes the preparation of IPP and DMAPP.
[0208] One method for preparing IPP and DMAPP is via the “mevalonate pathway.” Also known as the “isoprenoid pathway” or “HMG-CoA reductase pathway,” the “mevalonate pathway” is an essential metabolic pathway present in eukaryotes, archaea, and some bacteria. The mevalonate pathway begins with acetyl-CoA and produces two five-carbon building blocks called isopentenyl pyrophosphate (IPP) and dimethylallyl pyrophosphate (DMAPP). Combining the mevalonate pathway with enzymatic activity enables recombinant cell production of terpenes by generating terpene precursors GPP, FPP, or GGPP. This pathway is well known in the art. A list of enzymes required for the conversion of acetyl-CoA to IPP and DMAPP is provided below: Acetyl-CoA acetyltransferase (ACAT) 3-Hydroxy-3-methylglutaryl-CoA synthase (HMG-CoA synthase) 3-Hydroxy-3-methylglutaryl-CoA reductase (HMG-CoA reductase) Mevalonate kinase Phosphomevalonate kinase Mevalonate diphosphate decarboxylase Isopentenyl diphosphate isomerase.
[0209] An alternative method for preparing IPP and DMAPP is via methyl erythritol phosphate (MEP). This pathway is well known in the art. A list of enzymes required for the conversion of glyceraldehyde 3-phosphate (GAP) and pirubate to IPP and DMAPP is provided below: 1-Deoxy-D-xylulose 5-phosphate synthase (DXS) 1-Deoxy-D-xylulose 5-phosphate reductosomerase (DXR) 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase (MCT, IspD) .4-Diphosphocytidyl-2-C-methyl-D-erythritol kinase (CMK, IspE) 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (MDS, IspF) 4-Hydroxy-3-methylbuta-2-en-1-yldiphosphate synthase (HDS, IDS) .4-Hydroxy-3-methylbuta-2-en-1-yldiphosphate reductase (HDR)
[0210] Further alternative routes for preparing IPP and DMAPP are known; see, for example, Rinaldi, MA, et al. (2022). Natural Product Reports 39(1): 90-118. https: / / doi.org / 10.1039 / D1NP00025J (see Part 3 of this paper).
[0211] Accordingly, the inventors have provided a complete biocatalytic route for preparing compounds of formula (I) from acetyl-CoA or glyceraldehyde 3-phosphate (GAP) and pirubate. This multi-step biocatalytic method is described herein for the first time and represents a significant advance in the preparation of such compounds.
[0212] Reaction conditions of the method of the present invention The method of the present invention may be an in vivo method or a biotransformation method.
[0213] The term in vivo (or whole-cell production, or in vivo production, or in vivo biosynthesis) refers to a method of converting a carbon source into a new compound, for example, a carbon source into a terpene or terpene-derived compound, using metabolically active cells in which primary metabolism is active to produce precursors (preferably microbial cells) of the method of the present invention.
[0214] Preferred carbon sources are sugars such as monosaccharides, disaccharides, or polysaccharides. Very good carbon sources include, for example, glucose, fructose, mannose, galactose, ribose, sorbose, ribulose, lactose, maltose, sucrose, raffinose, starch, or cellulose. Sugars can also be added to the culture medium via complex compounds such as molasses or other by-products from sugar refining. Adding mixtures of various carbon sources may also be advantageous. Other possible carbon sources include oils and fats, such as soybean oil, sunflower oil, peanut oil, and coconut oil; fatty acids, such as palmitic acid, stearic acid, or linoleic acid; alcohols, such as glycerol, methanol, or ethanol; and organic acids, such as acetic acid or lactic acid.
[0215] Therefore, cells contain all the enzymes of one or more biosynthetic pathways. At least some of the enzymes involved in this process are part of the cell's major metabolism. For example, a cell may contain enzymes in a pathway that converts a carbon source (e.g., glucose, glycerol, isoprenol, prenol, CO2) to a terpenoid precursor (e.g., IPP, DMAPP, FPP, GGPP), and enzymes in a pathway that converts a terpene precursor to a terpene or terpene-derived molecule such as the compound of formula (I). The enzymes may be naturally present in the cell, or the cell may be transformed to produce the enzymes.
[0216] It is important to note that, until the present invention, it was impossible to prepare the compound of formula (I) using an in vivo method. The prior art at the time the present invention was conceived can be seen, for example, in International Publication No. 2016170099 and International Publication No. 2010139719. Here, it is found that the existing methods for preparing the compound of formula (I) were not in vivo, and therefore the present invention provides a significant advance in the methods disclosed therein.
[0217] Alternatively, the method of the present invention can be carried out under bioconversion conditions, also known as in vivo conversion conditions. Bioconversion refers to a method of converting a compound into various products using biological methods or agents, such as enzymes or whole cells (preferably microbial cells). Bioconversion does not involve the use of primary cellular metabolism (as defined above) to produce precursors for the method of the present invention. Bioconversion may involve multi-step reactions carried out by different enzymes, each. The compounds used in bioconversion can be extracted from natural sources or produced using other chemical or biochemical methods.
[0218] At least one polypeptide / enzyme present during the biotransformation method of the present invention as defined herein or between individual steps of a multi-step method may be present in living cells that naturally or recombinantly produce the enzyme, in harvested cells, in dead cells, in permeable cells, in crude cell extracts, in purified extracts, or in substantially pure or completely pure forms, i.e., under biotransformation conditions. Such extracts may comprise membrane fractions or liquid fractions prepared from recombinant host cells expressing at least one polypeptide / enzyme. These cells can be immobilized on suitable substrates as known in the art. At least one polypeptide / enzyme may be present in solution or as an enzyme immobilized on a carrier. One or more enzymes may be present simultaneously in soluble and / or immobilized forms.
[0219] Those skilled in the art can understand that there are advantages to using the in vivo method.
[0220] In particular, biotransformation methods typically involve multiple steps: - Preparation or isolation of the starting compound to be converted. This compound can be prepared using chemical or biochemical methods, or by extraction from natural sources. - Production of enzymes or (living) cells used in biotransformation. - In vivo transformation reactions involving contact between a compound and an enzyme or (living) cell. - Recovery and purification of the product.
[0221] In comparison, the in vivo method generally requires a limited number of steps, which are limited to the following: - Culturing microorganisms under conditions suitable for the production of the desired compound. - Harvesting of cells or growth medium and purification of the desired compound.
[0222] In biotransformation methods, such as the biotransformation of compounds of formula (VI), the addition of surfactants is often required to promote the solubilization of the compound or to maximize contact with the biocatalyst. In in vivo methods, the reactants and enzymes are produced intracellularly, and the addition of surfactants is not necessary.
[0223] Therefore, in vivo methods are generally more efficient and cost-effective than biotransformation methods.
[0224] Laboratory methods that can be used in the in vivo and biotransformation methods of the present invention are well known in the art. Several such methods will be discussed below.
[0225] The bioconversion method according to the present invention can be carried out in a general reactor known to those skilled in the art, and on a range of scales, for example, from laboratory scale (reaction volume of a few milliliters to tens of liters) to industrial scale (reaction volume of a few liters to several thousand cubic meters). When polypeptides are used in a form encapsulated by permeable non-living cells as optional, in the form of a somewhat purified cell extract, or in a purified form, a chemical reactor can be used. A chemical reactor typically allows control of the amount of at least one enzyme, the amount of at least one substrate, the pH of the reaction medium, temperature, and circulation.
[0226] When the method of the present invention is performed in vivo, it is preferable to carry out the reaction in a fermenter in which parameters necessary for appropriate living conditions for living cells (e.g., culture medium containing nutrients, temperature, aeration, presence or absence of oxygen or other gases, antibiotics, etc.) can be controlled.
[0227] The terms "fermentation production" or "fermentation" refer to the ability of microorganisms to produce chemical compounds in cell culture using at least one carbon source added to the incubation (assisted by the enzyme activity contained in or produced by such microorganisms).
[0228] The terms “fermentation broth” or “fermentation medium” are understood to mean a liquid, particularly aqueous or aqueous / organic solution, that is based on a fermentation process and is either untreated or treated as described herein, for example.
[0229] Those skilled in the art are familiar with chemical reactors or bioreactors, for example, procedures for scaling chemical or biotechnological methods from laboratory to industrial scale, or procedures for optimizing process parameters, which are widely described in the literature (for biotechnological methods, see, for example, Crueger und Crueger, Biotechnologie - Lehrbuch der angewandten Mikrobiologie, 2. Ed., R. Oldenbourg Verlag, Muenchen, Wien, 1984).
[0230] The culture medium to be used must appropriately meet the requirements of the specific strain. Descriptions of culture media for various microorganisms are found in the handbook "Manual of Methods for General Bacteriology" of the American Society for Bacteriology (Washington DC, USA, 1981).
[0231] These culture media that can be used according to the present invention may contain one or more carbon sources, nitrogen sources, inorganic salts, vitamins, and / or trace elements.
[0232] Nitrogen sources are typically organic or inorganic nitrogen compounds, or materials containing these compounds. Examples of nitrogen sources include ammonia gas, or ammonium salts such as ammonium sulfate, ammonium chloride, ammonium phosphate, ammonium carbonate, or ammonium nitrate, nitrates, urea, amino acids, or complex nitrogen sources such as corn steep liquor, soy flour, soy protein, yeast extract, or meat extract. Nitrogen sources can be used separately or as mixtures.
[0233] Inorganic salt compounds that may be present in the culture medium include chlorides, phosphates, or sulfates of calcium, magnesium, sodium, cobalt, molybdenum, potassium, manganese, zinc, copper, and iron.
[0234] Inorganic sulfur-containing compounds, such as sulfates, sulfites, dithionites, tetrathionates, thiosulfates, and sulfides, as well as organic sulfur compounds such as mercaptans and thiols, can also be used as sulfur sources.
[0235] Phosphate, potassium dihydrogen phosphate, or dipotassium hydrogen phosphate, or the corresponding sodium-containing salts can be used as a phosphorus source.
[0236] To retain metal ions in solution, chelating agents can be added to the culture medium. Particularly suitable chelating agents include dihydroxyphenols such as catechol or protocatecuate, or organic acids such as citric acid.
[0237] The fermentation medium used in this invention may also contain other growth factors, such as vitamins or growth promoters, including, for example, biotin, riboflavin, thiamine, folic acid, nicotinic acid, pantothenate, and pyridoxine. The growth factors and salts are often derived from the complex components of the medium, such as yeast extract, molasses, and corn steep liquor. Furthermore, appropriate precursors can be added to the culture medium. The exact composition of compounds in the medium is highly dependent on the specific experiment and must be determined individually for each specific case. Information on optimizing the medium can be found in the textbook "Applied Microbiol. Physiology, A Practical Approach" (1997). Growth media, such as Standard 1 (Merck) or BHI (Brain Heart Infusion, DIFCO), can also be obtained from commercial suppliers.
[0238] All components of the culture medium are sterilized by either heat (1.5 bar and 121°C for 20 minutes) or sterile filtration. These components may be sterilized together or separately as needed. All components of the culture medium may be present at the start of growth or may be added selectively in a continuous or batch feed.
[0239] The culture temperature is typically 15°C to 45°C, preferably 25°C to 40°C, and can be kept constant or varied during the experiment. The pH of the culture medium should be in the range of 5 to 8.5, preferably about 7.0. The pH value for growth can be controlled during growth by adding basic compounds such as sodium hydroxide, potassium hydroxide, ammonia, or aqueous ammonia, or acidic compounds such as phosphoric acid or sulfuric acid. To control foaming, an antifoaming agent, such as a fatty acid polyglycol ester, can be used. To maintain plasmid stability, a suitable substance with selective activity, such as an antibiotic, can be added to the culture medium. To maintain aerobic conditions, oxygen or an oxygen-containing gas mixture, such as ambient air, is supplied to the culture. The culture temperature is typically 20°C to 45°C. Culturing continues until the maximum amount of the desired product is formed. This is usually achieved within 1 to 160 hours.
[0240] When the method of the present invention is a biotransformation, cells containing at least one enzyme can be made permeable by physical or mechanical means such as ultrasonic pulses or radiofrequency pulses, Frenchless, or by chemical means such as hypotonic medium, lytic enzymes and surfactants present in the medium, or a combination of such methods. Examples of surfactants include SDS, digitonin, n-dodecyl maltoside, octyl glycoside, Triton® X-100, Tween® 20, deoxycholate, CHAPS (3-[(3-coramidopropyl)dimethylammonio]-1-propanesulfonate), Nonidet® P40 (ethylphenol poly(ethylene glycol ether)), etc. As previously stated, when the method of the present invention is an in vivo method, surfactants are not required for the reasons described herein.
[0241] The conversion reaction can be carried out in batch, semi-batch, or continuous manner. Reactants (and optionally nutrients) can be supplied at the start of the reaction or later semi-continuously or continuously.
[0242] The biotransformation reactions of the present invention can be carried out in aqueous, aqueous organic, or non-aqueous reaction media, depending on the specific reaction type.
[0243] Aqueous or aqueous organic culture media may contain a suitable buffer for adjusting the pH to a value in the range of 5 to 11, for example, 6 to 10.
[0244] In aqueous organic media, organic solvents that are miscible, partially miscible, or immiscible with water can be used. A non-limiting list of suitable organic solvents is provided below. Further examples include monohydric or polyhydric aromatic or aliphatic alcohols, particularly polyhydric aliphatic alcohols, such as glycerol.
[0245] The non-aqueous medium may be substantially water-free, i.e., it may contain about 1% by weight or less than 0.5% by weight of water.
[0246] The biotransformation method can also be carried out in an organic non-aqueous medium. Suitable organic solvents include aliphatic hydrocarbons having 5 to 8 carbon atoms, such as pentane, cyclopentane, hexane, cyclohexane, heptane, octane, or cyclooctane; aromatic carbohydrates such as benzene, toluene, xylene, chlorobenzene, or dichlorobenzene; aliphatic acyclic substances; and ethers such as diethyl ether, methyl-tert-butyl ether, ethyl-tert-butyl ether, dipropyl ether, diisopropyl ether, dibutyl ether, or mixtures thereof.
[0247] The reactant / substrate concentrations can be adjusted to suit the optimal biotransformation reaction conditions, which may depend on the specific enzyme being applied. For example, the initial substrate concentration may be 0.1–0.5 M, e.g., 10–100 mM.
[0248] The reaction temperature for bioconversion can be adjusted to suit the optimal reaction conditions, which may depend on the specific enzyme being applied. For example, the reaction can be carried out at temperatures ranging from 0 to 70°C, e.g., 20 to 50°C or 25 to 40°C. Examples of reaction temperatures include approximately 30°C, 35°C, 37°C, 40°C, 45°C, 50°C, 55°C, and 60°C.
[0249] Bioconversion can proceed until equilibrium is achieved between the substrate and the subsequent product, but it may be stopped earlier. Typical process times range from 1 minute to 25 hours, particularly from 10 minutes to 6 hours, and for example, from 1 hour to 4 hours, especially from 1.5 hours to 3.5 hours. These parameters are non-limiting examples of suitable process conditions.
[0250] Advantageously, microorganisms such as bacteria, fungi, or yeasts are used as host organisms. Advantageously, Gram-positive or Gram-negative bacteria, preferably from the families Enterobacteriaceae, Pseudomonadaceae, Rhizobiaceae, Streptomycetaceae, Streptococcusaceae, or Nocardiaceae, particularly preferably from the genera Escherichia, Pseudomonas, Streptomyces, Lactococcus, Nocardia, Burkholderia, Salmonella, Agrobacterium, Clostridium, or Rhodococcus. The genera and species of Escherichia coli are particularly preferred. Furthermore, other favorable bacteria are found in the groups of alpha-proteobacteria, beta-proteobacteria, or gamma-proteobacteria. Yeasts of families such as Saccharomyces or Pichia are also favorably suitable hosts.
[0251] Preferably, the cells are bacterial or fungal cells, particularly yeast cells. Preferably, the cells are unicellular organisms, cultured cells derived from multicellular organisms, cells present in cultured tissues derived from multicellular organisms, or cells present in living multicellular organisms. Preferably, the cells are bacterial cells of the genus Escherichia, preferably E. coli, or yeast cells of the genus Saccharomyces, preferably S. cerevisiae, Yarrowia, preferably Y. lipolytica, or Pichia, preferably P. pastoris.
[0252] Alternatively, plants or entire plant cells can function as natural or recombinant hosts. Non-limiting examples include the following plants or cells derived therefrom: Nicotiana, in particular Nicotiana benthamiana and Nicotiana tabacum (tobacco), and the Arabidopsis genus, in particular Arabidopsis thaliana.
[0253] Isolation of the product The methodology of the present invention may further include a step of recovering a final product or intermediate product that is substantially pure in a stereoisomerically or enantiomerically manner, optionally. The term “recover” includes extracting, harvesting, isolating, or purifying a compound from a culture medium or reaction medium. Recovery of a compound can be carried out by any conventional isolation or purification methodology known in the art, including but not limited to treatment with conventional resins (e.g., anionic or cation exchange resins, nonionic adsorbent resins, etc.), treatment with conventional adsorbents (e.g., activated carbon, silicic acid, silica gel, cellulose, alumina, etc.), pH changes, solvent extraction (e.g., using conventional solvents such as alcohol, ethyl acetate, hexane, etc.), distillation, dialysis, filtration, concentration, crystallization, recrystallization, pH adjustment, freeze-drying, etc.
[0254] The identity and purity of isolated products can be determined by known techniques such as high-performance liquid chromatography (HPLC), gas chromatography (GC), spectroscopy (IR, UV, NMR, etc.), colorimetric methods, TLC, NIRS, enzyme assays, or microbial assays (e.g., Patek et al. (1994) Appl. Environ. Microbiol. 60:133-140; Malakhova et al. (1996) Biotekhnologiya 11 27-32 and Schmidt et al. (1998) Bioprocess Engineer. 19:67-70. Ullmann's Encyclopedia of Industrial Chemistry (1996) Bd. A27, VCH: Weinheim, pp. 89-90, 521-540, 540-547, 559-566, 575-581 und s. 581-587, Michal, G (1999) Biochemical Pathways: An Atlas of Biochemistry and Molecular Biology, John Wiley and Sons, see Fallon, A. et al. (1987) Applications of HPLC in Biochemistry in: Laboratory Techniques in Biochemistry and Molecular Biology, Bd. 17).
[0255] Compounds produced by any of the methods described herein can be converted into derivatives such as hydrocarbons, esters, amides, glycosides, ethers, epoxides, aldehydes, ketones, alcohols, diols, acetals, or ketals. Terpene compound derivatives can be obtained by chemical methods such as oxidation, reduction, alkylation, acylation, and / or rearrangement, but are not limited to these. Alternatively, terpene compound derivatives can be obtained by biochemical methods, which involve contacting the terpene compound with an enzyme such as, but not limited to, oxidoreductase, monooxygenase, dioxygenase, transferase, or terpene cyclase. The biochemical conversion may be carried out in vitro using isolated enzymes, enzymes from lytic cells, or bioconversion using whole cells.
[0256] Recombinant cells of the present invention As discussed herein, the inventors have been able to innovate for the first time an in vivo method for preparing the compound of formula (I). To achieve this, the inventors created a biosynthetic pathway to the compound of formula (I) in recombinant cells. Until the present invention, it was not known to prepare this compound using an in vivo method involving recombinant cells. Therefore, until the present invention, recombinant cells producing the compound of formula (I) were not known in the art. The compound may be present in the recombinant cells or can be transported to a reaction medium.
[0257] Furthermore, as previously stated, the present invention has achieved the production of the compound of formula (VI) in the form of formula (VIa) with high selectivity, as demonstrated in the attached examples.
[0258] Subsequently, this provides a method for preparing olfactory-favorable forms of the compound of formula (I). Therefore, the method of the present invention for preparing the compound of formula (I) lacking a considerable amount of undesirable byproducts is of great commercial importance. As can be shown in the attached examples, more than 97% of the compounds of formula (I) are in the form of formula (Ia) and / or (Ib).
[0259] Therefore, according to this, one embodiment of the present invention is a recombinant cell in which more than 97% of the compound of formula (I) is in the form of formula (Ia) and / or (Ib).
[0260] Accordingly, a further aspect of the present invention provides recombinant cells that can produce, or produce, a compound of formula (I), and one or more compounds of formulas (II), (III), (IV), (V), and / or (VI).
[0261] This specification provides for preparing recombinant cells that contain, or can produce, or produce a compound of formula (I) and one or more compounds of formulas (II), (III), (IV), (V) and / or (VI). Preferably, the cells contain (i) a polypeptide having ADH enzyme activity, (ii) a polypeptide having enal cleavage enzyme activity, (iii) a polypeptide having BVMO enzyme activity, (iv) a polypeptide having esterase enzyme activity, and (v) a polypeptide having terpene cyclase enzyme activity.
[0262] A preferred embodiment of the present invention is: (i) Polypeptides having ADH enzyme activity that have at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 11-21; (ii) A polypeptide having enal cleavage enzyme activity that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 22; (iii) A polypeptide having BVMO enzyme activity that has sequence identity of at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more with respect to any of SEQ ID NOs: 23-26 and 216-227; (iv) A polypeptide having esterase enzyme activity that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with either SEQ ID NO: 27 or 28; and / or (v) A polypeptide having terpene cyclase enzyme activity having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 29-75, 79-89, and 265-289, preferably with SEQ ID NOs: 29-75 and 265-289, and more preferably with any of SEQ ID NOs: 29-75, 265-274, and 276-289. That is the case.
[0263] Further preferred embodiments of the present invention are: (i) The polypeptide having ADH enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 11 or 21, preferably the ADH enzyme having the sequence of SEQ ID NO: 11 or 21; (ii) The polypeptide having enal cleavage activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 22, and preferably the enal cleavage enzyme has the sequence of SEQ ID NO: 22; (iii) A polypeptide having BVMO enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 25 or 26, preferably the BVMO enzyme having the sequence of SEQ ID NO: 25 or 26; (iv) A polypeptide having esterase enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 28, preferably the esterase enzyme having the sequence of SEQ ID NO: 28; and / or (v) A polypeptide having terpene cyclase enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity with SEQ ID NOs: 48, 57, 71, 74, 265, 266, 267, 268, 274, 276, 279, 280, 281, 282, 283, 286, 287, or 288, and preferably the terpene cyclase enzyme has the sequence of SEQ ID NOs: 48, 57, 71, 74, 265, 266, 267, 268, 274, 276, 279, 280, 281, 282, 283, 286, 287, or 288. That is the case.
[0264] Alternatively, in this embodiment of the present invention, (v) The polypeptide having terpene cyclase enzymatic activity is an SHC enzyme and has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with SEQ ID NOs: 48, 265, 266, 267, 268, 274, 276, or 279, and preferably the SHC enzyme has the sequence of SEQ ID NOs: 48, 265, 266, 267, 268, 274, 276, or 279.
[0265] Alternatively, in this embodiment of the present invention, (v) The polypeptide having terpene cyclase enzyme activity is a meroterpenoid cyclase enzyme and has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with SEQ ID NOs: 57, 71, 74, 280, 281, 282, 283, 286, 287, or 288, preferably the meroterpenoid cyclase enzyme has the sequence of SEQ ID NOs: 57, 71, 74, 280, 281, 282, 283, 286, 287, or 288.
[0266] Further embodiments of the present invention include: (i) A compound of formula (I) [ka] It is in the form of, (ii) The compound of formula (II) [ka] It is in the form of, (iii) A compound of formula (III) [ka] It is in the form of, (iv) The compound of formula (IV) [ka] It is in the form of, (v) The compound of formula (V) [ka] It is in the form of, (vi) A compound of formula (VI) [ka] It is in the form of It is.
[0267] Recombinant cells may be any such cells suitable for the production of the compound of formula (I).
[0268] A list of suitable cells for the production of the compound of formula (I) has been described above in connection with the method of the present invention, and is also the cell for this aspect of the present invention.
[0269] Preferably, the cells are bacterial or fungal cells, particularly yeast. Preferably, the cells are unicellular organisms, cultured cells derived from multicellular organisms, cells present in cultured tissues derived from multicellular organisms, or cells present in living multicellular organisms. Preferably, the cells are bacterial cells of the genus Escherichia, preferably E. coli, or yeast cells of the genus Saccharomyces, preferably S. cerevisiae, Yarrowia, preferably Y. lipolytica, or Pichia, preferably P. pastoris.
[0270] Methods for introducing recombinant nucleic acid sequences into such host cells are well-known in the art and are routine laboratory methodologies that do not need to be further described herein.
[0271] Embodiments of this aspect of the present invention further comprise a cell comprising one or more prenyltransferase enzymes. Preferably, the cell further comprises one or more enzymes having phosphatase activity.
[0272] As described above in relation to the method of the present invention, the cells of the present invention may also include further enzymes for providing the compound of formula (II). Such a method requires the presence of one or more prenyltransferase enzymes and one or more enzymes having phosphatase activity.
[0273] Preferably, the prenyltransferase is a GGPP synthase. Preferably, the GGPP synthase has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 1 or 2.
[0274] Preferably, the phosphatase is a GGPP phosphatase. Preferably, the GGPP phosphatase has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 3 to 10.
[0275] A further embodiment of the present invention is that the cells of the present invention contain enzymes for IPP and DMAPP.
[0276] As described above in relation to the method of the present invention, the cells of the present invention may also include further other enzymes for providing compounds of formula (II) via the “mevalonate pathway,” the methylerythritol phosphate (MEP) pathway, or alternative pathways for preparing IPP and DMAPP.
[0277] In one embodiment of the present invention, the cell is an enzyme in the mevalonate pathway: Acetyl-CoA acetyltransferase (ACAT) 3-Hydroxy-3-methylglutaryl-CoA synthase (HMG-CoA synthase) 3-Hydroxy-3-methylglutaryl-CoA reductase (HMG-CoA reductase) Mevalonate kinase Phosphomevalonate kinase Mevalonate diphosphate decarboxylase Isopentenyl diphosphate isomerase Dimethylallyl diphosphate synthase Includes.
[0278] In a further embodiment of the present invention, cells are enzymes of the MEP pathway: 1-Deoxy-D-xylulose 5-phosphate synthase (DXS) 1-Deoxy-D-xylulose 5-phosphate reductosomerase (DXR) 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase (MCT, IspD) .4-Diphosphocytidyl-2-C-methyl-D-erythritol kinase (CMK, IspE) 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase (MDS, IspF) 4-Hydroxy-3-methylbuta-2-en-1-yldiphosphate synthase (HDS, IDS) .4-Hydroxy-3-methylbuta-2-en-1-yldiphosphate reductase (HDR) Includes.
[0279] As previously stated, when used in the method of the present invention, meroterpenoid cyclase produces isomers of the compound of formula (I) that have a preferred olfactory profile, i.e., compounds of formula (I) that have an isomer bias toward formula (Ia) and / or (Ib) rather than formula (Ic) and / or (Id). Therefore, the use of meroterpenoid cyclase offers remarkable technical advantages compared to the use of SHC enzymes. In particular, the inventors have demonstrated that less than 1% of the compounds of formula (I) produced by the method of the present invention were in the isomerized form of formula I(c) and / or I(d). Therefore, the use of meroterpenoid cyclase offers remarkable technical advantages compared to the use of SHC enzymes.
[0280] Therefore, according to this, a preferred embodiment of the present invention is one in which the recombinant cells contain meroterpenoid cyclase as the terpene cyclase enzyme.
[0281] Preferably, the meroterpenoid cyclase is a membrane-integrated meroterpenoid cyclase. From the attached examples, it can be seen that this class of meroterpenoid cyclases preferably produces compounds of formula (Ia) rather than formulas (Ic) and / or (Id). Examples of membrane-integrated meroterpenoid cyclases include those described in any of SEQ ID NOs: 50-73 and 280-289.
[0282] Preferably, the meroterpenoid cyclase is a soluble meroterpenoid cyclase. From the attached examples, it can be seen that meroterpenoid cyclases of this class preferably produce compounds of formula (Ib) rather than formulas (Ic) and / or (Id). Examples of soluble meroterpenoid cyclases are those described in SEQ ID NO: 74 or 75.
[0283] Preferably, the recombinant cells of the present invention contain more than 97% of the compounds of formula (I) in the form of compound (Ia) and / or (b), rather than I(c) and / or I(d).
[0284] The cell culture fermentation medium of the present invention As discussed herein, the inventors have been able to innovate for the first time an in vivo method for preparing the compound of formula (I). To achieve this, the inventors created a biosynthetic pathway to the compound of formula (I) in recombinant cells that are subsequently cultured in a suitable cell culture fermentation medium. Until the present invention, it had not been possible to prepare this compound using an in vivo method involving recombinant cells in a cell culture fermentation medium. Therefore, until the present invention, a cell culture fermentation medium containing the recombinant cells and / or the compound of formula (I) of the present invention was unknown in the art.
[0285] Furthermore, as previously stated, the present invention has achieved the production of the compound of formula (VI) in the form of formula (VIa) with high selectivity, as demonstrated in the attached examples.
[0286] Subsequently, this provides a method for preparing olfactory-favorable forms of the compound of formula (I). Therefore, the method of the present invention for preparing the compound of formula (I) lacking a considerable amount of undesirable byproducts is of great commercial importance. As can be shown in the attached examples, more than 97% of the compounds of formula (I) are in the form of formula (Ia) and / or (Ib).
[0287] Therefore, according to this, one embodiment of the present invention is a cell culture fermentation medium comprising a compound of formula (I), wherein more than 97% of the compound of formula (I) is in the form of formula (Ia) and / or (Ib).
[0288] Accordingly, further embodiments of the present invention provide a cell culture fermentation medium containing recombinant cells as described herein. The cell culture fermentation medium may further contain a compound of formula (I) and / or one or more compounds of formulas (II), (III), (IV), (V) and / or (VI).
[0289] As discussed herein, the inventors have been able to innovate for the first time in the biosynthetic pathway to the compound of formula (I) in recombinant cells. Such cells are then grown under conditions suitable for the production of the compound, using cell culture fermentation media appropriate for a particular cell type.
[0290] The cell culture fermentation medium may be a nutrient-rich broth for both cell growth and maintenance during the production phase. Yeast culture conditions for maintaining and propagating various strains may require specific formulations of complex media for use in cloning and protein expression, as can be understood by those skilled in the art. Commercially available culture media can be used, for example, from ThermoFisher. The medium may be a YPD broth or may have a yeast nitrogen base. Yeast can be grown in YPD or synthetic media at 30°C.
[0291] Typically, lysogenic broth (LB) is used for bacterial cells. Bacterial cells may have antibiotic resistance to prevent the growth and contamination of other cells in the culture medium. Cells may have antibiotic gene cassettes for resistance to antibiotics such as chloramphenicol, penicillin, kanamycin, and ampicillin.
[0292] Reaction mixture containing the compound of the present invention The method of the present invention may be a biotransformation method.
[0293] As discussed herein, the inventors have been able to innovate for the first time a biotransformation method for preparing the compound of formula (I). To achieve this, the inventors created a biosynthetic pathway to the compound of formula (I). Until the present invention, it had not been possible to prepare this compound using biotransformation methods.
[0294] Furthermore, as previously stated, the present invention has achieved the production of the compound of formula (VI) in the form of formula (VIa) with high selectivity, as demonstrated in the attached examples.
[0295] Subsequently, this results in a biotransformation that prepares an olfactory-favorable form of the compound of formula (I). Therefore, the biotransformation of the present invention for preparing the compound of formula (I) lacking a considerable amount of undesirable byproducts is of great commercial importance. As can be shown in the attached examples, more than 97% of the compounds of formula (I) are in the form of formula (Ia) and / or (Ib).
[0296] Accordingly, according to this embodiment, one embodiment of the present invention is a reaction mixture comprising a compound of formula (I), wherein more than 97% of the compound of formula (I) is in the form of formula (Ia) and / or (Ib). The reaction mixture may further comprise one or more compounds of formulas (II), (III), (IV), (V), and / or (VI).
[0297] Further components of the reaction mixture include surfactants, cofactors, cells, cell debris, cell culture media, and other such components well known to those skilled in the art.
[0298] Preparation of the compound of formula (VI) As previously presented, in order to prepare an improved method for preparing the compound of formula (I), the inventors have cultivated a deep understanding of the biochemical pathway that produces this compound by a multi-enzyme reaction from a precursor compound. This multi-enzyme reaction represents the first instance in which the preparation of this compound has been carried out by such a stepwise reaction, and represents a significant scientific and commercial advance in the preparation of the sesquiterpene compound of formula (I).
[0299] In addition to preparing the compound of formula (I), the inventors have also devised a method for preparing the compound of formula (VI). This compound is a commercially important precursor for preparing the compound of formula (I) by either subsequent biotransformation or chemical methods.
[0300] Therefore, according to this, a further aspect of the present invention is formula (VI) in the form of one or a mixture thereof of stereoisomers. [ka] A method for preparing the compound, (i) Formula of one of the stereoisomers or a mixture thereof (IV) [ka] The steps include contacting the compound with a polypeptide having BVMO enzyme activity to produce a compound of formula (V), and (ii) Formula (V) of one of the stereoisomers or a mixture thereof [ka] The compound is brought into contact with a polypeptide having esterase enzyme activity to produce the compound of formula (VI). This provides a method that includes [something].
[0301] One embodiment of the present invention is a method in which (a) Formula (III) of one of the stereoisomers or a mixture thereof [ka] A preliminary step involves contacting the compound with a polypeptide having enal cleavage enzyme activity to produce the compound of formula (IV). It includes.
[0302] Further embodiments of the present invention include a method in which (a) Formula (II) of one of the stereoisomers or a mixture thereof [ka] A preliminary step involves contacting the compound with a polypeptide having ADH enzyme activity to produce the compound of formula (III). It includes.
[0303] As discussed herein, the inventors were able to innovate for the first time a biosynthetic pathway to the compound of formula (I) in recombinant cells. In preparing such a pathway, the inventors also prepared cells capable of producing the compound of formula (VI). Therefore, this aspect of the present invention is not disclosed in the prior art.
[0304] It is understandable that the method for preparing the compound of formula (VI) includes steps (i) to (iv) of the method for preparing the compound of formula (I). Accordingly, all embodiments presented herein in relation to the method for preparing the compound of formula (I) can be used in this embodiment of the present invention, except for the polypeptide related to step (v) of the method for preparing the compound of formula (I).
[0305] To avoid any doubt, the compound of formula (VI) is also known as homofarnesol, 4,8,12-trimethyltrideca-3,7,11-trien-1-ol; CAS number 35826-67-6.
[0306] The compound of formula (VI) may exist as one of its stereoisomers or a mixture thereof. Specifically, the compound may have the following structures and isoforms: [ka] (3E,7E)-homofarnesol; (3E,7E)-4,8,12-trimethyltrideca-3,7,11-trien-1-ol; CAS number 459-89-2. [ka] (3Z,7E)-homofarnesol; (3Z,7E)-4,8,12-trimethyltrideca-3,7,11-trien-1-ol; CAS number 138152-06-4. [ka] (3E,7Z)-homofarnesol; (3E,7Z)-4,8,12-trimethyltrideca-3,7,11-trien-1-ol; CAS number 2032064-12-1. [ka] (3Z,7Z)-homofarnesol; (3Z,7Z)-4,8,12-trimethyltrideca-3,7,11-trien-1-ol; CAS number 138152-08-6.
[0307] Furthermore, as previously stated, the present invention has achieved the production of compounds of formula (VI) in the form of formula (VIa) with high selectivity, as demonstrated in the appended examples. Therefore, a preferred embodiment of the present invention is a method for preparing compounds of formula (VI), wherein more than 99% of the compounds of formula (VI) are in the form of formula (VIa).
[0308] Further preferred embodiments of the present invention are: (i) A polypeptide having BVMO enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 25 or 26, preferably the BVMO enzyme having the sequence of SEQ ID NO: 25 or 26; and / or (ii) A polypeptide having esterase enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 28, and preferably the esterase enzyme has the sequence of SEQ ID NO: 28. That is the case.
[0309] Further preferred embodiments of the present invention are: (i) A polypeptide having ADH enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 11 or 21, preferably the ADH enzyme having the sequence of SEQ ID NO: 11 or 21; and / or (ii) A polypeptide having Enal cleavage activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 22, and preferably the Enal cleavage enzyme has the sequence of SEQ ID NO: 22. That is the case.
[0310] Preferably, this method is an in vivo method or a biotransformation method.
[0311] Furthermore, this aspect of the present invention includes recombinant cells that can produce, or produce, a compound of formula (VI), and one or more compounds of formulas (II), (III), (IV), and / or (V). The recombinant cells include (i) a polypeptide having ADH enzyme activity, (ii) a polypeptide having enal cleavage enzyme activity, (iii) a polypeptide having BVMO enzyme activity, and (iv) a polypeptide having esterase enzyme activity.
[0312] Furthermore, this aspect of the present invention includes a method for producing the compound of formula (VI), comprising growing recombinant cells of this aspect of the present invention, as described above, under growth conditions suitable for the production of the compound of formula (VI).
[0313] Furthermore, this aspect of the present invention includes a cell culture fermentation medium containing recombinant cells of this aspect of the present invention. The cell culture fermentation medium may further contain the compound of formula (VI) and / or one or more compounds of formulas (II), (III), (IV), and / or (V).
[0314] Furthermore, this aspect of the present invention includes a reaction mixture comprising a compound of formula (VI) preferably in the form of formula (VIa). More preferably, more than 99% of the compound of formula (VI) is in the form of formula (VIa). The reaction mixture may further comprise one or more compounds of formulas (II), (III), (IV), and / or (V).
[0315] Furthermore, this aspect of the present invention includes compounds of formula (VI) obtained or obtainable by the method of this aspect of the present invention or from recombinant cells, cell culture fermentation or reaction mixtures of this aspect of the present invention.
[0316] Preparation of the compound of formula (V) In addition to preparing the compound of formula (I), the inventors have also devised a method for preparing the compound of formula (V). This compound is a commercially important precursor for preparing the compound of formula (I) by either subsequent biotransformation or chemical methods.
[0317] Therefore, according to this, a further aspect of the present invention is formula (V) in the form of one or a mixture thereof of stereoisomers. [ka] A method for preparing the compound, (i) Formula (III) of one of the stereoisomers or a mixture thereof [ka] The steps include contacting the compound with a polypeptide having enal cleavage enzyme activity to produce the compound of formula (IV), and (ii) Formula of one of the stereoisomers or a mixture thereof (IV) [ka] The compound is brought into contact with a polypeptide having BVMO enzyme activity to produce the compound of formula (V). This provides a method that includes [something].
[0318] One embodiment of the present invention is a method in which (a) Formula (II) of one of the stereoisomers or a mixture thereof [ka] A preliminary step involves contacting the compound with a polypeptide having ADH enzyme activity to produce the compound of formula (III). It includes.
[0319] As discussed herein, the inventors were able to innovate for the first time a biosynthetic pathway to the compound of formula (I) in recombinant cells. In preparing such a pathway, the inventors also prepared cells capable of producing the compound of formula (V). Therefore, this aspect of the invention is not disclosed in the prior art.
[0320] It is understandable that the method for preparing the compound of formula (V) includes steps (i) to (iii) of the method for preparing the compound of formula (I). Accordingly, all embodiments presented herein in relation to the method for preparing the compound of formula (I) can be used in this embodiment of the present invention, except for the polypeptides related to steps (iv) and (v) of the method for preparing the compound of formula (I).
[0321] To avoid any doubt, the compound of formula (V) is also known as homofarnesylacetate, i.e., 4,8,12-trimethyltrideca-3,7,11-triene-1-ylacetate; CAS number 109813-25-4.
[0322] The compound of formula (V) may exist as one of its stereoisomers or a mixture thereof. Specifically, the compound may have the following structures and isoforms: [ka] (3E,7E)-homofarnesylacetate; (3E,7E)-4,8,12-trimethyltrideca-3,7,11-triene-1-ylacetate; CAS number 944346-19-4. [ka] (3Z,7E)-homofarnesylacetate; (3Z,7E)-4,8,12-trimethyltrideca-3,7,11-triene-1-ylacetate; CAS number 1467099-77-9. [ka] (3E,7Z)-homofarnesylacetate; (3E,7Z)-4,8,12-trimethyltrideca-3,7,11-triene-1-ylacetate. [ka] (3Z,7Z)-homofarnesylacetate; (3Z,7Z)-4,8,12-trimethyltrideca-3,7,11-triene-1-ylacetate.
[0323] Preferably, this method is an in vivo method or a biotransformation method.
[0324] Furthermore, this aspect of the present invention includes recombinant cells that can produce, or produce, a compound of formula (V), and one or more compounds of formulas (II), (III), and / or (IV). The recombinant cells include (i) a polypeptide having ADH enzyme activity, (ii) a polypeptide having enal cleavage enzyme activity, and (iii) a polypeptide having BVMO enzyme activity.
[0325] Furthermore, this aspect of the present invention includes a method for producing the compound of formula (V), comprising growing recombinant cells of this aspect of the present invention, as described above, under growth conditions suitable for the production of the compound of formula (V).
[0326] Furthermore, this aspect of the present invention includes a cell culture fermentation medium containing recombinant cells of this aspect of the present invention. The cell culture fermentation medium may further contain a compound of formula (V) and / or one or more compounds of formulas (II), (III), and / or (IV).
[0327] Furthermore, this aspect of the present invention includes a reaction mixture comprising a compound of formula (V). The reaction mixture may further comprise one or more compounds of formulas (II), (III), and / or (IV).
[0328] Furthermore, this aspect of the present invention includes compounds of formula (V) obtained or obtainable by the method of this aspect of the present invention or from recombinant cells, cell culture fermentation or reaction mixtures of this aspect of the present invention.
[0329] Preparation of the compound of formula (IV) In addition to preparing the compound of formula (I), the inventors have also devised a method for preparing the compound of formula (IV). This compound is a commercially important precursor for preparing the compound of formula (I) by either subsequent biotransformation or chemical methods.
[0330] Therefore, according to this, a further aspect of the present invention is a form of formula (IV) in the form of one or a mixture thereof of stereoisomers. [ka] A method for preparing the compound, (i) Formula (II) of one of the stereoisomers or a mixture thereof [ka] The steps include contacting the compound with a polypeptide having ADH enzyme activity to produce the compound of formula (III), and (ii) Formula of one of the stereoisomers or a mixture thereof (III) [ka] The compound is brought into contact with a polypeptide having enal cleavage enzyme activity to produce the compound of formula (IV). This provides a method that includes [something].
[0331] As discussed herein, the inventors were able to innovate for the first time a biosynthetic pathway to the compound of formula (I) in recombinant cells. In preparing such a pathway, the inventors also prepared cells capable of producing the compound of formula (IV). Therefore, this aspect of the invention is not disclosed in the prior art.
[0332] It is understandable that the method for preparing the compound of formula (IV) includes steps (i) to (ii) of the method for preparing the compound of formula (I). Accordingly, all embodiments presented herein in relation to the method for preparing the compound of formula (I) can be used in this embodiment of the present invention, except for the polypeptides related to steps (iii), (iv), and (v) of the method for preparing the compound of formula (I).
[0333] To avoid any doubt, the compound of formula (IV) is also known as farnesylacetone, i.e., 6,10,14-trimethylpentadeca-5,9,13-trien-2-one; CAS number 762-29-8.
[0334] The compound of formula (IV) may exist as one of its stereoisomers or a mixture thereof. Specifically, the compound may have the following structures and isoforms: [ka] (5E,9E)-Farnesylacetone; (5E,9E)-6,10,14-Trimethylpentadeca-5,9,13-Trien-2-one; CAS number 1117-52-8. [ka] (5Z,9E)-Farnesylacetone; (5Z,9E)-6,10,14-Trimethylpentadeca-5,9,13-Trien-2-one; CAS number 1117-51-7. [ka] (5E,9Z)-Farnesylacetone; (5E,9Z)-6,10,14-Trimethylpentadeca-5,9,13-Trien-2-one; CAS number 3053-35-3. [ka] (5Z,9Z)-Farnesylacetone; (5Z,9Z)-6,10,14-Trimethylpentadeca-5,9,13-Trien-2-one; CAS number 3796-69-8.
[0335] Preferably, this method is an in vivo method or a biotransformation method.
[0336] Furthermore, this aspect of the present invention includes recombinant cells that can produce, or produce, a compound of formula (IV), and one or more compounds of formula (II) and / or formula (III). The recombinant cells include (i) a polypeptide having ADH enzyme activity and (ii) a polypeptide having enal cleavage enzyme activity.
[0337] Furthermore, this aspect of the present invention includes a method for producing the compound of formula (IV), comprising growing recombinant cells of this aspect of the present invention, as described above, under growth conditions suitable for the production of the compound of formula (IV).
[0338] Furthermore, this aspect of the present invention includes a cell culture fermentation medium containing recombinant cells of this aspect of the present invention. The cell culture fermentation medium may further contain a compound of formula (IV) and / or one or more compounds of formula (II) and / or formula (III).
[0339] Furthermore, this aspect of the present invention includes a reaction mixture comprising a compound of formula (IV). The reaction mixture may further comprise one or more compounds of formula (II) and / or formula (III).
[0340] Furthermore, this aspect of the present invention includes compounds of formula (IV) obtained or obtainable by the method of this aspect of the present invention or from recombinant cells, cell culture fermentation or reaction mixtures of this aspect of the present invention.
[0341] Further embodiments for preparing the compound of formula (I) Further aspects of the present invention relate to formula (I) in the form of one or a mixture thereof of stereoisomers. [ka] A method for preparing the compound, (i) Formula (VI) of one of the stereoisomers or a mixture thereof [ka] The compound is brought into contact with a polypeptide having terpene cyclase enzyme activity to produce the compound of formula (I). This provides a method that includes [something].
[0342] In one embodiment of this aspect of the present invention, more than 97% of the compounds of formula (I) are in the form of formula (Ia) and / or formula (Ib).
[0343] Another embodiment of this aspect of the present invention is one in which the compound of formula (VI) is in the form of formula (VIa).
[0344] A further aspect of the present invention is that the polypeptide having terpene cyclase enzymatic activity is a polypeptide that is not a squalene cyclase (SHC) enzyme, and / or a polypeptide that is a squalene cyclase enzyme. In the context of the present invention, the polypeptide that is not an SHC enzyme is a meroterpenoid cyclase enzyme.
[0345] Therefore, a further aspect of the present invention is that the polypeptide having terpene cyclase enzyme activity is a meloterpenoid cyclase enzyme and / or a squalene cyclase enzyme.
[0346] As can be understood in this embodiment of the present invention, a method for preparing a compound of formula (I) from a compound of formula (VI) includes step (v) of the method for preparing a compound of formula (I) in the first embodiment of the present invention. Accordingly, all embodiments presented herein in relation to step (v) of the method for preparing a compound of formula (I) in the first embodiment of the present invention can be used in this embodiment of the present invention.
[0347] In one embodiment of this aspect of the present invention, the squalene cyclase enzyme has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 29-49 and 265-279, preferably with any of SEQ ID NOs: 29-49, 265-274, and 276-279. More preferably, the squalene cyclase enzyme contains any of the sequences described in SEQ ID NOs: 29-49, 265-274, and 276-279.
[0348] In further embodiments, the squalene cyclase enzyme has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 29, 31, 33, 34, 36-38, 40, 41, 43-46, 48, 49, 265-274, and 276-279. Preferably, the squalene cyclase enzyme contains any of the sequences described in SEQ ID NOs: 29, 31, 33, 34, 36-38, 40, 41, 43-46, 48, 49, 265-274, and 276-279.
[0349] In preparing the method of the present invention, the inventors attempted to compare the isomer profiles of the compound of formula (I) synthesized by SHC and meroterpenoid cyclase.
[0350] To the surprise of the inventors, they found that when meroterpenoid cyclase is used in the method of the present invention, it produces a compound of formula (I) that has an isomer bias toward the isomers of the compound of formula (I) having a more favorable olfactory profile than that produced by the SHC enzyme.
[0351] In particular, the inventors demonstrated that less than 1% of the compound of formula (I) produced by the method of the present invention were in the isomer form of formula I(c) and / or I(d). Therefore, the use of meroterpenoid cyclases offers remarkable technical advantages compared to the use of SHC enzymes.
[0352] Preferably, the meroterpenoid cyclase is a membrane-integrated meroterpenoid cyclase. From the attached examples, it can be seen that this class of meroterpenoid cyclases preferably produces compounds of formula (Ia) rather than formulas (Ic) and / or (Id). Examples of membrane-integrated meroterpenoid cyclases include those described in any of SEQ ID NOs: 50-73 and 280-289.
[0353] Preferably, the meroterpenoid cyclase is a soluble meroterpenoid cyclase. From the attached examples, it can be seen that meroterpenoid cyclases of this class preferably produce compounds of formula (Ib) rather than formulas (Ic) and / or (Id). Examples of soluble meroterpenoid cyclases are those described in SEQ ID NO: 74 or 75.
[0354] This is the first time that a meroterpenoid cyclase has been used to prepare the compound of formula (I), and the bias toward such isomeric forms is remarkable and of industrial commercial importance.
[0355] Therefore, a further aspect of this invention is the form of any one of the stereoisomers or a mixture thereof of formula (I) [ka] A method for preparing the compound, (i) Formula (VI) of one of the stereoisomers or a mixture thereof [ka] The compound is brought into contact with a polypeptide having terpene cyclase enzyme activity to produce the compound of formula (I). Includes, Polypeptides that possess terpene cyclase enzyme activity are not SHC enzymes. Provide a method.
[0356] A preferred embodiment of this aspect of the present invention is that the polypeptide, which is not an SHC enzyme, is a meroterpenoid cyclase enzyme.
[0357] While understandable, meroterpenoid cyclase enzymes (or polypeptides having meroterpenoid cyclase enzyme activity) have been previously described in detail in connection with the first aspect of the present invention, and these are incorporated herein into this aspect of the invention.
[0358] Preferably, the meroterpenoid cyclase enzyme has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 50-75 and 280-289. Preferably, the meroterpenoid cyclase enzyme contains any of the sequences described in SEQ ID NOs: 50-75 and 280-289.
[0359] Preferably, the meroterpenoid cyclase enzyme has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with SEQ ID NOs: 57, 71, 74, 280, 281, 282, 283, 286, 287, or 288. Preferably, the meroterpenoid cyclase enzyme contains the sequence described in any of SEQ ID NOs: 57, 71, 74, 280, 281, 282, 283, 286, 287, or 288.
[0360] Preferably, the meroterpenoid cyclase enzyme is a membrane-integrated meroterpenoid cyclase. From the attached examples, it can be seen that meroterpenoid cyclases of this class preferably produce compounds of formula (Ia) rather than formula (Ic) and / or (Id). Therefore, in a preferred embodiment, the meroterpenoid cyclase enzyme is a membrane-integrated meroterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 50-73 and 280-289.
[0361] Preferably, the meroterpenoid cyclase is a soluble meroterpenoid cyclase. From the attached examples, it can be seen that meroterpenoid cyclases of this class preferably produce compounds of formula (Ib) rather than formula (Ic) and / or (Id). Thus, in another preferred embodiment, the meroterpenoid cyclase enzyme is a soluble meroterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to either SEQ ID NOs: 74 and 75.
[0362] As can be understood in this embodiment of the present invention for preparing a compound of formula (I), the method may further include one or more steps prior to step (i), which have been previously described in detail in connection with a first aspect of the present invention and are incorporated herein by reference into this embodiment of the present invention.
[0363] Preferably, this method is an in vivo method or a biotransformation method.
[0364] Furthermore, this aspect of the present invention includes recombinant cells that contain, can produce, or produce a compound of formula (I), wherein more than 97% of the compounds of formula (I) are in the form of formula (Ia) and / or (Ib). In one embodiment, the recombinant cells contain, can functionally express, or functionally express polypeptides having SHC enzyme activity and / or polypeptides having meroterpenoid cyclase enzyme activity, as described herein above.
[0365] Furthermore, this aspect of the present invention includes a method for producing the compound of formula (I), comprising growing recombinant cells of this aspect of the present invention under growth conditions suitable for the production of the compound of formula (I).
[0366] Furthermore, this aspect of the present invention includes a cell culture fermentation medium containing recombinant cells of this aspect of the present invention. The cell culture fermentation medium may further contain a compound of formula (I), and optionally, more than 97% of the compound of formula (I) is in the form of formula (Ia) and / or (Ib). The cell culture fermentation medium may further contain a compound of formula (VI).
[0367] Furthermore, this aspect of the present invention includes a reaction mixture containing a compound of formula (I), wherein optionally, more than 97% of the compound of formula (I) is in the form of formula (Ia) and / or (Ib). The reaction mixture is even better if it contains a compound of formula (VI).
[0368] Furthermore, this aspect of the present invention includes compounds of formula (I) obtained or obtainable by the method of this aspect of the present invention or from recombinant cells, cell culture fermentation media, or reaction mixtures of this aspect of the present invention.
[0369] Furthermore, this aspect of the present invention includes compound (I), more than 97% of which are in the form of formula (Ia) and / or (Ib).
[0370] A further aspect of the present invention is the use of a meroterpenoid cyclase enzyme for the production of compounds of formula (I) and / or derivatives thereof.
[0371] Further embodiments for preparing the compound of formula (I) As previously presented, in order to prepare an improved method for preparing the compounds of formula (I), the inventors have cultivated a deep understanding of the biochemical pathways that produce these compounds by multi-enzyme reactions from precursor compounds. This multi-enzyme reaction represents the first instance in which the preparation of these compounds has been carried out by such stepwise reactions, and represents a significant scientific and commercial advance in the preparation of sesquiterpene compounds of formula (I).
[0372] In addition to preparing the compound of formula (I) from the compound of formula (II), the inventors have also devised a method for preparing the compound of formula (I) from the compound of formula (V). This is also a commercially important method for preparing the compound of formula (I).
[0373] Therefore, according to this, a further aspect of the present invention is a form of formula (I) in the form of one or a mixture thereof of stereoisomers. [ka] A method for preparing the compound, (i) Formula (V) of one of the stereoisomers or a mixture thereof [ka] The steps include contacting the compound with a polypeptide having esterase enzyme activity to produce the compound of formula (VI), and (ii) Formula (VI) of one of the stereoisomers or a mixture thereof [ka] The compound is brought into contact with a polypeptide having terpene cyclase enzyme activity to produce the compound of formula (I). This provides a method that includes [something].
[0374] Esterase enzymes (or polypeptides having esterase enzyme activity) and terpene cyclase enzymes (or polypeptides having terpene cyclase enzyme activity) have been previously described in detail in connection with a first aspect of the present invention, which are incorporated herein by reference into this aspect of the present invention.
[0375] In this embodiment of the present invention, a method for preparing the compound of formula (I) uses the compound of formula (V) as a starting material. The method in this embodiment of the present invention may be an in vivo method or a biotransformation method. Preferably, the method is a biotransformation method.
[0376] A preferred embodiment of this aspect of the present invention is (i) Polypeptides having esterase enzyme activity that have at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 27 or 28, preferably SEQ ID NO: 28; and / or (ii) A polypeptide having terpene cyclase enzyme activity having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 29-75, 79-89, and 265-289, more preferably with any of SEQ ID NOs: 29-75, 265-274, and 276-289, and even more preferably with any of SEQ ID NOs: 48, 57, 71, 74, 265, 266, 267, 268, 274, 276, 279, 280, 281, 282, 283, 286, 287, or 288. That is the case.
[0377] Furthermore, this aspect of the present invention includes recombinant cells comprising one or more compounds of formula (I), formula (V), and / or formula (VI). The recombinant cells may further comprise (i) polypeptides having esterase enzyme activity and (ii) polypeptides having terpene cyclase enzyme activity.
[0378] Furthermore, this aspect of the present invention includes a method for producing the compound of formula (I), comprising growing recombinant cells of this aspect of the present invention under growth conditions suitable for the production of the compound of formula (I).
[0379] Furthermore, this aspect of the present invention includes a cell culture fermentation medium containing recombinant cells of this aspect of the present invention. The cell culture fermentation medium may further contain one or more compounds of formula (I), formula (V), and / or formula (VI).
[0380] Furthermore, this aspect of the present invention includes a reaction mixture comprising a compound of formula (I). The reaction mixture may further comprise one or more compounds of formula (V) and / or formula (VI).
[0381] Furthermore, this aspect of the present invention includes compounds of formula (I) obtained or obtainable by the method of this aspect of the present invention or from recombinant cells, cell culture fermentation medium, or reaction mixture of this aspect of the present invention.
[0382] Furthermore, as previously stated, the present invention has achieved the production of the compound of formula (VI) in the form of compound (VIa) with high selectivity, as demonstrated in the appended examples. This then leads to a biotransformation that prepares an olfactory-favorable form of the compound of formula (I). Therefore, the biotransformation of the present invention for preparing the compound of formula (I) lacking a considerable amount of undesirable byproducts is of great commercial importance.
[0383] Accordingly, as can be shown in the attached examples, one embodiment of this aspect of the present invention is such that more than 97% of the compounds of formula (I) are in the form of formula (Ia) and / or (Ib).
[0384] A further embodiment of this aspect of the present invention is a method in which (a) Formula (IV) of one of the stereoisomers or a mixture thereof [ka] A preliminary step involves contacting the compound with a polypeptide having BVMO enzyme activity to produce the compound of formula (V). It includes.
[0385] A further embodiment of this aspect of the present invention is a method in which (a) Formula (III) of one of the stereoisomers or a mixture thereof [ka] A further preliminary step involves contacting the compound with a polypeptide having enal cleavage enzyme activity to produce the compound of formula (IV). It includes.
[0386] Examples of BVMO and Enal cleavage enzymes have been described above in the first embodiment of the present invention and can be used in this embodiment of the present invention.
[0387] Phosphatase enzyme for use in the method of the present invention As described herein above, one embodiment of the method of the first aspect of the present invention further comprises one or more biocatalytic steps prior to step (i) for preparing the compound of formula (II), wherein the biocatalytic step is (a) Preparing geranylgeranyl diphosphate (GGPP) from IPP and DMAPP using one or more prenyltransferase enzymes, (b) Prepare the compound of formula (II) from GGPP using one or more enzymes having phosphatase activity. Includes.
[0388] In preparing the method of the present invention, the inventors identified phosphatase enzymes that can be used in step (b) of the method. This is the first time that these enzymes have been shown to catalyze the step of preparing the compound of formula (II) from GGPP.
[0389] Therefore, this aspect of the present invention includes the use of a phosphatase to prepare the compound of formula (II).
[0390] A further aspect of the present invention provides the use of a phosphatase enzyme having 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity to any of the sequences described in any of SEQ ID NOs: 3 to 10 for preparing a compound of formula (II) from GGPP.
[0391] Preferably, the phosphatase enzyme contains 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in any of SEQ ID NOs: 3, 4, 5, 7, and 8. More preferably, the phosphatase enzyme contains any of the sequences described in any of SEQ ID NOs: 3, 4, 5, 7, and 8.
[0392] ADH enzyme for use in the method of the present invention As described herein, a method of a first aspect of the present invention includes, as step (i), preparing a compound of formula (III) by contacting a compound of formula (II) with an ADH enzyme.
[0393] In preparing the method of the present invention, the inventors identified ADH enzymes that can be used in step (i) of the method. This is the first time that it has been shown that these enzymes can catalyze the step of preparing the compound of formula (III) from the compound of formula (II).
[0394] Accordingly, a further aspect of the present invention provides the use of an ADH enzyme containing 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in any of SEQ ID NOs: 11 to 21 for preparing a compound of formula (III) from a compound of formula (II). Preferably, the ADH enzyme contains any of the sequences described in any of SEQ ID NOs: 11 to 21.
[0395] Furthermore, another further aspect of the present invention provides the use of an ADH enzyme containing 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 11 or 21 for preparing a compound of formula (III) from a compound of formula (II). Preferably, the ADH enzyme contains the sequence described in SEQ ID NO: 11 or 21.
[0396] Enal cleavage enzyme for use in the method of the present invention As described herein, a method of a first aspect of the present invention includes, as step (ii), preparing a compound of formula (IV) by contacting a compound of formula (III) with an enal cleavage enzyme.
[0397] In preparing the method of the present invention, the inventors identified an enal cleavage enzyme that can be used in step (ii) of the method. This is the first time it has been shown that this enzyme can catalyze the step of preparing the compound of formula (IV) from the compound of formula (III).
[0398] Therefore, a further aspect of the present invention provides the use of a polypeptide having enal-cleaving activity for producing the compound of formula (IV). A further aspect of the present invention provides the use of a polypeptide having enal-cleaving activity for preparing the compound of formula (IV) from the compound of formula (III). A further aspect of the present invention is the use of a polypeptide having enal-cleaving activity for producing the compounds of formula (IV), (V), (VI), (I) and / or derivatives thereof.
[0399] Therefore, a further aspect of the present invention provides the use of a polypeptide having Enal cleavage activity containing 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 22 for producing the compound of formula (IV). A further aspect of the present invention provides the use of a polypeptide having Enal cleavage activity containing 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 22 for preparing the compound of formula (IV) from the compound of formula (III). A further aspect of the present invention provides the use of polypeptides having enal cleavage enzyme activity with sequence identity of 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more relative to SEQ ID NO: 22 for producing compounds of formula (IV), (V), (VI), (I) and / or derivatives thereof.
[0400] Preferably, the polypeptide having Enal cleavage enzyme activity comprises the sequence described in SEQ ID NO: 22.
[0401] BVMO enzyme for use in the method of the present invention As described herein, a method of a first aspect of the present invention includes, as step (iii), preparing a compound of formula (V) by contacting a compound of formula (IV) with a BVMO enzyme.
[0402] In preparing the method of the present invention, the inventors identified BMVO enzymes that can be used in step (iii) of the method. This is the first time that these enzymes have been shown to be able to catalyze the step of preparing the compound of formula (V) from the compound of formula (IV).
[0403] Accordingly, a further aspect of the present invention provides the use of a BVMO enzyme having 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to any of SEQ ID NOs: 23-26 and 216-227 for the preparation of a compound of formula (V) from a compound of formula (IV).
[0404] Preferably, the BVMO enzyme contains 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 23 to 26. More preferably, the BVMO enzyme contains the sequence described in any of SEQ ID NOs: 23 to 26.
[0405] Preferably, the BVMO enzyme contains 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 25 or 26. More preferably, the BVMO enzyme contains the sequence described in SEQ ID NO: 25 or 26.
[0406] Esterase enzyme for use in the method of the present invention As described herein, a method of a first aspect of the present invention includes, as step (iv), preparing a compound of formula (VI) by contacting a compound of formula (V) with an esterase enzyme.
[0407] In preparing the method of the present invention, the inventors identified esterase enzymes that can be used in step (iv) of the method. This is the first time that it has been shown that these enzymes can catalyze the step of preparing the compound of formula (VI) from the compound of formula (V).
[0408] Accordingly, a further aspect of the present invention provides the use of an esterase enzyme having 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 27 or 28 for preparing a compound of formula (VI) from a compound of formula (V).
[0409] Preferably, the esterase enzyme contains the sequence described in SEQ ID NO: 27 or 28.
[0410] SHC enzyme according to the present invention In preparing the method of the present invention, the inventors identified novel polypeptide sequences that encode SHC enzymes that can be used in the method. Therefore, these polypeptides are also part of the present invention.
[0411] In investigating SHC enzymes, the inventors identified enzymes having the amino acid alanine at position 437 and the amino acid methionine at position 600 relative to the sequence described in Sequence ID No. 82 as particularly useful.
[0412] Accordingly, a further aspect of the present invention is a mutant SHC enzyme having the amino acid alanine at position 437 and the amino acid methionine at position 600 relative to the sequence described in Sequence ID No. 82. This is the first time that this combination of mutations has been shown to function in the reaction described in step (v) of the method of the first aspect of the present invention.
[0413] The SHC enzyme was previously described, for example, in relation to step (v) of the method of the first aspect of the present invention. Those skilled in the art can use the information contained in that section of this specification to identify any SHC enzyme that can be modified using standard experimental techniques to arrive at the mutant SHC enzyme of this aspect of the present invention.
[0414] A preferred embodiment of this aspect of the present invention is a polypeptide having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 29, 31, 33, 34, 36-38, 40, 41, 43-46, 48, 49, 265-274, and 276-279, wherein the polypeptide has the amino acid alanine at position 437 and the amino acid methionine at position 600 with respect to the sequence described in SEQ ID NO: 82. A preferred embodiment is a polypeptide having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 29, 31, 33, 34, 36-38, 40, 41, 43-46, 48, 49, 265, 266, 267, 268, 274, 276, and 279, wherein the polypeptide has the amino acid alanine at position 437 and the amino acid methionine at position 600 with respect to the sequence described in SEQ ID NO: 82. Preferably, the mutant SHC enzyme is a polypeptide having the amino acid sequence described in any of SEQ ID NOs: 29, 31, 33, 34, 36-38, 40, 41, 43-46, 48, 49, 265-274, and 276-279. Preferably, the mutant SHC enzyme is a polypeptide having the amino acid sequence described in any of SEQ ID NOs: 29, 31, 33, 34, 36-38, 40, 41, 43-46, 48, 49, 265, 266, 267, 268, 274, 276, and 279. This aspect of the present invention also includes polypeptide fragments, variants, and their functional equivalents.
[0415] The polypeptide of this aspect of the present invention is an SHC enzyme that can be used in step (v) of a method of the first aspect of the present invention or any further aspect of the present invention, which includes preparing a compound of formula (I) from a compound of formula (VI).
[0416] A further aspect of the present invention provides a nucleic acid sequence that encodes a mutant SHC enzyme having the amino acid alanine at position 437 and the amino acid methionine at position 600 relative to the sequence described in Sequence ID No. 82. Methods for preparing such nucleic acid sequences are known in the art.
[0417] A preferred embodiment of this aspect of the present invention is a nucleic acid sequence encoding a mutant SHC enzyme that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 123, 125, 126, 128-131, 134-137, 139-142, 144-148, 150-153, 290-299, and 301-304, wherein the nucleic acid sequence encodes a polypeptide having the amino acid alanine at position 437 and the amino acid methionine at position 600 with respect to the sequence described in SEQ ID NO: 82. Preferably, the nucleic acid sequence is one of the nucleic acid sequences described in SEQ ID NOs: 123, 125, 126, 128-131, 134-137, 139-142, 144-148, 150-153, 290-299, and 301-304. This aspect of the present invention also includes expression vectors, cassettes, and other such related technologies containing the nucleic acid sequence of the present invention.
[0418] Furthermore, in the method of the present invention, as described herein, which includes preparing a compound of formula (I) by contacting a compound of formula (VI) with a terpene cyclase enzyme, the terpene cyclase enzyme may be an SHC enzyme. In preparing the method of the present invention, the inventors identified SHC enzymes that can be used in the method. This is the first time that it has been shown that these enzymes can catalyze the step of preparing a compound of formula (I) from a compound of formula (VI).
[0419] Accordingly, a further aspect of the present invention provides the use of an SHC enzyme having 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity to any of SEQ ID NOs: 29-49, 79-89, and 265-279 for the preparation of a compound of formula (I) from a compound of formula (VI).
[0420] Preferably, the SHC enzyme contains 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of sequence numbers 29-49 and 265-279. More preferably, the SHC enzyme contains the sequence described in any of sequence numbers 29-49 and 265-279.
[0421] Preferably, the SHC enzyme contains 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of sequence numbers 29-49, 265-274, and 276-279. More preferably, the SHC enzyme contains the sequence described in any of sequence numbers 29-49, 265-274, and 276-279.
[0422] The meroterpenoid cyclase enzyme according to the present invention In preparing the method of the present invention, the inventors identified novel polypeptide sequences that encode meroterpenoid cyclase enzymes that can be used in the method. Therefore, these polypeptides are also part of the present invention.
[0423] One embodiment of this aspect of the present invention is a mutant meroterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in any of SEQ ID NOs: 56 to 70. Preferably, the mutant meroterpenoid cyclase enzyme has the amino acid sequence described in any of SEQ ID NOs: 56 to 70. This aspect of the present invention also includes polypeptide fragments, variants, and functional equivalents thereof.
[0424] Accordingly, this aspect of the present invention provides a nucleic acid sequence encoding a mutant meroterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to any of the sequences described in SEQ ID NOs: 165 to 185. Preferably, the nucleic acid sequence is a nucleic acid sequence described in any of SEQ ID NOs: 165 to 185. This aspect of the present invention also includes expression vectors, cassettes, and other such related technologies containing the nucleic acid sequence of the present invention.
[0425] In investigating meroterpenoid cyclase enzymes, the inventors identified an enzyme having an amino acid substitution at amino acid position 9 relative to the sequence described in Sequence ID No. 51.
[0426] Accordingly, a further aspect of the present invention is a mutant meroterpenoid cyclase enzyme having an amino acid substitution at amino acid position 9 relative to the sequence described in Sequence ID No. 51. This is the first time it has been shown that a meroterpenoid cyclase enzyme having this mutation can function in the step of preparing the compound of formula (I) from the compound of formula (VI).
[0427] The meroterpenoid cyclase enzyme was previously described in connection with step (v) of the method of the first aspect of the present invention. Those skilled in the art can use the information contained in that section of this specification to identify any meroterpenoid cyclase enzyme that can be modified using standard experimental techniques to arrive at a variant meroterpenoid cyclase enzyme of this aspect of the present invention.
[0428] Preferably, the mutant meroterpenoid cyclase enzyme has a substitution at amino acid position 9 relative to the sequence described in SEQ ID NO: 51, introducing cysteine, methionine, or threonine at this position.
[0429] Accordingly, a preferred embodiment of this aspect of the present invention is a polypeptide having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in any of SEQ ID NOs: 56-61, 69, and 70, wherein the mutant meroterpenoid cyclase enzyme has an amino acid substitution at amino acid position 9 with respect to the sequence described in SEQ ID NO: 51. Preferably, the mutant meroterpenoid cyclase enzyme has the amino acid sequence described in any of SEQ ID NOs: 56-61, 69, and 70. This aspect of the present invention also includes polypeptide fragments, variants, and their functional equivalents.
[0430] The polypeptide of this aspect of the present invention is a meroterpenoid cyclase enzyme that can be used in step (v) of a method of the first aspect of the present invention or any further aspect of the present invention, which includes preparing a compound of formula (I) from a compound of formula (VI).
[0431] A further aspect of the present invention provides a nucleic acid sequence that encodes a mutant meloterpenoid cyclase enzyme having an amino acid substitution at amino acid position 9 relative to the sequence described in SEQ ID NO: 51. Methods for preparing such nucleic acid sequences are known in the art.
[0432] A preferred embodiment of this aspect of the present invention is a nucleic acid sequence encoding a mutant meroterpenoid cyclase enzyme that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 165-176, 184, and 185, wherein the nucleic acid sequence encodes a polypeptide having an amino acid substitution at amino acid position 9 with respect to the sequence described in SEQ ID NO: 51. Preferably, the nucleic acid sequence is a nucleic acid sequence described in any of SEQ ID NOs: 165-176, 184, and 185. This aspect of the present invention also includes expression vectors, cassettes, and other such related technologies containing the nucleic acid sequence of the present invention.
[0433] Furthermore, in the method of the present invention, as described herein, which includes preparing a compound of formula (I) by contacting a compound of formula (VI) with a terpene cyclase enzyme, the terpene cyclase enzyme may be a meroterpenoid cyclase enzyme. In preparing the method of the present invention, the inventors identified meroterpenoid cyclase enzymes that can be used in the method. This is the first time that it has been shown that these meroterpenoid cyclases can catalyze the step of preparing a compound of formula (I) from a compound of formula (VI).
[0434] Accordingly, a further aspect of the present invention provides the use of a meroterpenoid cyclase enzyme containing 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity for any of SEQ NO: 50-75 and 280-289.
[0435] Preferably, the meroterpenoid cyclase enzyme contains a sequence described in any of SEQ NO: 50-75 and 280-289.
[0436] Polypeptides and nucleic acids used in the present invention or in the methods of the present invention The interchangeable terms "polypeptide" and "peptide" refer to a natural or synthetic linear or synthetic sequence of amino acid residues linked by continuous peptide bonds, typically containing approximately 10 to over 1,000 residues. Short-chain polypeptides with up to 30 residues are also called "oligopeptides."
[0437] The term "protein" refers to a macromolecular structure consisting of one or more polypeptides. The amino acid sequence of the polypeptide represents the "primary structure" of the protein. The amino acid sequence also predetermines the "secondary structure" of the protein by forming special structural elements such as alpha helices and beta sheet structures formed within the polypeptide chain. The arrangement of multiple such secondary structural elements defines the "tertiary structure" or spatial arrangement of the protein. If a protein contains more than one polypeptide chain, the chains are spatially arranged to form the "quaternary structure" of the protein. The correct spatial arrangement or "folding" of a protein is a prerequisite for protein function. Denaturation or unfolding disrupts protein function. If such disruption is reversible, protein function may be restored by refolding.
[0438] A typical protein function referred to herein is "enzymatic function," that is, a protein acts as a biocatalyst on a substrate, such as a chemical compound, catalyzing the conversion of the substrate into a product. Enzymes can exhibit high or low levels of substrate and / or product specificity.
[0439] Therefore, a "polypeptide" referred to herein as having a specific "activity" implicitly refers to a correctly folded protein that exhibits the indicated activity, such as a specific enzymatic activity.
[0440] Therefore, unless otherwise specified, the term "polypeptide" also encompasses the terms "protein" and "enzyme."
[0441] Similarly, the term "polypeptide fragment" encompasses the terms "protein fragment" and "enzyme fragment."
[0442] The term "isolated polypeptide" refers to an amino acid sequence removed from its natural environment by any method or combination of methods known in the art, including recombinant, biochemical, and synthetic methods.
[0443] A "target peptide" refers to an amino acid sequence that targets a protein, or polypeptide, to an intracellular organelle, such as mitochondria, or plastids, or to the extracellular space (secretionary signaling peptides). The nucleic acid sequence encoding the target peptide can be fused to the nucleic acid sequence encoding the amino terminus, e.g., the N terminus, of a protein or polypeptide, or it can be used to replace a natural target polypeptide.
[0444] The present invention also relates to “functional equivalents” (also referred to as “analogs” or “functional mutations”) of polypeptides specifically described herein.
[0445] For example, “functional equivalent” refers to a polypeptide that, in tests used to determine enzyme activity, exhibits activity at least 1–10%, at least 20%, at least 50%, at least 75%, or at least 90% higher or lower than the activity of the polypeptide specifically described herein.
[0446] The “functional equivalents” according to the present invention include specific variants having different amino acids at at least one sequence position of the amino acid sequence described herein, but nevertheless possessing one of the aforementioned biological activities, such as enzymatic activity. Therefore, “functional equivalents” include variants that can be obtained by one or more, for example, 1 to 20, particularly 1 to 15 or 5 to 10 amino acid additions, substitutions, especially conservative substitutions, deletions, and / or inversions, where the described changes may occur at any sequence position, provided these changes result in variants having the characteristic profile according to the present invention. Furthermore, functional equivalents are obtained particularly when the activity patterns qualitatively match between the variant and the unchanged polypeptide, i.e., when, for example, the same agonist or antagonist or substrate interaction is observed, but at different rates (i.e., represented by EC50 or IC50 values or other parameters appropriate in the art). Examples of appropriate (conservative) amino acid substitutions are shown in the following table: [Table 2]
[0447] In the sense described above, "functional equivalents" also refer to the "precursors" of polypeptides, as well as the "functional derivatives" and "salts" of polypeptides as described herein.
[0448] In this case, the “precursor” is a natural or synthetic precursor of a polypeptide that has or does not have the desired biological activity.
[0449] The term "salt" refers to salts of carboxyl groups of protein molecules and salts of acid additions of amino groups. Salts of carboxyl groups can be produced by known methods and include inorganic salts, such as salts of sodium, calcium, ammonium, iron, and zinc, as well as salts with organic bases, such as amines such as triethanolamine, arginine, lysine, and piperidine. Salts of acid additions, such as salts with inorganic acids such as hydrochloric acid or sulfuric acid, as well as salts with organic acids such as acetic acid and oxalic acid, are also included in this invention.
[0450] The "functional derivatives" of polypeptides according to the present invention can also be produced on or at the N-terminus or C-terminus of functional amino acid side groups using known techniques. Such derivatives include, for example, aliphatic esters of carboxylic acid groups, amides of carboxylic acid groups that can be obtained by reaction with ammonia or primary or secondary amines; N-acyl derivatives of free amino groups produced by reaction with acyl groups; or O-acyl derivatives of free hydroxyl groups produced by reaction with acyl groups.
[0451] "Functional equivalents" naturally include polypeptides obtainable from other organisms, as well as naturally occurring variants. For example, the range of homologous sequence regions can be established by sequence comparison, and equivalent enzymes can be determined based on the specific parameters of the present invention.
[0452] A “functional equivalent” may or may not exhibit the desired biological function and may include “fragments” of the polypeptide according to the present invention, such as individual domains or sequence motifs, or N-terminal and / or C-terminal truncations. Preferably, such “fragments” retain at least the desired biological function qualitatively.
[0453] Furthermore, a “functional equivalent” is a fusion protein having a functional equivalent of a polypeptide sequence described herein or derived thereof and at least one further functionally distinct heterologous sequence via a functional N-terminal or C-terminal linkage (i.e., without substantial mutual functional damage to the fusion protein moieties). Non-limiting examples of these heterologous sequences include, for example, signal peptides, histidine anchors, or enzymes.
[0454] Similarly, “functional equivalents” as included in the present invention are homologs of the specifically disclosed polypeptides. These have homology (or identity) to one of the specifically disclosed amino acid sequences, calculated by the algorithm of Pearson and Lipman, Proc. Natl. Acad, Sci. (USA) 85(8), 1988, 2444-2448, of at least 60%, preferably at least 75%, particularly at least 80 or 85%, for example, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99%. The homology or identity expressed as a percentage of homologous polypeptides according to the present invention means identity expressed as a percentage of amino acid residues based on the full length of one of the amino acid sequences specifically described herein.
[0455] Identity data, expressed as a percentage, can also be determined using BLAST alignment, the blastp (protein-protein BLAST) algorithm, or by applying the Clustal settings specified below herein.
[0456] In the case of possible protein glycosylation, the “functional equivalents” according to the present invention include polypeptides such as those described herein, which are in deglycosylated or glycosylated forms and modified forms that can be obtained by altering the glycosylation pattern.
[0457] Functional equivalents or homologs of polypeptides according to the present invention can be produced by mutagenesis, for example, by point mutation, by protein elongation or shortening, or as described in more detail below.
[0458] Functional equivalents or homologs of polypeptides according to the present invention can be identified by screening a combinatorial database of variants, such as truncated variants. For example, a diverse database of protein variants can be generated by combinatorial mutagenesis at the nucleic acid level, for example, by enzymatic ligation of a mixture of synthetic oligonucleotides. There are many methods that can be used to generate a database of potential homologs from degenerate oligonucleotide sequences. The chemical synthesis of degenerate gene sequences can be carried out in an automated DNA synthesizer, and this synthetic gene can then be ligated into a suitable expression vector. The use of a degenerate genome makes it possible to supply all sequences encoding a desired set of potential protein sequences in a mixed state. Methods for synthesizing degenerate oligonucleotides are known to those skilled in the art.
[0459] Conventional techniques include several methods for screening gene products in combinatorial databases generated by point mutations or truncations, and for screening cDNA libraries for gene products with selected characteristics. These techniques can be adapted for rapid screening of gene banks generated by combinatorial mutagenesis of homologs according to the present invention. The most frequently used techniques for screening large gene banks based on high-throughput analysis include cloning the gene bank in replicable expression vectors, transforming appropriate cells with the resulting vector database, and expressing combinatorial genes under conditions where the isolation of vectors encoding detected genes is facilitated by the detection of desired activity. Recursive ensemble mutagenesis (REM), a technique that increases the frequency of functional variants in the database, can be used in combination with screening tests to identify homologs.
[0460] The embodiments provided herein provide orthologs and paralogs of the polypeptides disclosed herein, as well as methods for identifying and isolating such orthologs and paralogs. The definitions of the terms “ortholog” and “paralog” are given below and apply to amino acid and nucleic acid sequences.
[0461] The polypeptide of the present invention includes all active forms, including an activated partial sequence of the enzyme of the present invention, such as a catalytic domain or active site. In one embodiment, the present invention provides a catalytic domain or active site as described below. In one embodiment, the present invention provides peptides or polypeptides containing or derived from an active site domain as predicted through the use of databases such as Pfam (http: / / pfam.wustl.edu / hmmsearch.shtml) (The Pfam protein families database, A. Bateman, E. Birney, L. Cerruti, R. Durbin, L. Etwiller, SR Eddy, S. Griffiths-Jones, KL Howe, M. Marshall, and ELL Sonnhammer, Nucleic Acids Research, 30(1):276-280, 2002) or equivalents, such as InterPro and SMART databases (http: / / www.ebi.ac.uk / interpro / scan.html, http: / / smart.embl-heidelberg.de / ).
[0462] The present invention also includes "polypeptide variants" having desired activity, the variant polypeptide being selected from amino acid sequences having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to a specific, particularly natural, amino acid sequence referenced by a specific SEQ ID NO:, and comprising at least one substitution modification with respect to the SEQ ID NO:.
[0463] Coding nucleic acid sequences applicable by the present invention In this context, the following definitions apply: The terms “nucleic acid sequence,” “nucleic acid,” “nucleic acid molecule,” and “polynucleotide” are used interchangeably and refer to a sequence of nucleotides. A nucleic acid sequence may be a single-stranded or double-stranded deoxyribonucleotide, or a ribonucleotide of any length, and may include genes, exons, introns, sense and antisense complementary sequences, genomic DNA, cDNA, miRNA, siRNA, mRNA, rRNA, tRNA, recombinant nucleic acid sequences, isolated and purified naturally occurring DNA and / or RNA sequences, synthetic DNA and RNA sequences, fragments, primers, and coding and non-coding sequences of nucleic acid probes. Those skilled in the art recognize that a nucleic acid sequence of RNA is identical to a DNA sequence, differing in that thymine (T) is replaced by uracil (U). The term “nucleotide sequence” should also be understood as containing polynucleotide molecules or oligonucleotide molecules in the form of distinct fragments, or as components of a larger nucleic acid.
[0464] "Isolated nucleic acids" or "isolated nucleic acid sequences" may include nucleic acids or nucleic acid sequences that are substantially free of endogenous contaminants in an environment different from that in which they naturally occur.
[0465] As used herein, the term “naturally occurring,” as applied to nucleic acids, means that nucleic acids are found in the cells of organisms in nature and have not been intentionally modified by humans in a laboratory.
[0466] A “fragment” of a polynucleotide or nucleic acid sequence refers to a sequence of nucleotides in the embodiments herein, in particular having a length of at least 15 bp, at least 30 bp, at least 40 bp, at least 50 bp, and / or at least 60 bp. In particular, a fragment of a polynucleotide includes at least 25, more specifically at least 50, more specifically at least 75, more specifically at least 100, more specifically at least 150, more specifically at least 200, more specifically at least 300, more specifically at least 400, more specifically at least 500, more specifically at least 600, more specifically at least 700, more specifically at least 800, more specifically at least 900, and more specifically at least 1000 of the sequence of nucleotides of the polynucleotides in the embodiments herein.
[0467] A "recombinant nucleic acid sequence" is a nucleic acid sequence obtained by using laboratory methods (e.g., molecular cloning) to collect genetic material from one or more sources and create or modify a nucleic acid sequence that does not exist in nature and would not normally be found in living organisms.
[0468] "Recombinant DNA technology" refers to molecular biological procedures for preparing recombinant nucleic acid sequences, as described, for example, in Laboratory Manuals, 2002, Cold Spring Harbor Lab Press, edited by Weigel and Glazebrook, and in Sambrook et al., 1989, Cold Spring Harbor, NY, Cold Spring Harbor Laboratory Press.
[0469] The term "gene" refers to a DNA sequence that includes an RNA molecule operationally linked to an appropriate regulatory region, such as a promoter, or a region that is transcribed into intracellular mRNA. Therefore, a gene may include several operationally linked sequences, such as a 5' leader sequence containing a promoter, a sequence involved in translation initiation, a coding region of cDNA or genomic DNA, introns, exons, and / or transcription termination sites, and a 3' untranslated sequence containing these.
[0470] "Polycistronicity" refers to nucleic acid molecules, particularly mRNA, that can independently encode one or more polypeptides within the same nucleic acid molecule.
[0471] A “chimeric gene” typically refers to a gene that is not inherently found in a species, particularly a gene containing one or more parts of nucleic acid sequences that are not inherently related to each other. For example, a promoter may not be inherently related to some or all of the transcribed region or to another regulatory region. The term “chimeric gene” is understood to include expression constructs in which a promoter or transcriptional regulatory sequence is operationally linked to one or more coding sequences or antisense sequences, i.e., the reverse complement of the sense strand, or reverse repeat sequences (which are both sense and antisense, thereby causing the RNA transcript to form double-stranded RNA at transcription). The term “chimeric gene” also includes genes obtained by combining parts of one or more coding sequences to produce a new gene.
[0472] The "3'UTR" or "3' untranslated sequence" (also referred to as the "3' untranslated region" or "3' end") refers to a nucleic acid sequence found downstream of the coding sequence of a gene, which includes, for example, the transcription termination site and (in most, though not all, eukaryotic mRNAs) a polyadenylation signal, such as AAUAAA or its variants. After transcription termination, the mRNA transcript may be cleaved downstream of the polyadenylation signal, and a translation site, such as a poly(A) tail involved in the transport of mRNA into the cytoplasm, may be added.
[0473] The term "primer" refers to a short nucleic acid sequence that is hybridized to a template nucleic acid sequence and used for polymerization of nucleic acid sequences complementary to the template.
[0474] The term "selectable marker" refers to a gene that can be used in expression to select cells containing the selectable marker. Examples of selectable markers are listed below. Those skilled in the art will know that different antibiotic, fungicide, nutrient requirement, or herbicide selectable markers are applicable to different target species.
[0475] The present invention also relates to nucleic acid sequences encoding polypeptides as defined herein.
[0476] In particular, the present invention also relates to nucleic acid sequences (single-stranded and double-stranded DNA and RNA sequences, e.g., cDNA, genomic DNA, and mRNA) that encode one of the aforementioned polypeptides and their functional equivalents, which can be obtained, for example, using artificial nucleotide analogs.
[0477] The present invention relates to both isolated nucleic acid molecules encoding a polypeptide or a biologically active segment thereof according to the present invention, and nucleic acid fragments that can be used as hybridization probes or primers, etc., for identifying or amplifying coding nucleic acids according to the present invention.
[0478] The present invention also relates to nucleic acids having a certain degree of “identity” with respect to the sequences specifically disclosed herein. In either case, “identity” between two nucleic acids means the identity of nucleotides throughout the entire length of the nucleic acid.
[0479] The "identity" between two nucleotide sequences (and similarly for peptide or amino acid sequences) is a function of the number of identical nucleotide residues (or amino acid residues) in the two sequences when an alignment of the two sequences is generated. Identical residues are defined as residues that are the same in the two sequences at a given position in the alignment. As used herein, the ratio of sequence identity is calculated from the optimal alignment by dividing the number of identical residues between the two sequences by the total number of residues in the shortest sequence and multiplying by 100. The optimal alignment is the alignment that can yield the highest ratio of identity. Gaps can be introduced into one or both sequences at one or more positions in the alignment to obtain the optimal alignment. These gaps are then considered as non-identical residues for calculating the ratio of sequence identity. Alignments for the purpose of determining the ratio of identity of amino acid or nucleic acid sequences can be achieved in various ways using computer programs, e.g., publicly available computer programs available on the World Wide Web.
[0480] In particular, the BLAST program (Tatiana et al, FEMS Microbiol Lett., 1999, 174:247-250, 1999) with default parameters available from the National Center for Biotechnology Information (NCBI) website at ncbi.nlm.nih.gov / BLAST / bl2seq / wblast2.cgi can be used to obtain optimal alignment of protein or nucleic acid sequences and calculate the degree of sequence identity.
[0481] Alternatively, identity can be found in Chenna, et al. (2003), webpage: http: / / www.ebi.ac.uk / Tools / clustalw / index.html#, and the following settings: DNA gap open penalty 15.0 DNA gap extension penalty 6.66 DNA matrix identity Protein gap open penalty: 10.0 Protein gap extension penalty 0.2 Protein matrix Gonnet Protein / DNA ENDGAP -1 Protein / DNA GAPDIST 4 It can be determined accordingly.
[0482] All nucleic acid sequences (single-stranded and double-stranded DNA and RNA sequences, e.g., cDNA and mRNA) listed herein can be produced by known chemical synthesis methods from nucleotide building blocks, for example, by the condensation of individual overlapping complementary nucleic acid building block fragments of a double helix. The chemical synthesis of oligonucleotides can be carried out by known methods, for example, the phosphoamidite method (Voet, Voet, 2nd edition, Wiley Press, New York, pages 896-897). The accumulation and gap filling of synthetic oligonucleotides using Krenow fragments in DNA polymerase and ligation reactions, as well as general cloning techniques, are described in Sambrook et al. (1989) (see below).
[0483] The nucleic acid molecule according to the present invention may further include untranslated sequences from the 3' and / or 5' ends of the coding gene region.
[0484] The present invention further relates to nucleic acid molecules complementary to the nucleotide sequences or segments thereof that are specifically described.
[0485] The nucleotide sequences according to the present invention enable the production of probes and primers that can be used for the identification and / or cloning of homologous sequences in other cell types and organisms. Such probes or primers generally include a nucleotide sequence region that hybridizes under “severe” conditions (as defined elsewhere herein) to at least about 12, preferably at least about 25, e.g., about 40, 50, or 75 consecutive nucleotides of the sense strand or corresponding antisense strand of the nucleic acid sequence according to the present invention.
[0486] "Homologous" sequences include ortholog or paralog sequences. Methods for identifying orthologs or paralogs, including phylogenetic methods, sequence similarity, and hybridization methods, are known in the art and are described herein.
[0487] Paralogs are obtained from gene replication that produces two or more genes with similar sequences and similar functions. Paralogs typically cluster together and are formed by the replication of genes within related plant species. Paralogs are found in groups of similar genes during phylogenetic analysis of gene families using pairwise Blast analysis or programs such as CLUSTAL. Paralogs allow for the identification of consensus sequences that are characteristic of sequences within related genes and have similar functions.
[0488] Orthologs, or ortholog sequences, are sequences that are similar to each other because they are found in species derived from a common ancestor. For example, plant species with a common ancestor are known to contain many enzymes with similar sequences and functions. Those skilled in the art can identify ortholog sequences and predict their functions by, for example, constructing a polygene tree for a gene family of a single species using the CLUSTAL or BLAST program. A method for identifying or confirming similar functions between homologous sequences is to compare transcript profiles in host cells or organisms such as plants or microorganisms that overexpress or lack the relevant polypeptide (in knockout / knockdown). Those skilled in the art will understand that genes with similar transcript profiles, having more than 50% common regulatory transcripts, or more than 70% common regulatory transcripts, or more than 90% common regulatory transcripts, have similar functions. Homologous sequences, paralogs, orthologs, and other variants described herein are expected to function similarly by creating host cells, organisms such as plants, or microorganisms that produce terpene cyclase proteins.
[0489] Nucleic acid molecules according to the present invention can be recovered by standard molecular biology techniques and sequence information provided by the present invention. For example, cDNA can be isolated from a suitable cDNA library using standard hybridization techniques, with one of the specifically disclosed complete sequences or segments thereof used as a hybridization probe (as described, e.g., Sambrook, (1989)).
[0490] Furthermore, nucleic acid molecules containing the disclosed sequences or one of their segments can be isolated by polymerase chain reaction using oligonucleotide primers constructed based on these sequences. The thus amplified nucleic acids can be cloned into a suitable vector and characterized by DNA sequencing. Oligonucleotides according to the present invention can also be produced by standard synthetic methods, for example, using an automated DNA synthesizer.
[0491] To test the function of a mutant DNA sequence according to the embodiments herein, the sequence of interest is operationally linked to a selectable or screenable marker gene, and the expression of the reporter gene is tested in a transient expression assay, for example, in microorganisms or protoplasts, or in stably transformed plants.
[0492] The present invention also relates to derivatives of the nucleic acid sequences specifically disclosed or inducible.
[0493] Accordingly, further nucleic acid sequences according to the present invention can be derived from sequences specifically disclosed herein, which may differ from one or more, for example, 1 to 20, particularly 1 to 15 or 5 to 10, single or several (e.g., 1 to 10) nucleotide additions, substitutions, insertions, or deletions, and can further encode polypeptides having a desired characteristic profile.
[0494] The present invention also includes nucleic acid sequences containing so-called silent mutations, or sequences that have been modified according to the codon usage frequency of a particular original organism or host organism, compared to the sequences specifically described.
[0495] According to certain embodiments of the present invention, variant nucleic acids can be prepared to adapt their nucleotide sequences to specific expression systems. For example, bacterial expression systems are known to express polypeptides more efficiently when amino acids are encoded by specific codons. Due to the degeneracy of the genetic code, more than one codon can encode the same amino acid sequence, and multiple nucleic acid sequences can encode the same protein or polypeptide; all of these DNA sequences are included in the embodiments herein. Where appropriate, nucleic acid sequences encoding polypeptides described herein can be optimized for increased expression in host cells. For example, the nucleic acids of the embodiments herein can be synthesized using host-specific codons to improve expression.
[0496] The present invention also includes naturally occurring variants, such as splicing variants or allele variants of the sequences described herein.
[0497] Allele variants may have at least 60% homology, preferably at least 80%, and very preferably at least 90% homology, at the level of induced amino acids across the entire sequence range (for homology at the amino acid level, see the details previously given for polypeptides). Advantageously, homology may be higher across partial regions of the sequence.
[0498] The present invention also relates to sequences that can be obtained by conservative nucleotide substitutions (i.e., as a result, the amino acid in question is replaced by an amino acid of the same charge, size, polarity, and / or solubility).
[0499] The present invention also relates to molecules derived from nucleic acids specifically disclosed by sequence polymorphisms. Such gene polymorphisms may exist in cells from different populations or within a single population due to native allele mutations. Allele variants may also include functional equivalents. These native mutations typically result in a 1-5% variation in the nucleotide sequence of a gene. The polymorphisms can result in changes in the amino acid sequence of polypeptides disclosed herein. Allele variants may also include functional equivalents.
[0500] Furthermore, derivatives should also be understood as homologs of nucleic acid sequences according to the present invention, such as homologs of animals, plants, fungi, or bacteria, truncated sequences, coding and non-coding DNA sequences, and single-stranded DNA or RNA. For example, the homologs have at least 40%, preferably at least 60%, particularly preferably at least 70%, and very particularly preferably at least 80% homology at the DNA level across the entire DNA region given in the sequences specifically disclosed herein.
[0501] Furthermore, derivatives should be understood as, for example, fusions with promoters. Promoters attached to the stated nucleotide sequences can be modified by at least one nucleotide exchange, at least one insertion, inversion, and / or deletion without impairing the functionality or efficiency of the promoter. Moreover, the efficiency of promoters can be increased by altering their sequences, or even completely replaced with more effective promoters from organisms of different genera.
[0502] Generation of functional polypeptide variants Furthermore, those skilled in the art are familiar with methods for producing functional variants, i.e., methods for producing nucleotide sequences encoding polypeptides encoded by nucleic acid molecules comprising polypeptides having at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the amino acids related to the SEQ ID NOs as disclosed herein, and / or nucleotide sequences having at least 50% sequence identity with any of the nucleotides related to the SEQ ID NOs as disclosed herein.
[0503] Depending on the technique used, those skilled in the art can introduce completely random or more directional mutations into genes or non-coding nucleic acid regions (which are important, for example, to regulate expression), and then generate a gene library. The molecular biological methods required for this purpose are known to skilled workers and are described, for example, in Sambrook and Russell, Molecular Cloning. 3rd Edition, Cold Spring Harbor Laboratory Press 2001.
[0504] Methods for modifying genes, and therefore the polypeptides encoded by them, have long been known to skilled workers, and include, for example: - Site-specific mutagenesis in which individual or several nucleotides of a gene are directly replaced (Trower MK (Ed.) 1996; In vitro mutagenesis protocols. Humana Press, New Jersey), - Saturated mutagenesis that allows the exchange or addition of codons for any amino acid at any point in a gene (Kegler-Ebo DM, Docktor CM, DiMaio D (1994) Nucleic Acids Res 22:1593; Barettino D, Feigenbutz M, Valcarel R, Stunnenberg HG (1994) Nucleic Acids Res 22:541; Barik S (1995) Mol Biotechnol 3:1), - Mutagenic polymerase chain reaction in which the nucleotide sequence is mutated by mutagenic DNA polymerase (Eckert KA, Kunkel TA (1990) Nucleic Acids Res 18:3739), - The SeSaM method (sequence saturation method), in which preferential exchange is prevented by polymerase. Schenk et al., Biospektrum, Vol. 3, 2006, 277-279. - For example, gene passage in mutagenic strains where the rate of nucleotide sequence mutations increases due to defective DNA repair mechanisms (Greener A, Callahan M, Jerpseth B (1996) An efficient random mutagenesis technique using an E. coli mutator strain. Trower MK (Ed.) In vitro mutagenesis protocols. Humana Press, New Jersey), or DNA shuffling (Stemmer WPC (1994) Nature 370:389; Stemmer WPC (1994) Proc Natl Acad Sci USA 91:10747) is a process in which a pool of closely related genes is formed and digested, and the fragments are used as templates for polymerase chain reactions, in which repeated separation and rejoining of strands ultimately generates a full-length mosaic gene.
[0505] Using so-called directed evolution (particularly as described in Reetz MT and Jaeger KE (1999), Topics Curr Chem 200:31, Zhao H, Moore JC, Volkov AA, Arnold FH (1999), Methods for optimizing industrial polypeptides by directed evolution, Demain AL, Davies JE (Ed.) Manual of industrial microbiology and biotechnology. American Society for Microbiology), a skilled operator can produce functional variants on a large scale using a directed method. For this purpose, in the first step, a gene library of each polypeptide is initially generated, for example, using the method given earlier. The gene library is then represented by an appropriate method, for example, by bacteria or by a phage display system.
[0506] The relevant gene of a host organism expressing a functional mutant that largely possesses the desired trait can be sent to another mutation cycle. The steps of mutation and selection or screening can be repeated until the functional mutant possesses the desired trait to a sufficient degree. Using this iterative procedure, a limited number of mutations, e.g., 1, 2, 3, 4, or 5 mutations, can be performed stepwise, and their effects on the activity in question can be evaluated and selected. The selected mutants can then be sent to further mutation steps using the same method. In this way, the number of individual mutants to be investigated can be significantly reduced.
[0507] The results according to the present invention also provide important information regarding the structure and sequence of the relevant polypeptides, which is necessary for the targeted generation of further polypeptides with desired modified properties. In particular, it is possible to define so-called "hot spots," i.e., sequence segments that are potentially suitable for modifying properties by introducing targeted mutations.
[0508] Information regarding the position of amino acid sequences can also be inferred, and in those regions, mutations that would likely have little effect on activity may be affected; these can be called potential "silent mutations."
[0509] Constructs for expressing the polypeptide of the present invention and / or for use in the methods of the present invention In this context, the following definitions apply: "Gene expression" encompasses "heterogenetic expression" and "overexpression," and includes gene transcription and translation of mRNA into proteins. Overexpression refers to the production of gene products in transgenic cells or organisms, as measured by levels of mRNA, polypeptides, and / or enzyme activity, that exceed the levels of production in non-transformed cells or organisms with a similar genetic background.
[0510] As used herein, “expression vector” means a nucleic acid molecule manipulated using molecular biological methods and recombinant DNA techniques for delivering foreign or exogenous DNA to host cells. An expression vector typically contains the sequence necessary for the proper transcription of a nucleotide sequence. The coding region usually encodes the protein of interest, but can also encode RNA, such as antisense RNA or siRNA.
[0511] As used herein, “expression vector” includes, but is not limited to, any linear or circular recombinant vector, including viral vectors, bacteriophages, and plasmids. Those skilled in the art can select an appropriate vector according to the expression system. In one embodiment, the expression vector comprises a nucleic acid of the embodiment herein, or an mRNA-ribosome binding site, which is operationally ligated to at least one “regulatory sequence” that controls transcription, translation, initiation, and termination, such as a transcription promoter, operator, or enhancer, and optionally comprises at least one selectable marker. If the regulatory sequence is functionally relevant to the nucleic acid of the embodiment herein, the nucleotide sequence is “operationally ligated.”
[0512] As used herein, “expression system” encompasses any combination of nucleic acid molecules necessary for a single expression or for the co-expression of two or more polypeptides in vivo or in vitro in a given expression host. Each coding sequence may be located on a single nucleic acid molecule or vector, for example, on a vector containing multiple cloning sites or on a polycistronic nucleic acid, or it may be distributed across two or more physically different vectors. A specific example is an operon comprising a promoter sequence, one or more operator sequences, and one or more structural genes, each encoding an enzyme as described herein.
[0513] As used herein, the terms “amplify” and “amplify” refer to the use of any suitable amplification methodology for producing or detecting recombinants of naturally expressed nucleic acids, as described in detail below. For example, the present invention provides methods and reagents (e.g., oligo-dT primers, which are specific degenerate oligonucleotide primer pairs) for amplifying naturally expressed (e.g., genomic DNA or mRNA) or recombinant (e.g., cDNA) nucleic acids of the present invention in vivo, ex vivo, or in vitro (e.g., by polymerase chain reaction PCR).
[0514] A “regulatory sequence” refers to a nucleic acid sequence that can determine the expression level of the nucleic acid sequence in the embodiments herein and regulate the rate of transcription of the nucleic acid sequence operationally linked to the regulatory sequence. Regulatory sequences include promoters, enhancers, transcription factors, promoter elements, and the like.
[0515] According to the present invention, “promoter,” “promoter-active nucleic acid,” or “promoter sequence” is understood to mean a nucleic acid that regulates the transcription of a nucleic acid when functionally linked to the nucleic acid to be transcribed. In particular, “promoter” refers to a nucleic acid sequence that controls the expression of a coding sequence by providing a binding site for RNA polymerase and other factors necessary for proper transcription, including but not limited to transcription factor binding sites, repressor and activator protein binding sites. The meaning of the term promoter also includes the term “promoter-regulating sequence.” Promoter-regulating sequences may include upstream and downstream elements that can affect the transcription, RNA processing, or stability of the coding nucleic acid sequence in question. Promoters include naturally occurring and synthetic sequences. The coding nucleic acid sequence is typically located downstream of the promoter with respect to the direction of transcription, which begins at the transcription start site.
[0516] In this context, “functional” or “operational” linkage is understood to mean, for example, a set of arrangements of nucleic acids having regulatory sequences. For example, a sequence with promoter activity, a sequence of nucleic acid sequence to be transcribed, and optionally sequences of further regulatory elements, such as a sequence of nucleic acid sequence that ensures transcription of the nucleic acid, and a sequence of a terminator, are linked so that each regulatory element can perform its function during transcription of the nucleic acid sequence. This does not necessarily require direct linkage in a chemical sense. Gene regulatory sequences, such as enhancer sequences, can exert their functions relative to target sequences from more distant locations or even from other DNA molecules. A preferred arrangement is one in which the nucleic acid sequence to be transcribed is located behind the promoter sequence (i.e., at its 3' end) so that the two sequences are covalently linked together. The distance between the promoter sequence and the nucleic acid sequence to be recombinantly expressed may be less than 200 base pairs, less than 100 base pairs, or less than 50 base pairs.
[0517] In addition to promoters and terminators, other examples of regulatory elements include: target sequences, enhancers, polyadenylation signals, selectable markers, amplification signals, and origins of replication. Appropriate regulatory sequences are described, for example, in Goeddel, Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, CA (1990).
[0518] The term "constitutive promoter" refers to an unregulated promoter that enables the continuous transcription of the nucleic acid sequence to which it is operationally linked.
[0519] As used herein, the term “operationally linked” refers to the linking of polynucleotide elements that have a functional relationship. A nucleic acid is “operationally linked” if it has a functional relationship with another nucleic acid sequence. For example, a promoter, more precisely a transcriptional regulatory sequence, is operationally linked to a coding sequence if it affects the transcription of that coding sequence. Operationally linked means that the linked DNA sequences are typically continuous. The nucleotide sequence associated with the promoter sequence may be of homogeneous or heterogeneous origin with respect to the plant to be transformed. The sequences may be entirely or partially synthesized. Regardless of origin, the nucleic acid sequence associated with the promoter sequence will be expressed or silenced according to the promoter properties to which it is linked after being bound to the polypeptide of the embodiments herein. The associated nucleic acid may encode a protein, which is desirable to be expressed or repressed throughout the organism, always or alternatively at specific times, or in specific tissues, cells, or cell compartments. Such nucleotide sequences, in particular, encode proteins that confer desired phenotypic traits to the host cells or organism thereby modified or transformed. More specifically, the relevant nucleotide sequences result in the production of desired products, as defined herein, in cells or organisms. In particular, the nucleotide sequences encode polypeptides having enzymatic activity, as defined herein.
[0520] The nucleotide sequences described herein may be part of an “expression cassette.” The terms “expression cassette” and “expression construct” are used synonymously. An expression construct (preferably recombinant) comprises a nucleotide sequence that encodes a polypeptide according to the present invention and is under the genetic control of a regulatory nucleic acid sequence.
[0521] In the method applied by the present invention, the expression cassette may be part of an "expression vector," particularly a recombinant expression vector.
[0522] According to the present invention, “expression unit” is understood to mean an expression-active nucleic acid that includes a promoter as defined herein and, after functional linkage with the nucleic acid or gene to be expressed, regulates the expression, i.e., transcription and translation, of said nucleic acid or gene. Therefore, in this context, it is also referred to as a “regulatory nucleic acid sequence.” In addition to the promoter, other regulatory elements, such as enhancers, may also be present.
[0523] According to the present invention, an "expression cassette" or "expression construct" is understood to mean an expression unit that is operationally linked to a nucleic acid or gene to be expressed. Therefore, in contrast to an expression unit, an expression cassette includes not only nucleic acid sequences that regulate transcription and translation, but also nucleic acid sequences to be expressed as proteins as a result of transcription and translation.
[0524] In the context of the present invention, the terms “expression” or “overexpression” refer to generating or increasing the intracellular activity of one or more polypeptides in a microorganism encoded by the corresponding DNA. For this purpose, for example, it is possible to introduce a gene into an organism, replace an existing gene with another gene, increase the copy number of a gene, use a strong promoter, or use a gene encoding a corresponding polypeptide with high activity, and these means can be optionally combined.
[0525] Preferably, such a construct according to the present invention includes a 5'-upstream promoter and a 3'-downstream terminator sequence, and optionally other conventional modulators, in each case operationally coupled with the coding sequence.
[0526] The nucleic acid constructs according to the present invention include sequences encoding polypeptides derived, in particular, from amino acid-related sequence numbers as described herein, or their reverse complements, or their derivatives and homologs, which are advantageously operimetrically or functionally linked to one or more regulatory signals for controlling, for example, increasing, gene expression.
[0527] In addition to these regulatory sequences, the innate regulation of these sequences may still be present before the actual structural gene, and it may have been selectively genetically modified so that the innate regulation is turned off and gene expression is enhanced. However, the nucleic acid construct may be a simpler construct, i.e., no additional regulatory signal is inserted before the coding sequence and the innate promoter is not removed with its regulation. Instead, the innate regulatory sequence is mutated so that regulation no longer occurs and gene expression is increased.
[0528] A preferred nucleic acid construct also advantageously includes one or more of the previously mentioned “enhancer” sequences functionally linked to the promoter, thereby enabling enhanced expression of the nucleic acid sequence. Additional advantageous sequences, such as further regulatory elements or terminators, may also be inserted into the 3' end of the DNA sequence. One or more copies of the nucleic acid according to the present invention may be present in the construct. Other markers, such as genes complementing nutritional requirements or antibiotic resistance, may also be optionally present in the construct for construct selection.
[0529] Examples of suitable regulatory sequences include cos, tac, trp, tet, trp-tet, lpp, lac, lpp-lac, and lacI. q , T7, T5, T3, gal, trc, ara, rhaP(rhaP BAD )SP6, Lambda-P R , or lambda-P LThese are present in promoters such as the promoter itself, and are advantageously used in Gram-negative bacteria. Further advantageous regulatory sequences are present, for example, in the Gram-positive promoters amy and SpO2, and in the yeast or fungal promoters ADC1, MFalpha, AC, P-60, CYC1, GAPDH, TEF, rp28, and ADH. Artificial promoters can also be used for regulation.
[0530] For expression in a host organism, nucleic acid constructs are advantageously inserted into vectors, such as plasmids or phages, which allow for optimal gene expression in the host. The term "vector" is also understood to mean, in addition to plasmids and phages, all other vectors known to experienced operators, namely, viruses such as SV40, CMV, baculoviruses and adenoviruses, transposons, IS elements, phasmids, cosmids, and linear or circular DNA or artificial chromosomes. These vectors can autonomously replicate within a host organism or on chromosomes. These vectors represent further developments of the present invention. Binary or CPO integrated vectors are also applicable.
[0531] Suitable plasmids include, for example, in Escherichia coli (E. coli), pLG338, pACYC184, pBR322, pUC18, pUC19, pKC30, pRep4, pHS1, pKK223-3, pDHE19.2, pHS2, pPLc236, pMBL24, pLG200, pUR290, and pIN-III. 113 -B1, λgt11 or pBdCI; in Streptomyces, pIJ101, pIJ364, pIJ702 or pIJ361; in Bacillus, pUB110, pC194 or pBD214; in Corynebacterium, pSA77 or pAJ667; in fungi, pALS1, pIL2 or pBB116; in yeast, 2alphaM, pAG-1, YEp6, YEp13 or pEMBLYe23; or in plants, pLGV23, pGHlac +These are pBIN19, pAK2004, or pDH51. The plasmids mentioned above are a small selection of possible plasmids. Further plasmids are well known to experienced workers and can be found, for example, in the book *Cloning Vectors* (Eds. Pouwels PH et al. Elsevier, Amsterdam-New York-Oxford, 1985, ISBN 0 444 904018).
[0532] In further developmental forms of vectors, nucleic acid constructs or vectors containing nucleic acids according to the present invention may, advantageously, be introduced into microorganisms in the form of linear DNA and integrated into the genome of a host organism via heterologous or homologous recombination. This linear DNA may consist of plasmids, or nucleic acid constructs alone, or linearized vectors such as nucleic acids according to the present invention.
[0533] For optimal expression of heterologous genes in an organism, it is advantageous to modify nucleic acid sequences to match specific "codon usage frequencies" used in that organism. These "codon usage frequencies" can be easily determined by computer evaluation of other known genes in the organism in question.
[0534] The expression cassette according to the present invention is generated by fusing a suitable promoter to a suitable coding nucleotide sequence and a terminator signal or polyadenylation signal. For this purpose, common recombination and cloning techniques such as those described in, for example, T. Maniatis, EF Fritsch and J. Sambrook, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1989), TJ Silhavy, ML Berman and LW Enquist, Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY (1984), and Ausubel, FM et al., Current Protocols in Molecular Biology, Greene Publishing Assoc. and Wiley Interscience (1987) are used.
[0535] For expression in a suitable host organism, recombinant nucleic acid constructs or gene constructs are advantageously inserted into host-specific vectors that enable optimal gene expression in the host. Vectors are well known to experienced operators and can be found, for example, in "cloning vectors" (Pouwels PH et al., Ed., Elsevier, Amsterdam-New York-Oxford, 1985).
[0536] Alternative embodiments of the embodiments herein provide methods for “modifying gene expression” in host cells. For example, the polynucleotides of the embodiments herein can be enhanced, overexpressed, or induced in host cells or host organisms under certain circumstances (e.g., exposure to certain temperatures or culture conditions).
[0537] Modifications to the expression of polynucleotides provided herein may also result in ectopic expression, which is a different expression pattern in the modified organism and in the control or wild-type organism. Modifications to expression arise from the interaction between the polypeptides of the embodiments herein and exogenous or endogenous modulators, or as a result of chemical modification of the polypeptides. This term also refers to modified expression patterns of polynucleotides of the embodiments herein with activity altered to below detection levels or completely suppressed.
[0538] In this specification, in one embodiment, isolated, recombinant, or synthetic polynucleotides for encoding polypeptides or variant polypeptides provided herein are also provided.
[0539] In one embodiment, several nucleic acid sequences encoding polypeptides are co-expressed in a single host, particularly under the control of different promoters. In another embodiment, several nucleic acid sequences encoding polypeptides may be present on a single transformation vector, or they may be co-transformed simultaneously using separate vectors, and transformants containing both chimeric genes may be selected. Similarly, one or more polypeptide-encoding genes may be expressed together with other chimeric genes in a single plant, cell, microorganism, or organism.
[0540] Recombinant production of polypeptides according to the present invention The present invention further relates to a method for recombinantly producing polypeptides or functionally biologically active fragments thereof according to the present invention, comprising culturing a polypeptide-producing microorganism, optionally inducing polypeptide expression by applying at least one gene expression-inducing agent, and isolating the expressed polypeptide from the culture. Polypeptides can, if necessary, be produced in this manner on an industrial scale.
[0541] The microorganisms produced by this invention can be cultured continuously or discontinuously by batch, fed-batch, or repeated fed-batch methods. Outlines of known culture methods can be found in Chmiel's textbook (Bioprozesstechnik 1. Einfuehrung in die Bioverfahrenstechnik [Bioprocess technology 1. Introduction to bioprocess technology] (Gustav Fischer Verlag, Stuttgart, 1991)) or Storhas's textbook (Bioreaktoren und periphere Einrichtungen [Bioreactors and peripheral equipment] (Vieweg Verlag, Braunschweig / Wiesbaden, 1994)). Standard laboratory methods are available for this purpose, are known in the art, and are further described herein.
[0542] If polypeptides are not secreted into the culture medium, the cells may be lysed, and the product can be obtained from the lysate by known methods for isolating proteins. These cells can be optionally destroyed by high-frequency ultrasound, high pressure (e.g., by a French press), osmotic decomposition, by the action of surfactants, lytic enzymes or organic solvents, by a homogenizer, or by some combination of the aforementioned methods.
[0543] Polypeptides can be purified by known chromatographic techniques such as Q-Sepharose chromatography, ion exchange chromatography, and molecular sieve chromatography (gel filtration), as well as by other common techniques such as ultrafiltration, crystallization, salting out, dialysis, and native gel electrophoresis. Suitable methods are described, for example, in Cooper, TG, Biochemische Arbeitsmethoden [Biochemical processes], Verlag Walter de Gruyter, Berlin, New York or Scopes, R., Protein Purification, Springer Verlag, New York, Heidelberg, Berlin.
[0544] To isolate recombinant proteins, it may be advantageous to use vector systems or oligonucleotides that lengthen cDNA by a defined nucleotide sequence, thereby encoding a modified polypeptide or fusion protein, which is, for example, easier to purify. Suitable modifications of this type include, for example, so-called "tags" that function as anchors, known as hexahistidine anchors or epitopes that can be recognized as antigens of antibodies (see, for example, Harlow, E. and Lane, D., 1988, Antibodies: A Laboratory Manual. Cold Spring Harbor (NY) Press). These anchors serve to attach the protein to a solid support, such as a polymer matrix, which can be used, for example, as packing in a chromatography column, or on a microtiter plate or some other support.
[0545] Simultaneously, these anchors can also be used for protein recognition. Furthermore, for protein recognition, conventional markers such as fluorescent dyes, enzyme markers that form detectable reaction products after reaction with a substrate, or radioactive markers can be used alone or in combination with anchors for protein derivatization.
[0546] Use of the compound of formula (I) and other compounds of the present invention Further aspects of the present invention include the use of a compound of formula (I) or another (intermediate) compound obtained or obtainable from embodiments of the present invention as a fragrance, flavor or aroma component, or as a precursor for producing said components.
[0547] As described above, the present invention includes the use of a compound of formula (I) as a fragrance component. In other words, the present invention relates to a method or process for imparting, enhancing, improving, or modifying the olfactory properties of a fragrance composition, or a fragranced article, or surface, the method comprising, for example, adding an effective amount of at least one compound of formula (I) to the composition or article to impart its typical note. The final pleasure effect may depend on the precise dosage and sensory properties of the compound of the present invention, but in any case, it is understood that the addition of the compound of the present invention imparts its typical finish to the final product, in the form of note, feel, or appearance, depending on the dosage.
[0548] Here, "use of compound (I)" should also be understood as the use of any composition that contains compound (I) and can be advantageously used in the fragrance industry.
[0549] The aforementioned compositions, which can be advantageously used as fragrance components, are also subject to the present invention.
[0550] Therefore, another subject of the present invention is, i) At least one compound of the present invention as defined above, as a fragrance component, ii) At least one component selected from the group consisting of fragrance carriers and fragrance bases, (iii) at least one fragrance adjuvant and It is a fragrance composition containing [the specified ingredient].
[0551] Here, "fragrance carrier" means a material that is substantially neutral from the standpoint of fragrance, that is, a material that does not significantly alter the sensory properties of the fragrance components. The carrier may be a liquid or a solid.
[0552] Examples of liquid carriers, though not limited, include emulsions, i.e., solvents and surfactants, or solvents commonly used in fragrances. A detailed description of the properties and types of solvents commonly used in fragrances is not exhaustive. However, examples of the most commonly used solvents include butylene or propylene glycol, glycerol, dipropylene glycol and its monoethers, 1,2,3-propanetriyltriacetate, dimethyl glutarate, dimethyl adipate, 1,3-diacetyloxypropane-2-ylacetate, diethyl phthalate, isopropyl myristate, benzyl benzoate, benzyl alcohol, 2-(2-ethoxyethoxy)-1-ethanol, triethyl citrate, or mixtures thereof. In the case of a composition containing both a fragrance carrier and a fragrance base, suitable fragrance carriers other than those previously specified may include ethanol, water / ethanol mixtures, limonene or other terpenes, isoparaffins, for example, those known by the trademark Isopar® (manufactured by Exxon Chemical), glycol ethers and glycol ether esters, for example, those known by the trademark Dowanol® (manufactured by Dow Chemical Company), or hydrogenated castor oil, for example, those known by the trademark Cremophor® RH40 (manufactured by BASF).
[0553] The term "solid carrier" refers to a material to which a fragrance composition or several elements of a fragrance composition can be chemically or physically bound. Generally, such solid carriers are used either to stabilize a composition or to control the rate of evaporation of the composition or several components. Solid carriers are currently used in the art, and those skilled in the art know how to obtain the desired effect. However, non-limiting examples of solid carriers include absorbent gums or polymers or inorganic materials, such as porous polymers, cyclodextrins, wood-based materials, organic or inorganic gels, clays, gypsum talc, or zeolites.
[0554] Other non-limiting examples of solid carriers include encapsulating materials. Examples of such materials may include wall-forming and plasticizing materials, such as monosaccharides, disaccharides or trisaccharides, natural or modified starches, hydrophilic colloids, cellulose derivatives, polyvinyl acetate, polyvinyl alcohol, proteins, or pectin, or materials cited in references such as H. Scherz, Hydrokolloide: Stabilisatoren, Dickungs- und Geliermittel in Lebensmitteln, Band 2 der Schriftenreihe Lebensmittelchemie, Lebensmittelqualitaet, Behr's Verlag GmbH & Co., Hamburg, 1996. Encapsulation is carried out by methods well known to those skilled in the art, for example, by using techniques such as spray drying, agglomeration or extrusion, or by coating encapsulation including coacervation and composite coacervation techniques.
[0555] Non-limiting examples of solid supports include core-shell capsules having aminoplast, polyamide, polyester, polyurea, or polyurethane type resins or mixtures thereof (all of which are well known to those skilled in the art), which optionally utilize techniques such as polymerization, interfacial polymerization, coacervation, or phase separation processes induced by these (all of which are described in the prior art) in the presence of a polymer stabilizer or cationic copolymer.
[0556] The resin can be produced by polycondensation of aldehydes (e.g., formaldehyde, 2,2-dimethoxyethanal, glyoxal, glyoxylic acid, or glycolaldehyde and mixtures thereof) with amines such as urea, benzoguanamine, glycoluryl, melamine, methylolmelamine, methylated methylolmelamine, guanazole, and mixtures thereof. Alternatively, pre-formed resins such as alkylolated polyamines, commercially available under trademarks like Urac® (manufactured by Cytec Technology Corp.), Cymel® (manufactured by Cytec Technology Corp.), Urecoll®, or Luracoll® (manufactured by BASF), can be used.
[0557] Other resins are produced by polycondensation of polyols such as glycerol with polyisocyanates such as trimers of hexamethylene diisocyanate, trimers of isophorone diisocyanate or xylylene diisocyanate, or biuret of hexamethylene diisocyanate, or trimers of xylylene diisocyanate and trimethylolpropane (known by trade name Takenate®, manufactured by Mitsui Chemicals). Among these, trimers of xylylene diisocyanate and trimethylolpropane, and biuret of hexamethylene diisocyanate are preferred.
[0558] Some of the important literature related to the encapsulation of fragrances by polycondensation of amino resins, i.e., melamine-based resins, with aldehydes, includes papers such as those published by K. Dietrich et al. in Acta Polymerica, 1989, vol. 40, pages 243, 325 and 683, and 1990, vol. 41, page 91. Such papers already describe various parameters affecting the preparation of such core-shell microcapsules according to prior art methods, which are also further detailed and illustrated in the patent literature. U.S. Patent No. 4,396,670 of Wiggins Teape Group Limited is an early example of the latter. Since then, many other authors have enriched the literature in this field, so it would be impossible to encompass all published developments here, but general knowledge in encapsulation techniques is of great importance. Relevant and more recent publications disclosing the appropriate use of such microcapsules are shown, for example, in the paper by K. Bruyninckx and M. Dusselier, ACS Sustainable Chemistry & Engineering, 2019, vol. 7, pages 8041-8054.
[0559] Here, "fragrance base" means a composition containing at least one fragrance co-component.
[0560] The aforementioned fragrance co-components are not those of formula (I). Furthermore, here, “fragrance co-component” means a compound used in a fragrance preparation or composition to impart a pleasurable effect. In other words, such co-components considered to be fragrance components should be recognized by those skilled in the art not merely as having an odor, but as being capable of imparting or modifying the odor of a composition in a favorable or pleasant manner. Fragrance components may impart further benefits other than modifying or imparting an odor, such as long-lasting effect, blooming, odor neutralization, antimicrobial effect, antiviral effect, microbial stability, or pest control.
[0561] Herein, the properties and types of fragrance co-components present in the base are not guaranteed to be exhaustive, and in any case, those skilled in the art can select them based on their general knowledge, according to the intended use or application and the desired sensory effect. Generally, these fragrance co-components belong to a diverse chemical class such as alcohols, lactones, aldehydes, ketones, esters, ethers, acetates, nitriles, terpenoids, nitrogen-containing or sulfur-containing heterocyclic compounds and essential oils, and said fragrance co-components may be of natural or synthetic origin.
[0562] In particular, the following are examples of fragrance co-components commonly used in fragrance formulations: - Aldehyde components: decanal, dodecanal, 2-methyl-undecinal, 10-undecenal, octanal, nonanal, and / or nonenal; - Aromatic herbal components: Eucalyptus oil, camphor, eucalyptol, 5-methyltricyclo[6.2.1.0~2,7~]undecane-4-one, 1-methoxy-3-hexanethiol, 2-ethyl-4,4-dimethyl-1,3-oxatian, 2,2,7 / 8,9 / 10-tetramethylspiro[5.5]undecane-8-en-1-one, menthol, and / or alpha-pinene; - Balsam components: Coumarin, ethyl vanillin, and / or vanillin; - Citrus components: dihydromyrcenol, citral, orange oil, linalyl acetate, citronellyl nitrile, orange terpene, limonene, 1-p-menthen-8-yl acetate, and / or 1,4(8)-p-mentadiene; - Floral components: Methyldihydrojasmonate, linalool, citronellol, phenylethanol, 3-(4-tert-butylphenyl)-2-methylpropanal, hexyl cinnamaldehyde, benzyl acetate, benzyl salicylate, tetrahydro-2-isobutyl-4-methyl-4(2H)-pyranol, beta-ionone, methyl 2-(methylamino)benzoate, (E)-3-methyl-4-(2,6,6-trimethyl-2-cyclohexen-1-yl)-3-buten-2-one , (1E)-1-(2,6,6-trimethyl-2-cyclohexen-1-yl)-1-penten-3-one, 1-(2,6,6-trimethyl-1,3-cyclohexadiene-1-yl)-2-buten-1-one, (2E)-1-(2,6,6-trimethyl-2-cyclohexen-1-yl)-2-buten-1-one, (2E)-1-[2,6,6-trimethyl-3-cyclohexen-1-yl]-2-buten-1-one, (2E)-1-(2,6,6-trimethyl-1-cyclohexen-1-yl)-2- Buten-1-one, 2,5-dimethyl-2-indanmethanol, 2,6,6-trimethyl-3-cyclohexen-1-carboxylate, 3-(4,4-dimethyl-1-cyclohexen-1-yl)propanal, 3-(3,3 / 1,1-dimethyl-5-indanyl)propanal, hexyl salicylate, 3,7-dimethyl-1,6-nonadien-3-ol, 3-(4-isopropylphenyl)-2-methylpropanal, verzyl acetate, geraniol, p-menta-1-en-8-ol, 4-(1,1-dimethylethyl)-1-cyclohexyl acetate, 1,1-dimethyl-2-phenylethyl acetate, 4-cyclohexyl-2-methyl-2-butanol, amyl salicylate, high cis-methyl dihydrojasmonate, 3-methyl-5-phenyl-1-pentanol, berzylpropionate, geranyl acetate, tetrahydrolinalool, cis-7-p-menthanol, propyl(S)-2-(1,1-dimethylpropoxy)propanoate, 2-methoxynaphthalene, 2,2,2-Trichloro-1-phenylethyl acetate, 4 / 3-(4-hydroxy-4-methylpentyl)-3-cyclohexene-1-carbaldehyde, amyl cinnamic aldehyde, 8-decene-5-olido, 4-phenyl-2-butanone, isononyl acetate, 4-(1,1-dimethylethyl)-1-cyclohexyl acetate, berzyl isobutyrate, and / or mixtures of methyl ionone isomers; - Fruity components: gamma-undecalactone, 2,2,5-trimethyl-5-pentylcyclopentanone, 2-methyl-4-propyl-1,3-oxatian, 4-decanolide, ethyl 2-methyl-pentanoate, hexyl acetate, ethyl 2-methylbutanoate, gamma-nonalactone, allylheptanoate, 2-phenoxyethyl isobutyrate, ethyl 2-methyl-1,3-dioxolane-2-acetate, diethyl 1,4-cyclohexanedicarboxylate, 3-methyl-2-hexen-1-yl acetate, 1-[3,3-dimethylcyclohexyl]ethyl[3-ethyl-2-oxyranyl]acetate, and / or diethyl 1,4-cyclohexanedicarboxylate; - Green components: 2-methyl-3-hexanone(E)-oxime, 2,4-dimethyl-3-cyclohexen-1-carbaldehyde, 2-tert-butyl-1-cyclohexyl acetate, styraryl acetate, allyl(2-methylbutoxy) acetate, 4-methyl-3-decen-5-ol, diphenyl ether, (Z)-3-hexen-1-ol, and / or 1-(5,5-dimethyl-1-cyclohexen-1-yl)-4-penten-1-one; - Musk components: 1,4-dioxa-5,17-cycloheptadecanedione, (Z)-4-cyclopentadecene-1-one, 3-methylcyclopentadecanone, 1-oxa-12-cyclohexadecene-2-one, 1-oxa-13-cyclohexadecene-2-one, (9Z)-9-cycloheptadecene-1-one, 2-{(1S)-1-[(1R)-3,3-dimethylcyclohexyl]ethoxy}-2-oxoethylpropionate, 3-methyl-5-cyclo Pentadecene-1-one, 4,6,6,7,8,8-hexamethyl-1,3,4,6,7,8-hexahydrocyclopenta[g]isochromene, (1S,1'R)-2-[1-(3',3'-dimethyl-1'-cyclohexyl)ethoxy]-2-methylpropylpropionate, oxacyclohexadecane-2-one, and / or (1S,1'R)-[1-(3',3'-dimethyl-1'-cyclohexyl)ethoxycarbonyl]methylpropanoate; - Woody components: 1-[(1RS,6SR)-2,2,6-trimethylcyclohexyl]-3-hexanol, 3,3-dimethyl-5-[(1R)-2,2,3-trimethyl-3-cyclopenten-1-yl]-4-penten-2-ol, 3,4'-dimethylspiro[oxiran-2,9'-tricyclo[6.2.1.02,7]undeca[4]ene, (1-ethoxyethoxy)cyclododecane, 2,2,9,11-tetramethylspiro[5.5]undeca-8-ene-1-yl acetate, 1-(octahydro-2,3,8,8-tetramethyl-2-naphthalenyl)-1-ethanone, patchouli oil, patch Terpene fraction of lily oil, Clearwood®, (1'R,E)-2-ethyl-4-(2',2',3'-trimethyl-3'-cyclopenten-1'-yl)-2-buten-1-ol, 2-ethyl-4-(2,2,3-trimethyl-3-cyclopenten-1-yl)-2-buten-1-ol, methylcedyl ketone, 5-(2,2,3-trimethyl-3-cyclopentenyl)-3-methylpentan-2-ol, 1-(2,3,8,8-tetramethyl-1,2,3,4,6,7,8,8a-octahydronaphthalene-2-yl)ethane-1-one, and / or isobornyl acetate; - Other ingredients (e.g., amber, powdery spicy, or watery): dodecahydro-3a,6,6,9a-tetramethyl-naphtho[2,1-b]furan and any of its stereoisomers, heliotropin, anisaldehyde, eugenol, cinnamic aldehyde, clove oil, 3-(1,3-benzodioxol-5-yl)-2-methylpropanal, 7-methyl-2H-1,5-benzodioxepin-3(4H)-one, 2,5,5-trimethyl-1,2,3,4,4a,5,6,7-octahydro-2-naphthalenol, 1-phenylvinyl acetate, 6-methyl-7-oxa-1-thia-4-azaspiro[4.4]nonane, and / or 3-(3-isopropyl-1-phenyl)butanal.
[0563] The fragrance base according to the present invention is not limited to the fragrance co-components described above, and many other of these co-components are listed in reference to or recent editions thereof, such as the book *Perfume and Flavor Chemicals*, 1969, Montclair, New Jersey, USA by S. Arctander, or other works of a similar kind, as well as in the extensive patent literature in the field of fragrances. The co-components may also be compounds known to release various types of fragrance compounds, also known as pro-fragrances or pro-scents, in a controlled manner. A non-limiting example of a suitable professional fragrance is 4-(dodecylthio)-4-(2,6,6-trimethyl-2-cyclohexen-1-yl)-2-butanone, 4-(dodecylthio)-4-(2,6,6-trimethyl-1-cyclohexen-1-yl)-2-butanone, trans-3-(dodecylthio)-1-(2,6,6-trimethyl-3-cyclohexen-1-yl)-1-butanone, 2-(dodecyl Thio)octan-4-one, 2-phenylethyloxo(phenyl)acetate, 3,7-dimethylocta-2,6-dien-1-yloxo(phenyl)acetate, (Z)-hexa-3-en-1-yloxo(phenyl)acetate, 3,7-dimethyl-2,6-octadien-1-ylhexadecanoate, bis(3,7-dimethylocta-2,6-dien-1-yl)succinate, ( 2-((2-methylundeca-1-en-1-yl)oxy)ethyl)benzene, 1-methoxy-4-(3-methyl-4-phenethoxybuta-3-en-1-yl)benzene, (3-methyl-4-phenethoxybuta-3-en-1-yl)benzene, 1-(((Z)-hexa-3-en-1-yl)oxy)-2-methylundeca-1-ene, (2-((2-methylundeca-1-en-1-yl )oxy)ethoxy)benzene, 2-methyl-1-(octan-3-yloxy)undeca-1-ene, 1-methoxy-4-(1-phenethoxypropa-1-en-2-yl)benzene, 1-methyl-4-(1-phenethoxypropa-1-en-2-yl)benzene, 2-(1-phenethoxypropa-1-en-2-yl)naphthalene, (2-phenethoxyvinyl)benzene, 2-(1-((3,Examples include 7-dimethylocta-6-en-1-yl)oxy)propa-1-en-2-yl)naphthalene, (2-((2-pentylcyclopentylidene)methoxy)ethyl)benzene, 4-allyl-2-methoxy-1-((2-methoxy-2-phenylvinyl)oxy)benzene, (2-((2-pentylcyclopentylidene)methoxy)ethyl)benzene, (2-((2-heptylcyclopentylidene)methoxy)ethyl)benzene, 1-isopropyl-4-methyl-2-((2-pentylcyclopentylidene)methoxy)benzene, 2-methoxy-1-((2-pentylcyclopentylidene)methoxy)-4-propylbenzene, 3-methoxy-4-((2-methoxy-2-phenylvinyl)oxy)benzaldehyde, 4-((2-(hexyloxy)-2-phenylvinyl)oxy)-3-methoxybenzaldehyde, or mixtures thereof.
[0564] Here, “fragrance adjuvants” means ingredients that can impart further additional benefits such as color, specific lightfastness, and chemical stability. A detailed description of the properties and types of adjuvants commonly used in fragrance compositions is not exhaustive, but it should be noted that such ingredients are well known to those skilled in the art. Certain non-limiting examples include: viscosities (e.g., surfactants, thickeners, gelling and / or rheological modifiers), stabilizers (e.g., preservatives, antioxidants, heat / light and / or buffering agents, or chelating agents, e.g., BHT), colorants (e.g., dyes and / or pigments), preservatives (e.g., antimicrobial or antimicrobial or antifungal or anti-irritant agents), abrasives, skin coolants, fixatives, insecticides, ointments, vitamins, and mixtures thereof.
[0565] Those skilled in the art will understand that by mixing the aforementioned components of a fragrance composition, by simply applying standard knowledge in the art, and by trial-and-error methodologies, it is possible to perfectly design the optimal formulation for the desired effect.
[0566] The composition of the present invention, comprising at least one compound of formula (I) and at least one fragrance carrier, comprises a specific embodiment of the present invention and a fragrance composition comprising at least one compound of formula (I), at least one fragrance carrier, at least one fragrance base, and optionally at least one fragrance adjuvant.
[0567] According to certain embodiments, the compositions described above contain more than one of the compounds of formula (I), enabling perfumers to prepare accords or fragrances having the olfactory characteristics of various compounds of the present invention, and thus creating new building blocks for creative purposes.
[0568] Furthermore, for clarity, it is understood that any mixture obtained directly from a reaction medium, for example, through chemical synthesis without sufficient purification, in which the compounds of the present invention may be involved as a starting, intermediate, or final product, cannot be considered a fragrance composition according to the present invention unless such mixture provides a suitable form of the compounds of the present invention for fragrance. Therefore, unless specifically specified, unpurified reaction mixtures are generally excluded from the present invention.
[0569] The compounds of the present invention can also be advantageously used in all areas of modern fragrances, i.e., fine or functional fragrances, to favorably impart or modify the scent of consumer products to which compound (I) is added. Consequently, another subject of the present invention is a scented consumer product containing at least one compound of formula (I) as a fragrance component, as defined above.
[0570] The compounds of the present invention can be added as is, or they can be added as part of the fragrance composition of the present invention.
[0571] To clarify, “fragrance consumer product” means a consumer product that provides at least a pleasant fragrance effect to the surface or space to which it is applied (e.g., skin, hair, textiles, or house surfaces). In other words, a fragrance consumer product according to the present invention is a fragrance consumer product comprising a functional formulation, as well as optionally, further beneficial agents corresponding to the desired consumer product, and at least one compound of the present invention in an olfactory effective amount. To clarify, the fragrance consumer product is a non-edible product.
[0572] Herein, the properties and types of components of the scented consumer product are not guaranteed to be described in more detail, and this is not exhaustive in any case. A person skilled in the art can select these based on their general knowledge, according to the properties of the product and the desired effect.
[0573] Appropriately scented consumer products include, but are not limited to, fragrances such as fine fragrances, splashes or eau de parfums, colognes or shave or aftershave lotions; fabric care products such as liquid or solid detergents, fabric softeners, liquid or solid fragrances, fabric refreshers, ironing water, paper, bleach, carpet cleaners, and curtain care products; body care products such as hair care products (e.g., shampoos, coloring preparations or hairsprays, color care products, hair styling products, and dental care products), disinfectants, and intimate care products; cosmetic preparations (e.g., skin creams or lotions, vanishing creams, or deodorants or antiperspirants (e.g., sprays or roll-ons), and depilators). , tanning or sunburn or after-sun products, nail products, skin cleansers, cosmetics); or skincare products (e.g., soaps, shower or bath mousses, oils or gels, or hygiene products, or foot / hand care products); air care products, e.g., air fresheners or “ready-to-use” powder air fresheners for use in home spaces (rooms, refrigerators, cupboards, shoes, or cars) and / or public spaces (halls, hotels, malls, etc.); or home care products, e.g., mold removers, furniture care products, wipes, dish soaps or hard surface cleaners (e.g., floors, bathrooms, sanitary products, or window cleaning); leather care products; car care products, e.g., polishes, waxes, or plastic cleaners.
[0574] Some of the flavored consumer products described above may contain substances that act as aggressive mediators for the compounds of the present invention. Therefore, it may be necessary to protect the compounds of the present invention from premature degradation, for example, by encapsulation or by chemically binding the compounds of the present invention to another chemical substance suitable for releasing the components of the present invention in response to appropriate external stimuli such as enzymes, light, heat, or changes in pH.
[0575] The ratio in which the compounds according to the present invention can be incorporated into the various products or compositions described above varies over a wide range of values. These values depend on the properties of the article to be scented and the desired sensory effect, and also, if the compounds according to the present invention are mixed with fragrance co-components, solvents, or additives commonly used in the art, on the properties of the co-components in a given base.
[0576] For example, in the case of fragrance compositions, the typical concentration is on the order of 0.001% to 10% by weight or higher of the compound of the present invention, based on the weight of the composition in which they are incorporated. In the case of fragranced consumer products, the typical concentration is on the order of 0.01% to 1% by weight or higher of the compound of the present invention, based on the weight of the consumer product in which they are incorporated.
[0577] Furthermore, the intermediate compounds produced in any of the embodiments described herein can be converted into derivatives such as hydrocarbons, alcohols, diols, triols, acetals, ketals, aldehydes, acids, ethers, amides, ketones, lactones, epoxides, acetates, glycosides, and / or esters. These derivatives can be obtained by chemical methods such as oxidation, reduction, alkylation, acylation, and / or rearrangement, but are not limited to these. Alternatively, derivatives can be obtained by biochemical methods, such as contacting the terpene compound with an enzyme, such as an oxidoreductase, monooxygenase, dioxygenase, or transferase, but are not limited to these. The biochemical conversion can be carried out in vitro using isolated enzymes, enzymes from lytic cells, or in vivo using whole cells. The conversion may be a cyclization reaction achieved by chemical or biochemical methods. The derivatives can be used as fragrances, flavors, or aroma components.
[0578] Many possible modifications, which will become immediately apparent to those skilled in the art after considering the disclosures provided herein, also fall within the scope of the present invention.
[0579] The present invention will now be described in more detail by the following examples. These examples are for illustrative purposes only and are not intended to limit the scope of embodiments as described herein.
[0580] example material and method Unless otherwise stated, all chemical and biochemical materials, as well as microorganisms or cells, used herein are commercially available products.
[0581] Unless otherwise specified, recombinant proteins are cloned and expressed by standard methods, such as those described in Sambrook, J., Fritsch, EF and Maniatis, T., Molecular cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989.
[0582] Manipulation of recombinant Escherichia coli (E. coli) strains for the production of terpenoid precursors by chromosomal integration of genes encoding mevalonate pathway enzymes. We engineered Escherichia coli (E. coli) strains to produce farnesyl pyrophosphate (FPP) by chromosomal integration of recombinant genes encoding mevalonate pathway enzymes.
[0583] We designed an upper pathway operon (operon 1 from acetyl-CoA to mevalonate) consisting of the atoB gene from Escherichia coli, which encodes acetoacetyl-CoA thiolase, and the mvaA and mvaS genes from Staphylococcus aureus, which encode HMG-CoA synthase and HMG-CoA reductase, respectively.
[0584] As the lower mevalonate pathway operon (operon 2 from mevalonate to IPP / DMAPP), we selected a native operon from the Gram-negative bacterium Streptococcus pneumoniae that encodes mevalonate kinase (mvaK1), phosphomevalonate kinase (mvaK2), phosphomevalonate decarboxylase (mvaD), and isopentenyl diphosphate isomerase (fni).
[0585] The gene encoding codon-optimized Saccharomyces cerevisiae FPP synthase (ERG20) was introduced into the 3' end of the upper pathway operon to convert isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) to FPP.
[0586] The operons described above were synthesized by DNA2.0 and integrated into the araA gene of Escherichia coli strain BL21(DE3). The heterologous pathway was introduced in two separate recombination steps using the CRISPR / Cas9 genome manipulation system. The first operon to be integrated (lower pathway; operon 2) had a spectinomycin (Spec) marker, which was used to screen for Spec-resistant candidate embeds. The second operon was designed to supersede the Spec marker of the previously integrated operon and screened for Spec candidate embeds after the second recombination event. Guide RNA expression vectors targeting the araA gene were designed and synthesized by DNA2.0. Operon integration was confirmed by designing PCR primers that amplified across the recombination junction of the integrated target and embed in the araA gene using PCR. One clone that yielded correct PCR results was then fully sequenced and recorded as strain DP1205.
[0587] Culture medium composition for Escherichia coli (E. coli). The mineral AM medium used in shaking flasks and laboratory-scale fermentation experiments consists of the following: 4.2 g / L KH2PO4; 15.7 g / L K2HPO4·3H2O; 2.0 g / L (NH4)2SO4; 1.7 g / L citric acid; 8.4 mg / L EDTA; 30 g / L glycerol; 5 g / L yeast extract dissolved in diH2O; dodecane at 10% (v / v) and sterilized at 121°C for 30 minutes. A concentrated stock of 1 M, 5 mL / L MgSO4·7H2O; 1 mL / L vitamin (thiamine·HCl 4.5 g / L); and 10 mL / L batch trace metal solution were aseptically added to the medium, and the pH was adjusted to 7 with 5 M NaOH. Batch trace metal solutions in (1M HCl per 1L): 0.25 g / L CoCl2·6H2O; 1.5 g / L MnCl2·4H2O; 0.15 g / L CuCl2·2H2O; 0.3 g / L H3BO3; 0.25 g / L Na2MoO4·2H2O; 1.3 g / L Zn(CHCOO)2·2H2O; 10 g / L Fe(III) citrate. For fermentation carried out in a feed batch using AM medium, a 20 L glycerol feed solution was prepared containing 700 g / L glycerol, 12 g / L MgSO4·7H2O, 13 mg / L EDTA, and 10 mL / L feed trace solution. Feed trace metal solutions were prepared by dissolving 0.4 g of CoCl2·6H2O, 2.35 g of MnCl2·4H2O, 0.25 g of CuCl2·2H2O, 0.5 g of H3BO3, 0.4 g of Na2MoO4·2H2O, 1.6 g of Zn(CHCOO)2·2H2O, and 10 g of Fe(III)·H2O citrate in 1 L of 1 M HCl.
[0588] Culture of bacterial cells manipulated under conditions that enable the production of terpene compounds. DP1205 Escherichia coli (E. coli) cells engineered to produce elevated levels of the terpenoid precursor farnesyl diphosphate (FPP) (as described in International Publication No. 2018 / 114839) were transformed with one or two expression plasmids containing genes encoding enzymes from the homofarnesol biosynthesis pathway and / or terpene cyclases. Transformed cells were cultured on LB agarose plates with appropriate antibiotics (kanamycin (50 μg / mL) and / or carbenicillin (50 μg / mL) and / or chloramphenicol (34 μg / mL) and / or streptomycin (50 μg / mL)). Single colonies were inoculated into 5 mL of liquid LB medium supplemented with the same antibiotics, 4 g / L glucose, and 10% (v / v) n-dodecane. The following day, 0.2 mL of the overnight culture was inoculated into 2 mL of AM medium supplemented with the same antibiotic and 10% (v / v) n-dodecane. The culture was incubated at 37°C until it reached an optical density of 3. Recombinant protein expression was then induced by adding 0.1 mM IPTG, and the culture was incubated at 25°C for 72 hours.
[0589] Next, the culture was extracted with 1:1 volume of methyl tert-butyl ether (MTBE), and the composition of the organic phase was analyzed by GC-MS as described below. For quantification, an internal standard (α-longipinene (Sigma-Aldrich, Missouri, USA)) was added to the extract before GC-MS analysis, and the concentration of the component was estimated based on the comparison of peak areas.
[0590] GC-MS analysis method. Samples were analyzed using an Agilent 6890N GC system coupled with a 5975B series mass-selective detector (MSD) and equipped with a split / splitless injector (Agilent Technologies, CA) and a CombiPAL autosampler (PAL LSI 85 autosampler, Agilent Technologies, CA) injection system. The GC inlet temperature was set to 240°C, and 1.0 μL of sample was injected in split mode at a ratio of 25:1 (23.304 PSI) and analyzed on a DB-5ms capillary column (30 m × 0.25 mm inner diameter × 0.25 μm film thickness; Agilent J&W) using helium as the carrier gas at a constant flow rate of 1.2 mL / min. The oven was programmed to start at 80°C (hold for 1 minute), then increase to 300°C (10°C / min), and then to 300°C (30°C / min; hold for 1 minute).
[0591] General methods for genetic modification, culture, and compound analysis of Saccharomyces cerevisiae. A strain of Saccharomyces cerevisiae (such as the one described in International Publication No. 2018 / 114839) that elevates the levels of the terpenoid precursor farnesyl diphosphate (FPP) was used as a base strain for the expression of homofarnesol biosynthesis pathway genes and terpene cyclases. In short, this strain contains all the endogenous mevalonate pathway genes integrated into its genome under the control of the native GAL1 or GAL10 promoter. Further increases in the farnesyl diphosphate precursor pool in this strain were achieved by downregulating the squalene synthase gene (ERG9) by replacing its native promoter.
[0592] All genes (synthesized by ATUM, California, USA or Twist Bioscience, California, USA), along with relevant regulatory elements (e.g., promoters and terminators), were introduced into the base strain either by genome integration or by plasmids constructed in vivo using the yeast homologous recombination mechanism (Kuijpers et al., Microb Cell Fact., 2013, 12:47). All yeast transformations were performed using the lithium acetate method (Gietz and Woods, Methods Enzymol., 2002, 350:87-96).
[0593] Successfully transformed yeast colonies were grown for 3 days at 30°C in a culture medium containing 6.7 g / L of yeast nitrogen base (BD Difco, New Jersey, USA) without amino acids, appropriate antibiotics or nutrients depending on the marker gene used, 20 g / L of glucose, and 20 g / L of agar.
[0594] For metabolite production and analysis, a single colony of the modified yeast strain was inoculated into 2 mL of culture medium (Westfall et al., Proc Natl Acad Sci USA, 2012, 109:E111-118) with the addition of 2% galactose and 10% (v / v)n-dodecane (Sigma-Aldrich, Missouri, USA). The culture was incubated at 30°C for 3 days with shaking at 200 rpm. After incubation, the culture was extracted with 2x volume MTBE (supplemented with α-longipinene standard for quantification as described above), and the composition of the organic phase was analyzed by GC-MS using an Agilent 7890A GC system equipped with a split / splitless injector and GC Injector 80 injection system (Agilent Technologies, CA) coupled with a 5975C series mass selective detector (MSD). The GC inlet temperature was set to 260°C, and 1.0 μl of sample was injected in splitless mode. Analysis was performed on an HP-5 GC column (30 m × 0.25 mm × 0.25 μm; Agilent J&W) using helium as the carrier gas at a constant flow rate of 1.2 mL / min. The oven's initial temperature was set to 100°C and programmed to reach 300°C (10°C / min).
[0595] Example 1. In vivo production of (3E,7E)-homofarnesol and biosynthetic intermediates in engineered bacterial cells expressing GGPP synthase, phosphatase, alcohol dehydrogenase, BVMO, enal cleavage enzyme, and esterase. The reaction scheme in Figure 3 illustrates a biochemical pathway that can be used to produce (3E,7E)-homofarnesol in vivo. The common isoprenoid precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), are condensed to form (2E,6E,10E)-geranylgeranyl diphosphate (GGPP). This reaction can be catalyzed by GGPP synthase or by a combination of farnesyl diphosphate synthase (FPP synthase) and GGPP synthase. GGPP can then be converted to (2E,6E,10E)-geranylgeraniol by terpene synthase or phosphatase, as described, for example, in International Publication No. 2020011883. (2E,6E,10E)-geranylgeraniol then undergoes several enzymatic degradation steps. In the proposed pathway, (2E,6E,10E)-geranylgeraniol is first oxidized and cleaved by alcohol dehydrogenase (ADH) to produce (5E,9E)-farnesylacetone. The subsequent cleavage reaction can be catalyzed by an enal cleavage enzyme (ENase), such as the GXWXG (SEQ ID NO: 263) and DUF4334 domain protein described in International Publication No. 2021005097. (5E,9E)-farnesylacetone is further converted to (3E,7E)-homofarnesylacetate by Bayer-Villiger monooxygenase (BVMO). In the final step, the ester is hydrolyzed by an esterase to finally form (3E,7E)-homofarnesol.
[0596] To validate the (3E,7E)-homofarnesol pathway, E. coli cells were engineered to express the necessary enzymes. The plasmid was assembled to contain two operons.
[0597] The first operon is, - PsAerADH (Sequence ID: 11), an alcohol dehydrogenase from Pseudomonas aeruginosa (GeneBank accession number: WP_079868259.1) that has the ability to oxidize (2E,6E,10E)-geranylgeraniol to (2E,6E,10E)-geranylgeranial. - SCH24-BVMO1 (SEQ ID NO: 23), a Bayer-Villiger monooxygenase (BVMO) described in International Publication No. 2021005097 from Filobasidium magnum, enables the oxidation of (5E,9E)-farnesylacetone to (3E,7E)-homofarnesylacetate. - SCH24-EST1 (SEQ ID NO: 27), an esterase described in International Publication No. 2021005097 from Filobasidium magnum, which hydrolyzes (3E,7E)-homofarnesylacetate to (3E,7E)-homofarnesol and acetic acid. It was designed to contain three cDNAs that encode the subject.
[0598] The second operon is, - SCH94-03944 (Sequence ID: 22), a protein containing an enal cleavage enzyme from Rhodococcus erythropolis, described in International Publication No. 2021005097, which cleaves (2E,6E,10E)-geranylgeranial to (5E,9E)-farnesylacetone and acetaldehyde. - CcrGGPPS2-del57, i.e., a truncated form of (2E,6E,10E)-geranylgeranyl diphosphate synthase from Cistus creticus (GeneBank:AAM21639.1) (Sequence ID: 1), and - Two copies of cDNA encoding PgpB (Sequence ID: 3), which is a phosphatase from Escherichia coli (GeneBank: WP_089622241.1) that converts (2E,6E,10E)-geranylgeranyl diphosphate to (2E,6E,10E)-geranylgeraniol by cleavage of the diphosphate group. It was constructed to contain four cDNAs that encode the following:
[0599] The cDNAs encoding PsAerADH, SCH24-BVMO1, SCH24-EST1, SCH94-03944, CcrGGPPS2-del57, and PgpB were codon-optimized for Escherichia coli (SEQ ID NOs: 102, 115, 120, 113, 90, and 93), and the RBS sequence (aaggaggtaaaaaa) (SEQ ID NO: 264) was placed upstream of each of these cDNAs. The first operon containing the cDNAs for PsAerADH, SCH24-BVMO1, and SCH24-EST1 was under the control of a T5 promoter and an rrnB T1 terminator. A second operon containing cDNA for SCH94-03944, CcrGGPPS2-del57, and two cDNAs for PgpB was under the control of the T5 promoter and rrnB terminator. Both operons were synthesized and cloned into a vector backbone containing the pUC origin of replication, the kanamycin resistance gene, and the LacI gene to obtain the vector pHFOL-5.
[0600] The farnesyl diphosphate (FPP)-producing Escherichia coli (E. coli) strain DP1205, described in International Publication No. 2021005097, was transformed with the aforementioned vector pHFOL-5. When cultured under conditions that enable the production of terpene compounds, the resulting cells were able to produce (3E,7E)-homofarnesol (Figure 4). Under the conditions described in the "Materials and Methods" section, 69 mg / L of (3E,7E)-homofarnesol was produced in the culture medium in a tube assay.
[0601] The cellular product profile (Figure 4) also shows the accumulation of several metabolic intermediates, including (5E,9E)-farnesylacetone and (2E,6E,10E)-geranylgeraniol. By optimizing various enzymatic steps in the pathway, the accumulation of intermediates can be limited and the concentration of the final product can be increased.
[0602] The following example illustrates how to identify the appropriate enzymes for each enzymatic step in a pathway to increase (3E,7E)-homofarnesol production and limit the accumulation of metabolic intermediates.
[0603] Example 2. Enzymatic conversion of (5E,9E)-farnesylacetate to (3E,7E)-homofarnesylacetate and screening of Bayer-Villiger monooxygenase to improve in vivo production of (3E,7E)-homofarnesol. Example 1 shows that, regarding the in vivo production of (3E,7E)-homofarnesol, (5E,9E)-farnesylacetone may also be detected due to insufficient activity of Bayer-Villiger monooxygenase (SCH24-BVMO1 (SEQ ID NO: 23)) in this strain.
[0604] In this example, in vivo screening of various BVMOs was performed to identify enzyme candidates with higher efficiency compared to SCH24-BVMO1 (SEQ ID NO: 23). For screening, a modified version of vector pHFOL-5 (described in Example 1) was created by removing SCH24-BVMO1. The new vector was named pF-Facetone-7. Transformation of Escherichia coli (E. coli) strain DP1205 with vector pF-Facetone-7 resulted in a strain capable of producing (5E,9E)-farnesylacetone when cultured under conditions that enable terpene compound production. Up to 410 mg / L of (5E,9E)-farnesylacetone was produced in culture medium in a tube assay (Figure 5A). When cells are further transformed with a vector expressing active BVMO, (5E,9E)-farnesylacetone is converted to (3E,7E)-homofarnesylacetate, and this (3E,7E)-homofarnesylacetate itself is converted to (3E,7E)-homofanesol by SCH24-EST1 esterase (Figure 5B).
[0605] In the next step, we designed codon-optimized cDNAs encoding BVMOs and cloned them into a pJ423 expression plasmid (ATUM, Newark, California). DP1205 E. coli cells were co-transformed with one of these plasmids and plasmid pF-Facetone-7. BVMO activity was determined by quantifying the production of (3E,7E)-homofarnesol for each BVMO tested and compared to SCH24-BVMO1 (SEQ ID NO: 23).
[0606] The following table (Table 1) shows the relative activity of several BVMOs identified in this screening for (5E,9E)-farnesylacetone conversion.
[0607] [Table 3]
[0608] AraBVMO1 and AflavBVMO1 were found to produce significantly greater amounts of (3E,7E)-homofarnesol than SCH24-BVMO1 (SEQ ID NO: 23). AflavBVMO1 (SEQ ID NO: 26) increased (3E,7E)-homofarnesol production by 59% compared to reference BVMO.
[0609] Example 3. Enzymatic conversion of (2E,6E,10E)-geranylgeraniol to (2E,6E,10E)-geranylgeranial and in vivo screening of alcohol dehydrogenases to improve the efficiency of in vivo production of (3E,7E)-homofarnesol. In this example, we tested the efficiency of alcohol dehydrogenase (ADH) in vivo in oxidizing (2E,6E,10E)-geranylgeraniol to (2E,6E,10E)-geranylgeranial. Alcohol dehydrogenase catalyzes the reversible oxidation of alcohols.
[0610] To avoid the reverse alcohol dehydrogenase reaction in this in vivo screening assay, the enal-cleaving enzyme SCH94-03944, described in Example 1, was co-expressed in E. coli cells to enzymatically convert (2E,6E,10E)-geranylgeraniol to (5E,9E)-farnesylacetone. In this case, the catalytic efficiency of the ADH tested correlated with the amount of (2E,6E,10E)-geranylgeraniol converted to (5E,9E)-farnesylacetone. Candidate ADHs were codon-optimized and cloned into the pJ423 expression plasmid (ATUM, Newark, California). DP1205 Escherichia coli (E. coli) cells were co-transformed with one of these plasmids, pJ401-SCH94-3944-PgpB-CcrGGPPS, which contains the genes necessary for producing (5E,9E)-farnesylacetone, excluding the gene encoding ADH, as described below.
[0611] Therefore, plasmid pJ401-SCH94-3944-PgpB-CcrGGPPS2-del57 is - Enal cleavage enzyme SCH94-03944 (SEQ ID NO: 22), that is, a protein containing the GXWXG (SEQ ID NO: 263) and DUF4334 domains described in International Publication No. 2021005097, which cleaves (2E,6E,10E)-geranylgeranial to (5E,9E)-farnesylacetone and acetaldehyde. - PgpB (Sequence ID: 3), a phosphatase from Escherichia coli (GeneBank:WP_089622241.1) that converts (2E,6E,10E)-geranylgeranyl diphosphate to (2E,6E,10E)-geranylgeraniol by cleaving the diphosphate, and - CcrGGPPS2-del57 (Sequence ID: 1), i.e., a truncated form of (2E,6E,10E)-geranylgeranyl diphosphate synthase from Cistus criticus (GeneBank: AAM21639.1) It contained an operon that held three cDNAs encoded for [the specified substance].
[0612] cDNAs encoding SCH94-03944, PgpB, and CcrGGPPS2-del57 were codon-optimized for expression in Escherichia coli (SEQ ID NOs: 113, 93, and 90). An operon was designed containing the three cDNAs and an RBS sequence (aaggaggtaaaaaa) (SEQ ID NO: 264) placed upstream of each cDNA. The operon was synthesized and cloned into a pJ401 expression plasmid (ATUM, Newark, California).
[0613] The resulting strain (DP1205, containing plasmid pJ401-SCH94-3944-PgpB-CcrGGPPS2-del57 and plasmid pJ423, which holds a candidate alcohol dehydrogenase) was cultured under conditions that enable terpene production. The amounts of (2E,6E,10E)-geranylgeraniol and (5E,9E)-farnesylacetone were measured, and the conversion rates were calculated for each alcohol dehydrogenase tested.
[0614] [Table 4]
[0615] These results are shown in Table 2. The highest conversion rates from (2E,6E,10E)-geranylgeraniol were detected in culture medium in tube assays for alcohol dehydrogenases ThTerpADH1 (SEQ ID NO: 12), Ppseudo-alkJ (SEQ ID NO: 15), and CymB (SEQ ID NO: 17). The maximum amount of (5E,9E)-farnesylacetone was 172 mg / L, produced by ADHPpseudo-alkJ (SEQ ID NO: 15).
[0616] Example 4. In vivo testing of esterases that catalyze the hydrolysis of (3E,7E)-homofarnesylacetate to (3E,7E)-homofarnesol and acetic acid. In this example, various esterases were tested in vivo for their ability to catalyze the hydrolysis of (3E,7E)-homofarnesylacetate to (3E,7E)-homofarnesol and acetic acid.
[0617] Esterases were tested in two plasmid systems in *E. coli* DP1205. The first plasmid was: - Enal cleavage enzyme SCH94-03944 (SEQ ID NO: 22), that is, a protein containing the GXWXG (SEQ ID NO: 263) and DUF4334 domains described in International Publication No. 2021 / 005097, which cleaves (2E,6E,10E)-geranylgeranial to (5E,9E)-farnesylacetone and acetaldehyde. - PgpB (Sequence ID: 3), a phosphatase from Escherichia coli (GeneBank:WP_089622241.1) that converts (2E,6E,10E)-geranylgeranyl diphosphate to (2E,6E,10E)-geranylgeraniol by cleaving the diphosphate, and - CcrGGPPS2-del57 (Sequence ID: 1), i.e., a truncated form of (2E,6E,10E)-geranylgeranyl diphosphate synthase from Cistus criticus (GeneBank: AAM21639.1) It consisted of an operon containing three cDNAs that encoded the following:
[0618] The cDNAs encoding SCH94-03944, PgpB, and CcrGGPPS2-del57 were codon-optimized for expression in Escherichia coli (SEQ ID NOs: 113, 93, and 90) and contained an upstream RBS sequence (aaggaggtaaaaaa) (SEQ ID NO: 264). The operon was synthesized and cloned into the pJ401 expression plasmid (ATUM, Newark, California).
[0619] The second operon is, - PsAerADH, i.e., an alcohol dehydrogenase from Pseudomonas aeruginosa (GeneBank:WP_079868259.1) (Sequence ID: 11) that has the ability to oxidize (2E,6E,10E)-geranylgeraniol to (2E,6E,10E)-geranylgeranial, - SCH24-BVMO1 (SEQ ID NO: 23), i.e., Bayer-Villiger monooxygenase (BVMO) described in International Publication No. 2021005097, which enables the oxidation of (5E,9E)-farnesylacetone to (3E,7E)-homofarnesylacetate. - Candidate esterase gene It consisted of three cDNAs that encoded the subject.
[0620] cDNAs encoding PsAerADH, SCH24-BVMO1, and candidate esterase genes were codon-optimized for expression in Escherichia coli (SEQ ID NOs: 102 and 115) and contained an upstream RBS sequence (aaggaggtaaaaaa) (SEQ ID NO: 264). The operon was synthesized and cloned into the pJ424 expression plasmid (ATUM, Newark, California).
[0621] DP1205 Escherichia coli (E. coli) cells were co-transformed with plasmids pJ401-SCH94-03944-PgpB-CcrGGPPS2-del57 and pJ424-PsAerADH-SCH24-BVMO1-esterase. In the resulting strains, it was found that esterases SCH24-EST1 (SEQ ID NO: 27) from Filobasidium magnum and SCH23-EST1 (SEQ ID NO: 28) from Hyphozyma roseonigra could convert (3E,7E)-homofarnesyl acetate to (3E,7E)-homofarnesol.
[0622] Example 5. In vivo screening of various phosphatases that catalyze the hydrolysis of (2E,6E,10E)-geranylgeranyl diphosphate to (2E,6E,10E)-geranylgeraniol. In this example, phosphatases from different protein families were tested in vivo for their ability to catalyze the hydrolysis of (2E,6E,10E)-geranylgeranyl diphosphate to (2E,6E,10E)-geranylgeraniol and diphosphate. The phosphatases were tested in DP1205 Escherichia coli (E. coli) cells containing geranylgeranyl diphosphate synthase in an expression vector. Codon-optimized versions of the DNA fragments encoding the phosphatase genes, each cloned into a second expression vector, were then introduced into the strain by transformation. The inventors identified eight phosphatases that exhibited activity toward (2E,6E,10E)-geranylgeranyl diphosphate and production of (2E,6E,10E)-geranylgeraniol under conditions that enabled terpene production. These results are shown in Table 3. The phosphatases PgpB (SEQ ID NO: 3), PeSubTPP1 (SEQ ID NO: 7), and TalVeTPP (SEQ ID NO: 8) ...
Claims
1. Formula (I) of one of the stereoisomers or a mixture thereof 【Chemistry 1】 A method for preparing the compound, (i) Formula (VI) of one of the stereoisomers or a mixture thereof 【Chemistry 2】 The compound is brought into contact with a polypeptide having terpene cyclase enzyme activity to produce the compound of formula (I). Methods that include...
2. More than 97% of the compound of formula (I) is of formula (Ia) 【Transformation 3】 and / or formula (Ib) 【Chemistry 4】 The method according to claim 1, in the form of.
3. The compound of formula (VI) is of formula (VIa) 【Transformation 5】 The method according to claim 1 or 2, in a form thereof.
4. The method according to any one of claims 1 to 3, wherein the polypeptide having terpene cyclase enzyme activity is a meloterpenoid cyclase enzyme and / or a squalene cyclase enzyme.
5. The meroterpenoid cyclase enzyme is the following polypeptide: (a) . [W]xxx[D]xx[ILVMN] (Sequence ID: 254); . PxxAxxxNxxWE (Sequence ID: 255); . MxxxFxxMLxxR (Sequence ID: 256); and . RxxxxGQS (Sequence ID: 257) A bacterial membrane-integrated meloterpenoid cyclase comprising at least one amino acid motif selected from, (b) . [WY]Exx[YFW] (Sequence ID: 258); and . [DNE]xSYxxP (Sequence ID: 259) A fungal membrane-integrated meloterpenoid cyclase comprising at least one amino acid motif selected from, (c) GxWxxxW[WG]xxxxY (Sequence ID: 260); . WxxxHxxV[TSA] (Sequence ID: 261); and GxWxD[FY] (Sequence ID: 262) A bacterial-derived soluble meloterpenoid cyclase containing at least one amino acid motif selected from the following: Selected from at least one of the following: Each residue x independently represents any natural amino acid residue. The method according to claim 4.
6. The method according to claim 4 or 5, wherein the meroterpenoid cyclase enzyme is a membrane-integrated meroterpenoid cyclase enzyme.
7. The method according to any one of claims 4 to 6, wherein the meroterpenoid cyclase enzyme is a membrane-integrated meroterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 50 to 73 and 280 to 289.
8. The method according to claim 6 or 7, wherein the enzyme preferably produces a compound of formula (I) in the form of formula (Ia).
9. The method according to claim 4 or 5, wherein the meroterpenoid cyclase enzyme is a soluble meroterpenoid cyclase enzyme.
10. The method according to any one of claims 4, 5, and 9, wherein the meroterpenoid cyclase enzyme is a soluble meroterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to either SEQ ID NO: 74 or 75.
11. The method according to claim 9 or 10, wherein the enzyme preferably produces a compound of formula (I) in the form of formula (Ib).
12. The method according to any one of claims 1 to 5, wherein the polypeptide having terpene cyclase enzyme activity is a meroterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 50 to 75 and 280 to 289, preferably with any of SEQ ID NOs: 57, 71, 74, 280, 281, 282, 283, 286, 287, and 288.
13. The squalene cyclase enzymes are [SP][TP][VIL]WDTx[LWI] (SEQ ID NO: 247), PGG[WF][GYA]F (SEQ ID NO: 248), PDxDD[TAS][TIAS] (SEQ ID NO: 249), [MIL]QxxxG[GA][WF]x[AS][FY] (SEQ ID NO: 250), Qxxx[GH]xWxG[RK]WGxx[YF]x The method according to claim 4, comprising at least one motif selected from YG (SEQ ID NO: 251), Qxx[DN]G[GS][WF][GS]ExxxxS (SEQ ID NO: 252), and [STA]xx[SFN][QC]T[AGT]W[AS][LIV]xx[LQ] (SEQ ID NO: 253), wherein residue x independently represents any natural amino acid residue.
14. The method according to claim 4 or 13, wherein the squalene cyclase enzyme has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 29-49 and 265-279, preferably with any of SEQ ID NOs: 29-49, 265-274, and 276-279.
15. The method further includes one or more steps prior to step (i), and the step is (a) Formula (V) of one of the stereoisomers or a mixture thereof 【Transformation 6】 The step of contacting the aforementioned compound with a polypeptide having esterase enzyme activity to produce a compound of formula (VI), (b) Formula (IV) of one of the stereoisomers or a mixture thereof 【Transformation 7】 The above compound is contacted with a polypeptide having Bayer-Villiger monooxygenase (BVMO) enzyme activity to produce the compound of formula (V), (c) Formula (III) of one of the stereoisomers or a mixture thereof 【Transformation 8】 The above compound is brought into contact with a polypeptide having enal cleavage enzyme activity to produce a compound of formula (IV), (d) Formula (II) of one of the stereoisomers or a mixture thereof 【Chemistry 9】 The above compound is brought into contact with a polypeptide having alcohol dehydrogenase (ADH) enzyme activity to produce a compound of formula (III), (e) a step of producing a compound of formula (II) from geranylgeranyl diphosphate (GGPP) using one or more polypeptides having phosphatase enzyme activity, and / or (f) A step of producing GGPP from isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP) using one or more polypeptides having prenyltransferase enzyme activity. The method according to any one of claims 1 to 14, including
16. (a) The polypeptide having esterase enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with either SEQ ID NO: 27 or 28, (b) The polypeptide having BVMO enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 23-26 and 216-227, (c) The polypeptide having enal cleavage enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO:
22. (d) The polypeptide having ADH enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of SEQ ID NOs: 11 to 21, (e) The polypeptide having phosphatase enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any one of SEQ ID NOs: 3 to 10, and / or (f) The polypeptide having prenyltransferase enzyme activity has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with respect to SEQ ID NO: 1 or 2. The method according to claim 15.
17. The method according to any one of claims 1 to 16, wherein the method is an in vivo method or a biotransformation method.
18. The method according to any one of claims 1 to 17, wherein the method is carried out in recombinant cells capable of functionally expressing a polypeptide having terpene cyclase enzyme activity as defined in any one of claims 4 to 14, and optionally one or more polypeptides as defined in claim 15 or 16.
19. The method according to claim 18, wherein the recombinant cells are bacterial cells, plant cells, fungal cells such as yeast cells, and preferably the recombinant cells belong to the genus Escherichia, Saccharomyces, Yarrowia, or Pichia.
20. Recombinant cells containing, capable of producing, or producing a compound of formula (I), wherein more than 97% of the compound of formula (I) is in the form of formula (Ia) and / or (Ib).
21. The recombinant cell according to claim 20, wherein the cell comprises a polypeptide having terpene cyclase enzyme activity as defined in any one of claims 4 to 14.
22. A cell culture fermentation medium comprising recombinant cells according to claim 20 or 21.
23. A reaction mixture comprising a compound of formula (I), wherein more than 97% of the compound of formula (I) is in the form of formula (Ia) and / or (Ib).
24. A compound of formula (I) obtained or obtainable from recombinant cells according to claim 20 or 21, from a cell culture fermentation medium according to claim 22, or from a reaction mixture according to claim 23, by the method according to claims 1 to 19.
25. A compound of formula (I), wherein more than 97% of the compound is in the form of formula (Ia) and / or (Ib).
26. Use of the compound of formula (I) according to claim 24 or 25 as a fragrance component.
27. Use of a meroterpenoid cyclase enzyme to produce the compound of formula (I) and / or its derivatives.
28. A variant meloterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in any of Sequence IDs 56 to 70.
29. A variant meloterpenoid cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in any of SEQ ID NOs: 56-61, 69, and 70, wherein the variant meloterpenoid cyclase enzyme has an amino acid substitution at amino acid position 9 with respect to the sequence described in SEQ ID NO:
51.
30. A variant squalene cyclase enzyme having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or more sequence identity with any of the sequences described in SEQ ID NOs: 29, 31, 33, 34, 36-38, 40, 41, 43-46, 48, 49, 265-274, and 276-279, wherein the polypeptide has the amino acid alanine at position 437 and the amino acid methionine at position 600 with respect to the sequence described in SEQ ID NO: 82.