This notebook contains an analysis of the ecModel of E. coli for biotin production.
Benjamín J. Sánchez, 2019-10-29
We will use:
eciML1515, i.e. without proteomics and letting the model choose the required enzyme amounts (from a shared pool).
The models are available at:
(temporal, eventually the ecModel will be available in the master branch)
Model modifications for biotin production:
For the ecModel we will use normal FBA followed by pFBA minimizing glucose:
A 20% of carbon will be shifted towards biotin.
First let's build a dataframe with all enzyme usages under both conditions. We are looking for this at the
draw_prot_XXXXXX rxns, i.e. they are in units of
Let's make sure all values are positive:
One value was slightly negative, probably due to solver approximations. Let's change that:
There are a lot of rows with zero usage under both conditions, so let's filter them out:
Now let's compute usage changes. We will look at both absolute changes (the difference between both conditions) and relative changes (the fold change or ratio between them).
We can now sort and take a look at the top 20 of enzymes that:
We will use the FSEOF approach from cameo on the ecModel to validate the results from the previous section:
We will replace reaction ids with gene names. First, we have to replace with the associated gene ids:
And now using the metabolic model we can replace the labels with gene names:
Finally, we can sort based on the slope of increase in biotin production as each protein usage increases:
Overall we see consensus with the genes from section 3: in particular, all genes that cameo predicted to increase in expression as biotin prediction increases (top 13 genes in the list) are also in the top 20 genes that increase in terms of relative usage. This validates 1) cameo as a tool for exploring ecModels, and 2) the genes that we had proposed originally.
For creating biotin, 1 sulfur is required, which is taken from a 2Fe-2S cluster.
Note that the net reaction of adding up all 3 is:
NET RXN: 2fe1s_c + cys__L_c --> 2fe2s_c + ala__L_c
The net reaction of adding up these 3 instead is:
NET RXN: 2fe1s_c + atp_c + h2o_c + cys__L_c --> adp_c + h_c + pi_c + 2fe2s_c + ala__L_c
In both cases the sulfur is being provided by cysteine. However, ISC does not take up any energy whereas SUF uses ATP. It is interesting to notice that the ecModel approach chose still SUF as the best option to regenerate the cluster (as
suf genes pop up both in sections 3 & 4). This is due to faster kinetics:
We can see that the speed of
S2FE2SR is much higher than
I2FE2SR, hence the choice of the ecModel to switch strategy.
It is not fully known if this 2Fe-2S regeneration does indeed occur or actually a whole 2Fe-2S molecule is used up for each biotin molecule citation needed. We can test how predictions are afected with no regeneration by modifying the
BTS5 reaction to consume the 2Fe-2S molecule:
Let's check if the model can grow:
Growth seems uneaffected. We will then repeat the cameo analysis for the modified ecModel:
We see the same over-expression candidates from before, except for
sufD. These are the genes in charge of:
which is not needed anymore as the regeneration cycle is not active.
sufE are stil included though, as they also catalize:
which is still needed for producing 2Fe-2S. We can look at all reactions that are used to produce 2Fe-2S by optimizing only for that molecule's production:
We see that the sulfur is still coming from cysteine, and the iron directly from the extracellular media.