Protein translocons in photosynthetic organelles of Paulinella chromatophora

The rhizarian amoeba Paulinella chromatophora harbors two photosynthetic cyanobacterial endosymbionts (chromatophores), acquired independently of primary plastids of glaucophytes, red algae and green plants. These endosymbionts have lost many essential genes, and transferred substantial number of genes to the host nuclear genome via endosymbiotic gene transfer (EGT), including those involved in photosynthesis. This indicates that, similar to primary plastids, Paulinella endosymbionts must have evolved a transport system to import their EGT-derived proteins. This system involves vesicular trafficking to the outer chromatophore membrane and presumably a simplified Tic-like complex at the inner chromatophore membrane. Since both sequenced Paulinella strains have been shown to undergo differential plastid gene losses, they do not have to possess the same set of Toc and Tic homologs. We searched the genome of Paulinella FK01 strain for potential Toc and Tic homologs, and compared the results with the data obtained for Paulinella CCAC 0185 strain, and 72 cyanobacteria, eight Archaeplastida as well as some other bacteria. Our studies revealed that chromatophore genomes from both Paulinella strains encode the same set of translocons that could potentially create a simplified but fully-functional Tic-like complex at the inner chromatophore membranes. The common maintenance of the same set of translocon proteins in two Paulinella strains suggests a similar import mechanism and/or supports the proposed model of protein import. Moreover, we have discovered a new putative Tic component, Tic62, a redox sensor protein not identified in previous comparative studies of Paulinella translocons.


Introduction
The primary plastid endosymbiosis is one of the most important transitions in the evolution of life on our planet.It took place sometime before 1.5 billion years ago, when a phagotrophic eukaryote, the ancestor of glaucophytes, red algae and green plants, enslaved a β-cyanobacterium, which was transformed into a two-membrane photosynthetic primary plastid [1,2].In order to become a true cell organelle, the prokaryote underwent a tremendous transformation that involved two key processes: (i) endosymbiotic gene transfer (EGT), i.e. gene transfer from the endosymbiont to the host nuclear genome, and (ii) origin of an import machinery in its envelope membranes for host genome-encoded proteins, including the products of EGT-derived genes [3][4][5].
Interestingly, primary plastid endosymbiosis was not a singular event in the history of our biosphere, but has happened at least twice, questioning thereby a paradigm that endosymbiont-to-organelle transformation is an exceptionally rare evolutionary phenomenon [6].For the second time, it took place in the case of Paulinella chromatophora, a testate filose amoeba, belonging to the supergroup Rhizaria, a lineage that is very distantly related to the primary plastid bearing lineages, i.e. glaucophytes, red algae and green plants, united in the supergroup Archaeplastida [7,8].
Paulinella chromatophora harbors two photosynthetic cyanobacterial endosymbionts (chromatophores), acquired independently of Archaeplastida primary plastids ~60 million years ago [8][9][10][11].Similarly, to primary plastids, chromatophores are surrounded by a two-membrane envelope and are deeply integrated with their host cell.They divide synchronously with Paulinella cell (after being distributed to daughter cells), exchange metabolites with it and are incapable of independent life [9,12,13].They are especially similar in their structure to glaucophyte primary plastids (cyanelles) because they still retain the peptidoglycan wall in the intermembrane space, a clear sign of their bacterial inheritance [12].The intimacy between the Paulienlla host and its endosymbiotic bodies is well emphasized by substantial reduction of the latter genome, which was sequenced for two Paulinella strains CCAC 0185 [11] and FK01 [14].They were reduced in size and coding capacity by a factor of three (to 1 Mb and ~900 genes), compared with their closest free-living relative, the cyanobacterium Cyanobium gracile PCC 6307 (~3 Mb and ~3300 genes) [11,14] (Fig. 1).The genome-size reduction involved loss of many genes, including those engaged in essential biosynthetic pathways and endosymbiotic gene transfer [11,14].Transcriptome analyses have identified more than 30 EGT-derived genes in the Paulinella nuclear genome [14][15][16]; however, many more are expected; Nowack et al. [16] estimated their number between 40 and 125.Because the lost genes are of the utmost importance for cell functioning, and at least some of the transferred ones are transcriptionally regulated by the host and involved in photosynthesis or photosynthesis-related processes, Paulinella chromatophores are expected to have evolved a protein import route, or routes.
To check if chromatophore protein import system is similar to the primary plastid translocons at the outer and inner envelope membrane, Toc and Tic, respectively, Bodył et al. [17] searched for homologs of Toc and Tic genes in the completely sequenced chromatophore genome of Paulinella CCAC 0185 strain.They found that the Paulinella chromatophore genome encodes homologs to Toc12, Toc64, Tic21 and Tic32, but lost those of Toc75, Tic20, Tic55 and Tic62.They suggested that the missing genes, which are still present in the genomes of closely related cyanobacteria, were most probably relocated via the EGT to the host nuclear genome, and that their products are now imported into the photosynthetic bodies' membranes to create a Toc-Tic-like import machinery [17].Mackiewicz and Bodył [18] also found that an EGT-derived gene encoding the subunit IV of PS I reaction center (PsaE) form Paulinella FK01 strain has a clear signal peptide predicted from an alternative initiation site [18].Since signal peptides are involved in protein import into the endoplasmic reticulum, such a presequence in the case of PsaE indicates the possibility of endomembranedependent targeting to the outer chromatophore membrane.Subsequent studies confirmed the presence of a signal peptide-like presequence in another four of nine investigated EGT-derived proteins that evolved by modification of their N-terminal mature parts, but not the missing Toc and Tic components in the nuclear genome [16,19].
On the basis of their bioinformatics analyses, Mackiewicz et al. [20] formulated a model for protein import into Paulinella chromatophores.According to the model, Paulinella EGT-derived proteins established endomembrane-dependent mechanism, vesicular trafficking, to pass the outer chromatophore membrane, while the inner membrane is crossed by a simplified Tic-like complex.All investigated EGT-derived proteins, presumed to follow this pathway, characterize low molecular weight and nearly neutral charge, which constitute good adaptations to their passage through the peptidoglycan wall still present in the chromatophores.Mackiewicz et al. [20] also identified additional putative elements of the system, for example chaperons in the intermembrane space and components of the molecular motor responsible for pulling the imported proteins into the chromatophore stroma.It should be noted that part of the model, concerning the vesicular trafficking to the outer chromatophore membrane, has recently been confirmed by Nowack et al. [21].They used immunogold labeling electron microscopy to show that PsaE, PsaK1 and PsaK2 from Paulinella CCAC 0185 strain are localized both in chromatophores and the endomembrane system, including the Golgi apparatus.They also indicated that these proteins are indeed PS I subunits as they associate with PS I components.Interestingly, Nowack et al. [21] revealed that the three proteins do not have typical signal peptides at their N-termini.In the case of PsaE some internal targeting information was suggested by Mackiewicz et al. [20], whereas, for PsaK proteins algorithms recognized uncleavable signal peptides in their N-terminal hydrophobic domains, suggesting that their N-termini might fulfill a signal peptide role in the protein import after all [19].
Both Paulinella strains have undergone differential plastid gene losses [14,22] and have even been suggested to represent two different species on the basis of some morphological and phylogenetic analyses [8].Therefore it is interesting to check if they possess the same set of potential Toc and Tic homologs.The common maintenance of the same set of proteins would indicate a similar import mechanism and/or support the model proposed by Mackiewicz et al. [20].Therefore, we searched Paulinella FK01 strain genome for Toc and Tic homologs, and compared the results with the data obtained for Paulinella CCAC 0185 strain, 72 cyanobacteria, eight Archaeplastida, as well as some other bacteria.We carried out more sensitive homology and conserved domain searches than in previous approaches, which is justified by a short length and relatively high divergence of these proteins [17].

Material and methods
We downloaded 867 and 841 sequences of chromatophore-encoded proteins for Paulinella CCAC 0185 strain from Genbank [23] and for Paulinella FK01 strain from Debashish Bhattacharya's Laboratory website (http://dblab.rutgers.edu/home/index.php),respectively.We also acquired proteomes of 72 completely sequenced cyanobacteria, eight Archaeplastida and two reference bacteria for comparative studies from Genbank [23].On the basis of the obtained sequences a local protein database was created.Sequences of 16S and 23S rRNA for phylogenetic studies were extracted from gbk files downloaded from GenBank (genome database) for appropriate organisms.This set included sequences, which were derived from two Paulinella chromatophora genomes, 72 completely sequenced cyanobacterial genomes, eight primary plastid genomes and five genomes representing different bacterial lineages as an outgroup.
To find potential Toc-Tic components, we searched the database using Pisum sativum translocons as query sequences with PsiBlast (E-value set to 0.01, word size to two, filtering for low complexity regions on, five iterations) to achieve better sensitivity than with standard BLAST or FASTA [24].In the case of Tic21 a homolog from Arabidopsis thaliana was used as Tic21 is absent from P. sativum [25].The obtained candidates were verified in terms of domain content by searching CDD database with E-value < 0.01 (Tab. 1) [26].Alignments of Toc12, Tic21, Tic32 and Tic62 from P. sativum or A. thaliana and two strains of P. chromatophora, CCAC0185 and FK01, were performed in PSI-Coffee [27], slow and accurate algorithm dedicated to distantly related proteins using profile information.The alignments were visualized and edited in Jalview 2.4.0.b2 [28].The N-terminal transmembrane β-barrel regions were predicted with BOC-TOPUS [29], whereas α-helical transmembrane regions with TopPred [30] and TMpred [31].
Phylogenetic trees for the concatenated alignment of 16S and 23S rRNA were inferred by maximum likelihood (ML) method in TreeFinder [32], Bayesian approach in MrBayes 3.2.1 [33] and minimum evolution (ME) method based on logDet/paralinear distance in PAUP [34].In TreeFinder and MrBayes approaches, we used separate models of nucleotide substitutions for each type of rRNA.Two different GTR+Γ( 5) models for each data partition were applied in the ML method as suggested by the Propose Model module in Treefinder considering all criteria (−lnL, AIC, AICc, BIC, HQ).In MrBayes analyses, we assumed two separate mixed+I+Γ(5) models for each rRNA partition to sample appropriate models across the substitution model space in the Bayesian MCMC analysis itself avoiding the need for a priori model testing [35].In TreeFinder detailed search depth set to 2 was applied whereas in PAUP the final tree was searched from 10 starting trees obtained by stepwise and random sequence addition followed by the tree-bisection-reconnection (TBR) branch-swapping algorithm.To assess significance of particular branches, non-parametric bootstrap analyses were performed on 1000 replicates in these two methods.Additionally, we applied the local rearrangements-expected likelihood weights (LR-ELW) method in TreeFinder.In MrBayes analyses, two independent runs starting from random trees were applied, each using eight Markov chains.Trees were sampled every 100 generations for 10 000 000 generations.In the final analysis, we selected trees from the last 2 984 000 generations that reached the stationary phase and convergence (i.e. the standard deviation of split frequencies stabilized and was lower than the proposed threshold of 0.01).The temperature parameter for heating the chains was suitably adjusted to keep the proportions of successful state exchanges between chains close to the suggested range from 0.10 to 0.70.

Results and discussion
We found that both Paulinella chromatophore genomes encode the same set of sequence homologs to Toc12, Toc34, Toc64, Toc159, Tic21, Tic32 and Tic62 but lost those of Toc75, Tic20, Tic22 and Tic55 (Fig. 2).Interestingly, the missing genes are still present in the α-cyanobacterial genomes (Cyanobium gracile, Prochlorococcus marinus and Synechococccus CC9311, CC9605, CC9902 and WH 7803) closely related to Paulinella (Fig. 1).This indicates that they might have been lost during chromatophore genome reduction or transferred to the nucleus.An interesting case is Tic22 homolog, which is absent not only from Paulinella but also from all α-cyanobacteria grouped with Paulinella except for Synechococcus RCC307 (Fig. 1).The latter α-cyanobacterium is placed at the basal position to the clade indicating that the gene encoding Tic22 must have been lost after the divergence of this basal lineage from the other α-cyanobacteria.It should be mentioned that Tic22 homolog is present in all other, more than 50 studied cyanobacterial taxa (Fig. 2).Independent losses concern also the gene for Tic62 homolog in the cyanobacterium Atelocyanobacterium thalassa, Tic21 in the land plant Pisum sativum as well as Tic55 in the cyanobacterium Synechococcus sp.JA-3-3Ab and Tab. 1 Domains characteristic of Toc and Tic proteins from Pisum sativum and Arabidopsis thaliana, which were used to predicted their homologs in Paulinella, cyanobacteria and bacteria proteomes.
Fig. 1 MrBayes tree based on 16S and 23S rRNA alignment.An independent origin of Paulinella endosymbionts from primary plastids of red algae, glaucophytes and green plants is clearly visible.The former are descendants of α-cyanobacteria whereas the latter of β-cyanobacteria.Numbers at nodes, in the presented order, correspond to posterior probabilities estimated in MrBayes (MB), support values calculated by local rearrangements-expected likelihood weights method (LR), as well as bootstrap percentages calculated in TreeFinder (BP) and PAUP using minimum evolution method (ME).Values of the posterior probabilities and bootstrap percentages lower than 0.50 and 50%, respectively, were omitted or indicated by a dash "-".The length of branches leading to Bacteria taxa was shortened by 50% in comparison to their original length because of very high substitution rate.
Species name Toc/Tic components the red alga Cyanidioshyzon merolae.The gene for Tic21 is also absent from two studied Gleobacter species, which often take basal position to the rest cyanobacteria in phylogenetic trees [36][37][38], suggesting that lack of this gene might have characterised the ancestor of all cyanobacteria.
Because no homolog to the crucial outer membrane translocon Toc75 was found in the two chromatophore genomes and the Paulinella nuclear genome [16], the discovery of homologs to its two receptors, Toc34 and Toc159, was much unexpected (Fig. 2).However, we do not consider these proteins to be components of protein import machinery without Toc75, especially taking into account their relatively low E-values, in both PsiBlast and CDD analyses (Fig. 2).They may be distantly related to the plastid Toc components but they do not seem to be their functional homologs.Interestingly, quite significant E-values obtained for some cyanobacterial Toc34/Toc159 homologs suggest that these proteins might be of cyanobacterial origin.This contrasts with the common prevailing idea that they are derived from an ancient eukaryotic GTPase, Toc159 as the result of Toc34 duplication [39][40][41].A thorough phylogenetic investigation is, however, necessary to clarify the issue, which should be at present considered only a hypothesis.
We also found sequences significantly similar to Pisum sativum Toc64 in both Paulinella chromatophore genomes and bacterial genomes (Fig. 2).However, they should not be considered its true functional homologs because all of them did not have an important tetratrico-peptide (TPR) domain, responsible for Hsp90 docking, and therefore cannot fulfil Toc64 function.The gene for Toc64 encoding the TPR domain is also absent from glaucophytes, rhodophytes and the green alga Chlamydomonas reinhardtii but present in the other green alga Chlorella variabilis, some prasinophytes and land plants (Fig. 2 and data not shown).It indicates that the gene is a late acquisition in the green lineage.
In contrast to Toc64, homologs of Toc12, Tic21 and Tic32 in both Paulinella strains contain all the essential domains present in plant counterparts (Fig. 3).Paulinella Tic21 with its four well predicted α-helical transmembrane regions is suggested to create a protein conducting channel at the inner chromatophore membrane and can be involved in protein import.Such function of plant Tic21 was recently reported by Kikuchi et al. [42] and Hirabayashi et al. [43], and Lv et al. [44] proved functional compatibility of plant and cyanobacterial Tic21 by showing that the knockout mutant of Arabidopsis tic21 was rescued by Synechocystis ortholog.Tic32 homologs, which possess dehydrogenase and calmodulin-binding domains, probably regulate the import via redox sensing [20].Interestingly, Toc12 homologs, instead of N-terminal transmembrane β-barrel domains typical of plant proteins, comprise C-terminal transmembrane α-helical regions.Toc12, is one of the four chromatophoreencoded Hsp40 (DnaJ) proteins and a putative component of molecular motor responsible for pulling imported proteins into the chromatophore stroma [20].
The new discovery of the undertaken analyses was identification of potential homologs to Tic62 in Paulinella.In Pisum sativum Tic62 consists of two characteristic regions: the N-terminal part with a highly conserved NAD(P)binding domain and C-terminal part with biding sites for ferredoxin-NAD(P)-oxido-reductase (FNR; Fig. 3) [45].Similar N-terminal organization was shown for Paulinella Tic62 homologs, but the C-terminal region with FNR interacting repeats is missing from both Paulinella proteins (Fig. 3).However, the C-terminal region, which apparently evolved only in vascular plants, is not necessary for binding to the Tic translocon/inner membrane.The region responsible for mediating binding to the Tic complex is localized in the central part of Pisum sativum Tic62 [46], whereas the C-terminal region allows for specific and strong binding of ferredoxin-NAD(P)-oxido-reductase molecules, especially when Tic62 is present at the stroma lamellae of the thylakoid membrane [47].The reversible interaction with Tic complex/ inner membrane is supposed to be mediated by some hydrophobic contacts [48].One or two transmembrane domains were proposed in Pisum sativum Tic62 by Küchler et al. [45].In agreement with that, we predicted two such domains using TopPred [30] and TMpred [31].Interestingly, we also identified one potential transmembrane domains in each of Paulinella Tic62 proteins using these two programs (Fig. 3).
We suggest that Paulinella Tic62 homolog is a new component of the simplified but fully-functional Tic-like translocon at the inner chromatophore membrane, composed, beside Tic62, of Tic21, Tic32 and Toc12 (Hsp40; Fig. 4).The fact that these proteins possess all the essential domains to fulfil the task further strengthens the reliability of a Tic complex at the inner chromatophore membrane.Because Tic55, the third redox sensing protein [49][50][51], has not been discovered in the chromatophore genome in previous [17] and present study, as well as in the Paulinella nuclear genome [16], we suggest Tic32 and Tic62 to perform the task of redox sensing on their own or with some unknown partner.
An interesting argument for functioning of a simplified Tic system provided Kikuchi et al. [52].They suggested that the primordial Archaeplastida Tic complex was probably based on Tic20.Similarly to Tic21 and mitochondrial translocon at the inner membrane Tim23/Tim17, Tic20 contains four well predicted α-helical transmembrane domains capable of forming a protein conducting channel, and therefore together with Tic21 and Tim23/Tim17 represent a good example of parallel evolution of import functions [42].Since a Tic20-based system might have functioned Fig. 2 Distribution of Toc and Tic homologs in Paulinella chromatophora, cyanobacteria, Archaeplastida and two bacteria.The homologs were found by PsiBlast searches and verified for domain content by searching conserved domain database (CDD).Only best hits with E-values obtained from CDD equal or lower than 0.01 are indicated.The color scheme corresponds to E-value range.Please note that the genomes of Paulinella chromatophores encode homologs of Toc12,Toc34, Toc159, Tic21, Tic32 and Tic62 but those of, Toc75, Tic20, Tic22, Tic55 and Toc64 were lost.In the case of Toc64 significant homologous sequences were found; however, they did not contain a tetratrico-peptide domain, responsible for Hsp90 docking, and therefore we left this column blank.
in the ancestor of Archaeplastida, it is easy to imagine the possibility of analogous system based on Tic21 in the case of Paulinella.In support of this protein import via Tic21 was experimentally proved in plants [42,43].Although additional core translocon subunits of eukaryotic origin, such as Tic214, Tic100 and Tic56 were added to Tic20 in the 'green' lineage, they are missing from basal Archaeplastida, glaucophytes and red algae [52].This suggests that a simple Tic apparatus is responsible for protein import into primary plastids of glaucophytes and red algae and probably Paulinella chromatophora as well.Kikuchi et al. [52] also showed that eukaryote-derived Tic110, previously considered the main translocation pore, is not a component of the complex consisting of Tic20, Tic214, Tic100 and Tic56, but plays a role of a scaffold for stromal molecular chaperons at a later stage during protein import [49,50].
Fig. 3 Alignment of Toc12, Tic21, Tic32 and Tic62 from Pisum sativum and two strains of Paulinella chromatophora CCAC0185 and FK01.The range of domains characteristic of a given protein was indicated according to CDD database prediction.A characteristic motif of calmodulin binding domain in Tic32 was also marked [20], as well as NADH(P) binding domain (CDD 257784) and FNR-interacting repeats in Tic62 [46].The N-terminal transmembrane β-barrel regions were predicted with BOCTOPUS [29], α-helical transmembrane regions with TopPred [30] and TMpred [31].The average range of the regions was calculated based on the two latter predictions.

Conclusion
Our studies reveal that chromatophore genomes from both Paulinella strains encode the same set of translocons that could potentially create a simplified but fully-functional Tic-like translocon at the inner chromatophore membranes.Moreover, we have discovered a new putative Tic component, namely Tic62, a redox sensor protein not identified in previous comparative studies of Paulinella translocons [17].The common maintenance of the same set of Toc/Tic proteins indicates a similar import mechanism in the two investigated Paulinella strains and supports the proposed model.We also suggest a possibility that Toc34 and Toc159, GTPases of presumed eukaryotic origin [39][40][41], might in fact be derived from cyanobacterial proteins.It is composed of Tic21 (protein-conducting channel), Tic32 and Tic62 (calcium and redox-sensing regulatory proteins) as well as a molecular motor responsible for pulling imported proteins into the organelle stroma.The latter could consist of Hsp93, Hsp70, Hsp40 (Toc12) and GrpE.

Fig. 4
Fig.4 Model for a protein translocon in inner membrane of Paulinella photosynthetic chromatophores.The translocon is a simplified Tic-like apparatus and is responsible for final import step of nuclear-encoded proteins across the inner chromatophore membrane.It is composed of Tic21 (protein-conducting channel), Tic32 and Tic62 (calcium and redox-sensing regulatory proteins) as well as a molecular motor responsible for pulling imported proteins into the organelle stroma.The latter could consist of Hsp93, Hsp70, Hsp40 (Toc12) and GrpE.