Genome analysis of four novel Psychrobacter spp. and characterisation of their six putative laccase-like multicopper oxidases using bioinformatics tools
Abstract
In May 2009, several Psychrobacter spp. were found at the bottom of the Norwegian sea outside of Svalbard. Four of these, P11F6, P2G3, P11G3 and P11G5, were selected and sequenced for further work, and form the basis for this thesis. This work began with automatic annotation using RAST. The four genomes were found to have between 3.2 - 3.4 million base pairs, a GC-content of 41.9 - 42.9 % and contained between 2674 and 2914 genes. RAST places genes in subsystems if it finds a gene that fits one of the 27 subsystems. With few exceptions, the results from RAST showed an equal distribution of genes when comparing the subsystem distributions of the four genomes. Mauve was used to analyse how evolution had changed the genomes, and how whole blocks of genes had changed positions compared to the other genomes. Visual observation by carefully zooming in on specific parts of the genome verified that large parts of sequence were fully conserved in all four genomes, as well as demonstrating that large stretches of sequence were close to fully conserved, with the only difference being a shift in position relative to the other genomes.
Further investigations were performed to figure out if the Psychrobacter spp. contained laccases. Six laccase-like multicopper oxidases (LMCO's) were found; two in each of P11G3 and P11G5, one in P2G3 and one in a plasmid of P11F6. Analyses showed that these protein sequences consisted of 565 - 568 aa's. The compositions of atoms and amino acids were determined using ExPASY's ProtParam. This showed great similarities, as well as finding the molecular weight (63.7 - 64.1 KDa) and theoretical isoelectric point (6.75 - 8.59). Half-life was determined to be ``above 10 hours'' and all proteins were found to be stable.
One of the most important features of the laccases are the copper binding residues, and the LMCO's were searched in hope of finding them. Using Phyre2, type 1 was found in complete, while type 2 and type 3 were only partially found. Manual searches were performed to find the remaining residues, and hence finding the complete Cu-binding sites. These sites were found in the so-called signature sequences; conserved sequences which were expected to be found in members of the multicopper oxidase family.
Further studies were done on visualizing the LMCO's in PyMOL, both separate and superpositioned, to see differences and similarities. The 3D models showed that the LMCO's that were expected to be similar based on the other analyses, turned out to have more different structures. PyMOL was also used to visualize the substrate pockets and compare them with regards to shape and size. Clustal was used to compare the sequences in alignments, and both signal sequences and the full protein sequences were aligned and compared. The phylogenetic trees made by Clustal showed the relationship between the LMCO's. The signal sequences were investigated with PSORT-B to determine their subcellular localization, which showed that all LMCO's were destined for the periplasm.
Finally, the Psychrobacter sp. P11F6 was grown on media containing 2-methoxy-phenols, in an attempt to alter the gene expression into transcribing LMCO's, with 2-methoxy-phenols being one of the many substrates of laccases and LMCO's. As it turned out, the LMCO's were not even on the list of upregulated genes. The promoter sequences for the top ten transcript list were still identified. To see if any of these ten upregulated genes were translated, the proteome was investigated. This showed only eight of the ten, although these were upregulated when compared to P11F6 grown on media which did not contain the substrate.