Comparative Genomics

Comparative Genomics

A selected number of CCG comparative genomic projects are briefly outlined below


International Barley Genome Sequencing Consortium

The CCG in close collaboration with the Department of Agriculture of Western Australia and the Beijing Genome Institute are undertaking the sequencing and assembly of the barley (Hordeum vulgare L.) chromosomes 5H and 7H as part of the Australian contribution to the IBGSC. Representative fragments derived from both the short and long arms of each chromosome have been selected using a Minimal Tilling Path approach. Illumina HiSeq technology was then used to sequence each individual fragment (also known as BAC clones). This approach represents a major advance as compared to whole genome shotgun approaches, particularly for complex and highly repetitive genomes. The significantly improved genome assembly for barley is anticipated to become public late 2015.


International Cattle Tick Genome Sequencing Consortium

The cattle tick Rhipicephalus (Boophilus) microplus is the most economically important tick parasite in the world. The livestock industry in US saves approximately 3 billion dollars per annum in cattle tick free zones. In Australia, cattle ticks cause ~140 million losses to the beef and dairy industries annually. The CCG and the US Department of Agriculture (USDA-ARS) are undertaking the assembly of the R. microplus 7.1Gb genome that is estimated to be 70% repetitive. Three phases have been defined to tackle the assembly of this complex genome: Sequencing and assembly of the gene-rich fraction using Roche 454 (Phase I) and Illumina HiSeq (Phase II) technologies; and a whole genome shotgun approach using the Pacbio long read technology have been undertaken to assemble the repetitive fraction of the cattle tick genome. The outcomes of this collaborative effort are anticipated to be published late 2015.


Harnessing the genome of the Australian paralysis tick

The CCG in collaboration with the Queensland Alliance for Agriculture and Food Innovation (QAAFI) and Qld Government researchers have teamed up with Eli Lilly Australia to find ways of reducing the devastating impact of tick paralysis. Tick paralysis is caused by a neurotoxin in the saliva of the paralysis tick (Ixodes holocyclus), which is transmitted to a host animal while the tick is feeding. In companion animals, illness often develops over two-to-seven days presenting initially as muscle weakness in the host-animal’s legs, leading progressively to ascending paralysis of other major muscle groups, and sometimes respiratory failure or death. The CCG has undertaken the assembly of paralysis tick transcripts isolated from salivary glands and other samples. Computational workflows have been customised for the functional annotation of I. holocyclus transcripts and the identification of candidate neurotoxins. These resources will enable functional studies to implement safe treatments and/or preventive paralysis tick vaccines.  


Bioinformatics toolkit for the surveillance and diagnosis of plant viral pathogens

Post Entry Quarantine (PEQ) facilities in Australia and New Zealand play pivotal roles in preventing entry of viral pathogens in imported plants that could impact national and international markets. Current testing methods of plants held in PEQ rely mainly on biological indexing and specific PCR assays to screen an ever-increasing number of plant viruses and viroids. To enable PEQ plant pathologists to cope with the large volume of imported plants within a competitive time frame and operational costs, we have implemented a web-based bioinformatics toolkit. This toolkit uses next generation sequencing data from small RNAs which are produced by the plant immune system upon infection with viral pathogens. We have implemented a computational framework that enables the surveillance and diagnosis of all known viruses and viroids in a single experiment. To date we have identified a diverse range of viruses in a broad spectrum of quarantined plant material. We envisage that this resource, generated with the support from the Plant Biosecurity CRC, will benefit not only PEQ agencies but also benefit academia and industry stakeholders worldwide.


Assembling the Arundo donax (L.) shoot transcriptome: weed control, phytoremediation and biofuel implications 

Arundo donax (Poaceae) is considered one of the worst invasive species globally. In Australia, A. donax has been determined to have the potential to become a significant weed in riparian habitats in Queensland, as in California and Arizona causes $18 million losses per year are incurred. The giant reed has been shown to have moderate tolerance to both heavy metal contaminated soils and saline environments making it a candidate for remediating contaminants in soils, water, and sediments. The giant reed is one of the fastest growing plants in the planet; it can grow up to 10 cm per day. This rapid biomass accumulation characteristic has made the giant reed a candidate for possible biofuel applications. The CCG in collaboration with USDA-ARS and Landcare Research New Zealand have sequenced, assembled and annotated 27,491 high confidence A. donax genes that are expressed in the giant reed shoot. This public resource will promote weed control, phytoremediation and/or biofuel programs using the giant reed model system.


Optimization of splice switching therapies to treat Duchenne muscular dystrophy

Splicing – the removal of non-coding intronic sequences and precise ‘splicing’ of exons during pre-mRNA processing – is a fundamental process occurring during the expression of more than 90% of human genes. Molecular diagnostics have indicated that mutations resulting in RNA mis-splicing account for 15% of all inherited diseases, with some genes showing a marked susceptibility to splice aberrations, with 50% of gene mutations resulting in mis-splicing. This NHMRC study aims to optimise the antisense oligonucleotide (AOs) design for efficient manipulation of one or more exons. Empirically assessed and ranked AOs for splice switching efficiencies are being evaluated to identify sequence patterns, RNA secondary features, cis- and trans- interactions that may impact the binding of AOs to target sites. Analysis outcomes for all the target exons will be combined to train a neural network, a vector machine learning algorithm, or other computational approaches that enable oligomer discrimination. This project will develop therapeutic compounds to address a range of genetic disorders, with a focus on specific types of gene defects, namely mutations affecting splicing that can be suppressed to generate a normal gene product.


RD Connect - YABI Omics Clinical Analysis 

RD-Connect consortium ( was achieved to establish an integrated platform to host and analyze genomic and clinical data from research projects with clinical bioinformatics tools for analysis and integration of molecular and clinical data to discover new disease genes, pathways, and therapeutic targets.

RD-connect works with Neuromics, a project with a focus on neuromuscular and neurodegenerative disease such as muscular dystrophies, Huntington’s disease and spinal atrophy. With all these rare genetic diseases, the search for better treatments and even cures, depends on accurate genetic diagnosis based on gene expression analysis.  

As part of this effort,  the CCG in collaboration with Leiden University Medical Centre  is undertaking the establishment of a RNA-Seq Analysis platform for large-scale clinical research in rare disease based on our YABI open-source workflow environment . It is envisaged that this platform may serve as a key tools for future large scale clinical research into a range of significant rare diseases in both Europe and Australia.


Comparative Genomics of swine pathogenic bacteria Brachyspira hyodysenteriae 

The CCG (in collaboration with the School of Veterinary and Life Sciences, Murdoch University  and Institute for Hygiene and Infectious Diseases of Animals, Justus-Liebig University Giessen, Giessen, Germany) has completed a comparative genomic analysis of 20 bacterial genomes from Asia, Australia,  Europe and North America (Brachyspira hyodysenteriae),  revealing potential differential phenotypic markers for numerous strains, as well as a useful public sequence resource, as part of  the global effort to identify viable vaccine candidates for the prevention of swine dysentery.  


Comparative genomics of clinical H. haemolyticus isolates

Typically H. haemolyticus is generally considered a commensal bacteria, nevertheless in recent reports they have been associated with human respiratory tract invasive disease. This NHMRC funded study aims to sequence the genomes of H. haemolyticus commensal and the pathogenic strains. Furthermore, extensive comparative genomics analyses will be conducted against a large panel of public NTHi pathogenic bacteria genomes to identify the genetic requirements for invasive disease. This is essential for understanding commensalism versus pathogenesis in H. haemolyticus and for further development of H. haemolyticus as a vaccine.


CCG contributes to the Wheat Chromosome 7A assembly

Bread wheat (Triticum aestivum L.) is one of the world’s most important cereal grain crops, serving as the staple food source for 30% of the human population. In 5 of the last 10 years wheat production was not sufficient to meet demand. With the global
population projected to exceed 9 billion by 2050, researchers, breeders and growers are facing the challenge of increasing wheat production by about 70% to meet future demands. New strategies for increasing production are necessary and tapping into the wheat genome information is likely to facilitate breeding programs. The CCG as part of the Australian contribution to the International Wheat Genome Sequencing Consortium contributed with the design of a Minimal Tilling Path (MTP) BAC-pool approach to tackle the challenge of sequencing and assembling the ~800Mbp wheat chromosome 7A. With funding from the Grain Research & Development Corporation ( and Bioplatforms Australia ( more than 10,000 BACs covering the 7A short and long chromosome arms were selected and pooled in more than 800 MTP BAC pools for Illumina HiSeq sequencing. The CCG provided solutions for data storage, metadata capture, quality curation and individual MTP BAC pool assembly. The outcomes of these contributions have been delivered to the Australia-China Centre for Wheat Improvement headed by Prof. Rudi Appels (Murdoch University).

  • 24 July 2015

    The emperor's new dystrophin: finding sense in the noise


  • 24 July 2015

    Editorial: medicinal chemistry of aptamers