Rapid advances in genomic sequencing technology have resulted in a data deluge in biology and bioinformatics. Mendel is a scalable and similarity-aware distributed storage framework that enables efficient similarity searching over large volumes of data. Mendel supports the following capabilities:

  • Support for both DNA and protein sequences
  • A scale-out architecture that enables the incremental assimilation of nodes in the system
  • Performance weakly depends on the size of the data, able to perform when data volumes grow larger than memory
  • Support for many file formats including FASTA, GenBank, EMBL, and others
  • Highly configurable querying system