Abstract: Compression algorithms reduce the redundancy in
data representation to decrease the storage required for that data.
Lossless compression researchers have developed highly
sophisticated approaches, such as Huffman encoding, arithmetic
encoding, the Lempel-Ziv (LZ) family, Dynamic Markov
Compression (DMC), Prediction by Partial Matching (PPM), and
Burrows-Wheeler Transform (BWT) based algorithms.
Decompression is also required to retrieve the original data by
lossless means. A compression scheme for text files coupled with
the principle of dynamic decompression, which decompresses only
the section of the compressed text file required by the user instead of
decompressing the entire text file. Dynamic decompressed files offer
better disk space utilization due to higher compression ratios
compared to most of the currently available text file formats.
Abstract: Approximate tandem repeats in a genomic sequence are
two or more contiguous, similar copies of a pattern of nucleotides.
They are used in DNA mapping, studying molecular evolution
mechanisms, forensic analysis and research in diagnosis of inherited
diseases. All their functions are still investigated and not well
defined, but increasing biological databases together with tools for
identification of these repeats may lead to discovery of their specific
role or correlation with particular features. This paper presents a new
approach for finding approximate tandem repeats in a given sequence,
where the similarity between consecutive repeats is measured using
the Hamming distance. It is an enhancement of a method for finding
exact tandem repeats in DNA sequences based on the Burrows-
Wheeler transform.