Department of Electrical Engineering, Institute of Engineering of Polytechnic of Porto, Rua Dr. António Bernardino de Almeida, 431, 4200-072 Porto, Portugal
Copyright © 2012 J. A. Tenreiro Machado. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
This paper studies the information content of the chromosomes of twenty-three species. Several statistics considering different number of bases for alphabet character encoding are derived. Based on the resulting histograms, word delimiters and character relative frequencies are identified. The knowledge of this data allows moving along each chromosome while evaluating the flow of characters and words. The resulting flux of information is captured by means of Shannon entropy. The results are explored in the perspective of power law relationships allowing a quantitative evaluation of the DNA of the species.