What is Shannon’s Information Theory? (Report)

In information science, information is any kind of events that affects the state of a dynamic system. The first attempts to mathematically model information were made by Harry Nyquist (1920) and Ralph Hartley (1928). Based on Hartley’s model, information contained in an event has to be defined in terms of some measure of the uncertainty of that event and less certain events has to contain more information than more certain events. In addition, the information of independent events taken as a single event should be equal to the sum of the information of the independent events. Once we agreed to define the information of an event in terms of its probability, the other properties is satisfied if the information of an event is defined as a log function of its probability. [pic 1][pic 2]

Named after __Boltzmann's H-theorem__, Shannon (1948) denoted the entropy H of a __discrete random variable__ X with possible values{x1, ..., xn} as,

H(X) = E(I (X)) = [Shannon, 1948] (1) [pic 3][pic 4]

Where E is the __expected value__, and I is the self- __information content__ of X. Based on this definition, entropy of a random variable is defined in terms of its probability distribution and is a good measure of randomness or uncertainty. In information theory developed by Shannon, mutual information measures the amount of information that can be obtained about one random variable by observing another based on the assumption that the information that Y tells us about X is the reduction in uncertainty about X due to the knowledge of Y. Intuitively, mutual information measures the information that X and Y share by measuring how much knowing one of these variables reduces the uncertainty about the other. That is, Y tells as much about X as X tells about Y. Then, the mutual information of X relative to Y is given by: