About Interesting Posts
Interesting documents about a variety of subjects from around the world. Posted on edocr.
1
CHAPTER 1
REVIEW ON CHANNEL EQUALIZATION
1.1 INTRODUCTION
Communication systems comprise three fundamental elements: transmitter, channel and
receiver. When signals are transmitted through a communications system, they are obstructed
by some distortions which are mainly intersymbol interference (ISI) and noise. The
transmitted signal is distorted by ISI which is caused by multipath effect in band limited
(frequency selective) time dispersive channels and is the cause of bit errors on the receiver
side. ISI is considered the main factor negatively affecting fast transmission of data over
wireless channels. In order to eliminate or minimize these distortions, equalizers are
employed in these systems. Equalization is the method of compensating for, eliminating or
reducing the amplitude and phase distortion introduced by the transmission medium in
communications systems. In a general meaning, the term equalization refers to any signal
processing operation which minimizes ISI. An equalizing filter overcomes the ISI caused by
individual received symbols of a transmitted data stream, as well as the crosstalk that for
example occurs due to coupling of a transmitted pulse or that results from the capacitive
coupling of the transmitted pulse on an outgoing pair interfering with the received pulse on an
incoming pair. The task of equalizers is to provide efficient and error free communications by
ensuring that signals transmitted through the channel are recovered as original at the end of
the receiver that communications system has.
Distortions may be linear or nonlinear depending on the channel characteristics of channel.
When transmitting information through a physical channel, various mechanisms distort the
transmitted signal significantly, causing degradation or even failure in the communications.
These mechanisms can be classified as additive thermal noise, man-made noise and
atmospheric noise. In practice, many of the physical channels are characterized by various
channel models. The most frequently encountered channel of communications is that with
additive noise. An additive random noise process is involved in this channel model. The
factors causing the additive noise process are amplifiers and electronic components on the
2
receiver side of the communications system the transmission’s interference as radio signal
transmission, for example. Thermal noise is the category of noise that electronic components
and amplifiers cause. Statistically, that sort of noise gets classified as a random Gaussian
noise process and modeling the channel in terms of mathematics is named the additive
Gaussian noise channel. The mathematical model becomes an additive white Gaussian noise
(AWGN) channel in the case of the random process being a white-noise process. The random
process is a white-noise process when the power spectral density (PSD) is flat (constant) over
all frequencies [1,2].
When compared with AWGN channels, mobile radio channel deficiencies make the signal on
the receiver side greatly distorted or cause its significant fading. This fading is classified as a
non-additive signal disturbance and appears as time variation in the signal amplitude. Some
techniques are utilized to compensate for fading channel deficiencies. The main techniques
used in compensating for fading channel impairments can be classified as equalization,
channel coding and diversity that are employed to compensate for the signal distortions and
improve the received signal quality [3]. This thesis concentrates on equalization technique.
Equalization techniques can be categorized into linear or nonlinear techniques depending on
the way the output of an adaptive equalizer is used for subsequent control of the equalizer.
The decision making device of the receiver processes the equalizer’s output and determines
the value of the digital data bit being received before applying a slicing or thresholding
operation (a nonlinear operation) to determine the value of the reconstructed message data. If
this data is not used in the feedback path for the adapting of the equalizer, it’s a linear type of
equalization, but on the other hand, if the decision making device feeds the reconstructed data
back in order to alter the equalizer’s subsequent outputs, the equalization is nonlinear [3]. If
the used channels are nonlinear, linear equalizers cannot reconstruct the transmitted signal.
There are various equalizer structures among which linear transversal equalizer (LTE) is the
most common. The simplest LTE, whose transfer function is a polynomial, uses only feed
forward taps and has many zeros but poles only at ݖ = 0. This filter is called a finite impulse
response (FIR) filter or simply a transversal filter. In this type of equalizer, the filter
3
coefficient linearly weights the received signal’s current and past values before summing
them to produce the output of the equalizer.
Besides, some applications employ nonlinear equalizers since linear equalizers cannot deal
with high amount of channel distortion. The performance of linear equalizers on channels
involving deep spectral nulls in the passband is not good and hence, linear equalizers enhance
the noise present in the frequencies in which they place too much gain in attempting to
compensate for the distortion. Nonlinear equalizers are superior in performance to linear
equalizers because of these reasons. Three quite effective nonlinear methods which possess
improvements over linear equalization methods and that are used in 2G and 3G systems are:
1. Decision Feedback Equalization (DFE)
2. Maximum Likelihood Symbol Detection (MLSD)
3. Maximum Likelihood Sequence Estimation (MLSE) [4]
There have been large amount of studies aimed at channel equalization using various
methods, techniques and algorithms. Recently, neural network based fuzzy technology has
been widely used as a powerful and significant tool in channel equalization of various types of
signals. Experts have determined the fuzzy rules by utilizing the channel’s input-output data
pairs in this type of equalizers. Adaptive channel equalization based on neural networks and
employing multilayer perceptron (MLP) has been developed as part of this thesis which has
enabled the equalization of Quadrature Amplitude Modulation (QAM) type signals of various
levels. This has been achieved for both linear and nonlinear channels using a Nonlinear
Neuro-Fuzzy Equalizer (NNFE) at a relatively high adaptation speed and accurate equalizer
output results which has proven to be quite effective and practical.
The changeable fuzzy IF-THEN rules which configure the fuzzy adaptive filter are formed by
either human experts or the input-output pairs that are matched throughout a procedure of
adaptation. In this study, neural networks and fuzzy technology are used for the development
of a neuro-fuzzy equalizer for channel distortion of Quadrature Amplitude Modulation
(QAM) signals. Even though the QAM signal has a complex form which is composed of real
(in-phase) and imaginary (quadrature) parts, the complex signal is not directly applied to the
4
channel and equalizer since the used neuro-fuzzy filter is based on real values and best suits
the signal processing that takes place in real multidimensional space. The modulation and
demodulation of M-ary QAM (where M=4 & M=16 ) is accomplished by splitting the stream
of data bit into the in-phase (I) and quadrature (Q) components. Gray coding is employed to
map the I and Q components together. The significant feature of this thesis study is the
application of ‘normalization’ method by which the modulated in-phase and quadrature QAM
signal is normalized to a maximum of one. Consequently, each component of the complex
signal attains values between 0 and 1 by first shifting the values such that the minimum value
is zero and then scaling them such that the maximum value is 1. Each component then is input
to the channel and equalizer separately and denormalized separately at the equalizer’s output
where they are recombined to form the final desired complex QAM scheme at the end. The
normalization method provides better BER and convergence performance since it is stable in
addition to more accurate equalizer output results with relatively small number of iterations
before the minimum error is attained.
This thesis consists of five chapters where:
Chapter 1 presents an overview on channel equalization. The state of application of neuro-
fuzzy system and fuzzy logic as well as their properties and features are explained.
Chapter 2 explains the channel equalization, the distortions and noise in the channel.
Mathematical models and formulas representing the channels and nonlinear neuro-fuzzy
equalizer used in the thesis together with its characteristics are described.
Chapter 3 outlines the architecture and operation principles of the nonlinear neuro-fuzzy
network (NNFN). The used learning algorithm, the linguistic data about the target system and
numerical input-output relationships of NNFN are explained in detail. Fuzzy rule-based fuzzy
sets, the parameters and error calculations are analyzed.
Chapter 4 describes in detail the quadrature amplitude modulation (QAM) and its properties.
The application of QAM on NNFN and the features of the thesis design are explained. The
specific technique of normalization used in equalizing QAM signals and its mathematical
implementation are described.
5
Chapter 5 illustrates the simulation results of the equalization system demonstrating
graphically and statistically the performance of the equalization system. Bit error rate (BER)
versus signal-to-noise ratio (SNR) analysis is made in tabulated and graphical forms proving
the accuracy of the system. Comparisons between the channels and between the two
constellations of QAM are made to illustrate the performance of the equalizer, as well.
Conclusions are discussed at the end.
1.2 Overview
In order to accurately transmit the input signals from the transmitter to the receiver,
minimization and thus equalization of distortions in the channel is critical. This can be
successfully done by employing efficient equalization algorithms and techniques during the
transmission of the signals from the transmitter to the receiver. This chapter considers
methods used in channel equalization. Neural networks, fuzzy and neuro-fuzzy technologies
which form the basis of the adaptive channel equalization are analyzed and discussed.
1.3 The State of Application of Channel Equalization
Linear and nonlinear distortions are the main obstacles in transmitting the input signals to the
receiver of a communications system in their original state. These distortions, namely ISI and
noise, are caused in the channel and channel equalization is needed in order to transmit the
signals as accurately as possible. Even though both linear and nonlinear equalizers can be
used for this purpose, nonlinear equalizers are more preferably used because they are capable
of compensating both linear and nonlinear channel distortions effectively.
Two types of equalization are used which are sequence estimation and symbol detection. In
this thesis, symbol detection technique is used to realize the adaptive channel equalization.
This technique maps the input baseband signal of the input on top of a feature space that the
representation of a learnt property of the transmitted signal determined. The symbols are
separated by the usage of decision regions which function to classify the distorted signal.
The ISI problem which affects all digital communication systems is mainly caused by
restricted bandwidth. The restricted bandwidth is caused by rectangular multilevel pulses
6
when they are filtered improperly as they pass through a communication system spreading in
time, being smeared into adjacent time slots, causing ISI [2]. This ISI in turn causes errors
when transmitting data over the channel. Additionally, channel characteristics have a
significant role in causing distortions and the response of channel is time-variant meaning that
channel characteristics are not known in advance. The time-variant channel response and the
unknown channel characteristics obligates the equalizers to be designed to adjust themselves
to the channel response and to adapt themselves to the variations of time in the response of
channel so as to compensate for the channel characteristics’ variations. Such equalizers are
called adaptive equalizers and they have been receiving great attention because of their
superior features. In practice, as an example, there are situations when the channel consists of
dial-up telephone lines and the channel transfer function changes from call to call. In such a
case, the equalizer should be an adaptive filter.
Adaptive equalizers are categorized as supervised and unsupervised equalizers. When it is
necessary to use a training sequence because of the unpredictable channel characteristics in a
communications system, supervised equalizers are employed. This is done in order for the
channel response to be compared with the input to be able to update the parameters of the
equalizer. On the other hand, some communications systems do not allow the use of training
signals because the methods used to accomplish the equalization of channel do not allow the
training sequence to be transmitted. This is when unsupervised equalization is employed. This
equalization that involves a self-recovery method is also referred to as blind equalization [5].
Supervised equalization can be brought about by either sequence estimation or symbol
detection. Sequence estimator’s duty is to test the possible sequences of data instead of
decoding every one of the received symbols on its own and then selecting the sequence of
data that is most likely to be the output [4]. This sequence estimator is also referred to as
maximum likelihood sequence estimator (MLSE).
Unsupervised or blind equalization is used when the signal has no memory i.e. the signals
transmitted in successive symbol intervals are interdependent. In this case, each transmitted
symbol is detected separately. The constant modulus algorithm (CMA), discovered by Godard
[6] and Treichler [7] serves to be a highly significant algorithm for blind equalization. Its
7
robustness and capability of converging before phase recovery made this algorithm very
successful [5]. Another algorithm called the multimodulus algorithm (MMA) [8,9] has
improved performance over CMA since it provides low steady-state mean-squared error
(MSE) in addition to cancelling the necessity for phase recovery in steady-state operation [9].
Additionally, hybrid blind equalization algorithms are different types of blind equalization
algorithms known for combining or augmenting existing cost functions to attain improved
performance [5].
Nonlinear equalizers are considered significant among signal processing techniques due to
their both superior performance and improved features compared with linear equalizers, in
addition to the wide variety they offer. One of those features is the ability to form nonlinear
decision boundaries where the Bayesian equalizer determines the performance of these
equalizers. Decision Feedback Equalizers (DFEs) are one class of nonlinear equalizers with
relatively improved performance. Estimating and cancelling the ISI that an information
symbol induces on future symbols after it has been detected and decided upon forms the basis
of decision feedback equalization [4]. The DFE can possess two structures which are either
direct transversal or lattice structures. The direct form is made up of a feed forward filter
(FFF) and a feedback filter (FBF). The output of a detector located in between the FFF and
FBF determines the decisions that will be input to the FBF, eventually adjusting the
coefficients of the FBF to eliminate the current symbol’s ISI caused by past detected symbols.
The remarkable feature of the DFE is its superiority over linear transversal equalizer (LTE)
which is the most common equalizer structure. This superiority is due to its smaller minimum
mean square error (MMSE) than that of the LTE. This is caused by the severely distorted
channel of the LTE or when it exhibits nulls in the spectrum causing the performance of an
LTE to degrade and the minimum mean squared error (MMSE), which is the basic
performance criterion of the DFE, to be quite better than that of the LTE.
The goal in designing a communications system is to transmit information to the receiver with
as little deterioration as possible and at the meantime to satisfy design constraints of allowed
signal bandwidth, transmitted energy and cost. In digital communications systems, the
probability of bit error (Pe), which is named bit error rate (BER) is generally taken to be the
8
measure of degradation and performance. In analog communications systems, the signal-to-
noise ratio (SNR) that is related with the end of the receiver is generally the performance
criterion. It’s important to attain a low mean square error (MSE) and high convergence rate
beside a low BER in nonlinear channel equalization. Training sequences are also an important
factor that determines the efficiency of a communications system. They are intended to be as
short as possible which requires the adaptation process to end in as few iterations as possible.
The application of linear equalizers to nonlinear channels does not yield the desired BER
performance since they are based on linear system theory and are used for equalization of
linear channels. Recently, neural networks and fuzzy technology have evolved into a powerful
tool in the equalization of nonlinear channel distortions.
1.4 State of Application of Neural Networks and Fuzzy Technologies for Channel
Equalization
1.4.1 Design of neural network based equalizers
Nonlinear equalizers are capable of compensating for both nonlinear and linear channel
distortion. Adaptive nonlinear equalizers that implemented neural network models were used
extensively primarily for noise-cancellation in various applications. A multilayer perceptron
(MLP) is one of the neural network structures which is used in neural network based
equalizers. MLP networks consist of feedforward neural networks having one or more layers
of neurons, known as hidden neurons that are between the input and output neurons.
Filtering is the process of changing the relative amplitudes of the frequency components in a
signal or eliminating some frequency components completely in a variety of applications [10].
Assigning k information bits to the ܯ = 2 possible signal amplitudes which can be carried
out in a number of ways is called mapping or transformation. Generally, the nonlinear
equalization includes a channel estimator since the channel information is not available at the
receiver end [12]. Filtering comprise two estimation procedures, one of them being the
mapping from the available samples and the other one being the estimation of the output of
the filter from the input by the realization of this mapping [11]. The mapping is more difficult
9
for a nonlinear filter than for a linear filter but research still goes on to effectively realize the
mapping of nonlinear filters.
1.4.2 Channel equalization by using fuzzy logic
Adaptive equalizers for nonlinear channels can be developed by a variety of effective ways.
Baye’s probability theory [13] is capable of bringing about the optimal solution for a symbol
equalizer and is referred to as the Bayesian equalizer. Symbol decision equalizers are
particularly simple and less complex in terms of computationality compared with the MLSE.
A channel estimate is not always necessary for them. They function as inverse filters [14] and
such algorithms as recursive least square (RLS) or least mean square (LMS) are employed to
base an adaptive filter. The channel inverse is found by the adaptive filter where noise
provides a linear decision boundary. In general, an optimal equalizer requires decision
function that is naturally nonlinear. This equalization is usually thought to be a nonlinear
problem of classification with this perspective and because of this reason, linear equalizers’
performance is not good enough to be optimal. This is the reason search for nonlinear
equalizes providing a nonlinear decision function has been undertaken. Nonlinear equalizers
employing artificial neural networks (ANNs) [15], [16], [17] and radial basis function (RBF)
networks [15], [18], [19] were successfully developed. Nonlinear equalizers using ANN and
RBF networks were shown to provide superior performance to linear equalizers for channels
corrupted with ISI and AWGN [20]. The ANN equalizers had some discrepancies due to poor
convergence and RBF equalizers provided functional behavior which is localized and required
by the optimal equalizer where it was difficult to train the centers. This, however, caused the
examinations to find different nonlinear equalization techniques. A fuzzy adaptive filter forms
the basis of a fuzzy equalizer and this fuzzy equalizer has been suggested in [21] as the result
of examinations to find alternative nonlinear equalization techniques and a fuzzy system
related equalizer is offered by [22]. It was found that these equalizers had good performance
but the Bayesian equalizer decision function could not be found, in addition to the difficulty
of demand by fuzzy adaptive filter based equalizer, for high computational complexity.
The fuzzy logic is based on fuzzy rules that use input-output data pairs of the channel. This
type of adaptive equalizers operates by processing numerical data and linguistic information.
10
Fuzzy equalizer depends on fuzzy IF-THEN rules which are determined by human experts.
These rules use the channel’s input-output data pairs and carries out the construction of the
filter for nonlinear channel. The bit error rate (BER) and adaptation speed can be improved by
the linguistic and numerical information.
Digital communications involving quadrature amplitude modulation (QAM) can apply the
fuzzy filter with both linear and nonlinear channel characteristics as has been achieved in this
thesis. The present study proposes a complex fuzzy adaptive filter with changeable fuzzy IF-
THEN rules, which is an extension of the real fuzzy filter. The filter inputs and outputs are all
complex valued. However, the inputs of the channels are real reciprocals of the modulated
complex transmitted inputs and the equalizer outputs are real reciprocal estimates of the
reciprocal channel input signals. Afterwards, the reciprocal normalized equalizer outputs are
denormalized to form the final complex-valued, equalized estimate outputs of the receiver.
This technique which is primarily based on normalization and directly applied on the
transmitter, on the whole presents a new method to successfully equalize complex-valued
QAM signals which are severely distorted in both linear and hostile time-varying nonlinear
channel environments, by using real-valued reciprocals of the signals in question. In addition
to the methodology, the membership functions derived from the training data set and the
gradient-descent learning algorithm which trains the data set, represent a significant element
of the nonlinear neuro-fuzzy equalizer that is capable of this adaptive channel equalization. Its
superiority relies not only on its high equalization performance but also on its capability of
minimizing or eliminating the non-linear channel distortions that in general, linear equalizers
are not capable of doing. In turn, the fuzzy logic based neuro-fuzzy equalization is proven to
be an efficient equalizer on a complex scheme such as QAM with high approximation ability
in nonlinear problems in addition to the linear ones.
A fuzzy adaptive filter is based on a set of fuzzy IF-THEN rules whose function is to change
adaptively in order to minimize some criterion function as new information is available [35].
A recursive least squares (RLS) adaptation algorithm is used by a fuzzy adaptive filter.
The construction of RLS fuzzy adaptive filter is accomplished by the following four steps:
11
1) Defining fuzzy sets in the filter input space UєRn which has membership functions
covering U.
2) Constructing a set of fuzzy IF-THEN rules that either human experts determine or the
adaptation procedure determines by matching input-output data pairs;
3) Constructing a filter that is based on the set of rules; and,
4) Updating the filter’s free parameters by utilizing the RLS algorithm.
The fuzzy adaptive filter’s main advantage is the possibility of integrating linguistic
information (in the shape of fuzzy IF-THEN rules) and numerical information (in the shape of
input-output pairs) into the filter uniformly. At the end, when it’s time to apply the fuzzy
adaptive filter to equalization problems related with nonlinear communication channel, the
following fundamental differences between RLS and LMS are reached:
1) The RLS algorithm is faster than that of the LMS algorithm.
2) Having, in fuzzy terms, incorporated some linguistic description about the channel
into the fuzzy adaptive filter will extensively enhance the adaptation speed of RLS.
3) The fuzzy equalizer’s bit error rate is quite approximate to the bit error rate of the
optimal equalizer.
4) The excess mean-square error of the RLS algorithm is inclined towards zero as the
number of iterations comes nearer to infinity.
Development of neuro-fuzzy system in order to equalize channel distortion includes the
following steps:
-First, the methodologies utilized to equalize channel distortions are analyzed and state of
application problems of neural and fuzzy technologies for the development of an equalizer is
considered.
-Second, the data transmission structure is explained and the operation structure of adaptive
channel equalization utilizing neuro-fuzzy network is presented.
12
-Third, the mathematical model of the neuro-fuzzy network for the development of
equalization system for channel distortion is presented. The learning algorithm of neuro-
fuzzy system is considered.
-Fourth, the development of the neuro-fuzzy equalizer for channel distortion is presented.
-Fifth, the QAM signaling is explained and its application on nonlinear neuro-fuzzy network
is presented. The simulation results of the equalizer using QAM signals and analytical tables
demonstrating the performance of the equalizer are presented. Additionally, tables comparing
the different QAM constellations are presented.
1.5 Summary
In this chapter, the application of channel equalization is explained. The types of distortions in
channels and the types of equalizers used to minimize them are explained with their
classifications and properties. Performance criteria of equalizers, namely bit error rate (BER),
signal-to-noise ratio (SNR) and convergence rate with their ideal indications are stated.
Neural networks and fuzzy logic are particularly discussed and explained with their structures
and features. The methods of equalization using neural networks, specifically filtering is
described. Different types of algorithms, networks and equalizers used especially for difficult
nonlinear channels are defined.
Fuzzy IF-THEN rules which constitute the basis of fuzzy logic are described to point out their
significance in channel equalization. The steps of constructing a fuzzy adaptive filter using
these rules are defined. The methods used in equalizing QAM signals applied on neuro-fuzzy
network and the gradient-descent learning algorithm as part of the equalization system are
described as well.
13
CHAPTER 2
STRUCTURE OF CHANNEL EQUALIZATION
2.1 Overview
All communications systems are composed of three fundamental subsystems which are
transmitter, channel and receiver (Fig. 2.1). A transmitter’s task is to transmit information
signal through physical channel or transmission medium after converting it into a form which
is convenient for transmission. The receiver’s task, on the other hand, is to produce an
accurate replica of the transmitted symbol sequence by recovering the message signal that the
received signal contains. The communications channel acts as a connector between the
transmitter and the receiver sending the electrical signal from the transmitter to the receiver.
The unknown channel characteristics cause distortions to the transmitted signal before it
reaches the receiver.
Figure 2.1 Basic components of a communications system
Digital communications systems are preferred more compared with the analog ones due to
increasing demand for data communication and because digital transmission provides data
processing options and flexibilities that analog transmission cannot offer. The distinguishing
feature of a digital communications system is that it sends a waveform from a finite set of
possible waveforms during a finite interval of time as opposed to an analog communication
system that transmits a waveform from unlimited number of various waveforms which have
theoretically infinite resolution. The message from the source which is represented by an
information waveform is encoded before transmission so that transmission error can be
detected and corrected by the receiver. At the receiver end, the message signal must be
decoded before being used. The distortions preventing the correct transmission of signals are
mainly intersymbol interference (ISI) and noise. Noise is meant to be unwanted electrical
signals which exist in electrical systems. The equalization of channel is an efficient technique
Transmitter
Channel
Receiver
14
employed to reduce or eliminate the obscuring effect of distortion caused in the channel. This
chapter outlines the structure of data transmission system and the functions of its main
components as well as the equalization of channel distortion.
2.2 Architecture of Data Transmission Systems
A communications channel is an electrical medium which connects the transmitter and the
receiver, providing the data transmission from a source which generates the information to
one or more destinations. In the analysis and design of communication systems, the
characteristics of the physical channels through which the information is transmitted, are of
particular importance. Wire lines or free space may be used in the communications path from
the transmitter to the receiver. The examples for wire lines are coaxial cables, wire pairs and
optical fibers. These are widely used in terrestrial telephone networks, even though infrared
and optical free space links such as video, remote controls for TV and hi-fi equipment as well
as some security systems may be used in different situations, as well. This point of
transmission medium is where most of the attenuation and noise is observed [23].
The receiver functions to reverse the signal processing steps performed by the transmitter
recovering the original message signal by compensating for any signal deteriorations caused
by the channel. This involves amplification, filtering, demodulation and decoding and in
general is a more complex task than the transmitting process.
There are many reasons as to why digital communication systems are preferred over analog
systems. Digital communication systems (DCSs) represent an increase in complexity over the
equivalent analog systems. The principal advantages and reasons of DCS’s being the
preferred option instead of analog communication systems can be listed as:
1. The ease with which digital signals, compared with analog signals, are regenerated.
2. Digital systems are not as prone to distortion and interference as analog systems.
3. Increased demand for data transmission.
4. Increased scale of integration, sophistication and reliability of digital electronics for
signal processing, combined with decreased cost.
5. Facility to source code for data compression.
15
6. Possibility of channel coding (line and error control coding) to minimize the effects of
noise and interference.
7. Ease with which bandwidth, power and time can be traded off in order to optimize the
use of these limited resources.
8. Standardization of signals, irrespective of their type, origin or the services they
support, leading to an integrated services digital network (ISDN)
9. Digital hardware can be implemented more flexibly than analog hardware.
10. Various types of digital signals such as data, telephone, TV and telegraph can be
considered identical signals in transmission and switching [24].
Modulation, which is part of the transmission and equalization process, involves encoding
information from a message source in a way that is convenient for transmission. It is
accomplished by translating a baseband message signal to a bandpass signal at frequencies
which are quite high when compared with the frequency of baseband. It is also referred to as
the mapping of the baseband input information waveform into the bandpass signal. The
bandpass signal is referred to as the modulated signal and the baseband message signal is
referred to as the modulating signal. Modulation can be accomplished by varying the
frequency, phase or amplitude of a high frequency carrier in conformity with the amplitude of
the message signal. Demodulation, on the other hand, is the process of extracting the
baseband message from the carrier in order to enable the aimed receiver (also known as the
sink) to process and interpret it. In digital wireless communication systems, it’s possible to
represent the modulating signal as a time sequence of pulses or symbols, where each symbol
has m finite states. The representation of n bits of information where n = log2 m bits/symbol,
is done by each symbol [4].
The block diagram illustrated in Fig. 2.2 can describe communications systems. The source of
data is the signal generator that produces the information to be transmitted and modulated.
This information is in the form of a message symbol that can consist of a single bit or a
grouping of bits.
In order to make the transmission more efficient in terms of the time it takes and/or bandwidth
it requires, encoder is employed as a signal processor that converts the sources of digital
16
information into binary form, i.e. each symbol is encoded as a binary word. Encoding is
performed so as to enable the signal processor in the receiver to detect and correct errors
which will provide the minimization and/or elimination of bit errors caused by noise in the
channel.
The procedure used for detecting and correcting errors is called coding. Coding includes
adding redundant (extra) bits to the stream of data. The redundant bits like parity bits are
employed by the decoder and serve to correct errors at the receiver output even though a high
degree of redundancy may increase the bandwidth of the encoded signal. Codes can be
classified into two broad categories as block codes and convolutional codes. The main
difference is that block coder is a memoryless device whereas a coder having a memory
produces a convolutional coder. Hamming Codes, Golay Codes, Hadamard Codes, Cyclic
Codes, BCH (Bose-Chaudhuri-Hocquenghem) Codes and Reed-Solomon Codes are some
examples of block codes. In addition to block codes and convolutional codes, a new family of
codes, called turbo codes is used recently and is being incorporated in 3G wireless standards.
Turbo codes combine the capabilities of convolutional codes with channel estimation theory
and can be thought of as nested or parallel convolutional codes. When implemented properly,
turbo codes allow coding gains which are far superior to all previous error correcting codes
and permit a link of wireless communications to come surprisingly near to realizing the
Shannon capacity bound [4].
Each digital word has n binary digits and there are ܯ = 2 unique code words which are
possible where each code word corresponds to a certain amplitude level. However, each
sample value from the analog signal could be any one of an infinitely high number of levels
for the digital word which represents the amplitude closest to the actual sampled value to be
utilized. That is known as quantizing [2]. Gray coding was used as the mapping of bits along
the in-phase and quadrature axes of the QAM constellation as part of this thesis study. The
Gray code has been selected since it has change of only one bit for each change of step in the
quantized level. Multisymbol signaling can be thought of as a coding or bit mapping process
17
TRANSMITTER
AWGN
RECEIVER
Figure 2.2 Architecture of a digital communications system [39]
in which n binary symbols (bits) are mapped into a single M-ary symbol. A detection error in
a single symbol can therefore translate into several errors in the corresponding decoded bit
sequence. The bit error rate (BER), therefore relies not only on the probability of symbol error
and the symbol entropy but on the code or bit mapping used and the types of error which
occur as well. If a Gray code is used to map binary symbols to phasor states, this type of error
results in only a single decoded bit error [23]. Consequently, single errors in the receiver will
cause minimal errors in the recovered level.
There are many criteria used in the evaluation of the performance of a communications
system. The optimum system that is considered close to being ideal or perfect for digital
systems is the one that minimizes the bit error rate (BER) at the receiver output subject to
constraints on channel bandwidth and transmitted energy. This raises the matter of inventing a
system with no bit error at the output even when there is noise in the channel. Shannon
demonstrated in 1948 that it was possible to calculate a channel capacity C (bits/s) in the way
that if the rate of information was less than C, the probability of error would approach to zero.
In this case, the maximum possible bandwidth efficiency
max
B
, which is defined as the
Data
source
Encoder
Filter
Modulator
Physical Channel
Demodulator
Filter
Equalizer
Decision
device
Decoder
18
capability of a modulation scheme to accommodate data within a limited bandwidth, is
restricted by the channel noise and is stated by the channel capacity formula in Eq.2.1.
Shannon’s channel capacity formula is applicable to AWGN and is given by
N
S
B
C
B
1
log 2
max
or (2.1)
N
S
B
C
1
log 2
in which C is the channel capacity (bits per second), B is the transmission bandwidth, S is the
average power of the transmitted signal and N is the power spectral density of the white
Gaussian noise. S/N is called the signal-to-noise ratio. Shannon also showed that errors that a
noisy channel induces, could be decreased to any desired level by encoding the information
properly, without sacrificing the rate of information transfer.
The physical medium or the channel that the message signal is transmitted through, induces
distortions like intersymbol interference (ISI) and noise. The receiver, on the other hand is
responsible for separating the source information from the received modulated signal which is
distorted by noise that is usually random, additive white Gaussian noise (AWGN). The
receiver’s duty is to take the corrupted signal at the output of the channel and to convert it to a
baseband signal that the baseband processor could handle. The baseband processor eliminates
or minimizes this signal and distributes an estimate of the source information to the output of
the communications system [2]. Demodulation process is employed at the receiver to the
signal in order to recover the transmitted signal in its baseband form and make it ready to be
processed by the receiver filter. At the end, the decision device reconstructs the encoded
message signal depending on the decisions of the equalizer and the decoder reconstructs the
sequence of transmitted signals by bringing about the reverse operation of the encoder.
19
2.3 Channel Characteristics
Channels must have appropriate frequency band for their transmission medium. The
processed baseband signal is converted by the transmitter circuit into this frequency band. If
the channel is a fiber-optic cable, the carrier circuits convert the baseband input to light
frequencies and the transmitted signal is light.
Channels are classified as wire and wireless channels. Some examples of wire channels can be
counted as coaxial cables, fiber-optic cables, twisted-pair telephone lines and waveguides
whereas air, vacuum and seawater are examples of wireless channels.
The constraints channels may introduce are in favor of a particular type of signaling.
Generally, the signal is attenuated by the channel so that the channel or the noise produced by
an imperfect receiver deteriorates the delivered information from that of the source [2]. There
are various sources that cause noise; those sources may be natural electrical disturbances such
as lightning, artificial sources like ignition systems in cars, switching circuits in a digital
computer or high voltage transmission lines. The channel is likely to involve amplifying
devices such as satellite transponders in space communication systems or repeaters in
telephone systems that help the signal to be above the noise level. In addition to noise,
multiple paths that arise between the input and output of channel involve attenuation
characteristics and time delays. The attenuation characteristics may vary with time, which
makes the signal fade at the channel output. Fading of that type can be observed while
listening to distant shortwave stations.
Another significant characteristic of channels is bandwidth. In general terms, bandwidth is
defined to be the width of a positive frequency band of waveforms whose magnitude spectra
are even about the origin ݂ = 0. Bandwidth in a channel must be enough to accommodate the
signal but reject the noise. High bandwidth allows more users to be assigned as well as more
information to be transmitted. Some examples of band limited channels are telephone
channels and digital microwave radio channels. When the channel is band limited to ܹHz,
any frequency components above ܹ will not be passed by the channel. In turn, the bandwidth
of the transmitted signal will be limited to ܹ Hz, as well. When the channel is not ideal (i.e.
20
|݂| ≤ܹ), signal transmission at a symbol rate equivalent of or exceeding ܹ concludes as
intersymbol interference (ISI) among a number of adjacent symbols. In addition to telephone
channels, other physical channels which exhibit some form of time dispersion and thus
introduce ISI, are also available. Radio channels like shortwave ionospheric propagation (HF)
and tropospheric scatter are two examples of time-dispersive channels. In these channels, time
dispersion and hence, ISI is the consequence of multiple propagation paths that have different
path delays [1]. In addition to noise, multipath propagation and ISI, there are other
impairments in the channels specifically nonlinear distortion, frequency offset and phase
jitter. Channel impairments affect the transmission rate over the channel and the modulation
technique to be used. Depending on the rates, bandwidth efficient modulation techniques are
employed and some form of equalization is employed accordingly.
2.4 Channel Distortions
Channels which are used to transmit data distort signals in both amplitude and phase. In
addition to the nature of the channel itself, other factors like linear distortion, nonlinear
distortion and frequency offset are significant factors causing these distortions.
Linear distortion occurs in linear time-invariant systems in which channels are characterized
as band-limited linear filters. Those channels like telephone channels are part of digital
communications systems where distortionless transmission is highly desired. A linear time-
invariant system will produce two types of linear distortion which are amplitude distortion
and phase distortion. In order to have distortionless transmission with linear time-invariant
systems, the first requirement is that the transfer function of the channel must be given by
d
fT
j
Ae
f
X
f
Y
f
H
2
)
(
)
(
)
(
(2.2)
which means that in order to have no distortion at the system output, the following
requirements have to be met:
1. Flat amplitude response. That is,
A
f
H
constant
)
(
(2.3a)
21
2. The phase response that is a linear function of frequency. That is,
d
fT
f
H
f
2
)
(
)
(
(2.3b)
When the first condition is satisfied, no amplitude distortion exists and when the second
condition is satisfied, no phase distortion exists. The second requirement is related with the
time delay of the system and it is defined as
)
(
2
1
)
(
2
1
)
(
f
H
f
f
f
f
Td
(2.4)
and it is compulsory that
constant
)
(
f
Td
(2.5)
for distortionless transmission. If
)
( f
Td
is not constant, there is phase distortion since the
phase response,
)
( f
, is not a linear function of frequency.
Nonlinear distortion in telephone channels arises from nonlinearities in amplifiers and
compandors used in the telephone system. This type of distortion is usually small and it is
very difficult to correct [1]. There will be nonlinear distortion on the output signal if the
voltage gain coefficients from the second order on, are not zero. There are three types of
nonlinear distortions associated with the amplifiers which are harmonic distortion,
intermodulation distortion (IMD) and cross-modulation distortion. Harmonic distortion
occurs at the amplifier output and is caused by first and second order frequencies of the
amplifier output. The intermodulation distortion is produced by cross-product term of the
amplifier input-output equation whereas the cross-modulation distortion is caused by the third
order distortion products of the amplifier output.
In addition to linear and nonlinear distortions, signals transmitted through telephone channels
are subject to the impairment of frequency offset. A small frequency offset which is mostly
less than 5 Hz, results from the use of carrier equipment in the telephone channel. High-speed
digital transmission systems that use synchronous phase-coherent demodulation cannot
22
tolerate this type of offset. This offset is compensated for by the carrier recovery loop in the
demodulator.
Phase jitter is basically a low-index frequency modulation of the transmitted signal with the
low frequency harmonics of the power line frequency. Phase jitter poses a serious problem in
digital transmission of high rates. Yet, it can be tracked and compensated for, to some extent,
at the demodulator.
Distortion can occur within the transmitter, the receiver and the channel. As opposed to noise
and interference, distortion appears when the signal is turned off.
2.4.1 Multipath propagation
Multipath fading occurs to varying extents in many different radio applications. It is caused
whenever radio energy reaches the receiver by more than one path. Multiple paths may also
occur due to ground reflections, reflections from stable tropospheric layers and refraction by
tropospheric layers with extreme refractive index gradients [23]. Scattering obstacles also
cause multipath propagation to some other systems like urban cellular radio systems.
There are two principal effects of multipath propagation on systems, their relative severity
depending essentially on the relative bandwidth of the resulting channel compared with that of
the signal being transmitted. The fading process is governed by changes in atmospheric
conditions for fixed point systems such as the microwave radio relay network. The path delay
spread often is adequately short for the channel frequency response to be essentially constant
over its operating bandwidth. If that happens, fading is considered flat because all signal
frequency components become prone to the same fade at any given instant. In the case of path
delay spread being longer, the channel frequency response is likely to change rapidly on a
frequency scale that can be compared with signal bandwidth. If that happens, the fading is
considered frequency selective and the received signal is subject to severe amplitude and
phase distortion. Adaptive equalizers may then be required to flatten and linearize the overall
characteristics of channel. The flat fading effects can be combated by increasing transmitter
power whilst the effects of frequency selective channel cannot. A fade margin is usually
designed into the link budget to offset the expected multipath fades for microwave links
23
which are subject to flat fading. The magnitude of this margin depends on the required
availability of the link.
Paths of multiple propagation that have different path delays cause time dispersion and ISI in
time-dispersive channels. The reason for calling these channels time-variant multipath
channels is that the relative time delays among the paths and the number of paths vary with
time. Various frequency response characteristics are caused by the time-variant multipath
conditions resulting in inappropriate frequency response characterization for time-variant
multipath channels, which is used for telephone channels. Instead, scattering function
statistically characterizes these radio channels. The scattering function is a two-dimensional
representation of the average received signal power which depends on Doppler frequency and
relative time delay.
2.4.2 Intersymbol interference
Rectangular pulse signaling, in principle, has a spectral efficiency of 0 bits/s/Hz since each
rectangular pulse has infinite absolute bandwidth. In practice, of course, rectangular pulses
can be transmitted over channels with finite bandwidth if a degree of distortion can be
tolerated.
In digital communications, it might appear that distortion is unimportant since a receiver must
only distinguish between pulses which have been distorted in the same way. If the pulses are
filtered improperly as they pass through a communications system i.e. if the distortion is
severe enough, they will spread in time. The decision instant voltage might then arise not only
from the current symbol but also from one or more preceding pulses. Intersymbol interference
(ISI) is caused when smearing the pulse for each symbol into adjacent time slots occurs. The
pulses would have rounded tops instead of flat ones with a restricted bandwidth. What’s
important about ISI is the decision instant. The decision instant can be defined as the
sampling instant (or sampling point) at which each time slot of the transmitted or received
waveform begins. It is at this point that ISI occurs due to the smearing effect of the pulse.
This smearing will cause unwanted contributions from the adjacent pulses that are likely to
degrade bit error rate (BER) performance. The decision instant shows an important point: The
24
performance of digital communications systems is only related with decision instant ISI. If ISI
occurs at times that are not decision instants, it does not matter [23].
If the signal pulses could be persuaded to pass through zero crossing point (of the time axis) at
every decision instant (except one), then ISI would no longer be a problem. This suggests a
definition for an ISI-free signal, i.e.: If a signal passes through zero at all instants that are not
one of the sampling instants, it’s an ISI-free signal [23].
While transmitting information with pulses over an analog channel, the original signal is a
discrete time sequence (or an acceptable approximation); the received signal is a continuous
time signal. The channel can be considered a low-pass analog filter, by that means, smearing
or spreading the shape of the impulse train into a continuous signal with peaks that are related
with the original pulses’ amplitudes. Convolution of the pulse sequence by a continuous time
channel response could describe the operation in terms of mathematics. The convolution
integral is the beginning of the operation:
(2.6)
where x(k) denotes the received signal, h(k) denotes the channel impulse response and s(k)
denotes the input signal. The second half on the right side of the above equation illustrates the
commutativity property of the convolution operation.
Component s(k) is the input pulse train that is comprised of periodically transmitted impulses
of varying amplitudes, for that reason;
s(k) = 0 for k≠nT (2.7)
s(k) = Sn for k=nT
where T is the symbol period. Here, it is meant that the only significant values of the variable
of integration in the integral of equation (2.6), are those for which ݇ = ݊ܶ. A different value
of k amounts to multiplication by 0 and for that reason, x(k) can be stated as
d
k
h
s
d
k
s
h
k
s
k
h
k
x
)
(
)
(
)
(
)
(
)
(
)
(
)
(
25
)
(
)
(
nT
k
h
s
k
x
n
n
(2.8)
The above equation that represents x(k) is more similar to the convolution sum, however, it
nevertheless is the description of a continuous time system. It illustrates that the received
signal is comprised of the addition of a large number of shifted and scaled impulse responses
of continuous time system. The amplitudes of the transmitted pulses of x(k) scale the impulse
responses.
The first term in Eq. 2.8 is the component of x(k) because of the Nth symbol. The centre tap of
the channel impulse response multiplies it. ISI terms are the other product terms in the
summation. The appropriate samples in the tails of the channel impulse response scale the
input pulses in the neighborhood of the Nth symbol.
2.4.3 Noise
In communications systems, the received waveform is usually classified as the desired part
which contains the information and the extraneous or unwanted part. The desired part is the
signal and the unwanted part is the noise. Noise limits our ability to communicate and causes
more power consumption during the transmission of information. Minimizing the noise
effects is achieved after enhancing the power amount in the transmitted signal. Yet, factors
like equipment and various practical limitations restrict the level of power in the signal which
is transmitted.
The most frequently encountered problem in the transmission of signals through any channel
is additive noise that is generally generated internally at the receiver end by components like
solid-state devices of a subsystem and resistors employed in the implementation of the
communications system. That is at times referred to as thermal noise. Thermal noise is
produced by the random motion of free charge carriers (usually electrons) in a resistive
medium. Additive noise generated by the electronic components is usually found in a storage
system’s readback signal, as in the case of a radio or telephone communication system. When
such noise occupies the same frequency band that the desired signal occupies, suitable design
of the transmitted signal and its demodulator at the receiver can minimize its effect [23].
26
Another problem in transmission is the non-thermal noise, also known as the shot noise.
Although the time averaged current flowing in a device may be constant, statistical
fluctuations will be present if individual charge carriers have to pass through a potential
barrier. The potential barrier may, for example, be the junction of a PN junction diode, the
cathode of a vacuum tube or the emitter bus junction of a bipolar transistor. Such statistical
fluctuations constitute shot noise.
Noise that arises from external sources can be coupled into a communication system by the
receiving antenna. Antenna noise which is dominated by the broadband radiation produced in
lightning discharges associated with thunderstorms, below 30 MHz originates from several
different sources. This radiation is trapped by the ionosphere and propagates worldwide.
Such noise is sometimes referred to as atmospheric noise.
Noise can be classified into categories as:
a. White noise: A stochastic process which has a flat power spectral density over the
entirety of frequency range. It’s not possible to express that sort of noise using
quadrature components because of its wideband character. When problems tackling
the narrowband signal demodulation in noise are in question, modeling the additive
noise process as white and representing the noise using quadrature components is
mathematically convenient. It’s possible to accomplish this after putting forward that
the signals and noise at the receiver managed to pass through an ideal bandpass filter,
which has a passband including the spectrum of the signals but is a lot wider. The
noise that is the result of passing the white noise process through a spectrally flat
bandpass filter is referred to as bandpass white noise.
b. Electromagnetic Noise: Usually found in electrical devices like television and radio
transmitters and receivers. They can be present at all frequencies.
c. Impulse Noise: An additive disturbance which arises primarily from the switching
equipment in the telephone system. It is made up of short-duration pulses having
random duration and amplitude.
d. Acoustic Noise: Present in almost all conversations and limit telecommunications
environments such as telephone circuits and hands-free telephones. It may be
27
unnoticeable or distinct, depending on the time delay involved. If the delay between
the speech and its echo (noise) is short, the noise is unnoticeable, but perceived as a
form of spectral distortion referred to as reverberation. If, however, the delay exceeds
a few tens of milliseconds, the noise is distinctly noticeable [25]. Background noise
generated in a car cabin, air conditioners and computer fans represent some types of
acoustic noise.
e. Processing Noise: Modeled as a zero-mean, white-noise process
in data
communication systems. It is the result of digital analog processing of signals, e.g. lost
data packets in digital data communications systems or quantization noise in digital
coding of image or speech.
f. Colored Noise: It’s a Gaussian type noise which is part of wideband signal processes
with non-constant spectrum. Autoregressive noise and brown noise are some examples
of the non-white, colored noise.
Gaussian noise and specifically the additive white Gaussian noise (AWGN) is the most
frequently encountered type of noise in communication systems. It represents the simplest
mathematical model for a communication channel. Below are given a list of channel models
in which the effects of noise on electrical communication and the most important
characteristics of the transmission channels are investigated.
2.4.3.1 The additive noise channel
Contaminating noise in signal transmission usually has an additive effect in the sense that
noise often adds to the information bearing signal at various points between the source and the
destination. Random additive noise process n(k) whose channel has a mathematical model as
shown in Fig. 2.3, corrupts the transmitted signal x(k). The additive noise becomes white
when the random process has a power spectral density (PSD) which is constant over all
frequencies and becomes the most often assumed model of additive white Gaussian noise
(AWGN), when the noise has a Gaussian distribution. AWGN contains a uniform continuous
frequency spectrum over a particular frequency band and the majority of physical
communication channels implements this model since it is mathematically tractable.
28
s(k) x(k)=s(k)+n(k)
Figure 2.3 The additive Gaussian noise channel [39]
2.4.3.2 The linear filter channel
Filtering is an operation which includes extracting information about a quantity of interest
from data with noise at time ݐ by using measured data that includes ݐ. A filter is considered
linear when filtering, smoothing or predicting the amount at the filter output is done and this
amount linearly depends on the observations applied to the filter input [25].
Linear filter channels are those that enable the transmitted signals to remain in specified
bandwidth limitations without interfering with each other. The mathematical model including
the additive noise is illustrated in Fig. 2.4 in which s(k) is the channel input and the channel
output is represented as
)
(
)
(
)
(
)
(
)
(
)
(
)
(
k
n
d
k
s
h
k
n
k
h
k
s
k
x
(2.9)
in which h(τ) is the linear filter impulse response and * denotes convolution.
s(k) x(k)=s(k)∗h(k)+n(k)
Figure 2.4 Linear filter channel with additive noise [39]
Channel
n(k)
Linear
filter h(k)
Channel
n(k)
29
When attenuation is applied to the signal while being transmitted, the received signal becomes
x(k)=αs(k)+n(k) (2.10)
where α is the attenuation factor.
2.4.3.3 The linear time-variant filter channel
Mobile systems such as a moving vehicle and wireless channels such as radio channels cause
multipath propagation resulting in time-varying fading signals because their frequency
response characteristics are time-variant. The time-varying mobile channel characteristics
necessitate using a channel equalizer which continuously adapts to these characteristics,
effectively implementing a filter which is matched to these characteristics. A time-variant
channel impulse response h(τ;k) is a characteristic of such time-variant linear filters. The
channel response h(τ;k) contains an impulse applied at time k-τ where τ stands for the elapsed-
time variable. The linear time-variant filter channel containing additive noise and the signal of
channel output when s(k) is the input, becomes
)
(
)
(
)
;
(
)
(
)
;
(
)
(
)
(
t
n
d
k
s
k
h
k
n
k
h
k
s
k
x
(2.11)
in which the time-variant impulse response has the following representation
)
(
)
(
)
;
(
1
k
L
n
n
k
k
a
k
h
(2.12)
where the {an(k)} denotes the possibly time-variant attenuation factors for the L multipath
propagation paths. Substituting Eq. 2.12 into Eq. 2.11 makes the received signal
)
(
)
(
)
(
)
(
1
k
n
k
k
a
k
x
k
L
n
n
(2.13)
where each of the L multipath components is attenuated by {an(k)} and delayed by {߬(݊)}.
30
A large majority of physical channels are formed by the three defined mathematical models
and the communication systems are analyzed and designed based on these three channel
models.
2.5 Summary
This chapter outlines the structure of channel equalization system. The factors causing
distortions in the channel and their properties are explained and discussed in detail. The noise
types and interferences are described in detail in addition to their effects on the channel and
the ways of removing them from the channel.
The types of channels used within the data transmission system have been discussed.
Mathematical models representing various types of channels have been outlined and
described. Mathematical formulas representing the input, impulse response and the output of
the channel have been explained beside the channel characteristics of each type.
31
CHAPTER 3
MATHEMATICAL BACKGROUND OF A NEURO-FUZZY EQUALIZER
3.1 Overview
When the channel distortion in communications applications is extreme and linear equalizers
are not able to deal with them, nonlinear equalizers are employed instead. A linear equalizer
doesn’t have good performance on channels that have amplitude characteristics containing
deep spectral nulls or on channels containing nonlinear distortions. In an effort to compensate
for the channel distortion, the linear equalizer puts a vast gain in the vicinity of the spectral
null for the channel distortion compensation and consequently increases the amount of
additive noise the received signal has got.
Neural networks can be considered mathematical models of brain and mind activities. The
main purpose of neural networks is to form the organization of numerous simple processing
elements into layers for achieving tasks with higher level sophistication. High computation
rates, high capability for nonlinear problems, massive parallelism and continuous adaptation
are among the properties of neural networks. Those features turn neural networks into desired
tools for different sorts of applications [28]. Neural networks have been put forward for
equalization problems because of these attractive properties and their nonlinear capability.
On the other hand, neural networks have some weaknesses related with their individual
models. Their computational power is low and learning capability is limited. At this point, the
fuzzy systems have been considered to compensate these weaknesses with their capabilities of
logically reaching conclusions on a more advanced (linguistic or semantic) level.
This chapter describes the synthesizing of fuzzy logic with neural networks, the operation and
structure algorithms of neuro-fuzzy system as the channel equalization basis of QAM signals.
3.2 Neuro-Fuzzy System
Intelligent control is largely rule based, whereas classical control is rooted in the theory of
linear differential equations, because the dependencies involved in its deployment are much
32
too complex to permit an analytical representation. In tackling such dependencies, it is
expedient to use the mathematics of fuzzy systems and neural networks. The power of fuzzy
systems lies in their ability to measure the quantity of linguistic inputs and to quickly provide
a working approximation of complex and frequently unknown input-output rules of system.
The power of neural networks is in their ability to learn from data. It’s possible to combine
neural networks and fuzzy logic in a number of ways and both have advantages that provide
flexibility and effectiveness when combined. Fuzzy adaptive filters are effective because of
their data approximation ability in nonlinear problems and therefore are widely used in signal
processing problems. Fuzzy logic equalizers usually require fewer training samples than
conventional equalizers, especially for linear channels. They are capable of yielding better
error performance and also perform better in the presence of channel nonlinearities [29].
Neural networks supply algorithms for numeric classification, optimization and associative
storage. When fuzzy logic and neural networks are integrated, the emerging neuro-fuzzy
system becomes capable of training the network in a shorter time as a result of decreased
number of nodes of the network. There is a natural synergy between neural networks and
fuzzy systems that makes their hybridization a powerful tool for intelligent control and other
applications.
3.3 Fuzzy Inference Systems
3.3.1 Architecture of fuzzy inference systems
Fuzzy Inference Systems (FIS) are one of the well known applications of fuzzy sets theory
and fuzzy logic. They are used in achieving classification tasks, process control, offline
process simulation and diagnosis and online decision support tools. The power of FIS depends
on the twofold identity of both being capable of managing linguistic concepts and being
universal approximators which are capable of performing nonlinear mappings between inputs
and outputs.
FIS is often utilized for process simulation or control. Either expert knowledge or data can
design them. Knowledge based FIS solely may suffer from a loss of accuracy, for complex
33
systems which is the most important motivation to use fuzzy rules concluded from data [30].
The functional blocks as explained below, comprise a fuzzy inference system (Figure 3.1).
- Determining a set of fuzzy IF-THEN rules. Fuzzy rules are composed of linguistic
statements which describe the way the FIS makes a decision about the classification of
an input or the controlling of an output.
- Fuzzifying the inputs, which involves transforming the crisp inputs into degrees to
match with linguistic values, using the input membership functions defined by a data
base.
- Combining the fuzzified inputs in accordance with the fuzzy rules to set up a rule
strength (also called weight or fire strength).
- Determining the rule’s consequence by putting together the rule strength and the
membership function of the output.
- Combining the consequences so as to obtain an output distribution.
- Defuzzification of the output distribution which involves transforming the fuzzy rules
of the inference into crisp output.
The operations upon fuzzy IF-THEN rules are explained in the steps below:
1. Mapping the inputs to membership values of each linguistic label, utilizing a set of
input membership functions on the premise part (fuzzification process).
2. Computation of the rule strength by combining the fuzzified inputs (combining the
membership values), by utilizing the process of the fuzzy combination. (the fuzzy
combinations are also referred to as T-norms which are used in making a fuzzy rule
and involve the operators of ”and”, “or” and sometimes “not”)
3. Generating the qualified fuzzy or crisp consequent of each rule according to the rule
strength.
4. Combining the entirety of the fuzzy rule outputs to attain one fuzzy output distribution
and then aggregating the qualified consequent to produce a single crisp output
(Defuzzification of output distribution).
34
Figure 3.1 Structure of fuzzy inference system [39]
3.3.2 Rule base fuzzy if-then rule
The fuzzy knowledge base that includes a set of fuzzy IF-THEN rules forms one of the basic
blocks of a fuzzy system. The following is the form of expression that represents fuzzy IF-
THEN rules or fuzzy conditional statements [37].
If u is A Then y is B (3.1)
where u and y represent the input and output linguistic variables, A and B represent the labels
of the fuzzy sets characterized by appropriate membership functions. A denotes the premise
and B denotes the consequent part of the rule.
There are many forms representing IF-THEN rules among which Single Input Single Output
(SISO), given by statement (3.1) is the simplest. Multi-Input Single Output (MISO) of the
below given statements (3.2) and (3.3), are the other forms.
If u1 is
jA1 and u2 is
jA2 and ,…., and un is
l
nA Then yq is
p
qB
(3.2)
Input
Output
Knowledge base
Decision-making
Fuzzification
Interface
Defuzzification
Interface
35
If u1 is
jA1 and u2 is
jA2 and ,…., and un is
l
nA Then y1 is
rB1 and y2 is
sB2 (3.3)
The membership functions describe the fuzzy values A and B and Figure 3.2 demonstrates the
most widely used types of membership functions with their shapes.
1 1 1
0.5 0.5 0.5
0
(a)
0
(b)
0
(c)
Figure 3.2 Examples of membership functions (a) bell, (b) triangular, (c) trapezoidal
The following exponential function is one representation of a decision function that produces
a bell curve.
2
2
0
2
exp
)
(
x
x
x
(3.4)
where x is the independent variable on the universe, x0 denotes the position of the peak
relative to the universe and σ denotes the standard deviation.
The expressions (3.5) and (3.6) represent triangle and trapezoidal membership functions,
respectively.
ߤ(ݔ) = ቐ
1 − ௫ି௫
௫ି௫
,
ݔ < ݔ < ݔ
1 − ௫ି௫
௫ೝି௫
, ݔ < ݔ < ݔ
(3.5)
ߤ(ݔ) =
⎩
⎪
⎨
⎪
⎧1 − ௫ି௫
௫ି௫
, ݔ < ݔ < ݔ
1, ݔ < ݔ < ݔ
1 − ௫ି௫ೝ
௫ೝି௫ೝ
, ݔ < ݔ < ݔ
(3.6)
36
The following representation is the form of the types of rules, called Takagi and Sugeno fuzzy
rules because the consequent part of the fuzzy rules is a mathematical function of the input
variables.
If ܣ1(ݔ1), ܣ2(ݔ2), …… , ܣ݊(ݔ) then Y=݂(ݔଵ, ݔଶ, … . , ݔ) (3.7)
where the premise part is fuzzy and the function ݂ in the consequent part is usually a linear or
quadratic mathematical function.
݂ = ܽ + ܽଵx ݔଵ+ ܽଶx ݔଶ+ … + ܽx ݔ (3.8)
Fuzzy IF-THEN rules are widely used in modeling. They are considered the local description
of the system being designed and form the basics of the fuzzy inference system (FIS).
Fuzzification: The aim of fuzzification is mapping the crisp input into a fuzzy set. This input
can be from a set of sensors or features of those sensors like amplitude or frequency, and is
mapped into fuzzy numbers of values from 0 to 1, using a set of input membership functions.
The numeric inputs, ui߳Ui are converted into fuzzy sets by the fuzzification process for the
fuzzy system to use.
When
ܷ
∗ represents the set of all possible fuzzy sets which can be defined on ܷ∗ (given
ui߳Ui), ui is transformed to a fuzzy set denoted by ܣ
௨௭௭ that is defined on the universe of
discourse
ܷ
∗. The fuzzification operator F that produces this transformation is defined by
F: Ui =>
ܷ
∗
where
F(ui) = ܣ
௨௭௭,
Frequently, “singleton fuzzification” is used. It produces a fuzzy set ܣ
௨௭௭߳
ܷ
∗ with a
membership function given by
37
ߤೠ
(ݔ) = ቄ1 ݔ = ݑ
0 ݐℎ݁ݎݓ݅ݏ݁
Any fuzzy set with this form of membership function is termed “singleton”. Singleton
fuzzification is the type for which the input fuzzy set has only a single point of nonzero
membership and the number ui is represented by the singleton fuzzy set. In implementations
where singleton fuzzification is used, ui only takes on its measured values without any noise
involved. “Gaussian fuzzification” that uses bell type membership functions about input
points, and triangular fuzzification using triangle shapes, are common examples [38].
3.3.3 Fuzzy inference mechanism
Designing a fuzzy inference system (FIS) from data can be separated into two principal
stages: (1) automatic rule generation and (2) system optimization [30]. Rule generation is the
guide to a fundamental system that has a given space partitioning and the set of rules that
corresponds to it. System optimization is realized at different sorts of levels. Variable
selection could be a comprehensive selection or is possibly handled rule by rule. The goal of
rule base optimization is to choose the most efficient rules and to use rule conclusions in the
best way. It’s possible to enhance space partitioning by adding or removing fuzzy sets and by
tuning the parameters of membership function. Structure optimization has great significance:
choosing variables, lessening the rule base and optimizing the number of fuzzy sets.
There are two main tasks associated with fuzzy inference mechanism:
1. Matching task which involves determining the degree of each rule’s being relevant to
the current situation as marked by the inputs ݑ, ݅ = 1,2, … . ,݊.
2. Inference step which involves reaching the conclusions from the current inputs ui and
the information in the rule-base.
When the fuzzy set representing the premise of the ith rule is denoted by ܣଵ
× ܣଶ × … × ܣ ,
there will be two steps for matching:
38
Step 1: Combining inputs with rule premises: This step is about finding fuzzy sets ܣଵ
,
ܣଶ
, … ,ܣ
, with membership functions.
ߤ
భ
ೕ (ݑଵ) = ߤభೕ
(ݑଵ) ∗ ߤభೠ
(ݑଵ)
ߤ
మ
ೖ(ݑଶ) = ߤమೖ(ݑଶ) ∗ ߤమೠ
(ݑଶ)
.
.
ߤ
ೕ (ݑ) = ߤೕ (ݑ) ∗ ߤೠ
(ݑ)
(for all j, k, … ,l) combining the fuzzy sets from fuzzification with the fuzzy sets used in each
of the terms in the rules’ premises. When the singleton fuzzification is used, each of the input
fuzzy sets has only a single point of nonzero membership function.
(e.g. ߤ
ೕ (ݑ) = ߤೕ (ݑ) for ݑଵ = ݑଵ and ߤೕ
(ݑ) = 0 for ݑଵ ≠ ݑଵ)
To put it in another way, ߤ
ೠ(ݑ) = 1, with singleton fuzzification, for all ݅ = 1,2, … ,݊ for
the given ݑ inputs resulting in
ߤ
భ
ೕ (ݑଵ) = ߤభೕ
(ݑଵ)
ߤ
మ
ೖ(ݑଶ) = ߤమೖ(ݑଶ)
.
ߤ
ೕ (ݑ) = ߤೕ (ݑ)
Step 2: Determining those rules that are on: In this step, membership values ߤ(ݑଵ,ݑଶ, … ,ݑ)
are determined for the premise of ݅௧ rule which represents the certainty that each rule
premise is consistent with the given inputs. Defining
ߤ(ݑଵ,ݑଶ , … ,ݑ) = ߤೕ
(ݑଵ)ߤమೖ
(ݑଶ) … ߤ
39
that is a function of the inputs ݑ, ߤ(ݑଵ,ݑଶ, … ,ݑ) represents the certainty that the
antecedent of rule ݅ matches the information in the case of singleton fuzzification use. The
ߤ(ݑଵ,ݑଶ, … ,ݑ) is a multidimensional certainty surface. It stands for the certainty of a
premise of a rule and for the level to which a particular rule is consistent for a given set of
inputs. The implied fuzzy set is determined by the inference step which is then taken by
calculating the “implied fuzzy set” ܤ ,, for the ݅௧ rule, with the membership function
ߤ
൫ݕ൯ = ߤ(ݑଵ,ݑଶ, … , ݑ) ∗ ߤ(ݕ) (3.9)
The certainty level of the output’s being a specific crisp output ݕ within the universe of
discourse ݕ is specified by the implied fuzzy set ܤ
on considering simply rule I. The
defuzzification that comes after the inference step is employed to aggregate the conclusions of
all the rules which the implied fuzzy sets represent.
Defuzzification Methods: It is frequently important to find out a single crisp output from a
FIS. For instance, in the case of one attempting to classify a letter drawn by hand on a
drawing tablet, the FIS would be obliged to find out a crisp number to determine the letter that
was drawn. A process called defuzzification is used to attain this crisp number. In other
words, defuzzification means the way of extracting a crisp value from a fuzzy set as a
representative value.
Two known methods can be used for defuzzifying:
Center of Gravity (COG): The method picks the output distribution and works to find its
center of mass to produce one crisp number.
i
i
i
i
i
x
x
x
u
)
(
)
(
(3.10)
where the crisp output value ݑ is the abscissa (center of mass) under the center of gravity of
the fuzzy set, ߤ(ݔ) is the membership value in the membership function, ݔ is a running point
40
in a discrete universe. This expression is also considered the weighted average of the elements
in the support set.
The COG method for singletons attains the following expression
i
i
i
i
i
s
s
s
u
)
(
)
(
(3.11)
where ݏ is the position of singleton ݅ in the universe and ߤ(ݏ) represents the rule strength ߙ
of rule ݅. This technique has a good computational complexity and ݑ is differentiable with
respect to the singletons ݏ, that is practical in neuro-fuzzy systems.
Center of Average (COA): In this widely used method, a crisp output ݕ
௦ is selected
employing the centers of every one of the output membership functions and the highest
certainty of every one of the conclusions the implied fuzzy sets represent, and is described as
R
i
q
i
q
yq
R
i
q
i
q
yq
q
i
Crisp
q
y
B
y
B
b
y
1
1
)}
(
{
sup
)}
(
{
sup
(3.12)
here “sup” is the “supermum” (i.e., the least upper bound that is frequently regarded as
maximum value). Therefore, ݏݑ௫{ߤ(ݔ)} can simply be considered the highest value of ߤ(ݔ).
Fig. 3.3 outlines the inference mechanisms on different types of fuzzy systems graphically.
Most fuzzy inference systems can be categorized into three types depending on the types of
fuzzy reasoning.
In Type 1 fuzzy systems, the defuzzifier puts together the output sets that correspond to the
whole of the fired rules in a way to attain a single output set and afterwards comes up with a
crisp number which represents this output set that is put together, e.g., the centroid defuzzifier
comes up with the unity of the whole of the output sets and utilizes the centroid of the unity as
the crisp output [31]. The weighted average of each rule’s crisp output introduced by rule’s
weight and the output membership functions is the overall output.
41
Premise Consequent
Type1 Type2 Type3
A1 B1 w1 C1 w1 C1
z1=ax+by+c
ݔ ݕ ݖ
ݖ
A2 B2 w2 C2 w2 C2
z2=px+qy+r
ݔ ݕ ݖ ݖ
max
Multiplication
or min.
z = [w1z1+ w2z2]/ w1+ w2 z = [w1z1+ w2z2]/ w1+ w2
Figure 3.3 Types of fuzzy reasoning mechanisms [11]
In Type 2 fuzzy systems, fuzzy sets are quite helpful in conditions that make the
determination of an exact membership function for a fuzzy set hard; for this reason, they are
quite helpful in the incorporation of uncertainties. These uncertainties are caused by the
knowledge employed in the construction of rules in a fuzzy logic system and lead to rules that
have uncertain antecedents and/or consequents that are transformed in succession into
uncertain antecedent and/or consequent membership functions [31]. The overall fuzzy output
is attained after the application of 'max' operation to the fuzzy outputs that qualify. Every one
of these outputs equals the minimum rule strength and each rule’s membership function.
z
42
Type 3 is Takagi and Sugeno’s fuzzy IF-THEN rules. The output is a crisp number computed
by the multiplication of every one of the inputs by a constant and summing the result
afterwards. The weighted average of each rule’s output is the output.
In Fig. 3.3, a fuzzy inference system with two rules and two inputs is used to demonstrate the
different types of fuzzy rules and fuzzy reasoning described above.
3.4 Artificial Neural Networks
Recognizing that computing in the human brain takes place in a totally different manner from
the traditional digital computer, has been the incentive for research into artificial neural
networks, also known as “neural networks”. The brain is extremely complex, nonlinear and
parallel computing (information-processing system). It is capable of organizing its structural
constituents, called neurons, in order to carry out some necessary computations (e.g. pattern
recognition, perception and motor control) a lot more quickly than the highest speed digital
computer of the present time [29].
x0 ܫ = ∑ݓ ݔ ܵݑ݉݉ܽݐ݅݊
x1 ݓ ܻ
= ݂(ܫ)
ܶݎܽ݊ݏ݂݁ݎ
x2 ݓଶ
ݓଵ
Sum Transfer Output Path
•
•
•
wn
xn
Processing
Element
Inputs xi Synaptic
Weights wi
Figure 3.4 Artificial neuron
43
A neuron is a unit that processes information and is significant in a neural network’s
operation. The model of an artificial neuron that is fundamental in the design of artificial
neural networks is demonstrated in the block diagram of Fig. 3.4.
A set of synapses, also called connecting links, are the foundation elements of the neuronal
model. A weight or strength of its own characterizes each of these synapses. Specifically, a
signal ݔ for ݅ = 1,2, … ,݊, at the input of synapse, connected to neuron ݇, is multiplied by the
synaptic connection weight ݓ, for ݅ = 1,2, … ,݊. A result is generated by summing these
products, feeding them through a transfer function and then outputting them.
The output of the artificial neuron displayed in Fig. 3.4 is calculated from
)
(
1
n
j
i
j
ij
i
x
w
f
y
(3.13)
where ݔ is the input, ݕ is the output signal of the neuron, ݓ are the synaptic weight
coefficients, ߠ denotes the bias and ݂ is the activation function.
The activation function can be linear or nonlinear but a nonlinear sigmoid function is
frequently utilized as the activation function (eq. 3.14).
n
j
j
i
ij
j
x
w
y
1
)]
(
exp[
1
1
(3.14)
Neural networks are formed by a set of neurons in layer(s). The neurons are interconnected by
weighted connections at certain connection points which are called nodes. The way of
organization in a layered neural network is the layer formation. The least complicated
formation of a network with layers uses an input layer of source nodes which projects onto an
output layer of neurons (computation nodes) but not the other way round. This network is a
feedforward or acyclic type of network. Neurons in the network act as processing elements
which multiply an input by a set of weights and nonlinearly transform the result into an output
value.
44
On the whole, three basically different architectural network classes can be defined which are
single-layer feedforward (non-recurrent), multilayer feedforward and recurrent networks. The
feedforward neural network structures are shown in Fig. 3.5.
ݔଵ ݕଵ
ݔଵ
ݔଶ ∙
∙
ݔଶ
⋮
ݕଵ
ݔ୫
∙
∙∙
∙
ݕଶ
ݔ୫
⋮
⋮
⋮
2y
(a) (b)
Figure 3.5 (a) A single layer network, (b) A simple multilayer network [11]
3.4.1 Neural network’s learning
The most important specialty of a neural network is its capability of learning from its
environment and improving its performance through learning. An interactive adjustment
process applied to its synaptic weights and bias levels enables a neural network to learn about
its environment. Every one of the iterations of the learning process makes the network well-
informed of its environment. Learning in the circumstances of neural networks can be clearly
stated to be a process by which the neural network’s free parameters are adapted through a
stimulation process by the environment where the network is embedded. The way that the
parameter changes occur determines the type of learning [29].
A set of rules that are well determined and defined for the solution of a learning process is
referred to as a learning algorithm. No learning algorithm that is the only one of its sort exists
in the neural network design, as expected. The manner that the adjustment to a neuron’s
synaptic weight is clearly and exactly expressed, fundamentally cause the learning algorithms
to differ from each other. Another factor that should be taken into consideration is the way
45
that a neural network (learning machine) which is comprised of a set of interconnected
neurons, is related to its environment. In this latter context, a term is spoken as a learning
paradigm that refers to a model of the environment in which the neural network operates [29].
There are two fundamental learning paradigms associated with neural networks: (1) Learning
with a teacher (known as supervised learning) and (2) Learning without a teacher which is
divided into two subdivisions that are unsupervised learning and reinforcement learning.
Supervised learning involves training with a teacher. The teacher can be thought of as a set of
input-output examples representing the knowledge of the environment. Neural network, on
the other hand, does not know the environment. Considering that a training vector drawn from
the environment is applied to both the teacher and the neural network, the teacher is capable
of supplying the neural network with a desired response for the training vector. The network
parameters, i.e. the connection weights, are adjusted under the combined influence of the
training vector and the error signal. The error signal is what makes the desired response differ
from the actual response of the network. This adjustment is brought about in an iterative and
step-by-step way aiming at eventually causing the neural network to emulate the teacher; this
emulation is supposedly optimum in a statistical sense. This manner transfers the
environment’s knowledge that can be obtained by the teacher, to the neural network through
learning as fully as possible. On reaching this condition, the teacher may be removed and the
neural network copes with the environment entirely on its own.
The form of supervised learning just described, is the error correction learning which involves
a closed-loop feedback system but the loop does not contain the unknown environment. The
mean-square error or the sum of squared errors over the training samples that are in terms of
the free parameters of the system constitutes the performance criterion for the system. This
criterion may be visualized as a multidimensional error performance or simply error surface,
with the parameters as coordinates. The true error surface is averaged over all possible input-
output examples. It’s a point on the error surface which represents any one of the system’s
operations that the teacher supervises. The operating point has to move down one after
46
another toward a minimum point of the error surface so that the system improves performance
over time and thus learns from the teacher; it’s possible for the minimum point to be a local
minimum or a global minimum. A supervised learning system is capable of doing this using
the helpful information it has about the gradient of the error surface that corresponds to the
system’s current behavior. The gradient of an error surface at any point is a vector which
points in the direction of steepest descent. On providing an algorithm designed to minimize
the cost function, a sufficient set of input-output examples and sufficient time allowed to carry
out the training, a supervised learning system is generally capable of performing tasks like
pattern classification and function approximation [29].
3.4.2 Multilayer perceptrons & backpropagation algorithm
Multilayer feedforward networks form a significant classification of neural networks. The
network is characteristically comprised of a set of sensory units (source nodes) which
establish the input layer, one or more hidden layers of computation nodes, and an output layer
of computation nodes. The input signal propagates through the network in a forward direction,
on a layer-by-layer basis. These neural networks are called multilayer perceptrons (MLP) that
represent a generalization of a single-layer perceptron.
A widely used algorithm which is named the error back-propagation algorithm, trains the
multilayer perceptrons in applications in order to successfully solve some challenging and
diverse problems. Error correction learning rule forms the basis for this algorithm. It may be
considered a generalization of an equally popular adaptive filtering algorithm: the least mean
square (LMS) algorithm for the special case of a single layer neuron [29].
Error back-propagation learning is comprised of two passes through the different layers of the
network which are a forward pass and a backward pass. The forward pass contains an activity
pattern (input vector) whose effect propagates through the network one layer after another and
is applied to the network’s sensory nodes. Consequently, an output set is created as the real
network response. In the duration of the forward pass, all of the synaptic weights of the
47
network are unchanging. In the duration of the backward pass, all the synaptic weights are
adjusted according to an error correction rule. Particularly, the real network response is taken
out of a desired (target) response to come up with an error signal. The error signal is
propagated back through the network, in contrast to the direction of synaptic connections, thus
the naming “error back-propagation”. The synaptic weights are adjusted such that the real
network response moves nearer to the desired response statistically. The error back-
propagation algorithm is known in the literature as the back-propagation algorithm, or simply,
back-prop, as well. The learning process carried out with the algorithm is referred to as the
back-propagation learning.
There are three distinguishing characteristics of a multilayer perceptron:
1. There is a nonlinear function involved in the model of each neuron. The nonlinearity
mentioned here is smooth, in other words, differentiable everywhere. A generally
employed nonlinearity form which is sufficient for this requirement is a sigmoidal
nonlinearity that the following logistic function defines:
)
exp(
1
1
j
j
v
y
(3.15)
where ݒ is the induced local field (i.e. the weighted sum of all synaptic inputs plus the
bias) of neuron ݆, and ݕ is the output of the neuron.
2. One or more layers of hidden neurons which do not belong to the input or output of
the network can be found in the network. The network is capable of learning complex
duties by extracting increasingly significant specialties from the input patterns
(vectors) due to these hidden neurons.
3. The network performs a high connectivity degree which is decided by the network
synapses. A change in the network’s connectivity obligates a change in the population
of synaptic connections or their weights.
48
The multilayer perceptron derives its computational power when these characteristics are
combined with the capability of learning from experience through training. The back-
propagation algorithm has great significance in neural networks since it supplies a
computationally efficient method in order to train multilayer perceptrons.
Fig. 3.6 demonstrates the architectural graph of a multilayer perceptron with one hidden layer
and an output layer. The illustrated network is fully connected meaning that a neuron in one
layer of the network is connected to all the nodes/neurons in the previous layer. Signal flow
through the network progresses in a forward direction, from left to right and on a layer-by-
layer basis. The value of each neuron is computed by first summing the weighted sums and
the bias and then applying ݂(sum) (the sigmoid function) to calculate the neuron’s activation.
Input Output
⋮ ⋮
⋮ ⋮
⋮
⋮
Bias
Figure 3.6 Multilayer feedforward network [11]
Next, the training processes of the three layer feedforward network will be analyzed. Firstly,
three stages describing the feedforward phase in the network are: input (I), hidden (H) and
output (O) layers.
Input Layer (I): The input of the hidden layer is equal to the output of the input layer.
H
I
Input
Output
CHAPTER 1
REVIEW ON CHANNEL EQUALIZATION
1.1 INTRODUCTION
Communication systems comprise three fundamental elements: transmitter, channel and
receiver. When signals are transmitted through a communications system, they are obstructed
by some distortions which are mainly intersymbol interference (ISI) and noise. The
transmitted signal is distorted by ISI which is caused by multipath effect in band limited
(frequency selective) time dispersive channels and is the cause of bit errors on the receiver
side. ISI is considered the main factor negatively affecting fast transmission of data over
wireless channels. In order to eliminate or minimize these distortions, equalizers are
employed in these systems. Equalization is the method of compensating for, eliminating or
reducing the amplitude and phase distortion introduced by the transmission medium in
communications systems. In a general meaning, the term equalization refers to any signal
processing operation which minimizes ISI. An equalizing filter overcomes the ISI caused by
individual received symbols of a transmitted data stream, as well as the crosstalk that for
example occurs due to coupling of a transmitted pulse or that results from the capacitive
coupling of the transmitted pulse on an outgoing pair interfering with the received pulse on an
incoming pair. The task of equalizers is to provide efficient and error free communications by
ensuring that signals transmitted through the channel are recovered as original at the end of
the receiver that communications system has.
Distortions may be linear or nonlinear depending on the channel characteristics of channel.
When transmitting information through a physical channel, various mechanisms distort the
transmitted signal significantly, causing degradation or even failure in the communications.
These mechanisms can be classified as additive thermal noise, man-made noise and
atmospheric noise. In practice, many of the physical channels are characterized by various
channel models. The most frequently encountered channel of communications is that with
additive noise. An additive random noise process is involved in this channel model. The
factors causing the additive noise process are amplifiers and electronic components on the
2
receiver side of the communications system the transmission’s interference as radio signal
transmission, for example. Thermal noise is the category of noise that electronic components
and amplifiers cause. Statistically, that sort of noise gets classified as a random Gaussian
noise process and modeling the channel in terms of mathematics is named the additive
Gaussian noise channel. The mathematical model becomes an additive white Gaussian noise
(AWGN) channel in the case of the random process being a white-noise process. The random
process is a white-noise process when the power spectral density (PSD) is flat (constant) over
all frequencies [1,2].
When compared with AWGN channels, mobile radio channel deficiencies make the signal on
the receiver side greatly distorted or cause its significant fading. This fading is classified as a
non-additive signal disturbance and appears as time variation in the signal amplitude. Some
techniques are utilized to compensate for fading channel deficiencies. The main techniques
used in compensating for fading channel impairments can be classified as equalization,
channel coding and diversity that are employed to compensate for the signal distortions and
improve the received signal quality [3]. This thesis concentrates on equalization technique.
Equalization techniques can be categorized into linear or nonlinear techniques depending on
the way the output of an adaptive equalizer is used for subsequent control of the equalizer.
The decision making device of the receiver processes the equalizer’s output and determines
the value of the digital data bit being received before applying a slicing or thresholding
operation (a nonlinear operation) to determine the value of the reconstructed message data. If
this data is not used in the feedback path for the adapting of the equalizer, it’s a linear type of
equalization, but on the other hand, if the decision making device feeds the reconstructed data
back in order to alter the equalizer’s subsequent outputs, the equalization is nonlinear [3]. If
the used channels are nonlinear, linear equalizers cannot reconstruct the transmitted signal.
There are various equalizer structures among which linear transversal equalizer (LTE) is the
most common. The simplest LTE, whose transfer function is a polynomial, uses only feed
forward taps and has many zeros but poles only at ݖ = 0. This filter is called a finite impulse
response (FIR) filter or simply a transversal filter. In this type of equalizer, the filter
3
coefficient linearly weights the received signal’s current and past values before summing
them to produce the output of the equalizer.
Besides, some applications employ nonlinear equalizers since linear equalizers cannot deal
with high amount of channel distortion. The performance of linear equalizers on channels
involving deep spectral nulls in the passband is not good and hence, linear equalizers enhance
the noise present in the frequencies in which they place too much gain in attempting to
compensate for the distortion. Nonlinear equalizers are superior in performance to linear
equalizers because of these reasons. Three quite effective nonlinear methods which possess
improvements over linear equalization methods and that are used in 2G and 3G systems are:
1. Decision Feedback Equalization (DFE)
2. Maximum Likelihood Symbol Detection (MLSD)
3. Maximum Likelihood Sequence Estimation (MLSE) [4]
There have been large amount of studies aimed at channel equalization using various
methods, techniques and algorithms. Recently, neural network based fuzzy technology has
been widely used as a powerful and significant tool in channel equalization of various types of
signals. Experts have determined the fuzzy rules by utilizing the channel’s input-output data
pairs in this type of equalizers. Adaptive channel equalization based on neural networks and
employing multilayer perceptron (MLP) has been developed as part of this thesis which has
enabled the equalization of Quadrature Amplitude Modulation (QAM) type signals of various
levels. This has been achieved for both linear and nonlinear channels using a Nonlinear
Neuro-Fuzzy Equalizer (NNFE) at a relatively high adaptation speed and accurate equalizer
output results which has proven to be quite effective and practical.
The changeable fuzzy IF-THEN rules which configure the fuzzy adaptive filter are formed by
either human experts or the input-output pairs that are matched throughout a procedure of
adaptation. In this study, neural networks and fuzzy technology are used for the development
of a neuro-fuzzy equalizer for channel distortion of Quadrature Amplitude Modulation
(QAM) signals. Even though the QAM signal has a complex form which is composed of real
(in-phase) and imaginary (quadrature) parts, the complex signal is not directly applied to the
4
channel and equalizer since the used neuro-fuzzy filter is based on real values and best suits
the signal processing that takes place in real multidimensional space. The modulation and
demodulation of M-ary QAM (where M=4 & M=16 ) is accomplished by splitting the stream
of data bit into the in-phase (I) and quadrature (Q) components. Gray coding is employed to
map the I and Q components together. The significant feature of this thesis study is the
application of ‘normalization’ method by which the modulated in-phase and quadrature QAM
signal is normalized to a maximum of one. Consequently, each component of the complex
signal attains values between 0 and 1 by first shifting the values such that the minimum value
is zero and then scaling them such that the maximum value is 1. Each component then is input
to the channel and equalizer separately and denormalized separately at the equalizer’s output
where they are recombined to form the final desired complex QAM scheme at the end. The
normalization method provides better BER and convergence performance since it is stable in
addition to more accurate equalizer output results with relatively small number of iterations
before the minimum error is attained.
This thesis consists of five chapters where:
Chapter 1 presents an overview on channel equalization. The state of application of neuro-
fuzzy system and fuzzy logic as well as their properties and features are explained.
Chapter 2 explains the channel equalization, the distortions and noise in the channel.
Mathematical models and formulas representing the channels and nonlinear neuro-fuzzy
equalizer used in the thesis together with its characteristics are described.
Chapter 3 outlines the architecture and operation principles of the nonlinear neuro-fuzzy
network (NNFN). The used learning algorithm, the linguistic data about the target system and
numerical input-output relationships of NNFN are explained in detail. Fuzzy rule-based fuzzy
sets, the parameters and error calculations are analyzed.
Chapter 4 describes in detail the quadrature amplitude modulation (QAM) and its properties.
The application of QAM on NNFN and the features of the thesis design are explained. The
specific technique of normalization used in equalizing QAM signals and its mathematical
implementation are described.
5
Chapter 5 illustrates the simulation results of the equalization system demonstrating
graphically and statistically the performance of the equalization system. Bit error rate (BER)
versus signal-to-noise ratio (SNR) analysis is made in tabulated and graphical forms proving
the accuracy of the system. Comparisons between the channels and between the two
constellations of QAM are made to illustrate the performance of the equalizer, as well.
Conclusions are discussed at the end.
1.2 Overview
In order to accurately transmit the input signals from the transmitter to the receiver,
minimization and thus equalization of distortions in the channel is critical. This can be
successfully done by employing efficient equalization algorithms and techniques during the
transmission of the signals from the transmitter to the receiver. This chapter considers
methods used in channel equalization. Neural networks, fuzzy and neuro-fuzzy technologies
which form the basis of the adaptive channel equalization are analyzed and discussed.
1.3 The State of Application of Channel Equalization
Linear and nonlinear distortions are the main obstacles in transmitting the input signals to the
receiver of a communications system in their original state. These distortions, namely ISI and
noise, are caused in the channel and channel equalization is needed in order to transmit the
signals as accurately as possible. Even though both linear and nonlinear equalizers can be
used for this purpose, nonlinear equalizers are more preferably used because they are capable
of compensating both linear and nonlinear channel distortions effectively.
Two types of equalization are used which are sequence estimation and symbol detection. In
this thesis, symbol detection technique is used to realize the adaptive channel equalization.
This technique maps the input baseband signal of the input on top of a feature space that the
representation of a learnt property of the transmitted signal determined. The symbols are
separated by the usage of decision regions which function to classify the distorted signal.
The ISI problem which affects all digital communication systems is mainly caused by
restricted bandwidth. The restricted bandwidth is caused by rectangular multilevel pulses
6
when they are filtered improperly as they pass through a communication system spreading in
time, being smeared into adjacent time slots, causing ISI [2]. This ISI in turn causes errors
when transmitting data over the channel. Additionally, channel characteristics have a
significant role in causing distortions and the response of channel is time-variant meaning that
channel characteristics are not known in advance. The time-variant channel response and the
unknown channel characteristics obligates the equalizers to be designed to adjust themselves
to the channel response and to adapt themselves to the variations of time in the response of
channel so as to compensate for the channel characteristics’ variations. Such equalizers are
called adaptive equalizers and they have been receiving great attention because of their
superior features. In practice, as an example, there are situations when the channel consists of
dial-up telephone lines and the channel transfer function changes from call to call. In such a
case, the equalizer should be an adaptive filter.
Adaptive equalizers are categorized as supervised and unsupervised equalizers. When it is
necessary to use a training sequence because of the unpredictable channel characteristics in a
communications system, supervised equalizers are employed. This is done in order for the
channel response to be compared with the input to be able to update the parameters of the
equalizer. On the other hand, some communications systems do not allow the use of training
signals because the methods used to accomplish the equalization of channel do not allow the
training sequence to be transmitted. This is when unsupervised equalization is employed. This
equalization that involves a self-recovery method is also referred to as blind equalization [5].
Supervised equalization can be brought about by either sequence estimation or symbol
detection. Sequence estimator’s duty is to test the possible sequences of data instead of
decoding every one of the received symbols on its own and then selecting the sequence of
data that is most likely to be the output [4]. This sequence estimator is also referred to as
maximum likelihood sequence estimator (MLSE).
Unsupervised or blind equalization is used when the signal has no memory i.e. the signals
transmitted in successive symbol intervals are interdependent. In this case, each transmitted
symbol is detected separately. The constant modulus algorithm (CMA), discovered by Godard
[6] and Treichler [7] serves to be a highly significant algorithm for blind equalization. Its
7
robustness and capability of converging before phase recovery made this algorithm very
successful [5]. Another algorithm called the multimodulus algorithm (MMA) [8,9] has
improved performance over CMA since it provides low steady-state mean-squared error
(MSE) in addition to cancelling the necessity for phase recovery in steady-state operation [9].
Additionally, hybrid blind equalization algorithms are different types of blind equalization
algorithms known for combining or augmenting existing cost functions to attain improved
performance [5].
Nonlinear equalizers are considered significant among signal processing techniques due to
their both superior performance and improved features compared with linear equalizers, in
addition to the wide variety they offer. One of those features is the ability to form nonlinear
decision boundaries where the Bayesian equalizer determines the performance of these
equalizers. Decision Feedback Equalizers (DFEs) are one class of nonlinear equalizers with
relatively improved performance. Estimating and cancelling the ISI that an information
symbol induces on future symbols after it has been detected and decided upon forms the basis
of decision feedback equalization [4]. The DFE can possess two structures which are either
direct transversal or lattice structures. The direct form is made up of a feed forward filter
(FFF) and a feedback filter (FBF). The output of a detector located in between the FFF and
FBF determines the decisions that will be input to the FBF, eventually adjusting the
coefficients of the FBF to eliminate the current symbol’s ISI caused by past detected symbols.
The remarkable feature of the DFE is its superiority over linear transversal equalizer (LTE)
which is the most common equalizer structure. This superiority is due to its smaller minimum
mean square error (MMSE) than that of the LTE. This is caused by the severely distorted
channel of the LTE or when it exhibits nulls in the spectrum causing the performance of an
LTE to degrade and the minimum mean squared error (MMSE), which is the basic
performance criterion of the DFE, to be quite better than that of the LTE.
The goal in designing a communications system is to transmit information to the receiver with
as little deterioration as possible and at the meantime to satisfy design constraints of allowed
signal bandwidth, transmitted energy and cost. In digital communications systems, the
probability of bit error (Pe), which is named bit error rate (BER) is generally taken to be the
8
measure of degradation and performance. In analog communications systems, the signal-to-
noise ratio (SNR) that is related with the end of the receiver is generally the performance
criterion. It’s important to attain a low mean square error (MSE) and high convergence rate
beside a low BER in nonlinear channel equalization. Training sequences are also an important
factor that determines the efficiency of a communications system. They are intended to be as
short as possible which requires the adaptation process to end in as few iterations as possible.
The application of linear equalizers to nonlinear channels does not yield the desired BER
performance since they are based on linear system theory and are used for equalization of
linear channels. Recently, neural networks and fuzzy technology have evolved into a powerful
tool in the equalization of nonlinear channel distortions.
1.4 State of Application of Neural Networks and Fuzzy Technologies for Channel
Equalization
1.4.1 Design of neural network based equalizers
Nonlinear equalizers are capable of compensating for both nonlinear and linear channel
distortion. Adaptive nonlinear equalizers that implemented neural network models were used
extensively primarily for noise-cancellation in various applications. A multilayer perceptron
(MLP) is one of the neural network structures which is used in neural network based
equalizers. MLP networks consist of feedforward neural networks having one or more layers
of neurons, known as hidden neurons that are between the input and output neurons.
Filtering is the process of changing the relative amplitudes of the frequency components in a
signal or eliminating some frequency components completely in a variety of applications [10].
Assigning k information bits to the ܯ = 2 possible signal amplitudes which can be carried
out in a number of ways is called mapping or transformation. Generally, the nonlinear
equalization includes a channel estimator since the channel information is not available at the
receiver end [12]. Filtering comprise two estimation procedures, one of them being the
mapping from the available samples and the other one being the estimation of the output of
the filter from the input by the realization of this mapping [11]. The mapping is more difficult
9
for a nonlinear filter than for a linear filter but research still goes on to effectively realize the
mapping of nonlinear filters.
1.4.2 Channel equalization by using fuzzy logic
Adaptive equalizers for nonlinear channels can be developed by a variety of effective ways.
Baye’s probability theory [13] is capable of bringing about the optimal solution for a symbol
equalizer and is referred to as the Bayesian equalizer. Symbol decision equalizers are
particularly simple and less complex in terms of computationality compared with the MLSE.
A channel estimate is not always necessary for them. They function as inverse filters [14] and
such algorithms as recursive least square (RLS) or least mean square (LMS) are employed to
base an adaptive filter. The channel inverse is found by the adaptive filter where noise
provides a linear decision boundary. In general, an optimal equalizer requires decision
function that is naturally nonlinear. This equalization is usually thought to be a nonlinear
problem of classification with this perspective and because of this reason, linear equalizers’
performance is not good enough to be optimal. This is the reason search for nonlinear
equalizes providing a nonlinear decision function has been undertaken. Nonlinear equalizers
employing artificial neural networks (ANNs) [15], [16], [17] and radial basis function (RBF)
networks [15], [18], [19] were successfully developed. Nonlinear equalizers using ANN and
RBF networks were shown to provide superior performance to linear equalizers for channels
corrupted with ISI and AWGN [20]. The ANN equalizers had some discrepancies due to poor
convergence and RBF equalizers provided functional behavior which is localized and required
by the optimal equalizer where it was difficult to train the centers. This, however, caused the
examinations to find different nonlinear equalization techniques. A fuzzy adaptive filter forms
the basis of a fuzzy equalizer and this fuzzy equalizer has been suggested in [21] as the result
of examinations to find alternative nonlinear equalization techniques and a fuzzy system
related equalizer is offered by [22]. It was found that these equalizers had good performance
but the Bayesian equalizer decision function could not be found, in addition to the difficulty
of demand by fuzzy adaptive filter based equalizer, for high computational complexity.
The fuzzy logic is based on fuzzy rules that use input-output data pairs of the channel. This
type of adaptive equalizers operates by processing numerical data and linguistic information.
10
Fuzzy equalizer depends on fuzzy IF-THEN rules which are determined by human experts.
These rules use the channel’s input-output data pairs and carries out the construction of the
filter for nonlinear channel. The bit error rate (BER) and adaptation speed can be improved by
the linguistic and numerical information.
Digital communications involving quadrature amplitude modulation (QAM) can apply the
fuzzy filter with both linear and nonlinear channel characteristics as has been achieved in this
thesis. The present study proposes a complex fuzzy adaptive filter with changeable fuzzy IF-
THEN rules, which is an extension of the real fuzzy filter. The filter inputs and outputs are all
complex valued. However, the inputs of the channels are real reciprocals of the modulated
complex transmitted inputs and the equalizer outputs are real reciprocal estimates of the
reciprocal channel input signals. Afterwards, the reciprocal normalized equalizer outputs are
denormalized to form the final complex-valued, equalized estimate outputs of the receiver.
This technique which is primarily based on normalization and directly applied on the
transmitter, on the whole presents a new method to successfully equalize complex-valued
QAM signals which are severely distorted in both linear and hostile time-varying nonlinear
channel environments, by using real-valued reciprocals of the signals in question. In addition
to the methodology, the membership functions derived from the training data set and the
gradient-descent learning algorithm which trains the data set, represent a significant element
of the nonlinear neuro-fuzzy equalizer that is capable of this adaptive channel equalization. Its
superiority relies not only on its high equalization performance but also on its capability of
minimizing or eliminating the non-linear channel distortions that in general, linear equalizers
are not capable of doing. In turn, the fuzzy logic based neuro-fuzzy equalization is proven to
be an efficient equalizer on a complex scheme such as QAM with high approximation ability
in nonlinear problems in addition to the linear ones.
A fuzzy adaptive filter is based on a set of fuzzy IF-THEN rules whose function is to change
adaptively in order to minimize some criterion function as new information is available [35].
A recursive least squares (RLS) adaptation algorithm is used by a fuzzy adaptive filter.
The construction of RLS fuzzy adaptive filter is accomplished by the following four steps:
11
1) Defining fuzzy sets in the filter input space UєRn which has membership functions
covering U.
2) Constructing a set of fuzzy IF-THEN rules that either human experts determine or the
adaptation procedure determines by matching input-output data pairs;
3) Constructing a filter that is based on the set of rules; and,
4) Updating the filter’s free parameters by utilizing the RLS algorithm.
The fuzzy adaptive filter’s main advantage is the possibility of integrating linguistic
information (in the shape of fuzzy IF-THEN rules) and numerical information (in the shape of
input-output pairs) into the filter uniformly. At the end, when it’s time to apply the fuzzy
adaptive filter to equalization problems related with nonlinear communication channel, the
following fundamental differences between RLS and LMS are reached:
1) The RLS algorithm is faster than that of the LMS algorithm.
2) Having, in fuzzy terms, incorporated some linguistic description about the channel
into the fuzzy adaptive filter will extensively enhance the adaptation speed of RLS.
3) The fuzzy equalizer’s bit error rate is quite approximate to the bit error rate of the
optimal equalizer.
4) The excess mean-square error of the RLS algorithm is inclined towards zero as the
number of iterations comes nearer to infinity.
Development of neuro-fuzzy system in order to equalize channel distortion includes the
following steps:
-First, the methodologies utilized to equalize channel distortions are analyzed and state of
application problems of neural and fuzzy technologies for the development of an equalizer is
considered.
-Second, the data transmission structure is explained and the operation structure of adaptive
channel equalization utilizing neuro-fuzzy network is presented.
12
-Third, the mathematical model of the neuro-fuzzy network for the development of
equalization system for channel distortion is presented. The learning algorithm of neuro-
fuzzy system is considered.
-Fourth, the development of the neuro-fuzzy equalizer for channel distortion is presented.
-Fifth, the QAM signaling is explained and its application on nonlinear neuro-fuzzy network
is presented. The simulation results of the equalizer using QAM signals and analytical tables
demonstrating the performance of the equalizer are presented. Additionally, tables comparing
the different QAM constellations are presented.
1.5 Summary
In this chapter, the application of channel equalization is explained. The types of distortions in
channels and the types of equalizers used to minimize them are explained with their
classifications and properties. Performance criteria of equalizers, namely bit error rate (BER),
signal-to-noise ratio (SNR) and convergence rate with their ideal indications are stated.
Neural networks and fuzzy logic are particularly discussed and explained with their structures
and features. The methods of equalization using neural networks, specifically filtering is
described. Different types of algorithms, networks and equalizers used especially for difficult
nonlinear channels are defined.
Fuzzy IF-THEN rules which constitute the basis of fuzzy logic are described to point out their
significance in channel equalization. The steps of constructing a fuzzy adaptive filter using
these rules are defined. The methods used in equalizing QAM signals applied on neuro-fuzzy
network and the gradient-descent learning algorithm as part of the equalization system are
described as well.
13
CHAPTER 2
STRUCTURE OF CHANNEL EQUALIZATION
2.1 Overview
All communications systems are composed of three fundamental subsystems which are
transmitter, channel and receiver (Fig. 2.1). A transmitter’s task is to transmit information
signal through physical channel or transmission medium after converting it into a form which
is convenient for transmission. The receiver’s task, on the other hand, is to produce an
accurate replica of the transmitted symbol sequence by recovering the message signal that the
received signal contains. The communications channel acts as a connector between the
transmitter and the receiver sending the electrical signal from the transmitter to the receiver.
The unknown channel characteristics cause distortions to the transmitted signal before it
reaches the receiver.
Figure 2.1 Basic components of a communications system
Digital communications systems are preferred more compared with the analog ones due to
increasing demand for data communication and because digital transmission provides data
processing options and flexibilities that analog transmission cannot offer. The distinguishing
feature of a digital communications system is that it sends a waveform from a finite set of
possible waveforms during a finite interval of time as opposed to an analog communication
system that transmits a waveform from unlimited number of various waveforms which have
theoretically infinite resolution. The message from the source which is represented by an
information waveform is encoded before transmission so that transmission error can be
detected and corrected by the receiver. At the receiver end, the message signal must be
decoded before being used. The distortions preventing the correct transmission of signals are
mainly intersymbol interference (ISI) and noise. Noise is meant to be unwanted electrical
signals which exist in electrical systems. The equalization of channel is an efficient technique
Transmitter
Channel
Receiver
14
employed to reduce or eliminate the obscuring effect of distortion caused in the channel. This
chapter outlines the structure of data transmission system and the functions of its main
components as well as the equalization of channel distortion.
2.2 Architecture of Data Transmission Systems
A communications channel is an electrical medium which connects the transmitter and the
receiver, providing the data transmission from a source which generates the information to
one or more destinations. In the analysis and design of communication systems, the
characteristics of the physical channels through which the information is transmitted, are of
particular importance. Wire lines or free space may be used in the communications path from
the transmitter to the receiver. The examples for wire lines are coaxial cables, wire pairs and
optical fibers. These are widely used in terrestrial telephone networks, even though infrared
and optical free space links such as video, remote controls for TV and hi-fi equipment as well
as some security systems may be used in different situations, as well. This point of
transmission medium is where most of the attenuation and noise is observed [23].
The receiver functions to reverse the signal processing steps performed by the transmitter
recovering the original message signal by compensating for any signal deteriorations caused
by the channel. This involves amplification, filtering, demodulation and decoding and in
general is a more complex task than the transmitting process.
There are many reasons as to why digital communication systems are preferred over analog
systems. Digital communication systems (DCSs) represent an increase in complexity over the
equivalent analog systems. The principal advantages and reasons of DCS’s being the
preferred option instead of analog communication systems can be listed as:
1. The ease with which digital signals, compared with analog signals, are regenerated.
2. Digital systems are not as prone to distortion and interference as analog systems.
3. Increased demand for data transmission.
4. Increased scale of integration, sophistication and reliability of digital electronics for
signal processing, combined with decreased cost.
5. Facility to source code for data compression.
15
6. Possibility of channel coding (line and error control coding) to minimize the effects of
noise and interference.
7. Ease with which bandwidth, power and time can be traded off in order to optimize the
use of these limited resources.
8. Standardization of signals, irrespective of their type, origin or the services they
support, leading to an integrated services digital network (ISDN)
9. Digital hardware can be implemented more flexibly than analog hardware.
10. Various types of digital signals such as data, telephone, TV and telegraph can be
considered identical signals in transmission and switching [24].
Modulation, which is part of the transmission and equalization process, involves encoding
information from a message source in a way that is convenient for transmission. It is
accomplished by translating a baseband message signal to a bandpass signal at frequencies
which are quite high when compared with the frequency of baseband. It is also referred to as
the mapping of the baseband input information waveform into the bandpass signal. The
bandpass signal is referred to as the modulated signal and the baseband message signal is
referred to as the modulating signal. Modulation can be accomplished by varying the
frequency, phase or amplitude of a high frequency carrier in conformity with the amplitude of
the message signal. Demodulation, on the other hand, is the process of extracting the
baseband message from the carrier in order to enable the aimed receiver (also known as the
sink) to process and interpret it. In digital wireless communication systems, it’s possible to
represent the modulating signal as a time sequence of pulses or symbols, where each symbol
has m finite states. The representation of n bits of information where n = log2 m bits/symbol,
is done by each symbol [4].
The block diagram illustrated in Fig. 2.2 can describe communications systems. The source of
data is the signal generator that produces the information to be transmitted and modulated.
This information is in the form of a message symbol that can consist of a single bit or a
grouping of bits.
In order to make the transmission more efficient in terms of the time it takes and/or bandwidth
it requires, encoder is employed as a signal processor that converts the sources of digital
16
information into binary form, i.e. each symbol is encoded as a binary word. Encoding is
performed so as to enable the signal processor in the receiver to detect and correct errors
which will provide the minimization and/or elimination of bit errors caused by noise in the
channel.
The procedure used for detecting and correcting errors is called coding. Coding includes
adding redundant (extra) bits to the stream of data. The redundant bits like parity bits are
employed by the decoder and serve to correct errors at the receiver output even though a high
degree of redundancy may increase the bandwidth of the encoded signal. Codes can be
classified into two broad categories as block codes and convolutional codes. The main
difference is that block coder is a memoryless device whereas a coder having a memory
produces a convolutional coder. Hamming Codes, Golay Codes, Hadamard Codes, Cyclic
Codes, BCH (Bose-Chaudhuri-Hocquenghem) Codes and Reed-Solomon Codes are some
examples of block codes. In addition to block codes and convolutional codes, a new family of
codes, called turbo codes is used recently and is being incorporated in 3G wireless standards.
Turbo codes combine the capabilities of convolutional codes with channel estimation theory
and can be thought of as nested or parallel convolutional codes. When implemented properly,
turbo codes allow coding gains which are far superior to all previous error correcting codes
and permit a link of wireless communications to come surprisingly near to realizing the
Shannon capacity bound [4].
Each digital word has n binary digits and there are ܯ = 2 unique code words which are
possible where each code word corresponds to a certain amplitude level. However, each
sample value from the analog signal could be any one of an infinitely high number of levels
for the digital word which represents the amplitude closest to the actual sampled value to be
utilized. That is known as quantizing [2]. Gray coding was used as the mapping of bits along
the in-phase and quadrature axes of the QAM constellation as part of this thesis study. The
Gray code has been selected since it has change of only one bit for each change of step in the
quantized level. Multisymbol signaling can be thought of as a coding or bit mapping process
17
TRANSMITTER
AWGN
RECEIVER
Figure 2.2 Architecture of a digital communications system [39]
in which n binary symbols (bits) are mapped into a single M-ary symbol. A detection error in
a single symbol can therefore translate into several errors in the corresponding decoded bit
sequence. The bit error rate (BER), therefore relies not only on the probability of symbol error
and the symbol entropy but on the code or bit mapping used and the types of error which
occur as well. If a Gray code is used to map binary symbols to phasor states, this type of error
results in only a single decoded bit error [23]. Consequently, single errors in the receiver will
cause minimal errors in the recovered level.
There are many criteria used in the evaluation of the performance of a communications
system. The optimum system that is considered close to being ideal or perfect for digital
systems is the one that minimizes the bit error rate (BER) at the receiver output subject to
constraints on channel bandwidth and transmitted energy. This raises the matter of inventing a
system with no bit error at the output even when there is noise in the channel. Shannon
demonstrated in 1948 that it was possible to calculate a channel capacity C (bits/s) in the way
that if the rate of information was less than C, the probability of error would approach to zero.
In this case, the maximum possible bandwidth efficiency
max
B
, which is defined as the
Data
source
Encoder
Filter
Modulator
Physical Channel
Demodulator
Filter
Equalizer
Decision
device
Decoder
18
capability of a modulation scheme to accommodate data within a limited bandwidth, is
restricted by the channel noise and is stated by the channel capacity formula in Eq.2.1.
Shannon’s channel capacity formula is applicable to AWGN and is given by
N
S
B
C
B
1
log 2
max
or (2.1)
N
S
B
C
1
log 2
in which C is the channel capacity (bits per second), B is the transmission bandwidth, S is the
average power of the transmitted signal and N is the power spectral density of the white
Gaussian noise. S/N is called the signal-to-noise ratio. Shannon also showed that errors that a
noisy channel induces, could be decreased to any desired level by encoding the information
properly, without sacrificing the rate of information transfer.
The physical medium or the channel that the message signal is transmitted through, induces
distortions like intersymbol interference (ISI) and noise. The receiver, on the other hand is
responsible for separating the source information from the received modulated signal which is
distorted by noise that is usually random, additive white Gaussian noise (AWGN). The
receiver’s duty is to take the corrupted signal at the output of the channel and to convert it to a
baseband signal that the baseband processor could handle. The baseband processor eliminates
or minimizes this signal and distributes an estimate of the source information to the output of
the communications system [2]. Demodulation process is employed at the receiver to the
signal in order to recover the transmitted signal in its baseband form and make it ready to be
processed by the receiver filter. At the end, the decision device reconstructs the encoded
message signal depending on the decisions of the equalizer and the decoder reconstructs the
sequence of transmitted signals by bringing about the reverse operation of the encoder.
19
2.3 Channel Characteristics
Channels must have appropriate frequency band for their transmission medium. The
processed baseband signal is converted by the transmitter circuit into this frequency band. If
the channel is a fiber-optic cable, the carrier circuits convert the baseband input to light
frequencies and the transmitted signal is light.
Channels are classified as wire and wireless channels. Some examples of wire channels can be
counted as coaxial cables, fiber-optic cables, twisted-pair telephone lines and waveguides
whereas air, vacuum and seawater are examples of wireless channels.
The constraints channels may introduce are in favor of a particular type of signaling.
Generally, the signal is attenuated by the channel so that the channel or the noise produced by
an imperfect receiver deteriorates the delivered information from that of the source [2]. There
are various sources that cause noise; those sources may be natural electrical disturbances such
as lightning, artificial sources like ignition systems in cars, switching circuits in a digital
computer or high voltage transmission lines. The channel is likely to involve amplifying
devices such as satellite transponders in space communication systems or repeaters in
telephone systems that help the signal to be above the noise level. In addition to noise,
multiple paths that arise between the input and output of channel involve attenuation
characteristics and time delays. The attenuation characteristics may vary with time, which
makes the signal fade at the channel output. Fading of that type can be observed while
listening to distant shortwave stations.
Another significant characteristic of channels is bandwidth. In general terms, bandwidth is
defined to be the width of a positive frequency band of waveforms whose magnitude spectra
are even about the origin ݂ = 0. Bandwidth in a channel must be enough to accommodate the
signal but reject the noise. High bandwidth allows more users to be assigned as well as more
information to be transmitted. Some examples of band limited channels are telephone
channels and digital microwave radio channels. When the channel is band limited to ܹHz,
any frequency components above ܹ will not be passed by the channel. In turn, the bandwidth
of the transmitted signal will be limited to ܹ Hz, as well. When the channel is not ideal (i.e.
20
|݂| ≤ܹ), signal transmission at a symbol rate equivalent of or exceeding ܹ concludes as
intersymbol interference (ISI) among a number of adjacent symbols. In addition to telephone
channels, other physical channels which exhibit some form of time dispersion and thus
introduce ISI, are also available. Radio channels like shortwave ionospheric propagation (HF)
and tropospheric scatter are two examples of time-dispersive channels. In these channels, time
dispersion and hence, ISI is the consequence of multiple propagation paths that have different
path delays [1]. In addition to noise, multipath propagation and ISI, there are other
impairments in the channels specifically nonlinear distortion, frequency offset and phase
jitter. Channel impairments affect the transmission rate over the channel and the modulation
technique to be used. Depending on the rates, bandwidth efficient modulation techniques are
employed and some form of equalization is employed accordingly.
2.4 Channel Distortions
Channels which are used to transmit data distort signals in both amplitude and phase. In
addition to the nature of the channel itself, other factors like linear distortion, nonlinear
distortion and frequency offset are significant factors causing these distortions.
Linear distortion occurs in linear time-invariant systems in which channels are characterized
as band-limited linear filters. Those channels like telephone channels are part of digital
communications systems where distortionless transmission is highly desired. A linear time-
invariant system will produce two types of linear distortion which are amplitude distortion
and phase distortion. In order to have distortionless transmission with linear time-invariant
systems, the first requirement is that the transfer function of the channel must be given by
d
fT
j
Ae
f
X
f
Y
f
H
2
)
(
)
(
)
(
(2.2)
which means that in order to have no distortion at the system output, the following
requirements have to be met:
1. Flat amplitude response. That is,
A
f
H
constant
)
(
(2.3a)
21
2. The phase response that is a linear function of frequency. That is,
d
fT
f
H
f
2
)
(
)
(
(2.3b)
When the first condition is satisfied, no amplitude distortion exists and when the second
condition is satisfied, no phase distortion exists. The second requirement is related with the
time delay of the system and it is defined as
)
(
2
1
)
(
2
1
)
(
f
H
f
f
f
f
Td
(2.4)
and it is compulsory that
constant
)
(
f
Td
(2.5)
for distortionless transmission. If
)
( f
Td
is not constant, there is phase distortion since the
phase response,
)
( f
, is not a linear function of frequency.
Nonlinear distortion in telephone channels arises from nonlinearities in amplifiers and
compandors used in the telephone system. This type of distortion is usually small and it is
very difficult to correct [1]. There will be nonlinear distortion on the output signal if the
voltage gain coefficients from the second order on, are not zero. There are three types of
nonlinear distortions associated with the amplifiers which are harmonic distortion,
intermodulation distortion (IMD) and cross-modulation distortion. Harmonic distortion
occurs at the amplifier output and is caused by first and second order frequencies of the
amplifier output. The intermodulation distortion is produced by cross-product term of the
amplifier input-output equation whereas the cross-modulation distortion is caused by the third
order distortion products of the amplifier output.
In addition to linear and nonlinear distortions, signals transmitted through telephone channels
are subject to the impairment of frequency offset. A small frequency offset which is mostly
less than 5 Hz, results from the use of carrier equipment in the telephone channel. High-speed
digital transmission systems that use synchronous phase-coherent demodulation cannot
22
tolerate this type of offset. This offset is compensated for by the carrier recovery loop in the
demodulator.
Phase jitter is basically a low-index frequency modulation of the transmitted signal with the
low frequency harmonics of the power line frequency. Phase jitter poses a serious problem in
digital transmission of high rates. Yet, it can be tracked and compensated for, to some extent,
at the demodulator.
Distortion can occur within the transmitter, the receiver and the channel. As opposed to noise
and interference, distortion appears when the signal is turned off.
2.4.1 Multipath propagation
Multipath fading occurs to varying extents in many different radio applications. It is caused
whenever radio energy reaches the receiver by more than one path. Multiple paths may also
occur due to ground reflections, reflections from stable tropospheric layers and refraction by
tropospheric layers with extreme refractive index gradients [23]. Scattering obstacles also
cause multipath propagation to some other systems like urban cellular radio systems.
There are two principal effects of multipath propagation on systems, their relative severity
depending essentially on the relative bandwidth of the resulting channel compared with that of
the signal being transmitted. The fading process is governed by changes in atmospheric
conditions for fixed point systems such as the microwave radio relay network. The path delay
spread often is adequately short for the channel frequency response to be essentially constant
over its operating bandwidth. If that happens, fading is considered flat because all signal
frequency components become prone to the same fade at any given instant. In the case of path
delay spread being longer, the channel frequency response is likely to change rapidly on a
frequency scale that can be compared with signal bandwidth. If that happens, the fading is
considered frequency selective and the received signal is subject to severe amplitude and
phase distortion. Adaptive equalizers may then be required to flatten and linearize the overall
characteristics of channel. The flat fading effects can be combated by increasing transmitter
power whilst the effects of frequency selective channel cannot. A fade margin is usually
designed into the link budget to offset the expected multipath fades for microwave links
23
which are subject to flat fading. The magnitude of this margin depends on the required
availability of the link.
Paths of multiple propagation that have different path delays cause time dispersion and ISI in
time-dispersive channels. The reason for calling these channels time-variant multipath
channels is that the relative time delays among the paths and the number of paths vary with
time. Various frequency response characteristics are caused by the time-variant multipath
conditions resulting in inappropriate frequency response characterization for time-variant
multipath channels, which is used for telephone channels. Instead, scattering function
statistically characterizes these radio channels. The scattering function is a two-dimensional
representation of the average received signal power which depends on Doppler frequency and
relative time delay.
2.4.2 Intersymbol interference
Rectangular pulse signaling, in principle, has a spectral efficiency of 0 bits/s/Hz since each
rectangular pulse has infinite absolute bandwidth. In practice, of course, rectangular pulses
can be transmitted over channels with finite bandwidth if a degree of distortion can be
tolerated.
In digital communications, it might appear that distortion is unimportant since a receiver must
only distinguish between pulses which have been distorted in the same way. If the pulses are
filtered improperly as they pass through a communications system i.e. if the distortion is
severe enough, they will spread in time. The decision instant voltage might then arise not only
from the current symbol but also from one or more preceding pulses. Intersymbol interference
(ISI) is caused when smearing the pulse for each symbol into adjacent time slots occurs. The
pulses would have rounded tops instead of flat ones with a restricted bandwidth. What’s
important about ISI is the decision instant. The decision instant can be defined as the
sampling instant (or sampling point) at which each time slot of the transmitted or received
waveform begins. It is at this point that ISI occurs due to the smearing effect of the pulse.
This smearing will cause unwanted contributions from the adjacent pulses that are likely to
degrade bit error rate (BER) performance. The decision instant shows an important point: The
24
performance of digital communications systems is only related with decision instant ISI. If ISI
occurs at times that are not decision instants, it does not matter [23].
If the signal pulses could be persuaded to pass through zero crossing point (of the time axis) at
every decision instant (except one), then ISI would no longer be a problem. This suggests a
definition for an ISI-free signal, i.e.: If a signal passes through zero at all instants that are not
one of the sampling instants, it’s an ISI-free signal [23].
While transmitting information with pulses over an analog channel, the original signal is a
discrete time sequence (or an acceptable approximation); the received signal is a continuous
time signal. The channel can be considered a low-pass analog filter, by that means, smearing
or spreading the shape of the impulse train into a continuous signal with peaks that are related
with the original pulses’ amplitudes. Convolution of the pulse sequence by a continuous time
channel response could describe the operation in terms of mathematics. The convolution
integral is the beginning of the operation:
(2.6)
where x(k) denotes the received signal, h(k) denotes the channel impulse response and s(k)
denotes the input signal. The second half on the right side of the above equation illustrates the
commutativity property of the convolution operation.
Component s(k) is the input pulse train that is comprised of periodically transmitted impulses
of varying amplitudes, for that reason;
s(k) = 0 for k≠nT (2.7)
s(k) = Sn for k=nT
where T is the symbol period. Here, it is meant that the only significant values of the variable
of integration in the integral of equation (2.6), are those for which ݇ = ݊ܶ. A different value
of k amounts to multiplication by 0 and for that reason, x(k) can be stated as
d
k
h
s
d
k
s
h
k
s
k
h
k
x
)
(
)
(
)
(
)
(
)
(
)
(
)
(
25
)
(
)
(
nT
k
h
s
k
x
n
n
(2.8)
The above equation that represents x(k) is more similar to the convolution sum, however, it
nevertheless is the description of a continuous time system. It illustrates that the received
signal is comprised of the addition of a large number of shifted and scaled impulse responses
of continuous time system. The amplitudes of the transmitted pulses of x(k) scale the impulse
responses.
The first term in Eq. 2.8 is the component of x(k) because of the Nth symbol. The centre tap of
the channel impulse response multiplies it. ISI terms are the other product terms in the
summation. The appropriate samples in the tails of the channel impulse response scale the
input pulses in the neighborhood of the Nth symbol.
2.4.3 Noise
In communications systems, the received waveform is usually classified as the desired part
which contains the information and the extraneous or unwanted part. The desired part is the
signal and the unwanted part is the noise. Noise limits our ability to communicate and causes
more power consumption during the transmission of information. Minimizing the noise
effects is achieved after enhancing the power amount in the transmitted signal. Yet, factors
like equipment and various practical limitations restrict the level of power in the signal which
is transmitted.
The most frequently encountered problem in the transmission of signals through any channel
is additive noise that is generally generated internally at the receiver end by components like
solid-state devices of a subsystem and resistors employed in the implementation of the
communications system. That is at times referred to as thermal noise. Thermal noise is
produced by the random motion of free charge carriers (usually electrons) in a resistive
medium. Additive noise generated by the electronic components is usually found in a storage
system’s readback signal, as in the case of a radio or telephone communication system. When
such noise occupies the same frequency band that the desired signal occupies, suitable design
of the transmitted signal and its demodulator at the receiver can minimize its effect [23].
26
Another problem in transmission is the non-thermal noise, also known as the shot noise.
Although the time averaged current flowing in a device may be constant, statistical
fluctuations will be present if individual charge carriers have to pass through a potential
barrier. The potential barrier may, for example, be the junction of a PN junction diode, the
cathode of a vacuum tube or the emitter bus junction of a bipolar transistor. Such statistical
fluctuations constitute shot noise.
Noise that arises from external sources can be coupled into a communication system by the
receiving antenna. Antenna noise which is dominated by the broadband radiation produced in
lightning discharges associated with thunderstorms, below 30 MHz originates from several
different sources. This radiation is trapped by the ionosphere and propagates worldwide.
Such noise is sometimes referred to as atmospheric noise.
Noise can be classified into categories as:
a. White noise: A stochastic process which has a flat power spectral density over the
entirety of frequency range. It’s not possible to express that sort of noise using
quadrature components because of its wideband character. When problems tackling
the narrowband signal demodulation in noise are in question, modeling the additive
noise process as white and representing the noise using quadrature components is
mathematically convenient. It’s possible to accomplish this after putting forward that
the signals and noise at the receiver managed to pass through an ideal bandpass filter,
which has a passband including the spectrum of the signals but is a lot wider. The
noise that is the result of passing the white noise process through a spectrally flat
bandpass filter is referred to as bandpass white noise.
b. Electromagnetic Noise: Usually found in electrical devices like television and radio
transmitters and receivers. They can be present at all frequencies.
c. Impulse Noise: An additive disturbance which arises primarily from the switching
equipment in the telephone system. It is made up of short-duration pulses having
random duration and amplitude.
d. Acoustic Noise: Present in almost all conversations and limit telecommunications
environments such as telephone circuits and hands-free telephones. It may be
27
unnoticeable or distinct, depending on the time delay involved. If the delay between
the speech and its echo (noise) is short, the noise is unnoticeable, but perceived as a
form of spectral distortion referred to as reverberation. If, however, the delay exceeds
a few tens of milliseconds, the noise is distinctly noticeable [25]. Background noise
generated in a car cabin, air conditioners and computer fans represent some types of
acoustic noise.
e. Processing Noise: Modeled as a zero-mean, white-noise process
in data
communication systems. It is the result of digital analog processing of signals, e.g. lost
data packets in digital data communications systems or quantization noise in digital
coding of image or speech.
f. Colored Noise: It’s a Gaussian type noise which is part of wideband signal processes
with non-constant spectrum. Autoregressive noise and brown noise are some examples
of the non-white, colored noise.
Gaussian noise and specifically the additive white Gaussian noise (AWGN) is the most
frequently encountered type of noise in communication systems. It represents the simplest
mathematical model for a communication channel. Below are given a list of channel models
in which the effects of noise on electrical communication and the most important
characteristics of the transmission channels are investigated.
2.4.3.1 The additive noise channel
Contaminating noise in signal transmission usually has an additive effect in the sense that
noise often adds to the information bearing signal at various points between the source and the
destination. Random additive noise process n(k) whose channel has a mathematical model as
shown in Fig. 2.3, corrupts the transmitted signal x(k). The additive noise becomes white
when the random process has a power spectral density (PSD) which is constant over all
frequencies and becomes the most often assumed model of additive white Gaussian noise
(AWGN), when the noise has a Gaussian distribution. AWGN contains a uniform continuous
frequency spectrum over a particular frequency band and the majority of physical
communication channels implements this model since it is mathematically tractable.
28
s(k) x(k)=s(k)+n(k)
Figure 2.3 The additive Gaussian noise channel [39]
2.4.3.2 The linear filter channel
Filtering is an operation which includes extracting information about a quantity of interest
from data with noise at time ݐ by using measured data that includes ݐ. A filter is considered
linear when filtering, smoothing or predicting the amount at the filter output is done and this
amount linearly depends on the observations applied to the filter input [25].
Linear filter channels are those that enable the transmitted signals to remain in specified
bandwidth limitations without interfering with each other. The mathematical model including
the additive noise is illustrated in Fig. 2.4 in which s(k) is the channel input and the channel
output is represented as
)
(
)
(
)
(
)
(
)
(
)
(
)
(
k
n
d
k
s
h
k
n
k
h
k
s
k
x
(2.9)
in which h(τ) is the linear filter impulse response and * denotes convolution.
s(k) x(k)=s(k)∗h(k)+n(k)
Figure 2.4 Linear filter channel with additive noise [39]
Channel
n(k)
Linear
filter h(k)
Channel
n(k)
29
When attenuation is applied to the signal while being transmitted, the received signal becomes
x(k)=αs(k)+n(k) (2.10)
where α is the attenuation factor.
2.4.3.3 The linear time-variant filter channel
Mobile systems such as a moving vehicle and wireless channels such as radio channels cause
multipath propagation resulting in time-varying fading signals because their frequency
response characteristics are time-variant. The time-varying mobile channel characteristics
necessitate using a channel equalizer which continuously adapts to these characteristics,
effectively implementing a filter which is matched to these characteristics. A time-variant
channel impulse response h(τ;k) is a characteristic of such time-variant linear filters. The
channel response h(τ;k) contains an impulse applied at time k-τ where τ stands for the elapsed-
time variable. The linear time-variant filter channel containing additive noise and the signal of
channel output when s(k) is the input, becomes
)
(
)
(
)
;
(
)
(
)
;
(
)
(
)
(
t
n
d
k
s
k
h
k
n
k
h
k
s
k
x
(2.11)
in which the time-variant impulse response has the following representation
)
(
)
(
)
;
(
1
k
L
n
n
k
k
a
k
h
(2.12)
where the {an(k)} denotes the possibly time-variant attenuation factors for the L multipath
propagation paths. Substituting Eq. 2.12 into Eq. 2.11 makes the received signal
)
(
)
(
)
(
)
(
1
k
n
k
k
a
k
x
k
L
n
n
(2.13)
where each of the L multipath components is attenuated by {an(k)} and delayed by {߬(݊)}.
30
A large majority of physical channels are formed by the three defined mathematical models
and the communication systems are analyzed and designed based on these three channel
models.
2.5 Summary
This chapter outlines the structure of channel equalization system. The factors causing
distortions in the channel and their properties are explained and discussed in detail. The noise
types and interferences are described in detail in addition to their effects on the channel and
the ways of removing them from the channel.
The types of channels used within the data transmission system have been discussed.
Mathematical models representing various types of channels have been outlined and
described. Mathematical formulas representing the input, impulse response and the output of
the channel have been explained beside the channel characteristics of each type.
31
CHAPTER 3
MATHEMATICAL BACKGROUND OF A NEURO-FUZZY EQUALIZER
3.1 Overview
When the channel distortion in communications applications is extreme and linear equalizers
are not able to deal with them, nonlinear equalizers are employed instead. A linear equalizer
doesn’t have good performance on channels that have amplitude characteristics containing
deep spectral nulls or on channels containing nonlinear distortions. In an effort to compensate
for the channel distortion, the linear equalizer puts a vast gain in the vicinity of the spectral
null for the channel distortion compensation and consequently increases the amount of
additive noise the received signal has got.
Neural networks can be considered mathematical models of brain and mind activities. The
main purpose of neural networks is to form the organization of numerous simple processing
elements into layers for achieving tasks with higher level sophistication. High computation
rates, high capability for nonlinear problems, massive parallelism and continuous adaptation
are among the properties of neural networks. Those features turn neural networks into desired
tools for different sorts of applications [28]. Neural networks have been put forward for
equalization problems because of these attractive properties and their nonlinear capability.
On the other hand, neural networks have some weaknesses related with their individual
models. Their computational power is low and learning capability is limited. At this point, the
fuzzy systems have been considered to compensate these weaknesses with their capabilities of
logically reaching conclusions on a more advanced (linguistic or semantic) level.
This chapter describes the synthesizing of fuzzy logic with neural networks, the operation and
structure algorithms of neuro-fuzzy system as the channel equalization basis of QAM signals.
3.2 Neuro-Fuzzy System
Intelligent control is largely rule based, whereas classical control is rooted in the theory of
linear differential equations, because the dependencies involved in its deployment are much
32
too complex to permit an analytical representation. In tackling such dependencies, it is
expedient to use the mathematics of fuzzy systems and neural networks. The power of fuzzy
systems lies in their ability to measure the quantity of linguistic inputs and to quickly provide
a working approximation of complex and frequently unknown input-output rules of system.
The power of neural networks is in their ability to learn from data. It’s possible to combine
neural networks and fuzzy logic in a number of ways and both have advantages that provide
flexibility and effectiveness when combined. Fuzzy adaptive filters are effective because of
their data approximation ability in nonlinear problems and therefore are widely used in signal
processing problems. Fuzzy logic equalizers usually require fewer training samples than
conventional equalizers, especially for linear channels. They are capable of yielding better
error performance and also perform better in the presence of channel nonlinearities [29].
Neural networks supply algorithms for numeric classification, optimization and associative
storage. When fuzzy logic and neural networks are integrated, the emerging neuro-fuzzy
system becomes capable of training the network in a shorter time as a result of decreased
number of nodes of the network. There is a natural synergy between neural networks and
fuzzy systems that makes their hybridization a powerful tool for intelligent control and other
applications.
3.3 Fuzzy Inference Systems
3.3.1 Architecture of fuzzy inference systems
Fuzzy Inference Systems (FIS) are one of the well known applications of fuzzy sets theory
and fuzzy logic. They are used in achieving classification tasks, process control, offline
process simulation and diagnosis and online decision support tools. The power of FIS depends
on the twofold identity of both being capable of managing linguistic concepts and being
universal approximators which are capable of performing nonlinear mappings between inputs
and outputs.
FIS is often utilized for process simulation or control. Either expert knowledge or data can
design them. Knowledge based FIS solely may suffer from a loss of accuracy, for complex
33
systems which is the most important motivation to use fuzzy rules concluded from data [30].
The functional blocks as explained below, comprise a fuzzy inference system (Figure 3.1).
- Determining a set of fuzzy IF-THEN rules. Fuzzy rules are composed of linguistic
statements which describe the way the FIS makes a decision about the classification of
an input or the controlling of an output.
- Fuzzifying the inputs, which involves transforming the crisp inputs into degrees to
match with linguistic values, using the input membership functions defined by a data
base.
- Combining the fuzzified inputs in accordance with the fuzzy rules to set up a rule
strength (also called weight or fire strength).
- Determining the rule’s consequence by putting together the rule strength and the
membership function of the output.
- Combining the consequences so as to obtain an output distribution.
- Defuzzification of the output distribution which involves transforming the fuzzy rules
of the inference into crisp output.
The operations upon fuzzy IF-THEN rules are explained in the steps below:
1. Mapping the inputs to membership values of each linguistic label, utilizing a set of
input membership functions on the premise part (fuzzification process).
2. Computation of the rule strength by combining the fuzzified inputs (combining the
membership values), by utilizing the process of the fuzzy combination. (the fuzzy
combinations are also referred to as T-norms which are used in making a fuzzy rule
and involve the operators of ”and”, “or” and sometimes “not”)
3. Generating the qualified fuzzy or crisp consequent of each rule according to the rule
strength.
4. Combining the entirety of the fuzzy rule outputs to attain one fuzzy output distribution
and then aggregating the qualified consequent to produce a single crisp output
(Defuzzification of output distribution).
34
Figure 3.1 Structure of fuzzy inference system [39]
3.3.2 Rule base fuzzy if-then rule
The fuzzy knowledge base that includes a set of fuzzy IF-THEN rules forms one of the basic
blocks of a fuzzy system. The following is the form of expression that represents fuzzy IF-
THEN rules or fuzzy conditional statements [37].
If u is A Then y is B (3.1)
where u and y represent the input and output linguistic variables, A and B represent the labels
of the fuzzy sets characterized by appropriate membership functions. A denotes the premise
and B denotes the consequent part of the rule.
There are many forms representing IF-THEN rules among which Single Input Single Output
(SISO), given by statement (3.1) is the simplest. Multi-Input Single Output (MISO) of the
below given statements (3.2) and (3.3), are the other forms.
If u1 is
jA1 and u2 is
jA2 and ,…., and un is
l
nA Then yq is
p
qB
(3.2)
Input
Output
Knowledge base
Decision-making
Fuzzification
Interface
Defuzzification
Interface
35
If u1 is
jA1 and u2 is
jA2 and ,…., and un is
l
nA Then y1 is
rB1 and y2 is
sB2 (3.3)
The membership functions describe the fuzzy values A and B and Figure 3.2 demonstrates the
most widely used types of membership functions with their shapes.
1 1 1
0.5 0.5 0.5
0
(a)
0
(b)
0
(c)
Figure 3.2 Examples of membership functions (a) bell, (b) triangular, (c) trapezoidal
The following exponential function is one representation of a decision function that produces
a bell curve.
2
2
0
2
exp
)
(
x
x
x
(3.4)
where x is the independent variable on the universe, x0 denotes the position of the peak
relative to the universe and σ denotes the standard deviation.
The expressions (3.5) and (3.6) represent triangle and trapezoidal membership functions,
respectively.
ߤ(ݔ) = ቐ
1 − ௫ି௫
௫ି௫
,
ݔ < ݔ < ݔ
1 − ௫ି௫
௫ೝି௫
, ݔ < ݔ < ݔ
(3.5)
ߤ(ݔ) =
⎩
⎪
⎨
⎪
⎧1 − ௫ି௫
௫ି௫
, ݔ < ݔ < ݔ
1, ݔ < ݔ < ݔ
1 − ௫ି௫ೝ
௫ೝି௫ೝ
, ݔ < ݔ < ݔ
(3.6)
36
The following representation is the form of the types of rules, called Takagi and Sugeno fuzzy
rules because the consequent part of the fuzzy rules is a mathematical function of the input
variables.
If ܣ1(ݔ1), ܣ2(ݔ2), …… , ܣ݊(ݔ) then Y=݂(ݔଵ, ݔଶ, … . , ݔ) (3.7)
where the premise part is fuzzy and the function ݂ in the consequent part is usually a linear or
quadratic mathematical function.
݂ = ܽ + ܽଵx ݔଵ+ ܽଶx ݔଶ+ … + ܽx ݔ (3.8)
Fuzzy IF-THEN rules are widely used in modeling. They are considered the local description
of the system being designed and form the basics of the fuzzy inference system (FIS).
Fuzzification: The aim of fuzzification is mapping the crisp input into a fuzzy set. This input
can be from a set of sensors or features of those sensors like amplitude or frequency, and is
mapped into fuzzy numbers of values from 0 to 1, using a set of input membership functions.
The numeric inputs, ui߳Ui are converted into fuzzy sets by the fuzzification process for the
fuzzy system to use.
When
ܷ
∗ represents the set of all possible fuzzy sets which can be defined on ܷ∗ (given
ui߳Ui), ui is transformed to a fuzzy set denoted by ܣ
௨௭௭ that is defined on the universe of
discourse
ܷ
∗. The fuzzification operator F that produces this transformation is defined by
F: Ui =>
ܷ
∗
where
F(ui) = ܣ
௨௭௭,
Frequently, “singleton fuzzification” is used. It produces a fuzzy set ܣ
௨௭௭߳
ܷ
∗ with a
membership function given by
37
ߤೠ
(ݔ) = ቄ1 ݔ = ݑ
0 ݐℎ݁ݎݓ݅ݏ݁
Any fuzzy set with this form of membership function is termed “singleton”. Singleton
fuzzification is the type for which the input fuzzy set has only a single point of nonzero
membership and the number ui is represented by the singleton fuzzy set. In implementations
where singleton fuzzification is used, ui only takes on its measured values without any noise
involved. “Gaussian fuzzification” that uses bell type membership functions about input
points, and triangular fuzzification using triangle shapes, are common examples [38].
3.3.3 Fuzzy inference mechanism
Designing a fuzzy inference system (FIS) from data can be separated into two principal
stages: (1) automatic rule generation and (2) system optimization [30]. Rule generation is the
guide to a fundamental system that has a given space partitioning and the set of rules that
corresponds to it. System optimization is realized at different sorts of levels. Variable
selection could be a comprehensive selection or is possibly handled rule by rule. The goal of
rule base optimization is to choose the most efficient rules and to use rule conclusions in the
best way. It’s possible to enhance space partitioning by adding or removing fuzzy sets and by
tuning the parameters of membership function. Structure optimization has great significance:
choosing variables, lessening the rule base and optimizing the number of fuzzy sets.
There are two main tasks associated with fuzzy inference mechanism:
1. Matching task which involves determining the degree of each rule’s being relevant to
the current situation as marked by the inputs ݑ, ݅ = 1,2, … . ,݊.
2. Inference step which involves reaching the conclusions from the current inputs ui and
the information in the rule-base.
When the fuzzy set representing the premise of the ith rule is denoted by ܣଵ
× ܣଶ × … × ܣ ,
there will be two steps for matching:
38
Step 1: Combining inputs with rule premises: This step is about finding fuzzy sets ܣଵ
,
ܣଶ
, … ,ܣ
, with membership functions.
ߤ
భ
ೕ (ݑଵ) = ߤభೕ
(ݑଵ) ∗ ߤభೠ
(ݑଵ)
ߤ
మ
ೖ(ݑଶ) = ߤమೖ(ݑଶ) ∗ ߤమೠ
(ݑଶ)
.
.
ߤ
ೕ (ݑ) = ߤೕ (ݑ) ∗ ߤೠ
(ݑ)
(for all j, k, … ,l) combining the fuzzy sets from fuzzification with the fuzzy sets used in each
of the terms in the rules’ premises. When the singleton fuzzification is used, each of the input
fuzzy sets has only a single point of nonzero membership function.
(e.g. ߤ
ೕ (ݑ) = ߤೕ (ݑ) for ݑଵ = ݑଵ and ߤೕ
(ݑ) = 0 for ݑଵ ≠ ݑଵ)
To put it in another way, ߤ
ೠ(ݑ) = 1, with singleton fuzzification, for all ݅ = 1,2, … ,݊ for
the given ݑ inputs resulting in
ߤ
భ
ೕ (ݑଵ) = ߤభೕ
(ݑଵ)
ߤ
మ
ೖ(ݑଶ) = ߤమೖ(ݑଶ)
.
ߤ
ೕ (ݑ) = ߤೕ (ݑ)
Step 2: Determining those rules that are on: In this step, membership values ߤ(ݑଵ,ݑଶ, … ,ݑ)
are determined for the premise of ݅௧ rule which represents the certainty that each rule
premise is consistent with the given inputs. Defining
ߤ(ݑଵ,ݑଶ , … ,ݑ) = ߤೕ
(ݑଵ)ߤమೖ
(ݑଶ) … ߤ
39
that is a function of the inputs ݑ, ߤ(ݑଵ,ݑଶ, … ,ݑ) represents the certainty that the
antecedent of rule ݅ matches the information in the case of singleton fuzzification use. The
ߤ(ݑଵ,ݑଶ, … ,ݑ) is a multidimensional certainty surface. It stands for the certainty of a
premise of a rule and for the level to which a particular rule is consistent for a given set of
inputs. The implied fuzzy set is determined by the inference step which is then taken by
calculating the “implied fuzzy set” ܤ ,, for the ݅௧ rule, with the membership function
ߤ
൫ݕ൯ = ߤ(ݑଵ,ݑଶ, … , ݑ) ∗ ߤ(ݕ) (3.9)
The certainty level of the output’s being a specific crisp output ݕ within the universe of
discourse ݕ is specified by the implied fuzzy set ܤ
on considering simply rule I. The
defuzzification that comes after the inference step is employed to aggregate the conclusions of
all the rules which the implied fuzzy sets represent.
Defuzzification Methods: It is frequently important to find out a single crisp output from a
FIS. For instance, in the case of one attempting to classify a letter drawn by hand on a
drawing tablet, the FIS would be obliged to find out a crisp number to determine the letter that
was drawn. A process called defuzzification is used to attain this crisp number. In other
words, defuzzification means the way of extracting a crisp value from a fuzzy set as a
representative value.
Two known methods can be used for defuzzifying:
Center of Gravity (COG): The method picks the output distribution and works to find its
center of mass to produce one crisp number.
i
i
i
i
i
x
x
x
u
)
(
)
(
(3.10)
where the crisp output value ݑ is the abscissa (center of mass) under the center of gravity of
the fuzzy set, ߤ(ݔ) is the membership value in the membership function, ݔ is a running point
40
in a discrete universe. This expression is also considered the weighted average of the elements
in the support set.
The COG method for singletons attains the following expression
i
i
i
i
i
s
s
s
u
)
(
)
(
(3.11)
where ݏ is the position of singleton ݅ in the universe and ߤ(ݏ) represents the rule strength ߙ
of rule ݅. This technique has a good computational complexity and ݑ is differentiable with
respect to the singletons ݏ, that is practical in neuro-fuzzy systems.
Center of Average (COA): In this widely used method, a crisp output ݕ
௦ is selected
employing the centers of every one of the output membership functions and the highest
certainty of every one of the conclusions the implied fuzzy sets represent, and is described as
R
i
q
i
q
yq
R
i
q
i
q
yq
q
i
Crisp
q
y
B
y
B
b
y
1
1
)}
(
{
sup
)}
(
{
sup
(3.12)
here “sup” is the “supermum” (i.e., the least upper bound that is frequently regarded as
maximum value). Therefore, ݏݑ௫{ߤ(ݔ)} can simply be considered the highest value of ߤ(ݔ).
Fig. 3.3 outlines the inference mechanisms on different types of fuzzy systems graphically.
Most fuzzy inference systems can be categorized into three types depending on the types of
fuzzy reasoning.
In Type 1 fuzzy systems, the defuzzifier puts together the output sets that correspond to the
whole of the fired rules in a way to attain a single output set and afterwards comes up with a
crisp number which represents this output set that is put together, e.g., the centroid defuzzifier
comes up with the unity of the whole of the output sets and utilizes the centroid of the unity as
the crisp output [31]. The weighted average of each rule’s crisp output introduced by rule’s
weight and the output membership functions is the overall output.
41
Premise Consequent
Type1 Type2 Type3
A1 B1 w1 C1 w1 C1
z1=ax+by+c
ݔ ݕ ݖ
ݖ
A2 B2 w2 C2 w2 C2
z2=px+qy+r
ݔ ݕ ݖ ݖ
max
Multiplication
or min.
z = [w1z1+ w2z2]/ w1+ w2 z = [w1z1+ w2z2]/ w1+ w2
Figure 3.3 Types of fuzzy reasoning mechanisms [11]
In Type 2 fuzzy systems, fuzzy sets are quite helpful in conditions that make the
determination of an exact membership function for a fuzzy set hard; for this reason, they are
quite helpful in the incorporation of uncertainties. These uncertainties are caused by the
knowledge employed in the construction of rules in a fuzzy logic system and lead to rules that
have uncertain antecedents and/or consequents that are transformed in succession into
uncertain antecedent and/or consequent membership functions [31]. The overall fuzzy output
is attained after the application of 'max' operation to the fuzzy outputs that qualify. Every one
of these outputs equals the minimum rule strength and each rule’s membership function.
z
42
Type 3 is Takagi and Sugeno’s fuzzy IF-THEN rules. The output is a crisp number computed
by the multiplication of every one of the inputs by a constant and summing the result
afterwards. The weighted average of each rule’s output is the output.
In Fig. 3.3, a fuzzy inference system with two rules and two inputs is used to demonstrate the
different types of fuzzy rules and fuzzy reasoning described above.
3.4 Artificial Neural Networks
Recognizing that computing in the human brain takes place in a totally different manner from
the traditional digital computer, has been the incentive for research into artificial neural
networks, also known as “neural networks”. The brain is extremely complex, nonlinear and
parallel computing (information-processing system). It is capable of organizing its structural
constituents, called neurons, in order to carry out some necessary computations (e.g. pattern
recognition, perception and motor control) a lot more quickly than the highest speed digital
computer of the present time [29].
x0 ܫ = ∑ݓ ݔ ܵݑ݉݉ܽݐ݅݊
x1 ݓ ܻ
= ݂(ܫ)
ܶݎܽ݊ݏ݂݁ݎ
x2 ݓଶ
ݓଵ
Sum Transfer Output Path
•
•
•
wn
xn
Processing
Element
Inputs xi Synaptic
Weights wi
Figure 3.4 Artificial neuron
43
A neuron is a unit that processes information and is significant in a neural network’s
operation. The model of an artificial neuron that is fundamental in the design of artificial
neural networks is demonstrated in the block diagram of Fig. 3.4.
A set of synapses, also called connecting links, are the foundation elements of the neuronal
model. A weight or strength of its own characterizes each of these synapses. Specifically, a
signal ݔ for ݅ = 1,2, … ,݊, at the input of synapse, connected to neuron ݇, is multiplied by the
synaptic connection weight ݓ, for ݅ = 1,2, … ,݊. A result is generated by summing these
products, feeding them through a transfer function and then outputting them.
The output of the artificial neuron displayed in Fig. 3.4 is calculated from
)
(
1
n
j
i
j
ij
i
x
w
f
y
(3.13)
where ݔ is the input, ݕ is the output signal of the neuron, ݓ are the synaptic weight
coefficients, ߠ denotes the bias and ݂ is the activation function.
The activation function can be linear or nonlinear but a nonlinear sigmoid function is
frequently utilized as the activation function (eq. 3.14).
n
j
j
i
ij
j
x
w
y
1
)]
(
exp[
1
1
(3.14)
Neural networks are formed by a set of neurons in layer(s). The neurons are interconnected by
weighted connections at certain connection points which are called nodes. The way of
organization in a layered neural network is the layer formation. The least complicated
formation of a network with layers uses an input layer of source nodes which projects onto an
output layer of neurons (computation nodes) but not the other way round. This network is a
feedforward or acyclic type of network. Neurons in the network act as processing elements
which multiply an input by a set of weights and nonlinearly transform the result into an output
value.
44
On the whole, three basically different architectural network classes can be defined which are
single-layer feedforward (non-recurrent), multilayer feedforward and recurrent networks. The
feedforward neural network structures are shown in Fig. 3.5.
ݔଵ ݕଵ
ݔଵ
ݔଶ ∙
∙
ݔଶ
⋮
ݕଵ
ݔ୫
∙
∙∙
∙
ݕଶ
ݔ୫
⋮
⋮
⋮
2y
(a) (b)
Figure 3.5 (a) A single layer network, (b) A simple multilayer network [11]
3.4.1 Neural network’s learning
The most important specialty of a neural network is its capability of learning from its
environment and improving its performance through learning. An interactive adjustment
process applied to its synaptic weights and bias levels enables a neural network to learn about
its environment. Every one of the iterations of the learning process makes the network well-
informed of its environment. Learning in the circumstances of neural networks can be clearly
stated to be a process by which the neural network’s free parameters are adapted through a
stimulation process by the environment where the network is embedded. The way that the
parameter changes occur determines the type of learning [29].
A set of rules that are well determined and defined for the solution of a learning process is
referred to as a learning algorithm. No learning algorithm that is the only one of its sort exists
in the neural network design, as expected. The manner that the adjustment to a neuron’s
synaptic weight is clearly and exactly expressed, fundamentally cause the learning algorithms
to differ from each other. Another factor that should be taken into consideration is the way
45
that a neural network (learning machine) which is comprised of a set of interconnected
neurons, is related to its environment. In this latter context, a term is spoken as a learning
paradigm that refers to a model of the environment in which the neural network operates [29].
There are two fundamental learning paradigms associated with neural networks: (1) Learning
with a teacher (known as supervised learning) and (2) Learning without a teacher which is
divided into two subdivisions that are unsupervised learning and reinforcement learning.
Supervised learning involves training with a teacher. The teacher can be thought of as a set of
input-output examples representing the knowledge of the environment. Neural network, on
the other hand, does not know the environment. Considering that a training vector drawn from
the environment is applied to both the teacher and the neural network, the teacher is capable
of supplying the neural network with a desired response for the training vector. The network
parameters, i.e. the connection weights, are adjusted under the combined influence of the
training vector and the error signal. The error signal is what makes the desired response differ
from the actual response of the network. This adjustment is brought about in an iterative and
step-by-step way aiming at eventually causing the neural network to emulate the teacher; this
emulation is supposedly optimum in a statistical sense. This manner transfers the
environment’s knowledge that can be obtained by the teacher, to the neural network through
learning as fully as possible. On reaching this condition, the teacher may be removed and the
neural network copes with the environment entirely on its own.
The form of supervised learning just described, is the error correction learning which involves
a closed-loop feedback system but the loop does not contain the unknown environment. The
mean-square error or the sum of squared errors over the training samples that are in terms of
the free parameters of the system constitutes the performance criterion for the system. This
criterion may be visualized as a multidimensional error performance or simply error surface,
with the parameters as coordinates. The true error surface is averaged over all possible input-
output examples. It’s a point on the error surface which represents any one of the system’s
operations that the teacher supervises. The operating point has to move down one after
46
another toward a minimum point of the error surface so that the system improves performance
over time and thus learns from the teacher; it’s possible for the minimum point to be a local
minimum or a global minimum. A supervised learning system is capable of doing this using
the helpful information it has about the gradient of the error surface that corresponds to the
system’s current behavior. The gradient of an error surface at any point is a vector which
points in the direction of steepest descent. On providing an algorithm designed to minimize
the cost function, a sufficient set of input-output examples and sufficient time allowed to carry
out the training, a supervised learning system is generally capable of performing tasks like
pattern classification and function approximation [29].
3.4.2 Multilayer perceptrons & backpropagation algorithm
Multilayer feedforward networks form a significant classification of neural networks. The
network is characteristically comprised of a set of sensory units (source nodes) which
establish the input layer, one or more hidden layers of computation nodes, and an output layer
of computation nodes. The input signal propagates through the network in a forward direction,
on a layer-by-layer basis. These neural networks are called multilayer perceptrons (MLP) that
represent a generalization of a single-layer perceptron.
A widely used algorithm which is named the error back-propagation algorithm, trains the
multilayer perceptrons in applications in order to successfully solve some challenging and
diverse problems. Error correction learning rule forms the basis for this algorithm. It may be
considered a generalization of an equally popular adaptive filtering algorithm: the least mean
square (LMS) algorithm for the special case of a single layer neuron [29].
Error back-propagation learning is comprised of two passes through the different layers of the
network which are a forward pass and a backward pass. The forward pass contains an activity
pattern (input vector) whose effect propagates through the network one layer after another and
is applied to the network’s sensory nodes. Consequently, an output set is created as the real
network response. In the duration of the forward pass, all of the synaptic weights of the
47
network are unchanging. In the duration of the backward pass, all the synaptic weights are
adjusted according to an error correction rule. Particularly, the real network response is taken
out of a desired (target) response to come up with an error signal. The error signal is
propagated back through the network, in contrast to the direction of synaptic connections, thus
the naming “error back-propagation”. The synaptic weights are adjusted such that the real
network response moves nearer to the desired response statistically. The error back-
propagation algorithm is known in the literature as the back-propagation algorithm, or simply,
back-prop, as well. The learning process carried out with the algorithm is referred to as the
back-propagation learning.
There are three distinguishing characteristics of a multilayer perceptron:
1. There is a nonlinear function involved in the model of each neuron. The nonlinearity
mentioned here is smooth, in other words, differentiable everywhere. A generally
employed nonlinearity form which is sufficient for this requirement is a sigmoidal
nonlinearity that the following logistic function defines:
)
exp(
1
1
j
j
v
y
(3.15)
where ݒ is the induced local field (i.e. the weighted sum of all synaptic inputs plus the
bias) of neuron ݆, and ݕ is the output of the neuron.
2. One or more layers of hidden neurons which do not belong to the input or output of
the network can be found in the network. The network is capable of learning complex
duties by extracting increasingly significant specialties from the input patterns
(vectors) due to these hidden neurons.
3. The network performs a high connectivity degree which is decided by the network
synapses. A change in the network’s connectivity obligates a change in the population
of synaptic connections or their weights.
48
The multilayer perceptron derives its computational power when these characteristics are
combined with the capability of learning from experience through training. The back-
propagation algorithm has great significance in neural networks since it supplies a
computationally efficient method in order to train multilayer perceptrons.
Fig. 3.6 demonstrates the architectural graph of a multilayer perceptron with one hidden layer
and an output layer. The illustrated network is fully connected meaning that a neuron in one
layer of the network is connected to all the nodes/neurons in the previous layer. Signal flow
through the network progresses in a forward direction, from left to right and on a layer-by-
layer basis. The value of each neuron is computed by first summing the weighted sums and
the bias and then applying ݂(sum) (the sigmoid function) to calculate the neuron’s activation.
Input Output
⋮ ⋮
⋮ ⋮
⋮
⋮
Bias
Figure 3.6 Multilayer feedforward network [11]
Next, the training processes of the three layer feedforward network will be analyzed. Firstly,
three stages describing the feedforward phase in the network are: input (I), hidden (H) and
output (O) layers.
Input Layer (I): The input of the hidden layer is equal to the output of the input layer.
H
I
Input
Output