The importance of having ef cient and effective methods for data mining and kn- ledge discovery (DM&KD), to which the present book is devoted, grows every day and numerous such methods have been developed in recent decades. There exists a great variety of different settings for the main problem studied by data mining and knowledge discovery, and it seems that a very popular one is formulated in terms of binary attributes. In this setting, states of nature of the application area under consideration are described by Boolean vectors de ned on some attributes. That is, by data points de ned in the Boolean space of the attributes. It is postulated that there exists a partition of this space into two classes, which should be inferred as patterns on the attributes when only several data points are known, the so-called positive and negative training examples. The main problem in DM&KD is de ned as nding rules for recognizing (cl- sifying) new data points of unknown class, i. e. , deciding which of them are positive and which are negative. In other words, to infer the binary value of one more attribute, called the goal or class attribute. To solve this problem, some methods have been suggested which construct a Boolean function separating the two given sets of positive and negative training data points.
The importance of having efficient and effective methods for data mining and knowledge discovery (DM) is rapidly growing. This is due to the wide use of fast and affordable computing power and data storage media and also the gathering of huge amounts of data in almost all aspects of human activity and interest. While numerous methods have been developed, the focus of this book presents algorithms and applications using one popular method that has been formulated in terms of binary attributes, i.e., by Boolean functions defined on several attributes that are easily transformed into rules that can express new knowledge.
This book presents methods that deal with key data mining and knowledge discovery issues in an intuitive manner, in a natural sequence, and in a way that can be easily understood and interpreted by a wide array of experts and end users. The presentation provides a unique perspective into the essence of some fundamental DM issues, many of which come from important real life applications such as breast cancer diagnosis.
Applications and algorithms are accompanied by extensive experimental results and are presented in a way such that anyone with a minimum background in mathematics and computer science can benefit from the exposition. Rigor in mathematics and algorithmic development is not compromised and each chapter systematically offers some possible extensions for future research.