ISBN-13: 9783659914218 / Angielski / Miękka / 2017 / 76 str.
An enormous amount of data is being collected and stored in databases everywhere across the world. These data bundles up and keep on increasing every year. Extracting information that are hidden in such databases and classifying that information extracted are most important tasks in data mining. If such datasets are imbalanced, then it becomes tough to handle it. Since Predicting future is one of the fundamental tasks in data mining. Working with imbalance datasets to predict the possible outcome is a very tedious task. The dataset is imbalanced when it is not classified correctly, when one class holds more instances than other. They are often represented as a positive class (minority) and negative (majority) class. The class that has less number of samples is called minority class, and one that has more is called majority class. Imbalance dataset causes many serious issues in data mining, mostly the standard classification algorithm considers the dataset as balanced which in turn is partial towards majority class. For applications like medical diagnosis, this causes a very serious effect. Hence balancing dataset is critical for many real-time applications.