Top Two Data Mining Challenges

1. Development of parallel or high-performance algorithms, theoretical models and data mining techniques.
Distributed data mining algorithms must support the complete data mining process (pre-processing, data mining and post-processing) in a similar way as their centralized versions do. This means that all data mining tasks, including data cleaning, attribute discretization, concept generalization and so on, should be performed in a parallel way.

Several distributed algorithms have been developed according to their centralized versions. For instance, some parallel algorithms have been developed for association rules (Agrawal and Shafer, 1996; Ashrafi, Taniar and Smith, 2004), classification rules (Zaki, Ho and Agrawal, 1999; Cho andW¨uthrich, 2002) or clustering algorithms (Kargupta et al., 2001; Rajasekaran, 2005).

2. Design of new data mining systems and architectures to deal with the efficient use of computing resources.
Although some effort has been made towards the development of efficient distributed data mining algorithms, the environmental aspects, such as task scheduling and resource management, are critical aspects to the success of distributed data mining.

Therefore, the deployment of data mining applications within high-performance and distributed computing infrastructures becomes a challenge for future developments in the field of data mining. This volume is intended to cover this dimension.

User login

Who's new

  • GusTejada
  • dfjc7ojj
  • greeckjenss
  • sato
  • funky_dog

Who's online

There are currently 0 users and 2 guests online.