DATA MINING


Computer Engineering
electronics Engineering
Civil Engineering

This paper introduces a brand new and powerful decision support tool, data mining, in the context of knowledge management. Among other things, the most striking features of data mining techniques are clustering and prediction. The clustering aspect of data mining offers comprehensive characteristics analysis of students, while the predicting function estimates the likelihood for a variety of outcomes of them, such as transferability, persistence, retention and success in classes. Compared to traditional analytical studies that are often hindsight and aggregate, data mining is forward looking and is oriented to individual students. A real life project presents the work of data mining in predicting the possibility of returning to school for every student currently enrolled at a community college in Silicon Valley. The project applies neural network, C&RT and C5.0 to choose the best prediction followed by a clustering analysis using TwoStep. The list of students who are predicted as less likely to return to school by data mining is then turned over to faculty and management for direct or indirect intervention. The benefits of data mining are its ability to gain deeper understanding of the patterns previously unseen using current available reporting capabilities. Further, prediction from data mining allows the college an opportunity to act before a student drops out or to plan for resource allocation with confidence gained from knowing how many students will transfer or take a particular course. Data mining is the process of extracting patterns from data . Data mining is becoming an increasingly important tool to transform this data into information. It is commonly used in a wide range of profiling practices, such as marketing, surveillance, fraud detection and scientific discovery. Data mining can be used to uncover patterns in data but is often carried out only on samples of data. The mining process will be ineffective if the samples are not a good representation of the larger body of data. Data mining cannot discover patterns that may be present in the larger body of data if those patterns are not present in the sample being "mined". Inability to find patterns may become a cause for some disputes between customers and service providers. Therefore data mining is not fool proof but may be useful if sufficiently representative data samples are collected. The discovery of a particular pattern in a particular set of data does not necessarily mean that a pattern is found elsewhere in the larger data from which that sample was drawn. An important part of the process is the verification and validation of patterns on other samples of data. The related terms data dredging, data fishing and data snooping refer to the use of data mining techniques to sample sizes that are (or may be) too small for statistical inferences to be made about the validity of any patterns discovered (see also data-snooping bias). Data dredging may, however, be used to develop new hypotheses, which must then be validated with sufficiently large sample sets. ITH the continuous development of database technology and the extensive applications of database management system, the data volume stored in database increases rapidly and in the large amounts of data much important information is hidden. If the information can be extracted from the database they will create a lot of potential profit for the companies, and the technology of mining information from the massive database is known as data mining. Data mining tools can forecast the future trends and activities to support the decision of people. For example, through analyzing the whole database system of the company the data mining tools can answer the problems such as




No comments:

Post a Comment