Introduction to Data Mining
World is now running on data. Once we get up in the morning and go back to bed in night, our life is related to data in one or another way and the data stored in the data warehouse. Unfortunately majority of our data remains unused.
But the trend has changed since the evolution of data mining. so let discuss about what exactly is data mining and why we need data mining ?
What is Data Mining ?
In 1960s statisticians used the terms “Data Fishing” or “Data Dredging” to refer what they considered the bad practice of analyzing data without a prior hypothesis. Data mining is a field of research that has emerged in the 1990s, and is very popular today, sometimes under different names such as “big data” and “data science“, which have a similar meaning.
Data mining is the science of deriving knowledge from data, typically large data sets in which meaningful information, trends, and other useful insights need to be discovered. This is to eliminate the randomness and discover the hidden pattern. Data mining uses machine learning and statistical methods to extract useful “nuggets” of information from what would otherwise be a very intimidating data set.
Having a lot of data in databases is great. However, to really benefit from this data, it is necessary to analyze the data to understand it. Having data that we cannot understand or draw meaningful conclusions from it is useless. So how to analyze the data stored in large databases? Traditionally, data has been analyzed by hand to discover interesting knowledge. However, this is time-consuming, prone to error, doing this may miss some important information, and it is just not realistic to do this on large databases. To address this problem, automatic techniques have been designed to analyze data and extract interesting patterns, trends or other useful information. This is the purpose of data mining.
Why data mining is so popular ?
The reasons why data mining has become popular is that storing data electronically has become very cheap and that transferring data can now be done very quickly thanks to the fast computer networks that we have today. Thus, many organizations now have huge amounts of data stored in databases, that needs to be analyzed.
Data mining is the young and promising field for the present generation because of its wide range of applications. It has attracted a great deal of attention in the information industry and in society, due to the wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge.The information and knowledge gained can be used for applications ranging from market analysis, fraud detection, and customer retention to production control and science exploration. This is the reason why data mining is also called as knowledge discovery from data.
Process Of Data Mining :
- Data cleaning: This step consists of cleaning the data by removing noise or other inconsistencies that could be a problem for analyzing the data.
- Data integration: This step consists of integrating data from various sources to prepare the data that needs to be analyzed. For example, if the data is stored in multiple databases or file, it may be necessary to integrate the data into a single file or database to analyze it.
- Data selection: This step consists of selecting the relevant data for the analysis to be performed.
- Data transformation: This step consists of transforming the data to a proper format that can be analyzed using data mining techniques. For example, some data mining techniques require that all numerical values are normalized.
- Data mining: This step consists of applying some data mining techniques (algorithms) to analyze the data and discover interesting patterns or extract interesting knowledge from this data.
- Evaluating the knowledge that has been discovered: This step consists of evaluating the knowledge that has been extracted from the data. This can be done in terms of objective and/or subjective measures.
- Visualization: Finally, the last step is to visualize the knowledge that has been extracted from the data.
Data Mining Applications :
- Business forecasting
- Analyzing the behavior of customers in terms of what they buy
- Self-driving cars.
- Hazards of new medicine.
- Space research.
- Fraud detection.
- Stock market price prediction
- Weather forecasting.
- Social networks.