E GOVERNANCE NOTE(BSCCSIT/BCA/IOE)
INTRODUCTION
-
Data Warehousing:
- Definition: Data warehousing involves the collection, storage, and management of data from various sources in a centralized repository called a data warehouse.
- Purpose: The primary purpose of a data warehouse is to provide a consolidated and optimized platform for reporting and data analysis. It acts as a single, reliable source of truth for an organization's historical and current data.
- Characteristics:
- Subject-Oriented: Data warehouses are organized around specific subjects, such as sales, finance, or customer data.
- Integrated: Data from different sources is integrated and transformed into a consistent format within the warehouse.
- Time-Variant: Data warehouses store historical data, allowing for trend analysis and comparisons over time.
- Non-volatile: Once data is stored in the warehouse, it is not typically altered or deleted, maintaining a reliable historical record.
-
Data Mining:
- Definition: Data mining is the process of discovering patterns, trends, correlations, and valuable insights from large datasets using various techniques, including statistical analysis, machine learning, and artificial intelligence.
- Purpose: The main goal of data mining is to extract meaningful knowledge from data, uncover hidden patterns, and make predictions or classifications based on that knowledge.
- Techniques:
- Classification: Assigning items to predefined categories based on their characteristics.
- Clustering: Grouping similar items together based on their features.
- Association Rule Mining: Identifying relationships and associations between variables in the data.
- Regression Analysis: Predicting numerical values based on historical data.
- Anomaly Detection: Identifying unusual patterns or outliers in the data.
Relationship between Data Warehousing and Data Mining:
- Data mining relies on the data stored in a data warehouse to discover patterns and generate insights. The integrated and organized nature of data in a data warehouse makes it a suitable source for data mining activities.
- Data warehouses provide the historical and current data required for training and validating data mining models, ensuring that the results are based on reliable information.