Describe COM_MIN approach in horizontal fragmentation, How do you obtain clustered affinity matrix in vertical fragmentation? Explain with suitable example.

COM_MIN Approach in Horizontal Fragmentation

The COM_MIN (Complete and Minimal) approach is used in horizontal fragmentation to divide a database table into smaller, meaningful parts based on query conditions. The goal is to ensure that each fragment contains only the necessary data, making queries faster and reducing communication between database servers.

How it Works:

i. Identifying Conditions (Predicates): First, we find the common conditions used in queries, such as “region = ‘North'” or “year = 2020”.
ii. Combining Conditions: These conditions are combined to create logical fragments. For example, one fragment may store sales records for “North region in 2020”, while another may store “South region in 2021”.
iii. Ensuring Completeness and Minimality: The fragments must include all relevant data (completeness) and should not have unnecessary data (minimality).
Example:
Consider a Sales table with attributes (Sale_ID, Product, Region, Year, Revenue). If many queries filter data based on Region and Year, the COM_MIN approach may create these fragments:

Fragment 1: Sales data for North region in 2020.
Fragment 2: Sales data for South region in 2021.
This approach improves efficiency since queries only need to scan relevant fragments instead of the entire table.

Clustered Affinity Matrix in Vertical Fragmentation

Vertical fragmentation divides a table by columns rather than rows, keeping related columns together in separate fragments. A clustered affinity matrix is used to decide how to group the columns.

How it Works:

i. Analyze Queries: Identify how often each pair of columns is used together in queries.
ii. Create an Affinity Matrix: A table is created where each cell shows how frequently two columns appear together. Higher values indicate stronger relationships.
iii. Group Similar Columns: Columns with high affinity are grouped into separate fragments to minimize the need for joins.
Example:
Consider an Employee table with attributes (EmpID, Name, Address, Department, Salary). After analyzing queries, the affinity matrix might show:

(Name, Address) often appear together (used for employee details).
(EmpID, Department, Salary) are frequently used together (used for payroll and HR).
Using this matrix, we can create two fragments:

Fragment 1 (Personal Details): EmpID, Name, Address
Fragment 2 (Job Details): EmpID, Department, Salary
This reduces the need for unnecessary joins, making queries faster and improving system performance.

Hence, both COM_MIN for horizontal fragmentation and clustered affinity matrix for vertical fragmentation help improve query performance and efficiency by organizing data based on how it is accessed. Horizontal fragmentation reduces data transfer between servers, while vertical fragmentation groups related attributes to minimize join operations. These techniques ensure a well-optimized distributed database system.

Leave a Comment