Fragmentation in Distributed Databases
Fragmentation is the process of breaking a large database into smaller parts, called fragments, which are stored across different locations. These fragments ensure that data can be efficiently managed and accessed while maintaining the ability to rebuild the original database without losing any information.
For example, if a table T is divided into T1, T2, T3, …, TN, each fragment contains enough data to reconstruct the original table when needed.
Comparison of Horizontal and Vertical Fragmentation
Feature | Horizontal Fragmentation | Vertical Fragmentation |
---|---|---|
Definition | Divides a table into smaller parts based on rows (records) using a condition (e.g., region or department). | Divides a table into smaller parts based on columns, keeping a primary key for linking them back. |
Structure | Each fragment has the same columns but different rows. | Each fragment has the same rows but different columns. |
Use Case | Useful when queries frequently access specific rows based on conditions. | Useful when queries frequently request only specific columns of a table. |
Performance Optimization | Improves performance by reducing the number of rows processed in queries. | Improves performance by reducing unnecessary column retrieval, minimizing data load. |
Reconstruction | The original table is reconstructed using the UNION operation. | The original table is reconstructed using the JOIN operation with the primary key. |
Example Scenario | A multinational company stores employee records by country (e.g., USA, India). | A hospital separates patient records into personal details and medical history. |
Thus, fragmentation helps manage large databases by distributing data efficiently. Horizontal fragmentation organizes data by rows, while vertical fragmentation divides it by columns, improving performance and reducing processing time.