A-D-H Architecture of Distributed Database (DDB)
The A-D-H architecture is a design model used for distributed databases, consisting of three layers that organize the system’s components and their functions. Here’s a breakdown of each layer:
- A – Application Layer:
- This is the top layer where users and applications interact with the distributed database system. It includes the client-side interfaces, user applications, reports, and queries.
- Users access the system through this layer, requesting data or making updates without needing to understand the underlying data distribution or storage methods.
- D – Data Layer:
- The middle layer where actual data resides, but it’s distributed across multiple physical sites or locations in the network.
- This layer is responsible for managing how data is stored, retrieved, and updated across different sites in the distributed database.
- H – Hardware Layer:
- This bottom layer refers to the physical infrastructure, including servers, storage devices, and network connections.
- The hardware layer ensures that the distributed system operates smoothly by providing the resources necessary to store and transmit data between the distributed nodes.
Why Fragmentation is Important in Distributed Databases?
Fragmentation is the process of dividing a database into smaller, manageable pieces called fragments, which are then distributed across multiple sites in a distributed system. Fragmentation is a key part of database design for several reasons:
- Improved Performance:
- By splitting the database into smaller parts and placing the fragments closer to where they are often accessed, you can speed up data retrieval. This minimizes response times for users.
- Scalability:
- As data grows, fragmentation allows the system to scale more easily. New fragments can be added to accommodate the increased data load, making it easier to grow the system over time.
- Load Balancing:
- Fragmenting data helps distribute the workload evenly across different sites. This prevents a single site from becoming overloaded with requests and improves the overall system performance.
- Availability and Fault Tolerance:
- By replicating fragmented data across multiple sites, the system can continue to operate even if one site goes down. This ensures high availability and fault tolerance.
- Data Localization:
- Fragmentation ensures that data that’s frequently accessed together is stored at the same site. This reduces the need for remote access to distant sites, further improving data retrieval times.
Correctness Rules of Fragmentation
When designing the fragmentation of data in a distributed database, it’s important to follow certain correctness rules to ensure the integrity and reliability of the system. Here are the main rules:
- Completeness:
- Rule: Every piece of data from the original database must be included in one of the fragments. No data should be lost during the fragmentation process. Example: If you have a customer database, the entire customer record (name, address, phone number, etc.) should appear in one or more of the fragments, ensuring that no data is left out.
- Disjointness:
- Rule: A data item should only appear in one fragment. This avoids duplication of data across multiple fragments and ensures that each fragment contains distinct, non-overlapping sets of data. Example: If you split customer data by region (e.g., New York customers in one fragment), then customer data should not be duplicated across multiple regions unless explicitly replicated.
- Correctness of Data Retrieval:
- Rule: When a query is executed that needs data from multiple fragments, the system should correctly retrieve and reassemble the data from those fragments. Example: If you query all customers from a particular region, the system should pull the relevant data from the corresponding fragment and correctly combine it into a single result.
- Replication Consistency:
- Rule: If data is replicated across multiple sites, any updates made to a fragment must be reflected consistently across all replicas. This ensures the data is up-to-date everywhere. Example: If customer information is replicated across two locations, and a change is made to the address at one location, the change must be propagated to the other location to maintain consistency.
Types of Fragmentation
There are three main types of fragmentation used in distributed database systems:
- Horizontal Fragmentation:
- Data is divided into rows, with each fragment containing a subset of rows from the original table. This can be done based on a specific condition, such as a geographic region or customer category. Example: A customer table is fragmented horizontally by region, with one fragment containing customers from New York, another for California, and so on.
- Vertical Fragmentation:
- Data is divided into columns, where each fragment contains a subset of columns (attributes) from the original table. This is useful when only specific attributes are needed frequently, reducing the amount of data transferred. Example: A customer table could be vertically fragmented so that one fragment contains the customer ID and name, while another contains the address and contact information.
- Hybrid Fragmentation:
- A combination of horizontal and vertical fragmentation. For example, data can be first horizontally fragmented (by region) and then each fragment is further vertically fragmented into different attributes (e.g., splitting a customer table into name, address, and contact info). Example: A customer table could be first fragmented by region (e.g., New York, California) and then each regional fragment is further split to contain only specific columns.
Fragmentation is a crucial aspect of distributed database systems as it helps improve performance, scalability, and fault tolerance. By dividing data into smaller fragments and distributing them across different sites, distributed databases can ensure efficient and reliable access to data. Following the correctness rules of fragmentation (completeness, disjointness, correctness of data retrieval, and replication consistency) ensures that the system remains consistent, reliable, and efficient. Additionally, understanding and applying the different types of fragmentation—horizontal, vertical, and hybrid—is key to building a robust distributed database system.