Describe properties of transaction. What is distributed transaction? Give an example of dirty read in concurrent transaction. Define conflict serializability.

Properties of a Transaction (ACID Properties)

In database management systems (DBMS), a transaction is a sequence of one or more operations (such as insert, update, delete, or read) that are executed as a single unit. The ACID properties ensure the correctness and reliability of the transaction, particularly in the case of failures or concurrent access. The ACID properties are:

Atomicity:
- Atomicity means that a transaction is treated as a single unit, which either completely succeeds or completely fails. If one part of the transaction fails, the entire transaction is rolled back, leaving the database in its initial state.
- Example: If a bank transfer transaction involves withdrawing money from one account and depositing it into another, atomicity ensures that both actions occur or neither does.
Consistency:
- A transaction must bring the database from one consistent state to another consistent state. Any transaction that violates the integrity constraints of the database should be aborted and rolled back.
- Example: If a transaction transfers money from one account to another, the total balance in the system must remain unchanged before and after the transaction.
Isolation:
- Isolation ensures that the execution of one transaction is isolated from the execution of other transactions. Even though transactions may execute concurrently, the intermediate states of one transaction should not be visible to other transactions.
- Example: If two transactions are happening concurrently, one transferring money from Account A to Account B and another from Account B to Account C, isolation ensures that each transaction operates as if it were the only transaction running.
Durability:
- Durability guarantees that once a transaction is committed, its effects are permanent and persist, even if the system crashes immediately after the commit.
- Example: If a transaction that adds a new record to a database is committed, the record will remain in the database even if the system crashes right after the commit.

Distributed Transaction

A distributed transaction is a type of transaction that involves multiple databases or systems located on different servers or geographical locations. These databases are typically part of a distributed database system or a distributed computing environment.

In a distributed transaction, the ACID properties still apply, but additional challenges arise because the transaction spans multiple systems, which may not be able to communicate with each other all the time. Ensuring atomicity, consistency, isolation, and durability across multiple systems requires coordination and protocols such as the Two-Phase Commit (2PC) protocol.

Example of a Distributed Transaction:

Consider an e-commerce platform that uses two different databases for order management and payment processing. When a customer places an order, the system might need to perform the following:

Database A (order management) updates the order status.
Database B (payment processing) updates the payment status.

A distributed transaction ensures that both actions happen together, either both successfully or neither (if one system fails).

Dirty Read in Concurrent Transactions

A dirty read occurs when one transaction reads data that has been modified by another transaction but not yet committed. This can lead to inconsistencies because the uncommitted data might be rolled back, making the read value invalid.

Example of Dirty Read:

Transaction T1 starts and updates the balance of account A from $1000 to $800.
Transaction T2 starts and reads the updated balance of account A ($800) before T1 commits.
However, T1 might fail and roll back its changes, leaving account A’s balance at $1000.

In this case, T2 has read data (the $800 balance) that was not committed and might never actually exist. This is a dirty read.

Conflict Serializability

Conflict serializability refers to the concept of determining whether a schedule of transactions (or sequence of operations) is equivalent to some serial schedule (a schedule where transactions are executed one after another without any overlap). A schedule is conflict-serializable if it can be rearranged into a serial schedule by swapping non-conflicting operations.

Conditions for Conflict:

Two operations are considered to conflict if:

They belong to different transactions.
They operate on the same data item.
At least one of the operations is a write operation.

A schedule is conflict-serializable if the transactions in the schedule can be reordered into a serial schedule without violating the conflict conditions.

Example of Conflict Serializability:

Consider the following schedule involving two transactions, T1 and T2:

Operation	Transaction	Data Item
Write	T1	A
Read	T2	A
Write	T2	A
Read	T1	A

This schedule is not serial because both transactions perform operations on data item A, and they interfere with each other. However, it is conflict-serializable because it can be reordered into the following serial schedule without conflicts:

T1 writes A.
T2 reads and writes A.

Since no conflicts occur in the reordering, the schedule is conflict-serializable.

In conclusion, transactions are essential for maintaining consistency and reliability in a database system. The ACID properties ensure that transactions are executed in a manner that guarantees correctness and stability. Distributed transactions handle the complexity of multiple databases and systems. Issues such as dirty reads can occur in concurrent transactions, leading to inconsistencies, while conflict serializability helps in determining if a schedule of transactions is equivalent to a serial schedule, maintaining consistency in concurrent environments.