What is DML in SQL?

Free Coding Questions Catalog
Boost your coding skills with our essential coding questions catalog. Take a step towards a better tech career now!

Data Manipulation Language (DML) is a core subset of SQL (Structured Query Language) used to interact with and manipulate the data stored within relational databases. Unlike Data Definition Language (DDL), which focuses on defining and managing database structures, DML is primarily concerned with the actual data—retrieving, inserting, updating, and deleting records. Mastering DML is essential for effectively managing and utilizing the data within your databases.

Key Components of DML

DML encompasses several fundamental commands that allow users to perform various operations on the data. The most commonly used DML commands include:

  1. SELECT
  2. INSERT
  3. UPDATE
  4. DELETE
  5. MERGE (in some SQL dialects)

1. SELECT

Purpose:
The SELECT statement retrieves data from one or more tables in a database. It allows you to specify exactly which columns you want to view and apply various conditions to filter the results.

Syntax:

SELECT column1, column2, ... FROM table_name WHERE condition;

Example: Retrieve the first name and email of all employees in the Sales department.

SELECT FirstName, Email FROM Employees WHERE Department = 'Sales';

Advanced Usage:

  • Joining Tables:
    SELECT Employees.FirstName, Departments.DepartmentName FROM Employees INNER JOIN Departments ON Employees.DepartmentID = Departments.DepartmentID;
  • Aggregating Data:
    SELECT Department, COUNT(*) AS NumberOfEmployees FROM Employees GROUP BY Department HAVING COUNT(*) > 5;

2. INSERT

Purpose:
The INSERT statement adds new records (rows) to a table. It allows you to specify the values for each column in the new record.

Syntax:

INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...);

Example: Add a new employee to the Employees table.

INSERT INTO Employees (EmployeeID, FirstName, LastName, Department, Email) VALUES (101, 'Jane', 'Doe', 'Marketing', 'jane.doe@example.com');

Bulk Insert: Insert multiple records in a single statement.

INSERT INTO Employees (EmployeeID, FirstName, LastName, Department, Email) VALUES (102, 'John', 'Smith', 'Sales', 'john.smith@example.com'), (103, 'Alice', 'Johnson', 'IT', 'alice.johnson@example.com');

3. UPDATE

Purpose:
The UPDATE statement modifies existing records in a table. You can update one or more columns and apply conditions to specify which records should be updated.

Syntax:

UPDATE table_name SET column1 = value1, column2 = value2, ... WHERE condition;

Example: Update the department of an employee with EmployeeID 101 to 'Sales'.

UPDATE Employees SET Department = 'Sales' WHERE EmployeeID = 101;

Updating Multiple Records: Change the department for all employees in 'Marketing' to 'Sales'.

UPDATE Employees SET Department = 'Sales' WHERE Department = 'Marketing';

4. DELETE

Purpose:
The DELETE statement removes existing records from a table. You can specify conditions to determine which records should be deleted. Without a WHERE clause, all records in the table will be deleted.

Syntax:

DELETE FROM table_name WHERE condition;

Example: Delete the employee record with EmployeeID 101.

DELETE FROM Employees WHERE EmployeeID = 101;

Deleting Multiple Records: Remove all employees in the 'IT' department.

DELETE FROM Employees WHERE Department = 'IT';

Caution:
Always use a WHERE clause with DELETE statements to avoid unintentionally removing all records from a table.

5. MERGE (Available in Some SQL Dialects)

Purpose:
The MERGE statement allows you to perform INSERT, UPDATE, or DELETE operations in a single statement based on whether a condition is met.

Syntax:

MERGE INTO target_table USING source_table ON (condition) WHEN MATCHED THEN UPDATE SET column1 = value1, ... WHEN NOT MATCHED THEN INSERT (column1, column2, ...) VALUES (value1, value2, ...);

Example: Synchronize data between two tables, updating existing records and inserting new ones.

MERGE INTO Employees AS Target USING NewEmployees AS Source ON Target.EmployeeID = Source.EmployeeID WHEN MATCHED THEN UPDATE SET Target.FirstName = Source.FirstName, Target.LastName = Source.LastName, Target.Department = Source.Department WHEN NOT MATCHED THEN INSERT (EmployeeID, FirstName, LastName, Department, Email) VALUES (Source.EmployeeID, Source.FirstName, Source.LastName, Source.Department, Source.Email);

Differences Between DML and Other SQL Subsets

AspectData Manipulation Language (DML)Data Definition Language (DDL)Data Control Language (DCL)Transaction Control Language (TCL)
Primary PurposeManipulate and query the dataDefine and manage database structuresControl access and permissions to dataManage transactions within the database
Common CommandsSELECT, INSERT, UPDATE, DELETE, MERGECREATE, ALTER, DROP, TRUNCATE, RENAMEGRANT, REVOKECOMMIT, ROLLBACK, SAVEPOINT
ImpactChanges the actual data stored in tablesAlters the structure of the databaseModifies security settingsControls the flow of transactions
Transactional NatureCan be part of transactions, allowing commits or rollbacksTypically auto-committed, cannot be rolled backGenerally immediate effects, part of security settingsExplicitly manages transaction boundaries

Best Practices for Using DML

  1. Use Transactions:

    • Group related DML operations within transactions to ensure atomicity. This means either all operations succeed or none do, maintaining data integrity.
    BEGIN TRANSACTION; INSERT INTO Orders (OrderID, CustomerID, OrderDate) VALUES (1001, 1, '2024-04-01'); UPDATE Inventory SET Quantity = Quantity - 1 WHERE ProductID = 101; COMMIT TRANSACTION;
    • If any operation fails, you can roll back the entire transaction.
    ROLLBACK TRANSACTION;
  2. Always Use WHERE Clauses with UPDATE and DELETE:

    • To prevent unintended modifications or deletions, always specify conditions.
    UPDATE Employees SET Department = 'HR' WHERE EmployeeID = 102;
  3. Back Up Data Before Mass Updates or Deletes:

    • Before performing bulk operations, back up your data to avoid accidental loss.
    BACKUP DATABASE YourDatabase TO DISK = 'backup.bak';
  4. Validate Data Before Insertion:

    • Ensure that the data being inserted adheres to the table constraints and business rules.
    INSERT INTO Employees (EmployeeID, FirstName, LastName, Department, Email) VALUES (104, 'Mark', 'Taylor', 'Finance', 'mark.taylor@example.com') WHERE NOT EXISTS ( SELECT 1 FROM Employees WHERE Email = 'mark.taylor@example.com' );
  5. Use Parameterized Queries:

    • When interacting with databases through applications, use parameterized queries to prevent SQL injection attacks.
    -- Example in a programming language like Python cursor.execute("INSERT INTO Employees (FirstName, LastName) VALUES (?, ?)", (first_name, last_name))
  6. Optimize Queries for Performance:

    • Use appropriate indexing and avoid unnecessary columns in SELECT statements to enhance performance.
    SELECT FirstName, LastName FROM Employees WHERE Department = 'IT';
  7. Maintain Clear and Consistent Naming Conventions:

    • Use descriptive and consistent names for tables and columns to make queries more readable and maintainable.
    -- Good Naming Convention CREATE TABLE CustomerOrders ( OrderID INT PRIMARY KEY, CustomerID INT, OrderDate DATE, TotalAmount DECIMAL(10, 2) );
  8. Monitor and Audit DML Operations:

    • Keep track of changes made to the data for auditing purposes and to identify potential issues.
    -- Example: Creating an audit table CREATE TABLE Employees_Audit ( AuditID INT PRIMARY KEY AUTO_INCREMENT, EmployeeID INT, Action VARCHAR(10), ActionDate DATETIME, ChangedBy VARCHAR(50) ); -- Example: Trigger to log updates CREATE TRIGGER trg_Employees_Update AFTER UPDATE ON Employees FOR EACH ROW BEGIN INSERT INTO Employees_Audit (EmployeeID, Action, ActionDate, ChangedBy) VALUES (OLD.EmployeeID, 'UPDATE', NOW(), USER()); END;

Handling Common DML Scenarios

a. Inserting Data with Conflict Handling

Example: Insert a new user only if the email does not already exist.

INSERT INTO Users (Username, Email, PasswordHash) VALUES ('newuser', 'newuser@example.com', 'hashedpassword') WHERE NOT EXISTS ( SELECT 1 FROM Users WHERE Email = 'newuser@example.com' );

Alternative in PostgreSQL:

INSERT INTO Users (Username, Email, PasswordHash) VALUES ('newuser', 'newuser@example.com', 'hashedpassword') ON CONFLICT (Email) DO NOTHING;

b. Updating Multiple Columns Based on a Condition

Example: Update the department and email of an employee.

UPDATE Employees SET Department = 'Research', Email = 'new.email@example.com' WHERE EmployeeID = 105;

c. Deleting Records Based on a Subquery

Example: Delete employees who have not placed any orders.

DELETE FROM Employees WHERE EmployeeID NOT IN ( SELECT DISTINCT EmployeeID FROM Orders );

Common Pitfalls and How to Avoid Them

  1. Omitting WHERE Clauses:

    • Forgetting to include a WHERE clause in UPDATE or DELETE statements can result in modifying or removing all records in a table.
    • Solution: Always double-check your queries to ensure that conditions are correctly specified.
  2. Overusing SELECT *:

    • Using SELECT * retrieves all columns, which can be inefficient, especially with large tables.
    • Solution: Specify only the columns you need.
    SELECT FirstName, LastName FROM Employees WHERE Department = 'HR';
  3. Not Handling NULL Values Appropriately:

    • Operations involving NULL values can lead to unexpected results.
    • Solution: Use functions like COALESCE or IS NULL to handle NULL values effectively.
    SELECT FirstName, LastName, COALESCE(Email, 'No Email') AS EmailStatus FROM Employees;
  4. Ignoring Data Types and Constraints:

    • Inserting data with incorrect data types or violating constraints can cause errors.
    • Solution: Ensure that the data being inserted or updated adheres to the table’s schema and constraints.
  5. Neglecting Transaction Management:

    • Failing to use transactions for related DML operations can lead to partial updates and data inconsistencies.
    • Solution: Use transactions to group related operations and maintain atomicity.
    BEGIN TRANSACTION; UPDATE Accounts SET Balance = Balance - 100 WHERE AccountID = 1; UPDATE Accounts SET Balance = Balance + 100 WHERE AccountID = 2; COMMIT TRANSACTION;

Conclusion

Data Manipulation Language (DML) is indispensable for interacting with and managing the data within your SQL databases. By mastering DML commands—SELECT, INSERT, UPDATE, DELETE, and MERGE—you can efficiently retrieve, add, modify, and remove data as needed. Adhering to best practices, such as using transactions, validating data, optimizing queries, and implementing proper constraints, ensures that your data remains accurate, consistent, and secure.

Whether you're developing applications, performing data analysis, or administering databases, a solid understanding of DML empowers you to handle data effectively, maintain integrity, and support the dynamic needs of your organization.

TAGS
Coding Interview
System Design Interview
CONTRIBUTOR
Design Gurus Team

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
Is the system design interview book worth it?
Is the system design interview book worth it?
How to nail a Microsoft interview?
What comes under technical assessment?
Related Courses
Image
Grokking the Coding Interview: Patterns for Coding Questions
Image
Grokking Data Structures & Algorithms for Coding Interviews
Image
Grokking Advanced Coding Patterns for Interviews
Image
One-Stop Portal For Tech Interviews.
Copyright © 2024 Designgurus, Inc. All rights reserved.