Fading Coder

One Final Commit for the Last Sprint

Home > Notes > Content

SQL Fundamentals: Data Definition and Query Foundations

Notes May 8 3

Core Concepts of Structured Query Language

Structured Query Language (SQL) serves as the universal interface for interacting with relational databases. Its capabilities extend beyond simple queries to encompass schema definition, data manipulation, security enforcement, and integrity control. This material examines the essential DDL and DML features defined in the SQL-92 standard.

Defining Database Structures

The Data Definition Language (DDL) communicates the logical structure of a database to the system. It specifies relation schemas, attribute domains, integrity constraints, and storage details. The fundamental operations for managing these schemas are presented below.

Supported Data Types

SQL defines a set of core data types. Choosing the correct type is critical for storage efficiency and data accuracy.

  • character(n) or char(n): A fixed-width string padded with spaces.
  • character varying(n) or varchar(n): A variable-width string with a maximum length. This is general recommended over char to avoid pitfalls when comparing strings of unequal lengths, where behavior can vary across database systems.
  • integer or int: A typical integer value.
  • smallint: A smaller integer value.
  • numeric(p, d): A fixed-point number with p total digits, where d digits follow the decimal point.
  • real and double precision: Approximate floating-point numbers.
  • float(n): A floating-point number with a user-defined minimum precision.

Every type supports a special null marker, signifying a missing or unknown value.

Schema Management Statements

The create table command establishes a new relation.

create table department
 ( dept_id    varchar(8),
   name       varchar(40) not null,
   building   varchar(15),
   budget     numeric(10,2),
   primary key (dept_id)
 );

create table instructor
 ( id         varchar(5),
   full_name  varchar(25) not null,
   dept_id    varchar(8),
   salary     numeric(8,2),
   primary key (id),
   foreign key (dept_id) references department
 );

To remove a relation entirely, use drop table relation_name;. To modify an existing relation's structure, employ alter table.

alter table instructor add office_location varchar(10);
alter table instructor drop office_location;

Data is loaded into these schemas with the insert statement.

insert into instructor
values('10211', 'Elena Rossi', 'CS', 75000);

Querying Data with Select Statements

The core of data retrieval relies on the select, from, and where clauses. A query processes the relations defined in the from clause, filters tuples using where predicates, and constructs an output relation from the select list.

A simple query on one table might retrieve all department names:

select distinct building
from department;

The select clause can also evaluate expressions.

select id, full_name, dept_id, salary * 1.08 as projected_salary
from instructor;

For queries involving multiple tables, the join condition is typically specified in the where clause.

select instructor.full_name, department.building
from instructor, department
where instructor.dept_id = department.dept_id;

Joining Relations Explicitly

Instead of listing tables in the from clause and using where for join conditions, explicit join operators can be used. The natural join operation automatically matches tuples on columns that share the same name.

select full_name, course_id
from instructor natural join teaches;

To specify the join columns without relying on all identically named attributes, the join ... using construct is available.

select full_name, course_id
from instructor join teaches using (id);

Refining Queries with Additional Operations

Aliasing and Self-Joins

The as keyword renames columns for output clarity or tables to enable self-joins, which compare tuples within the same relation.

select distinct A.full_name as senior_faculty, B.full_name as junior_faculty
from instructor as A, instructor as B
where A.salary > B.salary and B.dept_id = 'Math';

Pattern Matching and Ordering

String matching uses the like operator with % (any substring) and _ (any single character).

select name
from department
where building like '%Hall%';

Results are ordered with order by, specifying asc (default) or desc.

select *
from instructor
order by salary desc, full_name asc;

Tuple Comparisons

A compact syntax allows comparing multi-attribute tuples using lexicographic ordering.

select full_name, course_id
from instructor, teaches
where (instructor.id, instructor.dept_id) = (teaches.id, 'Physics');

This is equivalent to where instructor.id = teaches.id and instructor.dept_id = 'Physics'.

Set Operations

SQL replicates set theory operations using union, intersect, and except. All three automatically eliminate duplicate tuples unless all is specified.

To find all courses offered in Fall 2023 or Spring 2024:

(select course_id from section where semester = 'Fall' and year = 2023)
union
(select course_id from section where semester = 'Spring' and year = 2024);

Working with Nulls and Aggregation

Null values introduce three-valued logic (true, false, unknown). Any arithmetic expression with a null operand yields null. A comparison involving null is evaluated as unknown. To test for null explicitly, use is null or is not null.

Aggregate Functions

Functions like avg, min, max, sum, and count compute a single value from a collection. The group by clause partitions data into sets for aggregation, and having filters those groups after they are formed.

select dept_id, avg(salary) as average_salary
from instructor
group by dept_id
having avg(salary) > 65000;

All aggregate functions except count(*) ignore nulls. An empty set evaluated by count returns 0, while other functions return null.

Subqueries and Complex Predicates

A subquery is a select-from-where expression nested inside another query. They enable sophisticated checks for set membership, comparison, and emptiness.

Membership and Comparison

Use in to test membership in a subquery's result set.

select distinct title
from course
where dept_id = 'CS' and course_id in
  (select course_id from section where semester = 'Spring' and year = 2024);

The some and all keywords compare a value against a set. For instance, > all checks if a value exceeds every element in a subquery's result.

Existence and Uniqueness Checks

The exists keyword returns true if a subquery yields any rows; not exists checks for an empty result. To verify that all elements of one set are contained in another, a common pattern is not exists (B except A).

The unique construct tests for duplicate tuples in a subquery's output.

Subqueries in Other Clauses

Subqueries that return a single value, known as scalar subqueries, can appear in the select, where, or even the set clause of an update.

select dept_id,
  (select count(*) from instructor where department.dept_id = instructor.dept_id) as faculty_count
from department;

The with clause defines a temporary named query that improves readability for complex operations.

with dept_funds(dept_id, total_budget) as
  (select dept_id, sum(salary) from instructor group by dept_id)
select dept_id
from dept_funds
where total_budget > (select avg(total_budget) from dept_funds);

Modifying the Database State

Deleting Tuples

The delete command removes entire rows from a single table.

delete from instructor
where salary < (select avg(salary) from instructor);

Inserting Tuples

New rows can be provided as literal values or the result of a query.

insert into course(course_id, title, dept_id, credits)
values('CS-437', 'Database Systems', 'CS', 4);

insert into instructor
select id, full_name, dept_id, 25000
from student
where dept_id = 'Art' and total_cred > 120;

When inserting based on a query, the source query is evaluated completely before any rows are added to the target table, preventing infinite loops.

Updating Tuples

The update statement modifies attribute values. A powerful case expression allows condittional updates in a single pass.

update instructor
set salary = case
   when salary <= 80000 then salary * 1.04
   else salary * 1.02
end;

Related Articles

Deploying a Maven Web Application to Tomcat 9 Using the Tomcat Manager

Tomcat 9 does not provide a dedicated Maven plugin. The Tomcat Manager interface, however, is backward-compatible, so the Tomcat 7 Maven Plugin can be used to deploy to Tomcat 9. This guide shows two...

Skipping Errors in MySQL Asynchronous Replication

When a replica halts because the SQL thread encounters an error, you can resume replication by skipping the problematic event(s). Two common approaches are available. Methods to Skip Errors 1) Skip a...

Spring Boot MyBatis with Two MySQL DataSources Using Druid

Required dependencies application.properties: define two data sources and poooling Java configuration for both data sources MyBatis mappers for each data source Controller endpoints to verify both co...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.