Excel Duplicate Identification and SQL Join Techniques
Identifying Duplicate Entries in Excel
Three methods to locate repeated data in a column:
- Conditional Formatting: Select the target column. Navigate to
Home>Conditional Formatting>Highlight Cells Rules>Duplicate Values. Choose a foramtting style to visually mark duplicates. - COUNTIF Function: In an adjacent column, apply a formula like
=COUNTIF($A$2:$A$100, A2)>1. This returnsTRUEfor duplicate entries. Adjust the range as needed. - PivotTable Summary: Insert a PivotTable from the target column. Drag the column field to both the Rows and Values areas. Set the Value Field Settings to Count. The resulting table displays the frequency of each entry.
SQL Statement Order vs. Execution Sequence
The logical writing order of a SQL query differs from its actual execution plan.
Written Order:
SELECT [columns]
FROM [table]
[INNER|LEFT|RIGHT] JOIN [other_table] ON [join_condition]
WHERE [filter_conditions]
GROUP BY [grouping_columns]
HAVING [group_filter_conditions]
ORDER BY [sort_columns];
Execution Order:
FROM: Specifies the source tables and performs joins.WHERE: Filters rows based on conditions.GROUP BY: Aggregates data into groups.HAVING: Filters aggregated groups.SELECT: Projects the final columns and computes expressions.ORDER BY: Sorts the result set.
Example: Find employees with over 5 years of service and an average project score above 90, ordered by score descending.
SELECT full_name, AVG(project_score) AS avg_score
FROM employees
WHERE years_of_service > 5
GROUP BY full_name
HAVING AVG(project_score) > 90
ORDER BY avg_score DESC;
Finding Maximum Salary per Department with SQL
Given a departments table (dept_id, dept_name) and a staff_salaries table (emp_id, dept_id, compensation), retrieve the highest salary in each department.
WITH DeptMaxSalaries AS (
SELECT
d.dept_name,
MAX(s.compensation) AS max_compensation
FROM departments d
JOIN staff_salaries s ON d.dept_id = s.dept_id
GROUP BY d.dept_name
)
SELECT *
FROM DeptMaxSalaries;