Fading Coder

One Final Commit for the Last Sprint

Home > Tech > Content

Strategies for Removing Duplicate Entries in JavaScript Arrays

Tech 1

Removing duplicate values from collections is a frequent requirement in data processing. Different language constructs handle identity comparisons and structural immutability differently, particularly regarding edge cases like NaN.

1. Native Set Construction

The Set data structure inherently enforces uniqueness based on the ECMAScript SameValueZero algorithm. When combined with array conversion utilities, this provides a concise, non-mutating solution. Importantly, SameValueZero explicitly treats NaN as equivalent to itself, resolving the standard strict equality paradox (NaN !== NaN).

const originalList = [1, 2, 2, 'abc', 'abc', true, true, false, false, undefined, undefined, NaN, NaN];

const uniqueViaSet = Array.from(new Set(originalList));
console.log(uniqueViaSet);
// Output: [ 1, 2, 'abc', true, false, undefined, NaN ]

2. Dual-Loop Iteration with Array Mutation

A procedural approach involves nested iteration paired with direct index manipulation. By shifting indices downward after removal, the algorithm avoids skipping elements. This technique relies on strict equality (===), meaning pairs of NaN values will persist because they never satisfy the comparison condition.

function eliminateRepeatsMutate(source) {
  let length = source.length;
  
  for (let outer = 0; outer < length; outer++) {
    for (let inner = outer + 1; inner < length; inner++) {
      if (source[outer] === source[inner]) {
        source.splice(inner, 1);
        length--;
        inner--;
      }
    }
  }
  return source;
}

console.log(eliminateRepeatsMutate([...originalList]));
// Output: [ 1, 2, 'abc', true, false, undefined, NaN, NaN ]

3. Linear Scanning via indexOf

Accumulating distinct values into a new collection while querying previous entries offers an imperative alternative. The indexOf method scans linearly until it locates a match or exhausts the buffer. Like the mutation approach, it uses strict equality under the hood, causing NaN entries to bypass the filter logic.

function trackFirstOccurrence(dataset) {
  const collected = [];
  
  for (const current of dataset) {
    if (collected.indexOf(current) === -1) {
      collected.push(current);
    }
  }
  return collected;
}

console.log(trackFirstOccurrence([...originalList]));
// Output: [ 1, 2, 'abc', true, false, undefined, NaN, NaN ]

4. Inclusion Testing with includes

Replacing indexOf with includes streamlines the boolean check while utilizing a modern internal implementation detail. The Array.prototype.includes specification explicitly mandates SameValueZero semantics, allowing accurate detection of NaN during runtime traversal without manual type coercion.

function deduplicateWithIncludes(data) {
  const result = [];
  
  for (const value of data) {
    if (!result.includes(value)) {
      result.push(value);
    }
  }
  return result;
}

console.log(deduplicateWithIncludes([...originalList]));
// Output: [ 1, 2, 'abc', true, false, undefined, NaN ]

5. First-Index Validation Using filter

Functional composition can identify uniqueness by verifying that an element's current position matches its initial discovered location. Elements appearing multiple times will return a lower index than their current iterator position, effectively filtering out duplicates.

function selectByFirstAppearance(list) {
  return list.filter((element, position) => {
    return list.indexOf(element) === position;
  });
}

console.log(selectByFirstAppearance([...originalList]));
// Output: [ 1, 2, 'abc', true, false, undefined ]

Since indexOf fails to locate NaN, the expression evaluates to false, causing all NaN instances to be discarded from the final collection.

6. Key-Based Tracking with Map

Associative structures preserve insertion order and handle value collisions efficiently. By toggling a boolean flag upon first encounter, the algorithm guarantees single-pass deduplication. Similar to Set and includes, Map.has() applies SameValueZero comparison rules.

function trackWithHash(data) {
  const seen = new Map();
  const output = [];
  
  for (const item of data) {
    if (!seen.has(item)) {
      seen.set(item, true);
      output.push(item);
    }
  }
  return output;
}

console.log(trackWithHash([...originalList]));
// Output: [ 1, 2, 'abc', true, false, undefined, NaN ]

7. Propperty Lookup on Plain Objects

Standard objects can emulate hashing behavior due to their non-duplicable key constraint. During execution, primitive values are automatically coerced to strings when used as property names. While effective for straightforward string or number sequences, this pattern introduces unexpected collisions between falsy primitives (e.g., 0 becomes '0') and numeric types.

function deduplicateViaProperties(arrayData) {
  const distinctItems = [];
  const flags = {};
  
  for (const val of arrayData) {
    if (!flags[val]) {
      flags[val] = 1;
      distinctItems.push(val);
    }
  }
  return distinctItems;
}

console.log(deduplicateViaProperties([...originalList]));
// Output varies based on prototype chain state, generally: [ 1, 2, 'abc', true, false, undefined, NaN ]

Selecting the appropriate strategy depends on mutation requirements, performance constraints, and specific handling needs for special numeric values.

Related Articles

Understanding Strong and Weak References in Java

Strong References Strong reference are the most prevalent type of object referencing in Java. When an object has a strong reference pointing to it, the garbage collector will not reclaim its memory. F...

Comprehensive Guide to SSTI Explained with Payload Bypass Techniques

Introduction Server-Side Template Injection (SSTI) is a vulnerability in web applications where user input is improper handled within the template engine and executed on the server. This exploit can r...

Implement Image Upload Functionality for Django Integrated TinyMCE Editor

Django’s Admin panel is highly user-friendly, and pairing it with TinyMCE, an effective rich text editor, simplifies content management significantly. Combining the two is particular useful for bloggi...

Leave a Comment

Anonymous

◎Feel free to join the discussion and share your thoughts.