Hone logo
Hone
Problems

Generating Unified Diff in Jest

This challenge focuses on creating a utility function that generates a unified diff string between two TypeScript code snippets. Unified diffs are a standard way to represent changes between versions of a file, and this exercise will test your ability to parse code, identify differences, and format them according to the unified diff standard. This is useful for automated code review, change tracking, and generating human-readable diffs for debugging.

Problem Description

You are tasked with creating a TypeScript function generateUnifiedDiff that takes two strings representing TypeScript code snippets as input and returns a unified diff string representing the differences between them. The function should adhere to the unified diff format, including headers, change indicators (+ for added lines, - for removed lines, and for unchanged lines), and line numbers.

Key Requirements:

  • Unified Diff Format: The output must strictly adhere to the unified diff format. This includes the header line (e.g., --- a/original.ts), the header line for the second file (e.g., +++ b/modified.ts), and the diff lines with appropriate indicators and line numbers.
  • Line-by-Line Comparison: The diff should be generated by comparing the input strings line by line.
  • Accurate Change Identification: The function must correctly identify added, removed, and unchanged lines.
  • Header Generation: The header lines should be dynamically generated based on the input file names (or default names if none are provided).
  • No External Dependencies (besides Jest for testing): You should not use any external diffing libraries. The goal is to implement the diff logic yourself.

Expected Behavior:

The function should return a string containing the unified diff. If the input strings are identical, it should return a diff indicating no changes. If there are differences, it should accurately represent those changes in the unified diff format.

Edge Cases to Consider:

  • Empty Input Strings: Handle cases where one or both input strings are empty.
  • Identical Input Strings: Return a diff indicating no changes.
  • Large Input Strings: Consider potential performance implications for very large input strings (though extreme optimization is not required for this challenge).
  • Whitespace Differences: The diff should accurately reflect differences even if they are solely due to whitespace changes (e.g., indentation).

Examples

Example 1:

Input:
original: "const x = 1;\nconsole.log(x);"
modified: "const x = 2;\nconsole.log(x);"
Output:
"--- a/original.ts\n+++ b/modified.ts\n@@ -1,2 +1,2 @@\n-const x = 1;\n+const x = 2;\n console.log(x);"
Explanation: Only the value of 'x' changed, so only that line is marked with a '+' and '-'. Line numbers are included.

Example 2:

Input:
original: "const x = 1;\nconsole.log(x);"
modified: "const x = 1;\nconsole.log(x);\nconst y = 3;"
Output:
"--- a/original.ts\n+++ b/modified.ts\n@@ -1,2 +1,3 @@\n const x = 1;\n console.log(x);\n+const y = 3;"
Explanation: A new line was added, so it's marked with a '+'.

Example 3: (Edge Case - Identical Strings)

Input:
original: "const x = 1;\nconsole.log(x);"
modified: "const x = 1;\nconsole.log(x);"
Output:
"--- a/original.ts\n+++ b/modified.ts\n@@ -1,2 +1,2 @@\n const x = 1;\n console.log(x);"
Explanation: The strings are identical, so the diff shows no changes, but still includes the header and line numbers.

Constraints

  • Input String Length: The maximum length of each input string is 1000 characters.
  • Input Type: Input must be strings.
  • Performance: The function should complete within 100ms for the given input length constraint.
  • Output Format: The output must be a valid unified diff string.

Notes

  • Consider splitting the input strings into arrays of lines for easier comparison.
  • The unified diff format is well-documented online. Refer to it for precise formatting requirements.
  • Focus on correctness first, then consider optimizing for performance if needed.
  • You can assume the input strings are valid TypeScript code (no need to perform syntax validation).
  • The file names in the header can be defaults like "original.ts" and "modified.ts" if no other names are provided.
  • The @@ lines indicate the line number ranges in the original and modified files. Calculate these correctly based on the changes.
Loading editor...
typescript