Git word-diff
If you have the problem that you can’t easily spot what has changed between two long lines of code or data when viewing the output of git diff
, then you can use word-diff which will show changes at a “word” level instead of per-line:
It does this by having a default definition for word boundaries, but you can override this by defining your own pattern with a regular expression (or regex). With a regex pattern that matches everything (the dot) you can see changes per-character within the line:
When comparing CSV files, separating on the delimiter (comma) and the quotation marks is very useful:
The following seems to work well for JSON data where curly brackets define an object and commas separate the keys within an object:
(Note that the git uses POSIX “extended” regular expressions and not PCRE or Perl Compatible Regular Expressions, e.g. [:space:]
instead of \s
for the whitespace character class.)
The default regex pattern used when you don’t specify one is different per file format, but generally uses white space characters while not splitting up multi-byte unicode characters [source], for example:
Addendum: What’s even more useful is that you can use git
’s diff
method to compare any files, even when they are not in a git repo, by adding --no-index
: