Class DiffAlgorithm

  • Direct Known Subclasses:
    LowLevelDiffAlgorithm

    public abstract class DiffAlgorithm
    extends java.lang.Object
    Compares two Sequences to create an EditList of changes.

    An algorithm's diff method must be callable from concurrent threads without data collisions. This permits some algorithms to use a singleton pattern, with concurrent invocations using the same singleton. Other algorithms may support parameterization, in which case the caller can create a unique instance per thread.

    • Constructor Detail

      • DiffAlgorithm

        public DiffAlgorithm()
    • Method Detail

      • getAlgorithm

        public static DiffAlgorithm getAlgorithm​(DiffAlgorithm.SupportedAlgorithm alg)
        Get diff algorithm
        Parameters:
        alg - the diff algorithm for which an implementation should be returned
        Returns:
        an implementation of the specified diff algorithm
      • diff

        public <S extends SequenceEditList diff​(SequenceComparator<? super S> cmp,
                                                  S a,
                                                  S b)
        Compare two sequences and identify a list of edits between them.
        Parameters:
        cmp - the comparator supplying the element equivalence function.
        a - the first (also known as old or pre-image) sequence. Edits returned by this algorithm will reference indexes using the 'A' side: Edit.getBeginA(), Edit.getEndA().
        b - the second (also known as new or post-image) sequence. Edits returned by this algorithm will reference indexes using the 'B' side: Edit.getBeginB(), Edit.getEndB().
        Returns:
        a modifiable edit list comparing the two sequences. If empty, the sequences are identical according to cmp's rules. The result list is never null.
      • coverEdit

        private static <S extends SequenceEdit coverEdit​(S a,
                                                           S b)
      • normalize

        private static <S extends SequenceEditList normalize​(SequenceComparator<? super S> cmp,
                                                               EditList e,
                                                               S a,
                                                               S b)
        Reorganize an EditList for better diff consistency.

        DiffAlgorithms may return Edit.Type.INSERT or Edit.Type.DELETE edits that can be "shifted". For example, the deleted section

         -a
         -b
         -c
          a
          b
          c
         
        can be shifted down by 1, 2 or 3 locations.

        To avoid later merge issues, we shift such edits to a consistent location. normalize uses a simple strategy of shifting such edits to their latest possible location.

        This strategy may not always produce an aesthetically pleasing diff. For instance, it works well with

          function1 {
           ...
          }
        
         +function2 {
         + ...
         +}
         +
         function3 {
         ...
         }
         
        but less so for
          #
          # comment1
          #
          function1() {
          }
        
          #
         +# comment3
         +#
         +function3() {
         +}
         +
         +#
          # comment2
          #
          function2() {
          }
         
        More sophisticated strategies are possible, say by calculating a suitable "aesthetic cost" for each possible position and using the lowest cost, but normalize just shifts edits to the end as much as possible.
        Type Parameters:
        S - type of sequence being compared.
        Parameters:
        cmp - the comparator supplying the element equivalence function.
        e - a modifiable edit list comparing the provided sequences.
        a - the first (also known as old or pre-image) sequence.
        b - the second (also known as new or post-image) sequence.
        Returns:
        a modifiable edit list with edit regions shifted to their latest possible location. The result list is never null.
        Since:
        4.7
      • diffNonCommon

        public abstract <S extends SequenceEditList diffNonCommon​(SequenceComparator<? super S> cmp,
                                                                    S a,
                                                                    S b)
        Compare two sequences and identify a list of edits between them. This method should be invoked only after the two sequences have been proven to have no common starting or ending elements. The expected elimination of common starting and ending elements is automatically performed by the diff(SequenceComparator, Sequence, Sequence) method, which invokes this method using Subsequences.
        Parameters:
        cmp - the comparator supplying the element equivalence function.
        a - the first (also known as old or pre-image) sequence. Edits returned by this algorithm will reference indexes using the 'A' side: Edit.getBeginA(), Edit.getEndA().
        b - the second (also known as new or post-image) sequence. Edits returned by this algorithm will reference indexes using the 'B' side: Edit.getBeginB(), Edit.getEndB().
        Returns:
        a modifiable edit list comparing the two sequences.