Package at.letto.tools
Class diff_match_patch
java.lang.Object
at.letto.tools.diff_match_patch
Class containing the diff, match and patch methods.
Also contains the behaviour settings.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
Class representing one diff operation.protected static class
Internal class for returning results from diff_linesToChars().static enum
The data structure representing a diff is a Linked list of Diff objects: {Diff(Operation.DELETE, "Hello"), Diff(Operation.INSERT, "Goodbye"), Diff(Operation.EQUAL, " world.")}static class
Class representing one patch operation. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate Pattern
private Pattern
short
Cost of an empty edit operation in terms of edit characters.float
Number of seconds to map a diff before giving up (0 for infinity).int
How far to search for a match (0 = exact location, 1000+ = broad match).private short
The number of bits in an int.float
At what point is no match declared (0.0 = perfection, 1.0 = very loose).float
When deleting a large block of text (over ~64 characters), how close do the contents have to be to match the expected contents. (0.0 = perfection, 1.0 = very loose).short
Chunk size for context length. -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected LinkedList
<diff_match_patch.Diff> diff_bisect
(String text1, String text2, long deadline) Find the 'middle snake' of a diff, split the problem in two and return the recursively constructed diff.private LinkedList
<diff_match_patch.Diff> diff_bisectSplit
(String text1, String text2, int x, int y, long deadline) Given the location of the 'middle snake', split the diff in two parts and recurse.protected void
diff_charsToLines
(LinkedList<diff_match_patch.Diff> diffs, List<String> lineArray) Rehydrate the text in a diff from a string of line hashes to real lines of text.void
Reduce the number of edits by eliminating operationally trivial equalities.void
Reorder and merge like edit sections.void
Reduce the number of edits by eliminating semantically trivial equalities.void
Look for single edits surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary.private int
diff_cleanupSemanticScore
(String one, String two) Given two strings, compute a score representing whether the internal boundary falls on logical boundaries.protected int
diff_commonOverlap
(String text1, String text2) Determine if the suffix of one string is the prefix of another.int
diff_commonPrefix
(String text1, String text2) Determine the common prefix of two stringsint
diff_commonSuffix
(String text1, String text2) Determine the common suffix of two stringsprivate LinkedList
<diff_match_patch.Diff> diff_compute
(String text1, String text2, boolean checklines, long deadline) Find the differences between two texts.diff_fromDelta
(String text1, String delta) Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.protected String[]
diff_halfMatch
(String text1, String text2) Do the two texts share a substring which is at least half the length of the longer text?private String[]
diff_halfMatchI
(String longtext, String shorttext, int i) Does a substring of shorttext exist within longtext such that the substring is at least half the length of longtext?int
Compute the Levenshtein distance; the number of inserted, deleted or substituted characters.private LinkedList
<diff_match_patch.Diff> diff_lineMode
(String text1, String text2, long deadline) Do a quick line-level diff on both strings, then rediff the parts for greater accuracy.protected diff_match_patch.LinesToCharsResult
diff_linesToChars
(String text1, String text2) Split two texts into a list of strings.private String
Split a text into a list of strings.Find the differences between two texts.Find the differences between two texts.private LinkedList
<diff_match_patch.Diff> Find the differences between two texts.Convert a Diff list into a pretty HTML report.Compute and return the source text (all equalities and deletions).Compute and return the destination text (all equalities and insertions).Crush the diff into an encoded string which describes the operations required to transform text1 into text2.int
diff_xIndex
(LinkedList<diff_match_patch.Diff> diffs, int loc) loc is a location in text1, compute and return the equivalent location in text2.match_alphabet
(String pattern) Initialise the alphabet for the Bitap algorithm.protected int
match_bitap
(String text, String pattern, int loc) Locate the best instance of 'pattern' in 'text' near 'loc' using the Bitap algorithm.private double
match_bitapScore
(int e, int x, int loc, String pattern) Compute and return the score for a match with e errors and x location.int
match_main
(String text, String pattern, int loc) Locate the best instance of 'pattern' in 'text' near 'loc'.protected void
patch_addContext
(diff_match_patch.Patch patch, String text) Increase the context until it is unique, but don't let the pattern expand beyond Match_MaxBits.patch_addPadding
(LinkedList<diff_match_patch.Patch> patches) Add some padding on text start and end so that edges can match something.Object[]
patch_apply
(LinkedList<diff_match_patch.Patch> patches, String text) Merge a set of patches onto the text.patch_deepCopy
(LinkedList<diff_match_patch.Patch> patches) Given an array of patches, return another array that is identical.patch_fromText
(String textline) Parse a textual representation of patches and return a List of Patch objects.patch_make
(String text1, String text2) Compute a list of patches to turn text1 into text2.patch_make
(String text1, String text2, LinkedList<diff_match_patch.Diff> diffs) Deprecated.Prefer patch_make(String text1, LinkedList<Diff> diffs).patch_make
(String text1, LinkedList<diff_match_patch.Diff> diffs) Compute a list of patches to turn text1 into text2.Compute a list of patches to turn text1 into text2.void
patch_splitMax
(LinkedList<diff_match_patch.Patch> patches) Look through the patches and break up any which are longer than the maximum limit of the match algorithm.patch_toText
(List<diff_match_patch.Patch> patches) Take a list of patches and return a textual representation.private static String
Unescape selected chars for compatability with JavaScript's encodeURI.
-
Field Details
-
Diff_Timeout
public float Diff_TimeoutNumber of seconds to map a diff before giving up (0 for infinity). -
Diff_EditCost
public short Diff_EditCostCost of an empty edit operation in terms of edit characters. -
Match_Threshold
public float Match_ThresholdAt what point is no match declared (0.0 = perfection, 1.0 = very loose). -
Match_Distance
public int Match_DistanceHow far to search for a match (0 = exact location, 1000+ = broad match). A match this many characters away from the expected location will add 1.0 to the score (0.0 is a perfect match). -
Patch_DeleteThreshold
public float Patch_DeleteThresholdWhen deleting a large block of text (over ~64 characters), how close do the contents have to be to match the expected contents. (0.0 = perfection, 1.0 = very loose). Note that Match_Threshold controls how closely the end points of a delete need to match. -
Patch_Margin
public short Patch_MarginChunk size for context length. -
Match_MaxBits
private short Match_MaxBitsThe number of bits in an int. -
BLANKLINEEND
-
BLANKLINESTART
-
-
Constructor Details
-
diff_match_patch
public diff_match_patch()
-
-
Method Details
-
diff_main
Find the differences between two texts. Run a faster, slightly less optimal diff. This method allows the 'checklines' of diff_main() to be optional. Most of the time checklines is wanted, so default to true.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.- Returns:
- Linked List of Diff objects.
-
diff_main
Find the differences between two texts.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.checklines
- Speedup flag. If false, then don't run a line-level diff first to identify the changed areas. If true, then run a faster slightly less optimal diff.- Returns:
- Linked List of Diff objects.
-
diff_main
private LinkedList<diff_match_patch.Diff> diff_main(String text1, String text2, boolean checklines, long deadline) Find the differences between two texts. Simplifies the problem by stripping any common prefix or suffix off the texts before diffing.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.checklines
- Speedup flag. If false, then don't run a line-level diff first to identify the changed areas. If true, then run a faster slightly less optimal diff.deadline
- Time when the diff should be complete by. Used internally for recursive calls. Users should set DiffTimeout instead.- Returns:
- Linked List of Diff objects.
-
diff_compute
private LinkedList<diff_match_patch.Diff> diff_compute(String text1, String text2, boolean checklines, long deadline) Find the differences between two texts. Assumes that the texts do not have any common prefix or suffix.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.checklines
- Speedup flag. If false, then don't run a line-level diff first to identify the changed areas. If true, then run a faster slightly less optimal diff.deadline
- Time when the diff should be complete by.- Returns:
- Linked List of Diff objects.
-
diff_lineMode
Do a quick line-level diff on both strings, then rediff the parts for greater accuracy. This speedup can produce non-minimal diffs.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.deadline
- Time when the diff should be complete by.- Returns:
- Linked List of Diff objects.
-
diff_bisect
Find the 'middle snake' of a diff, split the problem in two and return the recursively constructed diff. See Myers 1986 paper: An O(ND) Difference Algorithm and Its Variations.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.deadline
- Time at which to bail if not yet complete.- Returns:
- LinkedList of Diff objects.
-
diff_bisectSplit
private LinkedList<diff_match_patch.Diff> diff_bisectSplit(String text1, String text2, int x, int y, long deadline) Given the location of the 'middle snake', split the diff in two parts and recurse.- Parameters:
text1
- Old string to be diffed.text2
- New string to be diffed.x
- Index of split point in text1.y
- Index of split point in text2.deadline
- Time at which to bail if not yet complete.- Returns:
- LinkedList of Diff objects.
-
diff_linesToChars
Split two texts into a list of strings. Reduce the texts to a string of hashes where each Unicode character represents one line.- Parameters:
text1
- First string.text2
- Second string.- Returns:
- An object containing the encoded text1, the encoded text2 and the List of unique strings. The zeroth element of the List of unique strings is intentionally blank.
-
diff_linesToCharsMunge
private String diff_linesToCharsMunge(String text, List<String> lineArray, Map<String, Integer> lineHash) Split a text into a list of strings. Reduce the texts to a string of hashes where each Unicode character represents one line.- Parameters:
text
- String to encode.lineArray
- List of unique strings.lineHash
- Map of strings to indices.- Returns:
- Encoded string.
-
diff_charsToLines
Rehydrate the text in a diff from a string of line hashes to real lines of text.- Parameters:
diffs
- LinkedList of Diff objects.lineArray
- List of unique strings.
-
diff_commonPrefix
Determine the common prefix of two strings- Parameters:
text1
- First string.text2
- Second string.- Returns:
- The number of characters common to the start of each string.
-
diff_commonSuffix
Determine the common suffix of two strings- Parameters:
text1
- First string.text2
- Second string.- Returns:
- The number of characters common to the end of each string.
-
diff_commonOverlap
Determine if the suffix of one string is the prefix of another.- Parameters:
text1
- First string.text2
- Second string.- Returns:
- The number of characters common to the end of the first string and the start of the second string.
-
diff_halfMatch
Do the two texts share a substring which is at least half the length of the longer text? This speedup can produce non-minimal diffs.- Parameters:
text1
- First string.text2
- Second string.- Returns:
- Five element String array, containing the prefix of text1, the suffix of text1, the prefix of text2, the suffix of text2 and the common middle. Or null if there was no match.
-
diff_halfMatchI
Does a substring of shorttext exist within longtext such that the substring is at least half the length of longtext?- Parameters:
longtext
- Longer string.shorttext
- Shorter string.i
- Start index of quarter length substring within longtext.- Returns:
- Five element String array, containing the prefix of longtext, the suffix of longtext, the prefix of shorttext, the suffix of shorttext and the common middle. Or null if there was no match.
-
diff_cleanupSemantic
Reduce the number of edits by eliminating semantically trivial equalities.- Parameters:
diffs
- LinkedList of Diff objects.
-
diff_cleanupSemanticLossless
Look for single edits surrounded on both sides by equalities which can be shifted sideways to align the edit to a word boundary.- Parameters:
diffs
- LinkedList of Diff objects.
-
diff_cleanupSemanticScore
Given two strings, compute a score representing whether the internal boundary falls on logical boundaries. Scores range from 6 (best) to 0 (worst).- Parameters:
one
- First string.two
- Second string.- Returns:
- The score.
-
diff_cleanupEfficiency
Reduce the number of edits by eliminating operationally trivial equalities.- Parameters:
diffs
- LinkedList of Diff objects.
-
diff_cleanupMerge
Reorder and merge like edit sections. Merge equalities. Any edit section can move as long as it doesn't cross an equality.- Parameters:
diffs
- LinkedList of Diff objects.
-
diff_xIndex
loc is a location in text1, compute and return the equivalent location in text2. e.g. "The cat" vs "The big cat", 1->1, 5->8- Parameters:
diffs
- LinkedList of Diff objects.loc
- Location within text1.- Returns:
- Location within text2.
-
diff_prettyHtml
Convert a Diff list into a pretty HTML report.- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- HTML representation.
-
diff_text1
Compute and return the source text (all equalities and deletions).- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- Source text.
-
diff_text2
Compute and return the destination text (all equalities and insertions).- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- Destination text.
-
diff_levenshtein
Compute the Levenshtein distance; the number of inserted, deleted or substituted characters.- Parameters:
diffs
- LinkedList of Diff objects.- Returns:
- Number of changes.
-
diff_toDelta
Crush the diff into an encoded string which describes the operations required to transform text1 into text2. E.g. =3\t-2\t+ing -> Keep 3 chars, delete 2 chars, insert 'ing'. Operations are tab-separated. Inserted text is escaped using %xx notation.- Parameters:
diffs
- Array of Diff objects.- Returns:
- Delta text.
-
diff_fromDelta
public LinkedList<diff_match_patch.Diff> diff_fromDelta(String text1, String delta) throws IllegalArgumentException Given the original text1, and an encoded string which describes the operations required to transform text1 into text2, compute the full diff.- Parameters:
text1
- Source string for the diff.delta
- Delta text.- Returns:
- Array of Diff objects or null if invalid.
- Throws:
IllegalArgumentException
- If invalid input.
-
match_main
Locate the best instance of 'pattern' in 'text' near 'loc'. Returns -1 if no match found.- Parameters:
text
- The text to search.pattern
- The pattern to search for.loc
- The location to search around.- Returns:
- Best match index or -1.
-
match_bitap
Locate the best instance of 'pattern' in 'text' near 'loc' using the Bitap algorithm. Returns -1 if no match found.- Parameters:
text
- The text to search.pattern
- The pattern to search for.loc
- The location to search around.- Returns:
- Best match index or -1.
-
match_bitapScore
Compute and return the score for a match with e errors and x location.- Parameters:
e
- Number of errors in match.x
- Location of match.loc
- Expected location of match.pattern
- Pattern being sought.- Returns:
- Overall score for match (0.0 = good, 1.0 = bad).
-
match_alphabet
Initialise the alphabet for the Bitap algorithm.- Parameters:
pattern
- The text to encode.- Returns:
- Hash of character locations.
-
patch_addContext
Increase the context until it is unique, but don't let the pattern expand beyond Match_MaxBits.- Parameters:
patch
- The patch to grow.text
- Source text.
-
patch_make
Compute a list of patches to turn text1 into text2. A set of diffs will be computed.- Parameters:
text1
- Old text.text2
- New text.- Returns:
- LinkedList of Patch objects.
-
patch_make
Compute a list of patches to turn text1 into text2. text1 will be derived from the provided diffs.- Parameters:
diffs
- Array of Diff objects for text1 to text2.- Returns:
- LinkedList of Patch objects.
-
patch_make
public LinkedList<diff_match_patch.Patch> patch_make(String text1, String text2, LinkedList<diff_match_patch.Diff> diffs) Deprecated.Prefer patch_make(String text1, LinkedList<Diff> diffs).Compute a list of patches to turn text1 into text2. text2 is ignored, diffs are the delta between text1 and text2.- Parameters:
text1
- Old texttext2
- Ignored.diffs
- Array of Diff objects for text1 to text2.- Returns:
- LinkedList of Patch objects.
-
patch_make
public LinkedList<diff_match_patch.Patch> patch_make(String text1, LinkedList<diff_match_patch.Diff> diffs) Compute a list of patches to turn text1 into text2. text2 is not provided, diffs are the delta between text1 and text2.- Parameters:
text1
- Old text.diffs
- Array of Diff objects for text1 to text2.- Returns:
- LinkedList of Patch objects.
-
patch_deepCopy
public LinkedList<diff_match_patch.Patch> patch_deepCopy(LinkedList<diff_match_patch.Patch> patches) Given an array of patches, return another array that is identical.- Parameters:
patches
- Array of Patch objects.- Returns:
- Array of Patch objects.
-
patch_apply
Merge a set of patches onto the text. Return a patched text, as well as an array of true/false values indicating which patches were applied.- Parameters:
patches
- Array of Patch objectstext
- Old text.- Returns:
- Two element Object array, containing the new text and an array of boolean values.
-
patch_addPadding
Add some padding on text start and end so that edges can match something. Intended to be called only from within patch_apply.- Parameters:
patches
- Array of Patch objects.- Returns:
- The padding string added to each side.
-
patch_splitMax
Look through the patches and break up any which are longer than the maximum limit of the match algorithm. Intended to be called only from within patch_apply.- Parameters:
patches
- LinkedList of Patch objects.
-
patch_toText
Take a list of patches and return a textual representation.- Parameters:
patches
- List of Patch objects.- Returns:
- Text representation of patches.
-
patch_fromText
Parse a textual representation of patches and return a List of Patch objects.- Parameters:
textline
- Text representation of patches.- Returns:
- List of Patch objects.
- Throws:
IllegalArgumentException
- If invalid input.
-
unescapeForEncodeUriCompatability
Unescape selected chars for compatability with JavaScript's encodeURI. In speed critical applications this could be dropped since the receiving application will certainly decode these fine. Note that this function is case-sensitive. Thus "%3f" would not be unescaped. But this is ok because it is only called with the output of URLEncoder.encode which returns uppercase hex. Example: "%3F" -> "?", "%24" -> "$", etc.- Parameters:
str
- The string to escape.- Returns:
- The escaped string.
-