rope.ss: Ropes for fast string concatenation and subsequencing
_rope.ss_: Ropes for fast string concatenation and subsequencing
Danny Yoo ([email protected] / [email protected])
Index terms: _ropes_
Introduction
============
This presents a _rope_ data structure which is very much like a
string, except that it provides constant-time concatenation, as well
as inexpensive subsequencing. It's pretty much what's described in
the paper "Ropes: an Alternative to Strings" by Boehm, Atkinson, and
Plauss. This library also allows special values to be part of a rope.
Example
=======
> (require (planet "rope.ss" ("dyoo" "rope.plt" 2)))
> (define a-rope
(rope-append
(string->rope
"hello, this is a test of the emergency broadcast")
(string->rope "system; this is only a test")))
> a-rope
> #<struct:rope:concat>
> (rope-length a-rope)
75
> (subrope a-rope 5)
#<struct:rope:concat>
> (rope->string (subrope a-rope 5 10))
", thi"
> (rope-ref (subrope a-rope 20) 0)
#\t
;; Example with specials
> (define rope-with-specials
(rope-append (string->rope "hello")
(rope-append (special->rope (box " "))
(string->rope "world"))))
> (reverse (rope-fold/leaves cons '() rope-with-specials))
("hello" #&" " "world")
> (reverse (rope-fold cons '() rope-with-specials))
(#\h #\e #\l #\l #\o #&" " #\w #\o #\r #\l #\d)
API
===
A rope is defined to be either:
* A string node,
* A special node, or
* A concatenation node made of a left rope and a right rope.
Use string->rope and special->rope to convert those types to their
respective ropes. Concatenation nodes are built with rope-append.
> string->rope: string -> rope
Given a long string, breaks it up into smaller rope fragments and
appends them all together. The resulting rope is balanced.
> special->rope: (not string?) -> rope
Creates a rope element of length one whose content is the given
element.
Caveat: The special should not be a string. If it's necessary to
construct and distinguish such a special, one workaround is to wrap
with a box. Rationale: the behavior of rope-fold/leaves becomes
ambiguous and otherwise wouldn't let a client distinguish between a
string and a special-as-a-string.
> rope-append: rope rope -> rope
Joins two ropes together in constant time.
> rope?: any -> boolean
Returns true if the input is a rope.
> rope-has-special?: rope -> boolean
Returns true if the rope contains a special node.
> rope-length: rope -> natural-number
Returns the length of a rope. Cost is constant time.
> rope-ref: rope natural-number -> char-or-special
Returns the nth character or special in a rope. Cost is proportional
to the rope-depth of the rope.
> subrope: rope start [end] -> rope
Returns a subsequence of the rope containing all the character from
start up to (but not including) the end. If the end is not provided,
then the subrope extends to the end of the rope. Cost is proportional
to the rope-depth of the rope.
> rope->string: rope -> string
Creates a string from the content in the rope.
Note: if the rope has a special, this function will raise an error.
> rope-for-each: (char-or-special -> void) rope -> void
Applies the input function on every character or special in the rope.
> rope-fold: (char-or-special X) X rope -> X
Folds an accumulating function across every character or special in
the rope.
> rope-fold/leaves: (immutable-string-or-special X) X rope -> X
Folds an accumulating function across every string or special in the
rope.
> open-input-rope: rope -> input-port
Returns an input port whose byte content comes from the rope.
Specials can be read with read-byte-or-special.
Other API functions
===================
> rope-depth: rope -> natural-number
Returns the depth of the rope data structure.
> rope-node-count: rope -> natural-number
Returns the number of nodes in the rope.
> rope-balance: rope -> rope
Returns a balanced version of the rope whose rope-depth is roughly
logarithmic to its rope-node-count.
Internal structures
===================
The following structure definitions are provided for those who need to
do structured traversal over the nodes of a rope.
> (define-struct rope ())
> (define-struct (rope:string rope) (s))
where s is a string.
> (define-struct (rope:special rope) (s))
where s is not a string.
> (define-struct (rope:concat rope) (l r len))
where l and r are ropes, and len is a natural number.
Notes
=====
Unlike the description in the paper, this library does not
automatically rebalance the data structure when applying subrope or
rope-append. Explicit rebalancing may improve the performance of
subsequencing and referencing on the balanced tree.
All operations are done without mutation.
Recent changes
==============
Version 2.3: improved implementation of open-input-rope. Fixed silly
bug involving subrope and specials.
Version 2.2: exposed structure definitions, although most users won't
touch these directly.
Version 2.1: library API extended to support "special" objects. Also,
strings must be explicitly converted into strings now.
References
==========
Hans-Juergen Boehm and Russell R. Atkinson and Michael F. Plass.
"Ropes: an Alternative to Strings." Software - Practice and
Experience, Volume 25, No. 12. pp 1315--1330. (1995).