TkScript |
|
reference guide | Strings |
Table of Contents:
-
1. Strings
- A word or two about strings
-
2. Splitting strings
- Ways to slice a string into an array of strings
-
2.1. splitChar
- Split at delimiter character
-
2.2. splitCharset
- Split at one or more delimiter characters
-
2.3. splitSpace
- Split into words and substrings
-
2.4. Parsing
- String splitting via indexOf, substring, ..
-
3. Strings and File I/O
-
4. XML Parser
- Chopping up a String into a tree of hash tables
-
4.1. XFM
- The
tkui
user interface description language, a sample application for the XML parser
1. Strings
TkScript uses NULL terminated ASCII strings. The end-of-string character 0 is taken into account in the
length
count of a string, i.e. an empty String will usually have the length 1.
Strings are mutable, i.e. the character data in a String can be modified without creating a copy of the String object.
Example:
String
s = "hallo, welt.";
s.replace("hallo", "hello");
s.replace("welt", "world");
print s;
New string objects may be created automatically when the String is used in an arithmetic double arg expression (e.g.
+
).
Example:
String
s = "hello";
print (s + ", world."); // Create new temporary String, does not modify "s"
Strings can be compared using the relational operators.
Example:
String
s = "hello";
String
t = "hello";
String
u = ", world.";
String
v = ", ";
String
h = "hello, world.";
trace (s == t); // equal
trace (s != u); // not equal
trace (h <= t); // starts with right-handside string
trace (h >= u); // ends with right-handside string
trace (h & v); // contains right-handside string
Strings support the
return object syntax. This means that some string methods, like
String.append()
,
String.insert()
,
String.replace()
or
String.replaceRegion()
can be given a pre-allocated String object that will receive the result of the operation.
Example:
String
s ="hello, welt.";
String
d;
s.replace("welt", "world") => d; // result goes to pre-allocated String "d"
The way a string method is called can
determine where the result string is placed.
- If the method is called as a statement, "this" will be modified
- If the method is called as a statement and the => return object syntax is used, right-handside object will be modified
- If the method is called as an expression, the result will be placed in a new String object (which will be returned by the method call)
Example:
String
t, s = "abc";
s.replace("abc", "def"); // modify "self"
s.replace("def", "abc") => t; // place result in "t"
t <= s.replace("def", "abc"); // called as an expression, return new String object
1.1. Buffering
The
String
class uses buffering to e.g. speed up successive
String.append()
operations.
Example:
String
buf; // Creates String instance and assigns it to variable "buf"
buf.alloc(1024); // Create space for 1024 characters
buf.empty(); // Reset the number of used chars
buf.append("hello, world."); // No buffer re-allocation necessary
1.2. Character data ownership
Strings keep track of the ownership of the actual character data. If a String object holds a reference to non-deletable character data when the string is about to be modified, a copy of that data will be created automatically.
In a technical sense, this can make Strings
immutable to a certain point. From a script API point of view, a
String
always appears
mutable, though.
Example:
String
s <= "hello"; // set buffer to (the instance of) a constant string
// the following append operation will automatically create a copy
// of the constant character data
s.append(", world");
1.3. Preallocated strings
There are several variants of the
String
class available that come with a pre-allocated buffer. See
String8
,
String16
,
String32
,
String64
,
String128
for details.
Using pre-allocated strings has the advantage that this will the allocate the char data right along with the memory for the String object, thus saving a call to the internal memory allocator (
malloc
).
1.4. The [] operator
The [] operator is used to access single chars of a String.
If the applications tries to read/write characters beyond the end of the string, an
ArrayOutOfBounds
exception will be raised.
The
String.getc()
method can be used to safely read out of bounds, the method will simply return 0 when an invalid index was used.
Example:
String
s <= "hello, world.";
trace "the 6th char of the string is:\'"+tcchar(s[5])+"\'.";
Note: The function
tcchar(
) is used to convert an ASCII character code (s[5]=44==',') into a printable (2 char long, including ASCIIZ) String (",").
If the string buffer is large enough, writing a character to the exact last index will append it to the string and thus increase the string's length.
Example:
String
s;
s.alloc(4); // Allocate buffer that can hold 4 characters
trace "start length="+s.length;
loop(3)
{
s[s.length-1] = '*';
trace "s.length is now "+s.length;
}
trace s;
2. Splitting strings
The
String
class features several ways to chop up a
String
into a
StringArray
.
The
split
* methods return a
StringArray
after splitting a
String
up at word endings or delimiter characters.
The substrings will receive copies of the original string's bits and pieces.
2.1. splitChar
The
String.splitChar()
method is used to split the string at a given delimiter character.
Example:
String
s = "
hello
world
";
trace #(s.splitChar('\n')); // split string into lines
2.2. splitCharset
The
String.splitCharset()
method can split a string at multiple delimiter characters.
Example:
String
s = "sin(2PI*0.3)+1";
trace #(s.splitCharset("(*)+ \n"));
Note: The delimiter characters will not be included in the
StringArray
!
2.3. splitSpace
String.splitSpace()
is used to split a
String
into an array of
words.
The
String.splitSpace()
method can only split at whitespaces but it is able to recognize and parse escaped substrings.
Example:
String
s = "\"hello\" \", world\"";
trace #(s.splitSpace(true));
2.4. Parsing
The traditional way of chopping up strings in a parser is to search for the next token, handle it and continue.
The
String.indexOf()
and
String.indexOfChar()
methods are used to search for the next occurence of a substring or character starting at a given offset. The
String.charsetIndexOf()
method is used to search for one or more characters.
The
String.substring()
method is used to extract a range of characters from a string.
Take a look the
String
API reference for further details. Also take a look at the source code of "The DOG" (see
The DOG manual
) for a real-world example.
3. Strings and File I/O
Strings can be loaded from local files, pak files, buffers or other
Stream
classes. See
Buffer
,
File
,
PakFile
,
StdOutStream
,
StdInStream
,
StdErrStream
for more about streams.
3.1. Loading/saving local files
The method
String.loadLocal()
is used to load a
String
from a local and optionally convert
CR
(carriage return) characters to whitespace.
Example:
String
s;
if(s.loadLocal("test.txt", true)) // true=convert CR to whitespace
{
trace "s=\""+s+"\".";
s.append("..test..");
if(s.saveLocal("new.txt"))
{
trace "[...] wrote file \"new.txt\".";
}
}
3.2. Loading pak files (VFS)
String.load()
is used to load a
String
from a pakfile (i.e. from the virtual file system of a
.tkx
package).
String
s;
if(s.load("test.txt", true)) // true=convert CR to whitespace
{
trace "s=\""+s+"\".";
}
Note: If you run the
.tkp
project file instead of the
.tkx
package, the file will still be loaded from the local filesystem according to the file mapping specified in the project file.
3.3. (De-)serialization
Strings can also be (de-)serialized from/to
Stream
classes:
String
s <= "hello, world.";
Buffer
bufferStream;
bufferStream.size = 256;
bufferStream.offset = 0;
bufferStream << s; // Serialize String into Buffer Stream
trace "bufferStream.offset is now " + bufferStream.offset;
s = "n/a";
bufferStream.offset = 0;
s << bufferStream; // Deserialize String from Buffer Stream
trace "s is now \""+s+"\" again.";
4. XML Parser
The
String
class features a simplified XML parser which only performs basic syntax and nesting checks (e.g.
<e>
must be closed with
</e>
on the same level). DTD validation is not supported.
The
String.parseXML()
method splits a String into an L/R tree; the left nodes link elements of the same hierarchy level, the right linked nodes lead to subtrees of an element structure.
The attribut lists of elements are converted to
HashTable
s (associative arrays).
In order to make it accessible from scripts, the element structure of a document is converted to
TreeNode
s, which store the element
HashTable
s.
Example:
String
s <= "<test><body><text value=\" \'. . .\' <test>\"/></body></test>";
TreeNode
t <= s.parseXML();
TreeNode
u <= t.right;
trace "u.name=" + u.name;
u <= u.right;
HashTable
r <= u.objectValue;
trace "text=\"" + r["value"] + "\"";
Flow text between a start-tag (
<element>
) and an end-tag (
</element>
) is accessible in the
"<>"
attribute.
Example:
String
xml <= "<test><hello, world.></test>";
TreeNode
t <= xml.parseXML();
print "<" + t.name + ">.[]=\"" + t.objectValue["<>"] + "\"";
4.1. XFM
The
tkui
plugin uses the
String
XML parser to read
.xfm
user interface description files.
Example project file
testxfm.tkp
:
Example source file
testxfm.tks
:
use namespace ui;
UI.Initialize(Arguments);
Window window <= Window.New(XMLForm.New('
<xfm>
<Button caption="Hello, world." onClick=\'print "hello, world.";\'/>
</xfm>
'
),
"My window", 100, 100, 320, 240);
UI.AddFloatingLayer(window);
UI.OpenWindow(800, 600);
UI.Run();
auto-generated by "DOG", the TkScript document generator. Wed, 31/Dec/2008 15:53:35