Swift Strings Are Not C Strings or NSStrings

In many popular programming languages strings are little more than an array of characters, often referred to as C strings since C was one of the first languages to take this approach to strings. As we learned in the last post, with Swift’s use of Unicode characters in extentended grapheme clusters, this gets messed up, and you have a bit more work when working in Swift with the String and NSString Types.

Download the exercise file, and you’ll find a copy of the completed exercise file from the Unicode character tip. Run the app on an iPhone Simulator.

The app is supposed to count the characters in the string. It accurately counts 9 characters, but to do so it counts extended grapheme clusters, so the arrows don’t count in the character count. This makes sense if you are counting full characters, but it runs into a few problems.

Converting between String and NSString is one of those problems. I’ll Add an NSString above the label assignment:

let nsYummy = NSString(string: yummy)


NSString does not have a count but a length, I’ll add that to the label text with a new line to make it easier to read.

print (String(format:"\n %i %i",yummy.count,nsYummy.length))

Run this. The NSString‘s length reads all the Unicode characters separately, we get 12 instead of 9.

I’d expect 11, since I add the two arrows to the number of characters. I’ll come back to where that missing character is.

Neither of these are arrays of Character. You can’t do this:

let yummyChar = yummy[4]

Or this:

let nsYummyChar = nsYummy[4]

You’ll get an error.

That’s to keep track of all those clusters. NSString has a method character:at: which gives you the character as a 16-bit integer. I’ll chnge the assignment to

let nsYummyChar = nsYummy.character(at: 4)

Since I’ve been working in hex I’ll print to the console our label and the character

print(String(format:"%X %C",nsYummyChar, nsYummyChar))

Comment out yummy for now, Run and we get 67, which is the g.

For String, I have to use a relative index from the beginning or end of a string. There’s an internal type to String called String.Index that I can use with a subscript. It has a few properties that are useful. For the index of the first characte r there is the property startIndex. For the last index, endIndex. Remove the comment and change the subscript to

let yummyChar = yummy[yummy.startIndex]

That will get me the first character. For the fourth character, I can use the method index:offsetBy: I’ll just print that character to the console.

let yummyChar = yummy[yummy.index(yummy.startIndex,offsetBy:4)]
print(yummyChar)

Run this. In the console,

We get the h with the arrow because this is an offset from the index. It is the number of characters away from the starting character D. I subtract 1 to get 3 to get the fourth character.

let yummyChar = yummy[yummy.index(yummy.startIndex,offsetBy:3)]
print(yummyChar)

Iterations through characters are different too. For an NSString, you’ll have to iterate through an index and use character: At:

 
for index in 0..<nsYummy.length{
     let nsYummyChar = nsYummy.character(at: index)
    print(String(format:"%C",nsYummyChar) }
 

Swift Strings are sequenceable, so you can use for directly on the string. char here is of type character, and for comparison, I cast it to string for printing.

for char in yummy{
    print(String(char))
}

Run this. In the console you’ll find the results. First there is the NSString iteration, which iterates over all the characters, splitting up the grapheme clusters, and converting the doughnut emoji to question marks.

Then the Swift string returns grapheme clusters, with the doughnut intact.

If you noticed earlier, there were 12 characters in the NSString while ,String had 9. We expected that the arrows were another character for a total of 11. But looking at the console we can see that the emoji, here represented by those two question marks, is two characters. On the other hand, the char just prints 9 grapheme clusters for the Swift String and Character.

When working with strings, and especially when converting between String and NSString, be careful. Due to Unicode clusters, they might not be as simple as they look.

Some Common Functions for String

For most, that’s great theory, but how does it apply to strings, not characters? If you’re familiar with many languages that use strings as c-strings or character arrays, you’re familiar with a few simple string manipulation functions. In BASIC I knew them as right$, left$, and mid$. I’ve created a simple extension to String that will let you use three more common String functions to better understand strings.

First make the extension

extension String{

}

Since index is odd about numbering. I like keeping consistent with arrays, so I’ll write a function to return a valid position. This makes sure we are in range. I also set overflows to startIndex and endIndex. I did this for speed in coding the rest of this. This would be better returning Int! and checking for nil.

private func pos(position:Int)->Int{
    var pos = position
    if pos > 0 {pos -= 1} else {pos = 0}
    if pos >= count {pos = count - 1}
    return pos
}

I’ll need the position of the character I’m interested in. I made another function for that, using the offsetBy we already used. Since this is an extension of String, I use self for the object.

private func index(_ position:Int)->String.Index{
    return self.index(startIndex, offsetBy: pos(position:position))
}

All the characters to the left of the position I make an open range based on position to get a left string function.

func leftString(from position:Int)-> String{
     return  self[...index(position)]
}

In the extension, I use self to refer to my string. I get a substring on self by using a range for the subscripts on indices.

You’ll notice we get a weird error.

Substrings are not strings. Cast it to a string and the error disappears. .

return String(self[...index(position)])

For midString, a string starting at one character position and going for a length, I use a closed range of a start and end position. I calculate the end by adding the length, but still have to take 1 off.

func midString(from position:Int, length:Int)-> String{
    let endPosition = position + length - 1
    return String(self[index(position)...index(endPosition)])
}

Getting the rightmost characters is a little bit more difficult. rightIndex has to be from the trailing side. So I’ll find the position by subtracting position from endIndex

func rightString(from position:Int)-> String{
    let rightIndex = self.index(endIndex, offsetBy: -pos(position:position + 1))

}

Now to test all this, I’ll add print statements to viewDidLoad in the class I’ve been working in.

print(yummy)
print(yummy.leftString(from: 3))
print(yummy.midString(from: 3, length: 3))
print(yummy.rightString(from: 3))

Run this and you get in the console:

I printed the full string first for comparison. The third from the left character is u. For the leftString, I print the first, second and third characters. For midString I print the third character and the next two characters for a total of three characters. The u happens to be the third from the right character too, so the rightString prints from the second u to the beginning of the string.

You can tweak this to your preferences. I just wanted to show you how to manipulate the string using indices. These three functions I added in the extension are not in Swift because there are a lot more powerful things you can do with Swift strings. I’ll be covering those in upcoming tips.

The Whole Code

You’ll find the code below for cut and paste. You can also find it on GitHub here, but in a slightly differnt format. The extension is in a playground file.

//
//  A Demo for iOS Development Tips Weekly
//  by Steven Lipton (C)2018, All rights reserved
//  For videos go to http://bit.ly/TipsLinkedInLearning
//  For code go to http://bit.ly/AppPieGithub
//ūüć©

import UIKit

class ViewController: UIViewController {
    var yummy = "D\u{1f369}ugh\u{20d7}n\u{20ed}uts"
    
    @IBOutlet weak var label: UILabel!
    
    override func viewDidLoad() {
        super.viewDidLoad()
        //yummy = "\u{1f369}"
        //yummy = "Bun\u{0303}elos"
        let nsYummy = NSString(string: yummy)
        
        let yummyChar = yummy[yummy.index(yummy.startIndex,offsetBy: 3)]
        print(yummyChar)
        
        let nsYummyChar = nsYummy.character(at: 4)
        
        for index in 0..Int{
        var pos = position
        if pos > 0 {pos -= 1} else {pos = 0}
        if pos >= count {pos = count - 1}
        return pos
    }
    
    private func index(_ position:Int)->String.Index{
        return self.index(startIndex, offsetBy: pos(position:position))
    }
    
    func leftString(from position:Int)-> String{
        return String(self[...index(position)])
    }
    func midString(from position:Int, length:Int)-> String{
        let endPosition = position + length - 1
        return String(self[index(position)...index(endPosition)])
    }
    func rightString(from position:Int)-> String{
        let rightIndex = self.index(endIndex, offsetBy: -pos(position:position + 1))
        return String(self[...rightIndex])
    }
}


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.