Strings in Swift: Going back to BASICs

A few tips ago, we went under the hood with unicode Characters and their relationship to the Swift String type. For most, that’s great theory, but how does it apply to strings, not characters? When I started programming back in the 1980’s, I had three string functions in BASIC: RIGHT$, LEFT$, and MID$. Let’s create a simple extension to String that will let you use MID$ using string ranges, then discuss right$ and left$ equivalents, and how these get returned. 

Download the exercise files. head to the embedded playground and hide everything but the playground. set run to manual if necessary.

You’ll find I started an extension. 

extension String{
}

I also added an example string you run the playground, you’ll find the string has a lot of grapheme clusters of arrows and emoji.

Strings in Swift do not use integer string indexes to access characters due to grapheme clusters. Instead of arrays, strings are an ordered collection that uses relative distances from two constants, startIndex and endIndex

For my function I’ll need those string indexes. I made you a function to get those, returning an optional value. If we are out of range, it returns nil.  

func index(_ position:Int)->String.Index!{
        if position < 0 || position > self.count {return nil}
        return self.index(self.startIndex, offsetBy: position)
    }

For my midString method, I want a string starting at one character position and going for a length,

func midString(from index:Int, length:Int)->String!{

}

First I’ll check if my starting positions is a valid one, returning nil if not a valid position: 

if let startPosition = self.index(index){

}
return nil

String Indexes take ranges, so I use a closed range from the startPostion to an endPosition. I calcualte the endPosition by adding the length, but still have to take 1 off the length. That’s an optional so I’ll optionally chain it this way. 

if let endPosition = self.index(index + length - 1){

}

Then I should be able to return the range.  

 return self[startPosition...endPosition]

Except that gets me an error. When you take something from a String range, you don’t get a string, but a substring. Substrings are excellent for memory allocation but extremely unstable, since they are really a set of pointers to the sub string within the parent string. As soon as possible with a substring, instantiate it as a string. 

return String(self[startPosition...endPosition])

Now to test all this with a string with extended grapheme clusters

print(yummy.midString(from: 3, length: 3))

And I get

I’ll change this to an invalid length

print(yummy.midString(from: 3, length: 35))

Which returns nil. That works as we wanted. 

Prefix and Suffix

Ranges work in strings to get you substrings. If you want the beginning or end of a string, you use prefix and suffix, which also return substrings. 

I can use

 print(yummy.prefix(3))

to get the substring

Or I can use 

print(yummy.suffix(3)

And get the substring.

If I overflow these, 

yummy.suffix(35)
yummy.prefix(35)

I get the full string, but as a substring. All Of what I did here is on grapheme clusters. Try one with both

yummy.suffix(5).prefix(1)

returns

There are other ways of breaking apart a string if you want the diacritical marks to be separate from the associated glyph, but for most applications, this is all you need. 

The Whole code

Here’s the code for the playground If you don’t want to download it from GitHub.

 var yummy = "D\u{1f369}ugh\u{20d7}n\u{20ed}uts"
 extension String{
      func index(_ position:Int)->String.Index!{
         if position < 0 || position > self.count {return nil}
         return self.index(self.startIndex, offsetBy: position)
     }
     func midString(from index:Int, length:Int)->String!{
         if let startPosition = self.index(index){
             if let endPosition = self.index(index + length - 1){
                 return String(self[startPosition...endPosition])
             }
         }
         return nil
     }
     
 }
 

 //Now to test all this with a string with extended grapheme cluster
 yummy
 yummy.midString(from: 3, length: 3)
  yummy.midString(from: 3, length: 35)
 

 yummy.prefix(3)
  yummy.suffix(3)
  yummy.prefix(35)
  yummy.prefix(35)
  yummy.suffix(5).prefix(1)
  
  
  
  
  
 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.