You can find a video of this lesson on iOS Development Tips Weekly from LinkedIn Learning.
There’s a lot you can do with language easily in iOS and Watch OS. The Apple ecosystem has a Natural Language Processing (NLP ) system built in and easy to use. Let’s look at language recognition.
Download the starter file. It’s a playground with a few languages set up for you in a dictionary I used Google translate on. I’m trusting Google translate here and I know at least one of the non-latin script languages messed up in copying, which I deleted, so I’m not sure the others did, but will give us what we need.
The class we’ll use is the NSLinguisticTagger
class. It has a class method to find the dominant language in text. Add this to your playground
var language = NSLinguisticTagger.dominantLanguage(for: str[lang])
Run and you get en for English, the standard name for the language.
You can try a few more languages too, even in other scripts. Try Spanish(es) or Hindi(hi). Now Chinese (zh) is interesting. There’s a few different scripts and dialects. Chinese gives you a qualifier telling you more about the language.
That’s fun, and a great way to start automatic localization. But there’s more you can do with tagging. I’ll give you one more example: lexical tagging. This finds parts of speech, paragraph and sentence structure for you. You’ll need aninstance of lexical tagger to do this
let tagger = NSLinguisticTagger(tagSchemes:[.nameTypeOrLexicalClass], options: 0)
There’s several tag schemes you can use. Two common ones are Name type and lexical class. There’s also a combination one which I’ll use.
let tagger = NSLinguisticTagger(tagSchemes:[.nameTypeOrLexicalClass], options: 0)
You assign your string to the tagger
tagger.string = str[lang]!
Tagger has a method enumerateTags
which find all that tags within a range in the string. To check the entire string, I’ll need a value for the full range.
let fullRange = NSRange(location: 0, length: (str[lang]?.utf16.count)!)
I’ll use the enumerateTags
method, which has several parameters. The first is range, which I’ll use the fullRange
.
tagger.enumerateTags(in: fullRange,
Next is the unit I’m going to break this paragraph into, the full document, Paragraphs, sentences or words. I’ll use words.
unit: .word,
Next is the scheme, which will match one of the schemes for the tagger.
scheme: .nameTypeOrLexicalClass,
You can set options, and I’ll remove whitespace and punctuation from my scan.
options: [.omitPunctuation,.omitWhitespace])
THis method works a lot like a for loop, looping through all the units and for each unit running a closure. I’ll set up the closure with the tag, the range in the string of the word, and a pointer to a boolean value.
{ (tag, range, stop) in
THe tag for my setup will check parts of speech. I’ll check nouns.
if tag == .noun{ let word = (tagger.string! as NSString).substring(with:range) print(word) } Run this with English as the Language, and you’ll get a set of nouns. Change to verbs, and you’ll get verbs. Change the language to French, and you’ll get French verbs. Try Italian. Change the tag to personal name. Change to Hindi, and you’ll see nothing but the language. Not all languages are available, and I’ve not seen a list of what languages work with tags and which don’t. For the languages that do work, there’s a lot more you can tag and earn about a sentence. Take a look at the documentation and the WWDC 2017 video for more. <h1>The Whole Code</h1> Here's the completed <span id="mce_SELREST_start" style="overflow:hidden;line-height:0;"></span>playground code for this lesson. You can <a href="http://bit.ly/NLPTaggingEnd">download it from GitHub</a> //: Playground - noun: a place where people can play import UIKit let str:[String:String] = [ "English":"Where is the nearest Pizza Restaurant? Can I get a Pizza Margherita there? Steve loves pizza Margherita.", "Chinese":"最近的披萨餐厅在哪里?我可以在那里得到一份玛格丽塔披萨吗?史蒂夫喜欢披萨玛格丽塔。", "Spanish":"¿Dónde está el Pizza Restaurant más cercano? ¿Puedo conseguir una Pizza Margherita allí? Steve adora la pizza Margherita.", "French":"Où est le restaurant Pizza le plus proche? Puis-je avoir une Pizza Margherita là-bas? Steve adore la pizza Margherita.", "Italian":"Dov'è il ristorante pizzeria più vicino? Posso avere una pizza Margherita lì? Steve ama la pizza Margherita.", "Hawaiian":"ʻAuhea kahi Pizza Mea kokoke loa? Hiki iaʻu ke loaʻa i kahi Pizza Margherita ma laila? Paiʻo Steve i ka pizzaʻo Margherita.", "Hindi":"निकटतम पिज्जा रेस्तरां कहां है? क्या मुझे पिज्जा मार्गरिता मिल सकती है? स्टीव पिज्जा Margherita प्यार करता है।", "Japanese":"一番近いピザレストランはどこですか?そこにピザマルゲリータを手に入れることはできますか?スティーブはマルゲリータのピザが好きです。" ] let lang = "Chinese" var language = NSLinguisticTagger.dominantLanguage(for: str[lang]!) let tagger = NSLinguisticTagger(tagSchemes: [.nameTypeOrLexicalClass], options: 0) tagger.string = str[lang]! let fullRange = NSRange(location: 0, length: (str[lang]?.utf16.count)!) tagger.enumerateTags(in: fullRange, unit: .word, scheme: .nameTypeOrLexicalClass, options: [.omitPunctuation,.omitWhitespace]) { (tag, range, stop) in if tag == .noun{ let word = (tagger.string! as NSString).substring(with: range) print(word) } }
Leave a Reply