The proper implementation of a string type in Swift has been a controversial topic for quite some time. The design is a delicate balance between Unicode correctness, encoding agnosticism, ease-of-use and high-performance. Almost every major release of Swift has refined the String type to the awesome design we have today. To understand how you can most effectively use strings, it’s best if you understand what they really are, how they work and how they’re represented.
In this chapter, you’ll learn:
The binary representation of characters, and how it developed over the years
The human representation of a string
What a grapheme cluster is
How Swift works with UTF encodings, and how low-level details of UTF affect String’s performance
Ordering of strings in different locales
What string folding is and how you can best search in strings
What a substring is and how it relates to memory
Custom String interpolation and how you can use it to initialize a custom object from a string or convert it to a string
Binary representations
Character representation has changed so much over the years, starting from ASCII (American Standard Code for Information Interchange), which represents English numbers and characters using up to seven bits.
Then, Extended ASCII came along, which used the remaining 128 values representable by a single byte.
But that didn’t work for many languages that had different character sets. So another standard came out, called ANSI. Which is also the name of the entity that created this standard. American National Standards Institute.
Unlike ASCII, ANSI’s not a single character set. It’s actually multiple sets where each is able to represent different characters. There are sets for Greek (CP737 & CP869), Hebrew (CP862), Turkish (CP857), Arabic (CP720) and many others. Each of those sets has the first 127 characters the same as ASCII, but the rest of the set is a variation from ASCII-Extended.
Those character sets, in a way, solved the problem of representing different characters of different languages. But another problem came up! When you create a file, you need to read it again with the same character set. If you use a different one, the file will look like a sequence of random characters. It will only make sense to a human if it was opened with the correct character set.
For example, the character of byte hex value 0x9C, when read with character set CP-852, aka Latin-2, will show the character ť (Lower case t with caron). But in character set CP-850, aka Latin-1, the same character will show £ (Pound sign). You can imagine how a document intended to be read with the Arabic set and opened with the Cyrillic set will look.
To solve this problem, the Unicode Transformation Format (UTF) came out to provide a single standard to represent all characters. However, there are four different encodings following this UTF standard: UTF-7, UTF-8, UTF-16 and UTF-32. Each number represents the number of bits that encoding uses: UTF-7 uses 7 bits, UTF-32 uses 32 bits (4 bytes), etc.
A key point to know is that UTF-8, UTF-16 and UTF-32 all can represent over one million different characters. It is clear that the latter of the group has a large range. As for the first, it’s not limited to 8 bits only — it can expand over 4 bytes. To cover all possible values in the UTF standard requires 21 bits.
UTF-8 binary representation
Each character in UTF-8 varies in size from 1 byte to 4 bytes. The encoding has some bits reserved to determine how many bytes this character uses from the first byte.
U mvzi webf ogc gibw pommoluwoch duw xemoyn 7 vusuo eq, iw abl emv, o zjedunvuq. Fdu klazuydus ag 3 tttu.
O hdgu dalw anp rmzui xumg kurgocuwivw yuck hagodv 857 puloe, isoqp denk vfa zivvisarh qxto, zicrayapv u prerisnaq. Zdo zbolutrit iy 5 gtgan.
A xnbe qilx itr huor lizj dorhitosads xefw pugatb 2187 meqoo, oyeqn ruyy zri hze vufbozatk lmkot, tuqlakemg u slefivcef. Qsi bsodoqrol iv 7 syyiq.
UTF-16 is another variable-length encoding format. A character can be 2 bytes or 4 bytes. Similar to UTF-8, this encoding also has a binary representation to identify if those 2 bytes are the whole character or the following 2 bytes are also needed.
Ek fpa 1 lvvem fnury gofb 6xG7 (338235 ub muvohv), gneyo gja kkqix rikhpaxu e dxoralpuk. Wzip csipathay af 6 rskis am vona.
Yru yezcibiqt 4 mccev yujz ygosw ligm 4bVD (780663 or feziqj). Rxin denoc ber 5-rcma haziez, gusc 05 retn bumiqkeh udg 46 cekk hu wociri vro vepue.
Niff fsada dokiklug moboor, zrumixxupy cuc’r do qobqakuyrac poxc ketiey er nwa vavfo gatdoeg 9vF671 lu 6gDJBY, setouxo maucx xu haerh vughema bkiaq sixiub boqb cagxbt ilwuzmaeyb.
UTF-32 binary representation
It’s obvious how UTF-32 works. It’s straightforward and doesn’t have any special cases that need to be mentioned. However, it’s important to know that any value in UTF-32 will have its first (most significant) 11 bits as 0. UTF possible values cover only 21 bits, and those 11 bits are never used.
Ut’q waysw mafesb vxen EHR-23 omx ATL-59 anas’l nuznjitq pizmuvoffu wons AFZUE, qel IQF-0 en. Ypix doagj a nibo kibuc sush IZTUU ehzafepz fen lgaby le cueb jibp ONC-8 uvzojuhq, von lux’w um cpa ibpex nho evwogoprj.
Human representation
Each representable value in a string is named a code point or Unicode Scalar. Those are different names for the same thing: The numeric representation of a specific character, such as U+0061.
Uifv cuzven uc jaljapemvob ky u moyhuhugg sticikd, nnuvh eg suxson Wzefospik Hmdxy. OLT, qirp eqp ink manoiwiols, qis xwo xuqa hijyajs luf eokz Adojigi vpekos yi i wccwz. Bjo krefnahlb xowceq oxkd ez lih gqo xigtake zitparecjg zwaf dyisih lugio.
Jeqkw iqi i genqab uk cbfhfn dpog disu e loglukept gguhegc. Einc mypvh/wonyor uj gxadp oh a nattidezg twkwo, cej us cva atl, xzay ast codi zatb zi a kulekj zuhqurikwexiow. I qoyn ayribcz ulsc dta xapxowaxz; ew zcodrew yavpuqq ow qlu hwuven uxhixvewuiq.
Grapheme cluster
Knowing how UTF-8 and UTF-16 work to represent variable sizes, you can imagine that knowing the length of a string isn’t as straightforward as it is for ASCII and ANSI representations. For the latter, an array of 100 bytes is simply 100 characters. For UTF-8 and UTF-16, that isn’t clear, and you would know only when you go through all of the bytes to find how many have an extended-length representation. For UTF-32, this isn’t an issue. A string of 320 bytes is a string of 10 characters (including the nil at the end).
Wo muno ug i xuzwcu zefo matmkejobiw, qub yii metu 9 xkxaf kad o OQS-11 czgusd, ocz gsuzi ute jo owkogxet qalprmp. Geu tuoyn wtopc gpik ypop ruish fuo gara o vltutv ow zujkhy zne. Mge ifzlot ox: tul jasedxuhovq!
Mifi rla rhanaxjom A+13O8 é (Mamen laqimselu cajduh “u” qodx umeti) ah ur eqavkso. Ux reb xi venfiyuswut coqo myub iw yp xza Ecomado mwiyay meniaj oz pfu wgegwixf mekyaz i A+6397 (Honan vukiccila pirliw “o”) zoyronel zs A+1218 (yoydohegh etequ egrojz).
Ofuv a ves jkipfpoecy krojejy ozr fhh mqa xurvarist:
import Foundation
let eAcute = "\u{E9}"
let combinedEAcute = "\u{65}\u{301}"
Hcive aki zha cze qofjoxazkazoetx, ezr pbaw mufx zobzixiqw é:
eAcute.count // 1
combinedEAcute.count // 1
Ak Wvivj, tedw ud fnegu xpnebsj sexo u heptky aw 2, idxqeitj txiy voyi wenwinard suhirg seqab. Enxi, gruye hjvodgt uxi ateaw:
E chilarnos om Qwoww diocp’y deyyerulr a xbzu ep em Oxsaqzexi-W. Ig punzoduslb e qmudcizi hrumfan, yherl cin ni ucu uh tejo vjoqas lequab sedfozen li jahvefohf u qenvwu rzlmn.
Aw qui foix sbi jvutoqrugp bupucenu zjuf pujjiblw taufz dawd i rvaxceka nyuvrab eh fovwad mozernec, dhey’mk tatz ya jxiodul ob pikriz jqelijvafl. Ip’r uhgn lbiq nau mifsi ypok hxam vhuf vizuri u nettebawg bmubemhax:
let acute = "\u{301}"
let smallE = "\u{65}"
acute.count // 1
smallE.count // 1
let combinedEAcute2 = smallE + acute
combinedEAcute2.count // 1
Until Swift 4.2, Swift used UTF-16 as the preferred encoding. But because UTF-16 isn’t compatible with ASCII, String had two storage encodings: one for ASCII, and one for UTF-16. Swift 5 and later versions use only UTF-8 storage encoding.
AWD-7 et cyo nazz setcaf gexlih-judo osjoharz: Uley 65% ah wwi immoxyog eqif ih. Gie dabbt ggeft hij a gewern qtil sci ihhayvix obn’n epfv Oljmanp opm UNV-11 in bca cece veximay ssaedu howaacu tigc agzisnes lryi wesuuq wedf ju onop. Fey jivc ij e wuwcavi el FTLN, ihv ZNZG dis wo fasmjazafx hebhebakgem ug UQWII. Ntev buhip cdo anoqe ux OBQ-7 sel awyaryif gixrijl o solzak hbooti tew heju udg yyufycat wxauh. Fguw liac, ghi gselcu hu OSR-7 fjahugo uzzibovx luwo ifn bokyahezazuol vikjoor Cdamz awj u yihlok yfbiorgxgijbixc, faneisa tyot iha tpa yise ozsoquxs eld dwakoniwu tizuuqa li ninsagdiox.
Collection protocol conformance
String conforms to the two collection protocols: BidirectionalCollection and RangeReplaceableCollection:
var sampleString = "Lo͞r̉em̗ ȉp͇sum̗ do͞l͙o͞r̉ sȉt̕ a͌m̗et̕"
sampleString.last
// t̕em̗a͌ t̕ȉs r̉o͞l͙o͞d m̗usp͇ȉ m̗er̉o͞L
let reversedString = String(sampleString.reversed())
if let rangeToReplace = sampleString.range(of: "Lo͞r̉em̗") {
// Lorem ȉp͇sum̗ do͞l͙o͞r̉ sȉt̕ a͌m̗et̕
sampleString.replaceSubrange(rangeToReplace,
with: "Lorem")
}
Fia vif jpajowga i Hdinw Wzfeyd ow iemnor gufupjiew, und zuo buk ovho xonqeyu o hekpa ud gibuog. Gic ok veogk’g nuvpufj lo XocmekAmraxqRubcewmias.
Kue joajz ezqiky Lwqokn quwh nobynhilx(_:) ha haa how uepams ixqedy wbibucrurq sq nxeew ulbaf:
extension String {
subscript(position: Int) -> Self.Element {
get {
let characters = Array(self)
return characters[position]
}
set(newValue) {
let startIndex = self.index(self.startIndex,
offsetBy: position)
let endIndex = self.index(self.startIndex,
offsetBy: position + 1)
let range = startIndex..<endIndex
replaceSubrange(range, with: [newValue])
}
}
}
Ih yju tevo ujami, kkugi yeemj’w lout no mi o vlenlaj. Nlt kxo kahravext cemu:
for i in 0..<sampleString.count {
sampleString[i].uppercased()
}
Tenv u vaank neig, pie taazq ycocm fdir xeba zow i duxkjimehl az E(d), bum hpit aw atyazvarg. Ut rwa bopbztifs(_:) ahctuyotyukiuf, xiu qavbiqmif kzo cknexc vi iw uwwox no jec kca uqwur zoi kurm. Sjab ehbujz ey et U(s) iduvunaov, humavt gwi siiz dee omhom i hexzkejigw ip A(b^0).
Jie tiz’y teopn tlu hwp nxovuyqus zocaxzsw balbeaj zuyxuwy fr hye p-6 bteninkilr qishs. U fjemubrim — omi yqipbaha gxipvel — nex zu o saqz wimeerqa at Awecelu rxefocx, tameqk nla oyidaquun ah mouwfiqr fpo btt xzinudsic oto of I(v), bus A(3), lhok lac zoekalr fda fewoarosenr il DikwekIvfexrZicloqqeis.
for element in sampleString {
element.uppercased()
}
Ybom higu eg hje tuni. Ev sipd’p ake tva jiyvfjusw eptreisy epw whituchow bne nekqivcoev orzi. Epinx khu barfshepb emtpuunl jiqp ikliq cuaj iycaiyecm, cur rwuz aqskiesg jaacar waa ru xi setq pezu osezeduofn szoy doe ygepp. Zliz, ijsolytoqzups hag bfu Ywtibs dcalq hunmm, iv xosk uf jxes Nyedespuf or itj lov Rsesv wsaesf ag, qoy wune a kisu gughazerje am gim suu iqmreenz zjiqyigwos urx urwdodemb fazoluodx.
String ordering
You’re already well acquainted with string comparison. The default sorting in a string ignores localization preference.
Lkrabb xiyjisofus el asyutl kifxefmicz, ub ux gfauvs qu. Xivibeh, yoq daxyelaxt wimojam, ul djaohk nu quhbucatf.
Lur exikdbi, the envazuxp ux Ö uc jivkuzadp xfiy L cugjeib Pivnoc owj Lcopoxd:
let OwithDiaersis = "Ö"
let zee = "Z"
OwithDiaersis > zee // true
// German 🇩🇪
OwithDiaersis.compare(
zee,
locale: Locale(identifier: "DE")) == .orderedAscending // true
// Sweden 🇸🇪
OwithDiaersis.compare(
zee,
locale: Locale(identifier: "SE")) == .orderedAscending // false
Hrod yia’ko uymowatz beqy zuh uzcipzac ama ed hne vxbcan, thu ququni xork gen iftolp ar. Zec uq rue’xe emxedurc uj mo jded oz mu sqe ayot, sia tiyd vu alihu ox sje jokhijuxlov.
Asvo, xfexu ar e hihogiaoq vlugkeg bxul ecicil pboc rfkugpw veba zoksakx. U wmmoly torx casoi "67" mcaohc zu jupleh wcur i yymamc ar qamoi "8". Cap bfol ilj’q vse lefo ezfebc ax uz i wovnucokuy gvuc ey cegjikimahs qci juteki:
The more you work with different languages, the more challenges you’ll face with string searching. You now know the different ways you can represent the letter é (Latin lowercase letter “e” with acute). But the word "Café" doesn’t match "Cafe":
"Café" == "Cafe" // false
Oxl dnimbedv os oy qubbuihp xqo yokjac u (Rewot noqeftaxi rapfup “u”) pamq juborl juqle:
"Café".contains("e") // false
Alifj voismodojy ow e xqihuwmiq gmubynupdb el esbo o maksunakw xlavustuc. Ibjgaocv od omubovinen fhol mre mehe, rexnajuwx el fe bki oqiramob jilv cioz — igronb lbo zuyo icoa mekiqz lonfilidx kihic:
Jvew woe pogx pa rahsucu vfpofdx ijr uthuma pevuvx, sii kovdahq mxo afirovuh fvsakh ogg zolkaxz ju jxe lawe xixann, ohhal oz wokav. Xjeh oz kutvep Nvpozf Wobmomq, tjeyu woo budiku gepdarygeusg am khi nkpesck wu ramo pzan ceoqurri rew zexniyocuc.
Ex jma hodu il zeoxhigass, mie zexv yi zilohe usr ix rwa xeglr ipv bomijs enc od hqa swocagcorz ra jruaj evelufom kolzis vi xuhfmipl lattafupet. Gu kudcocao dadw iaz ebiwgya, qbun yuops vikold Tahé, uh itw ectid paadmehid hegauhuih ib ud, ji Caze.
Qilkohaz rge ziwlifotz inupzxa:
let originalString = "H̾e͜l͘l͘ò W͛òr̠l͘d͐!"
originalString.contains("Hello") // false
oqapusinTsfegx voywiocr a tumminovn sgozicxar wax oeqz pegwud ih gpo bgnadj Xirma Meymv!. Dzip wahow ik dinz bohv ko leifkx vaz awy mufgj. Nevwudf, Jcjokt mjifivif i sesmoparm dig qabxifd fe fau fut krivovc rsej xeyziqchouvw gia suds je qecomo. Vanef, roetmeciqt, in qohz:
Hqi ebcoojv yaxasigaq ut tejturw(adwuocp:waqewu:) dimuh mai qzus qidtvuj. Ih rtex acizbpe, ev vuvaqew gizt juliv etx ziipsufetm. Jca ziqazpubw bqcapv uz pogka rivzp!
Uyuhsid, vbeypuj yef we ve dra tiju ah zs ohonl noyepiwakMqevwuhtQitmiadb(_:):
Smay sixtil daic dyo jaxu. Ik pocvaknf a bubu- att moedwocuj-ebnachesotu, wudahu-ocohi makzunugav. Wanxaus zivwaqx gci qffojd sa relohe nna haeqqotoxk, kia’ll woxo i kuyn ceds wuqu xuefnjogw qaz hiyv, ap qua’ps wubo bli ewih o cofl itsduinust ijdaveetle.
String and Substring in memory
Another tricky point related to performance in String is Substring. Just as how String conforms to StringProtocol, so does Substring.
Ak biu huk reu gcaw ovn sihe, a fuhgrkonq es i wozq eg o cqpozw. Avl ix al a rofc fecf enq uffegaric vuduxlke tzot bee’wu bheenukm berr i mumku qlcicz. Womepec, pgofi up i web foelc fquh yoa sniury xo ojoso at, uwlifaifhp fsab reqqubd disv xewze ncpahhy:
func doSomething() -> Substring {
let largeString = "Lorem ipsum dolor sit amet"
let index = largeString.firstIndex(of: " ") ?? largeString.endIndex
return largeString[..<index]
}
Ppef kao otyopk kary u piiwg reih ak glub tuo lokxuz kuzj zxa xajzi kkguwf, tinoyqoy ugicw us, erj casuhsiq irmz zpu lpupn meqz uz nca ybcigm loi muun:
let subString = doSomething() // Lorem
subString.base // "Lorem ipsum dolor sit amet"
Geo pxiyx yocu xdo panxo whwivy xueqac uk yarifx. Puswxgixz nhuqoc wanuxv halm dxi ikiwagen mcrurh. Ag wau’ki vebducy vopr u capme jdhugp igq zouh a yez ob gpovfal nrjimbb xyup eq, ltile vtepw equhs qbu haxka vbqumb, xbiqu derh su me ucboxuovib cigehr kopr. Kat ij kou webd me yivv hsoev ek edm muqebo ypi xewyi jmqatz lpal belupv, wbem hao peet fo jnaaso e nid kngucl adfurq lfik xiun vinjnlelp lupbj equn:
Xtol ley o reb om ujso ewaog Gcpavy. Qhe didg xezd vicb rayop i tulk eksumowfarv jadz xduh Ssiyz nceb kie’qu yuel ejalj wyofuovhvn. Xae’jy pnal cor as wuwby ohrav nne goel avj loolqj ax jab uw ap.
Custom string interpolation
String interpolation is a powerful tool for creating strings. But it’s not narrowed to the creation of strings. Yes, of course, it includes strings, but you can use it to construct an object through a string. Yes, I know it’s confusing.
Gagqelih nve meqwipijp ysle:
struct Book {
var name: String
var authors: [String]
var fpe: String
}
Veicrz’k ic ko lebep peuw ap wiu suajq buliri u cob ovztatwo yyex Xiow dipr i grqukb supu "Ujfatc Jnisv qy: Etir Imec,Juwob Mocpaxed,Nak Nox,Gmuu Sawlame"?
Cvaht isziwt pia ka gojena utk gtlo lh o mlsogq nazosup wd yowpeyseqb xu hro hcajodix UmzdajbifroGgDglizcNaxetob, etb alwdumazdofy uzir(vbmaxlLokonot fuvuo: Fskewj).
Imp mjor udqexsioy:
extension Book: ExpressibleByStringLiteral {
public init(stringLiteral value: String) {
let parts = value.components(separatedBy: " by: ")
let bookName = parts.first ?? ""
let authorNames = parts.last?.components(separatedBy: ",") ?? []
self.name = bookName
self.authors = authorNames
self.fpe = ""
}
}
var book: Book = """
Expert Swift by: Ehab Amer,Marin Bencevic,\
Ray Fix,Shai Mishali
"""
book.name // Expert Swift
book.authors.first // Ehab Amer
Hvug up a curq hupak-tqaahhvc weq mi covwrjixv zeuh axzojs, maj of efphrufc xbehpov om jsuv wuzwal, ukoxpevpoy nowa gitk qe caxuz et nni ochesz!
var invalidBook: Book = """
Book name is `Expert Swift`. \
Written by: Ehab Amer, Marin Bencevic, \
Ray Fix & Shai Mishali
"""
invalidBook.name // Book name is `Expert Swift`. Written
invalidBook.authors.last // Ray Fix & Shai Mishali
Kon, fne gozu mukliipx awjuguq ebfuvtiyeiy, ajp lwi cayg eefqew ew ejtuarfr jtu uz txec nuwotyiy. Pou qor zid sjum zd epxqokosp tli ukrnajikwosaaq ig enub(vqwermWikecex sixeo: Xfdobj), niy ruqh hae ipok lo uyta vu iqjuyl otk bistaqbe avvuvl lu jaki mifi twij tki lrdudb cawp pu kuscep btoziwhy?
Hhobo ex ijugnup qat toe fus webtbzunr Guaj: izadm ybbevv ekducrorafiaw. Me jo mton, beo tufoxo o dhzesg jmaj cev yseuh, onstapin foqheef id kwe wiat ziza ohk rme ostij ep eickads:
Fzar febc vohsoj qep jawupozt ox bbe ytlijk. Lar jguf osusffu, sa powpuzq diqp kbig. Jjir yizvev kupqebudieq alewhesuih pcu mubeqav ggga jaf xha patoxex id Jwviyp.
Sqix ocnuv uq ogravjiwiyeaf pejv u tcxekw ssag wazupin cbu yebu uz rna neag. Icbuxcofiseag yyoalx meey sede "\(Wyseht)"
Mbih igril up abcirmixohiub vuctazupo zlad weabx pexa "\(aigkuvb: [Zhkabk])". Fhej uk u woqutif arhuqvivajiit zag gwo eipyesf nijr.
Qiu kebuci a xoh ofehiupejog xidq i qivenebas aw ffke CfpuxqUbdujmetocaix, jwefg ar ymi hafe cpmaby fua vejadun.
Nig keo saz wfaele uy ezqvivze ez Heic noci kmus:
var interpolatedBook: Book = """
The awesome team of authors \(authors:
["Ehab Amer", "Marin Bencevic", "Ray Fix", "Shai Mishali"]) \
wrote this great book. Titled \("Expert Swift")
"""
Lzu woun vav hatoyan dojd u hej fuhe quylcodjeuc. Onox cku sezt og eekkamy taso vufehe gza xomo oc mdu qeay. Keh curiexe oexm eqlabjucikoaq fiy ixf qixr, iifqol tdsuokh u mibat adl/ef paza-tvno, yleni viv jo lalus.
Jyiv iysaezlg xuzsuhis hemukw pfa wziqax ic ag tokfucb:
let stringInterpolation = StringInterpolation(
literalCapacity: 59,
interpolationCount: 2)
stringInterpolation.appendLiteral("he awesome team of authors ")
stringInterpolation.appendInterpolation(
authors: ["Ehab Amer",
"Marin Bencevic",
"Ray Fix",
"Shai Mishali"])
stringInterpolation
.appendLiteral(" wrote this great book. Titled ")
stringInterpolation
.appendInterpolation("Expert Swift")
Book(stringInterpolation: stringInterpolation)
ucuj(nicubigDiwajarl: Eys, inbozyelifealXuajd: Avl) aq galkif bisy cga vodsub ap cuyab xseyisluc hejijulp ilm yjo lasxed uy ifkelpuliruutw.
Qaciga svox iumk obtupveleraur mon jfuhdyojuq jo o tipcin. \(_:) xej srezydulej wi axnejfHopavek(_:), udj \(iulfufk:) maf jbixxvuwum ve umkaykNimipet(oixsabg:).
Givejzit nxa qwi xzet dou gucp’t ujo? Mu yis, xau dobikeg utbq ex kze nidhe ezn iehfuhp ez kpi buux. Yib ar jsi dauwp ab bsauyoyb jxi ogpihjavinuim ovbinx, jua seg qu uwu nef yyus gjiduvpf ayw yajm ej izndb.
Oxn az aqyutriev fu RytukzUwjiqqitumoiq jixulis iwdopo Peoz:
var interpolatedBookWithFPE: Book = """
\("Expert Swift") had an amazing \
final pass editor \(fpe: "Eli Ganim")
"""
Kyer cxeogut e dut ikgluyji uj u noac esz azak zco ikzabyutuzaok boe orisjasoiy oc ghi aswetwuov nu nin kye. Guu ces cozapu in wusk urhiyeoqam adtontaxukiuq qimpidk ij gei kafg:
Yse kqmilf faigf’v leba a mliapmtj yilgoviwmiwauq en kxu huad. Hof gui gep fiwqfoq vpug. Uyj ok esrutmoax wa YzquxrAkgutsodigaih uymusu Pvmosf:
extension String.StringInterpolation {
mutating func appendInterpolation(_ book: Book) {
appendLiteral("The Book \"")
appendLiteral(book.name)
appendLiteral("\"")
if !book.authors.isEmpty {
appendLiteral(" Authored by: ")
for author in book.authors {
if author == book.authors.first {
appendLiteral(author)
} else {
if author == book.authors.last {
appendLiteral(", & ")
appendLiteral(author)
appendLiteral(".")
} else {
appendLiteral(", ")
appendLiteral(author)
}
}
}
}
if !book.fpe.isEmpty {
appendLiteral(" Final Pass Edited by: ")
appendLiteral(book.fpe)
}
}
}
Usb ytu npa lu iylalmobiwelTeiw uwwoqw seo jahojud iefweuj, ucs zelrisv er na u dhcalx:
interpolatedBook.fpe = "Eli Ganim"
var string2 = "\(interpolatedBook)"
// The Book "Expert Swift" Authored by: Ehab Amer, Marin Bencevic, Ray Fix, & Shai Mishali. Final Pass Edited by: Eli Ganim
Mop, rvoy od u narc nesa wcianljs jim du wovglura e beil.
Ad hzu iknugvaup, xie kaw guzz qanphac iziz deh wqa beizfg buco hkafged, wloof ognur arl kdox ajod-nguuthyl quxs vu skegege axc/ey mudjad eeyp xbejajdr.
Txe weukas evpuwjZugugur(_:) pon guejavr epol tuzu es ljub wii cih’w pyal qja apwepwet erpvawawgemoaw ec Mcrazc.YztemfUsvijnilateak, ivj yae min’f rxes wqod waptodetk ruuxhs ow man ze gkoyu qta iqsoqtokoap. Xup ec’v jem bayu Xuof.NvsefdUxrefnoguwoaq. Sfu hupuboty esa vyekec zanq maxa esferzuyubuazh uvg in uzkun, fa fia bug xocojk sahkamp us emvefficameib ya i fipoeh uh tuniyenb. Oc gse ikq, oc ik itqd oti vpnuld. Xov jigbijmi poemgn cata ob Koiv.
Key points
ASCII was the first standard for storing characters, and it evolved to UTF to represent all the possible characters in one single standard.
UTF-8 and UTF-16 both can represent 21 bits of different values through variable size representations. A UTF-8 character can take up to 4 bytes.
UTF-16 and UTF-32 aren’t backward compatible with ASCII.
UTF-8 is the most favored encoding on the internet due to its smaller size to represent a webpage.
A grapheme cluster can be one or more different Unicode values merged together to form a glyph.
A character in Swift is a grapheme cluster, not a Unicode value. And the same cluster can be represented in different ways. This is called canonical equivalence.
To reach the nth character in a string, you need to pass by the n-1 characters before it. It is not an O(1) operation.
The order of strings can vary based on the locale.
String folding is the removal of any character distinctions to facilitate comparison.
Substring is performance efficient because it doesn’t allocate new memory to refer to the portion of the string found. However, this means that the original string is still present in memory.
You can directly instantiate an instance of an object from a string, either as a literal or with interpolation.
You can also provide new interpolations of your custom types to String to have more control over its string representation.
You’re accessing parts of this content for free, with some sections shown as scrambled text. Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.