it's my first time at cppcon I've thought about going for a lot of years
and now kind of uh regretting that I didn't go
[Music] sooner hello everyone let's get started
um so my name is danil uh today I'm going to talk about a long journey of
changing standard sword implementation at scale so who am I I a senior software
engineer at Google I work on software efficiency for all our data centers at
Google so um if you want to talk about performance like come to me we'll have a
lovely chat so what's the agenda for today
today we're going to talk a little bit about the history of sorting like how it
was developed in the standard Library what's out there like and the research
behind it so why have we decided to change anything at Google uh a lot of
bugs so you'll love them uh implementation which implementation we
decided to choose why with prons and cons efficiencies and efficiencies so
and last but not the least what can you do to improve your sortings to improve
or maybe just to update the library um all right so couple reminders
so uh I believe if you wrote C++ for about a week maybe a month you wanted to
sort some elements and the Sorting is pretty much just the ordering of
elements today I'm going to talk about the calls which about like the calls St
sword stable sword range sword and um like similar partial sword
nth element so all this stuff um yeah so you can see that um
they might accept two or three arguments two arguments like for begin and end and
the third argument a user defined comparator that's a quite
important detail which we're going to talk later so this is where most
interest most interest interesting stuff comes from all right um many of you
should know what quick sort is but just um short reminder so quick sort is a
type of sorting General General sorting so pretty much you take any element all
those who are less than this element go to the left those who are more to the
right and and then you recurse on both parts and you can have different
strategies for taking this element maybe a median of three maybe just the middle
one maybe I don't know something out of some set doesn't really matter quick
sort is like the general um description of any of such
algorithm um and if we going to look at first STL version by Stan and Lee we're
going to see these like four or five lines so third line is like quick sort
Loop and then final insertion sort so how does quick sort Loop look like uh it
look like it looks like um like the strategy for partitioning it takes um
median of three elements and then then it does partitioning like as again less
to the left more to the right and um uh it does some thresholding so what what
does it mean it means that if you have in the range more than elements just do
as usual quick sort if you don't have more than these elements don't do
anything and the range will be unsorted STL threshold in the first
implementation was set to 16 uh because you don't want to go a lot of deep and
like um deep you want to you want to have you don't want to have deep
recursion like and when you have 16 elements you're going to go like four
times so other sorts might be useful in this case uh and uh yeah final insertion
insertion sort pretty much just Smooths out these inaccuracies for 16 elements
it just does insertion sort for 16 elements insertion sort is a type of
sorting where you just go element by element and put it into the place where
it should be sorted uh to the prefix um but it works quadratic
time um so this is pretty much how looks like the first sorting from uh like
first STL you can see that you have more and more sorted range and you see these
little little inaccuracies uh that uh for every 16 ele
that occur for every 16 elements and in the end uh it does this um final
insertion sort and just Smooths everything out and now you have a
complete sorted array so this is kind of the first implementation of sorting that
has been available in somewhat standard library of
C++ okay but standard Library Bol from STL so STL is kind of notion of standard
template library now we call it like just standard library of C++ and uh we
have the following um the following definitions for them so the Sorting can
take two arguments can can take three arguments and the only restriction that
it should it has that it should it should have like an Logan comparisons
this actually hasn't been always the case so for some time it was analog and
comparisons on average but in C++ 11 it was changed to have worst case scenario
Anan and quick sort uh as you might know but it's easy to understand worst case
scenario arrow is squared because you can just pick the worst element and only
eliminate in Rec cursion only one element which is not great but this is
possible so the um so quick initial quick sort no longer works as a standard
compliant algorithm so what has been done and GCC leaps did C++ as well as L
leaps C++ but that's in a couple slides introduced a notion called intort what
does intort mean so this is the intort Lo and you can see final insertion sord
is still there so it hasn't gone anywhere but like in this int like and
you yeah the the difference is only in these calls and interport loop has on
line eight this argument so it has argument of two times logarithm it's
base two of plus minus first so what does it mean so what does it actually do
um so this algorithm is is called intort have no idea when it was named like that
but uh whatever so it's almost like a quick swort as you can see it has the
same um almost the same arguments and if we suspect a lot of recursion calls so
for example I don't know we don't want to go quadratic otherwise like you will
um exhaust your stack so in this case just use Heap sword and this argument
actually is what just limits the number of layers you go deep into the recursion
uh Microsoft standard Library does pretty much the same um but with the
only difference that when it when it comes to 16 elements in the range it
just calls insertion sord whereas GCC leap C++ calls insertion sord at the end
uh like the in the GCC version might be a little bit faster because if you do uh
like linear accesses you might get better cach locality um which might
result in better efficiency of performance of standard sword all right
but if we think a little bit like about the worst case scenario like it's just
not great I mean um still does the job but it does it like slower and uh you
can see that uh performance of sorting can vary from input to input that's
usual in in this um area This research so and you can see in the end it also
tries to smooth everything out all
right um today I'm going to talk a lot about LM leap C++ because at Google we
use it primarily for pretty much everything that been built all products
all that are written in C++ are built with standard library from llm and but
they didn't do a lot for like they didn't do anything for a long time so in
2014 uh the issue was reported that standard sword has worst case quadratic
worst case but standard mandates and Logan however like it took only seven
years years kind of to fix it so what's wrong um like first of all why was that
um the answer is just simple actually anybody of you could do and just
Implement that the thing was nobody really cared a lot and um like that
that's the only factor and after that we implemented that we found that only 0.1%
of calls got into this Heap Sort fallback in production so um yeah this
is I mean Google might be Google production might be biased of course but
that's what we what we've seen so far um so if we talk a little bit about
lvm history and sorting lvm history it doesn't like it doesn't do only quick
sort it doesn't doesn't it didn't want to do only quick sort it only wants to
be a little bit faster than leaped C++ back then um it this is the slide from
lolm conference from 2010 I guess uh C++ 11 hasn't been adopted and uh what it
wanted to do it just wanted to adapt to S sort like to certain patterns for
example equal patterns ascending descending even odds some pipes like
maybe almost almost sorted and random of course random
random sorts are also important and as you can see Lam leap C++ really worked
well on um these patterns and the truth is such patterns are actually quite um
they are not rare they do exist a lot especially in source that are not big
for example I know 100 elements like a lot of cases we've seen where it was
sorting almost already sorted um array and like here comes the question a
very good question that we want to uh kind of like answer before we you know
do anything with the implementation what what what what what what does constitute
a good sort what is a good sword so first of all it should like to have
analog and comparisons that's B like that's written in the standard uh
probably it wants to recognize sorted patterns we don't want to sort a lot of
random we don't want to waste a lot of time on already sorted um Ray um it
should be kind of fast for Modern Hardware and fewer comparisons for heavy
sorting for heavy comparison sorting so assume you have sorting comparison which
I don't know reads from uh reads from chunks of data for like SQL database and
the fewer comparisons you do the fewer reads you do uh this like the hardest
part of all that is actually the third bullet because Hardware changes and we
need to change the implementation as well but how we going to change it so is
it okay just to change a CD sort I mean on one hand yes it's good but let's see
first of all why why do we why do we need other other patterns as well so lip
C++ on a sending already finished and you can see lips C++ on a sending does
some weird stuff uh which I know hasn't finished yet but we'll just wait and see
how it sorts the El all right now it finished so patterns are important
Hardware is important and Lan comparisons is
important so what can we do about it
um so uh patterns first of all uh I want to talk a little bit about how how lip
C++ has achieved that and what it does it when it tries to recurse on subarrays
into quick sort it what it does it does a little bit of insertion sort but
insertion sort up to the end will be quite bad it will be quadratic in the
worst case but so what it does it just sets the limit so it tries to insert at
most eight elements so it just goes sees okay this is the sorted sorted prefix oh
this is the element that not in this place let me move it and only move it
only only do such operations eight times so find eight elements at most which are
not sorted and put them uh to the right place once you put them into the right
place you can recurse on the part that is not sorted and everything will be
really nice and sorted at the end and it in the wave for example it recognizes as
sending p uh patterns like patterns where like almost everything is sorted
and it's good yeah for sorting and Hardware the
story is somewhat interesting because every now and then and actually research
has start like research for sorting has started I don't know since
1950s and continues even now and you can see a lot of stuff has been done for
example I don't know x86 seemd sword VQ sword vectorized portable quick sword
somebody wrote a faster sorted algorithm somebody said okay part and defeating
like sort is better boost library has four different sorts like what should we
do like which which sort should we choose and and why the answer is we
probably as C++ programmers should choose the default like call to STD sort
and the question then okay can we change it to some something faster because I
mean 15 like 10 years has pass have passed can we change to something faster
yes and like our main motivation was to save Cycles uh one more motivation was
that we've seen some production failures but that's like a little bit
later um so we tried uh I love working in Google because we have hundreds of
millions lines of code and I just can take some I just can change something
and just see how it fails and it failed quite a lot so um let's look at this
picture a little bit longer than for other slides um this is pretty simple
don't read too much uh in the middle you have um like a simple SQL script which
inserts some like idea one and like which inserts id1 and id2 topples into
the table and then what it does it just selects everything from this table order
by only First Column you can see on the left
POS pogress um just outputed like has the output which is different from my
SQL and why is that uh the truth is we just ordered by the first element and
second element doesn't promise any order when you have ties and we call this
problem ties and we have we had a lot of these problems let's look at some C++
code say you have a vector pair of vectors and you want to sort say I don't
know by first element and if first elements are equal then the order of the
second element is not defined because this sword like standard sword is
unstable sort um it doesn't promise any ordering of equal elements and a lot of
people wrote tests which pretty much said okay do something it does some sort
in the middle of the function and then expect some I don't know PR above Json
some golden test and this actually caused a lot of pain when we try to
change change the implementation yes um and we decided
okay um in order to change STD sort to something better and at the same time
we're going to improve testing because people who depend on Golden tests
probably don't do real good job in what they really test anytime we're going to
change anything they going to be broken and we wanted to find such cases we
wanted to fix them we wanted to make them more
robust so how can you do that the answer is pretty simple just uh do the
randomization so this is the randomization which we added before
sorting so just randomize elements and then do your usual sort uh we decided
not to include it in like optimized builds that's on the macro however like
we included this macro and find a lot of cases a lot of flacky tests a lot of
problematic golden tests um and uh um how how can we do that actually in the
standard library because uh finally finally enough in the standard Library
when you want to do something just I know some Shuffle you need to depend on
the header random and header random probably depends on header algorithm
because it needs STD Min STD Max element and something like that um we use
technique so-called address space layout optimization so um you have I don't know
a global variable which has its address if it's loaded randomly into memory it
can be a good seeding it can be a good seed for for for um linear I know random
generator and it's uh it's by default on Mac OS it can be turn like it's by
default on shared libraries on Linux so it was pretty good for
us all right uh we thought a little bit about randomization so you can randomize
STD sort before before doing the actual sort however there are other calls that
like deserve randomization as well for example SD nth element and a CD partial
sword a CDN element is an algorithm which um it has three arguments um first
and third arguments are begin and the end of the range and the second argument
takes a iterator to Something in the middle of this range and it promises to
to order the elements in the way that this middle element is put into the
array as if it was sorted for partial sort um first uh first elements up to
this point up to this iterator will be sorted and you can see uh fun fact that
yeah an element of begin plus 9 um like has the element sorted only for partial
sort which has begin plus 10 so this plus minus one is actually very tricky
and we found a lot of bugs when we decided to randomize the ranges which
where the order is undefined and other or for example for nth element the order
of both parts are undefined so anything can be there nobody just should rely on
that if we want to I know Implement a new algorithm for nth element like that
uh that should not be relied on a partial sort pretty much the same the
right part uh has has a property that the order is
undefined uh so how could you do that pretty simple just randomize range for
nth element randomize range before randomize range that comes up to the nth
element and randomized range range that comes after nth element strictly after
for partial sword these are two randomizations just before and the right
part so box uh probably the most interesting part and honestly uh I know
maybe I did a couple bugs as well first of all determinism golden tests and for
example this is a this is an example from click house which is written a lot
of in C++ uh and how how was the test fixed pretty much they extended the
ordering in some tests and moved the reference files here and there we've
seen a lot of failures in my SQL in progress so in my SQL we haven't seen
failures in progress because it's mostly c um but yeah a lot of other places as
well inside Google so um harder cases involved more
time for example assume you have you sort some data or a search and you cach
it somewhere then you have ties and they handled randomly and now we have lower
ke lower lower heat rate cash heit rate because ties can change something that's
on the border where you want to I don't know limit limit the result so you might
have different data depending on different sorts and your cash might
invalidate we couldn't afford that even if we change it only once because a
human change change only once and then throughout the roll out you will you
will pretty much dunk your cash which is not great for services and we tried to
find that we were pretty successful in it uh but that's these are harder cases
for goldens um sometimes yeah in order to
fix such goldens um you need to use probably stable sword stable sword
promises that the elements in which they were in the original array equal
elements it promises that equal elements in the original array will be in the
same order in the resulting array so it's um
deterministic and actually I wondered why in my computer science course uh we
talked a lot about like stable source and the answer is this you don't you you
want like stable sorts if you want determinism all right other bugs let's
try to write a median function um I mean it's pretty good so
we just do an elements to the median um just take the result of this median well
if there are even number of elements we should take um median from like minus
one and like take the average of these two values and the result will be just
um element in the vector from medium and medium minus one over two and we just
return the result um anybody does anybody see the bug in
this in this snippet all right quite a few
hands okay um yeah the bug in this snippet is
that you call nth element twice and the second nth element U might re rewrite
the element for example on the position Med like V of medium might not be like
the one that was here because an element doesn't doesn't specify any order on
both parts um but you can you can get lucky because an element actually puts
elements puts elements around and close to Total ordering
and um let me show you examples from I don't know from GitHub repositories so
this is exactly what happened in some repository with three 3,000
Stars U this happened in repository for m m plot I believe it's a famous library
for I guess python but I might be wrong might M plot lip so you can see they try
to um to use to compute inter quartile range but they called an element first
then an element second one and then they access the
access the the elements in the array and probably go probably they have incorrect
results um another element uh actually it's a quite another another bug
actually it's quite funny because there is a snippet that this was taken from
Geek forces and probably you know the site and yeah actually you can go now to
geek forces and there will be exactly this
bug um another bug another bug
it's Ableton um another bug medical image analysis that's
scary [Music]
uh uh another bug and I
can continue like infinitely so what you can do if you use lvm liap C++ please
use this macro macro in your builds if you want to randomize um randomize
sortings an elements partial sortings it proved to be really useful in finding
bugs fighting golden tests uh and like it improved the test how people think
about testing so it's probably the most important
part all right now let's talk a little bit more more complicated stuff in order
to sort or do anything complicated with ordering in C++ your elements and your
comparator should satisfy so-called strict weak ordering
what does it mean it contains of four properties so if you compare element
with itself the result is always false like
it's it's very logical a symmetry if one element is less than another then the
other way around cannot be true so trans transitivity if x is less than y y is
less than Z then X is less than Z and fourth property is a little bit more
complicated but it's still still needed uh what it means that the incomparable
elements like they considered equal among each other so for example like if
x is less than y is false and Y is less than x is false as well then we consider
these elements equal and if two elements are equal and another two elements are
equal then first and first and third should be also
equal um like interesting interesting fact and
I I've heard a lot um even from the standard Library um imp from the from
the people who Implement standard library that such checks are very
expensive so in order to check everything here for example for the
third for the third and fourth condition you need kind of to Traverse over
triples of all elements in the array and in order to validate that your input
like satisfies strict wig ordering we need like cubic time that's completely
inappropriate for analog algorithm as as sort so but let's think a little bit but
what are the risks what if we don't do that um on paper it's undefined Behavior
but I'm more interested as an engineer what happens in practice and in practice
um I mean best case scenario best case scenario you can sort and then SD is
sorted uh returns false well not great but at least
doesn't fail probably yeah you but probably it's not great um
like let's look at this example probably you see something uh on the left um it's
a very simple very simple code I just have a vector of integers which is
filled with minus one and then I sort them the way that to the left I want all
negative integers and to the right I want Total ordering of positive
integers and uh at first it seems an okay comparator but if uh but um num
alums if I put num Elms to 30 this just works
fine it just sorts and doesn't do anything so that's okay but if we put
301 elements it actually fails with segmentation fault uh and this is
actually very tricky because like I can prove that no like with this comparator
there there is no single case that when it fails with less or equal than 30
elements and this is actually quite interesting because okay if you don't
have in your range 30 elements then all good if you have more them something can
fail but I haven't written a wellth thought test for 30 plus elements just
to test sorting and we found a lot of attack vectors in this area so if you
have incorrect comparator then I can fail your sword with 31 elements like
with out of Bounce read pretty much so and this is one with this is
also one one one reason why we decided to change anything because at Google
like once in several months we found some problems like literally production
failures out of bounds reads segmentation faults scary stuff attack
vectors security like doesn't work and all that stuff we were scared like weeks
of debugging all that stuff and yeah um why does why does it actually can have
like why is it ever why is it even happening the truth the truth is when
you um do the partitioning in your quick sort you kind of wait for an element to
like to be false because for example if you start with a pivot and then you go
further you some at some point will find that there is a pivot element um and you
compare pivot with pivot and this should result in false and this while loop
should end but if it but if it returns a lot of truths
then it doesn't end and then then reads um out of bounds one solution of course
is just to have a bound check here um which we'll discuss a little like later
however well like if you know that you like your comparator is correct you
probably don't even think oh yeah I'll definitely find an element which um
which is less than pivot which is not less than pivot yeah
so but still uh the risks are actually quite huge so we have failing production
we have incorrect sortings um like but still the check actually is uh cubic
what what what can we do um first thing we use some maths to reduce the
complexity of such check um I'm not going to dive into why it works I'm just
going to tell you the algorithm how it works so first of all we need to sort
the range with some algorithm that doesn't fail um this is a little bit
harder than you might think but just yeah imagine any algorithm that just
finishes and correctly sorts when the comparative like satisfy satisfies uh
strict we ordering just sort the range if it's not sorted return false if it's
not sorted after you sort it then your comparison does not satisfy strict week
ordering um then find the prefix of all equal elements so just compare your
first element with the next one and until you find something that's bigger
uh check that these elements are equal among each other uh check that they are
less than all that remain and just remove them um it's a little bit like
not very obvious like why it works you can read the proof uh by the link above
and so but it works so still uh we have a squared algorithm for check and stct
reordering um and what we decided to do in the end is okay we have a sort we
have a quadratic algorithm that kind of tells you like an oracle does your
comparator satisfy St reordering or maybe it's not maybe it does not and we
decided to try out okay let's check just first
100 elements surprisingly it worked insanely Well because most sorts of
small like majority of cases are sorts on 100 elements um having I don't know
10,000 comparisons extra comparisons in debug builds like hasn't time like
hasn't resulted in any timeout across all tests that we had at Google and uh
uh yeah we were very happy actually to discover that limiting this check works
and finds most issues for cubic algorithm we thought about 20 elements
because I mean we if you have like 100 elements for a cubic algorithm that's
going to result around million uh comparisons which is probably not which
is probably a little bit like harder to do so
square like quadratic algorithm allowed us to do it like 100 elements all right
but there are some caveats to this algorithm so first of all the Sorting
should not fail but we've seen that it like you can uh you can give invalid
comparator some input just reads out of bounds um uh it only tells that the
strict ordering is met or not it doesn't give you the triple there is no
algorithm to give you a triple uh which uh like no algorith quadratic algor
which gives you the triple that violates a strict we ordering and arbitrary
comparator can do anything so like sorting can fail for example um
comparator Can fail if you give it I don't know two elements like second and
998th element and probably in your sorting you never call that but we we
might um third bullet like was not very relevant we haven't seen many of such
cases um the hardest was the first one so uh what we decided to do we
decided to extend the debug do the debug builds so one thing to kind of fight uh
out of bound re is to do like asan builds uh addess sanitizer build debug
builds to find most cases or to have another I know oracle which can give you
can point you to some problems in your program and yeah we decided to go with
this request uh we decided to go with this proposal to Elam Community which
was accepted pretty fast um so first thing we did is just out of bounce check
in cases where we partition the elements in quick sort that just gives us um the
opportunity to find um if it fails in for example like online 5 510 then your
comparator is not strict Rec ordering uh then we enabled St week
ordering checks for stable partial STD sort and it was rolled out in lbm 177
which probably which which release was probably a week ago and if you use lvm
and lip C++ you can use these command line Flags to enable strict quiek
ordering check in your projects all right um at Google it took
us around seven six to seven months to fix everything I wouldn't say that we
worked like full-time on fixing everything it was just more like a thing
that okay we uh I know I had I don't know 20 minutes 20 spare minutes we have
a thing called um I know it's it's a collaborative project you can kind of
jump in fix a couple targets and move on with your project so some Community
contribution and yeah but it took us overall six to seven months to fix to
find issues uh we found fewer of them but they were much
scarer so I'm going to list like um not the actual bar bugs but the family the
families of bugs so first bug is when people tried okay I have STD less I'm
just returning I like left hand side less than right hand side oh if I want
to sort sort in the other order what I do is just I need to negate it no
unfortunately not you cannot do that because uh IR reflex IR reflexivity
would be uh would not be met um the more complicated bug in this case was looked
like that so you you were reading a flag uh do you want to sort descending or
ascending and bug was only happening when the
desending is false but when it was true everything worked
fine um another issue if you have one element in your vector no comparator is
called because why would you like one element is already a sorted Vector like
Vector of one element is already s sorted Vector but if you have um I know
comparator which I know the references some pointer and you have no like and
you have an element which doesn't have any value and the referencing is just
null um then like you're the referencing null pointer that just results in in a
failure um another thing was um like okay we like another failure was to
check if something doesn't have any value and just to partition it let's
just okay I do have some um I do have some elements which I
want to partition to the left but they don't have any value and those who do
have value they will be like um ordered by their oper operator less um this is
also not correct and because comparator of the same element can be true and this
is exactly the thing that I showed you in godbolt where I can give you 31
elements and your sorting would read out of balance
um the correct fix here is to swap first if with a second if then it's a correct
it's a correct comparison um another another bug um
another family of bugs was uh people wanted people treated ties differently
from how they compared it so in this case um like they computed something
from the element some difference between days and then they returned absolute
value like with this differences and um you can construct example which
fail such sort which is also not great um assume you're Google Drive and you
comput something should be uh what should be deleted first and you delete
in incorrect order which is pro that's probably not
great um another another family of bugs was that like I don't know they were
comparators with thousands lines of code and they were writing and writing and
writing this comparisons and they were good good and good and in the end they
just okay just return true no unfortunately that's not also correct
you need to return false when you have completely equal
elements um another family of bugs was people wanted to compare users wanted to
developers wanted to compare some elements and sometimes the comparison
just doesn't result in a correct I would say it cannot compare it because I mean
something has gone wrong I don't know the networking issue or something like
that and what they decided to go okay decided to do is oh if we cannot compare
anything with just R and false unfortunately that's also that's also
exploitable that doesn't work and you need other mechanisms to kind of
understand um what what can go wrong maybe throw an exception from here maybe
do something else um another bug which is um might be
less obvious to people uh that sometimes you want to okay uh I want to sort some
floats but I want to like some some floats should be equal I mean I don't
know I did some ranking and if the the abson absolon comparison results and
true these are probably like they have the same ranking so I would just return
by ad however this is not a correct sorting because in this almost equals
they do compare floating numbers by some Epsilon and a can be equal to a plus
Epsilon um a plus Epsilon can be equals can can can be equal to a plus 2 Epsilon
but a is not equal to a plus 2 Epsilon because 2 Epsilon might be more than
they they compare their floats and this results just in Incorrect sortings and
unfortunately transitivity of incomparability is lost here and um this
is actually probably the the bug that we Face the most so a lot of uh a lot of
developers thought that it's okay to I mean it's generally good rule to compare
floats by Epsilon so you don't want to say okay this less than this or just
looking at they like matisa and exponent however yeah in sorting it might result
in something bad um another another issue which which
probably I don't know happens from time to time on heren news or whatever um if
you sort the vector of floats it can be undefined Behavior why um floating
numbers can be none not a number U not a number uh numbers by I
e754 this is a very scary name for me but doesn't matter uh it always returns
false so X is less than none returns false none is less than x returns false
by uh incomparability it means that X is equal to none and it means that every
element is equal to none but not all elements are equal among each other so
it means that if you have nuns in your container which you want to sort
unfortunately it might result in incorrect
sorting um this is a very funny bug this is a very another funny bug so we found
that um so on the left uh you have a vector of floats and you want to sort um
from begin to end but with a greater and you may debug integers and on the right
I put w-w everything so give me all warnings
that you have and doesn't give you any warning so you want to sort Vector
floats with a greater comparator uh however you might get you might misspell
float you might forget this is a vector of float so actually you might want to
change the type of vectoral float and nothing will give you a warning so even
though C++ is a type safe language sometimes I wonder how such things even
happen um a lot of bugs in open source as well so this is some Android um yeah
this is the bug of the case where if all elements are equal to CFA regag then
this might result in in a failure uh BFB Trace same bug 5,000 stars um another
repository which just returns something less than equal for some reason I don't
know uh chromium web RTC also very interesting bug um you can exploit it
but it already has been fixed so um yeah so as you can see I can
like I could like give more examples here but um I don't have much time and
let me talk about interesting stuff which you might never consider what you
can do with your sortings um this is a snippet of code which might be hard to
read and but you don't need to to understand what gas means or candidate
or num solid whatever all you can see that here that these elements are
changed along the way of like calling anth element or STD sort so this these
elements this element is changed sometimes and then you kind of order
rate so you can construct I don't know comparator which self
modifies some array and return sorting by giving you I don't know it just um
let me put it this way the algorithm gives you some elements which it wants
to sort and you can trick this algorithm for example in the worst case scenario
and this is the snippet where you can trick I don't know SD sort or nth
element to to be the worst case scenario always um this is somewhat artificial I
would say um you don't you probably never write comparators which modify the
array like along the along the Sorting but it's valid it's it's okay it
satisfies strict quick ordering and however sdn element will always result
in quadratic worth case for that and uh this there is an issue in GitHub uh
which provided a little bit of like talks and comments like like is it
actually valid do we need to fix anything
and nth element requires linear complexity on average however we have a
comparator where every time you call the sord on any range like it results in
quadratic Behavior because you kind of can trick an element to be quadratic
that's also like a funny book it was discovered
um but like long long time ago pretty much in the case how you can you find um
problems and how can you find problems in your quick
sorts all right um some sort in trivia uh can we sort 1.02 1.01
1.0 probably yes yeah I I don't see any problems all right can we sort 3.0 none
and 4.0 I explained no because um none thinks that it's equal to 3.0 and 4.0
and 3.0 is not equal to 4.0 can we sort none 3.0 and
none surprisingly the answer is yes because all elements are equal and there
is no I don't know 4.0 which is equal to 3.0 so uh even though your vector might
have nuns you might get lucky and you can't sort this if you have like nuns
and some other element which um which is equal to all other elements so for
example if you add a couple 3.0 here that also going to be
fine all right um last last section what are we changing our sorting algorithm
with so we made cleanups that result all golden tests fixed all bugs most bugs on
strict week ordering so now we are ready to change something uh we decided to
choose a mix of block quicksort and PN def pattern defeating quicksort um let
me briefly talk about what what what they do um the necessity actually comes
from the partitioning algorithm and partitioning algorithm
has these two while Loops these two while Loops are data dependent and
Branch pred and Branch predictor like doesn't do a very good job on finding um
like on predicting these branches because I mean you have random data and
it cannot predict anything so we saw a lot of branch mispredictions and sorting
so what we decided to do we decided to use a technique called block quick sort
so what we do we just store the results in a bit set pretty pry much in an
integer of 64 bits uh from left and right uh then what we do we just find
with uh count trailing zeros uh call uh like the bits uh that um need to be
swapped we lower the bit lower um these bits from one to zero um these are just
one instructions on a x86 and just swap these elements this help to reduce
branches a lot and another change that we decided to do which is more relevant
to PN defeating quick sort is soal Takis nther so when you want to find this any
element which partitions quick sort it's called pivot um you take nine elements
find the median in every three and find the median out of these three medians
that's called n Takis nther and uh this proved to be really good for Ren data
and like reduces the number of comparisons
overall um yeah this is godbolt which gives us that yeah block quick sort is
really nice to vectorized uh when you compare a lot of
integers um so which results do we have on integers on random integers we have
up to 50% results uh on some other patterns like ascending descending we
regress but just by a bit uh but still like when you are like you have an
ascending array if you sorted like like NC per element or 1.03 NC per element
that's probably doesn't matter much um on more interesting benchmarks like
string strings because strings uh cannot be vectorized properly and you have
pointer in directions from array to Strings and so on we have around like
10% and um all right um some thoughts so we're heading towards the end just a
couple minutes more um C++ is a weird language in a way that our default
sorting is unstable other languages like python rest and Java have stable default
sorts so if you if you call do sort in python or rust is just a stable sort it
doesn't have any problems with golden tests and it has like specific like sort
and stable calls uh for C++ it's the other way around it's CD sort as a St
and you need to call stable SW kind of to stabilize
things um another interesting question why do
you even Implement some algorithms that can read out of bound why shouldn't we
just be a little bit safer U from the standard Library standpoint um good
question probably we should um um fun fact um I heard a lot
that Rangers sort and overall Rangers wanted to have one argument a lot uh
like just container argument um that because because if you have two
arguments which take the beginning and the end of the range they might be from
different containers and there is no warning against that so I we found
literally zero bugs where we had different um iterators in a standard
like two argument standard sort maybe we're just lucky maybe it's not but I
don't think that this argument for migrating from STD sword SD range of
Sword reduces a lot of bucks um C++ 20 allows us to be allows
us to make fewer bugs we return from parator spaceship the ordering that we
need um it works much better you don't need to write your own comparators which
might be buggy so what are what what what are the
results of all that so overall the whole effort we call a comparator sanitizer so
we find golden tests we find broken comparators and it prevents from out of
bound reads prevents prevents exploits I believe it it deserves a name of
sanitizer uh we saw in production around 78% Improvement not 50 as we've seen for
random sorts uh for random integer sorts well that's we expected a little bit
more okay I'll take 78% Improvement in sorting and reduced a lot of Branch
mispredictions we gave this capacity U Branch predictor capacity to something
else uh fixed literally thousands of bugs for the HMS law um like my take of
all that that even simplest things are hard to do right when I started this
project two year ago I thought all right I mean I I would finish that in six
months no it took around two and a half years definitely not fulltime I was like
on and off of this project this was fun to do okay what can you do um use these
command line Flags if you use LEP C+ Le leap C++ uh because um they enable you
um randomization checks and strict week
ordering checks
um um probably you can make randomization for other algorithms
probably you want to um make randomization for standard partition or
SD make hip um I was a little bit lazy to do that um like we probably can do
strict week order and checks for nth element I didn't implement it for nth
element because in the algorithm that I DES in the quadratic algorithm that I
described you need to sort the range probably you don't want sort the range
kind of to do anti element but that might be okay for the back
build what can you do else um remember y today talked about invalidation of
vector iterators well um I don't know there can be ideas which we which were
very successful for strict with ordering checks let's just in debug build
relocate Vector on first 100 pushbacks we'll find a lot of bugs with dangling
references dangling iterators uh probably it will help a a
lot um these are the links that you can read uh my blog where I described all
this effort lvm request for comments sort and change itself uh fabric review
in llvm um here are my details if you want
to get in touch please do um that's it I'm happy to take your
[Applause] questions
yes if you if you have questions please go to the
microphone uh thank you uh was very
interesting I believe that the reason you don't see an issue is greater and is
that leap C++ disables warnings in Stand Library hers yeah probably probably
you're right yeah yeah I know about that thank you oh I also have a suggestion
for that it's more of a comment but um instead of stood grer in use stood
greater uh that colon element type yeah yeah that also yeah that also
works but yeah finally enough second question
about this this weird bug okay thank you for the great talk uh you mentioned
finding a lot of bugs uh regarding in like improper use did you ever find any
attacks against Google Google code kind of exploiting this D or so or um okay
that's an interesting question even if I knew about some something I believe I
wouldn't be allowed to talk about them so but I don't think that I know about
anything thank have you looked at what happens if
a St map gets one of these bad compilat um I'll say just question
thanks um this is actually my next kind of plan for this adventure let let's try
not only to get to the algorithms but also like to um containers uh I believe
for standard map standard set you might get weird results because well these are
trees these are red black trees and if you mess up with red black trees you get
like you might get I don't know linear trees or something like that I might
suspect that if you if you supply some comparator which is not great it might
result in something in something very long so yeah I do think I do think you
can exploit that right now as well I guess there are no more questions
thank you so [Applause]
much