Professional Documents
Culture Documents
I keep running into code that uses strcpy, sprintf, strncpy, _snprintf (Microsoft only), wcsncpy, swprintf,
and morally equivalent functions. Please stop. There are alternatives which are far safer, and they
actually require less typing.
The focus of this post is on fixed-size string buffers, but the technique applies to any type of fixed-length
buffer. If you dont use fixed-size buffers then this post is not relevant to you. Some people do use them,
sometimes for valid reasons, and this post is for them.
The dangers of strcpy and sprintf should require little explanation I hope. Neither function lets you
specify the size of the output buffer so buffer overruns are often a risk. Using strcpy to copy data from
network packets or to copy a large array into a smaller one are particularly dangerous, but even when
youre certain that the string will fit, its really not worth the risk.
n
The dangers of strncpy, _snprintf, and wcsncpy should be equally well known, but apparently they are
not. These functions let you specify the size of the buffer but and this is really important they do not
guarantee null-termination. If you ask these functions to write more characters than will fill the buffer
then they will stop thus avoiding the buffer overrun but they will not null-terminate the buffer. In
order to use these functions correctly you have to do this sort of nonsense.
char buffer[5];
strncpy(buffer, Thisisalongstring, sizeof(buffer));
buffer[sizeof(buffer)-1] = 0;
A non-terminated string in C/C++ is a time-bomb just waiting to destroy your code. My understanding is
that strncpy was designed for inserting text into the middle of strings, and then was repurposed for
secure coding even though it is a terrible fit. Meanwhile _snprintf followed the strncpy pattern, but
snprintf did not. That is, snprintf guarantees null-termination, but strncpy and _snprintf do not. Is it any
wonder that developers get confused? Is it any wonder that developers often do this:
lstrcpy is a similarly named Microsoft design defect that appears to behave like strlcpy but is actually a
security bug. It uses structured exception handling to catch access violations and then return, so in some
situations it will cover up crashes and give you a non-terminated buffer. Awesome.
Its also worth emphasizing that of the seven functions listed above only one is even plausibly safe to
use. And its not great either.
The solution is
We are programmers, are we not? If the functions we are given to deal with strings are difficult to use
correctly then we should write new ones. And it turns out it is easy. Here I present to you the safest way
to copy a string to an array:
The syntax is a bit weird since it combines templating on an integral value (instead of a type) with
passing an array by reference, both of which are unfamiliar to many programmers. For more
information on passing arrays by reference see this stack overflow article. Or, you can use the template
magic quite effectively without worrying about the details of how it works.
<note>Commenters correctly pointed out that strncpy followed by null-termination is not the ideal
implementation of strcpy_safe because it is inefficient (strncpy will zero all bytes to the end of the
buffer) and maybe cut UTF-8 characters in half. Fixing this is beyond the scope of this post, which was
supposed to be focused on automatically inferring the buffer size through template magic. So, dont
forget to implement YourCopyNFunction and maybe Ill post a version next time.</note>
I challenge you to use this function incorrectly. You can make it crash by passing an invalid source
pointer, but in many years of using this technique I have never seen a case where the buffer size was not
inferred correctly. If you pass a pointer as the destination then, because the size cannot be inferred, the
code will fail to compile. It only works with a static string buffer as destination (no std::string, or
std::vector) but different overloads can be made for those destination types.
I think that strcpy_safe is (ahem) a perfect function. It is either used correctly, or it fails to compile. It. Is
Perfect. And its only six lines. Five if you indent like K&R.
Because strcpy_safe is so tiny it just calls strncpy and then stores a zero it will inline automatically in
VC++ and gcc in optimized builds. If you want to reduce code size further you could write a non-inline
helper function (strlcpy?) that would do the null-termination and have strcpy_safe call this function. Its
up to you.
One could certainly debate the name maybe you would rather call it acme_strcpy, or
acme_strncpy_safe. I really dont care. You could even call it strcpy and let template overloading
magically improve the safety of your code.
Unicode
String truncation causes problems with UTF-8 encoding. If you want to truncate at a character (or is that
code-point I cant remember) boundaries then need to add some extra code to scan backwards to a
character boundary. Its not too complicated, but its outside the scope of this post which was focused
on using templates to infer array sizes.
Extrapolation
Similar wrappers can obviously be made for all of the string functions that you use. You can even invent
new ones, like sprintf_cat_safe. In fact, when I write a member function that takes a pointer and a size I
usually make it private and write a template wrapper to handle the size. Its a versatile technique that
you should get used to using. Templates arent just for writing unreadable meta-code.
String classes
Yes, to be clear, I am aware of the existence of std::string. For better or for worse most game developers
try to avoid dynamically allocated memory and std::string generally implies just that. There are valid
reasons to use character arrays (fewer allocations, better cache locality), even if those valid reasons are
just that youve been handed a million lines of legacy code with security and reliability problems in all
directions. strcpy_safe and friends are unique in that they let you improve code security and reliability
with a simple s/strcpy/strcpy_safe/.
As I say at the top, if you dont need to use fixed-length buffers then congratulations, this post does not
apply to you.
Summary
We started with this code two lines of error prone verbosity:
strcpy_safe(buffer, Thisisalongstring);
You should be using the second option, or explain to me why the heck not. It is constructive laziness of
the highest order. Doing less typing in order to create code that is less fragile seems like it just might be
a good idea.
One could certainly argue that silently truncating is bad if that is your policy then your wrapper
function can assert, log, or abort(). Those are implementation specific details. However we can all agree
that avoiding buffer overruns is progress.