You are on page 1of 5

strncpy Problems

I keep running into code that uses strcpy, sprintf, strncpy, _snprintf (Microsoft only), wcsncpy, swprintf,
and morally equivalent functions. Please stop. There are alternatives which are far safer, and they
actually require less typing.

The focus of this post is on fixed-size string buffers, but the technique applies to any type of fixed-length
buffer. If you dont use fixed-size buffers then this post is not relevant to you. Some people do use them,
sometimes for valid reasons, and this post is for them.

The dangers of strcpy and sprintf should require little explanation I hope. Neither function lets you
specify the size of the output buffer so buffer overruns are often a risk. Using strcpy to copy data from
network packets or to copy a large array into a smaller one are particularly dangerous, but even when
youre certain that the string will fit, its really not worth the risk.
n

functions considered dangerous

The dangers of strncpy, _snprintf, and wcsncpy should be equally well known, but apparently they are
not. These functions let you specify the size of the buffer but and this is really important they do not
guarantee null-termination. If you ask these functions to write more characters than will fill the buffer
then they will stop thus avoiding the buffer overrun but they will not null-terminate the buffer. In
order to use these functions correctly you have to do this sort of nonsense.

char buffer[5];
strncpy(buffer, Thisisalongstring, sizeof(buffer));
buffer[sizeof(buffer)-1] = 0;

A non-terminated string in C/C++ is a time-bomb just waiting to destroy your code. My understanding is
that strncpy was designed for inserting text into the middle of strings, and then was repurposed for
secure coding even though it is a terrible fit. Meanwhile _snprintf followed the strncpy pattern, but
snprintf did not. That is, snprintf guarantees null-termination, but strncpy and _snprintf do not. Is it any
wonder that developers get confused? Is it any wonder that developers often do this:

// Make snprintf available on Windows:


// Dont ever do this! These two functions are different!

#define snprintf _snprintf

strlcpy and lstrcpy


strlcpy is designed to solve the null-termination problems it always null-terminates. Its certainly an
improvement over strncpy, however it isnt natively available in VC++.

lstrcpy is a similarly named Microsoft design defect that appears to behave like strlcpy but is actually a
security bug. It uses structured exception handling to catch access violations and then return, so in some
situations it will cover up crashes and give you a non-terminated buffer. Awesome.

Wide characters worse?


swprintf is a function that defies prediction. It lacks an n in its name but it takes a character count,
however it doesnt guarantee null-termination. Its enough to make ones head explode.

Wheres the pattern?


If you find the list below obvious or easy to remember then you may be a prodigy, or a liar:

May overrun the buffer: strcpy, sprintf


Sometimes null-terminates: strncpy, _snprintf, swprintf, wcsncpy, lstrcpy
Always null-terminates: snprintf, strlcpy
The documentation for these functions (man pages, MSDN) is typically pretty weak. I want bold letters
at the top telling me whether it will null-terminate, but it typically takes a lot of very careful reading to
be sure. Its usually faster to write a test program.

Its also worth emphasizing that of the seven functions listed above only one is even plausibly safe to
use. And its not great either.

More typing means more errors


But wait, its actually worse. Because it turns out that programmers are imperfect human beings, and
therefore programmers sometimes pass the wrong buffer size. Not often probably not much more
than one percent of the time but these mistakes definitely happen, and being careful doesnt actually
help. Ive seen developers pass hard-coded constants (the wrong ones), pass named constants (the
wrong ones), used sizeof(the wrong buffer), or use sizeof on a wchar_t array (thus getting a byte count
instead of character count). I even saw one piece of code where the address of the string was passed
instead of the size, and due to a mixture of templates and casting it actually compiled! Passing sizeof() to
a function that expects a character count is the most common failure, but they all happen even
snprintf and strlcpy are misused. Using annotations and /analyze can help catch these problems, but we
can do so much better.

The solution is
We are programmers, are we not? If the functions we are given to deal with strings are difficult to use
correctly then we should write new ones. And it turns out it is easy. Here I present to you the safest way
to copy a string to an array:

template <size_t charCount>


void strcpy_safe(char (&output)[charCount], const char* pSrc)
{
YourCopyNFunction(output, pSrc, charCount);
// Copy the string dont copy too many bytes.
//strncpy(output, pSrc, charCount);
// Ensure null-termination.
//output[charCount 1] = 0;
}

// Call it like this:


char buffer[5];
strcpy_safe(buffer, Thisisalongstring);

The syntax is a bit weird since it combines templating on an integral value (instead of a type) with
passing an array by reference, both of which are unfamiliar to many programmers. For more
information on passing arrays by reference see this stack overflow article. Or, you can use the template
magic quite effectively without worrying about the details of how it works.

<note>Commenters correctly pointed out that strncpy followed by null-termination is not the ideal
implementation of strcpy_safe because it is inefficient (strncpy will zero all bytes to the end of the
buffer) and maybe cut UTF-8 characters in half. Fixing this is beyond the scope of this post, which was
supposed to be focused on automatically inferring the buffer size through template magic. So, dont
forget to implement YourCopyNFunction and maybe Ill post a version next time.</note>

I challenge you to use this function incorrectly. You can make it crash by passing an invalid source
pointer, but in many years of using this technique I have never seen a case where the buffer size was not
inferred correctly. If you pass a pointer as the destination then, because the size cannot be inferred, the
code will fail to compile. It only works with a static string buffer as destination (no std::string, or
std::vector) but different overloads can be made for those destination types.

I think that strcpy_safe is (ahem) a perfect function. It is either used correctly, or it fails to compile. It. Is
Perfect. And its only six lines. Five if you indent like K&R.

Because strcpy_safe is so tiny it just calls strncpy and then stores a zero it will inline automatically in
VC++ and gcc in optimized builds. If you want to reduce code size further you could write a non-inline
helper function (strlcpy?) that would do the null-termination and have strcpy_safe call this function. Its
up to you.

One could certainly debate the name maybe you would rather call it acme_strcpy, or
acme_strncpy_safe. I really dont care. You could even call it strcpy and let template overloading
magically improve the safety of your code.

Unicode
String truncation causes problems with UTF-8 encoding. If you want to truncate at a character (or is that
code-point I cant remember) boundaries then need to add some extra code to scan backwards to a
character boundary. Its not too complicated, but its outside the scope of this post which was focused
on using templates to infer array sizes.

Extrapolation
Similar wrappers can obviously be made for all of the string functions that you use. You can even invent
new ones, like sprintf_cat_safe. In fact, when I write a member function that takes a pointer and a size I
usually make it private and write a template wrapper to handle the size. Its a versatile technique that
you should get used to using. Templates arent just for writing unreadable meta-code.

String classes
Yes, to be clear, I am aware of the existence of std::string. For better or for worse most game developers
try to avoid dynamically allocated memory and std::string generally implies just that. There are valid
reasons to use character arrays (fewer allocations, better cache locality), even if those valid reasons are
just that youve been handed a million lines of legacy code with security and reliability problems in all
directions. strcpy_safe and friends are unique in that they let you improve code security and reliability
with a simple s/strcpy/strcpy_safe/.

As I say at the top, if you dont need to use fixed-length buffers then congratulations, this post does not
apply to you.

Summary
We started with this code two lines of error prone verbosity:

strncpy(buffer, Thisisalongstring, sizeof(buffer));


buffer[sizeof(buffer)-1] = 0;

We ended with this code simpler, and impossible to get wrong:

strcpy_safe(buffer, Thisisalongstring);

You should be using the second option, or explain to me why the heck not. It is constructive laziness of
the highest order. Doing less typing in order to create code that is less fragile seems like it just might be
a good idea.

One could certainly argue that silently truncating is bad if that is your policy then your wrapper
function can assert, log, or abort(). Those are implementation specific details. However we can all agree
that avoiding buffer overruns is progress.

You might also like