Everyone, when they're starting out on the Arduino and similar boards, learns to use the String object for working with text. Or they think they do.
Well, you should forget all you think you have learned about using Strings on the Arduino, because it is all wrong.
Strings are a bit of a tricky area on the Arduino. The String object was created to make working with blocks of text easier for people that don't really know what they are doing when it comes to low level C++ programming. However, if you don't know what you're doing when it comes to low level C++ programming then it is very easy to abuse the String object in such a way that it makes everything about your sketch fragile and unstable.
So I'm here to tell you firstly why you shouldn't use String objects, and secondly, if you really must use them, how you go about using them properly.
Ok. First a little bit of computer theory - especially to do with memory.
The Arduino's RAM is split up into different chunks for different purposes. There's a chunk where all the global and static variables are stored (aka BSS and data areas). There's the stack where local variables created within function are stored, and finally there's the heap, which is where dynamic variables are stored (more on those in a moment).
If you want to know more about how these chunks of memory relate to each other you can read more on Wikipedia.
The main part we are concerned with here is the heap.
So let's describe the stack and the heap so you can grasp the difference.
Imagine you have a load of coins. Each one represents a variable in your program. Local variables in functions are placed on the stack. As its name suggests this is like a single stack of coins. You place a coin on the top of the stack, and you can remove a coin from the top of the stack. Variables are placed on the stack, and the stack grows. Variables are removed from the stack, and the stack shrinks. Just like placing coins on top of the stack of coins and then removing them. You have to remove them in the opposite order that you placed them on. If you tried taking a coin out from the middle of the stack the whole stack would fall over.
Now the heap is completely different. Again, you have a load of coins. The coins are of different sizes. Some big, some small, and maybe you even have some notes (if you're lucky). The heap is more like you are laying the coins on the table side by side in a line. You can place a coin at the end of the line, and you can remove coins from the middle of the line. If there is a gap in the line big enough for you to fit a new coin in then you can place it in there. If the coin is smaller than the gap you end up with a tiny gap between that new coin and the next one. That gap's too small to fit another coin into, so the next time you want to place a coin down it has to go on the end of the line. After a while of adding and removing coins of different sizes you end up with a line of coins full of gaps that's much longer than it needs to be.
This is called Heap Fragmentation and is a real problem when you only have a very limited amount of memory. All those gaps in the heap are wasted memory that is very hard to re-use.
Dynamic variables are variables that are created by asking the memory management routines to give you a chunk of memory to work with. Once you have finished working with that memory you are supposed to hand it back to the memory management routines so that it can be re-used by another part of your program. The main functions used here are malloc(), free() and realloc(). The first asks for a block of memory, the second gives it back again. The third says "This block of memory I have is too small. Give me a bigger one instead".
When you ask for a block of memory it tries to find a hole in the heap that it can use for your request. If there is one it will use part or all of it for your request. If there isn't it will add it to the end of the heap. If you ask for a bigger amount of memory with realloc() and there isn't room where your current allocation is, it will move your allocation elsewhere, leaving a hole behind.
So you can see the problem of why dynamic memory allocation in systems with very small amounts of RAM can be a problem.
And the String class uses dynamic memory in abundance.
The biggest problem is when you perform many common operations with Strings you inadvertently create new String objects that you don't need to. Every one of those potentially creates holes in the heap.
For instance, take this simple snippet:
String hi = "Hello"; String name = "Fred"; String greeting = hi + " " + name;
How many Strings do you count there? Nope, there's four, not three. You have the String "hi" which has allocated RAM to store the word "Hello", a second one that stores the name "Fred", and a third that stores the results of adding the others together. But there is a fourth. You see, to build up the results of "greeting" it has to do it in stages. First it takes the string "hi" and adds a space to the end of it. That is placed into a new String object. Then to that new String object it adds the contents of the "name" object. So you have what is called a temporary. Just that, a temporary. Not a temporary variable or anything, just simply a temporary. It's created in the process of doing the work and then thrown away again afterwards. And of course that has the potential of leaving behind it a hole.
So what should you do if you want to avoid these temporaries? Well, the String class has a handy function concat() which will add things to the end of an existing string.
So you could have something more like:
String greeting = hi; greeting.concat(" "); greeting.concat(name);
BUT there is another little gotcha there. Every time you add to the end of the String it has to make more space to store the extra text in it. If the hole it's currently in is too small it will end up moving elsewhere and leaving behind it a hole, and the heap will grow. So even that is not a good solution.
So the best solution is not to use the String class at all, but to do all the work in proper native "C strings". I'll cover working with those a little later on.
So let's look at another surprising little example. Take this bit of code:
void PrintVal(String tag, String value) { Serial.print(tag); Serial.print(" = "); Serial.println(value); }
A little function which takes two Strings and prints them to the serial port separated by =.
You call the function with two Strings. Say you call it with:
String tempname = "Temperature"; String temp = "23C"; PrintVal(tempname, temp);
You have two Strings already, tempname and temp. You then call the function, and in doing so you inadvertently create copies of both tempname and temp. Those copies are called tag and value in the function. So suddenly you have twice as much heap used by Strings as you did before. So you have made, without thinking about it, 2 extra Strings that you didn't really mean to. And that has a real noticeable effect on the integrity of your heap.
By now it's looking like Swiss cheese.
So how do you avoid those inadvertent copies all over the place? Well, the trick here is to pass the String as references instead of copies. That calls for the & reference operator:
void PrintVal(String &tag, String &value) { Serial.print(tag); Serial.print(" = "); Serial.println(value); }
Now when you call the function the Strings tempname and tag are the exact same Strings. The same for temp and value. You have simply given the Strings new names for the function use. That's saved two whole extra copies.
So now you can see why the String class, which was created for use by people who don't have advanced programming skills, is not a good thing for people who don't have advanced programming skills to use.
But even with advanced programming skills you can never work around all the shortcomings of the String class.
So instead you really need to learn how to do without the String class altogether. And that means using "C strings".
First a little bit of anatomy.
A C string is simply an array of characters, but it is an array of characters that must obey certain rules.
The biggest rule of C strings is that they are NULL terminated. That means that the very last character of every C string must be ASCII character 0.
The internal C string manipulation functions I will be introducing in a moment all look for that final NULL character as a marker to show where the end of the string is. The reason is because in C an array, although you may have specified a size at compile time, doesn't have that size stored as part of it, and neither does a C string. It is perfectly possible to say you're working with a string of 10 characters and then fill it with 20 characters instead. That is a very bad thing to do, so you must learn to take care of these things. The result of that is known as a buffer overrun and is one of the most prevalent hack attacks used by cyber-criminals - fill up an input buffer with more data than it can handle until you end up writing your data over part of the program that is being run - and then your data (which could quite happily be instructions for a program) would get executed, thus compromising the device. So care must really be taken to avoid that.
Creating a C string is as simple as:
char string[30];
That will create an array of up to 30 characters. If you do it globally it will be stored in the BSS area mentioned above. If you do it in a function it will be stored in the stack. Not a hint of it even going near the heap.
Don't forget that the 30 character space that you have reserved includes the NULL character, so actually you only have room for 29 characters if you are to be able to follow the rules for a C string.
Getting things into C strings is a little more tricky though. You can specify some content right at the start if you like:
char string[30] = "This is a string";
And that's simple enough. But what about if you want to change the content on the fly? Unlike with the String class you can't just do this:
string = "New content";
Instead you have to change each of the characters in the array individually. You see string doesn't contain the test, it just points to where it is in memory. So you need to manipulate the memory that it's pointing to, not the pointer itself.
So to change what is in the string you can use the strcpy() function:
strcpy(string, "New content");
That will iterate character by character over the second string and place those characters into the first string's memory.
Another thing you can't do with C strings is adding them together. This will not work:
char hi[7] = "Hello "; char name[5] = "Fred"; char all[14] = hi + name;
Remember, the variables hi and name just point to locations in memory where those strings are stored. So in fact what you are doing there is adding to addresses (numbers) together and ending up with some bigger number which you then try to assign to an array (which doesn't work).
Instead you need to use the handy strcat() function:
char hi[7] = "Hello "; char name[5] = "Fred"; char all[14] = ""; strcat(all, hi); strcat(all, name);
The strcat function, like the strcpy function, copies the memory content character by character from the right hand string to the left hand one. Unlike strcpy though, strcat starts from the end of the first string, not the start.
Now, when it comes to buffer overruns, there is a special variation of all the string handling functions available. Every C string function has an n variant available, for instance strcpy has strncpy. These variations will perform on up to n characters. That allows you to limit the maximum number of characters you will work with, and thus help you to prevent buffer overruns from existing.
One of the hardest parts of working with C strings, though, is that of working out what is in a string. You can't just compare strings:
char a[10] = "Part A"; char b[10] = "Part B"; if (a == b) { .... }
That kind of comparison is merely comparing the pointers to the memory where the strings are stored. Instead you must compare the content of the memory character by character.
Fortunately, again, there are functions to do that for you. strcmp is a good one to start with.
strcmp will take two strings and compare them character by character. If the strings are the same it will return a 0. If one string is logically "less" than the other (in that "a" is less that "g") it will return -1. If one string is greater than the other then it will return +1. Most of the time you don't care about greater than or less than, only is it equal. So to compare the two strings above you would use:
if (strcmp(a, b) == 0) { .... }
Another useful variation on that function is strcasecmp. This does the exact same job but it doesn't care about upper or lowercase letters. So "Hello" would equal "hello" with strcasecmp but not with strcmp.
Again there are n variants of both those functions available, strncmp and strncasecmp.
So far I have only shown you one way of creating strings: char [num] { = "content";}, but there are others, and they each have special meaning.
char *string;
That format just creates a pointer to a string. It doesn't actually point to any string at all - it's like a string with no size to it whatsoever. You may thing that's useless, but it's not - it's incredibly useful. You can use it to point to any other string, and since it's really just a number, you can move it around anywhere within a string. More on pointer manipulation later.
char *string = "This is text";
That is creating a pointer to some text in memory. That memory could well be Flash memory, or it may have been copied into RAM first, depending on your architecture. It is never safe to change the content of that string since it may be in read-only memory. The size of the string is determined by the length of the content at compile time.
char string[] = "This is text";
Just like above this will create a block of memory whose size is equal to the length of the content with the content in it and point the string at it. However, this differed from the one above in that it will always be copied into RAM first. It is perfectly safe to change the content of the string.
Note the subtle difference between those last two. The first is supposed to be read only and could cause untold problems if you try to change it - the second, though it looks almost identical, is safe to change. In the former it's normal to prefix the declaration with const which tells the compiler "I will never change this. Don't let me even try".
const char *string = "This is text";
Now onto that aforementioned pointer manipulation. Passing C strings to functions.
Unlike a String object you cannot pass the entire string to a function as a parameter. Instead all you pass is the pointer to the memory where the string data is located. So the example function PrintVal that we looked at way up the page would look like this:
void PrintVal(char *tag, char *value) { Serial.print(tag); Serial.print(" = "); Serial.println(value); }
Just a tiny change, but one with lots to it.
Now you're passing the address in memory of where the two strings are stored, and those are then passed on to the print function which knows how to handle them. It then, staring at the address given, works character by character printing each one in turn up until it reaches that all important NULL character.
However, all is not well there. Imagine you call the function as:
char temp[] = "23C"; PrintVal("Temperature", temp);
You'd think that would be fine. And most of the time it would be. However, the compiler isn't completely happy with it. If you allow the compiler to show warnings (turned off by default in most Arduino and Ardiuno-like IDEs) you would see it moaning. Simply because you have specified the parameters as char *, which means, as parameters, "Pointers to memory that I can edit", yet you are passing the first parameter as a string literal "Temperature". That's not a string in memory that you can edit, it's a literal string in Flash memory. So it moans.
So the rule of thumb, any char pointers that you are passing into a function that you know will not be modified by the function must be done as const char pointers:
void PrintVal(const char *tag, const char *value) { Serial.print(tag); Serial.print(" = "); Serial.println(value); }
Now the compiler knows that you're never going to modify the memory pointed to by the pointers you pass, and so it is perfectly happy for you to pass a read-only string.
I have mentioned a few times phrases like "iterates over each character until it reaches the NULL character". But what do I mean and, more importantly, how does it do it? Well, let's take a little look at an example - printing a string to Serial character by character.
There's a number of ways this can be done, but I'm going to show you the pointer way of doing it. Here's a little function to whet your appetite. I have purposely broken it down into lots of small steps so we can analyse what is going on:
void PrintString(const char *str) { const char *p; p = str; while (*p) { Serial.print(*p); p++; } }
I know, you're thinking "WTF?!", right? Well, don't worry, it's all quite simple.
Firstly you should recognise that this is a function to which we're passing a pointer to a block of memory that we know we won't be modifying (const char str). So we have a string pointer *str to work with.
Next we are creating a new string pointer, but it's not pointing anywhere and not got any size. That "useless" one from before, remember? (char *p)
Next we are pointing that new variable to the same area of memory that str is pointing to (p = str) - so p now contains the same address as str does, and they both point to the same piece of memory - that is, the start of the string we want to print.
Now comes a while loop, with the enigmatic test "p". The * in front of the *p means "Give me the value that is stored in the memory that p is pointing at". Initially, then, that means "Give me the first character from the string" since p is pointing to the start of the string. Now the magic here is that the NULL character at the end of the string is 0, which is the same as FALSE, so when *p equates to 0 the while loop will finish.
Next we use that same * operator again to get the character that is currently pointed to and print it.
Finally, the last operation in the while loop, is to increment p (p++). Because p is just a number (the memory address), incrementing p causes it to point to the next address in memory. That means p is now pointing to the next character in the string.
And so it continues until the character pointed to by p is NULL.
So you can now see the importance of that NULL character - without it how would functions like these know when to stop?
There are far more functions for working with C strings than I can cover here, but hopefully I have shown you the basics and you can now delve in and discover more for yourself and finally do away with that String class that causes so many problems.
For further reading you might like to check out these links:
- http://www.tutorialspoint.com/c_standard_library/string_h.htm
- http://www.tutorialspoint.com/c_standard_library/c_function_strtok.htm
- https://en.wikibooks.org/wiki/C_Programming/Strings