There are two important string-related concepts you must come to fully understand:
String
class and the C++ std::string
class;std::string
class and C++ "character arrays".
You will undoubtedly require – and hence need to understand fully – both C++
std::string
instances as well as C++ character arrays. You need to understand when
both are typically employed, how usage of the two differ, and how to convert from one to
the other.String
class versus C++
std::string
class==
in C++, whereas equals
must be used in Java
Java | C++ |
void compare(String s1, String s2) { if (s1.equals(s2)) System.out.println("s1 and s2 are the same"); else System.out.println("s1 and s2 are different"); } |
#include <string> void compare(std::string s1, std::string s2) { if (s1 == s2) std::cout << "s1 and s2 are the same\n"; else std::cout << "s1 and s2 are different\n"; } |
String
objects are immutable whereas C++
std::string
objects can be modified. This difference is often not noticeable.
Here is one
example where it is noticeable:
Java | C++ |
public class StringTest { // Since String is a class in Java, "str" will be passed // by reference, just as it is in the C++ version. private static void addToString(String str) { str += "def"; System.out.println("In addToString, str = " + str); } public static void main(String[] args) { String myString = "abc"; System.out.println( "In main before call to addString, myString = " + myString); addToString(myString); System.out.println( "In main after call to addString, myString = " + myString); } } |
#include <iostream> #include <string> void addToString(std::string& str) { str += "def"; std::cout << "In addToString, str = " << str << '\n'; } int main() { std::string myString = "abc"; std::cout << "In main before call to addString, myString = " << myString << '\n'; addToString(myString); std::cout << "In main after call to addString, myString = " << myString << '\n'; return 0; } |
In main before call to addString, myString = abc In addToString, str = abcdef In main after call to addString, myString = abc |
In main before call to addString, myString = abc In addToString, str = abcdef In main after call to addString, myString = abcdef |
std::string
class versus C++ character arrays
(char*
)Note |
The declaration of "msg" on the left is a syntactical shortcut for creating the character array: const char msg[ ] = { 'a', 'b', 'c', '\0' };
The special character This syntactical shortcut is a remnant from the old C days (and old C++ days before the std::string class was added) when this was the only way character string data could be stored and manipulated in a C/C++ program. The reason it should be declared as "const" (in the declaration on the left) is that character string literals are created as "const char[]" instances by most compilers these days. As we saw in item #1 above, such literals can be used to initialize std::string instances, however trying to use one to initialize a char* pointer as: char* msg = "abc"; will result in most compilers issuing a warning message. Some will flag it as an error. |
std::string
, C++ can also hold character data in
arrays whose base type is char
. Such instances are first and foremost arrays, hence
all the aspects of C++ arrays mentioned in the Arrays section apply. There are many ways to declare
and use character arrays, but you must be careful that the actual array is sufficiently large
for all operations you will perform.
As a very simple example, suppose I want to create a character array, msg, that holds "abc". I could do the following:
const char* msg = "abc"; // See "Note" on the right.
The fact that the compiler appends a sentinel zero byte to the array is important because I can output this string by executing:
std::cout << msg;
The obvious question is: "How can this work since msg
is an array passed to a method of the
std::cout class without also passing its length?" (Recall that C++
has no way of knowing the length
of an array.) The answer is that functions that receive
character arrays like this expect to see the end of
the character string marked with the special '\0'
sentinel. In this case, for example, when sending
msg
to std::cout
, the implementation just starts writing one character
after another to the screen until it encounters the '\0'
character.
The output operator "<<" is not unique in operating this way. There are in fact
several standard C/C++ functions that accept character arrays that are assumed to be terminated by
this '\0'
sentinel. They perform various operations like
comparison, concatenation, determining the length of the string in the character array, etc.
See Appendix H of Carrano for a listing of all such functions, some of which are declared
when you #include <cstring>; others when you #include <cstdlib>
This use of character arrays has been available since the earliest versions of the C language (and
hence the name of the header file: <cstring>).
By contrast, the std::string
class is a relative newcomer since the evolution of C++
from C.
While std::string
is generally much easier to work with, there are times when the
character array must be used. One example that we have already seen is in the C++ main
function as
we saw in item #5 on the Similarities page. You will undoubtedly encounter many other examples when
using various C-based APIs. There are a plethora of such toolkits including graphics interfaces, database systems,
GPU programming interfaces, and many other very useful toolkits originally developed as C-based APIs.
Given a zero-byte terminated character array, you can create an equivalent std::string
instance as follows:
void hereIsACharArray(char* msg) { std::string msgAsString(msg); … work with msgAsString … }
You can extract the character array from an std::string
instance by using
the c_str
method:
void hereIsAString(std::string myString) { const char* myStringAsCharArray = myString.c_str(); … work with myStringAsCharArray … }
You are not allowed to modify the string returned from c_str
; notice that
myStringAsCharArray
is declared as const char*
.