Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Tokenizing Strings with strtok | Practical String Processing
Working with Strings in C

bookTokenizing Strings with strtok

Note
Definition

Definition: Tokenization is the process of splitting a string into smaller parts, called tokens, based on specified delimiter characters. In C, the strtok function is used to perform tokenization by modifying the original string and returning pointers to each token in sequence.

When you need to break up a string into individual pieces, such as words or fields separated by spaces or commas, tokenization is the approach you use. The strtok function in C is designed for this purpose. It works by searching for delimiter characters in a string, replacing them with the null terminator ('\0'), and returning a pointer to the next token each time you call it. You typically use strtok in a loop to process all tokens in a string.

Common use cases for strtok include parsing sentences into words, breaking up comma-separated values, or processing data from files where fields are separated by specific characters.

main.c

main.c

copy
123456789101112131415
#include <stdio.h> #include <string.h> int main() { char sentence[] = "C programming is fun"; char *token = strtok(sentence, " "); while (token != NULL) { printf("Token: %s\n", token); token = strtok(NULL, " "); } return 0; }

In this code, you declare a character array containing the sentence "C programming is fun". The strtok function is called with the sentence and a single space as the delimiter. On the first call, strtok finds the first space, replaces it with the null terminator, and returns a pointer to the first word. Each subsequent call to strtok with NULL as the first argument continues from where the last token was found, returning the next word until no more tokens are left.

It is important to understand that strtok modifies the original string by inserting null terminators at each delimiter. This means the original content of the string is changed after tokenization, which can affect later use of the same data.

main.c

main.c

copy
12345678910111213141516
#include <stdio.h> #include <string.h> int main() { char data[] = "apple, orange;banana|grape"; const char *delimiters = ",;| "; char *token = strtok(data, delimiters); while (token != NULL) { printf("Token: %s\n", token); token = strtok(NULL, delimiters); } return 0; }

Despite its usefulness, strtok has some limitations. It is not thread-safe because it uses internal static state to keep track of its position in the string. Also, because it modifies the original string, you must be careful if you need to preserve the original data. If you need a thread-safe or reentrant alternative, you can use strtok_r (where available) or write your own tokenization logic.

question mark

What does strtok return when there are no more tokens found in the string?

Select the correct answer

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 5. Kapittel 1

Spør AI

expand

Spør AI

ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

Suggested prompts:

Can you show me an example of how to use `strtok` in C?

What happens if there are consecutive delimiters in the string?

How does `strtok_r` differ from `strtok`?

bookTokenizing Strings with strtok

Sveip for å vise menyen

Note
Definition

Definition: Tokenization is the process of splitting a string into smaller parts, called tokens, based on specified delimiter characters. In C, the strtok function is used to perform tokenization by modifying the original string and returning pointers to each token in sequence.

When you need to break up a string into individual pieces, such as words or fields separated by spaces or commas, tokenization is the approach you use. The strtok function in C is designed for this purpose. It works by searching for delimiter characters in a string, replacing them with the null terminator ('\0'), and returning a pointer to the next token each time you call it. You typically use strtok in a loop to process all tokens in a string.

Common use cases for strtok include parsing sentences into words, breaking up comma-separated values, or processing data from files where fields are separated by specific characters.

main.c

main.c

copy
123456789101112131415
#include <stdio.h> #include <string.h> int main() { char sentence[] = "C programming is fun"; char *token = strtok(sentence, " "); while (token != NULL) { printf("Token: %s\n", token); token = strtok(NULL, " "); } return 0; }

In this code, you declare a character array containing the sentence "C programming is fun". The strtok function is called with the sentence and a single space as the delimiter. On the first call, strtok finds the first space, replaces it with the null terminator, and returns a pointer to the first word. Each subsequent call to strtok with NULL as the first argument continues from where the last token was found, returning the next word until no more tokens are left.

It is important to understand that strtok modifies the original string by inserting null terminators at each delimiter. This means the original content of the string is changed after tokenization, which can affect later use of the same data.

main.c

main.c

copy
12345678910111213141516
#include <stdio.h> #include <string.h> int main() { char data[] = "apple, orange;banana|grape"; const char *delimiters = ",;| "; char *token = strtok(data, delimiters); while (token != NULL) { printf("Token: %s\n", token); token = strtok(NULL, delimiters); } return 0; }

Despite its usefulness, strtok has some limitations. It is not thread-safe because it uses internal static state to keep track of its position in the string. Also, because it modifies the original string, you must be careful if you need to preserve the original data. If you need a thread-safe or reentrant alternative, you can use strtok_r (where available) or write your own tokenization logic.

question mark

What does strtok return when there are no more tokens found in the string?

Select the correct answer

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 5. Kapittel 1
some-alt