Check if a String is a Valid URL in C
To check if a string is a valid URL in C, you can use simple string functions to verify common URL prefixes or employ POSIX regular expressions to validate the overall structure.
Example 1: Using String Comparison
This example demonstrates how to check if a URL string starts with “http://” or “https://”, which are typical prefixes for web URLs.
main.c
</>
Copy
#include <stdio.h>
#include <string.h>
int main() {
char url[] = "https://www.example.com";
// Check if the URL starts with "http://" or "https://"
if (strncmp(url, "http://", 7) == 0 || strncmp(url, "https://", 8) == 0) {
printf("Valid URL\n");
} else {
printf("Invalid URL\n");
}
return 0;
}
Explanation:
char url[]
: Declares and initializes the URL string to be validated.strncmp()
: Compares the beginning of the URL with “http://” (7 characters) and “https://” (8 characters) to ensure the string starts with one of these prefixes.- The
if
condition checks if either comparison returns 0, indicating a match. - If the condition is true,
printf()
outputs “Valid URL”; otherwise, it outputs “Invalid URL”.
Output:
Valid URL
Example 2: Using POSIX Regular Expressions
This example uses the POSIX regex library to validate the URL structure. We compile a regular expression pattern that enforces the URL to start with “http://” or “https://”, followed by a valid domain, an optional port, and an optional path.
main.c
</>
Copy
#include <stdio.h>
#include <regex.h>
int main() {
char url[] = "http://www.example.com/path";
regex_t regex;
int ret;
// Regular expression pattern for basic URL validation:
// ^(https?://) -> URL must start with "http://" or "https://"
// ([a-zA-Z0-9.-]+) -> Domain name with letters, digits, dots, or hyphens
// (:[0-9]+)? -> Optional port number
// (/[a-zA-Z0-9./?=&%-]*)?$ -> Optional path and query parameters
char *pattern = "^(https?://)([a-zA-Z0-9.-]+)(:[0-9]+)?(/[a-zA-Z0-9./?=&%-]*)?$";
// Compile the regular expression with extended syntax
ret = regcomp(®ex, pattern, REG_EXTENDED);
if (ret) {
printf("Could not compile regex\n");
return 1;
}
// Execute the regular expression on the URL string
ret = regexec(®ex, url, 0, NULL, 0);
if (!ret) {
printf("Valid URL\n");
} else {
printf("Invalid URL\n");
}
// Free the compiled regular expression
regfree(®ex);
return 0;
}
Explanation:
char url[]
: Stores the URL string to be validated.regex_t regex
: Declares a variable to hold the compiled regular expression.char *pattern
: Contains the regex pattern that specifies the expected URL format. The pattern ensures the URL begins with “http://” or “https://”, followed by a valid domain, an optional port, and an optional path.regcomp()
: Compiles the regex pattern into a regex_t structure. TheREG_EXTENDED
flag enables extended regular expression syntax.regexec()
: Executes the compiled regex against the URL string. A return value of 0 indicates a match, meaning the URL is valid.regfree()
: Releases the memory allocated for the compiled regular expression.
Output:
Valid URL
Conclusion
In this tutorial, we learned how to check if a string is a valid URL in C by exploring two different approaches:
- String Comparison: Uses
strncmp()
to verify that the URL begins with common prefixes like “http://” or “https://”. - Regular Expressions: Utilizes the POSIX regex library to validate the overall structure of the URL, ensuring it meets specific format criteria.