How to Use Curl to Check if a Remote Resource is Available

How to Use Curl to Check if a Remote Resource is Available

Do you want to check if a file is available on a remote server before attempting to download it - in a short and effective way? If so, and you’re not sure how you’ll learn how in this post.


Recently, I was writing a shell script as part of a new Docker container setup. While I was pretty happy with the script, it wasn’t as defensive as I’d have liked it to be. This was because it didn’t check if a remote tarball was available before attempting to download and use it.

Sure, it’s a personal project that only I use, so no one else would ever get burned if the resource wasn’t available. However, you never know, one day, it may be publicly available, and I want it to be as good as I can make it.

So, how should it check if the resource was available, before attempting to download it? The first idea that came to mind was to make a HEAD request with Curl, by passing the -I or --head flags, and see if the status code returned was an HTTP 200.

Here are the contents of the man page for those options:

-I, –head (HTTP FTP FILE) Fetch the headers only! HTTP-servers feature the command HEAD which this uses to get nothing but the header of a document. When used on an FTP or FILE file, curl displays the file size and last modification time only.

And here’s an example of the output that you might expect to see by making a HEAD request:

HTTP/1.1 200 OK
X-Powered-By: React/alpha
Content-Type: image/png
Transfer-Encoding: chunked

There, at the top, you see that the server’s using HTTP 1.1 and that the resource is there, as the status code is a 200 OK. So far, so good.

Now, how do I — efficiently — extract just the status code and nothing else? As the response headers are all text, then many of the standard Linux shell commands, e.g., awk, sed, and grep could be used. After a bit of experimenting, I came up with the following bash one-liner (formatted for readability).

(($(curl --silent -I https://example.com/my.remote.tarball.gz \
    | grep -E "^HTTP" \
    | awk -F " " '{print $2}') == 200)) \
    && echo "file exists"

It starts with Curl making a HEAD request and then pipes the returned headers through to grep. Grep, using a regular expression, extracts the response code header and pipes it to awk. Awk then extracts the response number. Finally, if the response code is 200, then “file exists” is echoed to the terminal.

Want to Learn More About Mezzio?

Mezzio Essentials teaches you the fundamentals of PHP's Mezzio framework. It's a practical, hands-on approach, which shows you just enough of about the underlying principles and concepts before stepping you through the process of creating an application.

You could argue that I should be happy at this point. It works. Use it. Move on. Nope! No can do. Something seems wrong here.

Surely, I think to myself, if I can make a HEAD request with Curl, then Curl can also filter out the status code itself, without having to pass the response headers through a pipe. So, I go back through the Curl man page, looking for an option to use.

Nothing stands out, at least at first. Then, after a bit of reading, along with suggestions from the Twitter-verse, I uncovered a new option: -w --write-out. Here are the option’s essential details from Curl’s man page:

-w, –write-out Make curl display information on stdout after a completed transfer. The variables present in the output format will be substituted by the value or text that Curl thinks fit, as described below. All variables are specified as %{variable_name}…

Looks like a winner! After looking through the list of available variables, I hit pay dirt with http_code. Here’s what it does:

http_code The numerical response code that was found in the last retrieved HTTP(S) or FTP(s) transfer. In 7.18.2 the alias response_code was added to show the same info.

By using this option, there’s be no need to parse regular expressions or to pipe for further text processing. Now that’s efficient! So, what does the revised command look like? Here it is:

curl --silent --head --write-out '%{http_code}' \
    https://example.com/my.remote.tarball.gz

Running it, the output is now changed to:

[14/Aug/2019 08:27:58] "HEAD /assets/img/logos/ms-logo-46x46-transparent.png HTTP/1.1" 200
HTTP/1.1 200 OK
X-Powered-By: React/alpha
Content-Type: image/png
Transfer-Encoding: chunked

200%

Hmmm, almost there, as the 200 code is displayed last. So, not entirely, as the other headers are still present. I don’t want all of the header data, just the extracted code. To get rid of this, I used the -o or --output option, which writes output to a file instead of stdout. As I’m not interested in the content, I decided to write it to /dev/null. So, here’s the final version:

curl -o /dev/null --silent -Iw '%{http_code}' \
    https://example.com/my.remote.tarball.gz

The output is now just the HTTP status code, as follows:

200%

Perfect!

In Conclusion

Use Curl to check if a remote resource, regardless of whether the remote resource is an image, tarball (or other compressed files), text file, or whatever you’re after, is available before attempting to download it.

Curl’s man page contains a wealth of information, which just goes to show that the old adage of “search, and you shall find” is apt in the modern day. That said, it can take some searching (at times) to find what you’re looking for, and the right command (or command combination) may not always be evident at first.

However, it’s worth spending the time to read man pages in-depth. If, after doing so, you don’t find an option that helps, then start building a pipe. But don’t build one as a first resort.

Finally, a big thank you for the feedback I received on Twitter — especially to Dan Allen, who pointed out that I wasn’t as clear as I should have been with my initial question, and Thomas Boerger, for sharing a link to an example script that he had, which did almost exactly what I was after. Thanks, everyone!


You might also be interested in these tutorials too...

Tue, Mar 5, 2019

Find What's Using a Port With Two Commands

If you’ve ever attempted to bind a process on a port on Linux, BSD, or macOS, only to find that something else is using the port, yet you don’t know what that process is, here’s a quick way to find the process and remove it.

Don't Write Code When You Don't Have To
Fri, Aug 16, 2019

Don't Write Code When You Don't Have To

Writing code is a very creative endeavour. However, if you’re not careful, you may well end up wasting a lot of time writing code that you don’t have to.

Command-Line Productivity Hack - ctrl+x+e
Thu, Nov 21, 2019

Command-Line Productivity Hack - ctrl+x+e

There are lots of tips, tricks, and ideas around for hacking your command-line productivity to make you more efficient. However, in this post, I’m not going to show you something that’s super in-depth, ultra-detailed, or talk about an app that you have to install, ctrl+x+e.


Want more tutorials like this?

If so, enter your email address in the field below and click subscribe.

You can unsubscribe at any time by clicking the link in the footer of the emails you'll receive. Here's my privacy policy, if you'd like to know more. I use Mailchimp to send emails. You can learn more about their privacy practices here.

Join the discussion

comments powered by Disqus