Albert Ge

A silly cURL-ing problem

October 3, 2019

I ran into a silly problem playing with cURL the other day. Here's some steps I took to correctly fix the problem.

I am running cURL on a remote server, as follows (domain replaced with www.example.com):

remote> curl www.example.com

<!DOCTYPE html>
<html style="height:100%">
<head><title> 301 Moved Permanently
...
</div></div></body></html>

Redirection

The 301 response header means that I should try to follow the redirect.

remote> curl -L www.example.com
curl: (7) Failed to connect to gtop100.com port 443: Connection refused

which fails.

I try to load the page from the browser local machine and test that the connection works. Yes, it connects.

Copying request headers

So then I think, maybe the site only accepts HTTP requests with a specific kind of header, or request type? I go into the browser and attempt to manually copy the request headers. Something crazy, like

That doesn't work. How about I try cURL-ing from the local machine?

local> curl -L www.example.com
<!doctype html>
<html>
<head>
    <title>Example Domain</title>
    ...

That works! which is so strange. How can the same tool produce different outputs? Is it version dependent?

...No. They're both version 7.66 (released 2019-09-11).

Verbose output

How about I try to check the verbose output to really see what's going on?

local> curl -v -L www.example.com > curl_out
...
* Issue another request to this URL: 'https://example.com/'
*   Trying <ip-addr>...
...
< HTTP/1.1 200 OK
< Date: Thu, 03 Oct 2019 03:39:39 GMT
...<rest of response headder>

remote> curl -v -L www.example.com > curl_out
...
* Issue another request to this URL: 'https://example.com/'
*   Trying 0.0.0.0...
...
< HTTP/1.1 200 OK
< Date: Thu, 03 Oct 2019 03:39:39 GMT
...<rest of response header>

So it looks like cURL is trying to follow two different addresses, for the same redirect URL! How can that be? Maybe the DNS server for the remote machine are not updated? That cannot be, because both the local and remote machine reside on the same network (my home).

A passing comment on an SO post gives me the idea to check my respective /etc/hosts file. Then I realize... www.example.com had been blocked by the hosts file, which I personally set years ago to deny outgoing connections to ad sites.

A quick change to the hosts and then everything was working as normal.

Conclusion

Even though the problem was my own undoing, I learned about a bit about the structure of HTTP requests. Additionally, the --verbose flag is really useful to understand at a deeper level what is going on. Thanks, man curl!