I ran into a silly problem playing with cURL the other day. Here's some steps I took to correctly fix the problem.
I am running cURL on a remote server, as follows (domain replaced with
remote> curl www.example.com <!DOCTYPE html> <html style="height:100%"> <head><title> 301 Moved Permanently ... </div></div></body></html>
The 301 response header means that I should try to follow the redirect.
remote> curl -L www.example.com curl: (7) Failed to connect to gtop100.com port 443: Connection refused
I try to load the page from the browser local machine and test that the connection works. Yes, it connects.
So then I think, maybe the site only accepts HTTP requests with a specific kind of header, or request type? I go into the browser and attempt to manually copy the request headers. Something crazy, like
That doesn't work. How about I try cURL-ing from the local machine?
local> curl -L www.example.com <!doctype html> <html> <head> <title>Example Domain</title> ...
That works! which is so strange. How can the same tool produce different outputs? Is it version dependent?
...No. They're both version 7.66 (released 2019-09-11).
How about I try to check the verbose output to really see what's going on?
local> curl -v -L www.example.com > curl_out ... * Issue another request to this URL: 'https://example.com/' * Trying <ip-addr>... ... < HTTP/1.1 200 OK < Date: Thu, 03 Oct 2019 03:39:39 GMT ...<rest of response headder> remote> curl -v -L www.example.com > curl_out ... * Issue another request to this URL: 'https://example.com/' * Trying 0.0.0.0... ... < HTTP/1.1 200 OK < Date: Thu, 03 Oct 2019 03:39:39 GMT ...<rest of response header>
So it looks like cURL is trying to follow two different addresses, for the same redirect URL! How can that be? Maybe the DNS server for the remote machine are not updated? That cannot be, because both the local and remote machine reside on the same network (my home).
A passing comment on an SO post gives me the idea to check my respective /etc/hosts file. Then I realize... www.example.com had been blocked by the hosts file, which I personally set years ago to deny outgoing connections to ad sites.
A quick change to the hosts and then everything was working as normal.
Even though the problem was my own undoing, I learned about a bit about the structure of HTTP requests. Additionally, the
--verbose flag is really useful to understand at a deeper level what is going on. Thanks,