Proxies

                              A proxy is a machine or software that does
                                    something on behalf of you, the
                                    client.
                            
                              You can also see it as a middle man that
                                    sits between you and the server you want to
                                    work with, a middle man that you connect to
                                    instead of the actual remote server. You ask
                                    the proxy to perform your desired operation
                                    for you and then it will run off and do that
                                    and then return the data to you.
                            
                              There are several different types of
                                    proxies and we shall list and discuss them
                                    further down in this section.
                            
                                Discover your proxy
                                      
                              Some networks are setup to require a proxy
                                    in order for you to reach the Internet or
                                    perhaps that special network you are
                                    interested in. The use of proxies are
                                    introduced on your network by the people and
                                    management that run your network for policy
                                    or technical reasons.
                            
                              In the networking space there are a few
                                    methods for the automatic detection of
                                    proxies and how to connect to them, but none
                                    of those methods are truly universal and
                                    curl supports none of them. Furthermore,
                                    when you communicate to the outside world
                                    through a proxy that often means that you
                                    have to put a lot of trust on the proxy as
                                    it will be able to see and modify all the
                                    non-secure network traffic you send or get
                                    through it. That trust is not easy to assume
                                    automatically.
                            
                              If you check your browser's network
                                    settings, sometimes under an advanced
                                    settings tab, you can learn what proxy or
                                    proxies your browser is configured to use.
                                    Chances are big that you should use the same
                                    one or ones when you use curl.
                            
                              As an example, you can find
                                  proxy settings for Firefox browser
                                    in Preferences => General => Network
                                    Settings as shown below:
                            
                                      proxy settings for Firefox
                                    
                                PAC
                                      
                              Some network environments provides several
                                    different proxies that should be used in
                                    different situations, and a customizable way
                                    to handle that is supported by the browsers.
                                    This is called "proxy auto-config", or PAC.
                            
                              A PAC file contains a JavaScript function
                                    that decides which proxy a given network
                                    connection (URL) should use, and even if it
                                    should not use a proxy at all. Browsers most
                                    typically read the PAC file off a URL on the
                                    local network.
                            
                              Since curl has no JavaScript capabilities,
                                    curl does not support PAC files. If your
                                    browser and network use PAC files, the
                                    easiest route forward is usually to read the
                                    PAC file manually and figure out the proxy
                                    you need to specify to run curl
                                    successfully.
                            
                                Captive portals
                                      
                              These are not proxies but they're
                                    blocking the way between you and the server
                                    you want to access.
                            
                              A "captive portal" is one of
                                    these systems that are popular to use in
                                    hotels, airports and for other sorts of
                                    network access to a larger audience. The
                                    portal will "capture" all network
                                    traffic and redirect you to a login web page
                                    until you have either clicked OK and
                                    verified that you have read their conditions
                                    or perhaps even made sure that you have paid
                                    plenty of money for the right to use the
                                    network.
                            
                              curl's traffic will of course also
                                    captured by such portals and often the best
                                    way is to use a browser to accept the
                                    conditions and "get rid of" the
                                    portal since from then on they often allow
                                    all other traffic originating from that same
                                    machine (MAC address) for a period of
                                    time.
                            
                              Most often you can use curl too to submit
                                    that "ok" affirmation, if you just
                                    figure out how to submit the form and what
                                    fields to include in it. If this is
                                    something you end up doing many times, it
                                    may be worth exploring.
                            
                                Proxy type
                                      
                              curl supports several different types of
                                    proxies.
                            
                              The default proxy type is HTTP so if you
                                    specify a proxy host name (or IP address)
                                    without a scheme part (the part that is
                                    often written as "http://") curl
                                    goes with assuming it's an HTTP
                                    proxy.
                            
                              curl also allows a number of different
                                    options to set the proxy type instead of
                                    using the scheme prefix. See the
                                  SOCKS
                                    section below.
                            
                                HTTP
                                      
                              An HTTP proxy is a proxy that the client
                                    speaks HTTP with to get the transfer done.
                                    curl will, by default, assume that a host
                                    you point out with -x
                                    or --proxy
                                    is an HTTP proxy, and unless you also
                                    specify a port number it will default to
                                    port 1080 (and the reason for that
                                    particular port number is purely
                                    historical).
                            
                              If you want to request the example.com web
                                    page using a proxy on 192.168.0.1 port 8080,
                                    a command line could look like:
                            
curl -x 192.168.0.1:8080 http://example.com/

                              Recall that the proxy receives your
                                    request, forwards it to the real server,
                                    then reads the response from the server and
                                    then hands that back to the client.
                            
                              If you enable verbose mode with -v
                                    when talking to a proxy, you will see that
                                    curl connects to the proxy instead of the
                                    remote server, and you will see that it uses
                                    a slightly different request line.
                            
                                HTTPS with HTTP proxy
                                      
                              HTTPS was designed to allow and provide
                                    secure and safe end-to-end privacy from the
                                    client to the server (and back). In order to
                                    provide that when speaking to an HTTP proxy,
                                    the HTTP protocol has a special request that
                                    curl uses to setup a tunnel through the
                                    proxy that it then can encrypt and verify.
                                    This HTTP method is known as CONNECT.
                            
                              When the proxy tunnels encrypted data
                                    through to the remote server after a CONNECT
                                    method sets it up, the proxy cannot see nor
                                    modify the traffic without breaking the
                                    encryption:
                            
curl -x proxy.example.com:80 https://example.com/

                                MITM-proxies
                                      
                              MITM means Man-In-The-Middle. MITM-proxies
                                    are usually deployed by companies in
                                    "enterprise environments" and
                                    elsewhere, where the owners of the network
                                    have a desire to investigate even TLS
                                    encrypted traffic.
                            
                              To do this, they require users to install a
                                    custom "trust root" (Certificate
                                    Authority (CA) certificate) in the client,
                                    and then the proxy terminates all TLS
                                    traffic from the client, impersonates the
                                    remote server and acts like a proxy. The
                                    proxy then sends back a generated
                                    certificate signed by the custom CA. Such
                                    proxy setups usually transparently capture
                                    all traffic from clients to TCP port 443 on
                                    a remote machine. Running curl in such a
                                    network would also get its HTTPS traffic
                                    captured.
                            
                              This practice, of course, allows the middle
                                    man to decrypt and snoop on all TLS
                                    traffic.
                            
                                Non-HTTP protocols over an HTTP
                                      proxy
                                      
                              An "HTTP proxy" means the proxy
                                    itself speaks HTTP. HTTP proxies are
                                    primarily used to proxy HTTP but it is also
                                    fairly common that they support other
                                    protocols as well. In particular, FTP is
                                    fairly commonly supported.
                            
                              When talking FTP "over" an HTTP
                                    proxy, it is usually done by more or less
                                    pretending the other protocol works like
                                    HTTP and asking the proxy to "get this
                                    URL" even if the URL is not using HTTP.
                                    This distinction is important because it
                                    means that when sent over an HTTP proxy like
                                    this, curl does not really speak FTP even
                                    though given an FTP URL; thus FTP-specific
                                    features will not work:
                            
curl -x http://proxy.example.com:80 ftp://ftp.example.com/file.txt

                              What you can do instead then, is to
                                    "tunnel through" the HTTP
                                    proxy!
                            
                                HTTP proxy tunneling
                                      
                              Most HTTP proxies allow clients to
                                    "tunnel through" it to a server on
                                    the other side. That's exactly
                                    what's done every time you use HTTPS
                                    through the HTTP proxy.
                            
                              You tunnel through an HTTP proxy with curl
                                    using -p
                                    or --proxytunnel.
                            
                              When you do HTTPS through a proxy you
                                    normally connect through to the default
                                    HTTPS remote TCP port number 443, so
                                    therefore you will find that most HTTP
                                    proxies white list and allow connections
                                    only to hosts on that port number and
                                    perhaps a few others. Most proxies will deny
                                    clients from connecting to just any random
                                    port (for reasons only the proxy
                                    administrators know).
                            
                              Still, assuming that the HTTP proxy allows
                                    it, you can ask it to tunnel through to a
                                    remote server on any port number so you can
                                    do other protocols "normally" even
                                    when tunneling. You can do FTP tunneling
                                    like this:
                            
curl -p -x http://proxy.example.com:80 ftp://ftp.example.com/file.txt

                              You can tell curl to use HTTP/1.0 in its
                                    CONNECT request issued to the HTTP proxy by
                                    using --proxy1.0 [proxy]
                                    instead of -x.
                            
                                SOCKS types
                                      
                              SOCKS is a protocol used for proxies and
                                    curl supports it. curl supports both SOCKS
                                    version 4 as well as version 5, and both
                                    versions come in two flavors.
                            
                              You can select the specific SOCKS version
                                    to use by using the correct scheme part for
                                    the given proxy host with -x, or you can specify it with a separate
                                    option instead of -x.
                            
                              SOCKS4 is for the version 4 and SOCKS4a is
                                    for the version 4 without resolving the host
                                    name locally:
                            
curl -x socks4://proxy.example.com http://www.example.com/
​
curl --socks4 proxy.example.com http://www.example.com/

                              The SOCKS4a versions:
                            
curl -x socks4a://proxy.example.com http://www.example.com/
​
curl --socks4a proxy.example.com http://www.example.com/

                              SOCKS5 is for the version 5 and
                                    SOCKS5-hostname is for the version 5 without
                                    resolving the host name locally:
                            
curl -x socks5://proxy.example.com http://www.example.com/
​
curl --socks5 proxy.example.com http://www.example.com/

                              The SOCKS5-hostname versions. This sends
                                    the host name to the server so there's
                                    no name resolving done locally:
                            
curl -x socks5h://proxy.example.com http://www.example.com/
​
curl --socks5-hostname proxy.example.com http://www.example.com/

                                Proxy authentication
                                      
                              HTTP proxies can require authentication, so
                                    curl then needs to provide the proper
                                    credentials to the proxy to be allowed to
                                    use it, and failing to do will only make the
                                    proxy return HTTP responses using code
                                    407.
                            
                              Authentication for proxies is similar to
                                    "normal" HTTP authentication. It
                                    is separate from the server authentication
                                    to allow clients to independently use both
                                    normal host authentication as well as proxy
                                    authentication.
                            
                              With curl, you set the user name and
                                    password for the proxy authentication with
                                    the -U user:password
                                    or --proxy-user user:password
                                    option:
                            
curl -U daniel:secr3t -x myproxy:80 http://example.com

                              This example will default to using the
                                    Basic authentication scheme. Some proxies
                                    will require another authentication scheme
                                    (and the headers that are returned when you
                                    get a 407 response will tell you which) and
                                    then you can ask for a specific method with --proxy-digest, --proxy-negotiate, --proxy-ntlm. The above example command again, but
                                    asking for NTLM auth with the proxy:
                            
curl -U daniel:secr3t -x myproxy:80 http://example.com --proxy-ntlm

                              There's also the option that asks curl
                                    to figure out which method the proxy wants
                                    and supports and then go with that (with the
                                    possible expense of extra roundtrips) using --proxy-anyauth. Asking curl to use any method the proxy
                                    wants is then like this:
                            
curl -U daniel:secr3t -x myproxy:80 http://example.com --proxy-anyauth

                                HTTPS to proxy
                                      
                              All the previously mentioned protocols to
                                    speak with the proxy are clear text
                                    protocols, HTTP and the SOCKS versions.
                                    Using these methods could allow someone to
                                    eavesdrop on your traffic the local network
                                    where you or the proxy reside.
                            
                              One solution for that is to use HTTPS to
                                    the proxy, which then establishes a secure
                                    and encrypted connection that is safe from
                                    easy surveillance.
                            
                              When a HTTPS proxy is specified, the
                                    default port will be 443.
                            
                                Proxy environment variables
                                      
                              curl checks for the existence of specially
                                    named environment variables before it runs
                                    to see if a proxy is requested to get
                                    used.
                            
                              You specify the proxy by setting a variable
                                    named [scheme]_proxy
                                    to hold the proxy host name (the same way
                                    you would specify the host with -x). So if you want to tell curl to use a
                                    proxy when access a HTTP server, you set the
                                    'http_proxy' environment variable.
                                    Like this:
                            
http_proxy=http://proxy.example.com:80
curl -v www.example.com

                              While the above example shows HTTP, you
                                    can, of course, also set ftp_proxy,
                                    https_proxy, and so on. All these proxy
                                    environment variable names except http_proxy
                                    can also be specified in uppercase, like
                                    HTTPS_PROXY.
                            
                              To set a single variable that controls all
                                    protocols, the ALL_PROXY exists. If a
                                    specific protocol variable one exists, such
                                    a one will take precedence.
                            
                              When using environment variables to set a
                                    proxy, you could easily end up in a
                                    situation where one or a few host names
                                    should be excluded from going through the
                                    proxy. This is then done with the NO_PROXY
                                    variable. Set that to a comma- separated
                                    list of host names that should not use a
                                    proxy when being accessed. You can set
                                    NO_PROXY to be a single asterisk
                                    ('*') to match all hosts.
                            
                              As an alternative to the NO_PROXY variable,
                                    there's also a --noproxy
                                    command line option that serves the same
                                    purpose and works the same way.
                            
                                http_proxy in lower case only
                                      
                              The HTTP version of the proxy environment
                                    variables is treated differently than the
                                    others. It is only accepted in its lower
                                    case version because of the CGI protocol,
                                    which lets users run scripts in a server
                                    when invoked by an HTTP server. When a CGI
                                    script is invoked by a server, it
                                    automatically creates environment variables
                                    for the script based on the incoming headers
                                    in the request. Those environment variables
                                    are prefixed with uppercase HTTP_!
                            
                              An incoming request to a HTTP server using
                                    a request header like Proxy: yada
                                    will therefore create the environment
                                    variable HTTP_PROXY
                                    set to contain yada
                                    before the CGI script is started. If that
                                    CGI script runs curl...
                            
                              Accepting the upper case version of this
                                    environment variable has been the source for
                                    many security problems in lots of software
                                    through times.
                            
                                Proxy headers
                                      
                              When you want to add HTTP headers meant
                                    specifically for a proxy and not for the
                                    remote server, the --header
                                    option falls short.
                            
                              For example, if you issue a HTTPS request
                                    through a HTTP proxy, it will be done by
                                    first issuing a CONNECT
                                    to the proxy that establishes a tunnel to
                                    the remote server and then it sends the
                                    request to that server. That first CONNECT
                                    is only issued to the proxy and you may want
                                    to make sure only that receives your special
                                    header, and send another set of custom
                                    headers to the remote server.
                            
                              Set a specific different User-Agent:
                                    only to the proxy:
                            
curl --proxy-header "User-Agent: magic/3000" -x proxy https://example.com/