zhangguanzhang's Blog

golang的net/http包的客户端简单科普

字数统计: 3.8k阅读时长: 19 min
2019/07/07

之前准备写个简单的 api 的调用,百度和问了很多人后发现基本对于 http 的客户端熟悉的人非常少。或者对包的不了解自己造了效率很低的轮子,而且官方一些包里有坑,被坑过,这里简单科普下

简单的get和post

http 包里下列可以直接使用的请求方法

1
2
3
4
func Head(url string) (resp *Response, err error)
func Get(url string) (resp *Response, err error)
func Post(url string, bodyType string, body io.Reader) (resp *Response, err error)
func PostForm(url string, data url.Values) (resp *Response, err error)

变量 DefaultClient 是用于包函数 Get、Head 和 Post 的默认 Client。

1
var DefaultClient = &Client{}

例如简单的直接调用

1
2
3
4
5
6
7
8
9
10
11
package main
import "net/http"

func main(){
resp, err := http.Get("http://example.com/")
...
resp, err := http.Post("http://example.com/upload", "image/jpeg", &buf)
...
resp, err := http.PostForm("http://example.com/form",
url.Values{"key": {"Value"}, "id": {"123"}})
}

如果 get 下载文件直接字节写文件打开是损坏或者乱的,尝试大小端写二进制流试试

设置header

看源码,发现默认的 http.Get 是调用默认客户端的 Get 方法

1
2
3
func Get(url string) (resp *Response, err error) {
return DefaultClient.Get(url)
}

ClientGet 方法里是先 new 了一个 http.Request对象后调用 Client.Do 方法来发请求

1
2
3
4
5
6
7
8
9
10
11
func (c *Client) Get(url string) (resp *Response, err error) {
req, err := NewRequest("GET", url, nil)
if err != nil {
return nil, err
}
return c.Do(req)
}
...
func (c *Client) Do(req *Request) (*Response, error) {
return c.do(req)
}

http.NewRequest 返回一个*RequestRequest结构体里有 header属性

1
2
3
type Request struct {
...
Header Header

Header 类型实现了以下方法来设置和获取发请求时候的请求头

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
type Header map[string][]string


func (h Header) Add(key, value string) {
textproto.MIMEHeader(h).Add(key, value)
}


func (h Header) Set(key, value string) {
textproto.MIMEHeader(h).Set(key, value)
}


func (h Header) Get(key string) string {
return textproto.MIMEHeader(h).Get(key)
}

所以自定制header可以这样写,用http.NewRequest来 new 一个请求,然后用请求的Header.Set去设置 header ,然后最后去调用客户端的Do(req)发起请求

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
package main
import "net/http"

func main(){
req, err := http.NewRequest("GET", "http://example.com/", nil)
req.Header.Set("Origin", "xxxxxx")
req.Header.Set("Accept-Encoding", "gzip, deflate, br")
req.Header.Set("Accept-Language", "zh-CN,zh;q=0.9")
req.Header.Set("User-Agent", "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36")
req.Header.Set("Content-Type", "application/json")
req.Header.Set("Accept", "application/json, text/javascript, */*; q=0.01")
req.Header.Set("Referer", "xxxxxx")
req.Header.Set("X-Requested-With", "XMLHttpRequest")
req.Header.Set("Connection", "keep-alive")
req.Header.Set("X-Csrftoken", "xxxxxx")

resp, err := http.DefaultClient.Do(req)
}

例如发起一个post请求

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
jar, _ := cookiejar.New(nil)
http = &http.Client{}

body := strings.NewReader(`username=admin&password=Password%40_`)
req, err := http.NewRequest("POST", sessionUrl, body)
if err != nil {
return h, err

req.Header.Set("Accept-Encoding", "gzip, deflate, br")
req.Header.Set("Accept-Language", "zh-CN,zh;q=0.9")
req.Header.Set("User-Agent", "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36")
req.Header.Set("Content-Type", "application/x-www-form-urlencoded; charset=UTF-8")
req.Header.Set("Accept", "application/json, text/javascript, */*; q=0.01")
req.Header.Set("Referer", baseUrl)
req.Header.Set("X-Requested-With", "XMLHttpRequest")
req.Header.Set("Connection", "keep-alive")

resp, err := http.Do(req)
if err != nil {
return err //errors.New("Login Timeout")
}
defer resp.Body.Close()

respBody, err := ioutil.ReadAll(resp.Body)
if err != nil {
return err
}
var data = &CSR{}
if err := json.Unmarshal(respBody, data); err != nil {
return err
}
//fmt.Println(string(respBody))
if data.PasswordModify != 0 {
return errors.New("Password Wrong")
}
return nil

自定义客户端

上面都是使用的包里定义的默认客户端,例如有些网站的证书不是权威证书,我们得关闭客户端的权威证书检查,类似于curl -k那样。或者设置客户端超时时间

1
2
3
4
5
6
7
8
client = &http.Client{
Timeout: time.Second * 3,
Transport: &http.Transport{
TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
},
}
req, err := http.NewRequest("GET", "http://example.com/", nil)
resp, err := client.Do(req)

cookies

上面都是些不需要登陆的,或者说不是那种接口式的网站,接口的网站一般是先 basicAuth或者 oAuth2 啥的请求了获取了一个 token,后续调接口带上 token 请求就行了,不需要设置 header 啥的。但是也有网站不提供接口的,所以一般需要 http 客户端的记录 session 模拟人为登陆。
而 session 就是体现在 http 的 header 的cookie: xxx=yyy; aaa=bbb; session_id=93728560xxxx; ..... 里(http的 header 的 key 不区分大小写),客户端请求后,服务器端回应的时候会带上 Set-cookie 然后客户端会自行去把键值对写到 cookie 里。有的网站把 token 放在 cookies 里作为认证的身份,例如以前的百度贴吧的自动签到和 pandownload 的登陆下载都是叫用户自己找 cookie 里的那几个字段的值写进去,程序会带着它去请求。cookie 里的很多字段看各个 web server 的控制了,这里不细致讨论。

我们使用浏览器的时候,浏览器会自动响应Set-cookie,那么 net/http 源码里肯定也有对应的代码段, Do方法最后调用的send方法发送请求

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
func (c *Client) send(req *Request, deadline time.Time) (resp *Response, didTimeout func() bool, err error) {
if c.Jar != nil {
for _, cookie := range c.Jar.Cookies(req.URL) {
req.AddCookie(cookie)
}
}
resp, didTimeout, err = send(req, c.transport(), deadline)
if err != nil {
return nil, didTimeout, err
}
if c.Jar != nil {
if rc := resp.Cookies(); len(rc) > 0 {
c.Jar.SetCookies(req.URL, rc)
}
}
return resp, nil, nil
}

如果客户端的 .Jar 不为空就会去 SetCookies ,所以我们使用 cookies 也可以自行在 header 里自动去写,这样做法是浏览器登录后 F12 打开 network 抓包,点击到请求里找
http

找到后自行req.Header.Set("Cookie", "xxxxxx")或者req.AddCookie(xxx),这样非常繁琐,所以一般我们是新建一个客户端把客户端的Jar不设置为空就行了

1
2
3
4
5
6
jar, _ := cookiejar.New(nil)
h.http = &http.Client{
...
Jar: jar,
...
}

这样后续使用这个客户端的时候就和我们使用浏览器一样会自动处理服务端发的 cookie 操作了,会保持住 session

multipart/form-data

http 上传文件的时候是把文件分段上传的,会生成一个随机字符(boundary)来分割每段
http
boundary 是各自的http客户端生成的,chrome 好像和其他的不一样,总之上传文件的 type 为

1
Content-Type: multipart/form-data; boundary=分割文件时候的随机字符

type里的 boundary 是随机的,所以我们得用包"mime/multipart"处理

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
import (
"bytes"
"fmt"
"io"
"io/ioutil"
"mime/multipart"
"net/http"
"os"
)

func postFile(filename string, targetUrl string) error {
bodyBuf := &bytes.Buffer{} //创建缓存
bodyWriter := multipart.NewWriter(bodyBuf) // 创建part的writer

//关键的一步操作,fwimage自行看上图抓包里的,而且这里最好用filepath.Base取文件名不要带路径
fileWriter, err := bodyWriter.CreateFormFile("file", filepath.Base(filename))
if err != nil {
fmt.Println("error writing to buffer")
return err
}


fh, err := os.Open(filename)
if err != nil {
fmt.Println("error opening file")
return err
}
defer fh.Close()

//iocopy
_, err = io.Copy(fileWriter, fh)
if err != nil {
return err
}

_ = bodyWriter.WriteField("id", "WU_FILE_0")

// write some others if needed
// p1w, _ := bw.CreateFormField("name")
// p1w.Write([]byte("Tony Bai"))

bodyWriter.Close() // 必须在发请求之前关闭,不然不会读到EOF

req, err := http.NewRequest("POST", Url, bodyBuf)
if err != nil {
return err
}

...
req.Header.Set("Content-Type", bodyWriter.FormDataContentType()) //获取Content-Type的值

resp, err := http.Do(req) //自己的客户端去do,不要照抄

defer resp.Body.Close()
resp_body, err := ioutil.ReadAll(resp.Body)
if err != nil {
return err
}
fmt.Println(resp.Status)
fmt.Println(string(resp_body))
return nil
}

digest auth

先看一段curl 的 digest auth 的过程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
curl -svX GET --digest -u admin:'xxxxxxx' http://100.64.16.10:8080/cas/casrs/operator/getAuthUrl -H 'Accept: application/json'
* About to connect() to 100.64.16.10 port 8080 (#0)
* Trying 100.64.16.10...
* Connected to 100.64.16.10 (100.64.16.10) port 8080 (#0)
* Server auth using Digest with user 'admin'
> GET /cas/casrs/operator/getAuthUrl HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 100.64.16.10:8080
> Accept: application/json
>
< HTTP/1.1 401 Unauthorized
< Cache-Control: no-cache, no-store, max-age=0, must-revalidate
< Pragma: no-cache
< Expires: 0
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: DENY
< X-Content-Type-Options: nosniff
< Set-Cookie: JSESSIONID=F65AC0489A30F3A69175FD59B10F9CC5; Path=/cas/; HttpOnly
< WWW-Authenticate: Digest realm="VMC RESTful Web Services", qop="auth", nonce="MTU2NDUzNjE3MTUwMzo0MTJjOGM3MGU1MDlmZDhiMDlhM2YzNTBhYjExOGRhMg=="
< Content-Type: text/html;charset=ISO-8859-1
< Content-Length: 686
< Date: Wed, 31 Jul 2019 01:17:51 GMT
< Server: CVM
<
* Ignoring the response-body
* Connection #0 to host 100.64.16.10 left intact
* Issue another request to this URL: 'http://100.64.16.10:8080/cas/casrs/operator/getAuthUrl'
* Found bundle for host 100.64.16.10: 0xb50040
* Re-using existing connection! (#0) with host 100.64.16.10
* Connected to 100.64.16.10 (100.64.16.10) port 8080 (#0)
* Server auth using Digest with user 'admin'
> GET /cas/casrs/operator/getAuthUrl HTTP/1.1
> Authorization: Digest username="admin", realm="VMC RESTful Web Services", nonce="MTU2NDUzNjE3MTUwMzo0MTJjOGM3MGU1MDlmZDhiMDlhM2YzNTBhYjExOGRhMg==", uri="/cas/casrs/operator/getAuthUrl", cnonce="ICAgICAgICAgICAgICAgICAgICAgICAgIDIxMTgzODA=", nc=00000001, qop=auth, response="bc0c68695d1d51daa364d7a4976566b7"
> User-Agent: curl/7.29.0
> Host: 100.64.16.10:8080
> Accept: application/json
>
< HTTP/1.1 200 OK
< Cache-Control: no-cache, no-store, max-age=0, must-revalidate
< Pragma: no-cache
< Expires: 0
< X-XSS-Protection: 1; mode=block
< X-Frame-Options: DENY
< X-Content-Type-Options: nosniff
< Set-Cookie: JSESSIONID=F06F0016DB0589032994C4D2C296B604; Path=/cas/; HttpOnly
< Content-Type: application/json
< Transfer-Encoding: chunked
< Date: Wed, 31 Jul 2019 01:17:51 GMT
< Server: CVM
<
* Connection #0 to host 100.64.16.10 left intact

http digest auth过程是:

  • 初次请求后 server 端返回 401 请求,并发送 curl 一个里 headerWWW-Authenticate的请求,里面拥有三个字段Digest realmqopnonce
  • curl 回复请求,header 为Authorization,内容为:
    • Digest username为用户名
    • realmDigest realm的值,nonceqop为server端返回
    • uri为去掉host字段的url部分
    • nc就是nonceCount,用于标记,计数,防止重放攻击,所以这次为1
    • cnonce客户端发给服务器的随机字符串
    • response的值是由俩个hash加密的,加密的表达式决于qop字段,这里直接写伪代码吧
      • 如果algorithm未定义或者值为MD5:
        1
        HA1 = MD5(fmt.Sprintf("%s:%s:%s", username, realm, password))
      • 如果algorithm值为MD5-sess(和上面差不多,只不过多了:nonce:cnonce):
        1
        HA1 = MD5(fmt.Sprintf("%s:%s:%s", MD5(fmt.Sprintf("%s:%s:%s", username, realm, password)), nonce, cnonce)
      • 如果qop未定义或者值为auth:
        1
        HA2 = MD5(fmt.Sprintf("%s:%s", method, digestURI))
      • 如果qop值为auth-int:
        1
        HA2 = MD5(fmt.Sprintf("%s:%s:%s", method, digestURI, MD5(entityBody)))
      • 如果qop值为auth-int,response为:
        1
        response = MD5(fmt.Sprintf("%s:%s:%s:%s:%s:%s", HA1, nonce, nonceCount, cnonce, qop, HA2))
      • 如果qop未定义
        1
        response = MD5(fmt.Sprintf("%s:%s:%s", HA1, nonce, HA2))

所以可以写代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
package cas

import (
"crypto/md5"
"crypto/rand"
"encoding/hex"
"encoding/json"
"errors"
"fmt"
"io"
"io/ioutil"
"net/http"
"net/http/cookiejar"
"strings"
)

type CAS struct {
client *http.Client
baseUrl string
}

func NewCAS(url, user, pass string) (*CAS, error) {

var err error
c := &CAS{
baseUrl: url,
}

//tr := &http.Transport{
// TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
//}
jar, _ := cookiejar.New(nil)
c.client = &http.Client{
Jar: jar,
}

req, err := c.NewRequest("GET", "/cas/casrs/operator/getAuthUrl", nil)
if err != nil {
return nil, err
}

resp, err := c.client.Do(req)
if err != nil {
return nil, err
}
defer resp.Body.Close()
if resp.StatusCode == http.StatusUnauthorized {
digestParts := getDigestParts(resp)
digestParts["uri"] = url
digestParts["method"] = req.Method
digestParts["username"] = user
digestParts["password"] = pass
req, err = c.NewRequest("GET", "/cas/casrs/operator/getAuthUrl", nil)
if err != nil {
return nil, err
}
req.Header.Set("Authorization", getDigestAuthrization(digestParts))
req.Header.Set("Content-Type", "application/json")
resp, err = c.client.Do(req)
if err != nil {
return nil, err
}
defer resp.Body.Close()
body, _ := ioutil.ReadAll(resp.Body)

var data = struct {
URI string `json:"uri"`
}{}
err = json.Unmarshal(body, &data)
if err != nil {
fmt.Println(string(body))
return nil, fmt.Errorf("NewCAS:json.Unmarshal failed|%s", err.Error())
}
if data.URI == "" {
return nil, errors.New("maybe username or password error!")
}
}

return c, nil
}

func (c *CAS) NewRequest(method, url string, body io.Reader) (*http.Request, error) {
req, err := http.NewRequest(method, fmt.Sprintf("%s%s", c.baseUrl, url), body)
if err != nil {
return nil, err
}

req.Header.Set("Accept", "application/json, text/plain, */*")
req.Header.Set("Connection", "keep-alive")
if body != nil {
req.Header.Set("Content-Type", "application/json;charset=UTF-8")
}
return req, nil
}

func getDigestParts(resp *http.Response) map[string]string {
result := map[string]string{}
if len(resp.Header["WWW-Authenticate"]) > 0 {
wantedHeaders := []string{"nonce", "realm", "qop", "algorithm"}
responseHeaders := strings.Split(resp.Header["WWW-Authenticate"][0], ",")
for _, r := range responseHeaders {
for _, w := range wantedHeaders {
if strings.Contains(r, w) {
result[w] = strings.Split(r, `"`)[1]
}
}
}

}
return result
}

func getDigestAuthrization(digestParts map[string]string) string {
var ha1, ha2, response string
d := digestParts

getMD5 := func(text string) string {
hasher := md5.New()
hasher.Write([]byte(text))
return hex.EncodeToString(hasher.Sum(nil))
}

getCnonce := func() string {
b := make([]byte, 8)
_, _ = io.ReadFull(rand.Reader, b)
return fmt.Sprintf("%x", b)[:16]
}
cnonce := getCnonce()
ha1 = getMD5(d["username"] + ":" + d["realm"] + ":" + d["password"])
if strings.Compare(d["algorithm"], "MD5-sess") == 0 {
ha1 = getMD5(ha1 + ":" + d["nonce"] + ":" + cnonce)
}

if strings.Compare(d["qop"], "auth-int") != 0 {
ha2 = getMD5(d["method"] + ":" + d["uri"])
}
nonceCount := 00000001
if len(d["qop"]) == 0 {
response = getMD5(fmt.Sprintf("%s:%v:%s", ha1, nonceCount, ha2))
} else {
response = getMD5(fmt.Sprintf("%s:%s:%v:%s:%s:%s", ha1, d["nonce"], nonceCount, cnonce, d["qop"], ha2))
}

authorization := fmt.Sprintf(`Digest username="%s", realm="%s", nonce="%s", uri="%s", cnonce="%s", nc="%v", qop="%s", response="%s"`,
d["username"], d["realm"], d["nonce"], d["uri"], cnonce, nonceCount, d["qop"], response)
return authorization
}

entityBody还没搞清楚是啥,所在qopauth-int的没写

一些坑

header的host字段

之前把 curl 写的一套逻辑尝试写到 go 里,发现一直不对,最后发现了host字段的锅。抓包的接口host字段和请求的url不一样(这种情况虽然不是很热门,但是是存在的,例如我们命令行访问ip,设置header模拟访问域名),后面没办法去掉 host 的 header 设定就可以了
具体移步 https://github.com/golang/go/issues/7682

上传文件的type

官方函数 CreateFormFile 限制了 Content-Typeapplication/octet-stream 而且并不打算改,很多时候后端的时候会重视这个 type。可以看到之前我的浏览器抓包的 type 是application/octet-binary 所以我们可以写个下面的函数处理

1
2
3
4
5
6
func createAudioFormFile(w *multipart.Writer, fieldname, filename string) (io.Writer, error) {
h := make(textproto.MIMEHeader)
h.Set("Content-Disposition", fmt.Sprintf(`form-data; name="%s"; filename="%s"`, fieldname, filename))
h.Set("Content-Type", "application/octet-binary")
return w.CreatePart(h)
}

我们可以这样用

1
2
3
fileWriter, err := bodyWriter.CreateFormFile("fwimage", filepath.Base(filename))
# 改为
fileWriter, _ := createAudioFormFile(bodyWriter,"fwimage", filepath.Base(filename))

json的坑

下面这张图可能看不出啥问题,但是是问题的一部分,调用的接口的数据经过了 cdn,int类型经常出现.0的数字导致我写错类型 json.Unmarshal 报错,jq也会把.0 的去掉取整
http3

参考:

CATALOG
  1. 1. 简单的get和post
  2. 2. 设置header
  3. 3. 自定义客户端
  4. 4. cookies
  5. 5. multipart/form-data
  6. 6. digest auth
  7. 7. 一些坑
    1. 7.1. header的host字段
    2. 7.2. 上传文件的type
    3. 7.3. json的坑