Skip to content

Commit cdc9a0a

Browse files
committed
updated transfer protocols chapter
1 parent 184063c commit cdc9a0a

File tree

1 file changed

+184
-2
lines changed

1 file changed

+184
-2
lines changed

text/54_Transfer_Protocols/0_Transfer_Protocols.markdown

Lines changed: 184 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,187 @@
11
## Transfer Protocols ##
22

3-
receive-pack
3+
Here we will go over how clients and servers talk to each other to
4+
transfer Git data around.
5+
6+
### Fetching Data over HTTP ###
7+
8+
Fetching over an http/s URL will make Git use a slightly dumber protocol.
9+
In this case, all of the logic is entirely on the client side. The server
10+
requires no special setup - any static webserver will work fine if the
11+
git directory you are fetching from is in the webserver path.
12+
13+
In order for this to work, you do need to run a single command on the
14+
server repo everytime anything is updated, though - linkgit:git-update-server-info[0],
15+
which updates the objects/info/packs and info/refs files to list which refs
16+
and packfiles are available, since you can't do a listing over http. When
17+
that command is run, the objects/info/packs file looks something like this:
18+
19+
P pack-ce2bd34abc3d8ebc5922dc81b2e1f30bf17c10cc.pack
20+
P pack-7ad5f5d05f5e20025898c95296fe4b9c861246d8.pack
21+
22+
So that if the fetch can't find a loose file, it can try these packfiles. The
23+
info/refs file will look something like this:
24+
25+
184063c9b594f8968d61a686b2f6052779551613 refs/heads/development
26+
32aae7aef7a412d62192f710f2130302997ec883 refs/heads/master
27+
28+
Then when you fetch from this repo, it will start with these refs and walk the
29+
commit objects until the client has all the objects that it needs.
30+
31+
For instance, if you ask to fetch the master branch, it will see that master is
32+
pointing to <code>32aae7ae</code> and that your master is pointing to <code>ab04d88</code>,
33+
so you need <code>32aae7ae</code>. You fetch that object
34+
35+
CONNECT http://myserver.com
36+
GET /git/myproject.git/objects/32/aae7aef7a412d62192f710f2130302997ec883 - 200
37+
38+
and it looks like this:
39+
40+
tree aa176fb83a47d00386be237b450fb9dfb5be251a
41+
parent bd71cad2d597d0f1827d4a3f67bb96a646f02889
42+
author Scott Chacon <schacon@gmail.com> 1220463037 -0700
43+
committer Scott Chacon <schacon@gmail.com> 1220463037 -0700
44+
45+
added chapters on private repo setup, scm migration, raw git
46+
47+
So now it fetches the tree <code>aa176fb8</code>:
48+
49+
GET /git/myproject.git/objects/aa/176fb83a47d00386be237b450fb9dfb5be251a - 200
50+
51+
which looks like this:
52+
53+
100644 blob 6ff87c4664981e4397625791c8ea3bbb5f2279a3 COPYING
54+
100644 blob 97b51a6d3685b093cfb345c9e79516e5099a13fb README
55+
100644 blob 9d1b23b8660817e4a74006f15fae86e2a508c573 Rakefile
56+
57+
So then it fetches those objects:
58+
59+
GET /git/myproject.git/objects/6f/f87c4664981e4397625791c8ea3bbb5f2279a3 - 200
60+
GET /git/myproject.git/objects/97/b51a6d3685b093cfb345c9e79516e5099a13fb - 200
61+
GET /git/myproject.git/objects/9d/1b23b8660817e4a74006f15fae86e2a508c573 - 200
62+
63+
It actually does this with Curl, and can open up multiple parallel threads to
64+
speed up this process. When it's done recursing the tree pointed to by the
65+
commit, it fetches the next parent.
66+
67+
GET /git/myproject.git/objects/bd/71cad2d597d0f1827d4a3f67bb96a646f02889 - 200
68+
69+
Now in this case, the commit that comes back looks like this:
70+
71+
tree b4cc00cf8546edd4fcf29defc3aec14de53e6cf8
72+
parent ab04d884140f7b0cf8bbf86d6883869f16a46f65
73+
author Scott Chacon <schacon@gmail.com> 1220421161 -0700
74+
committer Scott Chacon <schacon@gmail.com> 1220421161 -0700
75+
76+
added chapters on the packfile and how git stores objects
77+
78+
and we can see that the parent, <code>ab04d88</code> is where our master branch
79+
is currently pointing. So, we recursively fetch this tree and then stop, since
80+
we know we have everything before this point. You can force Git to double check
81+
that we have everything with the '--recover' option. See linkgit:git-http-fetch[1]
82+
for more information.
83+
84+
If one of the loose object fetches fails, Git will download the packfile indexes
85+
looking for the sha that it needs, then download that packfile.
86+
87+
It is important if you are running a git server that serves repos this way to
88+
implement a post-receive hook that runs the 'git update-server-info' command
89+
each time or there will be confusion.
90+
91+
### Fetching Data with Upload Pack ###
92+
93+
For the smarter protocols, fetching objects is much more efficient. A socket
94+
is opened, either over ssh or over port 9418 (in the case of the git:// protocol),
95+
and the linkgit:git-fetch-pack[1] command on the client begins communicating with
96+
a forked linkgit:git-upload-pack[1] process on the server.
97+
98+
Then the server will tell the client which SHAs it has for each ref,
99+
and the client figures out what it needs and responds with a list of SHAs it
100+
wants and already has.
101+
102+
At this point, the server will generate a packfile with all the objects that
103+
the client needs and begin streaming it down to the client.
104+
105+
Let's look at an example.
106+
107+
The client connects and sends the request header. The clone command
108+
109+
$ git clone git://myserver.com/project.git
110+
111+
produces the following request:
112+
113+
0032git-upload-pack /project.git\000host=myserver.com\000
114+
115+
The first four bytes contain the hex length of the line (including 4 byte line
116+
length and trailing newline if present). Following are the command and
117+
arguments. This is followed by a null byte and then the host information. The
118+
request is terminated by a null byte.
119+
120+
The request is processed and turned into a call to git-upload-pack:
121+
122+
$ git-upload-pack /path/to/repos/project.git
123+
124+
This immediately returns information of the repo:
125+
126+
007c74730d410fcb6603ace96f1dc55ea6196122532d HEAD\000multi_ack thin-pack side-band side-band-64k ofs-delta shallow no-progress
127+
003e7d1665144a3a975c05f1f43902ddaf084e784dbe refs/heads/debug
128+
003d5a3f6be755bbb7deae50065988cbfa1ffa9ab68a refs/heads/dist
129+
003e7e47fe2bd8d01d481f44d7af0531bd93d3b21c01 refs/heads/local
130+
003f74730d410fcb6603ace96f1dc55ea6196122532d refs/heads/master
131+
0000
132+
133+
Each line starts with a four byte line length declaration in hex. The section
134+
is terminated by a line length declaration of 0000.
135+
136+
This is sent back to the client verbatim. The client responds with another
137+
request:
138+
139+
0054want 74730d410fcb6603ace96f1dc55ea6196122532d multi_ack side-band-64k ofs-delta
140+
0032want 7d1665144a3a975c05f1f43902ddaf084e784dbe
141+
0032want 5a3f6be755bbb7deae50065988cbfa1ffa9ab68a
142+
0032want 7e47fe2bd8d01d481f44d7af0531bd93d3b21c01
143+
0032want 74730d410fcb6603ace96f1dc55ea6196122532d
144+
00000009done
145+
146+
The is sent to the open git-upload-pack process which then streams out the
147+
final response:
148+
149+
"0008NAK\n"
150+
"0023\002Counting objects: 2797, done.\n"
151+
"002b\002Compressing objects: 0% (1/1177) \r"
152+
"002c\002Compressing objects: 1% (12/1177) \r"
153+
"002c\002Compressing objects: 2% (24/1177) \r"
154+
"002c\002Compressing objects: 3% (36/1177) \r"
155+
"002c\002Compressing objects: 4% (48/1177) \r"
156+
"002c\002Compressing objects: 5% (59/1177) \r"
157+
"002c\002Compressing objects: 6% (71/1177) \r"
158+
"0053\002Compressing objects: 7% (83/1177) \rCompressing objects: 8% (95/1177) \r"
159+
...
160+
"005b\002Compressing objects: 100% (1177/1177) \rCompressing objects: 100% (1177/1177), done.\n"
161+
"2004\001PACK\000\000\000\002\000\000\n\355\225\017x\234\235\216K\n\302"..."\b<M^*\343\362\302s"
162+
"2005\001\360\204{\225\376\330\345]z\226\273"..."\361\326\245\036\036\334*78w)\327\"/"
163+
...
164+
"0037\002Total 2797 (delta 1799), reused 2360 (delta 1529)\n"
165+
...
166+
"<\276\255L\273s\005\001w0006\001[0000"
167+
168+
See the Packfile chapter previously for the actual format of the packfile data
169+
in the response.
170+
171+
### Pushing Data ###
172+
173+
Pushing data over the git and ssh protocols is similar, but simpler. Basically
174+
what happens is the client requests a receive-pack instance, which is started
175+
up if the client has access, then the server returns all the ref head shas it
176+
has again and the client generates a packfile of everything the server needs
177+
(generally only if what is on the server is a direct ancestor of what it is
178+
pushing) and sends that packfile upstream, where the server either stores it
179+
on disk and builds an index for it, or unpacks it (if there aren't many objects
180+
in it)
181+
182+
This entire process is accomplished through the linkgit:git-send-pack[1] command
183+
on the client, which is invoked by linkgit:git-push[1] and the
184+
linkgit:git-receive-pack[1] command on the server side, which is invoked by
185+
the ssh connect process or git daemon (if it's an open push server).
186+
4187

5-
upload-pack

0 commit comments

Comments
 (0)