|
| 1 | + |
| 2 | +Hyperlink |
| 3 | +========= |
| 4 | + |
| 5 | +Word allows hyperlinks to be placed in a document. |
| 6 | + |
| 7 | +The target of a hyperlink may be external, such as a web site, or internal, |
| 8 | +to another location in the document. |
| 9 | + |
| 10 | +A hyperlink can contain multiple runs of text, each with its own distinct |
| 11 | +text formatting (font). |
| 12 | + |
| 13 | + |
| 14 | +Candidate protocol |
| 15 | +------------------ |
| 16 | + |
| 17 | +An external hyperlink has an address and an optional anchor. An internal |
| 18 | +hyperlink has only an anchor. |
| 19 | + |
| 20 | +.. highlight:: python |
| 21 | + |
| 22 | +**Add the external hyperlink** `http://us.com#about`:: |
| 23 | + |
| 24 | + >>> hyperlink = paragraph.add_hyperlink('About', address='http://us.com', anchor='about') |
| 25 | + >>> hyperlink |
| 26 | + <docx.text.hyperlink.Hyperlink at 0x7f...> |
| 27 | + >>> hyperlink.text |
| 28 | + 'About' |
| 29 | + >>> hyperlink.address |
| 30 | + 'http://us.com' |
| 31 | + >>> hyperlink.anchor |
| 32 | + 'about' |
| 33 | + |
| 34 | +**Add an internal hyperlink (to a bookmark)**:: |
| 35 | + |
| 36 | + >>> hyperlink = paragraph.add_hyperlink('Section 1', anchor='Section_1') |
| 37 | + >>> hyperlink.text |
| 38 | + 'Section 1' |
| 39 | + >>> hyperlink.anchor |
| 40 | + 'Section_1' |
| 41 | + >>> hyperlink.address |
| 42 | + None |
| 43 | + |
| 44 | +**Modify hyperlink properties**:: |
| 45 | + |
| 46 | + >>> hyperlink.text = 'Froogle' |
| 47 | + >>> hyperlink.text |
| 48 | + 'Froogle' |
| 49 | + >>> hyperlink.address = 'mailto:info@froogle.com?subject=sup dawg?' |
| 50 | + >>> hyperlink.address |
| 51 | + 'mailto:info@froogle.com?subject=sup%20dawg%3F' |
| 52 | + >>> hyperlink.anchor = None |
| 53 | + >>> hyperlink.anchor |
| 54 | + None |
| 55 | + |
| 56 | +**Add additional runs to a hyperlink**:: |
| 57 | + |
| 58 | + >>> hyperlink.text = 'A ' |
| 59 | + >>> # .insert_run inserts a new run at idx, defaults to idx=-1 |
| 60 | + >>> hyperlink.insert_run(' link').bold = True |
| 61 | + >>> hyperlink.insert_run('formatted', idx=1).bold = True |
| 62 | + >>> hyperlink.text |
| 63 | + 'A formatted link' |
| 64 | + >>> [r for r in hyperlink.iter_runs()] |
| 65 | + [<docx.text.run.Run at 0x7fa...>, |
| 66 | + <docx.text.run.Run at 0x7fb...>, |
| 67 | + <docx.text.run.Run at 0x7fc...>] |
| 68 | + |
| 69 | +**Iterate over the run-level items a paragraph contains**:: |
| 70 | + |
| 71 | + >>> paragraph = document.add_paragraph('A paragraph having a link to: ') |
| 72 | + >>> paragraph.add_hyperlink(text='github', address='http://github.com') |
| 73 | + >>> [item for item in paragraph.iter_run_level_items()]: |
| 74 | + [<docx.text.paragraph.Run at 0x7fd...>, <docx.text.paragraph.Hyperlink at 0x7fe...>] |
| 75 | + |
| 76 | +**Paragraph.text now includes text contained in a hyperlink**:: |
| 77 | + |
| 78 | + >>> paragraph.text |
| 79 | + 'A paragraph having a link to: github' |
| 80 | + |
| 81 | + |
| 82 | +Word Behaviors |
| 83 | +-------------- |
| 84 | + |
| 85 | +* What are the semantics of the w:history attribute on w:hyperlink? I'm |
| 86 | + suspecting this indicates whether the link should show up blue (unvisited) |
| 87 | + or purple (visited). I'm inclined to think we need that as a read/write |
| 88 | + property on hyperlink. We should see what the MS API does on this count. |
| 89 | + |
| 90 | +* We probably need to enforce some character-set restrictions on w:anchor. |
| 91 | + Word doesn't seem to like spaces or hyphens, for example. The simple type |
| 92 | + ST_String doesn't look like it takes care of this. |
| 93 | + |
| 94 | +* We'll need to test URL escaping of special characters like spaces and |
| 95 | + question marks in Hyperlink.address. |
| 96 | + |
| 97 | +* What does Word do when loading a document containing an internal hyperlink |
| 98 | + having an anchor value that doesn't match an existing bookmark? We'll want |
| 99 | + to know because we're sure to get support inquiries from folks who don't |
| 100 | + match those up and wonder why they get a repair error or whatever. |
| 101 | + |
| 102 | + |
| 103 | +Specimen XML |
| 104 | +------------ |
| 105 | + |
| 106 | +.. highlight:: xml |
| 107 | + |
| 108 | + |
| 109 | +External links |
| 110 | +~~~~~~~~~~~~~~ |
| 111 | + |
| 112 | +The address (URL) of an external hyperlink is stored in the document.xml.rels |
| 113 | +file, keyed by the w:hyperlink@r:id attribute:: |
| 114 | + |
| 115 | + <w:p> |
| 116 | + <w:r> |
| 117 | + <w:t xml:space="preserve">This is an external link to </w:t> |
| 118 | + </w:r> |
| 119 | + <w:hyperlink r:id="rId4"> |
| 120 | + <w:r> |
| 121 | + <w:rPr> |
| 122 | + <w:rStyle w:val="Hyperlink"/> |
| 123 | + </w:rPr> |
| 124 | + <w:t>Google</w:t> |
| 125 | + </w:r> |
| 126 | + </w:hyperlink> |
| 127 | + </w:p> |
| 128 | + |
| 129 | +... mapping to relationship in document.xml.rels:: |
| 130 | + |
| 131 | + <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"> |
| 132 | + <Relationship Id="rId4" Mode="External" Type="http://..." Target="http://google.com/"/> |
| 133 | + </Relationships> |
| 134 | + |
| 135 | +A hyperlink can contain multiple runs of text (and a whole lot of other |
| 136 | +stuff, including nested hyperlinks, at least as far as the schema indicates):: |
| 137 | + |
| 138 | + <w:p> |
| 139 | + <w:hyperlink r:id="rId2"> |
| 140 | + <w:r> |
| 141 | + <w:rPr> |
| 142 | + <w:rStyle w:val="Hyperlink"/> |
| 143 | + </w:rPr> |
| 144 | + <w:t xml:space="preserve">A hyperlink containing an </w:t> |
| 145 | + </w:r> |
| 146 | + <w:r> |
| 147 | + <w:rPr> |
| 148 | + <w:rStyle w:val="Hyperlink"/> |
| 149 | + <w:i/> |
| 150 | + </w:rPr> |
| 151 | + <w:t>italicized</w:t> |
| 152 | + </w:r> |
| 153 | + <w:r> |
| 154 | + <w:rPr> |
| 155 | + <w:rStyle w:val="Hyperlink"/> |
| 156 | + </w:rPr> |
| 157 | + <w:t xml:space="preserve"> word</w:t> |
| 158 | + </w:r> |
| 159 | + </w:hyperlink> |
| 160 | + </w:p> |
| 161 | + |
| 162 | + |
| 163 | +Internal links |
| 164 | +~~~~~~~~~~~~~~ |
| 165 | + |
| 166 | +An internal link provides "jump to another document location" behavior in the |
| 167 | +Word UI. An internal link is distinguished by the absence of an r:id |
| 168 | +attribute. In this case, the w:anchor attribute is required. The value of the |
| 169 | +anchor attribute is the name of a bookmark in the document. |
| 170 | + |
| 171 | +Example:: |
| 172 | + |
| 173 | + <w:p> |
| 174 | + <w:r> |
| 175 | + <w:t xml:space="preserve">See </w:t> |
| 176 | + </w:r> |
| 177 | + <w:hyperlink w:anchor="Section_4"> |
| 178 | + <w:r> |
| 179 | + <w:rPr> |
| 180 | + <w:rStyle w:val="Hyperlink"/> |
| 181 | + </w:rPr> |
| 182 | + <w:t>Section 4</w:t> |
| 183 | + </w:r> |
| 184 | + </w:hyperlink> |
| 185 | + <w:r> |
| 186 | + <w:t xml:space="preserve"> for more details.</w:t> |
| 187 | + </w:r> |
| 188 | + </w:p> |
| 189 | + |
| 190 | +... referring to this bookmark elsewhere in the document:: |
| 191 | + |
| 192 | + <w:p> |
| 193 | + <w:bookmarkStart w:id="0" w:name="Section_4"/> |
| 194 | + <w:r> |
| 195 | + <w:t>Section 4</w:t> |
| 196 | + </w:r> |
| 197 | + <w:bookmarkEnd w:id="0"/> |
| 198 | + </w:p> |
| 199 | + |
| 200 | + |
| 201 | +Schema excerpt |
| 202 | +-------------- |
| 203 | + |
| 204 | +.. highlight:: xml |
| 205 | + |
| 206 | +:: |
| 207 | + |
| 208 | + <xsd:complexType name="CT_P"> |
| 209 | + <xsd:sequence> |
| 210 | + <xsd:element name="pPr" type="CT_PPr" minOccurs="0"/> |
| 211 | + <xsd:group ref="EG_PContent" minOccurs="0" maxOccurs="unbounded"/> |
| 212 | + </xsd:sequence> |
| 213 | + <xsd:attribute name="rsidRPr" type="ST_LongHexNumber"/> |
| 214 | + <xsd:attribute name="rsidR" type="ST_LongHexNumber"/> |
| 215 | + <xsd:attribute name="rsidDel" type="ST_LongHexNumber"/> |
| 216 | + <xsd:attribute name="rsidP" type="ST_LongHexNumber"/> |
| 217 | + <xsd:attribute name="rsidRDefault" type="ST_LongHexNumber"/> |
| 218 | + </xsd:complexType> |
| 219 | + |
| 220 | + <xsd:group name="EG_PContent"> <!-- denormalized --> |
| 221 | + <xsd:choice> |
| 222 | + <xsd:element name="r" type="CT_R"/> |
| 223 | + <xsd:element name="hyperlink" type="CT_Hyperlink"/> |
| 224 | + <xsd:element name="fldSimple" type="CT_SimpleField"/> |
| 225 | + <xsd:element name="sdt" type="CT_SdtRun"/> |
| 226 | + <xsd:element name="customXml" type="CT_CustomXmlRun"/> |
| 227 | + <xsd:element name="smartTag" type="CT_SmartTagRun"/> |
| 228 | + <xsd:element name="dir" type="CT_DirContentRun"/> |
| 229 | + <xsd:element name="bdo" type="CT_BdoContentRun"/> |
| 230 | + <xsd:element name="subDoc" type="CT_Rel"/> |
| 231 | + <xsd:group ref="EG_RunLevelElts"/> |
| 232 | + </xsd:choice> |
| 233 | + </xsd:group> |
| 234 | + |
| 235 | + <xsd:complexType name="CT_Hyperlink"> |
| 236 | + <xsd:group ref="EG_PContent" minOccurs="0" maxOccurs="unbounded"/> |
| 237 | + <xsd:attribute name="tgtFrame" type="s:ST_String"/> |
| 238 | + <xsd:attribute name="tooltip" type="s:ST_String"/> |
| 239 | + <xsd:attribute name="docLocation" type="s:ST_String"/> |
| 240 | + <xsd:attribute name="history" type="s:ST_OnOff"/> |
| 241 | + <xsd:attribute name="anchor" type="s:ST_String"/> |
| 242 | + <xsd:attribute ref="r:id"/> |
| 243 | + </xsd:complexType> |
| 244 | + |
| 245 | + <xsd:group name="EG_RunLevelElts"> |
| 246 | + <xsd:choice> |
| 247 | + <xsd:element name="proofErr" type="CT_ProofErr"/> |
| 248 | + <xsd:element name="permStart" type="CT_PermStart"/> |
| 249 | + <xsd:element name="permEnd" type="CT_Perm"/> |
| 250 | + <xsd:element name="bookmarkStart" type="CT_Bookmark"/> |
| 251 | + <xsd:element name="bookmarkEnd" type="CT_MarkupRange"/> |
| 252 | + <xsd:element name="moveFromRangeStart" type="CT_MoveBookmark"/> |
| 253 | + <xsd:element name="moveFromRangeEnd" type="CT_MarkupRange"/> |
| 254 | + <xsd:element name="moveToRangeStart" type="CT_MoveBookmark"/> |
| 255 | + <xsd:element name="moveToRangeEnd" type="CT_MarkupRange"/> |
| 256 | + <xsd:element name="commentRangeStart" type="CT_MarkupRange"/> |
| 257 | + <xsd:element name="commentRangeEnd" type="CT_MarkupRange"/> |
| 258 | + <xsd:element name="customXmlInsRangeStart" type="CT_TrackChange"/> |
| 259 | + <xsd:element name="customXmlInsRangeEnd" type="CT_Markup"/> |
| 260 | + <xsd:element name="customXmlDelRangeStart" type="CT_TrackChange"/> |
| 261 | + <xsd:element name="customXmlDelRangeEnd" type="CT_Markup"/> |
| 262 | + <xsd:element name="customXmlMoveFromRangeStart" type="CT_TrackChange"/> |
| 263 | + <xsd:element name="customXmlMoveFromRangeEnd" type="CT_Markup"/> |
| 264 | + <xsd:element name="customXmlMoveToRangeStart" type="CT_TrackChange"/> |
| 265 | + <xsd:element name="customXmlMoveToRangeEnd" type="CT_Markup"/> |
| 266 | + <xsd:element name="ins" type="CT_RunTrackChange"/> |
| 267 | + <xsd:element name="del" type="CT_RunTrackChange"/> |
| 268 | + <xsd:element name="moveFrom" type="CT_RunTrackChange"/> |
| 269 | + <xsd:element name="moveTo" type="CT_RunTrackChange"/> |
| 270 | + <xsd:group ref="EG_MathContent" minOccurs="0" maxOccurs="unbounded"/> |
| 271 | + </xsd:choice> |
| 272 | + </xsd:group> |
| 273 | + |
| 274 | + <xsd:complexType name="CT_R"> |
| 275 | + <xsd:sequence> |
| 276 | + <xsd:group ref="EG_RPr" minOccurs="0"/> |
| 277 | + <xsd:group ref="EG_RunInnerContent" minOccurs="0" maxOccurs="unbounded"/> |
| 278 | + </xsd:sequence> |
| 279 | + <xsd:attribute name="rsidRPr" type="ST_LongHexNumber"/> |
| 280 | + <xsd:attribute name="rsidDel" type="ST_LongHexNumber"/> |
| 281 | + <xsd:attribute name="rsidR" type="ST_LongHexNumber"/> |
| 282 | + </xsd:complexType> |
| 283 | + |
| 284 | + <xsd:simpleType name="ST_OnOff"> |
| 285 | + <xsd:union memberTypes="xsd:boolean ST_OnOff1"/> |
| 286 | + </xsd:simpleType> |
| 287 | + |
| 288 | + <xsd:simpleType name="ST_OnOff1"> |
| 289 | + <xsd:restriction base="xsd:string"> |
| 290 | + <xsd:enumeration value="on"/> |
| 291 | + <xsd:enumeration value="off"/> |
| 292 | + </xsd:restriction> |
| 293 | + </xsd:simpleType> |
| 294 | + |
| 295 | + <xsd:simpleType name="ST_RelationshipId"> |
| 296 | + <xsd:restriction base="xsd:string"/> |
| 297 | + </xsd:simpleType> |
| 298 | + |
| 299 | + <xsd:simpleType name="ST_String"> |
| 300 | + <xsd:restriction base="xsd:string"/> |
| 301 | + </xsd:simpleType> |
0 commit comments