Skip to content

Commit df7b416

Browse files
author
XiaomingSu
committed
get links of html
1 parent 309a00e commit df7b416

File tree

1 file changed

+22
-0
lines changed

1 file changed

+22
-0
lines changed

PyBeaner/0009/link_in_html.py

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# coding=utf-8
2+
__author__ = 'PyBeaner'
3+
from bs4 import BeautifulSoup
4+
5+
6+
def get_links(html):
7+
soup = BeautifulSoup(html)
8+
links = []
9+
for link in soup.find_all("a"):
10+
href = link["href"]
11+
if href.startswith("http"):
12+
links.append(href)
13+
return links
14+
15+
16+
if __name__ == '__main__':
17+
import requests
18+
19+
r = requests.get("https://github.com/")
20+
html = r.text
21+
links = get_links(html)
22+
print(links)

0 commit comments

Comments
 (0)