-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URLs and selectors are outdated #10
Comments
New URLs probably need something like this: relativeURL = '/area/106316122/hawaii'
start_urls = [domain + relativeURL]
allowed_domains = ['mountainproject.com']
rules = [
Rule(
LinkExtractor(allow='area/(.+)'),
callback='parse',
follow=True
)
] |
New state pages have <div class="col-md-3 left-nav float-md-left mb-2">
<div class="mp-sidebar"> So probably And on the main page it has <div class="col-xs-12">
<div class="title-with-border-bottom mb-2">
<h2 class="inline-block mr-half">Rock Climbing Guide</h2>
</div>
<div class="row" id="route-guide"> So probably Still doesn't work, though. |
yield scrapy.Request(url, callback=self.parse_coordinates) |
I'm not sure why the original code says this: if 'Location' not in response.css('#rspCol800 div.rspCol table tr:nth-child(2) td ::text').extract()[0]:
return response.css('#rspCol800 div.rspCol table tr:nth-child(3) td ::text').extract()[1].strip()
else:
return response.css('#rspCol800 div.rspCol table tr:nth-child(2) td ::text').extract()[1].strip() In the case that it doesn't list for example. (Now in the new layout it's "GPS:", though.) |
(I've got it working, but I made a bunch of clunky changes with the help of ChatGPT that I don't fully understand) |
The /v/ URLs redirect to a new scheme:
<div id="viewerLeftNavColContent" class="rspCollapsedContent">
was present in old pages: https://web.archive.org/web/20161122233413/http://www.mountainproject.com/v/alabama/105905173but no longer.
<span class="destArea">
was present on old homepage:https://web.archive.org/web/20171016232313/https://www.mountainproject.com/
but no longer.
The text was updated successfully, but these errors were encountered: