Skip to content

Adding site map generation #185 #210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 3, 2014
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ gem 'rakismet'
gem 'ruby-progressbar'
gem 'sanitize'
gem 'simple_form'
gem 'sitemap_generator'
gem 'tweet-button'
gem 'local_time'

Expand Down
3 changes: 3 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -619,6 +619,8 @@ GEM
rack (~> 1.4)
rack-protection (~> 1.4)
tilt (~> 1.3, >= 1.3.4)
sitemap_generator (5.0.5)
builder
slim (2.0.3)
temple (~> 0.6.6)
tilt (>= 1.3.3, < 2.1)
Expand Down Expand Up @@ -804,6 +806,7 @@ DEPENDENCIES
simple_form
simplecov
sinatra
sitemap_generator
slim-rails
spring
spring-commands-rspec
Expand Down
4 changes: 4 additions & 0 deletions app/clock.rb
Original file line number Diff line number Diff line change
Expand Up @@ -44,5 +44,9 @@
ClearExpiredSessionsJob.perform_async
end

every(1.day, 'sitemap:refresh', at: '06:00') do
SitemapRefreshWorker.perform_async
end

# This is tied with broken code. Probably should delete
# every(1.day, 'facts:system', at: '00:00') {}
40 changes: 40 additions & 0 deletions app/workers/sitemap_refresh_worker.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
class SitemapRefreshWorker
include Sidekiq::Worker
sidekiq_options queue: :high

def perform
SitemapGenerator::Sitemap.default_host = "https://coderwall.com"
SitemapGenerator::Sitemap.public_path = 'tmp/'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmp ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seuros /tmp is recommend here for heroku to temporarily store files before uploading to s3 https://github.com/kjvarga/sitemap_generator/wiki/Generate-Sitemaps-on-read-only-filesystems-like-Heroku

SitemapGenerator::Sitemap.sitemaps_host = "http://#{ENV['FOG_DIRECTORY']}.s3.amazonaws.com/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't use the cdn to servesite map.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seuros should this use the CDN? The URL above is for the S3 I believe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Then fine, let keep it. I didn't know that heroku is RO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After further reading. It seem that only the bambo stack is RO. I think we have Cedar.
cc @just3ws

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seuros @just3ws I don't believe anything is persisted between dynos.

From https://devcenter.heroku.com/articles/how-heroku-works: "Changes to the filesystem on one dyno are not propagated to other dynos and are not persisted across deploys and dyno restarts."

Also, from https://devcenter.heroku.com/articles/dynos#isolation-and-security "Ephemeral File System":
Each dyno gets its own ephemeral filesystem, with a fresh copy of the most recently deployed code. During the dyno’s lifetime its running processes can use the filesystem as a temporary scratchpad, but no files that are written are visible to processes in any other dyno and any files written will be discarded the moment the dyno is stopped or restarted.

SitemapGenerator::Sitemap.sitemaps_path = 'sitemaps/'
SitemapGenerator::Sitemap.adapter = SitemapGenerator::WaveAdapter.new

SitemapGenerator::Sitemap.create do
add '/welcome', :priority => 0.7, :changefreq => 'montlhy'
add '/contact_us', :priority => 0.5, :changefreq => 'montlhy'
add '/blog', :priority => 0.7, :changefreq => 'weekly'
add '/api', :priority => 0.5, :changefreq => 'monthly'
add '/faq', :priority => 0.5, :changefreq => 'monthly'
add '/privacy_policy', :priority => 0.2, :changefreq => 'monthly'
add '/tos', :priority => 0.2, :changefreq => 'monthly'
add '/jobs', :priority => 0.7, :changefreq => 'daily'
add '/employers', :priority => 0.7, :changefreq => 'monthly'
Protip.find_each do |protip|
add protip_path(protip), :lastmod => protip.updated_at
end
Team.all.each do |team|
add teamname_path(slug: team.slug), :lastmod => team.updated_at
team.jobs.each do |job|
add job_path(:slug => team.slug, :job_id => job.public_id), :lastmod => job.updated_at
end
end
User.find_each do |user|
add badge_path(user.username), :lastmod => user.updated_at
end
BlogPost.all_public.each do |blog_post|
add blog_post_path(blog_post.id), :lastmod => blog_post.posted
end
end
SitemapGenerator::Sitemap.ping_search_engines
end
end
1 change: 1 addition & 0 deletions public/robots.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@

User-agent: EasouSpider
Disallow: /
Sitemap: https://coderwall-assets-0.s3.amazonaws.com/sitemaps/sitemap.xml.gz