Skip to content

Go scanner #147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Jul 13, 2013
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
abb92f3
New: *Go Encoder*
Eric-Guo Jul 8, 2012
74eeb08
Merge remote-tracking branch 'Eric-Guo/go-scanner' into go-scanner
korny Jul 28, 2012
ec39f80
fix whitespace
korny Jul 28, 2012
7d2c3e5
Merge branch 'master' into go-scanner
korny Mar 10, 2013
addcbd4
Merge branch 'master' into go-scanner
korny Jun 12, 2013
0013b64
Merge branch 'master' into go-scanner
korny Jun 23, 2013
fd8c81f
additional types: string, error
nathany Jun 23, 2013
004d0c8
whitespace
korny Jun 23, 2013
afa2be7
add string as predefined type
korny Jun 23, 2013
7ef6f77
add support for raw strings in Go
korny Jun 23, 2013
a24c207
fix empty token in Go scanner
korny Jun 23, 2013
4669b70
fix label_expected (test case?)
korny Jun 23, 2013
c0d0280
make predefined-type a bit more bright/blue
korny Jun 23, 2013
52eadb6
Merge branch 'go-scanner' of github.com:rubychan/coderay into go-scanner
korny Jun 23, 2013
17946d7
changelog
korny Jun 23, 2013
85275cf
Go doesn't have a "f" suffix for floats like C.
nathany Jun 23, 2013
cad9a00
add imaginary numbers to Go scanner
nathany Jun 23, 2013
4c877bf
predeclared identifiers
nathany Jun 23, 2013
dd9ec43
yup, no C-style directives (auto extern static)
nathany Jun 23, 2013
a2c625b
Merge branch 'master' into go-scanner
korny Jun 30, 2013
e1abb68
Merge branch 'master' into go-scanner
korny Jul 13, 2013
4c24863
create nathany for Go scanner, too
korny Jul 13, 2013
bbe4d72
tweak numeral tokens handling (#147)
korny Jul 13, 2013
6dd14ef
be a bit more graceful with buggy Go strings
korny Jul 13, 2013
82864ef
allow unicode characters in char literals
korny Jul 13, 2013
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
New: *Go Encoder*
Draft version, copy from c
  • Loading branch information
Eric-Guo committed Jul 8, 2012
commit abb92f30b12e11781afa76f43a344627520b5b34
37 changes: 19 additions & 18 deletions lib/coderay/helpers/file_type.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
module CodeRay

# = FileType
#
# A simple filetype recognizer.
Expand All @@ -8,18 +8,18 @@ module CodeRay
#
# # determine the type of the given
# lang = FileType[file_name]
#
#
# # return :text if the file type is unknown
# lang = FileType.fetch file_name, :text
#
#
# # try the shebang line, too
# lang = FileType.fetch file_name, :text, true
module FileType

UnknownFileType = Class.new Exception

class << self

# Try to determine the file type of the file.
#
# +filename+ is a relative or absolute path to a file.
Expand All @@ -30,7 +30,7 @@ def [] filename, read_shebang = false
name = File.basename filename
ext = File.extname(name).sub(/^\./, '') # from last dot, delete the leading dot
ext2 = filename.to_s[/\.(.*)/, 1] # from first dot

type =
TypeFromExt[ext] ||
TypeFromExt[ext.downcase] ||
Expand All @@ -39,10 +39,10 @@ def [] filename, read_shebang = false
TypeFromName[name] ||
TypeFromName[name.downcase]
type ||= shebang(filename) if read_shebang

type
end

# This works like Hash#fetch.
#
# If the filetype cannot be found, the +default+ value
Expand All @@ -51,7 +51,7 @@ def fetch filename, default = nil, read_shebang = false
if default && block_given?
warn 'Block supersedes default value argument; use either.'
end

if type = self[filename, read_shebang]
type
else
Expand All @@ -60,9 +60,9 @@ def fetch filename, default = nil, read_shebang = false
raise UnknownFileType, 'Could not determine type of %p.' % filename
end
end

protected

def shebang filename
return unless File.exist? filename
File.open filename, 'r' do |f|
Expand All @@ -73,9 +73,9 @@ def shebang filename
end
end
end

end

TypeFromExt = {
'c' => :c,
'cfc' => :xml,
Expand All @@ -86,6 +86,7 @@ def shebang filename
'dpr' => :delphi,
'erb' => :erb,
'gemspec' => :ruby,
'go' => :go,
'groovy' => :groovy,
'gvy' => :groovy,
'h' => :c,
Expand Down Expand Up @@ -128,16 +129,16 @@ def shebang filename
for cpp_alias in %w[cc cpp cp cxx c++ C hh hpp h++ cu]
TypeFromExt[cpp_alias] = :cpp
end

TypeFromShebang = /\b(?:ruby|perl|python|sh)\b/

TypeFromName = {
'Capfile' => :ruby,
'Rakefile' => :ruby,
'Rantfile' => :ruby,
'Gemfile' => :ruby,
}

end

end
195 changes: 195 additions & 0 deletions lib/coderay/scanners/go.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
module CodeRay
module Scanners

# Scanner for Go, copy from c
class Go < Scanner

register_for :go
file_extension 'go'

# http://golang.org/ref/spec#Keywords
KEYWORDS = [
'break', 'default', 'func', 'interface', 'select',
'case', 'defer', 'go', 'map', 'struct',
'chan', 'else', 'goto', 'package', 'switch',
'const', 'fallthrough', 'if', 'range', 'type',
'continue', 'for', 'import', 'return', 'var',
] # :nodoc:

# http://golang.org/ref/spec#Types
PREDEFINED_TYPES = [
'bool',
'uint8', 'uint16', 'uint32', 'uint64',
'int8', 'int16', 'int32', 'int64',
'float32', 'float64',
'complex64', 'complex128',
'byte', 'rune',
'uint', 'int', 'uintptr',
] # :nodoc:

PREDEFINED_CONSTANTS = [
'nil', 'iota',
'true', 'false',
] # :nodoc:

DIRECTIVES = [
'go_no_directive', # Seems no directive concept in Go?
] # :nodoc:

IDENT_KIND = WordList.new(:ident).
add(KEYWORDS, :keyword).
add(PREDEFINED_TYPES, :predefined_type).
add(DIRECTIVES, :directive).
add(PREDEFINED_CONSTANTS, :predefined_constant) # :nodoc:

ESCAPE = / [rbfntv\n\\'"] | x[a-fA-F0-9]{1,2} | [0-7]{1,3} /x # :nodoc:
UNICODE_ESCAPE = / u[a-fA-F0-9]{4} | U[a-fA-F0-9]{8} /x # :nodoc:

protected

def scan_tokens encoder, options

state = :initial
label_expected = true
case_expected = false
label_expected_before_preproc_line = nil
in_preproc_line = false

until eos?

case state

when :initial

if match = scan(/ \s+ | \\\n /x)
if in_preproc_line && match != "\\\n" && match.index(?\n)
in_preproc_line = false
label_expected = label_expected_before_preproc_line
end
encoder.text_token match, :space

elsif match = scan(%r! // [^\n\\]* (?: \\. [^\n\\]* )* | /\* (?: .*? \*/ | .* ) !mx)
encoder.text_token match, :comment

elsif match = scan(/ [-+*=<>?:;,!&^|()\[\]{}~%]+ | \/=? | \.(?!\d) /x)
label_expected = match =~ /[;\{\}]/
if case_expected
label_expected = true if match == ':'
case_expected = false
end
encoder.text_token match, :operator

elsif match = scan(/ [A-Za-z_][A-Za-z_0-9]* /x)
kind = IDENT_KIND[match]
if kind == :ident && label_expected && !in_preproc_line && scan(/:(?!:)/)
kind = :label
match << matched
else
label_expected = false
if kind == :keyword
case match
when 'case', 'default'
case_expected = true
end
end
end
encoder.text_token match, kind

elsif match = scan(/L?"/)
encoder.begin_group :string
if match[0] == ?L
encoder.text_token 'L', :modifier
match = '"'
end
encoder.text_token match, :delimiter
state = :string

elsif match = scan(/ \# \s* if \s* 0 /x)
match << scan_until(/ ^\# (?:elif|else|endif) .*? $ | \z /xm) unless eos?
encoder.text_token match, :comment

elsif match = scan(/#[ \t]*(\w*)/)
encoder.text_token match, :preprocessor
in_preproc_line = true
label_expected_before_preproc_line = label_expected
state = :include_expected if self[1] == 'include'

elsif match = scan(/ L?' (?: [^\'\n\\] | \\ #{ESCAPE} )? '? /ox)
label_expected = false
encoder.text_token match, :char

elsif match = scan(/\$/)
encoder.text_token match, :ident

elsif match = scan(/0[xX][0-9A-Fa-f]+/)
label_expected = false
encoder.text_token match, :hex

elsif match = scan(/(?:0[0-7]+)(?![89.eEfF])/)
label_expected = false
encoder.text_token match, :octal

elsif match = scan(/(?:\d+)(?![.eEfF])L?L?/)
label_expected = false
encoder.text_token match, :integer

elsif match = scan(/\d[fF]?|\d*\.\d+(?:[eE][+-]?\d+)?[fF]?|\d+[eE][+-]?\d+[fF]?/)
label_expected = false
encoder.text_token match, :float

else
encoder.text_token getch, :error

end

when :string
if match = scan(/[^\\\n"]+/)
encoder.text_token match, :content
elsif match = scan(/"/)
encoder.text_token match, :delimiter
encoder.end_group :string
state = :initial
label_expected = false
elsif match = scan(/ \\ (?: #{ESCAPE} | #{UNICODE_ESCAPE} ) /mox)
encoder.text_token match, :char
elsif match = scan(/ \\ | $ /x)
encoder.end_group :string
encoder.text_token match, :error
state = :initial
label_expected = false
else
raise_inspect "else case \" reached; %p not handled." % peek(1), encoder
end

when :include_expected
if match = scan(/<[^>\n]+>?|"[^"\n\\]*(?:\\.[^"\n\\]*)*"?/)
encoder.text_token match, :include
state = :initial

elsif match = scan(/\s+/)
encoder.text_token match, :space
state = :initial if match.index ?\n

else
state = :initial

end

else
raise_inspect 'Unknown state', encoder

end

end

if state == :string
encoder.end_group :string
end

encoder
end

end

end
end