IOError when scanning file with odd chars

I inadvertently pasted in to a ruby comment some text from Word which had inverted commas.  When I executed `CodeRay.scan_file` on that file, it complained with:

```
IOError (Cannot run program "file" (in directory "C:\tb\port_compare"): CreateProcess error=2, The system cannot find the file specified)
```

...which was thrown at `lib/coderay/scanner.rb:120` (method `guess_encoding`).  Further up the stack in `normalize` I could see where it was branching to `encode_with_encoding` (as opposed to `to_unix`) so I commented that out to force it to use `to_unix`.  

Then I retried and received this error:

```
CodeRay::Scanners::Scanner::ScanError (

***ERROR in scanner.rb:200:in `tokenize': invalid byte sequence in UTF-8 (after 0 tokens)

tokens:


current line: 55  column: 89  pos: 1673
matched: "# WTF? AND data_srce_sys_cde / id_prod_cmpnt_cde_1 are in \x93Interest Only\x94 list"  state: "Error in CodeRay::Scanners::Ruby#scan_tokens, initial state was: :initial"
bol? = false,  eos? = false

surrounding code:
"_1 are in \u0093Interest Only\u0094 list"  ~~  "\n          return :bullet_inte"


***ERROR***
```

...which helped my diagnose the root problem.

If would be good if there was some error handling around the `IO.popen` call to help diagnose, or if the call to `guess_encoding` was stricter (assuming it was called in error).  Not sure how to do this but thought I'd log it here anyway in case someone else has the same error...

Windows XP - Notepad ++ - ANSI file


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

IOError when scanning file with odd chars #98

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

IOError when scanning file with odd chars #98

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions