mirror of
https://github.com/ralsina/tartrazine.git
synced 2025-07-02 04:47:07 -03:00
Compare commits
16 Commits
Author | SHA1 | Date | |
---|---|---|---|
27008640a6 | |||
7db8fdc9e4 | |||
ad664d9f93 | |||
0626c8619f | |||
3725201f8a | |||
6f64b76c44 | |||
5218af6855 | |||
c898f395a1 | |||
56e49328fb | |||
8d7faf2098 | |||
2e87762f1b | |||
88f5674917 | |||
ce6f3d29b5 | |||
46d6d3f467 | |||
78ddc69937 | |||
b1ad7b64c0 |
106
.ameba.yml
106
.ameba.yml
@ -1,5 +1,5 @@
|
|||||||
# This configuration file was generated by `ameba --gen-config`
|
# This configuration file was generated by `ameba --gen-config`
|
||||||
# on 2024-08-04 23:09:09 UTC using Ameba version 1.6.1.
|
# on 2024-08-12 22:00:49 UTC using Ameba version 1.6.1.
|
||||||
# The point is for the user to remove these configuration records
|
# The point is for the user to remove these configuration records
|
||||||
# one by one as the reported problems are removed from the code base.
|
# one by one as the reported problems are removed from the code base.
|
||||||
|
|
||||||
@ -9,7 +9,7 @@ Documentation/DocumentationAdmonition:
|
|||||||
Description: Reports documentation admonitions
|
Description: Reports documentation admonitions
|
||||||
Timezone: UTC
|
Timezone: UTC
|
||||||
Excluded:
|
Excluded:
|
||||||
- src/tartrazine.cr
|
- src/lexer.cr
|
||||||
- src/actions.cr
|
- src/actions.cr
|
||||||
Admonitions:
|
Admonitions:
|
||||||
- TODO
|
- TODO
|
||||||
@ -17,3 +17,105 @@ Documentation/DocumentationAdmonition:
|
|||||||
- BUG
|
- BUG
|
||||||
Enabled: true
|
Enabled: true
|
||||||
Severity: Warning
|
Severity: Warning
|
||||||
|
|
||||||
|
# Problems found: 22
|
||||||
|
# Run `ameba --only Lint/MissingBlockArgument` for details
|
||||||
|
Lint/MissingBlockArgument:
|
||||||
|
Description: Disallows yielding method definitions without block argument
|
||||||
|
Excluded:
|
||||||
|
- pygments/tests/examplefiles/cr/test.cr
|
||||||
|
Enabled: true
|
||||||
|
Severity: Warning
|
||||||
|
|
||||||
|
# Problems found: 1
|
||||||
|
# Run `ameba --only Lint/NotNil` for details
|
||||||
|
Lint/NotNil:
|
||||||
|
Description: Identifies usage of `not_nil!` calls
|
||||||
|
Excluded:
|
||||||
|
- pygments/tests/examplefiles/cr/test.cr
|
||||||
|
Enabled: true
|
||||||
|
Severity: Warning
|
||||||
|
|
||||||
|
# Problems found: 34
|
||||||
|
# Run `ameba --only Lint/ShadowingOuterLocalVar` for details
|
||||||
|
Lint/ShadowingOuterLocalVar:
|
||||||
|
Description: Disallows the usage of the same name as outer local variables for block
|
||||||
|
or proc arguments
|
||||||
|
Excluded:
|
||||||
|
- pygments/tests/examplefiles/cr/test.cr
|
||||||
|
Enabled: true
|
||||||
|
Severity: Warning
|
||||||
|
|
||||||
|
# Problems found: 1
|
||||||
|
# Run `ameba --only Lint/UnreachableCode` for details
|
||||||
|
Lint/UnreachableCode:
|
||||||
|
Description: Reports unreachable code
|
||||||
|
Excluded:
|
||||||
|
- pygments/tests/examplefiles/cr/test.cr
|
||||||
|
Enabled: true
|
||||||
|
Severity: Warning
|
||||||
|
|
||||||
|
# Problems found: 6
|
||||||
|
# Run `ameba --only Lint/UselessAssign` for details
|
||||||
|
Lint/UselessAssign:
|
||||||
|
Description: Disallows useless variable assignments
|
||||||
|
ExcludeTypeDeclarations: false
|
||||||
|
Excluded:
|
||||||
|
- pygments/tests/examplefiles/cr/test.cr
|
||||||
|
Enabled: true
|
||||||
|
Severity: Warning
|
||||||
|
|
||||||
|
# Problems found: 3
|
||||||
|
# Run `ameba --only Naming/BlockParameterName` for details
|
||||||
|
Naming/BlockParameterName:
|
||||||
|
Description: Disallows non-descriptive block parameter names
|
||||||
|
MinNameLength: 3
|
||||||
|
AllowNamesEndingInNumbers: true
|
||||||
|
Excluded:
|
||||||
|
- pygments/tests/examplefiles/cr/test.cr
|
||||||
|
AllowedNames:
|
||||||
|
- _
|
||||||
|
- e
|
||||||
|
- i
|
||||||
|
- j
|
||||||
|
- k
|
||||||
|
- v
|
||||||
|
- x
|
||||||
|
- y
|
||||||
|
- ex
|
||||||
|
- io
|
||||||
|
- ws
|
||||||
|
- op
|
||||||
|
- tx
|
||||||
|
- id
|
||||||
|
- ip
|
||||||
|
- k1
|
||||||
|
- k2
|
||||||
|
- v1
|
||||||
|
- v2
|
||||||
|
ForbiddenNames: []
|
||||||
|
Enabled: true
|
||||||
|
Severity: Convention
|
||||||
|
|
||||||
|
# Problems found: 1
|
||||||
|
# Run `ameba --only Naming/RescuedExceptionsVariableName` for details
|
||||||
|
Naming/RescuedExceptionsVariableName:
|
||||||
|
Description: Makes sure that rescued exceptions variables are named as expected
|
||||||
|
Excluded:
|
||||||
|
- pygments/tests/examplefiles/cr/test.cr
|
||||||
|
AllowedNames:
|
||||||
|
- e
|
||||||
|
- ex
|
||||||
|
- exception
|
||||||
|
- error
|
||||||
|
Enabled: true
|
||||||
|
Severity: Convention
|
||||||
|
|
||||||
|
# Problems found: 6
|
||||||
|
# Run `ameba --only Naming/TypeNames` for details
|
||||||
|
Naming/TypeNames:
|
||||||
|
Description: Enforces type names in camelcase manner
|
||||||
|
Excluded:
|
||||||
|
- pygments/tests/examplefiles/cr/test.cr
|
||||||
|
Enabled: true
|
||||||
|
Severity: Convention
|
||||||
|
16
README.md
16
README.md
@ -4,17 +4,17 @@ Tartrazine is a library to syntax-highlight code. It is
|
|||||||
a port of [Pygments](https://pygments.org/) to
|
a port of [Pygments](https://pygments.org/) to
|
||||||
[Crystal](https://crystal-lang.org/). Kind of.
|
[Crystal](https://crystal-lang.org/). Kind of.
|
||||||
|
|
||||||
It's not currently usable because it's not finished, but:
|
The CLI tool can be used to highlight many things in many styles.
|
||||||
|
|
||||||
* The lexers work for the implemented languages
|
|
||||||
* The provided styles work
|
|
||||||
* There is a very very simple HTML formatter
|
|
||||||
|
|
||||||
# A port of what? Why "kind of"?
|
# A port of what? Why "kind of"?
|
||||||
|
|
||||||
Because I did not read the Pygments code. And this is actually
|
Pygments is a staple of the Python ecosystem, and it's great.
|
||||||
based on [Chroma](https://github.com/alecthomas/chroma) ...
|
It lets you highlight code in many languages, and it has many
|
||||||
although I did not read that code either.
|
themes. Chroma is "Pygments for Go", it's actually a port of
|
||||||
|
Pygments to Go, and it's great too.
|
||||||
|
|
||||||
|
I wanted that in Crystal, so I started this project. But I did
|
||||||
|
not read much of the Pygments code. Or much of Chroma's.
|
||||||
|
|
||||||
Chroma has taken most of the Pygments lexers and turned them into
|
Chroma has taken most of the Pygments lexers and turned them into
|
||||||
XML descriptions. What I did was take those XML files from Chroma
|
XML descriptions. What I did was take those XML files from Chroma
|
||||||
|
@ -1,5 +1,5 @@
|
|||||||
name: tartrazine
|
name: tartrazine
|
||||||
version: 0.2.0
|
version: 0.4.0
|
||||||
|
|
||||||
authors:
|
authors:
|
||||||
- Roberto Alsina <roberto.alsina@gmail.com>
|
- Roberto Alsina <roberto.alsina@gmail.com>
|
||||||
|
@ -14,15 +14,18 @@ unicode_problems = {
|
|||||||
"#{__DIR__}/tests/java/test_string_literals.txt",
|
"#{__DIR__}/tests/java/test_string_literals.txt",
|
||||||
"#{__DIR__}/tests/json/test_strings.txt",
|
"#{__DIR__}/tests/json/test_strings.txt",
|
||||||
"#{__DIR__}/tests/systemd/example1.txt",
|
"#{__DIR__}/tests/systemd/example1.txt",
|
||||||
|
"#{__DIR__}/tests/c++/test_unicode_identifiers.txt",
|
||||||
}
|
}
|
||||||
|
|
||||||
# These testcases fail because of differences in the way chroma and tartrazine tokenize
|
# These testcases fail because of differences in the way chroma and tartrazine tokenize
|
||||||
# but tartrazine is correct
|
# but tartrazine is correct
|
||||||
bad_in_chroma = {
|
bad_in_chroma = {
|
||||||
"#{__DIR__}/tests/bash_session/test_comment_after_prompt.txt",
|
"#{__DIR__}/tests/bash_session/test_comment_after_prompt.txt",
|
||||||
|
"#{__DIR__}/tests/html/javascript_backtracking.txt",
|
||||||
"#{__DIR__}/tests/java/test_default.txt",
|
"#{__DIR__}/tests/java/test_default.txt",
|
||||||
"#{__DIR__}/tests/java/test_multiline_string.txt",
|
"#{__DIR__}/tests/java/test_multiline_string.txt",
|
||||||
"#{__DIR__}/tests/java/test_numeric_literals.txt",
|
"#{__DIR__}/tests/java/test_numeric_literals.txt",
|
||||||
|
"#{__DIR__}/tests/octave/test_multilinecomment.txt",
|
||||||
"#{__DIR__}/tests/php/test_string_escaping_run.txt",
|
"#{__DIR__}/tests/php/test_string_escaping_run.txt",
|
||||||
"#{__DIR__}/tests/python_2/test_cls_builtin.txt",
|
"#{__DIR__}/tests/python_2/test_cls_builtin.txt",
|
||||||
}
|
}
|
||||||
@ -30,19 +33,14 @@ bad_in_chroma = {
|
|||||||
known_bad = {
|
known_bad = {
|
||||||
"#{__DIR__}/tests/bash_session/fake_ps2_prompt.txt",
|
"#{__DIR__}/tests/bash_session/fake_ps2_prompt.txt",
|
||||||
"#{__DIR__}/tests/bash_session/prompt_in_output.txt",
|
"#{__DIR__}/tests/bash_session/prompt_in_output.txt",
|
||||||
"#{__DIR__}/tests/bash_session/test_newline_in_echo_no_ps2.txt",
|
|
||||||
"#{__DIR__}/tests/bash_session/test_newline_in_ls_ps2.txt",
|
|
||||||
"#{__DIR__}/tests/bash_session/ps2_prompt.txt",
|
"#{__DIR__}/tests/bash_session/ps2_prompt.txt",
|
||||||
"#{__DIR__}/tests/bash_session/test_newline_in_ls_no_ps2.txt",
|
"#{__DIR__}/tests/bash_session/test_newline_in_echo_no_ps2.txt",
|
||||||
"#{__DIR__}/tests/bash_session/test_virtualenv.txt",
|
|
||||||
"#{__DIR__}/tests/bash_session/test_newline_in_echo_ps2.txt",
|
"#{__DIR__}/tests/bash_session/test_newline_in_echo_ps2.txt",
|
||||||
"#{__DIR__}/tests/c/test_string_resembling_decl_end.txt",
|
"#{__DIR__}/tests/bash_session/test_newline_in_ls_no_ps2.txt",
|
||||||
"#{__DIR__}/tests/html/css_backtracking.txt",
|
"#{__DIR__}/tests/bash_session/test_newline_in_ls_ps2.txt",
|
||||||
|
"#{__DIR__}/tests/bash_session/test_virtualenv.txt",
|
||||||
"#{__DIR__}/tests/mcfunction/data.txt",
|
"#{__DIR__}/tests/mcfunction/data.txt",
|
||||||
"#{__DIR__}/tests/mcfunction/selectors.txt",
|
"#{__DIR__}/tests/mcfunction/selectors.txt",
|
||||||
"#{__DIR__}/tests/php/anonymous_class.txt",
|
|
||||||
"#{__DIR__}/tests/html/javascript_unclosed.txt",
|
|
||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
# Tests that fail because of a limitation in PCRE2
|
# Tests that fail because of a limitation in PCRE2
|
||||||
|
@ -30,11 +30,11 @@ module Tartrazine
|
|||||||
end
|
end
|
||||||
|
|
||||||
# ameba:disable Metrics/CyclomaticComplexity
|
# ameba:disable Metrics/CyclomaticComplexity
|
||||||
def emit(match : Regex::MatchData?, lexer : Lexer, match_group = 0) : Array(Token)
|
def emit(match : MatchData, lexer : Lexer, match_group = 0) : Array(Token)
|
||||||
case type
|
case type
|
||||||
when "token"
|
when "token"
|
||||||
raise Exception.new "Can't have a token without a match" if match.nil?
|
raise Exception.new "Can't have a token without a match" if match.empty?
|
||||||
[Token.new(type: xml["type"], value: match[match_group])]
|
[Token.new(type: xml["type"], value: String.new(match[match_group].value))]
|
||||||
when "push"
|
when "push"
|
||||||
states_to_push = xml.attributes.select { |attrib|
|
states_to_push = xml.attributes.select { |attrib|
|
||||||
attrib.name == "state"
|
attrib.name == "state"
|
||||||
@ -79,23 +79,29 @@ module Tartrazine
|
|||||||
# the action is skipped.
|
# the action is skipped.
|
||||||
result = [] of Token
|
result = [] of Token
|
||||||
@actions.each_with_index do |e, i|
|
@actions.each_with_index do |e, i|
|
||||||
next if match[i + 1]?.nil?
|
begin
|
||||||
|
next if match[i + 1].size == 0
|
||||||
|
rescue IndexError
|
||||||
|
# FIXME: This should not actually happen
|
||||||
|
# No match for this group
|
||||||
|
next
|
||||||
|
end
|
||||||
result += e.emit(match, lexer, i + 1)
|
result += e.emit(match, lexer, i + 1)
|
||||||
end
|
end
|
||||||
result
|
result
|
||||||
when "using"
|
when "using"
|
||||||
# Shunt to another lexer entirely
|
# Shunt to another lexer entirely
|
||||||
return [] of Token if match.nil?
|
return [] of Token if match.empty?
|
||||||
lexer_name = xml["lexer"].downcase
|
lexer_name = xml["lexer"].downcase
|
||||||
Log.trace { "to tokenize: #{match[match_group]}" }
|
Log.trace { "to tokenize: #{match[match_group]}" }
|
||||||
Tartrazine.lexer(lexer_name).tokenize(match[match_group], usingself: true)
|
Tartrazine.lexer(lexer_name).tokenize(String.new(match[match_group].value), usingself: true)
|
||||||
when "usingself"
|
when "usingself"
|
||||||
# Shunt to another copy of this lexer
|
# Shunt to another copy of this lexer
|
||||||
return [] of Token if match.nil?
|
return [] of Token if match.empty?
|
||||||
|
|
||||||
new_lexer = Lexer.from_xml(lexer.xml)
|
new_lexer = Lexer.from_xml(lexer.xml)
|
||||||
Log.trace { "to tokenize: #{match[match_group]}" }
|
Log.trace { "to tokenize: #{match[match_group]}" }
|
||||||
new_lexer.tokenize(match[match_group], usingself: true)
|
new_lexer.tokenize(String.new(match[match_group].value), usingself: true)
|
||||||
when "combined"
|
when "combined"
|
||||||
# Combine two states into one anonymous state
|
# Combine two states into one anonymous state
|
||||||
states = xml.attributes.select { |attrib|
|
states = xml.attributes.select { |attrib|
|
||||||
|
75
src/bytes_regex.cr
Normal file
75
src/bytes_regex.cr
Normal file
@ -0,0 +1,75 @@
|
|||||||
|
module BytesRegex
|
||||||
|
extend self
|
||||||
|
|
||||||
|
class Regex
|
||||||
|
def initialize(pattern : String, multiline = false, dotall = false, ignorecase = false, anchored = false)
|
||||||
|
flags = LibPCRE2::UTF | LibPCRE2::DUPNAMES | LibPCRE2::UCP | LibPCRE2::NO_UTF_CHECK
|
||||||
|
flags |= LibPCRE2::MULTILINE if multiline
|
||||||
|
flags |= LibPCRE2::DOTALL if dotall
|
||||||
|
flags |= LibPCRE2::CASELESS if ignorecase
|
||||||
|
flags |= LibPCRE2::ANCHORED if anchored
|
||||||
|
if @re = LibPCRE2.compile(
|
||||||
|
pattern,
|
||||||
|
pattern.bytesize,
|
||||||
|
flags,
|
||||||
|
out errorcode,
|
||||||
|
out erroroffset,
|
||||||
|
nil)
|
||||||
|
else
|
||||||
|
msg = String.new(256) do |buffer|
|
||||||
|
bytesize = LibPCRE2.get_error_message(errorcode, buffer, 256)
|
||||||
|
{bytesize, 0}
|
||||||
|
end
|
||||||
|
raise Exception.new "Error #{msg} compiling regex at offset #{erroroffset}"
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
def finalize
|
||||||
|
LibPCRE2.code_free(@re)
|
||||||
|
end
|
||||||
|
|
||||||
|
def match(str : Bytes, pos = 0) : Array(Match)
|
||||||
|
match_data = LibPCRE2.match_data_create_from_pattern(@re, nil)
|
||||||
|
match = [] of Match
|
||||||
|
rc = LibPCRE2.match(
|
||||||
|
@re,
|
||||||
|
str,
|
||||||
|
str.size,
|
||||||
|
pos,
|
||||||
|
LibPCRE2::NO_UTF_CHECK,
|
||||||
|
match_data,
|
||||||
|
nil)
|
||||||
|
if rc < 0
|
||||||
|
# No match, do nothing
|
||||||
|
else
|
||||||
|
ovector = LibPCRE2.get_ovector_pointer(match_data)
|
||||||
|
(0...rc).each do |i|
|
||||||
|
m_start = ovector[2 * i]
|
||||||
|
m_size = ovector[2 * i + 1] - m_start
|
||||||
|
if m_size == 0
|
||||||
|
m_value = Bytes.new(0)
|
||||||
|
else
|
||||||
|
m_value = str[m_start...m_start + m_size]
|
||||||
|
end
|
||||||
|
match << Match.new(m_value, m_start, m_size)
|
||||||
|
end
|
||||||
|
end
|
||||||
|
LibPCRE2.match_data_free(match_data)
|
||||||
|
match
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
class Match
|
||||||
|
property value : Bytes
|
||||||
|
property start : UInt64
|
||||||
|
property size : UInt64
|
||||||
|
|
||||||
|
def initialize(@value : Bytes, @start : UInt64, @size : UInt64)
|
||||||
|
end
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
# pattern = "foo"
|
||||||
|
# str = "foo bar"
|
||||||
|
# re = BytesRegex::Regex.new(pattern)
|
||||||
|
# p! String.new(re.match(str.to_slice)[0].value)
|
@ -9,12 +9,15 @@ module Tartrazine
|
|||||||
# This is the base class for all formatters.
|
# This is the base class for all formatters.
|
||||||
abstract class Formatter
|
abstract class Formatter
|
||||||
property name : String = ""
|
property name : String = ""
|
||||||
|
property theme : Theme = Tartrazine.theme("default-dark")
|
||||||
|
|
||||||
def format(text : String, lexer : Lexer, theme : Theme) : String
|
# Format the text using the given lexer.
|
||||||
|
def format(text : String, lexer : Lexer) : String
|
||||||
raise Exception.new("Not implemented")
|
raise Exception.new("Not implemented")
|
||||||
end
|
end
|
||||||
|
|
||||||
def get_style_defs(theme : Theme) : String
|
# Return the styles, if the formatter supports it.
|
||||||
|
def style_defs : String
|
||||||
raise Exception.new("Not implemented")
|
raise Exception.new("Not implemented")
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
@ -4,20 +4,23 @@ module Tartrazine
|
|||||||
class Ansi < Formatter
|
class Ansi < Formatter
|
||||||
property? line_numbers : Bool = false
|
property? line_numbers : Bool = false
|
||||||
|
|
||||||
def format(text : String, lexer : Lexer, theme : Theme) : String
|
def initialize(@theme : Theme = Tartrazine.theme("default-dark"), @line_numbers : Bool = false)
|
||||||
|
end
|
||||||
|
|
||||||
|
def format(text : String, lexer : Lexer) : String
|
||||||
output = String.build do |outp|
|
output = String.build do |outp|
|
||||||
lexer.group_tokens_in_lines(lexer.tokenize(text)).each_with_index do |line, i|
|
lexer.group_tokens_in_lines(lexer.tokenize(text)).each_with_index do |line, i|
|
||||||
label = line_numbers? ? "#{i + 1}".rjust(4).ljust(5) : ""
|
label = line_numbers? ? "#{i + 1}".rjust(4).ljust(5) : ""
|
||||||
outp << label
|
outp << label
|
||||||
line.each do |token|
|
line.each do |token|
|
||||||
outp << colorize(token[:value], token[:type], theme)
|
outp << colorize(token[:value], token[:type])
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
output
|
output
|
||||||
end
|
end
|
||||||
|
|
||||||
def colorize(text : String, token : String, theme : Theme) : String
|
def colorize(text : String, token : String) : String
|
||||||
style = theme.styles.fetch(token, nil)
|
style = theme.styles.fetch(token, nil)
|
||||||
return text if style.nil?
|
return text if style.nil?
|
||||||
if theme.styles.has_key?(token)
|
if theme.styles.has_key?(token)
|
||||||
|
@ -15,20 +15,37 @@ module Tartrazine
|
|||||||
property? standalone : Bool = false
|
property? standalone : Bool = false
|
||||||
property? surrounding_pre : Bool = true
|
property? surrounding_pre : Bool = true
|
||||||
property? wrap_long_lines : Bool = false
|
property? wrap_long_lines : Bool = false
|
||||||
|
property weight_of_bold : Int32 = 600
|
||||||
|
|
||||||
def format(text : String, lexer : Lexer, theme : Theme) : String
|
property theme : Theme
|
||||||
text = format_text(text, lexer, theme)
|
|
||||||
|
def initialize(@theme : Theme = Tartrazine.theme("default-dark"), *,
|
||||||
|
@highlight_lines = [] of Range(Int32, Int32),
|
||||||
|
@class_prefix : String = "",
|
||||||
|
@line_number_id_prefix = "line-",
|
||||||
|
@line_number_start = 1,
|
||||||
|
@tab_width = 8,
|
||||||
|
@line_numbers : Bool = false,
|
||||||
|
@linkable_line_numbers : Bool = true,
|
||||||
|
@standalone : Bool = false,
|
||||||
|
@surrounding_pre : Bool = true,
|
||||||
|
@wrap_long_lines : Bool = false,
|
||||||
|
@weight_of_bold : Int32 = 600)
|
||||||
|
end
|
||||||
|
|
||||||
|
def format(text : String, lexer : Lexer) : String
|
||||||
|
text = format_text(text, lexer)
|
||||||
if standalone?
|
if standalone?
|
||||||
text = wrap_standalone(text, theme)
|
text = wrap_standalone(text)
|
||||||
end
|
end
|
||||||
text
|
text
|
||||||
end
|
end
|
||||||
|
|
||||||
# Wrap text into a full HTML document, including the CSS for the theme
|
# Wrap text into a full HTML document, including the CSS for the theme
|
||||||
def wrap_standalone(text, theme) : String
|
def wrap_standalone(text) : String
|
||||||
output = String.build do |outp|
|
output = String.build do |outp|
|
||||||
outp << "<!DOCTYPE html><html><head><style>"
|
outp << "<!DOCTYPE html><html><head><style>"
|
||||||
outp << get_style_defs(theme)
|
outp << style_defs
|
||||||
outp << "</style></head><body>"
|
outp << "</style></head><body>"
|
||||||
outp << text
|
outp << text
|
||||||
outp << "</body></html>"
|
outp << "</body></html>"
|
||||||
@ -36,21 +53,21 @@ module Tartrazine
|
|||||||
output
|
output
|
||||||
end
|
end
|
||||||
|
|
||||||
def format_text(text : String, lexer : Lexer, theme : Theme) : String
|
def format_text(text : String, lexer : Lexer) : String
|
||||||
lines = lexer.group_tokens_in_lines(lexer.tokenize(text))
|
lines = lexer.group_tokens_in_lines(lexer.tokenize(text))
|
||||||
output = String.build do |outp|
|
output = String.build do |outp|
|
||||||
if surrounding_pre?
|
if surrounding_pre?
|
||||||
pre_style = wrap_long_lines? ? "style=\"white-space: pre-wrap; word-break: break-word;\"" : ""
|
pre_style = wrap_long_lines? ? "style=\"white-space: pre-wrap; word-break: break-word;\"" : ""
|
||||||
outp << "<pre class=\"#{get_css_class("Background", theme)}\" #{pre_style}>"
|
outp << "<pre class=\"#{get_css_class("Background")}\" #{pre_style}>"
|
||||||
end
|
end
|
||||||
"<code class=\"#{get_css_class("Background", theme)}\">"
|
outp << "<code class=\"#{get_css_class("Background")}\">"
|
||||||
lines.each_with_index(offset: line_number_start - 1) do |line, i|
|
lines.each_with_index(offset: line_number_start - 1) do |line, i|
|
||||||
line_label = line_numbers? ? "#{i + 1}".rjust(4).ljust(5) : ""
|
line_label = line_numbers? ? "#{i + 1}".rjust(4).ljust(5) : ""
|
||||||
line_class = highlighted?(i + 1) ? "class=\"#{get_css_class("LineHighlight", theme)}\"" : ""
|
line_class = highlighted?(i + 1) ? "class=\"#{get_css_class("LineHighlight")}\"" : ""
|
||||||
line_id = linkable_line_numbers? ? "id=\"#{line_number_id_prefix}#{i + 1}\"" : ""
|
line_id = linkable_line_numbers? ? "id=\"#{line_number_id_prefix}#{i + 1}\"" : ""
|
||||||
outp << "<span #{line_id} #{line_class} style=\"user-select: none;\">#{line_label} </span>"
|
outp << "<span #{line_id} #{line_class} style=\"user-select: none;\">#{line_label} </span>"
|
||||||
line.each do |token|
|
line.each do |token|
|
||||||
fragment = "<span class=\"#{get_css_class(token[:type], theme)}\">#{token[:value]}</span>"
|
fragment = "<span class=\"#{get_css_class(token[:type])}\">#{token[:value]}</span>"
|
||||||
outp << fragment
|
outp << fragment
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
@ -60,10 +77,10 @@ module Tartrazine
|
|||||||
end
|
end
|
||||||
|
|
||||||
# ameba:disable Metrics/CyclomaticComplexity
|
# ameba:disable Metrics/CyclomaticComplexity
|
||||||
def get_style_defs(theme : Theme) : String
|
def style_defs : String
|
||||||
output = String.build do |outp|
|
output = String.build do |outp|
|
||||||
theme.styles.each do |token, style|
|
theme.styles.each do |token, style|
|
||||||
outp << ".#{get_css_class(token, theme)} {"
|
outp << ".#{get_css_class(token)} {"
|
||||||
# These are set or nil
|
# These are set or nil
|
||||||
outp << "color: ##{style.color.try &.hex};" if style.color
|
outp << "color: ##{style.color.try &.hex};" if style.color
|
||||||
outp << "background-color: ##{style.background.try &.hex};" if style.background
|
outp << "background-color: ##{style.background.try &.hex};" if style.background
|
||||||
@ -72,7 +89,7 @@ module Tartrazine
|
|||||||
# These are true/false/nil
|
# These are true/false/nil
|
||||||
outp << "border: none;" if style.border == false
|
outp << "border: none;" if style.border == false
|
||||||
outp << "font-weight: bold;" if style.bold
|
outp << "font-weight: bold;" if style.bold
|
||||||
outp << "font-weight: 400;" if style.bold == false
|
outp << "font-weight: #{@weight_of_bold};" if style.bold == false
|
||||||
outp << "font-style: italic;" if style.italic
|
outp << "font-style: italic;" if style.italic
|
||||||
outp << "font-style: normal;" if style.italic == false
|
outp << "font-style: normal;" if style.italic == false
|
||||||
outp << "text-decoration: underline;" if style.underline
|
outp << "text-decoration: underline;" if style.underline
|
||||||
@ -86,7 +103,7 @@ module Tartrazine
|
|||||||
end
|
end
|
||||||
|
|
||||||
# Given a token type, return the CSS class to use.
|
# Given a token type, return the CSS class to use.
|
||||||
def get_css_class(token, theme)
|
def get_css_class(token : String) : String
|
||||||
return class_prefix + Abbreviations[token] if theme.styles.has_key?(token)
|
return class_prefix + Abbreviations[token] if theme.styles.has_key?(token)
|
||||||
|
|
||||||
# Themes don't contain information for each specific
|
# Themes don't contain information for each specific
|
||||||
@ -98,6 +115,7 @@ module Tartrazine
|
|||||||
}]
|
}]
|
||||||
end
|
end
|
||||||
|
|
||||||
|
# Is this line in the highlighted ranges?
|
||||||
def highlighted?(line : Int) : Bool
|
def highlighted?(line : Int) : Bool
|
||||||
highlight_lines.any?(&.includes?(line))
|
highlight_lines.any?(&.includes?(line))
|
||||||
end
|
end
|
||||||
|
17
src/lexer.cr
17
src/lexer.cr
@ -1,3 +1,4 @@
|
|||||||
|
require "baked_file_system"
|
||||||
require "./constants/lexers"
|
require "./constants/lexers"
|
||||||
|
|
||||||
module Tartrazine
|
module Tartrazine
|
||||||
@ -65,7 +66,7 @@ module Tartrazine
|
|||||||
# is true when the lexer is being used to tokenize a string
|
# is true when the lexer is being used to tokenize a string
|
||||||
# from a larger text that is already being tokenized.
|
# from a larger text that is already being tokenized.
|
||||||
# So, when it's true, we don't modify the text.
|
# So, when it's true, we don't modify the text.
|
||||||
def tokenize(text, usingself = false) : Array(Token)
|
def tokenize(text : String, usingself = false) : Array(Token)
|
||||||
@state_stack = ["root"]
|
@state_stack = ["root"]
|
||||||
tokens = [] of Token
|
tokens = [] of Token
|
||||||
pos = 0
|
pos = 0
|
||||||
@ -76,12 +77,13 @@ module Tartrazine
|
|||||||
text += "\n"
|
text += "\n"
|
||||||
end
|
end
|
||||||
|
|
||||||
|
text_bytes = text.to_slice
|
||||||
# Loop through the text, applying rules
|
# Loop through the text, applying rules
|
||||||
while pos < text.size
|
while pos < text_bytes.size
|
||||||
state = states[@state_stack.last]
|
state = states[@state_stack.last]
|
||||||
# Log.trace { "Stack is #{@state_stack} State is #{state.name}, pos is #{pos}, text is #{text[pos..pos + 10]}" }
|
# Log.trace { "Stack is #{@state_stack} State is #{state.name}, pos is #{pos}, text is #{text[pos..pos + 10]}" }
|
||||||
state.rules.each do |rule|
|
state.rules.each do |rule|
|
||||||
matched, new_pos, new_tokens = rule.match(text, pos, self)
|
matched, new_pos, new_tokens = rule.match(text_bytes, pos, self)
|
||||||
if matched
|
if matched
|
||||||
# Move position forward, save the tokens,
|
# Move position forward, save the tokens,
|
||||||
# tokenize from the new position
|
# tokenize from the new position
|
||||||
@ -94,8 +96,13 @@ module Tartrazine
|
|||||||
end
|
end
|
||||||
# If no rule matches, emit an error token
|
# If no rule matches, emit an error token
|
||||||
unless matched
|
unless matched
|
||||||
# Log.trace { "Error at #{pos}" }
|
if text_bytes[pos] == 10u8
|
||||||
tokens << {type: "Error", value: "#{text[pos]}"}
|
# at EOL, reset state to "root"
|
||||||
|
tokens << {type: "Text", value: "\n"}
|
||||||
|
@state_stack = ["root"]
|
||||||
|
else
|
||||||
|
tokens << {type: "Error", value: String.new(text_bytes[pos..pos])}
|
||||||
|
end
|
||||||
pos += 1
|
pos += 1
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
10
src/main.cr
10
src/main.cr
@ -54,6 +54,8 @@ if options["--list-formatters"]
|
|||||||
exit 0
|
exit 0
|
||||||
end
|
end
|
||||||
|
|
||||||
|
theme = Tartrazine.theme(options["-t"].as(String))
|
||||||
|
|
||||||
if options["-f"]
|
if options["-f"]
|
||||||
formatter = options["-f"].as(String)
|
formatter = options["-f"].as(String)
|
||||||
case formatter
|
case formatter
|
||||||
@ -61,9 +63,11 @@ if options["-f"]
|
|||||||
formatter = Tartrazine::Html.new
|
formatter = Tartrazine::Html.new
|
||||||
formatter.standalone = options["--standalone"] != nil
|
formatter.standalone = options["--standalone"] != nil
|
||||||
formatter.line_numbers = options["--line-numbers"] != nil
|
formatter.line_numbers = options["--line-numbers"] != nil
|
||||||
|
formatter.theme = theme
|
||||||
when "terminal"
|
when "terminal"
|
||||||
formatter = Tartrazine::Ansi.new
|
formatter = Tartrazine::Ansi.new
|
||||||
formatter.line_numbers = options["--line-numbers"] != nil
|
formatter.line_numbers = options["--line-numbers"] != nil
|
||||||
|
formatter.theme = theme
|
||||||
when "json"
|
when "json"
|
||||||
formatter = Tartrazine::Json.new
|
formatter = Tartrazine::Json.new
|
||||||
else
|
else
|
||||||
@ -71,11 +75,9 @@ if options["-f"]
|
|||||||
exit 1
|
exit 1
|
||||||
end
|
end
|
||||||
|
|
||||||
theme = Tartrazine.theme(options["-t"].as(String))
|
|
||||||
|
|
||||||
if formatter.is_a?(Tartrazine::Html) && options["--css"]
|
if formatter.is_a?(Tartrazine::Html) && options["--css"]
|
||||||
File.open("#{options["-t"].as(String)}.css", "w") do |outf|
|
File.open("#{options["-t"].as(String)}.css", "w") do |outf|
|
||||||
outf.puts formatter.get_style_defs(theme)
|
outf.puts formatter.style_defs
|
||||||
end
|
end
|
||||||
exit 0
|
exit 0
|
||||||
end
|
end
|
||||||
@ -83,7 +85,7 @@ if options["-f"]
|
|||||||
lexer = Tartrazine.lexer(name: options["-l"].as(String), filename: options["FILE"].as(String))
|
lexer = Tartrazine.lexer(name: options["-l"].as(String), filename: options["FILE"].as(String))
|
||||||
|
|
||||||
input = File.open(options["FILE"].as(String)).gets_to_end
|
input = File.open(options["FILE"].as(String)).gets_to_end
|
||||||
output = formatter.format(input, lexer, theme)
|
output = formatter.format(input, lexer)
|
||||||
|
|
||||||
if options["-o"].nil?
|
if options["-o"].nil?
|
||||||
puts output
|
puts output
|
||||||
|
56
src/rules.cr
56
src/rules.cr
@ -1,8 +1,9 @@
|
|||||||
require "./actions"
|
require "./actions"
|
||||||
|
require "./bytes_regex"
|
||||||
require "./formatter"
|
require "./formatter"
|
||||||
|
require "./lexer"
|
||||||
require "./rules"
|
require "./rules"
|
||||||
require "./styles"
|
require "./styles"
|
||||||
require "./lexer"
|
|
||||||
|
|
||||||
# These are lexer rules. They match with the text being parsed
|
# These are lexer rules. They match with the text being parsed
|
||||||
# and perform actions, either emitting tokens or changing the
|
# and perform actions, either emitting tokens or changing the
|
||||||
@ -10,16 +11,21 @@ require "./lexer"
|
|||||||
module Tartrazine
|
module Tartrazine
|
||||||
# This rule matches via a regex pattern
|
# This rule matches via a regex pattern
|
||||||
|
|
||||||
|
alias Regex = BytesRegex::Regex
|
||||||
|
alias Match = BytesRegex::Match
|
||||||
|
alias MatchData = Array(Match)
|
||||||
|
|
||||||
class Rule
|
class Rule
|
||||||
property pattern : Regex = Re2.new ""
|
property pattern : Regex = Regex.new ""
|
||||||
property actions : Array(Action) = [] of Action
|
property actions : Array(Action) = [] of Action
|
||||||
property xml : String = "foo"
|
property xml : String = "foo"
|
||||||
|
|
||||||
def match(text, pos, lexer) : Tuple(Bool, Int32, Array(Token))
|
def match(text : Bytes, pos, lexer) : Tuple(Bool, Int32, Array(Token))
|
||||||
match = pattern.match(text, pos)
|
match = pattern.match(text, pos)
|
||||||
# We don't match if the match doesn't move the cursor
|
# We don't match if the match doesn't move the cursor
|
||||||
# because that causes infinite loops
|
# because that causes infinite loops
|
||||||
return false, pos, [] of Token if match.nil? || match.end == 0
|
return false, pos, [] of Token if match.empty? || match[0].size == 0
|
||||||
|
# p! match, String.new(text[pos..pos+20])
|
||||||
# Log.trace { "#{match}, #{pattern.inspect}, #{text}, #{pos}" }
|
# Log.trace { "#{match}, #{pattern.inspect}, #{text}, #{pos}" }
|
||||||
tokens = [] of Token
|
tokens = [] of Token
|
||||||
# Emit the tokens
|
# Emit the tokens
|
||||||
@ -27,18 +33,21 @@ module Tartrazine
|
|||||||
# Emit the token
|
# Emit the token
|
||||||
tokens += action.emit(match, lexer)
|
tokens += action.emit(match, lexer)
|
||||||
end
|
end
|
||||||
Log.trace { "#{xml}, #{match.end}, #{tokens}" }
|
Log.trace { "#{xml}, #{pos + match[0].size}, #{tokens}" }
|
||||||
return true, match.end, tokens
|
return true, pos + match[0].size, tokens
|
||||||
end
|
end
|
||||||
|
|
||||||
def initialize(node : XML::Node, multiline, dotall, ignorecase)
|
def initialize(node : XML::Node, multiline, dotall, ignorecase)
|
||||||
@xml = node.to_s
|
@xml = node.to_s
|
||||||
@pattern = Re2.new(
|
pattern = node["pattern"]
|
||||||
node["pattern"],
|
# flags = Regex::Options::ANCHORED
|
||||||
multiline,
|
# MULTILINE implies DOTALL which we don't want, so we
|
||||||
dotall,
|
# use in-pattern flag (?m) instead
|
||||||
ignorecase,
|
# flags |= Regex::Options::MULTILINE if multiline
|
||||||
anchored: true)
|
pattern = "(?m)" + pattern if multiline
|
||||||
|
# flags |= Regex::Options::DOTALL if dotall
|
||||||
|
# flags |= Regex::Options::IGNORE_CASE if ignorecase
|
||||||
|
@pattern = Regex.new(pattern, multiline, dotall, ignorecase, true)
|
||||||
add_actions(node)
|
add_actions(node)
|
||||||
end
|
end
|
||||||
|
|
||||||
@ -80,7 +89,7 @@ module Tartrazine
|
|||||||
def match(text, pos, lexer) : Tuple(Bool, Int32, Array(Token))
|
def match(text, pos, lexer) : Tuple(Bool, Int32, Array(Token))
|
||||||
tokens = [] of Token
|
tokens = [] of Token
|
||||||
actions.each do |action|
|
actions.each do |action|
|
||||||
tokens += action.emit(nil, lexer)
|
tokens += action.emit([] of Match, lexer)
|
||||||
end
|
end
|
||||||
return true, pos, tokens
|
return true, pos, tokens
|
||||||
end
|
end
|
||||||
@ -90,25 +99,4 @@ module Tartrazine
|
|||||||
add_actions(node)
|
add_actions(node)
|
||||||
end
|
end
|
||||||
end
|
end
|
||||||
|
|
||||||
# This is a hack to workaround that Crystal seems to disallow
|
|
||||||
# having regexes multiline but not dot_all
|
|
||||||
class Re2 < Regex
|
|
||||||
@source = "fa"
|
|
||||||
@options = Regex::Options::None
|
|
||||||
@jit = true
|
|
||||||
|
|
||||||
def initialize(pattern : String, multiline = false, dotall = false, ignorecase = false, anchored = false)
|
|
||||||
flags = LibPCRE2::UTF | LibPCRE2::DUPNAMES |
|
|
||||||
LibPCRE2::UCP
|
|
||||||
flags |= LibPCRE2::MULTILINE if multiline
|
|
||||||
flags |= LibPCRE2::DOTALL if dotall
|
|
||||||
flags |= LibPCRE2::CASELESS if ignorecase
|
|
||||||
flags |= LibPCRE2::ANCHORED if anchored
|
|
||||||
flags |= LibPCRE2::NO_UTF_CHECK
|
|
||||||
@re = Regex::PCRE2.compile(pattern, flags) do |error_message|
|
|
||||||
raise Exception.new(error_message)
|
|
||||||
end
|
|
||||||
end
|
|
||||||
end
|
|
||||||
end
|
end
|
||||||
|
@ -11,7 +11,7 @@ require "xml"
|
|||||||
|
|
||||||
module Tartrazine
|
module Tartrazine
|
||||||
extend self
|
extend self
|
||||||
VERSION = "0.2.0"
|
VERSION = {{ `shards version #{__DIR__}`.chomp.stringify }}
|
||||||
|
|
||||||
Log = ::Log.for("tartrazine")
|
Log = ::Log.for("tartrazine")
|
||||||
end
|
end
|
||||||
|
Reference in New Issue
Block a user