Commit Graph

117 Commits

Author SHA1 Message Date
github-actions
afe3bdf45a Updated Linguist to v7.23.0 2023-03-03 14:07:28 +01:00
Alex Bezzubov
8246efecce heuristics regexp engine configurable #3, adapt IsVendor optimization & tests
Regex collation optimization for IsVendor now fails gracefully.
Tests that are affected by non-RE2 syntax are explicitly marked.
2023-02-16 17:55:57 +01:00
Alex Bezzubov
8df9e1ecf2 code-gen: improve ability to debug failures
Code generation failres were hard to identify and undertand

 * avoid unnececary re-formating & memory allocation
 * return clear formatting errors
2023-02-16 17:47:44 +01:00
Alex Bezzubov
319e630aaf code-gen: syntax-aware generation of vendor regex 2023-01-19 19:50:22 +01:00
Alex Bezzubov
3aeb9879da heuristics regexp engine configurable #2, skip rules at runtime 2023-01-19 19:50:22 +01:00
Alex Bezzubov
d8913b00e9 code-gen: re-generate code & fixtures 2022-12-25 22:37:52 +01:00
Alex Bezzubov
5e590f3554 code-gen: make content heuristics regexp engine configurable & generation syntax-aware 2022-12-25 22:37:52 +01:00
Alex Bezzubov
0b92f97b9c code-gen: refactoring, re-use function map in templates 2022-12-25 22:37:52 +01:00
Alex
a9296f134c
Merge pull request #149 from go-enry/improve-gen-tests
Generator tests: add readable text diff output
2022-12-18 19:32:15 +01:00
Alex Bezzubov
c0176b04e7 gen-test: don't expose diff + attributions 2022-12-12 20:48:35 +01:00
Alex Bezzubov
c79c32f525 gen-test: add readable text diff output
test plan:
 * go test -run '^Test_GeneratorTestSuite$' \
	-testify.m '^(TestGenerationFiles)$' \
	github.com/go-enry/go-enry/v2/internal/code-generator/generator
2022-12-03 18:55:48 +01:00
Alex
2059129b5e
Merge branch 'master' into spelling 2022-12-03 10:48:23 +01:00
Alex
a8344728a7
Merge pull request #143 from go-enry/re-collation-at-codegen
Move venrod RE collation at codegen
2022-12-02 10:11:39 +01:00
Alex Bezzubov
2c708f0b6c gen: re-generate aliases & test fixture for go 1.19 2022-12-02 09:55:50 +01:00
Alex Bezzubov
f4051b0f16 gen: re-generated vendors using build-time optimization 2022-12-01 22:16:54 +01:00
Alex Bezzubov
ede9e478fe IsVendor: move RE collation to code generation phase
test plan:
 * go test -run '^TestIsVendor$' github.com/go-enry/go-enry/v2
2022-12-01 22:16:44 +01:00
Alex Bezzubov
6be1ebe9d6 test: fail fast on suite setup/teardown 2022-12-01 22:00:58 +01:00
Alex Bezzubov
43475949cc test: limit linguist repo history size 2022-12-01 22:00:58 +01:00
Alex Bezzubov
bb7a81ede4 refactoring: unify, extract&reuse maybeCloneLinguist() 2022-12-01 22:00:58 +01:00
Alex Bezzubov
5683b2e7f8 test: refactored to clarify Linguist cloning logic on codegen 2022-10-23 11:07:50 +02:00
Josh Soref
bc7767728d spelling: syntax
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-10-06 13:25:49 -04:00
Josh Soref
500fa07895 spelling: structure
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-10-06 13:25:49 -04:00
Josh Soref
d4d3d66352 spelling: skipping
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-10-06 13:25:49 -04:00
Josh Soref
42c82564ae spelling: reference
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-10-06 13:25:49 -04:00
Josh Soref
799e590e75 spelling: maintaining
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-08-09 00:45:29 -04:00
Josh Soref
2e629094b6 spelling: allows
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-08-08 22:52:56 -04:00
github-actions
60edc790b3 Updated Linguist to v7.21.0 2022-06-09 20:09:50 +00:00
github-actions
9f73cdf211 Updated Linguist to v7.20.0 2022-04-05 20:12:53 +00:00
Lauris BH
ae2b0576a7
Add check for non-backtracking subexpressions 2022-03-21 13:54:11 +02:00
github-actions
8eac4cab85 Updated Linguist to v7.19.0 2022-03-03 20:08:49 +00:00
github-actions
2febea0489 Updated Linguist to v7.18.0 2021-12-15 20:08:13 +00:00
github-actions
b3ee64f627 Updated Linguist to v7.17.0 2021-11-14 18:33:24 +01:00
Luke Francl
02878b9c9f Rename CodemirrorMode to CodeMirrorMode
It is a bit of a Rubyism to translate "CodeMirror Mode" into "codemirror_mode".
This is more in line with Go practices.
2021-10-12 16:18:33 -07:00
Luke Francl
4bde6c61a1 remove obsolete TODO 2021-10-11 14:06:29 -07:00
Luke Francl
b248b21349 Expose LanguageInfo with all Linguist data
As discussed in https://github.com/go-enry/go-enry/issues/54, this provides an
API for accessing a LanguageInfo struct which is populated with all the data
from the Linguist YAML source file. Functions are provided to access the
LanguageInfo by name or ID.

The other top-level functions like GetLanguageExtensions, GetLanguageGroup, etc.
could in principle be implemented using this structure, which would simplify the
code generation. But that would be a big change so I didn't do any of that.
Perhaps in the next major version something like that would make sense.
2021-10-11 13:32:29 -07:00
Lauris BH
0affa3ccca Update to Linguist v7.16.1 2021-09-25 23:57:50 +03:00
Luke Francl
dfb8041dcc Update generated code for Linguist 7.14.0 2021-04-26 09:36:25 -07:00
Luke Francl
eb043e80a8 Add GetLanguageID function
The Linguist-defined language IDs are important to our use case because they are
used as database identifiers. This adds a new generator to extract the language
IDs into a map and uses that to implement GetLanguageID.

Because one language has the ID 0, there is no way to tell if a language name is
found or not. If desired, we could add this by returning (string, bool) from
GetLanguageID. But none of the other functions that take language names do this,
so I didn't want to introduce it here.
2021-04-13 11:49:21 -07:00
Lauris BH
c40b34c351 Sync with Liguist v7.13.0 2021-03-07 18:02:04 +02:00
Lauris BH
497e2f85d3 Sync with github/linguist version v7.12.2 2021-01-17 14:10:38 +02:00
Lauris BH
289ac3d9f0 Sync with linguist 7.12.1 2020-11-15 14:32:56 +02:00
Lauris BH
bc76dd38b0 sync to the latest github/linguist v7.11.1 2020-10-12 12:32:48 +03:00
Lauris BH
7c562a6c34 sync to the latest github/linguist v7.11.0 2020-09-17 10:34:41 +03:00
Máximo Cuadros
29bc0a181b
data: replace substring package with regex package 2020-04-15 17:27:48 +02:00
Lauris BH
97a26011a9 Return group color if language has none 2020-03-31 09:30:27 +03:00
Lauris BH
9030d3671b sync to the latest github/linguist v7.9.0 2020-03-30 01:25:57 +03:00
Alexander Bezzubov
3ea961e5ab
generator: change-detector tests on EOL-dependant sample
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2020-03-29 23:23:56 +02:00
Alexander Bezzubov
9be0211f04
generator: skip symlinks on *nix and win
As Git on win does not support symlinks [1], we have to hard-code
the paths to fils under ./samples/ in Linguist codebase that are
known to be a symlink.

 1. https://github.com/git-for-windows/git/wiki/Symbolic-Links

TestPlan:
 - go test ./internal/code-generator/generator -run Test_GeneratorTestSuite

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2020-03-29 23:23:56 +02:00
Alexander Bezzubov
78eee0cf7e
generator: flag to debug building of bayesian classifier
It seems that reading ./samples/ from Linguist consumes
a different number of files from filesystem on different OSes.

This change adds ENRY_DEBUG env var to print some debug output
about calculations of token stats from  samples.

TestPlan:
 - ENRY_DEBUG=1 go test -v ./internal/code-generator/generator \
	-run Test_GeneratorTestSuite -testify.m TestGenerationFiles

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2020-03-29 19:35:49 +02:00
Alexander Bezzubov
3a5f4b2db1
generator: mode debug output in case of failure
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2020-03-25 14:20:26 +01:00