Commit Graph

142 Commits

Author SHA1 Message Date
Alex Bezzubov
5e590f3554 code-gen: make content heuristics regexp engine configurable & generation syntax-aware 2022-12-25 22:37:52 +01:00
Alex Bezzubov
0b92f97b9c code-gen: refactoring, re-use function map in templates 2022-12-25 22:37:52 +01:00
Alex
a9296f134c
Merge pull request #149 from go-enry/improve-gen-tests
Generator tests: add readable text diff output
2022-12-18 19:32:15 +01:00
Alex Bezzubov
c0176b04e7 gen-test: don't expose diff + attributions 2022-12-12 20:48:35 +01:00
Alex Bezzubov
c79c32f525 gen-test: add readable text diff output
test plan:
 * go test -run '^Test_GeneratorTestSuite$' \
	-testify.m '^(TestGenerationFiles)$' \
	github.com/go-enry/go-enry/v2/internal/code-generator/generator
2022-12-03 18:55:48 +01:00
Alex
2059129b5e
Merge branch 'master' into spelling 2022-12-03 10:48:23 +01:00
Alex
a8344728a7
Merge pull request #143 from go-enry/re-collation-at-codegen
Move venrod RE collation at codegen
2022-12-02 10:11:39 +01:00
Alex Bezzubov
2c708f0b6c gen: re-generate aliases & test fixture for go 1.19 2022-12-02 09:55:50 +01:00
Alex Bezzubov
375b301238 code-gen: reformat template for go 1.19
https://tip.golang.org/doc/go1.19#go-doc with introduction of
https://tip.golang.org/doc/comment has broken the code generator tests.
2022-12-02 09:55:50 +01:00
Alex Bezzubov
f4051b0f16 gen: re-generated vendors using build-time optimization 2022-12-01 22:16:54 +01:00
Alex Bezzubov
ede9e478fe IsVendor: move RE collation to code generation phase
test plan:
 * go test -run '^TestIsVendor$' github.com/go-enry/go-enry/v2
2022-12-01 22:16:44 +01:00
Alex Bezzubov
6be1ebe9d6 test: fail fast on suite setup/teardown 2022-12-01 22:00:58 +01:00
Alex Bezzubov
43475949cc test: limit linguist repo history size 2022-12-01 22:00:58 +01:00
Alex Bezzubov
bb7a81ede4 refactoring: unify, extract&reuse maybeCloneLinguist() 2022-12-01 22:00:58 +01:00
Alex Bezzubov
5683b2e7f8 test: refactored to clarify Linguist cloning logic on codegen 2022-10-23 11:07:50 +02:00
Alex Bezzubov
3feb720575 code-gen: fail fast
Stop the code generation process early if any of its
generators fail rather than skipping it with the log message.
2022-10-23 10:56:01 +02:00
Josh Soref
bc7767728d spelling: syntax
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-10-06 13:25:49 -04:00
Josh Soref
500fa07895 spelling: structure
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-10-06 13:25:49 -04:00
Josh Soref
d4d3d66352 spelling: skipping
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-10-06 13:25:49 -04:00
Josh Soref
42c82564ae spelling: reference
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-10-06 13:25:49 -04:00
Josh Soref
799e590e75 spelling: maintaining
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-08-09 00:45:29 -04:00
Josh Soref
2e629094b6 spelling: allows
Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-08-08 22:52:56 -04:00
github-actions
60edc790b3 Updated Linguist to v7.21.0 2022-06-09 20:09:50 +00:00
github-actions
9f73cdf211 Updated Linguist to v7.20.0 2022-04-05 20:12:53 +00:00
Lauris BH
ae2b0576a7
Add check for non-backtracking subexpressions 2022-03-21 13:54:11 +02:00
github-actions
8eac4cab85 Updated Linguist to v7.19.0 2022-03-03 20:08:49 +00:00
github-actions
2febea0489 Updated Linguist to v7.18.0 2021-12-15 20:08:13 +00:00
github-actions
b3ee64f627 Updated Linguist to v7.17.0 2021-11-14 18:33:24 +01:00
Luke Francl
03b31eb4ce
Update internal/code-generator/main.go
Co-authored-by: Lauris BH <lauris@nix.lv>
2021-10-13 10:30:19 -07:00
Luke Francl
02878b9c9f Rename CodemirrorMode to CodeMirrorMode
It is a bit of a Rubyism to translate "CodeMirror Mode" into "codemirror_mode".
This is more in line with Go practices.
2021-10-12 16:18:33 -07:00
Luke Francl
b6b72c6c08 Add documentation to LanguageInfo struct fields
These are adapted from https://github.com/github/linguist/blob/master/lib/linguist/languages.yml
2021-10-12 16:13:59 -07:00
Luke Francl
6212f1fcb4 Remove name -> LanguageInfo mapping per code review
The GetLanguageInfo method is now implemented in terms of GetLanguageInfoByID.
This is possible because you can use GetLanguageID to get the ID for a language.
2021-10-12 13:29:39 -07:00
Luke Francl
6279d53f66 clean up whitespace in template 2021-10-11 14:20:25 -07:00
Luke Francl
4bde6c61a1 remove obsolete TODO 2021-10-11 14:06:29 -07:00
Luke Francl
b248b21349 Expose LanguageInfo with all Linguist data
As discussed in https://github.com/go-enry/go-enry/issues/54, this provides an
API for accessing a LanguageInfo struct which is populated with all the data
from the Linguist YAML source file. Functions are provided to access the
LanguageInfo by name or ID.

The other top-level functions like GetLanguageExtensions, GetLanguageGroup, etc.
could in principle be implemented using this structure, which would simplify the
code generation. But that would be a big change so I didn't do any of that.
Perhaps in the next major version something like that would make sense.
2021-10-11 13:32:29 -07:00
Lauris BH
0affa3ccca Update to Linguist v7.16.1 2021-09-25 23:57:50 +03:00
Luke Francl
dfb8041dcc Update generated code for Linguist 7.14.0 2021-04-26 09:36:25 -07:00
Luke Francl
eb043e80a8 Add GetLanguageID function
The Linguist-defined language IDs are important to our use case because they are
used as database identifiers. This adds a new generator to extract the language
IDs into a map and uses that to implement GetLanguageID.

Because one language has the ID 0, there is no way to tell if a language name is
found or not. If desired, we could add this by returning (string, bool) from
GetLanguageID. But none of the other functions that take language names do this,
so I didn't want to introduce it here.
2021-04-13 11:49:21 -07:00
Lauris BH
c40b34c351 Sync with Liguist v7.13.0 2021-03-07 18:02:04 +02:00
Lauris BH
497e2f85d3 Sync with github/linguist version v7.12.2 2021-01-17 14:10:38 +02:00
Lauris BH
289ac3d9f0 Sync with linguist 7.12.1 2020-11-15 14:32:56 +02:00
Lauris BH
bc76dd38b0 sync to the latest github/linguist v7.11.1 2020-10-12 12:32:48 +03:00
Lauris BH
7c562a6c34 sync to the latest github/linguist v7.11.0 2020-09-17 10:34:41 +03:00
Máximo Cuadros
29bc0a181b
data: replace substring package with regex package 2020-04-15 17:27:48 +02:00
Lauris BH
97a26011a9 Return group color if language has none 2020-03-31 09:30:27 +03:00
Lauris BH
9030d3671b sync to the latest github/linguist v7.9.0 2020-03-30 01:25:57 +03:00
Alexander Bezzubov
3ea961e5ab
generator: change-detector tests on EOL-dependant sample
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2020-03-29 23:23:56 +02:00
Alexander Bezzubov
9be0211f04
generator: skip symlinks on *nix and win
As Git on win does not support symlinks [1], we have to hard-code
the paths to fils under ./samples/ in Linguist codebase that are
known to be a symlink.

 1. https://github.com/git-for-windows/git/wiki/Symbolic-Links

TestPlan:
 - go test ./internal/code-generator/generator -run Test_GeneratorTestSuite

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2020-03-29 23:23:56 +02:00
Alexander Bezzubov
78eee0cf7e
generator: flag to debug building of bayesian classifier
It seems that reading ./samples/ from Linguist consumes
a different number of files from filesystem on different OSes.

This change adds ENRY_DEBUG env var to print some debug output
about calculations of token stats from  samples.

TestPlan:
 - ENRY_DEBUG=1 go test -v ./internal/code-generator/generator \
	-run Test_GeneratorTestSuite -testify.m TestGenerationFiles

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2020-03-29 19:35:49 +02:00
Alexander
b78e4423f0
generator: drop platform-specific separator
Co-Authored-By: Lauris BH <lauris@nix.lv>
2020-03-25 19:27:46 +01:00