tartrazine

mirror of https://github.com/ralsina/tartrazine.git synced 2025-09-16 10:27:34 +00:00

Author	SHA1	Message	Date
Lauris BH	0affa3ccca	Update to Linguist v7.16.1	2021-09-25 23:57:50 +03:00
Luke Francl	dfb8041dcc	Update generated code for Linguist 7.14.0	2021-04-26 09:36:25 -07:00
Luke Francl	eb043e80a8	Add GetLanguageID function The Linguist-defined language IDs are important to our use case because they are used as database identifiers. This adds a new generator to extract the language IDs into a map and uses that to implement GetLanguageID. Because one language has the ID 0, there is no way to tell if a language name is found or not. If desired, we could add this by returning (string, bool) from GetLanguageID. But none of the other functions that take language names do this, so I didn't want to introduce it here.	2021-04-13 11:49:21 -07:00
Lauris BH	c40b34c351	Sync with Liguist v7.13.0	2021-03-07 18:02:04 +02:00
Lauris BH	497e2f85d3	Sync with github/linguist version v7.12.2	2021-01-17 14:10:38 +02:00
Lauris BH	289ac3d9f0	Sync with linguist 7.12.1	2020-11-15 14:32:56 +02:00
Lauris BH	bc76dd38b0	sync to the latest github/linguist v7.11.1	2020-10-12 12:32:48 +03:00
Lauris BH	7c562a6c34	sync to the latest github/linguist v7.11.0	2020-09-17 10:34:41 +03:00
Máximo Cuadros	29bc0a181b	data: replace substring package with regex package	2020-04-15 17:27:48 +02:00
Lauris BH	97a26011a9	Return group color if language has none	2020-03-31 09:30:27 +03:00
Lauris BH	9030d3671b	sync to the latest github/linguist v7.9.0	2020-03-30 01:25:57 +03:00
Alexander Bezzubov	3ea961e5ab	generator: change-detector tests on EOL-dependant sample Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2020-03-29 23:23:56 +02:00
Alexander Bezzubov	9be0211f04	generator: skip symlinks on *nix and win As Git on win does not support symlinks [1], we have to hard-code the paths to fils under ./samples/ in Linguist codebase that are known to be a symlink. 1. https://github.com/git-for-windows/git/wiki/Symbolic-Links TestPlan: - go test ./internal/code-generator/generator -run Test_GeneratorTestSuite Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2020-03-29 23:23:56 +02:00
Alexander Bezzubov	78eee0cf7e	generator: flag to debug building of bayesian classifier It seems that reading ./samples/ from Linguist consumes a different number of files from filesystem on different OSes. This change adds ENRY_DEBUG env var to print some debug output about calculations of token stats from samples. TestPlan: - ENRY_DEBUG=1 go test -v ./internal/code-generator/generator \ -run Test_GeneratorTestSuite -testify.m TestGenerationFiles Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2020-03-29 19:35:49 +02:00
Alexander	b78e4423f0	generator: drop platform-specific separator Co-Authored-By: Lauris BH <lauris@nix.lv>	2020-03-25 19:27:46 +01:00
Alexander Bezzubov	3a5f4b2db1	generator: mode debug output in case of failure Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2020-03-25 14:20:26 +01:00
Alexander Bezzubov	b0f94ad693	generator: CLI tool fix to support win paths On Win `make code-generate` produces unreasonable Bayesian classifier weights from Linguist samples silently, failing only the final classification tests. TestPlan: - go test ./internal/code-generator/... \ -run Test_GeneratorTestSuite -testify.m TestGenerationFiles Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2020-03-25 14:00:24 +01:00
Alexander Bezzubov	78d8f43a88	tokenizer: hide flex-based impl, avoid build failures on win TestPlan: - go test -run TestTokenize ./internal/tokenizer - go test -tags flex -run TestTokenize ./internal/tokenizer (shold fail as default fixtures are from regex-based tokenizer)	2020-03-19 19:58:48 +01:00
Alexander Bezzubov	1ab8148c10	test: fix platform-depenent paths in tests Test Plan: - go test ./internal/code-generator/... -run Test_GeneratorTestSuite -testify.m TestGenerationFiles	2020-03-19 19:47:22 +01:00
Alexander Bezzubov	e32a70a784	tokenizer: fix a bug and regenerate the code \w latest Go See https://github.com/bzz/enry/pull/4 for details. Test Plan: - go test ./...	2020-03-19 19:08:21 +01:00
Máximo Cuadros	84efad7693	*: module rename to go-enry/go-enry/v4	2020-03-19 17:31:29 +01:00
Alexander Bezzubov	bc5e031cee	Drop src-d org ref except for issues Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2020-03-19 14:04:36 +01:00
Lauris Bukšis-Haberkorns	4e3e15e80d	Sync to linguist v7.5.1 Signed-off-by: Lauris BH <lauris@nix.lv>	2019-08-06 17:18:01 +03:00
Lauris Bukšis-Haberkorns	2f5526ddba	Improve detection of unsupported regexp syntax Signed-off-by: Lauris Bukšis-Haberkorns <lauris@nix.lv>	2019-08-05 22:24:03 +03:00
Lauris Bukšis-Haberkorns	25b29ebdc4	Implement getting color code for languages Signed-off-by: Lauris Bukšis-Haberkorns <lauris@nix.lv>	2019-07-19 23:59:46 +03:00
Alexander Bezzubov	f3ceaa6330	token: refactor & simplify test fixtures Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-05-08 22:17:32 +02:00
Alexander Bezzubov	a724a2f841	token: test case for regexp + non-valid UTF8 Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-05-07 13:46:36 +02:00
Alexander Bezzubov	8bdc830833	token: new test case with Unicode replacement Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-17 19:28:06 +02:00
Alexander Bezzubov	278eaf1c22	tokenizer: move flex-based to modules Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-17 13:54:34 +02:00
Alexander	ae43e1a91f	Merge pull request #219 from bzz/go-mod Introduce Go modules	2019-04-17 13:39:55 +02:00
Alexander Bezzubov	7e136bade8	test: don't export tokenizer fixtures Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-16 19:38:48 +02:00
Alexander Bezzubov	6c7b91cb91	doc: improve API doc on review feedback Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-16 19:38:48 +02:00
Alexander Bezzubov	ada6f15c93	address review feedback Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-16 19:38:48 +02:00
Alexander Bezzubov	7929933eb5	tokenizer: cleanup & attributions Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-14 21:38:16 +02:00
Alexander Bezzubov	8756fbdcb4	refactor to build tags Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-14 21:38:16 +02:00
Alexander Bezzubov	553399ed76	tokenizer: port flex-based C impl from linguist Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-14 21:38:16 +02:00
Alexander Bezzubov	6a5f37e9e2	modules: prepare for v2 release - update go.mod \w v2 - update all import paths Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-14 21:28:12 +02:00
Alexander Bezzubov	20c6d2845a	build: gopkg.in -> github.com imports Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-12 11:49:16 +02:00
Alexander Bezzubov	85d5906b2b	address review feedback - tixing a fypo Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-11 21:36:29 +02:00
Alexander Bezzubov	41478262f3	fix verb mismatch in a format string Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-11 15:28:49 +02:00
Alexander Bezzubov	bdb5603f28	Address code review feedback Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-08 16:07:10 +02:00
Alexander Bezzubov	b2b61c2a8c	gen: refactoring, renaming vars for readability This does not change the logic of the generatro but only renames/moves some vars for readability Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-04-03 15:40:23 +02:00
M. J. Fromberger	3a6d42b39a	doc: fix spelling Co-Authored-By: bzz <bzz@users.noreply.github.com>	2019-02-21 09:33:17 +01:00
Alexander Bezzubov	baefa18475	gen: compare generated code to gold ignoring whitespaces Reason is that gofmt can change between versions e.g see https://go-review.googlesource.com/c/go/+/122295/ and this would avoid breaking tests and edit wars Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-02-20 23:22:02 +01:00
Alexander Bezzubov	c8e0f75132	test: make gen test output less verbose Signed-off-by: Alexander Bezzubov <bzz@apache.org>	2019-02-20 23:22:02 +01:00
Alexander	3499750785	Sync to linguist 7.2.0: heuristics.yml support (#189 ) Sync \w Github Linguist v7.2.0 Includes new way of handling `heuristics.yml` and all `./data/*` re-generated using Github Linguist [v7.2.0](https://github.com/github/linguist/releases/tag/v7.2.0) release tag. - many new languages - better vendoring detection - update doc on update&known issues.	2019-02-14 12:47:45 +01:00
M. J. Fromberger	5245079744	Apply suggestions from review. Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>	2019-01-29 11:28:44 -08:00
M. J. Fromberger	dabb41527f	Apply suggestions from review. Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>	2019-01-29 11:28:42 -08:00
M. J. Fromberger	4027b494b3	Add documentation comments to package tokenizer. Although this package is internal, it still exports an API and deserves some comments. Serves in partial satisfaction of #195. Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>	2019-01-29 11:18:52 -08:00
M. J. Fromberger	7d277b11de	Copy the tokenizer input to avoid modifying the caller's copy. Addresses #196. Several of the tokenizer's processing steps wind up editing the source, and we don't want those changes to be observed by the caller, which may use the source for other purposes afterward. Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>	2019-01-29 10:12:33 -08:00

1 2 3

107 Commits