Commit Graph

83 Commits

Author SHA1 Message Date
25b29ebdc4 Implement getting color code for languages
Signed-off-by: Lauris Bukšis-Haberkorns <lauris@nix.lv>
2019-07-19 23:59:46 +03:00
f3ceaa6330 token: refactor & simplify test fixtures
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-05-08 22:17:32 +02:00
a724a2f841 token: test case for regexp + non-valid UTF8
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-05-07 13:46:36 +02:00
8bdc830833 token: new test case with Unicode replacement
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-17 19:28:06 +02:00
278eaf1c22 tokenizer: move flex-based to modules
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-17 13:54:34 +02:00
ae43e1a91f Merge pull request #219 from bzz/go-mod
Introduce Go modules
2019-04-17 13:39:55 +02:00
7e136bade8 test: don't export tokenizer fixtures
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-16 19:38:48 +02:00
6c7b91cb91 doc: improve API doc on review feedback
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-16 19:38:48 +02:00
ada6f15c93 address review feedback
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-16 19:38:48 +02:00
7929933eb5 tokenizer: cleanup & attributions
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-14 21:38:16 +02:00
8756fbdcb4 refactor to build tags
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-14 21:38:16 +02:00
553399ed76 tokenizer: port flex-based C impl from linguist
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-14 21:38:16 +02:00
6a5f37e9e2 modules: prepare for v2 release
- update go.mod \w v2
 - update all import paths

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-14 21:28:12 +02:00
20c6d2845a build: gopkg.in -> github.com imports
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-12 11:49:16 +02:00
85d5906b2b address review feedback - tixing a fypo
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-11 21:36:29 +02:00
41478262f3 fix verb mismatch in a format string
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-11 15:28:49 +02:00
bdb5603f28 Address code review feedback
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-08 16:07:10 +02:00
b2b61c2a8c gen: refactoring, renaming vars for readability
This does not change the logic of the generatro
but only renames/moves some vars for readability

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-03 15:40:23 +02:00
3a6d42b39a doc: fix spelling
Co-Authored-By: bzz <bzz@users.noreply.github.com>
2019-02-21 09:33:17 +01:00
baefa18475 gen: compare generated code to gold ignoring whitespaces
Reason is that gofmt can change between versions e.g
see https://go-review.googlesource.com/c/go/+/122295/
and this would avoid breaking tests and edit wars

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-02-20 23:22:02 +01:00
c8e0f75132 test: make gen test output less verbose
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-02-20 23:22:02 +01:00
3499750785 Sync to linguist 7.2.0: heuristics.yml support (#189)
Sync \w Github Linguist v7.2.0

Includes new way of handling `heuristics.yml` and
all `./data/*` re-generated using Github Linguist [v7.2.0](https://github.com/github/linguist/releases/tag/v7.2.0)
release tag.

 - many new languages
 - better vendoring detection
 - update doc on update&known issues.
2019-02-14 12:47:45 +01:00
5245079744 Apply suggestions from review.
Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2019-01-29 11:28:44 -08:00
dabb41527f Apply suggestions from review.
Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2019-01-29 11:28:42 -08:00
4027b494b3 Add documentation comments to package tokenizer.
Although this package is internal, it still exports an API and deserves some
comments. Serves in partial satisfaction of #195.

Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2019-01-29 11:18:52 -08:00
7d277b11de Copy the tokenizer input to avoid modifying the caller's copy.
Addresses #196. Several of the tokenizer's processing steps wind up editing the
source, and we don't want those changes to be observed by the caller, which may
use the source for other purposes afterward.

Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2019-01-29 10:12:33 -08:00
169060e1cd Add a test that tokenization does not modify the input.
At present this test fails, since the tokenizer replaces text in shared slices
of the input. A subsequent commit will fix that.

Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2019-01-29 10:03:09 -08:00
15bb13117f Refactor Oniguruma integration
Instead of use a command to change imports before build, using a build tag to generate the correct binary.

This will allow applications to compile enry using oniguruma with less troubles.

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>
2018-08-29 18:01:13 +03:00
7eafe024af write a canonical header for machine-generated files
Signed-off-by: Denys Smirnov <denys@sourced.tech>
2018-04-30 12:57:39 +03:00
7923b86ebd Rename onigumura to oniguruma
This change names the dependency like its called. The link to the
package was correct, but all other references were renamed where I could
find time with git grep.

Signed-off-by: Zeger-Jan van de Weg <git@zjvandeweg.nl>
2018-03-28 21:34:54 +02:00
ce5adee8ab Merge pull request #113 from vmarkovtsev/master
Use rubex for faster regular expressions
2017-10-26 18:03:43 +02:00
a66154b7eb Make tokenizer regexps work under rubex
Signed-off-by: Vadim Markovtsev <vadim@sourced.tech>
2017-10-26 17:04:31 +02:00
09d6add804 Fix review
Signed-off-by: Vadim Markovtsev <vadim@sourced.tech>
2017-10-26 17:02:58 +02:00
c97a180da5 Fix review suggestions
Signed-off-by: Vadim Markovtsev <vadim@sourced.tech>
2017-10-26 15:51:02 +02:00
250519bb51 Add the external test linguist dir from env var
This allows to use a cached directory with linguist instead of cloning and speeds up the tests by -10s on my local machine.

Signed-off-by: Vadim Markovtsev <vadim@sourced.tech>
2017-09-28 23:51:38 +02:00
52d7ccd6cf Updated mimeType.gold and regenerated mimeType.go 2017-07-19 10:18:18 +02:00
b2fe3f69ce Added mymeType.gold 2017-07-18 12:47:19 +02:00
ea819f58c2 Renamed mime to mimeType 2017-07-18 12:46:29 +02:00
632422db69 Added pending untracked files 2017-07-18 12:46:29 +02:00
125c802582 Now generates mime file 2017-07-18 12:46:29 +02:00
2045abfa41 use of gopkg.in/toqueteos/substring.v1 in content.go to improve GetLanguagesByContent performance 2017-07-13 08:21:09 +02:00
3f2248084e Moved commit.go to data directory 2017-06-28 11:22:42 +02:00
7e827e47ef moved generated data to data subpackage 2017-06-28 08:31:11 +02:00
b7d4be5fdd commit against tests run is fixed
renamed tmpLinguist to repoLinguist and SimpleLinguistTestSuite to EnryTestSuit in common_test.go

changed receiver's name for TestSuites to 's'

fixed comments
2017-06-26 15:35:53 +02:00
17a6f3dc89 Changed commit ref to .git/HEAD 2017-06-19 11:20:24 +02:00
beda5b73e7 changed signatures for strategies 2017-06-15 10:07:23 +02:00
1fc8cf7a5d changes to improve detection accuracy 2017-06-15 10:07:22 +02:00
ba53e10c7b renamed package and cli to enry 2017-06-13 14:18:23 +02:00
3a470f617c project renamed to enry 2017-06-08 09:27:27 +02:00
0d5dff1979 changes in the API, ready to version 2 2017-06-06 11:30:23 +02:00