Commit Graph

124 Commits

Author SHA1 Message Date
Alexander Bezzubov
7929933eb5
tokenizer: cleanup & attributions
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-14 21:38:16 +02:00
Alexander Bezzubov
8756fbdcb4
refactor to build tags
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-14 21:38:16 +02:00
Alexander Bezzubov
553399ed76
tokenizer: port flex-based C impl from linguist
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-14 21:38:16 +02:00
Alexander Bezzubov
6a5f37e9e2
modules: prepare for v2 release
- update go.mod \w v2
 - update all import paths

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-14 21:28:12 +02:00
Alexander Bezzubov
20c6d2845a
build: gopkg.in -> github.com imports
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-12 11:49:16 +02:00
Alexander Bezzubov
85d5906b2b
address review feedback - tixing a fypo
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-11 21:36:29 +02:00
Alexander Bezzubov
41478262f3
fix verb mismatch in a format string
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-11 15:28:49 +02:00
Alexander Bezzubov
bdb5603f28
Address code review feedback
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-08 16:07:10 +02:00
Alexander Bezzubov
b2b61c2a8c
gen: refactoring, renaming vars for readability
This does not change the logic of the generatro
but only renames/moves some vars for readability

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-03 15:40:23 +02:00
M. J. Fromberger
3a6d42b39a
doc: fix spelling
Co-Authored-By: bzz <bzz@users.noreply.github.com>
2019-02-21 09:33:17 +01:00
Alexander Bezzubov
baefa18475
gen: compare generated code to gold ignoring whitespaces
Reason is that gofmt can change between versions e.g
see https://go-review.googlesource.com/c/go/+/122295/
and this would avoid breaking tests and edit wars

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-02-20 23:22:02 +01:00
Alexander Bezzubov
c8e0f75132
test: make gen test output less verbose
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-02-20 23:22:02 +01:00
Alexander
3499750785
Sync to linguist 7.2.0: heuristics.yml support (#189)
Sync \w Github Linguist v7.2.0

Includes new way of handling `heuristics.yml` and
all `./data/*` re-generated using Github Linguist [v7.2.0](https://github.com/github/linguist/releases/tag/v7.2.0)
release tag.

 - many new languages
 - better vendoring detection
 - update doc on update&known issues.
2019-02-14 12:47:45 +01:00
M. J. Fromberger
5245079744 Apply suggestions from review.
Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2019-01-29 11:28:44 -08:00
M. J. Fromberger
dabb41527f Apply suggestions from review.
Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2019-01-29 11:28:42 -08:00
M. J. Fromberger
4027b494b3 Add documentation comments to package tokenizer.
Although this package is internal, it still exports an API and deserves some
comments. Serves in partial satisfaction of #195.

Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2019-01-29 11:18:52 -08:00
M. J. Fromberger
7d277b11de Copy the tokenizer input to avoid modifying the caller's copy.
Addresses #196. Several of the tokenizer's processing steps wind up editing the
source, and we don't want those changes to be observed by the caller, which may
use the source for other purposes afterward.

Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2019-01-29 10:12:33 -08:00
M. J. Fromberger
169060e1cd Add a test that tokenization does not modify the input.
At present this test fails, since the tokenizer replaces text in shared slices
of the input. A subsequent commit will fix that.

Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
2019-01-29 10:03:09 -08:00
Antonio Jesus Navarro Perez
15bb13117f Refactor Oniguruma integration
Instead of use a command to change imports before build, using a build tag to generate the correct binary.

This will allow applications to compile enry using oniguruma with less troubles.

Signed-off-by: Antonio Jesus Navarro Perez <antnavper@gmail.com>
2018-08-29 18:01:13 +03:00
Denys Smirnov
7eafe024af write a canonical header for machine-generated files
Signed-off-by: Denys Smirnov <denys@sourced.tech>
2018-04-30 12:57:39 +03:00
Zeger-Jan van de Weg
7923b86ebd
Rename onigumura to oniguruma
This change names the dependency like its called. The link to the
package was correct, but all other references were renamed where I could
find time with git grep.

Signed-off-by: Zeger-Jan van de Weg <git@zjvandeweg.nl>
2018-03-28 21:34:54 +02:00
Alfredo Beaumont
ce5adee8ab Merge pull request #113 from vmarkovtsev/master
Use rubex for faster regular expressions
2017-10-26 18:03:43 +02:00
Vadim Markovtsev
a66154b7eb Make tokenizer regexps work under rubex
Signed-off-by: Vadim Markovtsev <vadim@sourced.tech>
2017-10-26 17:04:31 +02:00
Vadim Markovtsev
09d6add804 Fix review
Signed-off-by: Vadim Markovtsev <vadim@sourced.tech>
2017-10-26 17:02:58 +02:00
Vadim Markovtsev
c97a180da5 Fix review suggestions
Signed-off-by: Vadim Markovtsev <vadim@sourced.tech>
2017-10-26 15:51:02 +02:00
Vadim Markovtsev
250519bb51 Add the external test linguist dir from env var
This allows to use a cached directory with linguist instead of cloning and speeds up the tests by -10s on my local machine.

Signed-off-by: Vadim Markovtsev <vadim@sourced.tech>
2017-09-28 23:51:38 +02:00
David Paz
52d7ccd6cf Updated mimeType.gold and regenerated mimeType.go 2017-07-19 10:18:18 +02:00
David Paz
b2fe3f69ce Added mymeType.gold 2017-07-18 12:47:19 +02:00
David Paz
ea819f58c2 Renamed mime to mimeType 2017-07-18 12:46:29 +02:00
David Paz
632422db69 Added pending untracked files 2017-07-18 12:46:29 +02:00
David Paz
125c802582 Now generates mime file 2017-07-18 12:46:29 +02:00
Manuel Carmona
2045abfa41 use of gopkg.in/toqueteos/substring.v1 in content.go to improve GetLanguagesByContent performance 2017-07-13 08:21:09 +02:00
David Paz
3f2248084e Moved commit.go to data directory 2017-06-28 11:22:42 +02:00
David Paz
7e827e47ef moved generated data to data subpackage 2017-06-28 08:31:11 +02:00
Manuel Carmona
b7d4be5fdd commit against tests run is fixed
renamed tmpLinguist to repoLinguist and SimpleLinguistTestSuite to EnryTestSuit in common_test.go

changed receiver's name for TestSuites to 's'

fixed comments
2017-06-26 15:35:53 +02:00
David Paz
17a6f3dc89 Changed commit ref to .git/HEAD 2017-06-19 11:20:24 +02:00
Manuel Carmona
beda5b73e7 changed signatures for strategies 2017-06-15 10:07:23 +02:00
Manuel Carmona
1fc8cf7a5d changes to improve detection accuracy 2017-06-15 10:07:22 +02:00
Manuel Carmona
ba53e10c7b renamed package and cli to enry 2017-06-13 14:18:23 +02:00
Máximo Cuadros
3a470f617c project renamed to enry 2017-06-08 09:27:27 +02:00
Manuel Carmona
0d5dff1979 changes in the API, ready to version 2 2017-06-06 11:30:23 +02:00
Manuel Carmona
5b304524d1 Rearranged code 2017-06-02 09:33:55 +02:00
Manuel Carmona
f8b8f7f5c4 Added classifier to the sequence of strategies 2017-05-30 09:07:58 +02:00
Manuel Carmona
fcf30a07c8 Added frequencies.go generation 2017-05-29 12:19:37 +02:00
Manuel Carmona
45314b4903 Added all the necessary to do GetLanguageByAlias functionality works 2017-05-08 11:34:00 +02:00
Manuel Carmona
6f3ad6d30d separated GetLanguageType and languagesType map in different files due to a better generation files 2017-05-03 12:17:54 +02:00
Manuel Carmona
cbf44205e0 fixed GetLanguageType to return Unknown when language is not found in languagesType map 2017-05-03 10:48:28 +02:00
Manuel Carmona
664afe48d4 fixed GetLanguageByContent returned value when there is not a function matcher for the extension 2017-05-03 10:37:34 +02:00
Manuel Carmona
28dc452853 added some corner cases to content.go generation tests 2017-04-27 17:32:42 +02:00
Manuel Carmona
63d4d9bf24 removed templates from test_files directory to use templates from assets directory in tests 2017-04-27 17:32:42 +02:00
Manuel Carmona
f63a25d794 all related to extension strategy renamed to reference it 2017-04-27 17:32:42 +02:00
Manuel Carmona
645bdd7331 added filenames_map.go generation
languagesByFilename now is a map[string]string
2017-04-27 17:30:57 +02:00
Manuel Carmona
f45efec5fb GetLanguageType and Type constants have comments now
type.go comments generated from type.go.tmpl
2017-04-27 16:40:28 +02:00
Manuel Carmona
c6d74bca66 added shebang functionality
fixed autogenerated comment

changed constant types names

GetLanguageByShebang doesn't print errors

languageInfo struct change to have only necessary fields

GetLanguageByShebang has a comment now
2017-04-27 16:40:08 +02:00
Manuel Carmona
2644a7c8da added interpreters_map.go generation
fixed Interpreters comment
2017-04-27 16:39:54 +02:00
Manuel Carmona
6ddbb79af0 changed generator_test.go to use only TestFromFile
modified *.test.yml to contain only necessary information

fixed white spaces

remove duplicated file languages.test.tmpl
2017-04-27 16:39:36 +02:00
Manuel Carmona
1bf555bc4c changed getAlphabeticalOrderedKeys to use sort.Strings 2017-04-27 16:35:23 +02:00
Manuel Carmona
c08b85120d created 'type Type int' for type.go generation 2017-04-17 12:08:54 +02:00
Manuel Carmona
b277944b2a fixed constant iotas 2017-04-17 12:00:50 +02:00
Manuel Carmona
25e835f5fd slice of languages arranged in alphabetical order 2017-04-17 11:55:29 +02:00
Manuel Carmona
9a9968dca0 added comments to constants 2017-04-17 11:55:29 +02:00
Manuel Carmona
ef39403555 added type.go generation 2017-04-17 11:55:29 +02:00
Manuel Carmona
5d61ca93d8 changed langs.go to unmarshal on a languageInfo struct 2017-04-17 11:55:29 +02:00
Manuel Carmona
ca3ae587f3 added documentation_matchers.go generation 2017-04-17 11:52:11 +02:00
Manuel Carmona
65996506ae fixed Vendor function's comment 2017-04-10 10:32:54 +02:00
Manuel Carmona
30772e4ea0 changed executeVendorTemplate's paramaters names 2017-04-10 10:27:44 +02:00
Manuel Carmona
f175c2d20b changed Vendor function's comment and parameters names 2017-04-10 10:25:52 +02:00
Manuel Carmona
eaf473743b changed function name executeUtilsTemplate to executeVendorTemplate 2017-04-10 10:20:38 +02:00
Manuel Carmona
e998b0ff2e regexp for vendored files and directories are generated in vendor_matchers.go 2017-04-07 09:27:40 +02:00
Manuel Carmona
13e7886a02 Added utils.go generation 2017-04-06 17:31:17 +02:00
Máximo Cuadros
3a2a62baad move srcd.works to gopkg.in 2017-04-05 18:26:58 +02:00
Manuel Carmona
03c71a9b93 move content.go generation to internal 2017-04-05 18:15:27 +02:00
Manuel Carmona
ba22a0a243 content generator 2017-04-05 18:09:14 +02:00
Manuel Carmona
665b7475e3 code generation move to internal/code-generator 2017-04-05 17:49:58 +02:00