Commit Graph

51 Commits

Author SHA1 Message Date
Alex Bezzubov
b41b4e14fe test: refactor a single maybeCloneLinguist() impl 2023-09-06 23:20:43 +03:00
Alex Bezzubov
b5441048b3 test: actually a correct LanguagesByFilename case :/ 2023-09-06 20:58:00 +03:00
Alex Bezzubov
f435dd406f test: add one more LanguagesByFilename case 2023-09-06 20:26:30 +03:00
Alex Bezzubov
86cae02425 test: cover GetLanguageByContent confusing edge cases
And clarify documentation wording, based on discussion
at https://github.com/go-enry/go-enry/issues/145

test plan:
 * go test -run '^Test_EnryTestSuite$' -testify.m '^(TestGetLanguageByContent)$' ./...
2022-12-01 22:10:01 +01:00
Alex Bezzubov
6be1ebe9d6 test: fail fast on suite setup/teardown 2022-12-01 22:00:58 +01:00
Alex Bezzubov
43475949cc test: limit linguist repo history size 2022-12-01 22:00:58 +01:00
Alex Bezzubov
a93364ec79 refactoring: add separate test suite for linguist samples/fixtures 2022-12-01 22:00:58 +01:00
Alex Bezzubov
bb7a81ede4 refactoring: unify, extract&reuse maybeCloneLinguist() 2022-12-01 22:00:58 +01:00
Alex Bezzubov
0c3a5927bb test: case-insensitive language name comparison 2021-11-14 18:28:09 +01:00
Alex Bezzubov
d47102badf test: update language name fixture 2021-11-14 18:28:09 +01:00
Luke Francl
b248b21349 Expose LanguageInfo with all Linguist data
As discussed in https://github.com/go-enry/go-enry/issues/54, this provides an
API for accessing a LanguageInfo struct which is populated with all the data
from the Linguist YAML source file. Functions are provided to access the
LanguageInfo by name or ID.

The other top-level functions like GetLanguageExtensions, GetLanguageGroup, etc.
could in principle be implemented using this structure, which would simplify the
code generation. But that would be a big change so I didn't do any of that.
Perhaps in the next major version something like that would make sense.
2021-10-11 13:32:29 -07:00
Lauris BH
0affa3ccca Update to Linguist v7.16.1 2021-09-25 23:57:50 +03:00
Lauris BH
4686615d9e Improve shebang parsing to detect correct interpreter 2021-09-25 19:24:44 +03:00
Michael Rykov
58f8dccbcf Fixed GetLanguagesByShebang for paths with “env” 2021-06-19 00:49:05 +08:00
Alex
0a9864e6ec
Merge pull request #46 from look/look/add-language-id
Add GetLanguageID function
2021-04-24 08:32:32 +02:00
Luke Francl
cabfdaffc0 Update GetLanguageID to return a found boolean per code review 2021-04-22 16:55:42 -07:00
Luke Francl
bf7167fc44 Rewrite GetLanguages to work like Linguist.detect
Prior to this change, GetLanguages collected all candidate languages from each
strategy to pass to the next strategy (without de-duplicating them). Linguist
only uses the previous strategy's candidates for the next strategy. Also, it
would overwrite languages with nil if a strategy returned that, so you could get
into a situation where you go from multiple languages to no language.

See the Ruby code for details: aad49acc06/lib/linguist.rb (L14-L49)

This addresses https://github.com/src-d/enry/issues/207 because GetLanguages
should not return all candidates detected, otherwise it would work differently
than Linguist.
2021-04-13 12:04:47 -07:00
Luke Francl
eb043e80a8 Add GetLanguageID function
The Linguist-defined language IDs are important to our use case because they are
used as database identifiers. This adds a new generator to extract the language
IDs into a map and uses that to implement GetLanguageID.

Because one language has the ID 0, there is no way to tell if a language name is
found or not. If desired, we could add this by returning (string, bool) from
GetLanguageID. But none of the other functions that take language names do this,
so I didn't want to introduce it here.
2021-04-13 11:49:21 -07:00
Lauris BH
323d739170 Fix test 2021-03-07 18:34:08 +02:00
Lauris BH
6d8f15af5b Add XML strategy 2020-11-15 15:43:37 +02:00
Lauris BH
cb353b4b05 Add support for Roff man pages filenames 2020-10-12 12:18:57 +03:00
Lauris BH
7c562a6c34 sync to the latest github/linguist v7.11.0 2020-09-17 10:34:41 +03:00
Lauris BH
97a26011a9 Return group color if language has none 2020-03-31 09:30:27 +03:00
Alexander Bezzubov
1ab8148c10
test: fix platform-depenent paths in tests
Test Plan:
 - go test ./internal/code-generator/... -run Test_GeneratorTestSuite -testify.m TestGenerationFiles
2020-03-19 19:47:22 +01:00
Máximo Cuadros
84efad7693
*: module rename to go-enry/go-enry/v4 2020-03-19 17:31:29 +01:00
Alexander Bezzubov
bc5e031cee Drop src-d org ref except for issues
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2020-03-19 14:04:36 +01:00
Alexander Bezzubov
fa097f4ed4
go: remove Classifier from API
Even more reduces public API surface by
hiding un-used Classifier API for providing
a pre-trained classifier weights.

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-10-29 18:20:33 +01:00
Alexander Bezzubov
3f0c4e182b
go: reduce API surface
Don't export defaultClassifier

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-10-29 18:14:43 +01:00
Alexander Bezzubov
6a5f37e9e2
modules: prepare for v2 release
- update go.mod \w v2
 - update all import paths

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-14 21:28:12 +02:00
Alexander Bezzubov
20c6d2845a
build: gopkg.in -> github.com imports
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2019-04-12 11:49:16 +02:00
kuba--
5adfee5761
Do not return empty lang.
It's better to return any potential candidate than nothing.

Signed-off-by: kuba-- <kuba@sourced.tech>
2019-03-14 14:08:19 +01:00
Alexander
3499750785
Sync to linguist 7.2.0: heuristics.yml support (#189)
Sync \w Github Linguist v7.2.0

Includes new way of handling `heuristics.yml` and
all `./data/*` re-generated using Github Linguist [v7.2.0](https://github.com/github/linguist/releases/tag/v7.2.0)
release tag.

 - many new languages
 - better vendoring detection
 - update doc on update&known issues.
2019-02-14 12:47:45 +01:00
Alexander
ef50154395
Maintenance: batch of minor changes (#183)
* exclude build artifacts from git
* build: simplify building by using src-d/ci
* bench: simplify&fix shell runners
* build: simplify benchmarks* targets
* test: remove dependency on single test suite
* doc: rel image link + linguist cli difference highlight
* suggestions from code review
* bench: add fail fast to all shell runners

Signed-off-by: Alexander Bezzubov <bzz@apache.org>
2018-12-27 11:55:34 +01:00
Alfredo Beaumont
a7cfa65953 tests: Add testcases for empty filenames
Signed-off-by: Alfredo Beaumont <alfredo.beaumont@gmail.com>
2017-12-07 16:42:02 +01:00
Manuel Carmona
8ddce8bc4b Added cases with nil and empty content to TestGetLanguagesByModeline
Signed-off-by: Manuel Carmona <manu.carmona90@gmail.com>
2017-11-08 14:41:31 +01:00
Manuel Carmona
f6649550f0 fixed test for GetLanguagesByShebang function
Signed-off-by: Manuel Carmona <manu.carmona90@gmail.com>
2017-11-08 14:41:31 +01:00
Vadim Markovtsev
c97a180da5 Fix review suggestions
Signed-off-by: Vadim Markovtsev <vadim@sourced.tech>
2017-10-26 15:51:02 +02:00
Vadim Markovtsev
250519bb51 Add the external test linguist dir from env var
This allows to use a cached directory with linguist instead of cloning and speeds up the tests by -10s on my local machine.

Signed-off-by: Vadim Markovtsev <vadim@sourced.tech>
2017-09-28 23:51:38 +02:00
Alexander Bezzubov
3303cf7824 Fix 🐛 on file starting with single shebang 2017-07-25 10:37:11 +02:00
Manuel Carmona
510c430fd0 fixed some tests that were not using a temp-linguist-repo 2017-07-18 13:31:34 +02:00
Manuel Carmona
d8798c2dd9 binary files are returned as OtherLanguage by GetLanguage 2017-07-04 11:38:43 +02:00
David Paz
3f2248084e Moved commit.go to data directory 2017-06-28 11:22:42 +02:00
Manuel Carmona
b7d4be5fdd commit against tests run is fixed
renamed tmpLinguist to repoLinguist and SimpleLinguistTestSuite to EnryTestSuit in common_test.go

changed receiver's name for TestSuites to 's'

fixed comments
2017-06-26 15:35:53 +02:00
Manuel Carmona
bea1bc3af8 split GetLanguage into GetLanguage and GetLanguages 2017-06-15 13:02:59 +02:00
Manuel Carmona
beda5b73e7 changed signatures for strategies 2017-06-15 10:07:23 +02:00
Manuel Carmona
5f0e92b1a8 changed test LinguistCorpus to use GetLanguage and fail if not assert 2017-06-15 10:07:23 +02:00
Manuel Carmona
ba53e10c7b renamed package and cli to enry 2017-06-13 14:18:23 +02:00
Manuel Carmona
0d5dff1979 changes in the API, ready to version 2 2017-06-06 11:30:23 +02:00
Manuel Carmona
5b304524d1 Rearranged code 2017-06-02 09:33:55 +02:00
Máximo Cuadros
2bbd7ec440 unified GetLanguage function 2016-07-18 16:20:12 +02:00