This mitigates the problem that tokenizer uses regex
that matches platform-specific line endings
TestPlan:
- go test ./internal/code-generator/generator \
-run Test_GeneratorTestSuite -testify.m TestTokenizerOnATS
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
As Git on win does not support symlinks [1], we have to hard-code
the paths to fils under ./samples/ in Linguist codebase that are
known to be a symlink.
1. https://github.com/git-for-windows/git/wiki/Symbolic-Links
TestPlan:
- go test ./internal/code-generator/generator -run Test_GeneratorTestSuite
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
It seems that reading ./samples/ from Linguist consumes
a different number of files from filesystem on different OSes.
This change adds ENRY_DEBUG env var to print some debug output
about calculations of token stats from samples.
TestPlan:
- ENRY_DEBUG=1 go test -v ./internal/code-generator/generator \
-run Test_GeneratorTestSuite -testify.m TestGenerationFiles
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
On Win `make code-generate` produces unreasonable
Bayesian classifier weights from Linguist samples
silently, failing only the final classification tests.
TestPlan:
- go test ./internal/code-generator/... \
-run Test_GeneratorTestSuite -testify.m TestGenerationFiles
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
TestPlan:
- go test -run TestTokenize ./internal/tokenizer
- go test -tags flex -run TestTokenize ./internal/tokenizer
(shold fail as default fixtures are from regex-based tokenizer)
Even more reduces public API surface by
hiding un-used Classifier API for providing
a pre-trained classifier weights.
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
A PoC that exposes single function
`enry.language_by_extension()` and a small
number of helpers to deal with string
coversion between Go<->C<->Python.
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
Fixes#243. The default behaviour for `go get` has changed slightly and we now
need to either provide a module context or disable modules for installation to
work correctly.
Also remove a now-obsolete reference to the source{d} engine CLI.
Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>
Addresses #239. The `go get` command fetches the command-line tool, and does not match the import path for the library. To make things more clear:
1. Mention explicitly that `go get` fetches the CLI. Also, to avoid potential
issues with pre-modules Go versions, do the fetch in /tmp.
2. Include an import path explicitly in the source examples.
3. Mention explicitly how to import enry into a modules build.
Addresses #239. The `go get` command fetches the command-line tool, and does
not match the import path for the library. To make things more clear:
1. Mention explicitly that `go get` fetches the CLI. Also, to avoid potential
issues with pre-modules Go versions, do the fetch in /tmp.
2. Include an import path explicitly in the source examples.
3. Mention explicitly how to import enry into a modules build.
Signed-off-by: M. J. Fromberger <michael.j.fromberger@gmail.com>